Visual Search – Please, I’m begging you!

At breakfast the other morning, my colleague Alan Wood was excited to try out a software tool, Devonthink, that he’d just read about in Steven Johnson’s Where Good Ideas Come From

“Private serendipity can be cultivated by technology as well. For more than a decade now, I have been curating a private digital archive of quotes that I’ve found intriguing, my twenty-first-century version of the commonplace book. Some of these passages involve very focused research on a specific project; others are more random discoveries, hunches waiting to make a connection. Some of them are passages that I’ve transcribed from books or articles; others were clipped directly from Web pages. (In the past few years, thanks to Google Books and the Kindle, copying and storing interesting quotes from a book has grown far simpler.) I keep all these quotes in a database using a program called DEVONthink, where I also store my own writing: chapters, essays, blog posts, notes. By combining my own words with passages from other sources, the collection becomes something more than just a file storage system. It becomes a digital extension of my imperfect memory, an archive of all my old ideas, and the ideas that have influenced me. There are now more than five thousand distinct entries in that database, and more than 3 million words—sixty books’ worth of quotes, fragments, and hunches, all individually captured by me, stored in a single database.

“Having all that information available at my fingertips is not just a quantitative matter of finding my notes faster. Yes, when I’m trying to track down an article I wrote many years ago, it’s now much easier to retrieve. But the qualitative change lies elsewhere: in finding documents that I’ve forgotten about altogether, finding documents that I didn’t know I was looking for. What makes the system truly powerful is the way that it fosters private serendipity.

“DEVONthink features a clever algorithm that detects subtle semantic connections between distinct passages of text. These tools are smart enough to get around the classic search-engine failing of excessive specificity: searching for “dog” and missing all the articles that only have the word “canine” in them. Modern indexing software like DEVONthink’s learns associations between individual words by tracking the frequency with which words appear near each other. This can create almost lyrical connections between ideas. Several years ago, I was working on a book about cholera in London and queried DEVONthink for information about Victorian sewage systems. Because the software had detected that the word “waste” is often used alongside “sewage,” it directed me to a quote that explained the way bones evolved in vertebrate bodies: namely, by repurposing the calcium waste products created by the metabolism of cells. At first glance that might seem like an errant result, but it sent me off on a long and fruitful tangent into the way complex systems—whether cities or bodies—find productive uses for the waste they create. That idea became a central organizing theme for one of the chapters in the cholera book. (It will, in fact, reappear in this book in a different guise.)

“Now, strictly speaking, who was responsible for that initial idea? Was it me, or the software? It sounds like a facetious question, but I mean it seriously. Obviously, the computer wasn’t conscious of the idea taking shape, and I supplied the conceptual glue that linked the London sewers to cell metabolism. But I’m not at all confident that I would have made the initial connection without the help of the software. The idea was a true collaboration, two very different kinds of intelligence playing off one another, one carbon-based, the other silicon. When I’d first captured that quote about calcium and bone structure, I’d had no idea that it would ultimately connect to the history of London’s sewage system (or to a book about innovation). But there was something about that concept that intrigued me enough to store it in the database. It lingered there for years in the software’s primordial soup, a slow hunch waiting for its connection.

“I use DEVONthink as an improvisational tool as well. I write a paragraph about something—let’s say it’s about the human brain’s remarkable facility for interpreting facial expressions. I then plug that paragraph into the software, and askDEVONthink to find other passages in my archive that are similar. Instantly, a list of quotes appears on my screen: some delving into the neural architecture that triggers facial expressions, others exploring the evolutionary history of the smile, others dealing with the expressiveness of our near-relatives, the chimpanzees. Invariably, one or two of these triggers a new association in my head—perhaps I’ve forgotten about the chimpanzee connection—and so I select that quote, and ask the software to find a new batch of passages similar to it. Before long, a larger idea takes shape in my head, built upon the trail of associations the machine has assembled for me.

“Compare that to the traditional way of exploring your files, where the computer is like a dutiful, but dumb, butler: “Find me that document about the chimpanzees!” That’s searching. The other feels radically different, so different that we don’t quite have a verb for it: it’s riffing, or exploring. There are false starts and red herrings, but there are just as many happy accidents and unexpected discoveries. Indeed, the fuzziness of the results is part of what makes the software so powerful. The serendipity of the system emerges out of two distinct forces. First, there is the connective power of the semantic algorithm, which is smart but also slightly unpredictable, thus creating a small amount of randomizing noise that makes the results more surprising. But that randomizing force is held in check by the fact that I have curated all these passages myself, which makes each individual connection far more likely to be useful to me in some way. When you start a new query in DEVONthink and look down at the initial results, at first glance they can sometimes seem jumbled and disconnected, but then you read through them in more detail, and inevitably something tantalizing catches your eye. “Jumbled” and “disconnected” is of course also how we describe the strange explorations of our dreams, and the comparison is an apt one. DEVONthink takes the strange but generative combinations of the dream state and turns them into software.”

Johnson, Steven (2010-10-05). Where Good Ideas Come From: The Natural History of Innovation (pp. 115-117). Riverhead. Kindle Edition.

Alan shared as much as he could remember of the above passage.

Finally, I couldn’t help myself and blurted out “You don’t want to do that.  You’ll find it hard to use and it won’t do what you think it will do.  What you really want is a personal version of Attenex Patterns.”

What ensued was a long discussion about visual analytics, semantic networks, social networks, and event networks. Finally, Alan looked at me and said “Well, why don’t you just go build it? What are you waiting for?”

After David Socha and I stopped laughing and crying simultaneously as we’ve been having this same discussion for ten years, we all looked at each other and said “Maybe it is time.”

As I shared this story with one of David’s students, Yulana Shestak, she asked me if I’d ever heard of a tool called Zet Universe that a friend of hers is working on. I hadn’t heard of it so she sent me a pointer. The tool looks interesting and I can’t wait to try it out. A description of the tool can be found at Neocytelabs.

Project Overview

What

Zet Universe is the ubiquitous digital work environment with a game-changing natural user interface that learns and expands on users interest graph over the time. Zet Universe is a new, living metaphor of working with information in the Post-PC world.

Problems

Zet Universe addresses the needs of generation Z mindsets to have a simple yet powerful digital work environment with natural user experience across multiple devices.

Tools to tackle information are currently split between various products and platforms, leading to overload, context loss, permanent thought and action flow disruption, productivity decrease and extremely poor experience.

Goals

Like original Context-aware Computing Shell UX, Zet Universe is a system that is designed to enable user to:

  • Concentrate on important projects
  • Switch between projects without loosing context
  • Return to previous projects after long time
  • See whole picture of the project and easily jump to its details

in order to:

  • Reintroduce people’s information processes around their interests

Zet Universe is the Interest Graph

  • Zet Universe will reintroduce people’s information processes around their interests – from inception to learning, updates and sharing.
  • Zet Universe is interest graph that uses sophisticated machine learning algorithms to extract interests from personal information.
  • Zet Universe stores and makes user information of any kind (files, notes, people, places, documents, etc.) available across one’s devices via cloud service.
  • Zet Universe gives relevant recommendations within current user activity over the time.

As I poked around Daniel Kornev’s websites, I found a pointer to the Google Knowledge Graph. I’ve wondered whether Google was ever going to get into this space, so it was nice to see some acknowledgement on their part of the importance of visualization. Lance Ulanoff had this to say about the knowledge graph:

“Google has a confession to make: It does not understand you. If you ask it “the 10 deepest lakes in the U.S,” it will give you a very good result based on the keywords in the phrase and sites with significant authority on those words and even word groupings, but Google Fellow and SVP Amit Singhal says Google doesn’t understand the question. “We cross our fingers and hope someone on the web has written about these things or topics.”

“The future of Google Search, though, could be a very different story. In an extensive conversation, Singhal, who has been in the search field for 20 years, outlined a developing vision for search that takes it beyond mere words and into the world of entities, attributes and the relationship between those entities. In other words, Google’s future search engine will not only understand your lake question but know a lake is a body of water and tell you the depth, surface areas, temperatures and even salinities for each lake.

“To understand where Google is going, however, you need to know where it’s been.

“Search, Singhal explained, started as a content-based, keyword index task that changed little in the latter half of the 20th century, until the arrival of the World Wide Web, that is. Suddenly search had a new friend: links. Google, Amit said, was the first to use links as “recommendation surrogates.” In those early days, Google based its results on content links and the authority of those links. Over time, Google added a host of signals about content, keywords and you to build an even better query result.

“Eventually Google transitioned from examining keywords to meaning. “We realized that the words ‘New’ and ‘York’ appearing next to each other suddenly changed the meaning of both those words.” Google developed statistical heuristics that recognized that those two words appearing together is a new kind of word. However, Google really did not yet understand that New York is a city, with a population and particular location.”

Google Knowledge Graph

To build it (and they will come) or to wait for someone to build the personal visual analytics engine that I really want and need. That is always the question when it comes to my crazy ideas.

Maybe just maybe, Curtis Wong from Microsoft with the engine underneath the WorldWide Telescope project will provide what I need.

This entry was posted in Attenex Patterns, Content with Context, Knowledge Management, Software Development, Visual Analytics, Visual pattern Language. Bookmark the permalink.

One Response to Visual Search – Please, I’m begging you!

  1. Pingback: Day 1 – Creating My Future | On the Way to Somewhere Else

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s