Feature
posted 1 Sep 1999 in Volume 3 Issue 1
Natural
navigation
Technology to support Knowledge Management must access relevant
information, and help make sense of it. This is particularly important when a
huge amount of information is available from multiple, disparate sources,
including unstructured documents and images, as well as structured databases. In
this article Ramana Rao & Ralph Sprague, Jr describe two underlying
technologies that support Knowledge Work i.e. Information Visualisation and
Knowledge Extraction.
Much research has been done on designing tools that are natural to us as
humans as well as to the tasks we must perform with them. In some sense, the
entire history of computing has been about a shift from the artificial, even the
strange, to the natural. As computation has become less and less expensive and
rare, more and more of it could be used to adapt interaction to us and our work.
Some acts are completely natural for humans and foreign to computers. For
example, we expect humans to identify which branch of a tree sticks out, or
recognise that something in particular is meant by a string of words like 'hot
and cold running water.' Other acts we expect computers to do easily and humans
to do poorly, like counting the number of leaves on the tree or the number of
spaces in this sentence.
The knowledge management elements of information visualisation and
knowledge extraction are about applying computation in ways that allow
leveraging of our natural abilities of vision, language, memory based on
visual/spatial clues, and pattern recognition.
![]() |
Information visualisation is about utilising interactive graphics to present information and support interactions. By leveraging the strengths as well as respecting the constraints of our cognitive, perceptual, and motor systems, it is possible to support interaction with much larger amounts of information than possible with previous graphical interfaces. |
![]() |
Knowledge extraction is about extracting attributes or structure from information objects (documents or datasets) or collections (e.g. corpora or databases) of the kind that we naturally use in sense-making and in decisions about how we are going to use our time. This extracted information, often called metadata or even knowledge, is valuable in guiding access and use of the information and often for automating portions of the work. |
Early on, the use of computer power moved beyond calculation to information processing in support of business functions. This required structuring information so that it could be handled routinely through computation. Data is organised as entities into its attributes, and attributes into data records, which are accumulated into files, databases and information systems. A major challenge in using computers for knowledge management derives from the increasing use of so-called 'unstructured' information in documents, in addition to the 'structured' information in databases. This assertion that documents are unstructured results from attempts to extend the structural model of databases to documents. Yet it also belies a computer or implementation bias.
A human or user perspective might label things exactly the opposite. Large databases in some sense seem very unstructured to a human because patterns in the data are hard to observe. But from the human point of view, there is a great deal of structure in documents. Humans regularly scan a document and see the title to indicate general topic, and the table of contents to see the parts of the content in context. They see several levels of paragraph headings, format, bold and italic to infer structure and emphasis. Even overall physical attributes such as size, weight, and graphic design provide clues to content and structure of a book or document.
What is needed then is a way to exploit structure that is available in the information system in one case (traditionally called structured) and easily perceived by humans in the other (traditionally called unstructured). The two technologies focus on the two sides of this goal, respectively.
Information visualisation
Humans as a species have evolved to have great visual and spatial skills. We are quick to see edges and discontinuities, things that stick out, things of a similar colour, or sharp changes of light or motion. Similarly, we are great at retrieving memories by visual and spatial cues: 'it had a picture, mostly red, in its upper left corner.' By leveraging these perceptual, automatic skills in systems, we can reduce the cognitive, conscious loads on users as they navigate through large amounts of information.
Navigating spaces of much and many
Navigation is about steering to an item of interest. The conventional navigation techniques of graphical user interfaces (GUI) do not scale when faced with large numbers of objects. The mechanical acts of interacting with opening/closing windows, weaving through hierarchical menus, folding/unfolding levels in a tree view, and scrolling through long lists of textual items, interject a great deal of mechanical overhead in the work process and seem to break the flow of thinking.
Instead, information visualisation and automatic space management techniques can be used to a variety of ends in knowledge management systems to visualise document collections, query results, relationships, sets of sources, datasets, and metadata. These technologies exploit a few common design elements, each of which is based on our natural abilities as humans or the natural constraints of working in space:
Early on, the use of computer power moved beyond calculation to information processing in support of business functions. This required structuring information so that it could be handled routinely through computation. Data is organised as entities into its attributes, and attributes into data records, which are accumulated into files, databases and information systems. A major challenge in using computers for knowledge management derives from the increasing use of so-called 'unstructured' information in documents, in addition to the 'structured' information in databases. This assertion that documents are unstructured results from attempts to extend the structural model of databases to documents. Yet it also belies a computer or implementation bias.
A human or user perspective might label things exactly the opposite. Large databases in some sense seem very unstructured to a human because patterns in the data are hard to observe. But from the human point of view, there is a great deal of structure in documents. Humans regularly scan a document and see the title to indicate general topic, and the table of contents to see the parts of the content in context. They see several levels of paragraph headings, format, bold and italic to infer structure and emphasis. Even overall physical attributes such as size, weight, and graphic design provide clues to content and structure of a book or document.
What is needed then is a way to exploit structure that is available in the information system in one case (traditionally called structured) and easily perceived by humans in the other (traditionally called unstructured). The two technologies focus on the two sides of this goal, respectively.
Information visualisation
Humans as a species have evolved to have great visual and spatial skills. We are quick to see edges and discontinuities, things that stick out, things of a similar colour, or sharp changes of light or motion. Similarly, we are great at retrieving memories by visual and spatial cues: 'it had a picture, mostly red, in its upper left corner.' By leveraging these perceptual, automatic skills in systems, we can reduce the cognitive, conscious loads on users as they navigate through large amounts of information.
Navigating spaces of much and many
Navigation is about steering to an item of interest. The conventional navigation techniques of graphical user interfaces (GUI) do not scale when faced with large numbers of objects. The mechanical acts of interacting with opening/closing windows, weaving through hierarchical menus, folding/unfolding levels in a tree view, and scrolling through long lists of textual items, interject a great deal of mechanical overhead in the work process and seem to break the flow of thinking.
Instead, information visualisation and automatic space management techniques can be used to a variety of ends in knowledge management systems to visualise document collections, query results, relationships, sets of sources, datasets, and metadata. These technologies exploit a few common design elements, each of which is based on our natural abilities as humans or the natural constraints of working in space:
![]() |
Graphical representation: Seeing graphics is fundamentally different from reading text. For example, scanning a thousand tiny bars with your eyes requires hardly any conscious effort, unlike reading a thousand numbers, which takes a great deal of mental energy and time. Table Lens by Inxight is a good example. Furthermore, graphical representations can be packed more densely then text. Thus by using graphical marks-of varying colour, light level, texture, shape, size, iconography, so on-and by linking or arranging them in space, a large number of objects and relationships can be shown in ways that can be quickly assimilated. |
![]() |
Focus and content: Even with the move to denser visual representation, we are still subject to the fundamental limitation of available screen space. Thus, there is a tradeoff between showing a lot about a little or a little about a lot. One strategy for dealing with this is to switch between overviews (a little about a lot) and more detailed views like a page view (a lot about a little). So across time, a user is able to navigate around in the large view, while still being able to access detailed information about particular items of interest. Hyperbolic Tree by Inxight is a good example. |
![]() |
Animated transitions: Changes to the view, e.g. changing the focus, are more effective if they are animated. Showing a series of frames rapidly which depict the change from the initial position to the final position preserves the sense that the structure is a coherent object that is being rearranged. |
Knowledge extraction
Knowledge extraction is about automatically extracting properties of documents or context that humans are quite effective at reading. Knowledge workers typically leverage these properties in deciding whether and how to use particular documents or to choose among various resources. When there are large numbers of documents available, automatic extraction of this metadata can be used during visualisation to assist the user based in the performance of tasks. Technology available today for language analysis allows for higher quality extraction from textual sources. KM solutions should also consider types of metadata useful in knowledge work and methods of extracting them.
Treating text as language
Many applications that process text treat the text as a stream of character codes not as human language. Yet, human languages support rich and complex structure in their words, sentences, syntax and grammar. To deal with this, many text processing applications-including indexing, search, and extraction-employ ad hoc methods of breaking a text stream into words, phrases, and larger units of language. So, for example, a common technique used to stem words to their root forms is to use rules like breaking off certain endings. Though these types of heuristics worked for some purposes in English, they certainly don't work for all applications in other languages. Furthermore since these techniques are language-specific, linguistic phenomena, such as agglutination or compounding, that show up in different languages require different software runtimes. What is required is some uniform way of capturing linguistics rules of each given language that allows processing text uniformly.
Extracting the 'about'
Knowing about a thing can help a person decide whether it is worth spending time on it. Consider how you decide whether to go to a movie. Most likely you would want to know what the movie is about, who the actors and the director are, what kind of reviews it is getting. Previews and reviews, associations and interests, all play a role in deciding whether the two hours is worth spending. Similarly in dealing with the massive amounts and diverse sources of information available to a knowledge worker, knowing about the information enables more effectively selecting and filtering documents and sources. Thus techniques for automatically extracting properties of a document and collections are quite important for knowledge management systems. Often this kind of extracted data is called metadata or meta-information, which can be a catchall category covering many different types of features or descriptions of documents or collections of documents. Some advanced examples of metadata are:
![]() |
Language identification: This is an important capability for systems that will be used in global markets by multi-national companies and users of the Internet. |
![]() |
Genre detection: Knowing that a document is a newspaper article, a scholarly publication, a letter to the editor, or email message can make a difference. Users derive important information about the context, structure, contents, and value of a document from its genre. |
![]() |
Summarisation: This reduces document volume and complexity while still retaining the essential information of the document. Research has shown that summaries containing only 20% of the original content can be as informative as the full text. Research indicates that even shorter summaries can retain the essence of the document meaning. Summarisation enables readers to get a sense of a document's meaning without reading its entire contents. When this capability is added to an information retrieval application, it increases users' awareness of what documents are available and improves their performance in navigating to documents of greatest interest. |
Conclusion
Information visualisation and knowledge extraction focus on the use of computation to lead to more natural performance by humans. The technologies support people by leveraging acts that people are expected to do well, and that fit into the work they do in a natural manner. By designing information systems fit for humans and for knowledge work, not only are certain tasks made more efficient, but in fact many complex tasks are enabled that would otherwise not be possible. In particular, much larger amounts of information and a greater variety of sources can be dealt with effectively by a broader range of user.
Ramana Rao, is Chief Technology Officer and Director of Engineering at Inxight Software, Inc. He can be contacted at:rao@inxight.com
Ralph H. Sprague, Jr., is Professor of Decision Science at the University of Hawaii. He can be contacted at:sprague@hawaii.edu
denotes premium content | Nov 19 2008 





