Browse Prior Art Database

Organizing a Ranked List of Search Matches

IP.com Disclosure Number: IPCOM000114072D
Original Publication Date: 1994-Nov-01
Included in the Prior Art Database: 2005-Mar-27
Document File: 4 page(s) / 131K

Publishing Venue

IBM

Related People

Wecker, AJ: AUTHOR [+2]

Abstract

The results of a keyword search through hierarchically organized documents are typically presented as a ranked list of topics, where a topic is the general name for any single subsection at any level of the hierarchy. The ranking indicates how well a topic matched the query. Such list with search matches pays no attention to the contiguity of topics which appear in the list and is unable to capture units of discussion longer than a single topic. A highly ranked topic appears at the top of the list and a low ranked top at the bottom, even if the two topics appear in uninterrupted succession in the book itself.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 39% of the total text.

Organizing a Ranked List of Search Matches

      The results of a keyword search through hierarchically
organized documents are typically presented as a ranked list of
topics, where a topic is the general name for any single subsection
at any level of the hierarchy.  The ranking indicates how well a
topic matched the query.  Such list with search matches pays no
attention to the contiguity of topics which appear in the list and is
unable to capture units of discussion longer than a single topic.  A
highly ranked topic appears at the top of the list and a low ranked
top at the bottom, even if the two topics appear in uninterrupted
succession in the book itself.

      A user of standard ranked list of topics with search matches
has two problems.  First, when cycling through the results, he is
very likely to be taken back to a discussion already read.  A user
who asks to see a specific topic in the list is not necessarily
restricted to that topic only; instead, the user is more likely to
engage in browsing activity to include contiguous topics which
continue to discuss the material of interest.

      Secondly, a user who is presented with a standard ranked list
of topics with search matches can miss the forest for the trees: the
presence of a large number of topics in the list might obscure the
fact that there are really only a few discussions of the material in
the book, with each of these discussions ranging over several topics.
A user might thus be misled into thinking that their query was not
specific enough, or that too much effort will be needed to sift
through the results for it to be worthwhile.

      For example, if the word "dingbat" is submitted as a search
query, using the IBM Bookmanager*, the following ranked list of
search results is returned (only topic numbers are shown):
  2.33.2.1
  2.32.1
  GLOSSARY
  2.33.2
  2.32
  2.33.1
  2.33.2.2
  2.33.2.4
  2.33.2.5
  2.33.2.3
  2.33.2.2.1
  CHANGES.4
  2.33.2.5.2
  2.33.2.6
  2.33.3.1
  5.1.2.1
  5.1.2.2
  5.1.3
  2.33.1.2
  2.33.2.3.2
  2.33.2.4.3

      Examination of the list (and consideration of topic length)
reveals that there are seven major points of entry in the book where
the word "dingbat" can be found.  The following reorganized list
presents the required information in a more usable form:
  +2.32 - 2.32.2
  +2.33.1 - 2.33.2.6
  GLOSSARY
  CHANGES
  5.1.2.1
  5.1.2.2
  5.1.3

      The revised list has simply grouped together sequences of
topics, listing the first and last topic number represented in the
sequence.  At the same time, it has retained the ranking order of
each group (or ungrouped topic).

      Now the user knows that there are seven principal places in the
book where dingbats are discussed, and these discussions are ranked.
The user can then select which discussion they are interested in, and
jump to the beginning of that discussion.  If this granularity is not
fine enough, the user can clic...