Browse Prior Art Database

Sort of Search Results of Information Retrieval System

IP.com Disclosure Number: IPCOM000112241D
Original Publication Date: 1994-Apr-01
Included in the Prior Art Database: 2005-Mar-27
Document File: 2 page(s) / 38K

Publishing Venue

IBM

Related People

Nomiyama, H: AUTHOR

Abstract

Disclosed is a mechanism to sort results of information retrieval systems based on frequencies of characters.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 77% of the total text.

Sort of Search Results of Information Retrieval System

      Disclosed is a mechanism to sort results of information
retrieval systems based on frequencies of characters.

      In information retrieval systems, keyword search is a simple,
efficient and the most widely-used method to search information over
huge databases.  But it is often difficult to find the right keywords
to match the information that users want.  Sometimes they are too
vague and sometimes are too specific.

      It would be useful if the system shows what are retrieved for
the result of the search.  Users can sophisticate a set of keyworks
incrementally by using such information.  The proposed mechanism
shows rough characteristics of the search result based on the
frequencies of characters or words.  System Configuration

      This systems consists of three steps:

1.  Extract frequently-occurred strings (characteristic strings) from
    search results.

2.  Eliminate invalid characteristic strings.

3.  Sort search results based on the characteristic strings.

2.1 Extraction of Characteristic Strings

      In this step, frequently-occurred strings are extracted.
Strings can be character sequences (in Japanese) or word sequences
according to natures of each language.  The algorithm is shown in
[*].

2.2 Elimination of Invalid Characteristic Strings

      Characteristic strings must be effective to distinguish one
search result to another.  So characteristic string th...