Browse Prior Art Database

Comparison Reducing Information Retrieval Method

IP.com Disclosure Number: IPCOM000094652D
Original Publication Date: 1965-Apr-01
Included in the Prior Art Database: 2005-Mar-06
Document File: 2 page(s) / 13K

Publishing Venue

IBM

Related People

Raver, N: AUTHOR

Abstract

One manner in which certain information in documents is matched with requests for the information is to set up a descriptor-word profile of the document and to set up a descriptor-word profile of the information requested. Then, each word in the request profile is compared with each word in the document profile in order to make the required data matches. The request profile can either have a must operator associated with it which indicates that, if this word is found, the document is in answer to the request. The entries in the request can have a may operator associated with them. This means that a predetermined number of matches between a request profile and a document profile must be made before the document is selected. With this scheme, a very large number of comparisons between words is required.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 52% of the total text.

Page 1 of 2

Comparison Reducing Information Retrieval Method

One manner in which certain information in documents is matched with requests for the information is to set up a descriptor-word profile of the document and to set up a descriptor-word profile of the information requested. Then, each word in the request profile is compared with each word in the document profile in order to make the required data matches. The request profile can either have a must operator associated with it which indicates that, if this word is found, the document is in answer to the request. The entries in the request can have a may operator associated with them. This means that a predetermined number of matches between a request profile and a document profile must be made before the document is selected. With this scheme, a very large number of comparisons between words is required. Only small percentage of comparisons, usually less than 2%, results in a successful hit.

The number of word comparisons required can be reduced by over 70% by the use of a randomization technique or can be eliminated entirely by modification of this technique. The first step in this operation is to randomize each descriptor word in a profile into a given bit position of an N-bit profile word. That is, each X-bit descriptor word is transformed into a bit in a unique position of the N-bit word. With this method a given descriptor word is always randomized to the same bit position of the N-bit profile word, although it can share this position with other descriptor words. Thus, each document profile and each request profile are represented by a unique N-bit word. A single comparison can then be made between the N-bit word of the request profile and the N-bit word of the document profile.

If the must operator is being used, a single match between these N-bit words indicates that a detail...