Browse Prior Art Database

Indexing Method for Fast Random Access to Collated Data

IP.com Disclosure Number: IPCOM000061728D
Original Publication Date: 1986-Sep-01
Included in the Prior Art Database: 2005-Mar-09
Document File: 1 page(s) / 13K

Publishing Venue

IBM

Related People

Evans, CW: AUTHOR

Abstract

A mechanism for rapidly retrieving collated entries, such as phone directory listings, from a large data file is described. Only a very small portion of the data file is brought into memory, thus substantially improving performance over other file organizations. On-line access to large data files of directory-type data needs to be very fast for productive use. Since it is impractical to keep these large files in memory for immediate access, an efficient mechanism is needed to minimize the amount of data retrieved from the file with each directory look-up. The method involves pre-processing the data, collating it and developing a table of special entry points to random records in the file. Once the computation has been performed to fill the table, the table is stored as part of the file.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 51% of the total text.

Page 1 of 1

Indexing Method for Fast Random Access to Collated Data

A mechanism for rapidly retrieving collated entries, such as phone directory listings, from a large data file is described. Only a very small portion of the data file is brought into memory, thus substantially improving performance over other file organizations. On-line access to large data files of directory-type data needs to be very fast for productive use. Since it is impractical to keep these large files in memory for immediate access, an efficient mechanism is needed to minimize the amount of data retrieved from the file with each directory look-up. The method involves pre-processing the data, collating it and developing a table of special entry points to random records in the file. Once the computation has been performed to fill the table, the table is stored as part of the file. The table is designed to return a two-byte sector number when presented with a two- character string (the first two characters of a Surname, for example). The organization of the table is as follows: Eight entries are reserved for each of the 26 letters in the alphabet. The entries for names beginning with A are subdivided as follows: AA-AD, AE-AH, AI-AL, AM-AP, AQ-AT, AU-AX, AY-AZ, 88, BA-BD, BE-BH, BI-BL, BM-BP,..... with every eighth entry left unused. The subsequent
letters are divided in the same manner. The total size of the table is 2 bytes x 26 letters x 8 entries = 416 bytes, which fits efficiently in one 512-byte sector of a file, leaving room for some additional control information, such as directory title and tab stop information. The binary nature of the table dimensions make calculating the index from the input characters very efficient: 1. Take the first character and convert it to upper case. 2. Subtract from it the value assigned to the code for "A" to generate a binary numbe...