Browse Prior Art Database

REDUCED COMBINED INDEXES FOR EFFICIENT MULTIPLE ATTRIBUTE RETRIEVAL

IP.com Disclosure Number: IPCOM000148915D
Original Publication Date: 1975-Nov-30
Included in the Prior Art Database: 2007-Mar-30

Publishing Venue

Software Patent Institute

Related People

Shneiderman, Ben: AUTHOR [+2]

Abstract

Reduced Combined Indexes For Efficient Multiple Attribute Retrieval

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 15% of the total text.

Page 1 of 28

[This page contains 1 picture or other non-text object]

Page 2 of 28

[This page contains 1 picture or other non-text object]

Page 3 of 28

Reduced Combined Indexes


For Efficient Multiple Attribute Retrieval

Ben Shneiderman

Computer Science Department

   Indiana University
Bloomington, Indiana 47401

Abstract

   Combined indexes were proposed by Lum 111 as an alternative
to the traditional approach of intersecting multiple single attribute

D

indexes. The combined index approach is appealing for queries re- quiring conjunctions of attribute values since it eliminates the need for time consuming intersections. The large penalty of wasted auxiliary storage space in the combined index approach can be mini- mized by adopting the reduced combined index technique proposed
in this paper. A detailed description, with a number of variations and suggestions for implementation economies, is presented.

Keywords
combined indexes, reduced combined indexes, inverted files,

indexing, multiple attribute retrievals, f i l e organization, secon- dary index files, information retrieval, data management, f i l e organ- ization, query.

CR Categories

3.73, 3.74, 3.79.

[This page contains 1 picture or other non-text object]

Page 4 of 28

A variety of techniques exist for single attribute retrievals

from record oriented f i l e structures, but there are few practical methods available for multiple attribute retrievals. We assume a f i l e t o be a collection of records containing a number of keyed attributes and non-keyed attributes. Queries are specified by single keyed attributes or combinations of keyed attributes, for example:

Select student-records where (age equals 18)

Select student-records where (age equals 18) and (home-state equals Indiana)

Select student-records wht?r3e (age equals 18) and (home-state equals Indiana) and (class-standing equals sophomore)

t

The common practice of inverted indexing requires separate indexes t o each of the attributes. Multiple attribute queries are answered by taking the intersection of the index pointers and then retrieving only the records i n the intersection set. This simple approach
is easy t o implement but the cost of responding t o a query is poten- tially great since the intersection operation can be costly, partic- ularly if the number of attribute values for an attribute is small and the number of records is large.

   To remedy the high cost of' performing an intersection each time a query is made, Lum [l] proposed the idea of combined indexes or

compound indexing. Lum recognized that i f , for example, three attrl- butes were t o be keyed then only three combined indexes would be necessary to answer all first, second and third degree queries.

The degree of a query is the number of attributes specified i n a conjunctive query. Queries containing disjunctions can be converted t o a series of conjunctiv...