Browse Prior Art Database

Method for Extending Index and Segmentation

IP.com Disclosure Number: IPCOM000119160D
Original Publication Date: 1997-Dec-01
Included in the Prior Art Database: 2005-Apr-01
Document File: 4 page(s) / 145K

Publishing Venue

IBM

Related People

Chiang, S: AUTHOR [+2]

Abstract

A concern with any indexing methodology is the granularity and the strength of that solution (volume or size of data). The solution should be able to support indexing of materials at line level, page level, chapter or section level and to the document level for text data. Indexing support should function with the data from JES, from a pre-processed file or data with external index descriptor.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 40% of the total text.

Method for Extending Index and Segmentation

      A concern with any indexing methodology is the granularity and
the strength of that solution (volume or size of data).  The solution
should be able to support indexing of materials at line level, page
level, chapter or section level and to the document level for text
data.  Indexing support should function with the data from JES, from
a pre-processed file or data with external index descriptor.

      If a philosophy is chosen for dealing with an index which
states that all index materials must be online, then the high
maintenance and database support costs must also be addressed.  An
ideal solution would minimize manual intervention and provide high
availability of the  descriptors necessary to respond to queries
against the information stored within our archival system.

      A truly flexible solution would enable the user to segment the
document into multiple levels such as a line, page, chapter or the
entire document.  Within that segmentation, the number of fields
supported can  be any reasonable number, and we have selected an
arbitrary 9.  A number  of 8 or 16 is okay as well.

      The size of the index data can vary and will be as big or
bigger than the data which is being indexed; this will be especially
true for multiple index values.  Segmentation does not solve the
volume data/index problem but only adds a degree of complexity which
may only  confuse the user.  The user only knows about a specific
report or application of interest and the data item within that
report which is being interrogated.  Adding an artificial boundary
does not assist in finding data but only adds a level of indirection.

      The solution will continue to use the hierarchical index
philosophy.  The way that philosophy is implemented is changed to
deliver a flexible and extensible solution.  Each portion of the
hierarchy is changed from separating the index from the data object
to how the hierarchy is navigated.  The data object will contain only
data (no preprocessing is needed to sort the data into ascending
order).  The index object(s) will be separated from the data it
describes.  This unique segmented index object has been separated
from the data object as a unique set of segmented index object or
objects.  The index objects  are segmented into 32K blocks, and each
index set will build an index object or series of objects.  In this
new object, there is an ability to build an unlimited number of
indices and index fields, resulting in  a high-level Direct Access
Storage Device (DASD) index which contains a  pointer to the low
level index set(s) which then contains the detailed  index
information.  The detailed index information is gathered together
in index sets; each index set is a complete index with the fields
needed to located a segment of data.

      An index set can be used to construct a DASD SQL online
index.  The index set is a predetermine s...