Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Invention for Full-Text Searching of a Compound Document

IP.com Disclosure Number: IPCOM000114298D
Original Publication Date: 1994-Dec-01
Included in the Prior Art Database: 2005-Mar-28
Document File: 2 page(s) / 46K

Publishing Venue

IBM

Related People

Lumsden, MW: AUTHOR

Abstract

A method for performing full-text search over all content of a compound document is described. The invention includes a caching strategy to greatly improve search performance.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 84% of the total text.

Invention for Full-Text Searching of a Compound Document

      A method for performing full-text search over all content of a
compound document is described.  The invention includes a caching
strategy to greatly improve search performance.

      The approach is based on the establishment of a convention
among the developers of compound document part type handlers to
support a method that will return the full text (if any) of a part.
Note that even non-traditional part content may have text associated
with it.  For example, a video part may include a textual caption
that describes it.

      When a search of the compound document is requested, the root
part type handler obtains the full text of all of its immediately
contained parts via this method.  It can then apply its own powerful
search functionality, consistently, over the text of the entire
document, without requiring any knowledge of the format of the
various parts' data.

      For any part type that is a container, the implementation of
the full text method would include "rolling up"  the text of all
contained parts, obtained by those parts' full text methods.

      The performance of this approach is improved by having the root
part type handler retain the full text of each contained part in an
indexed and compressed form after the search is performed.  Then, on
a subsequent search of the document, only those parts that have been
updated since the previous search must be again interrogat...