Browse Prior Art Database

A Multimedia Document Search and Retrieval System

IP.com Disclosure Number: IPCOM000021600D
Original Publication Date: 2004-Jan-26
Included in the Prior Art Database: 2004-Jan-26
Document File: 2 page(s) / 45K

Publishing Venue

IBM

Abstract

This invention describes a system for multimedia and cross-media search and retrieval of multimedia documents in a multimedia database. The invention processes a multimedia database into sets of features which it stores and indexes. Multimedia queries can then be made on the indexed multimedia database. The invention takes a multimedia query converts it into features which it matches against the ones in the index database. The process results in a list of documents sorted by their estimated relevance to the query . The recognition engines generate N-best lists for the objects in each multimedia document. These N-best lists are concise statistical representations of the multimedia data. These statistical representations enable the retrieval methods to be robust even when there are machine recognition errors, allowing retrieval of documents which would be missed by other retrieval systems.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 51% of the total text.

Page 1 of 2

A Multimedia Document Search and Retrieval System

The system performs two basic functions: the indexing of documents in a multimedia database and the retrieval-by-query of indexed multimedia documents. The multimedia database may be any repository or set of repositories where media documents are stored. The documents themselves can be identified using URLs or other method known to those skilled in the art, and may consequently be stored in a local or remote filesystem, database, FTP site, or web server.

Before retrieval is possible, an index must be built. The index is built in the following manner. A user interacts with an indexing client and requests that multimedia documents be indexed. The indexing client is responsible for verifying that the corresponding media types are supported; locating and retrieving the documents from a multimedia database; and passing them to an index builder along with any user-specified indexing preferences. The index builder is responsible for converting the documents it receives into "patterns" (statistical representations) that it then stores in a pattern database. Documents are processed by passing them to a multimedia decomposer which is responsible for decomposing a multimedia document into it's constituent "primitive" media elements. Once a document is decomposed into primitive media elements, the media elements are then passed to their corresponding pattern builders which are responsible for generating statistical representations of the media. These representations are called pattern objects. The pattern objects are passed back to the index builder which then adds them to the pattern database.

Once an index has been built, a user may search the corresponding documents by submitting a multimedia query to a query client. The query client is responsible for constructing queries, validating that the media types in the query are supported, and submitting the query to the query engine. The query engine uses the media decomposer and pattern builders to convert the query into pattern objects which are then compared to the pattern database entries using a pattern similarity metric. The comparison process results in a relevance score for each indexed document. These scores are then passed to the query client which retrieves documents from the media database ranked by their relevance to the query.

A pattern builder is the component of the system which converts specific, primitive, media elements into sets of pattern objects. Machine learning algorithms can be used to convert the media into statistical representations of the media. Different pattern builders may be designed...