Browse Prior Art Database

Contextual Search for Multimedia Presentation

IP.com Disclosure Number: IPCOM000111859D
Original Publication Date: 1994-Apr-01
Included in the Prior Art Database: 2005-Mar-26
Document File: 4 page(s) / 120K

Publishing Venue

IBM

Related People

Manthuruthil, GC: AUTHOR [+3]

Abstract

Multimedia presentations can consist of text, video, graphics and audio segments. There is no mechanism in existence which will answer contextual search queries on multimedia documents which can be a combination of text, video, audio or graphics. These queries may include: Boole an AND, OR, NOT type operators. (Some example queries are included in this document.)

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 48% of the total text.

Contextual Search for Multimedia Presentation

      Multimedia presentations can consist of text, video, graphics
and audio segments.  There is no mechanism in existence which will
answer contextual search queries on multimedia documents which can be
a combination of text, video, audio or graphics.  These queries may
include: Boole an AND, OR, NOT type operators.  (Some example queries
are included in this document.)

      This method seeks to achieve contextual search capability
analogous to the textual documents on the multimedia documents.  This
invention proposesan index mechanism, specifically based on a
bipartite graph oriented data structure.  A combination of 1) the
data structure and, 2) the algorithms which work in conjunction with
this data structure, yield contextual search capabilities.

      Presently no mechanism exists for searching through a
multimedia presentation.  Some workarounds exist at this time.

1.  Go through the presentation in a different speed (e.g. fast
    forward while viewing the picture in VCR) until we find the
    object being searched.

2.  Only search through one component (for example, search through
    the textual component of the multimedia presentation by
    specifying a particular word in a sentence).

      An efficient mechanism is needed to search in various
dimensions (i.e., text, audio, video and sound) of a multimedia
presentation.

      The author of the multimedia document is - like any other
authors of regular textual documents - expected to identify the
entities within his document which deserve to be indexed.  This may
be accomplished by playing back the document and while in the
playback mode identifying the 'squares' on the screen which are
interesting and deserve to be indexed.  The software running in the
background captures these 'index elements' and the corresponding
frame numbers and the locations within the frames (which additionally
indirectly identifies the time at which they occurred) where they
occur.

      For the audio portion of the document:  The author may
similarly identify the 'sound sequences' which are interesting and
which deserve to be indexed.  The background software can similarly
capture the instant in time or the frame(s) in which these occur.

      For the textual portion of the document: A method essentially
the same as described above can be used.

      The index is attached with the presentation.  The objects in
the index are selected based on the content of each multimedia
presentation.  These aspects are no different from the textual
indexes.

      For the actual layout of the index, please refer to the
attached Figure.  The index consists of the bipartite graph which is
a fairly studied and well researched type of graph.  Several
algorithms exist which are designed to work with the bipartite
graphs.  Each link the graph is BIDIRECTIONAL.  On specifying an
element('e'), the correspon...