Browse Prior Art Database

Maintaining Descriptive Metadata with Integrity

IP.com Disclosure Number: IPCOM000016274D
Original Publication Date: 2002-Sep-16
Included in the Prior Art Database: 2003-Jun-21
Document File: 3 page(s) / 114K

Publishing Venue

IBM

Abstract

We can distinguish two types of metadata: formal and descriptive. Formal metadata records the originator, creation date, and other durable information related to the provenance of the item to which the metadata relates. Descriptive metadata records classification information including, but not limited to, keywords that portray the content of the item. Other forms of metadata might relate to the circumstances in which the item was accessed. Such metadata is capable of being data mined. This article is concerned with descriptive metadata. The latter is important for multimedia documents, which comprise one or more of the media: text, image, sound, and video. Descriptive metadata is particularly useful in the search and retrieval of multimedia documents. For the purposes of illustration, the discussion will concentrate on keyword metadata in search and retrieval, but it should be understood that the technology proposed is relevant to, and useful for, other forms of metadata. It is important to allow descriptive metadata to evolve, for example to encompass additional contexts of use for the documents, where those contexts were either not envisaged, or additional attributes can refine the metadata. Thus additional keywords can be added to those already associated with an image as it becomes apparent that the image depicts a concept not included when the image was originally tagged. One mechanism for acquiring update information is relevance feedback . Users of the multimedia documents can provide the retrieval system with feedback about how valuable individual documents are in the user’s context of enquiry. To be useful, this feedback should update the descriptive metadata. It has been suggested [1] that images receiving positive feedback can have the search keywords automatically added to the set associated with each image. Such semi-automatic annotation carries the risk that the integrity of the metadata will become compromised, for example by introducing bias. A safer approach is to accumulate potential updates for collation and analysis prior to completing the update. This article describes an approach that enables an update policy to be defined that maintains metadata integrity.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 53% of the total text.

Page 1 of 3

Maintaining Descriptive Metadata with Integrity

We can distinguish two types of metadata: formal and descriptive. Formal metadata records the originator, creation date, and other durable information related to the provenance of the item to which the metadata relates. Descriptive metadata records classification information including, but not limited to, keywords that portray the content of the item. Other forms of metadata might relate to the circumstances in which the item was accessed. Such metadata is capable of being data mined.

    This article is concerned with descriptive metadata. The latter is important for multimedia documents, which comprise one or more of the media: text, image, sound, and video. Descriptive metadata is particularly useful in the search and retrieval of multimedia documents. For the purposes of illustration, the discussion will concentrate on keyword metadata in search and retrieval, but it should be understood that the technology proposed is relevant to, and useful for, other forms of metadata.

    It is important to allow descriptive metadata to evolve, for example to encompass additional contexts of use for the documents, where those contexts were either not envisaged, or additional attributes can refine the metadata. Thus additional keywords can be added to those already associated with an image as it becomes apparent that the image depicts a concept not included when the image was originally tagged.

    One mechanism for acquiring update information is relevance feedback. Users of the multimedia documents can provide the retrieval system with feedback about how valuable individual documents are in the user's context of enquiry. To be useful, this feedback should update the descriptive metadata. It has been suggested [1] that images receiving positive feedback can have the search keywords automatically added to the set associated with each image. Such semi-automatic annotation carries the risk that the integrity of the metadata will become compromised, for example by introducing bias. A safer approach is to accumulate potential updates for collation and analysis prior to completing the update. This article describes an approach that enables an update policy to be defined that maintains metadata integrity.

    The approach requires the owner or curator of the multimedia document to specify the following additional information, to be stored with the document metadata:

Intermediary locations for candidate updates Format template for update information Update mechanism

    The intermediary location describes where the information for a potential update is stored pending further analysis. One or more LDAP servers could be used for this purpose. The format template can be as specific as the analysis requires. The template could allow for a freeform...