Browse Prior Art Database

Method of Keyword Categorization

IP.com Disclosure Number: IPCOM000118980D
Original Publication Date: 1997-Oct-01
Included in the Prior Art Database: 2005-Apr-01
Document File: 2 page(s) / 59K

Publishing Venue

IBM

Related People

Nasukawa, T: AUTHOR

Abstract

Disclosed is a method for categorizing keywords in databases so that keywords can be treated as meaningful entities rather than mere character strings in order to be applied to advanced information retrieval systems. This method categorizes keywords in a database by referring to categories of other keywords associated to the same data in order to categorize ambiguous keywords that may belong to more than one category.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 53% of the total text.

Method of Keyword Categorization

      Disclosed is a method for categorizing keywords in databases so
that keywords can be treated as meaningful entities rather than mere
character strings in order to be applied to advanced information
retrieval systems.  This method categorizes keywords in a database by
referring to categories of other keywords associated to the same data
in order to categorize ambiguous keywords that may belong to more
than one category.

      A keyword is classified to a category if the keyword is listed
in a dictionary for the category.  However, some keywords may belong
to more than one category and require disambiguation.  For example, a
word, "Washington", may be categorized as a human or a place
depending on its context.  In this method, ambiguous keywords are
categorized by  referring to other keywords associated to the same
data in a database with their categories as a context in order to
handle databases in which  keywords may not be extracted from texts
and/or original texts may not  be accessible, besides its advantage
in the processing speed achieved by  avoiding complex natural
language processing.

This method consists of three steps:
  1.  Attach possible categories for each keyword associated to
       the same data in a database.  If only one category is
       attached to a keyword, classify the keyword as an
       unambiguous keyword.
  2.  Add preference values to each category for each ambiguous
       keyword that is attached to more than one category by the
       following manner:
      o  If an ambiguous keyword is matched with an element word
          or phrase of an unambiguous compound keyword, add a
          preference value calculated by the following formula to
          the category attached to the unambi...