Browse Prior Art Database

Resolution of Word-Sense Ambiguity by Example Sentences

IP.com Disclosure Number: IPCOM000101698D
Original Publication Date: 1990-Aug-01
Included in the Prior Art Database: 2005-Mar-16
Document File: 3 page(s) / 90K

Publishing Venue

IBM

Related People

Tsutsumi, T: AUTHOR

Abstract

Disclosed is a method which automatically selects appropriate word-senses of a subject, a verb, and an object in a sentence by using sample sentences which have already been manually disambiguated and by using synonym and taxonym hierarchy data.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Resolution of Word-Sense Ambiguity by Example Sentences

       Disclosed is a method which automatically selects
appropriate word-senses of a subject, a verb, and an object in a
sentence by using sample sentences which have already been manually
disambiguated and by using synonym and taxonym hierarchy data.

      This method provides more precise and robust disambiguation
than the conventional methods.  The text database which includes
sample sentences is easy to create and maintain, and there is no need
for semantic categorizations which the conventional knowledge base
needs.

      The figure above shows the elements and the process of this
method.  Process (A) and data (1), (4), and (5) are existing
technologies.

      In the figure, the input is first analyzed by the existing
language analysis method (process (A)).  If the input includes
ambiguous (multi-sense) words, more than one interpretation is
obtained (process (B)) as the result of the analysis by referring to
the table of word- sense numbers to be handled (data (2)).  After
extracting related sample sentences from the text database (data
(3)), the plausibility of an interpretation is calculated (process
(C)) by using the sample sentences and data (4) and (5).  The
selection of the interpretation which gets the highest plausibility
score means the selection of the most plausible word-senses of the
ambiguous words.

      The text database (data (3)) contains disambiguated canonical
forms (predicate-argument structures) of sample sentences that are
extracted from texts.  A canonical form, which...