Browse Prior Art Database

Algorithms for Disambiguating Word Senses in Thesauri

IP.com Disclosure Number: IPCOM000109686D
Original Publication Date: 1992-Sep-01
Included in the Prior Art Database: 2005-Mar-24
Document File: 2 page(s) / 75K

Publishing Venue

IBM

Related People

Chodorow, MS: AUTHOR [+4]

Abstract

Two algorithms are disclosed that identify the intended sense of each occurrence of a word in an alphabetical thesaurus. The intended sense is marked as a numerical index, corresponding to one of the senses specified for that word in its thesaurus entry. As a result, multi-sense synonyms are disambiguated, and the synonymy relation is shown to hold between word-senses, rather than between (ambiguous) words.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Algorithms for Disambiguating Word Senses in Thesauri

       Two algorithms are disclosed that identify the intended
sense of each occurrence of a  word in an alphabetical thesaurus.
The intended sense is marked as a numerical index, corresponding to
one of the senses specified for that word in its thesaurus entry.  As
a result, multi-sense synonyms are disambiguated, and the synonymy
relation is shown to  hold between word-senses, rather than between
(ambiguous) words.

      Dictionary-style thesauri consist of alphabetically arranged
discrete entries.  Each  entry consists of a headword (W), separated
into senses (W1, W2, ...), and each sense is followed by a list of
its synonyms (W1: A, B, C, ...).

      The synonyms that appear in the list are not indexed for their
intended sense.  Thus, if A is a synonym of W1, and A is also a
headword with 2 senses (A1 and A2), it is not known which of these
two senses is intended as the synonym of W1.  The two algorithms
offered here automatically index synonyms for their intended senses.
SYMMETRY ALGORITHM
l.  Look up headword W.
2.  Find all the senses of headword W that have A as a synonym.
3.  Look up headword A.
4.  Find all the senses of headword A that have W as a synonym.
5.  If A is a synonym of a single W sense (Wj), index each occurrence
of W in the entry for headword A with the sense number of that sense
(j).
6.  If W is a synonym of a single A sense (Ai), index each occurrence
of A in the entry for headword W with the sense number of that sense
(i).
BEFORE INDEXING               AFTER INDEXING
W
  W1: ... A ...                           W1: ... A2 ...
  .                                         .
  .                                         .
  Wn:    ...                              Wn:    ...
A
  A1:    ...                              A1:    ...
  A2: ... W ...                           A2: ... W1 ...
  .                ...