Browse Prior Art Database

Two-Pass Lexical Ambiguity Resolution

IP.com Disclosure Number: IPCOM000122407D
Original Publication Date: 1991-Dec-01
Included in the Prior Art Database: 2005-Apr-04
Document File: 5 page(s) / 167K

Publishing Venue

IBM

Related People

Nomiyama, H: AUTHOR [+2]

Abstract

Disclosed is a mechanism for resolving lexical ambiguities by using lexical information acquired from the results of two-pass processing of a complete document.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 43% of the total text.

Two-Pass Lexical Ambiguity Resolution

      Disclosed is a mechanism for resolving lexical
ambiguities by using lexical information acquired from the results of
two-pass processing of a complete document.

      In Japanese, there are two types of lexical ambiguity:
(1) Word Identification
  o Word Segmentation
                               (A letter and a card)        Ex. 1
                               (A letter is a kid.)         Ex. 2
  o Part of Speech
                                            (TSUKURINASU)  Ex. 3
                                            (SAKUSEISURU)  Ex. 4
(2) Word Interpretation
  o
                   HUM (subcontractor)                  Ex. 5
                   ACT (subcontraction)                 Ex. 6
  o
                          handle
                          steering

      The proposed mechanism resolves such ambiguities by processing
a document twice to obtain lexical constraints and preferences.

      Natural language processing systems that analyze sentences
semantically have three layers: (1) morphological analysis, (2)
syntactic analysis, and (3) semantic analysis. Lexical ambiguities in
word identification occur in the morphological analysis phase.
Lexical ambiguities in word interpretation occur in the syntactic
analysis or the semantic analysis.

      In the following sections, we explain how ambiguities are
resolved for those two types of ambiguity.
 (1) Word Identification

      In examples 1 and 2, the ambiguity occurs because of the rule
that the function word "  " is connectable before the function word "
" In the following example, however, there is no ambiguity, even
though the same word "      " is used in a sentence and the same
rules are described for the word "      " in both examples.
                                               Ex. 7

      This means that the occurrence of ambiguities depends on the
context. If the word "      " occurs more than once and there are no
ambiguities in one or more occurrences, then the word "      " is
considered preferable to the word "      ".

      Similarly, if a document contains the sentence shown below,
then "SAKUSEISURU" is considered preferable to "TSUKURINASU."
                                (SAKUSEISURU)  Ex. 8
 (2) Word Interpretation

      Generally, words that play important rules in the document
(keywords) occur many times and their meanings are constant [1]. On
the other hand, words that are not specific to the context (general
words) also occur, but their meanings depend on the context.

      If the m...