Browse Prior Art Database

Parsing Algorithm for Japanese Text Analysis

IP.com Disclosure Number: IPCOM000112472D
Original Publication Date: 1994-May-01
Included in the Prior Art Database: 2005-Mar-27
Document File: 4 page(s) / 124K

Publishing Venue

IBM

Related People

Saito, T: AUTHOR [+2]

Abstract

This article describes a parsing algorithm that can be used for the pre-analysis of machine translation or the prosody control of a Japanese text-to-speech system.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 48% of the total text.

Parsing Algorithm for Japanese Text Analysis

      This article describes a parsing algorithm that can be used for
the pre-analysis of machine translation or the prosody control of a
Japanese text-to-speech system.

      Background - In Japanese, it is difficult to analyze the
syntactic structure of sentences completely without considering the
meaning and context.  On the other hand, it costs too much to use all
the available information on meaning and context.  Proposed here is a
parsing algorithm that uses pattern matching of the modificational
relations.  This algorithm is intended to be used for the
pre-selection of syntactic candidates in machine translation and for
the prosody control of Japanese text-to-speech, without consideration
of meaning and context.  It is based on the rules of modificational
relation in Japanese.

      Principles and tendencies of modificational relation in
Japanese - In Japanese, there are three principles governing
modificational relations among phrases.

o   Prin.  1 - A phrase modifies only one following phrase.

o   Prin.  2 - A phrase is modified only by previous ones.

o   Prin.  3 - Two modificational relations do not cross.

      By analyzing modificational relations, even without using
information on meaning and context, we can find the modificational
relations to which those principles do not apply, namely, those
involving ambiguity (in which Prin.  1 and Prin.  2 do not apply) and
those involving inconsistency (in which Prin.  3 does not apply).  To
exclude relations that involve ambiguity or inconsistency, we propose
two tendencies of modificational relations among phrases.  <Tend.1>
If the modificational relations are analyzed by using only
grammatical information, the reliability of the modificational
relation weakens in inverse proportion to the number of modifier
relations.  Fig. 1 shows an example of the reliability of
modificational relation.  This Figure shows two cases of
modificational relations consisting of three phrases.  Phrase A
modifies phrase B in both case 1 and case 2 (X1, X2).  In case 1,
however, phrase A modifies phrase C grammatically (Y1).  In these
relations, the reliability of modification X1 is stronger than that
of modification X2.

<Tend.2> In Japanese, a phrase is likely to modify the nearer phrase.

The parsing algorithm proposed is based on these principles and
tendencies.

      Proposed Method - The parsing algorithm is used for
disambiguation of the syntactic structure of a sentence.  In this
method, there are two stages of analysis.  First, the modificational
relation pattern of the sentence is analyzed.  Next, the
modificational relation pattern is mapped to one that has less
ambiguity and inconsistency.

1.  Analysis of the modificational relation pattern.  In discussed
    parsing algorithm, the modificational relations of all pairs of
    phrases are analyzed.  For example, in the case of a sentence
    t...