Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Method for Machine Translation by Generalizing Translation Examples

IP.com Disclosure Number: IPCOM000112893D
Original Publication Date: 1994-Jun-01
Included in the Prior Art Database: 2005-Mar-27
Document File: 4 page(s) / 97K

Publishing Venue

IBM

Related People

Watanabe, H: AUTHOR

Abstract

This article describes a method of translation by generalizing translation examples.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 53% of the total text.

Method for Machine Translation by Generalizing Translation Examples

      This article describes a method of translation by generalizing
translation examples.

Data Structure

We assume that a translation example (TE) is given as follows:

        TE = <S T M>
where S is a source sentence, T is a target sentence, and M is a
correspondence between S and T.  Precisely, S and T are a sequence of
terms, and M is a list of index pair.  A term is of the following
form:

        term = [{type:}elements
where type is optional and a category of elements, an element is a
word or an id of a translation example.  We assume that type
hierarchies for source and target language are given.  An example is
shown as follows:

   S: [watashi]  [ga]  [kuruma]  [wo]  [katta&r
      T: [I]  [bought]  [a]  [car]
      M: 1-1, 3-4, 5-2

Simple Generalization

Given two translation examples, they are called consistent if they
satisfy the following conditions:

  1.  Lengths of both source sentences are the same.

  2.  Source of one TE subsumes source of the other TE.

  3.  Both mappings are the same.
where a source A subsumes the other source B if each term of A
subsumes corresponding term of B, and a term X subsumes a term Y if a
type of Y is the same as or a sub-class of a type of X, or both
elements are the same.

      The simple generalization is performed for two TEs such that
they are consistent other than only one term.  Given two translation
examples TE1=<S1,T1,M1> and TE2=<S2,T2,M2>, a generalized translation
examples of them are created as follows:

      Let ts1 be an inconsistent term in S1, ts2 be one in S2, tt1 be
a term in T1 corresponding to ts1, and tt2 be one in T2 corresponding
to ts2.

      A generalized translation example (GTE=<SG,TG,MG>) consists of
the follwoing things:

      SG is S1 but ts1 is replaced by lowest common ancestor term of
ts1 and ts2.

      TG is T1 but tt1 is replaced by lowest common ancestor term of
tt1 and tt2.  MG is M1.

      The following is an example of simple generalization.  A
generalized translation example p3) is created from p1) and p2).

  p1) S: [kanojo]  [ga]  [sake]  [wo]  [nomu&rbrack

      T: [She]  [drinks]  [alchohol]

      M: 1-1, 3-3, 5-2

  p2) S: [kanojo]  [ga]  [juusu]  [wo]  [nomu&rbrac

      T: [She]  [drinks]  [juice]

      M: 1-1, 3-3, 5-2

  p3) S: [kanojo]  [ga]  [DRINKABLE:{sake,juusu}]  [wo]  &lb

      T: [She]  [drinks]  [DRINKABLE:{alchohol,juice}]

      M: 1-1, 3-3, 5-2

Correction of Over Generalization

To avoid over generalization, the following check is needed.

  1.  If TE1 and TE2 are consistent but there is only one target term
     which is not consistent for each TEs, then step 2.

  2.  Let ts1, ts2, tt1, tt2 be terms in S1, S2, T1, T2 which caused
    ...