Browse Prior Art Database

Ranking Candidate Translations of Source Words in Machine Translation

IP.com Disclosure Number: IPCOM000111472D
Original Publication Date: 1994-Feb-01
Included in the Prior Art Database: 2005-Mar-26
Document File: 2 page(s) / 67K

Publishing Venue

IBM

Related People

Brown, PF: AUTHOR [+4]

Abstract

Described is a method of ranking candidate translations of particular source language words in a machine translation system according to the average scores the candidate translations will receive when used as extensions to current hypotheses in a stack search algorithm.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Ranking Candidate Translations of Source Words in Machine Translation

      Described is a method of ranking candidate translations of
particular source language words in a machine translation system
according to the average scores the candidate translations will
receive when used as extensions to current hypotheses in a stack
search algorithm.

      Statistical machine translation as described by [*], relies on
a language model which provides estimates of prior probabilities of
sentences in a target language, such as English, and also a
translation model which provides estimates of conditional
probabilities with which sentences in a source language, such as
French, are translations of sentences in a target language [*].
Probabilities of these two models can be multiplied together to
determine the joint probability of a source sentence and a target
sentence.  One translates a source sentence into a target language by
attempting to find the target sentence which maximizes the joint
probability as estimated by the language and translation models.

      For computational reasons, it is impossible to examine all
source-language sentences.  Instead a stack search algorithm is
employed.  The particular search algorithm advocated by Brown et al.
iteratively extends a set of hypotheses.  Each hypothesis consists of
a prefix of a target sentence, together with a list of which words in
the source sentence have been accounted for by that hypothesis.  Each
time an hypothesis is extended an additional source language word is
accounted for.  This may be done by adding more words to the right
end of the hypothesis such that at least one of these additional
words is a translation of the new source word which has been
accounted for, or by using at least one word in the existing
hypothesis to account for an additional source word.

 ...