Dismiss
InnovationQ will be updated on Sunday, April 29, from 10am - noon ET. You may experience brief service interruptions during that time.
Browse Prior Art Database

Method for Computing the Conditional Distribution of a Word Given the Previous Word in Text

IP.com Disclosure Number: IPCOM000111909D
Original Publication Date: 1994-Apr-01
Included in the Prior Art Database: 2005-Mar-26
Document File: 2 page(s) / 61K

IBM

Related People

Brown, PF: AUTHOR [+4]

Abstract

Disclosed is a method for estimating the conditional probability of a word in text given the previous word in the text. Essen and Steinbiss [*], describe a method for expressing the conditional probability distribution of a word given the preceeding word in text as the linear combination of a set of conditional probability distributions. The method described here is an improvement of their scheme.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Method for Computing the Conditional Distribution of a Word Given
the Previous Word in Text

Disclosed is a method for estimating the conditional
probability of a word in text given the previous word in the text.
Essen and Steinbiss [*], describe a method for expressing the
conditional probability distribution of a word given the preceeding
word in text as the linear combination of a set of conditional
probability distributions.  The method described here is an
improvement of their scheme.

Let T be a sequence of words from a vocabulary V of size v.
Let c(w sub 1 w sub 2)  be the number of times that the pair of words
w sub 1 w sub 2  occur in sequence in T, and let c(w sub 1 .) be the
number of times that w sub 1  occurs at the beginning of a sequence
of two words in T.  Similarly, let c(.  w sub 2) be the number of
time that w sub 2 at the end of a sequence of two words in T, and let
c(..) be the number of two word sequences in T.  Let S  be a set of
distribution indices, 1, 2, s.  Let P sub sigma (w), sigma memberof
S, be a set of probability distributions over the words in the
vocabulary, and let C sub w (sigma), w memberof V, be a set of
probability distributions over the indices in S.  Then a conditional
probability distribution of w sub 2  given w sub 1  is given by

P(w sub 2  | w sub 1 ) identical sum from <sigma memberof S> C sub
<w
sub 1> (sigma) P sub sigma (w sub 2 ).

According to the method described herein, the distributions P
sub sigma (w) and C sub w (sigma) for s=2 are chosen as follows.

1.    Set n=0.

2.    Set P sub 1 sup <(0)> (w)=c(.  w)/c(.  .), and P sub 2 sup
<(0)> (w)=v sup <-1>.

3.    Set C sub w sup <(0)> (1)=0.5 + i sub w epsilon, C sub w sup
<(0)> (2) = 0.5 - i sub w epsilon, where i sub w is chosen
randomly to be +1 for approximately half of the words in V and -1
for the remainder of the words in V, and epsilon is some suitable
small number, say 0.1.

4.    Determine J sup <(n)> (w sub 1 sigma w sub 2 ) according to the
formula

J sup <(n)> (w sub 1  sigma w sub 2 ) = <C sub <w sub 1> sup
<(n)>
(sigma) P sub sigma sup <(n)> (w sub 2 )> left lbracket
<sum from <sigma memberof S> of <> C sub <w sub 1> sup <(n)>
(sigma) P sub
sigma sup <(n)> (w sub 2 )> right rbracket sup <-1>.

5.    Determine N sup <(n)> (w sub 1 sigma w sub 2 ) according to the
formula

N sup <(n)> (w sub 1  sigma w sub 2 ) = c(w sub...