Using Grammatical Classes to Obtain Improved Estimates of Word Probabilities in a Speech Recognition System
Original Publication Date: 1986-Nov-01
Included in the Prior Art Database: 2005-Mar-09
Assuming that each word in a text has a unique grammatical class associated with it and that the number of different grammatical classes is significantly smaller than the number of different words, word level m-gram probabilities can be improved through utilizing grammatical class statistics as an estimator in probability calculations. The present invention involves predicting the next word a speaker utters based on predetermined "m-grams" and estimated probabilities therefor. An "m-gram" represents a set, or sequence, of m words and the estimated probabilities are in the form p(wm w1,w2,...wm-1), where w1,w2,w3,...wm represent m words. In accordance with the invention, the grammatical classes of w1 through wm-1 and probabilities p(ci) therefor are used.