Dismiss
The InnovationQ application will be updated on Sunday, May 31st from 10am-noon ET. You may experience brief service interruptions during that time.
Browse Prior Art Database

Recursive Self-Smoothing of Linguistic Contingency Tables

IP.com Disclosure Number: IPCOM000044667D
Original Publication Date: 1984-Dec-01
Included in the Prior Art Database: 2005-Feb-06

Publishing Venue

IBM

Related People

Authors:
Nadas, AJ [+details]

Abstract

The number of zeros in a contingency table, which varies from many zeros for n > 1 to mostly zeros for n > 2, may be smoothed by using three factors as follows: 1. probabilities conditioned on more recent past are obtained as weighted averages of long term memory; 2. short term memory is regarded as prior information for estimating priorities based on long term memory; 3. identify the data required for estimating the parameters of the prior distribution, using probabilities conditioned as in 1. The estimator is constructed and normalized so as to sum to unity. In probablistic modeling of natural language text the number of occurrences of possible n-grams ln = (11 12 .....1n) of words can be arranged in a contingency table. For n > 1, such tables have many zeros; for n > 2' they are sparse.