Browse Prior Art Database

Language Model Adaptation Using Word Clustering

IP.com Disclosure Number: IPCOM000016435D
Original Publication Date: 2003-Feb-08
Included in the Prior Art Database: 2003-Jun-21
Document File: 1 page(s) / 27K

Publishing Venue

IBM

Abstract

Building a stochastic language model (LM) for speech recognitions, etc. requires a large corpus of target task. In some tasks no enough large corpus is available and this is an obstacle to achieve a high recognition accuracy. In this paper, we propose a method for building an LM with a higher prediction power using large corpora of different tasks than an LM estimated from a small corpus of a target task. In our experiment, we used transcriptions of air university lectures and articles of {\it Nikkei} newspaper and compared an existing interpolation-based method and our new method. The result tells us that our method allows 9.71\% of perplexity reduction.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 100% of the total text.

Page 1 of 1

Language Model Adaptation Using Word Clustering

None

1