Browse Prior Art Database

Procedure for Using Shortlist Reference Templates to Obtain Improved Shortlists of Candidate Words in a Speech Recognition System

IP.com Disclosure Number: IPCOM000039115D
Original Publication Date: 1987-Apr-01
Included in the Prior Art Database: 2005-Feb-01
Document File: 2 page(s) / 14K

Publishing Venue

IBM

Related People

Bahl, LR: AUTHOR [+4]

Abstract

In speech recognition, an utterance corresponds to one word in a vocabulary. To reduce the amount of processing and time required to determine the correct word, a course evaluation is first made to derive a shortlist of likely candidate words which can then be processed in detail. The present invention involves an algorithm for increasing the likelihood that the correct word is in the shortlist without increasing the length thereof. In particular, shortlist reference templates are used to improve a given shortlist(s). The algorithm includes three parts: preparation, training, and application, each of which is described step-by-step as follows: (1) Preparation 1. For input utterances, obtain --by a prescribed method-- an acoustic shortlist of candidate words for each vocabulary word.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 57% of the total text.

Page 1 of 2

Procedure for Using Shortlist Reference Templates to Obtain Improved Shortlists of Candidate Words in a Speech Recognition System

In speech recognition, an utterance corresponds to one word in a vocabulary. To reduce the amount of processing and time required to determine the correct word, a course evaluation is first made to derive a shortlist of likely candidate words which can then be processed in detail. The present invention involves an algorithm for increasing the likelihood that the correct word is in the shortlist without increasing the length thereof. In particular, shortlist reference templates are used to improve a given shortlist(s). The algorithm includes three parts: preparation, training, and application, each of which is described step-by-step as follows: (1) Preparation 1. For input utterances, obtain --by a prescribed method-- an

acoustic shortlist of candidate words for each

vocabulary word. These shortlists are reference

shortlists which will serve as templates. 2. Determine which shortlists contain word W, for each word W in the vocabulary. For each word W' which has a

reference shortlist containing W, compute the median

rank (position) of W on the lists of W', and store the

triple W-W'-R' (where R' denotes the median rank).

This triple indicates approximately what rank W would

be expected to have on the shortlist obtained when the

true word is actually W'. (2) Training A New Speaker 1. Record a set of training utterances spoken by a speaker who is to be recognized. 2. For each training utterance, obtain an acoustic shortlist of candidate words by the prescribed method.

These shortlists will be referred to as training

shortlists. 3. Let M be the maximum length of any training or

reference shortlist. Define V to be an M x M vote

matrix. The purpose of this training phase is to

compute optimal values of V(i,j) for use in the voting

scheme described in the Application part. Define two

other M x M matrices A and C, used below in computing

V. A is a matrix of vote adjustments, and C is a

matrix of counts. 4. Initialize V as follows. Let n(i) be

the number of times the correct word had rank i in the

training shortlists. For

i = 1,2,...,M set

V(i,1) = log(n(i)) if n(i) > 0, and 0

otherwise

V(i,j) = 0, for j = 2,3,...,M The following iterative steps improve on these initial values of V. 5. Set all elements of the matrices A and C to 0. For each training shortlist in turn perform steps 6 - 9. 6. Let T be a vector of vote totals, where T(w) denotes the total vote cast for word w. Set all elements of T

to 0, and perform steps 7 - 9 for each word W on the

1

Page 2 of 2

current training shortlist. 7. Let R be the rank of W...