Browse Prior Art Database

N-Segment Label Histogram Speech Recognition Method Modified for Small Computer System

IP.com Disclosure Number: IPCOM000040190D
Original Publication Date: 1987-Oct-01
Included in the Prior Art Database: 2005-Feb-02
Document File: 2 page(s) / 23K

Publishing Venue

IBM

Related People

Watanuki, O: AUTHOR

Abstract

This article describes a modified N-segment label histogram speech recognition method by utilizing linear approximation of the logarithmic function without sacrificing recognition accuracy. The conventional N-segment label histogram method is based on a probabilistic technique, in which label output probabilities are defined as the probabilities of the respective labels being produced at each segment of each reference word. Such probabilities are easily obtained from label histograms during training. In decoding, the label string produced from an unknown input word is divided into the same number of segments as the reference words. The output probability of each label in the string is a function of the identity of the label and the identity of the segment to which the label belongs.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 65% of the total text.

Page 1 of 2

N-Segment Label Histogram Speech Recognition Method Modified for Small Computer System

This article describes a modified N-segment label histogram speech recognition method by utilizing linear approximation of the logarithmic function without sacrificing recognition accuracy. The conventional N-segment label histogram method is based on a probabilistic technique, in which label output probabilities are defined as the probabilities of the respective labels being produced at each segment of each reference word. Such probabilities are easily obtained from label histograms during training. In decoding, the label string produced from an unknown input word is divided into the same number of segments as the reference words. The output probability of each label in the string is a function of the identity of the label and the identity of the segment to which the label belongs. The likelihood

(Image Omitted)

of the unknown input word being one of the reference words is calculated as follows.

(w) = log Pi (w) where Pi = max (Pi, ),

(w) is the likelihood for the word w, and

Pi is the output probability of the label i. The reference word having the highest likelihood is determined as the recognition result. Such a conventional N-segment label histogram method requires less computation than recognition methods of DP matching and HMM types. However, computing the logarithm of label output probabilities is time consuming with a small computer system, such as a personal computer....