Browse Prior Art Database

Acoustic Processor Activation Method by Using Input Power Level

IP.com Disclosure Number: IPCOM000112911D
Original Publication Date: 1994-Jun-01
Included in the Prior Art Database: 2005-Mar-27
Document File: 2 page(s) / 35K

Publishing Venue

IBM

Related People

Sugawara, K: AUTHOR

Abstract

A technique is described whereby real-time speech recognition is made possible on slow machines, which are incapable of acoustical processing in realtime. The system uses the input power level to identify where acoustical analysis should be done.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 100% of the total text.

Acoustic Processor Activation Method by Using Input Power Level

      A technique is described whereby real-time speech recognition
is made possible on slow machines, which are incapable of acoustical
processing in realtime.  The system uses the input power level to
identify where acoustical analysis should be done.

      First, the input level is checked in a big frame (512 points).
This is the initial mode.  If the level is below a predefined
threshold, the frame is discarded as non-sound.  If the input is
above the threshold, the processing mode is changed to the small
frame mode (128 points).

      The small-frame input-power-level data are passed to the next
process, segmentation, which determines whether the frame is within
an utterance of a word by using the double threshold method.  For
each small frame, it returns a frame status such as "silence,"
"within an utterance," "end point candidate," or "end point
established."  If the frame is before an utterance, the processing of
the frame is ended.  If the frame is after the beginning of the
utterance and the end point is not established, time-consuming
acoustical processing is done for the frame.

      If the end of the utterance is established, the processing mode
is set to the initial mode (512-point frame-shift).