Browse Prior Art Database

Fast Silence Detection

IP.com Disclosure Number: IPCOM000044518D
Original Publication Date: 1984-Dec-01
Included in the Prior Art Database: 2005-Feb-06
Document File: 2 page(s) / 33K

Publishing Venue

IBM

Related People

Davies, K: AUTHOR [+2]

Abstract

Fast silence detection quickly and inexpensively recognizes periods of 'silence,' using vocoder techniques to identify the actual background sounds, so that microprocessors can calculate quickly enough to operate in realtime. It is important in many speech processing applications, particularly the currently popular isolated word recognition techniques, to be able to identify the 'silence' or background noise portions of a signal which contains sounds, especially speech sounds and background noise or silences between these sounds. At the beginning of a session in which the 'silence' is to be detected, for a period of N seconds, no intentional sounds are made. Typically N is about 8. During this period the silence signal, presumably very quiet, is processed.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 52% of the total text.

Page 1 of 2

Fast Silence Detection

Fast silence detection quickly and inexpensively recognizes periods of 'silence,' using vocoder techniques to identify the actual background sounds, so that microprocessors can calculate quickly enough to operate in realtime. It is important in many speech processing applications, particularly the currently popular isolated word recognition techniques, to be able to identify the 'silence' or background noise portions of a signal which contains sounds, especially speech sounds and background noise or silences between these sounds. At the beginning of a session in which the 'silence' is to be detected, for a period of N seconds, no intentional sounds are made. Typically N is about 8. During this period the silence signal, presumably very quiet, is processed. This processing consists of labeling the sound (see silence waveform 1) every 10 msecs with a scalar identifier which categorizes the sound for that period. This process involves standard feature extraction and vector quantization algorithms, typical of that used in some speech vocoders. This list of labels for the first N secs is saved. A mapping computation is then performed which counts the number of occurrences of each label, sorts these counts into descending order, and then assigns to each label a class, either 0 or 1, as follows: Those labels with the highest counts receive a class of 1 until the cumulative count for all such labels reaches or exceeds 95% of the total count for the entire N secs, i.e., 95% of N*100, or 95*N. Then all other labels receive a class of 0. See class register 2. Regular processing of the signal (see waveform 3) can now begin. Each 10 msec u...