Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Digital Signal Processing Algorithm for Microphone Input Energy Detection Having Adaptive Sensitivity

IP.com Disclosure Number: IPCOM000039370D
Original Publication Date: 1987-May-01
Included in the Prior Art Database: 2005-Feb-01
Document File: 4 page(s) / 67K

Publishing Venue

IBM

Related People

Einkauf, MA: AUTHOR [+3]

Abstract

In many audio applications, such as voice recording or speech recognition, a microphone is used as an input device for voice signals. Often, it is desired that the application ignore all microphone input except that which is truly speech. The determination of whether the input is speech or noise is often achieved by comparing the input signal amplitude to a predetermined value, known as the noise threshold. If the input amplitude is above this threshold, it is determined that the input is speech. Otherwise, it is determined that the input is background noise and the input is ignored. This determination method, while simple and inexpensive to implement, has a serious drawback in that the background noise level may not be equal to the predetermined threshold.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 36% of the total text.

Page 1 of 4

Digital Signal Processing Algorithm for Microphone Input Energy Detection Having Adaptive Sensitivity

In many audio applications, such as voice recording or speech recognition, a microphone is used as an input device for voice signals. Often, it is desired that the application ignore all microphone input except that which is truly speech. The determination of whether the input is speech or noise is often achieved by comparing the input signal amplitude to a predetermined value, known as the noise threshold. If the input amplitude is above this threshold, it is determined that the input is speech. Otherwise, it is determined that the input is background noise and the input is ignored. This determination method, while simple and inexpensive to implement, has a serious drawback in that the background noise level may not be equal to the predetermined threshold. If the noise level is higher than the threshold, the microphone input will, erroneously, always be determined to be speech. Similarly, if the noise level is lower than the threshold, the detection algorithm may be less sensitive than necessary and may ignore some input signals which are actually speech. The problem is compounded by the fact that the ambient noise level can quickly change. Therefore, even if the exact noise level is determined at the beginning of an audio session and the detection algorithm calibrated accordingly, the noise level could soon change, invalidating the calibrated noise threshold. This new algorithm continually samples the microphone input and automatically updates the noise threshold to correspond to the ambient noise level. This is achieved by first determining if the input is background noise or speech, and if it is background noise, measuring the amplitude. This noise amplitude is adjusted slightly and becomes the new noise threshold. The method for determining if the input is speech or noise is to full- wave rectify and low-pass filter the input. This produces a positive signal with a magnitude that corresponds to the energy of the microphone input signal. If this energy level remains relatively constant for a certain period of time (about 2 seconds), it is determined that the input is background noise, since background noise normally has a semi-constant amplitude over short time spans. If the input is determined to be background noise, the new noise threshold is calculated to be slightly higher than the maximum microphone input received for the 2-second period. If the energy level of the microphone input (low- pass filter output) is not relatively constant, the input is determined to be speech, and the noise threshold is not adjusted. In this way, during short breaks in speech input, the noise threshold can be adjusted, requiring no action by the user. A flow chart of the algorithm is shown. The definition of constant and variables employed in the flow chart is as follows: STATE - State of noise threshold detection algorithm. If STATE=0, the al...