Browse Prior Art Database

Voice Identification Based Programmable Speech Data Filter

IP.com Disclosure Number: IPCOM000018949D
Original Publication Date: 2003-Aug-21
Included in the Prior Art Database: 2003-Aug-21
Document File: 1 page(s) / 39K

Publishing Venue

IBM

Abstract

Speech products from cell phone to Automatic Speech Recognition (ASR) applications, require high quality audio input to achieve optimal performance. Applications where the user is some distance from the microphone, known as far field microphone applications, are turning to microphone arrays to achieve the necessary audio quality, specifically high Signal-to-Noise Ratio (SNR). To achieve optimize SNR, the microphone array's Digital Signal Processor (DSP) adjusts its' parameters to change the array's sensitivity profile. In essence a high sensitivity beam is directed at the intended speaker, and low sensitivity areas, or nulls, are directed toward detected noise sources. When the speaker of interest is one of several people in proximity to the array, as in a conference room setting for example, it is not always clear to whom the array's beam should be focused. One way to determine which of several speakers, is the speaker of interest and therefore the direction toward which to steer the beam, is through the use Automatic Speaker Identification (ASI).

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 52% of the total text.

Page 1 of 1

Voice Identification Based Programmable Speech Data Filter

   Microphone arrays have been in existence for many years. Advancements in technology over the years, has now reduced the cost, complexity and size of microphone arrays, making them practical for use with many of todays speech applications. Most speech applications use fixed beam array, which relies on the intended user being located in a fixed and predetermined location relative to the microphone array, as is the case with many of current automotive telematics products. Applications where the user location, relative to the array, is not predetermined or changes during operation, need an array with beam steering capability. To accomplish beam steering, the array DSP calculates the location of a speaker and moves a sensitivity beam to that location. The assumption is that any speech is from the intended user. The problem becomes more complex if the user is one of several speakers in proximity to the array. This presents the problem to determine which person is the intended user of the application, and which of the speakers are noise. Solving this problem is the subject of this publication.

Automatic Speaker Identification (ASI) is a software function which identifies or excludes a person from a list of known speakers. This publication, is based on the combined operation of ASI function, with beam steering array technology, and a speech application or product. In this configuration, the speech application would have an intended user preprogrammed or otherwise identified as the intended user. Then when the target speech application is operating, ASI would be used to identify and...