Browse Prior Art Database

Speech Recognition for Telephone Access Systems

IP.com Disclosure Number: IPCOM000120534D
Original Publication Date: 1991-May-01
Included in the Prior Art Database: 2005-Apr-02
Document File: 4 page(s) / 150K

Publishing Venue

IBM

Related People

Davis, GT: AUTHOR [+2]

Abstract

Described is a telephone access speech recognition system using digital transmission of pre-processed speech. The system is an improvement over telephone access speech recognition systems which have very low vocabulary capabilities due to bandwidth limitations and noise induced by the transmission medium.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Speech Recognition for Telephone Access Systems

      Described is a telephone access speech recognition system
using digital transmission of pre-processed speech.  The system is an
improvement over telephone access speech recognition systems which
have very low vocabulary capabilities due to bandwidth limitations
and noise induced by the transmission medium.

      The concept provides a means of pre-processing of speech at a
handset so as to generate a template for speech data analysis.
Digital transmission of coded speech data allows all necessary
information to be intact, as compared to the bandwidth limitations of
analog speech.  This is done at a lower rate than log pulse code
modulation (PCM) techniques and at a higher resolution.  A localized
pre-processing unit provides high quality microphone speech input for
a speech recognition unit.  A library of stored template sets at a
data base center is provided enabling user selection before voice
connection occurs.

      Distortion in speech quality is often the biggest detriment in
the recognition of telephone speech.  To minimize the distortion,
telephone pre-processing of the speech at the user handset is used
and information is transmitted as coded digital data.

      With the advent of speaker integrated services digital network
(ISDN) -1-, the problems concerning the transmission line noise are
minimized since the speech is transmitted digitally.  Coding and
decoding to analog form is done at the transmitting and receiving
ends.  Bandwidth limitations are overcome by transmitting
pre-processed voice information.

      Fig. 1 shows a block diagram of the typical system consisting
of telephone handset 10, which the incorporates a pre-processing
unit, telephone exchange 11, and data base unit 12, with its
automatic speech recognition unit.

      Analog handset 10 is shown in detail in the block diagram of
Fig.  2 and consists of speech pre-processor 13 which analyzes the
speech input to generate digital templates for speech recognition to
be transmitted by modem 14.

      Data base unit 12 (Fig. 1), shown in detail in Fig. 3, consists
of phone line interface 15 and modem 18 which demodulates the digital
data for speech recognition recognition unit 19, so as to send
received commands to database interpreter 20.  In the ISDN
environment, the modem sections would be eliminated and coded speech
would be transmitted as digital data at rates as high as 64k bits per
second.

      Fig. 4 shows a block diagram of speech pre-processor unit 13
(Fig.  2) located in telephone handset 10 and speech recognition unit
19 located in data base unit 12 (Fig. 3). Hardware realization is
performed using digital signal processors 24 and 27, as shown in
Figs....