Browse Prior Art Database

Remote Capture of PCM

IP.com Disclosure Number: IPCOM000104180D
Original Publication Date: 1993-Mar-01
Included in the Prior Art Database: 2005-Mar-18
Document File: 2 page(s) / 70K

Publishing Venue

IBM

Related People

Daggett, G: AUTHOR [+4]

Abstract

Disclosed is a method of capturing audio input to a speech recognition system from sources remote to the provider of recognition services.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Remote Capture of PCM

      Disclosed is a method of capturing audio input to a speech
recognition system from sources remote to the provider of recognition
services.

      Automatic Speech Recognition (ASR) systems require audio voice
information from a speaker.  Modern ASR  systems use as input a
digital form of audio information called Pulse Code Modulation, or
PCM.  In a client-server model for speech recognition, the ASR system
can be located at a distance from the user, and the capture of PCM
remote from the recognizer is required.  The remote capture of pcm,
or RPCM, is a mechanism which provides audio data to an ASR system.

      The method described in this article consists of an audio
capture mechanism which can be connected to an ASR server to
implement an audio channel.  The major components of RPCM are:

1.  Audio Capture

2.  Audio Encoding

3.  Communication Channel

4.  Control.

          Audio Capture - This includes the use of traditional audio
    capture components such as microphone preamplifiers and
    analog-to-digital converters.  Computer workstations often
    include these functions as standard features, or else can provide
    these functions by adding an adapter specifically designed for
    this purpose.  Audio capture can also be accomplished by
    traditional telephone systems, including digital telephones or
    digital PBX systems.

          Audio Encoding - There are a variety of digital
    representations of audio, each with different bandwidths (bits of
    data per second of audio), and audio quality (after decoding),
    and processing requirements (to encode and decode).  The RPCM
    mechanism can potentially use any existing or future encoding
    scheme, although some are better suited than others.  The actual
    encoding scheme used to digitally represent the audio data is not
    critical to this method.

          Communication Channel - The Communication Channel provides
    the means for the ASR Server to access the audio information.  It
    consists of a logical layer which formats messages and a
    communication mecha...