Browse Prior Art Database

Speech Storage Time Markers for Variable Bit Rate Speech Coders

IP.com Disclosure Number: IPCOM000113667D
Original Publication Date: 1994-Sep-01
Included in the Prior Art Database: 2005-Mar-27
Document File: 2 page(s) / 70K

Publishing Venue

IBM

Related People

Ware, MS: AUTHOR

Abstract

Disclosed is a data structure with a header called Speech Storage Time Markers, and techniques that use these time markers to overcome disadvantages associated with variable rate speech coding.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Speech Storage Time Markers for Variable Bit Rate Speech Coders

      Disclosed is a data structure with a header called Speech
Storage Time Markers, and techniques that use these time markers to
overcome disadvantages associated with variable rate speech coding.

      The key advantages of variable rate speech coding is that every
speech block has some unique bit requirement, but this bit
requirement varies through time.  Every speech block generates some
number of bits based on its "natural" information rate rather than
some arbitrary fixed amount.  A fixed rate coder requires the same
number of bits be generated from a speech coder for a given block of
time.  Since silence between words and sentences requires very few
bits, a variable rate coder can take advantage of this by coding a
smaller number of bits for silence blocks.  A fixed rate coder cannot
take advantage of this.

      In speech coding, digital speech samples are normally processed
as a block or group in fixed units of time.  For example, one might
process 32 speech samples that resulted from an analog to digital
converter sampling an analog speech waveform 8000 times per second.
This results in a speech block of 125 microseconds per sample times
32 samples or 4 milliseconds of time.  Every 4 milliseconds, another
32 samples are processed.

      In a fixed rate speech coder, e.g., 16k bits per second, this
would mean every 4 millisecond speech block would be allotted a total
of 64 bits.  At no point can a fixed rate coder change this 64 bit
figure, regardless of the information content in the speech block.

      In a variable rate speech coder, achieving some average bit
rate through time, e.g., around 16k bits per second, rather than
being required to use 64 bits each speech block, it can use as few as
1 bit per block...