Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Low Bit Rate Predictive Voice Encoding

IP.com Disclosure Number: IPCOM000086467D
Original Publication Date: 1976-Sep-01
Included in the Prior Art Database: 2005-Mar-03
Document File: 3 page(s) / 46K

Publishing Venue

IBM

Related People

Esteban, D: AUTHOR [+2]

Abstract

This is a low-bit rate predictive voice coding system which makes it possible to predetermine, on a realtime basis, all the basic parameters for efficiently encoding and decoding the voice signal.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 54% of the total text.

Page 1 of 3

Low Bit Rate Predictive Voice Encoding

This is a low-bit rate predictive voice coding system which makes it possible to predetermine, on a realtime basis, all the basic parameters for efficiently encoding and decoding the voice signal.

Speech sounds are produced by an excitation signal exciting the resonances of the vocal tract by using either larynx pulses or noise. These resonances convolved with the excitation signal provide a speech signal with spectral redundancy. By removing this redundancy at the transmitting end of a transmission system, the required bit rate for coding the speech signal may be reduced. If certain parameters are transmitted with the coded speech, the redundancy may then be reinstated on the receiving end of the system and recover the original speech information with a good signal-to-noise ratio.

Such a redundancy is usually removed by deconvolving the signal using a predictive filter. Since the excitation signal varies slowly (with a time constant of the order of 15 to 25 ms for a normal human voice), the coefficients of the filter can be computed on a realtime basis and transmitted periodically for a whole block of samples, with the residual signal.

This is why in this system, the voice signal samples S(n) are fed into a deconvolver comprising mainly a buffer memory (BUF 1) feeding both a predictor and a processor. The predictor is a filter whose transfer function in the z domain is:

(Image Omitted)

The a(i) coefficients are computed in the processor, using any known method (see for instance IEEE Trans. on Audio and Electroacoustic, Vol. AU-20, pp. 69- 79, April 1973). The predictor output is subtracted from each sample S(n) for providing residual samples e from which the spectral redundancy has thus been removed.

The e(n) is not, however, free from any redundancy in the time domain. The required bit rate for coding it may again be lowered by removing...