Browse Prior Art Database

Reduction of the Voice Excited Vocoder Data

IP.com Disclosure Number: IPCOM000091426D
Original Publication Date: 1968-Jan-01
Included in the Prior Art Database: 2005-Mar-05
Document File: 3 page(s) / 57K

Publishing Venue

IBM

Related People

Buron, R: AUTHOR

Abstract

The voice excited vocoder is a nine-channel vocoder with a base-band. The data format before reduction is obtained by coding the signals delivered by the voice excited vocoder.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 54% of the total text.

Page 1 of 3

Reduction of the Voice Excited Vocoder Data

The voice excited vocoder is a nine-channel vocoder with a base-band. The data format before reduction is obtained by coding the signals delivered by the voice excited vocoder.

The base-band is sampled according to the duet sampling process. Each sample is coded as a six-bit word. A first bit is equal to 1 if the signal is positive and is equal to 0 if the signal is negative. The five following bits are in log PCM form. The particular value 0 is encoded as 000000.

For each pair of samples of the base-band, a different channel is sampled. The output of this channel is coded with four bits. These four bits are associated with two consecutive base-band samplings to provide two bytes of eight bits as in drawing A. Thus eighteen base-band samplings are associated to the nine channel samplings as in drawing B. If the speech is sampled every 15 msec, these 18x8 = 144 bits represent 15 msec of speech. This corresponds to a bit rate of 9,600 bits per second. This is format F1.

To achieve a less redundant coding of the signals delivered by a voice excited vocoder, certain properties of the speech signals are taken into account. Statistical analysis of human speech shows that pauses represent 30% of the time and that unvoiced consonants having no energy below 900 cps represent about 15% of the time.

There are three additional data formats. F2 is very similar to F1 for pitched sections of the voice. F3 is for unvoiced consonants. F4 is for pauses.

Format F2 is identical to F1 with a special byte at the beginning to allow the recognition of the format as being a pitched sound format. The basic frame contains nineteen bytes as in...