Browse Prior Art Database

Speech Compression and Reconstruction

IP.com Disclosure Number: IPCOM000052774D
Original Publication Date: 1981-Jul-01
Included in the Prior Art Database: 2005-Feb-11
Document File: 2 page(s) / 42K

Publishing Venue

IBM

Related People

Nassimbene, EG: AUTHOR

Abstract

This invention relates to a method for the lossy compression recording and expansion of asymmetrical speech waves. The waves are compressed by the sampling of a leading partial cycle segment, extracting pitch and contour information, then digitizing and recording the sample. In this regard, only the first cycle of each pitch period is used for compression and reconstruction of speech. The method is premised upon the observation that within most pitch periods the first one-fourth to one-fifth of the waveform is significantly larger in amplitude than subsequent ones. Further, the first one-quarter to one-fifth of the waveform contains nearly all the frequency components that the remainder of the waveform contains.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 60% of the total text.

Page 1 of 2

Speech Compression and Reconstruction

This invention relates to a method for the lossy compression recording and expansion of asymmetrical speech waves. The waves are compressed by the sampling of a leading partial cycle segment, extracting pitch and contour information, then digitizing and recording the sample. In this regard, only the first cycle of each pitch period is used for compression and reconstruction of speech. The method is premised upon the observation that within most pitch periods the first one-fourth to one-fifth of the waveform is significantly larger in amplitude than subsequent ones. Further, the first one-quarter to one-fifth of the waveform contains nearly all the frequency components that the remainder of the waveform contains. Consequently, this fractional part of the waveform constitutes the most significant contribution to the perceived sound.

Referring now to the figure, there is shown an original asymmetrinacal waveform, the portion extracted from each wave, the compressed version of the waveform, and its reconstruction. Relatedly, when the small portion of a pitch period is stored, then the length of time the original pitch period lasted is also stored. When an unvoiced (nonvoicing) sound is encountered, one of two procedures can be used: (1) digitizing and storing the entire waveform, or (2) storing only one millisecond of sound along with the length of time it should last for those nonvoicing sounds which are basically repetitive whi...