Browse Prior Art Database

New Applications &Recent Research

IP.com Disclosure Number: IPCOM000131460D
Original Publication Date: 1980-Sep-01
Included in the Prior Art Database: 2005-Nov-11
Document File: 3 page(s) / 19K

Publishing Venue

Software Patent Institute

Related People

Demetrios A. Michaiopoulos: AUTHOR [+3]

Abstract

California State University. Fullerton IBM reports Droaress in speech recognition and transcription Scientists at IBM Research have used a computer to transcribe speech, composed of sentences drawn from a 1000- word vocabulary and read at a normal speaking pace, into printed form with what is believed to be the best accuracy yet obtained under complex experimental conditions -- 91 percent. The experimental laboratory results represent an ";encouraging early step along an enormously difficult path that someday may lead to computer recognition of unlimited speech,"; said Frederick Jelinek, head of the continuous speech recognition group at the IBM Thomas J. Watson Research Center Yorktown. He reported the results at the 1980 SAE Congress and Exposition, sponsored by the Society of Automotive Engineers.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 32% of the total text.

Page 1 of 3

THIS DOCUMENT IS AN APPROXIMATE REPRESENTATION OF THE ORIGINAL.

This record contains textual material that is copyright ©; 1980 by the Institute of Electrical and Electronics Engineers, Inc. All rights reserved. Contact the IEEE Computer Society http://www.computer.org/ (714-821-8380) for copies of the complete work that was the source of this textual material and for all use beyond that as a record from the SPI Database.

New Applications &Recent Research

New Applications Editor: Prof. Demetrios A. Michaiopoulos

California State University. Fullerton

IBM reports Droaress in speech recognition and transcription

Scientists at IBM Research have used a computer to transcribe speech, composed of sentences drawn from a 1000- word vocabulary and read at a normal speaking pace, into printed form with what is believed to be the best accuracy yet obtained under complex experimental conditions -- 91 percent.

The experimental laboratory results represent an "encouraging early step along an enormously difficult path that someday may lead to computer recognition of unlimited speech," said Frederick Jelinek, head of the continuous speech recognition group at the IBM Thomas J. Watson Research Center Yorktown. He reported the results at the 1980 SAE Congress and Exposition, sponsored by the Society of Automotive Engineers.

Although much work remains before continuous speech recognition devices can come into practical use, laboratory results indicate that this goal stands a reasonable chance of being achieved, Jelinek told the meeting. Jelinek visualizes the ideal voice recognition device as one that, as a person speaks into a microphone, instantaneously transcribes the speech and also offers an immediate verbal editing feature to correct mistakes and make immediate revisions -- in effect, a very advanced dictation machine.

The task of having a computer "recognize" continuous speech is far different from the jobs for which speech-input devices are being used today, such as sorting packages by destination codes or controlling inventory. These devices typically use built-in microprocessors to respond to words from a very small vocabulary, enunciated very carefully.

The voice recognition experiments are carried out on an IBM 370/168 in a "quiet room" environment with high fidelity equipment. The speaker talks into a microphone, and after a period of analysis that may be very long, the words as recognized by the computer appear on a CRT.

Technical background.

Ordinary speech is made up of very complex variables, and therefore poses some difficulty for analysis by computer. Several major factors must be considered.

One is word separation. In normal speech, one word follows another rapidly and continuously; therefore, the computer must have some way of identifying, within a speech stream, the end of one word and the beginning of another. The problem is like listening to a foreign language you don't understand and trying to pick out individual words. Anot...