Browse Prior Art Database

Realtime Speech Power Indicator

IP.com Disclosure Number: IPCOM000061337D
Original Publication Date: 1986-Jul-01
Included in the Prior Art Database: 2005-Mar-09
Document File: 3 page(s) / 82K

Publishing Venue

IBM

Related People

Kuroda, A: AUTHOR

Abstract

A power indicator is proposed as a device to improve user-interface in speech input. It enables a user to check the power of his/her utterance correctly and easily in realtime. A number of speech recognizers and speech filing/messaging systems have been announced. But they have no efficient interface for a user to check the loudness of his/her utterance. For example: One system displays on the console such a message as "Speak louder!" in case of too low speech and "Speak lower!" in case of too loud speech. However, a user has no means to know to what degree he/she should speak louder/lower. Another system displays a bar whose length changes according to the power of the moment or the peak power in a short period of time. This provides the better user-interface than the above.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 48% of the total text.

Page 1 of 3

Realtime Speech Power Indicator

A power indicator is proposed as a device to improve user-interface in speech input. It enables a user to check the power of his/her utterance correctly and easily in realtime. A number of speech recognizers and speech filing/messaging systems have been announced. But they have no efficient interface for a user to check the loudness of his/her utterance. For example: One system displays on the console such a message as "Speak louder!" in case of too low speech and "Speak lower!" in case of too loud speech. However, a user has no means to know to what degree he/she should speak louder/lower. Another system displays a bar whose length changes according to the power of the moment or the peak power in a short period of time. This provides the better user-interface than the above. However, ambiguity still remains, since it provides only momentary information. Additionally, even if overflows occur at the A/D (analog-to-digital) conversion process and digital values are saturated to the maximum value causing significant distortion of the input speech, a user has no direct means to know it, since the values are saturated before the calculation of the input power and the length of the bar changes little. The speech power indicator proposed here provides better interface for a user to check the loudness of his/her utterance: The power indicator keeps and displays speech power profile of a latest few seconds, instead of displaying the power of the moment or the peak power in a short period of time.

Thus, a user can trace power transition in realtime. The overflow at the A/D conversion process is checked, and, if significant, overflow (described later) is detected and the corresponding point is displayed with another color as a warning of too loud speech. Base System (Fig.1) The power indicator is implemented on such a system as follows: - A personal computer system with a bit-mapped color display. - An acoustic processing card which is supplied as an optional card of the system, and a microphone/speaker or a telephone handset. The signal processor on the acoustic processing card performs input speech analysis and produces: - logarithmic power - A/D conversion overflow count ... described later - other application-dependent speech features, for example: - ADPCM data - linear predictive coefficients Speech Power Indication Fig. 2 shows a display panel of the realtime speech power indicator. When a user speaks "Sapporo" (Japanese city name), it displays logarithmic power of the utterance realtime. The width of the display area is wide enough to keep at least one word of utterance (approximately 2 seconds in our case). For realtime displays: 1. The acoustic processing card calculates logarithmic power for one frame and stores it in a shared memory every frame rate (10-20 ms), then it raises an interrupt request to the PC. 2. The interrupt service routine on the PC gets the logarithmic power and saves it to a PC-lo...