Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Autocorrelation-Faces: an Aid to Deaf Children Learning to Speak

IP.com Disclosure Number: IPCOM000059662D
Original Publication Date: 1986-Jan-01
Included in the Prior Art Database: 2005-Mar-08
Document File: 3 page(s) / 48K

Publishing Venue

IBM

Related People

Evangelisti, CJ: AUTHOR [+2]

Abstract

Various graphical methods of representing multivariate data using icons, or symbols, have been discussed previously [1,2,3,4,5,6]. In general, data parameters are each mapped into a figure with features, each feature varying in size or shape according to the point's coordinate in that dimension. One particularly novel method of representing multivariate data has been presented by Chernoff [1]. The data sample variables are mapped to facial characteristics; thus, each multivariate observation is visualized as a computer drawn face. Such faces have been shown to be more reliable and more memorable than other tested icons [2] and allow the human analyst to grasp many of the essential regularities and irregularities in the data.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 44% of the total text.

Page 1 of 3

Autocorrelation-Faces: an Aid to Deaf Children Learning to Speak

Various graphical methods of representing multivariate data using icons, or symbols, have been discussed previously [1,2,3,4,5,6]. In general, data parameters are each mapped into a figure with features, each feature varying in size or shape according to the point's coordinate in that dimension. One particularly novel method of representing multivariate data has been presented by Chernoff [1]. The data sample variables are mapped to facial characteristics; thus, each multivariate observation is visualized as a computer drawn face. Such faces have been shown to be more reliable and more memorable than other tested icons [2] and allow the human analyst to grasp many of the essential regularities and irregularities in the data. This aspect of the graphical point displays capitalizes on the feature integration abilities of the human visual system, particularly at higher levels of cognitive processing [2]. In the current applications, ten facial parameters (F1F2 ...,F10) are used, and each facial characteristic has ten settings (S1S2 ..., S10), providing for 10 billion possible different faces. The controlled features are: head eccentricity, eye eccentricity, pupil size, eyebrow slant, nose size, mouth shape, eye spacing, eye size, mouth length, and degree of mouth opening. The mouth is constructed using parabolic interpolation routines, and the other features are derived from circles, lines, and ellipses. A pilot study was conducted to determine whether people, without formal training in phonetics or acoustics, and with no preparation, could group speech sounds represented by computer-drawn faces computed from voice input (Figs. 1,2,3). Nine isolated sounds were used, with 3 examples of each sound. The sounds covered a range of classes. Included were four fricatives (2 voiced (Z, V) and 2 unvoiced (S, SH)), three vowel sounds (EE, AA, UU), and two nasal sounds (M, N). The 10 facial parameters were computed from the first 10 points of the autocorrelation function of a 50 ms segment of the speech sampled at 10 KHz. The autocorrelation of a signal x(n) with lag k is defined as:

(Image Omitted)

The results of the study suggest that untrained subjects can categorize facial sound-displays with a high level of performance, indicating that there are salient and reliable cues in the speech-faces which are sufficiently different to distinguish even phonetically similar sounds, such as the steady-state 'M' and 'N' sound. When the faces were computed from the output of a developmental speech synthesizer (an all- digital 5-formant cascade synthesizer), the faces contained many, but not all, of the distinguishing characteristics of their natural speech counterparts. For example, the fricative sounds S, SH, and V contained the characteristic elongated head, the UU sound had the overlapping eyes, and the AA sounds in both natural and synthetic speech gave rise to the only faces to have...