Browse Prior Art Database

Simple Lip-Synchronisation of Animated Icons with Audio O/P

IP.com Disclosure Number: IPCOM000106311D
Original Publication Date: 1993-Oct-01
Included in the Prior Art Database: 2005-Mar-20
Document File: 2 page(s) / 79K

Publishing Venue

IBM

Related People

Sharman, RA: AUTHOR

Abstract

Disclosed is a means of toggling a matched pair of icons or other bitmaps on a computer display in synchrony with the appearance/absence of true human speech in a sampled audio signal. If the bitmaps represent a face or a mouth, it can appear to open and shut in synchrony with spoken words. Advantages over crude bitmap flipping is a much greater verisimilitude as the changes in appearance occur at the psychologically appropriate times. Over hand-drawn animation the icon changes can occur for arbitrary spoken audio signals, and do not have to be matched by hand. The film industry, robot talking simulators and cartoons employ laborious hand alignment of pictures and audio.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Simple Lip-Synchronisation of Animated Icons with Audio O/P

      Disclosed is a means of toggling a matched pair of icons or
other bitmaps on a computer display in synchrony with the
appearance/absence of true human speech in a sampled audio signal.
If the bitmaps represent a face or a mouth, it can appear to open and
shut in synchrony with spoken words.  Advantages over crude bitmap
flipping is a much greater verisimilitude as the changes in
appearance occur at the psychologically appropriate times.  Over
hand-drawn animation the icon changes can occur for arbitrary spoken
audio signals, and do not have to be matched by hand.  The film
industry, robot talking simulators and cartoons employ laborious hand
alignment of pictures and audio.

      An icon displayed on a "windows" screen is usually a static
bitmap.  It can be refreshed with alternated bitmaps at intervals
making an
 "animated icon".  Such an animated icon can be used to display a
"face" with opening and shutting mouth which appears to visually
"talk".  The icon can be associated with audio output from a
multimedia audio adapter which outputs an audio message to the user,
eg from voice-mail or voice-response unit.  The problem is how to
synchronise the "lips" of the icon to the audio output, to achieve
"realistic" animation of the icon, and enhance the visual/aural
impact to the user.  Also deciding when to change from the bitmap
representing a vowel to the other, and vice versa, to give a
realistic appearance to the overall effect.  Since the icon can be
modified only slowly, a simple and robust technique of modelling
"open-mouth" vs "closed-mouth" is required.  This can be done by a
simple Voiced-Unvoiced distinction (V/UV) classification of the
output audio signal.  Voiced audio broadly corresponds to vowels (for
which the mouth is open) and unvoiced audio is roughly equivalent to
consonants, for which the mouth is closed.

Hence, th...