Visual Speech Recognition by Tracking of Lip Motion
Original Publication Date: 2003-Feb-25
Included in the Prior Art Database: 2003-Feb-25
Related PeopleOther Related People:
The invention provides a visual speech recognition system which is based on lip-reading. For this a gradient technique is applied which tracks the motion of facial parts involved in articulation with sub-pixel accuracy. To compensate bulk motion of the head, it measures relative motions (i.e. how the lips open and close). Also particular regions in the face are identified which carry significant speech information. The core of this invention lies in the identification of novel features and in the dynamical approach to extract features. Shape-recognition is only required for initialization, from then on, motions are tracked. Novel features: The invention uses novel features which carry speech information and which provide speech information in addition to lip motion: chin-motion (measured as chin to nose distance) motion of the upper lip (measured as upper lip to nose distance) motion of the lower lip (measured as lower lip to chin distance) Previously especially the opening of the lips and the width of the mouth have been applied. The invention proposes a dynamical approach, which does not extract shapes but tracks motions instead.