Browse Prior Art Database

Interactive Language Training System

IP.com Disclosure Number: IPCOM000111470D
Original Publication Date: 1994-Feb-01
Included in the Prior Art Database: 2005-Mar-26
Document File: 2 page(s) / 67K

Publishing Venue

IBM

Related People

Farrett, PW: AUTHOR [+2]

Abstract

Disclosed is a language training environment in which the end-user is "drilled" routinely until correct pronunciation and prosody are achieved for a given language.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Interactive Language Training System

      Disclosed is a language training environment in which the
end-user is "drilled" routinely until correct pronunciation and
prosody are achieved for a given language.

      Currently available are a number of multimedia products that
support applications in language training exercises.  These products
offer good pre-recorded (waveform encoded) speech in various Romance,
East European, and Asian languages.  However, the problem with this
approach is the lack of sufficient interactive voice capability
regarding the user (i.e., user just listens then replays audio/speech
file).  In order for a language training system to be "interactive",
a more flexible approach is needed.  This area is what this
disclosure addresses.

      A language training system would consist of the following
(contingent upon application).  Voice Input: A voice recognition
system capable of discrete/continuous (word or phrase) recognition,
which takes into account syntactic (lexical) and prosodic (stress,
intonation, etc.)  attributes.  Voice Output: A speech synthesizer,
which would include at least one of the following technologies:
waveform encoding for direct speech reconstruction; synthesis by rule
for phonetic speech synthesis; mathematical reconstruction of speech
from the time domain (e.g., LPC).  Application Session: Application
is based on Socratic dialogue session (i.e., drills based on
teacher/student relationship).

The following algorithmic process illustrates the approach:

Short Language Training Session (Discrete):

   For discrete utterances (isolated words) do:

        1) Match pattern of pre-stored voice template
           for single word-utterance.

        2) User's voice print best matches waveform (one of the
           above speech technologies):
                     - Prosody of voice print is matched against
                       waveform for {intonational|rhythmical-stress|
                       temporal-continuity} contours.  (e.g., f0,
                       f1-f5 parameters of the waveform "best
matched";
                       note phoneme articulation/pronunciation is
                       f1-f3 or 200 Hz - 1500 Hz; language prosody
   ...