Browse Prior Art Database

Improved "Added Word" User Interface Using Integrated Automatic Speech and Handwriting Recognition

IP.com Disclosure Number: IPCOM000103849D
Original Publication Date: 1993-Feb-01
Included in the Prior Art Database: 2005-Mar-18
Document File: 4 page(s) / 142K

Publishing Venue

IBM

Related People

Bellegarda, JR: AUTHOR [+6]

Abstract

Speech (resp. handwriting) recognition systems introduce errors in the transcription of spoken (resp. written) text for a variety of reasons, one of which being that often the misrecognized word does not belong to the recognition vocabularies. It is therefore imperative to provide the user with the capability of adding a new word to the vocabulary. This disclosure presents a new ADDWORD user interface which draws on the complementarity between speech and handwriting recognition.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 37% of the total text.

Improved "Added Word" User Interface Using Integrated Automatic Speech and Handwriting Recognition

      Speech (resp.  handwriting) recognition systems introduce
errors in the transcription of spoken (resp.  written) text for a
variety of reasons, one of which being that often the misrecognized
word does not belong to the recognition vocabularies.  It is
therefore imperative to provide the user with the capability of
adding a new word to the vocabulary.  This disclosure presents a new
ADDWORD user interface which draws on the complementarity between
speech and handwriting recognition.

      Out-of-vocabulary words represent a persistent problem in
speech and handwriting recognition tasks, because they severely
diminish the versatility and value-addedness of the product.  Since
by definition the required word is not in a list of alternative
candidate words the user must usually type in the correct word.  This
leads to the type of interface that automatic speech recognition
(ASR) is trying to avoid -- using a keyboard.  More specifically,
there are currently two possible scenarios which the user follows to
add a new word to the active vocabulary.

      1) From past experience with the recognizer the user knows that
this word is not in the vocabulary and just type in this word into
the vocabulary before proceeding.  At this point, in the case of ASR
for example, a phonetic baseform is created for this word using  1 .

      2) The user identifies a word that has been wrongly decoded by
the recognizer.  (Note: There is currently no automatic method for
doing that.  This, however, is beyond the scope of this disclosure;
here it is assumed that user is able to find the wrong words).  The
user first checks the list of alternative words and discovers that
the word is not an alternate candidate.  The user then types in the
word, the recognizer automatically checks that the word is not in the
vocabulary, and, finally, creates a baseform for this word (again, as
in  1  in the case of ASR).

      This disclosure provides an improved user interface based on
pen-tablet technology.  This interface is especially desirable in the
light of planned future products (such as notebooks) with an
interface that will be completely pen based (without conventional
keyboard).

      The underlying observation is that acoustic evidence bears
complementary characteristics in comparison with information provided
by handwriting (as was observed in lbracket 2 rbracket  ).  In
 lbracket 2 rbracket , various algorithms were suggested for
integration of speech and handwriting technology to exploit this
complementarity and improve overall recognition accuracy.  These
algorithms are based on the time alignment of the speech utterance
with the handwriting strokes, which is particularly simple if words
are spoken and written simultaneously.  But in the case considered
here words are written after they were spoken, which makes such
a...