Browse Prior Art Database

Using Speech to Correct Optical Character Recognition Output

IP.com Disclosure Number: IPCOM000114015D
Original Publication Date: 1994-Oct-01
Included in the Prior Art Database: 2005-Mar-27
Document File: 2 page(s) / 71K

Publishing Venue

IBM

Related People

Cohen, PS: AUTHOR [+3]

Abstract

Described are several methods whereby speech is used to correct Optical Character Recognition (OCR) output. The methods combine the statistics from the OCR process with speech recognition to create a composite model.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Using Speech to Correct Optical Character Recognition Output

      Described are several methods whereby speech is used to correct
Optical Character Recognition (OCR) output.  The methods combine the
statistics from the OCR process with  speech recognition to create a
composite model.

      In prior art, many difficulties have been outlined for speech
recognition tasks (1,2).  The concept described herein includes
methods that are designed to eliminate OCR errors during the
distinguishing of like-shaped characters, such as 0 and O, 1 and I, j
and g, etc.  The following illustrates how speech can be used to
correct OCR errors:
 1.  To input (speak) a corrected word, field, or phrase once the
user
    has identified the error, the user may use a mouse or tab key to
    identify an incorrect word and then speak it:  This can be faster
    than keying in the correction.
 2.  To input (speak) a correction once the system has identified an
    error or area of low confidence.  The user might say "accept" or
    "change to...":  (Several OCR systems flag areas of low
    confidence with color or reverse video.)
 3.  To identify an erroneous line with speech:  For example, "change
    line 12 to ..."
 4.  To find and correct an error with speech:  For example, "change
    '100 many elephants' to 'too many elephants'".  Or the user might
    say, "phrase should read, 'too many elephants'".  The system
    would use speech to both find the error and correct it.  The
    error is found by matching a phrase or sentence with the text on
    screen.
 5.  To combine the OCR output and the speech recognition system
    results to create the highest probability combined result:  OCR
    algorithms, like speech recognition algorithms, evaluate multiple
    competing solutions.  Combining the probabilities from both
    algorithms can yield higher accuracy...