Browse Prior Art Database

Multi-Term Words Post-Process Program

IP.com Disclosure Number: IPCOM000113559D
Original Publication Date: 1994-Sep-01
Included in the Prior Art Database: 2005-Mar-27
Document File: 2 page(s) / 67K

Publishing Venue

IBM

Related People

Kita, Y: AUTHOR [+2]

Abstract

This article describes the mechanism of the Post-Processor (PP) for the multi-term words used in the hand-written character OCR system. This program creates a kind of network whose nodes are composed of words beginning from the same column. Then it decides the optimal path from this network by scoring all the possible path. This optimal path gives one string as the output of post-processor.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 62% of the total text.

Multi-Term Words Post-Process Program

      Text goes here.

      This article describes the mechanism of the Post-Processor (PP)
for the multi-term words used in the hand-written character OCR
system.  This program creates a kind of network whose nodes are
composed of words beginning from the same column.  Then it decides
the optimal path from this network by scoring all the possible path.
This optimal path gives one string as the output of post-processor.

      PP treats the multi-term words in Fig. 1 written in 1 field in
the OCR form and gets the candidate matrix as input.  For this kind
of data may be punctuated by the delimiter and may NOT be so, PP must
judge the optimal punctuation for the input.  It creates the term
candidate network:N like figured in the Fig. 2 from this input.  Each
node is composed of the word candidates whose initial letter is on
the same column.  Every path that links the initial node and the last
node can be the result of post-processing.  PP evaluates all the
possible linkage, path, and select out the optimal path as output of
processing.  It scores each word in each node and the summation of
the scores of the words that compose a path:&Gamma.  becomes the
"score of the path:&Gamma.".  It is the summation of the order where
each character forming the word appears in the matrix.  If the
character doesn't appear in the proper column the penalty score:23 is
given for the column.  This scoring means that the smaller the...