Browse Prior Art Database

Handwritten Segmentation/Recognition of Currency Amount Punctuation Symbols

IP.com Disclosure Number: IPCOM000123823D
Original Publication Date: 1999-May-01
Included in the Prior Art Database: 2005-Apr-05
Document File: 6 page(s) / 210K

Publishing Venue

IBM

Related People

Mai, DD: AUTHOR

Abstract

Disclosed is an effective method of segmenting and recognizing multiple Currency Amount Punctuation Symbols (CAPS) in the Convenience Amount Recognition algorithm.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 49% of the total text.

Handwritten Segmentation/Recognition of Currency Amount Punctuation
Symbols

   Disclosed is an effective method of segmenting and
recognizing multiple Currency Amount Punctuation Symbols (CAPS) in
the Convenience Amount Recognition algorithm.

   In the financial industry, the success of a Convenience
Amount Recognition algorithm is very dependant on the error rate.
Most banks want to run unassisted character recognition at a very low
error rate.  The correct identification of Currency Amount
Punctuation Symbols (CAPS) is not only a big factor in reducing the
error rate but it can also help to increase the correct rate.  This
invention will provide a means to achieve that goal.  In general,
there are two basic punctuation symbols used in the US Currency
amounts; they are decimal point and comma.  However, some other
alphabetic written language countries use different symbols for
currency amount punctuation.  For examples symbols dash, double
dashes, Pound, Dollar, Franc, etc.  This disclosure describes an
effective method of segmenting and recognizing multiple CAPSs.

The following disclosure provides:
  1.  An effective contextual segmentation and recognition
      algorithm of multiple CAPS (Currency Amount Punctuation
      Symbols).
  2.  A convenient way to eliminate noises, broken box lines
      that resemble CAP points and dashes.

   General terms:
  o  Object Repertoire List (ORL): an object list contains
     non-useful objects.
  o  Active objects List (AOL): an object list contains useful
     objects.
  o  An object is a segment of an image.  It can be a fragmented
     or a whole image of a character, symbol, noise or extraneous
     object.
  o  Terminating CAPS: the currency amount punctuation symbol at
     the end of an amount.  The symbol, in the absence of the
     decimal CAPS, equates to a double zero digit.
  o  Decimal CAPS: a decimal punctuation symbol.
  o  Thousand CAPS: a thousand punctuation symbol.
  o  Million CAPS: a million punctuation symbol.
  o  Standard object: an object contains a single object.
  o  Composite object: an object contains more than one single
     object.

   Step 1:
  Proceed with the conventional method of numeral segmentation
  and recognition.  In the process, unuseful objects should
  be removed and saved somewhere that can be retrieved later.
  For the clarification of this disclosure, let's use the ORL
  to store unuseful objects, and the AOL to store objects.

   Step 2:
  a) Select possible CAPSs: Use a look up table 2 to select
     possible APSs and categorize them into one type of CAPS,
     namely logical CAPS.
  b) Remove all logical CAPS objects from the AOL and save them
     in ORL.  The advantages of removing logical CAPSs are that
     the remaining noise-like and box-line-like CAPS are finally
     removed, and it is simpler to contextually process the AOL
    ...