Browse Prior Art Database

A PARSER FOR OPTICAL DOCUMENT LAYOUT

IP.com Disclosure Number: IPCOM000027343D
Original Publication Date: 1996-Apr-30
Included in the Prior Art Database: 2004-Apr-07
Document File: 2 page(s) / 54K

Publishing Venue

Xerox Disclosure Journal

Abstract

Proposed is a method for document processing using an attributed ambiguous grammar, as a representation, and preferences to determine the best or optimal results. The grammar possess finite-state machines and the preference logic compares values in two different possibilities to indicate which possibility is better. The proposed method comprises the steps of: describing the possible resultant document structures (such as layout) using the ambiguous grammar; defining preferences among the the possible results; and parsing the input according to the ambiguous grammar to determine possible results, while using the preferences to eliminate less desirable or sub-optimal results.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 100% of the total text.

Page 1 of 2

XEROX DISCLOSURE JOURNAL

A PARSER FOR OPTICAL DOCUMENT LAYOUT Sidney W. Marshall

Proposed Classification
U.S. C1.364/419 Int. C1. G06f 15/38

Proposed is a method for document processing using an attributed ambiguous grammar, as a representation, and preferences to determine the best or optimal results. The grammar possess finite-state machines and the preference logic compares values in two different possibilities to indicate which possibility is better. The proposed method comprises the steps of: describing the possible resultant document structures (such as layout) using the ambiguous grammar; defining preferences among the the possible results; and parsing the input according to the ambiguous grammar to determine possible results, while using the preferences to eliminate less desirable or sub- optimal results.

REFERENCES

US. Patent No. 5,060,155 issued October 22, 1991, to Job M. van. Zuijlen discloses a method for unambiguously coding multiple parsing analyses of a natural language word sequence in dependency grammar. The dependencies are defined between pairs of words wherein, each pair consists of a superordinate word or governor and a related word or dependent.

XEROX DISCLOSURE JOURNAL - Vol. 21, No. 2 March/April 1996 173

[This page contains 1 picture or other non-text object]

Page 2 of 2

174 XEROX DISCLOSURE JOURNAL - Vol. 21, No. 2 March/April 1996

[This page contains 1 picture or other non-text object]