Browse Prior Art Database

Automatic Word Extraction

IP.com Disclosure Number: IPCOM000063365D
Original Publication Date: 1985-Mar-01
Included in the Prior Art Database: 2005-Feb-18

Publishing Venue

IBM

Related People

Authors:
Hashihara, H Itoh, H [+details]

Abstract

This article describes automatic word extraction from a full page document image. N scan lines are ORed, and the resulting black bits from the OR operation indicate the existence of the word over the scan lines, from which a table is formed. In accordance with the data in the table, the word images are extracted from the full page document image and displayed on a display screen. Referring to Fig. 1, a document 1 is scanned, and the document image is converted to a bi-level image and stored in a full page image buffer 31 (Fig. 3). N scan lines of the bi-level image are ORed. If a word exists over the N scan lines, the resulting signal from the OR operation includes the black bits "1", as shown in a partial image in Fig. 1.