Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

General-Purpose Thresholding Algorithm for Mixed Format Documents

IP.com Disclosure Number: IPCOM000062026D
Original Publication Date: 1986-Oct-01
Included in the Prior Art Database: 2005-Mar-09
Document File: 3 page(s) / 60K

Publishing Venue

IBM

Related People

Fox, SJ: AUTHOR [+2]

Abstract

This article describes a family of general-purpose discriminating/ thresholding techniques for processing documents containing line copy information (LC), continuous tone information (CT) and halftone information (HT). The picture elements (PELs) of a scanned document are automatically classified as either line copy (LC) or nonline copy (NLC) and then thresholded accordingly. Devices for making this classification are referred to as "discriminators". Devices for thresholding the classified PELs are known as "thresholders". This article discusses a family of discriminators and thresholders that can be combined in various ways to cover the range of quality, function and cost. The menu for discriminators and thresholding techniques is given in Fig. 1. The following is an explanation of the terms used in Fig. 1.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 52% of the total text.

Page 1 of 3

General-Purpose Thresholding Algorithm for Mixed Format Documents

This article describes a family of general-purpose discriminating/ thresholding techniques for processing documents containing line copy information (LC), continuous tone information (CT) and halftone information (HT). The picture elements (PELs) of a scanned document are automatically classified as either line copy (LC) or nonline copy (NLC) and then thresholded accordingly. Devices for making this classification are referred to as "discriminators". Devices for thresholding the classified PELs are known as "thresholders". This article discusses a family of discriminators and thresholders that can be combined in various ways to cover the range of quality, function and cost. The menu for discriminators and thresholding techniques is given in Fig. 1. The following is an explanation of the terms used in Fig. 1. Discriminators: (1) Defocused Symmetry (DS) is known in the art and is a general purpose discriminator that can be used at several different levels of complexity with a corresponding increase in the reliability of discriminating between LC and NLC. (2) Information Homogeneity (IH) is a technique which cannot be used alone, but when used with another discriminator, can provide a very significant increase in the quality of the overall discrimination. It recognizes the fact that halftones, in as much as they reproduce pictures, must occupy some minimum area (probably about one square inch). Text, too, usually has associated with it macroscopy spatial dimensions and is usually embedded in a background of white. Information Homogeneity makes use of these properties to bridge over small misidentified areas and to correct them. (3) The LCCT Algorithm is just the line copy, continuous tone algorithm (LCCT) with the CT branch and counters representing both CT and HT. This technique is described in U. S. Patent 4,554,593. It is the most general algorithm for reproducing mixed format LC and CT documents and preserves the fullest gray scale range of CT without sacrificing the quality of line copy. It depends on history or hysteresis counters, the spatial extent of pictorial matter and the high gradient edge transitions and binary nature of line copy to achieve a complete separation of LC and CT. The algorithm uses both level and gradient informa...