Browse Prior Art Database

Document Interchange Unit Parser for Multiple Data Processing Architectures

IP.com Disclosure Number: IPCOM000061359D
Original Publication Date: 1986-Jul-01
Included in the Prior Art Database: 2005-Mar-09
Document File: 3 page(s) / 17K

Publishing Venue

IBM

Related People

Ho, A: AUTHOR [+2]

Abstract

In today's office data processing environment, two architectures have evolved, one of which governs how information is communicated from office equipment to a host computer and another between host computers to allow "document" interchange between host systems. A standard Generalized Data Stream (GDS) encoding scheme is used to form information into a data stream. Each atomic piece of information in a GDS is self-defined in a basic information unit known as the Document Interchange Unit (DIU), which starts off with the length of the piece of information (LL) followed by an identification (ID), the attribute or format (F) and then the information itself (data), or in some systems the DIU is an LT type (length, type, data).

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 36% of the total text.

Page 1 of 3

Document Interchange Unit Parser for Multiple Data Processing Architectures

In today's office data processing environment, two architectures have evolved, one of which governs how information is communicated from office equipment to a host computer and another between host computers to allow "document" interchange between host systems. A standard Generalized Data Stream (GDS) encoding scheme is used to form information into a data stream. Each atomic piece of information in a GDS is self-defined in a basic information unit known as the Document Interchange Unit (DIU), which starts off with the length of the piece of information (LL) followed by an identification (ID), the attribute or format (F) and then the information itself (data), or in some systems the DIU is an LT type (length, type, data). The rules governing the various possible combinations of each atomic piece of information inside a DIU fall into the jurisdiction of these two architectures. When a computer system receives a DIU into its memory, a program is needed to analyze the structure of that DIU according to the syntax rules described by the respective architecture and then extracts each piece of information from the DIU. This process of analysis is usually known as parsing, and the program which does the job is called the parser. In the field of data communication, the cost of transmission is directly proportional to the amount of data (or characters) needed to be shuttled between two computer systems in order to constitute a meaningful conversation so that a task will be performed. Therefore, the ideal situation will be: to use the minimum amount of data to convey the maximum amount of information. Because of this requirement, the rules which govern the different possible combinations of atomic pieces of information is very flexible and complex. Besides, the architecture will evolve with time in order to solve unforeseen problems and handle additional requirements in an electronic office. What it means is that more information will be encoded into the data stream in the future. This makes the task of building a parser very difficult. This article describes a generalized table-driven parser which can be used to parse data streams having different syntax rules. Where two syntaxes are used, two sets of syntax rules are stored in two sets of parse tables. When the parser encounters a data stream of one of the architectures, the respective set of parse tables is loaded into the core memory of the system. The parser has an input function which will read data from the communication line into the program's memory (parser's working area), as dictated by the parse table, and will only process one atomic piece of information at a time. The parsing algorithm begins by reading the first entry from the first parse table. The table entry has sufficient information to guide the input function to read a certain amount of data into the parser's working area from the communication...