Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Extension (CVTTEXT) to REXX language to allow rules based processing of input

IP.com Disclosure Number: IPCOM000014575D
Original Publication Date: 2002-May-21
Included in the Prior Art Database: 2003-Jun-19
Document File: 2 page(s) / 47K

Publishing Venue

IBM

Abstract

Disclosed is a program that converts a flatfile of text to any other given format via a simple rules file. The rules language is simpler to learn then Lexx, Yacc, Perl, Snobol, or Lisp but nearly as robust. Performance is at least as good as Perl. Alleviates writing one-time-only parsers and greatly reduces the lines-of-code necessary while maintaining a high standard of readability (for more readable then the other languages mentioned.) Some example uses follow. Given a forum or bulletin-board file with any number of appends or posts, write a rules file to handle the repeating blocks and the variable info in-between. Output to "structured text" for importing the entire forum/bulletin-board into Lotus Notes. Output to XML or HTML for internet use. Typical rules file for this is less than 30 lines long. Given a complete SQL dump of DB variables in a standard format, write a rules file to create FTP, JCL, and SMP/E jobs to move, load, and process the objects identified in the SQL dump. Typical rules file for this is less than 60 lines long. Given SGML, XML, or HTML with numerous tagged elements in random order, write a rules file to organize the tags in a standard manner, correctly moving the text between tags with the re-ordered tags. Typical rules file under 300 lines long with subsecond response on thousands of files.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 54% of the total text.

Page 1 of 2

Extension (CVTTEXT) to REXX language to allow rules based processing of input

    Disclosed is a program that converts a flatfile of text to any other given format via a simple rules file. The rules language is simpler to learn then Lexx, Yacc, Perl, Snobol, or Lisp but nearly as robust. Performance is at least as good as Perl. Alleviates writing one-time-only parsers and greatly reduces the lines-of-code necessary while maintaining a high standard of readability (for more readable then the other languages mentioned.) Some example uses follow.

Given a forum or bulletin-board file with any number of appends or posts, write a rules file to handle the repeating blocks and the variable info in-between. Output to "structured text" for importing the entire forum/bulletin-board into Lotus Notes. Output to XML or HTML for internet use. Typical rules file for this is less than 30 lines long.

Given a complete SQL dump of DB variables in a standard format, write a rules file to create FTP, JCL, and SMP/E jobs to move, load, and process the objects identified in the SQL dump. Typical rules file for this is less than 60 lines long.

Given SGML, XML, or HTML with numerous tagged elements in random order, write a rules file to organize the tags in a standard manner, correctly moving the text between tags with the re-ordered tags. Typical rules file under 300 lines long with subsecond response on thousands of files.

Given an text database, write a rules file to "data mine" or analyze the database for trends in data. Typical rules file under 20 lines long per item to be analyzed. Obviously the richest datamining occurs in databases with structured information, e.g, SGML, XML, HTML, Lotus Notes structured text etc.

Given Java, Ada, C++ or other languages that need documenting, write a rules file to provide reference information for all classes, relations, attributes, methods, exceptions. Output to SGML, XML, or HTML. Typical handled with about 6 rules files (one for each large category, e.g, classes) with each
rules files being 50 lines of code.

A functional overview follows: The CVTTEXT RULES file, a file of expected input syntax and corresponding Rexx statements, is loaded first. Then, the input file is processed a record at a time through this logic starting with the level 1 (l=1) <START><END> group. This logic generates the output data, while CVTTEXT EXEC generates the output file. If the output file already exists it is ERASEd first.

Command Syntax:

CVTTEXT ifile-id < (

<CTLCHAR c>

<DEBUG>

<OUTFILE ofile-id>

<REPLACE>

<RULES rfn>

<(parmtext

>

where:

ifile-id...fn ft fm of input file....(required)

c..........control character.........(def: \)

DEBUG......run with REXX trace on....(optional)

ofile-id...fn ft fm of output file...(def: ifile-fn CVTTEXT A)

REPLACE....if the return code is 0, then copy the output file into the

1

Page 2 of 2

input file (replacing it) and erase the output file.

rfn........fn of the RULES file......(def: CVTTEXT RULES *)

parm...