Browse Prior Art Database

Kanji DATA Tokenizer

IP.com Disclosure Number: IPCOM000061832D
Original Publication Date: 1986-Sep-01
Included in the Prior Art Database: 2005-Mar-09
Document File: 1 page(s) / 12K

Publishing Venue

IBM

Related People

Olson, D: AUTHOR [+2]

Abstract

This is a method for scanning text data which contains regular EBCDIC characters as well as double byte characters (DBCS). The method allows: (1) the location of delimiters, which are used to differentiate between EBCDIC and DBCS data, (2) validation of the presence of DBCS data (by ensuring that the number of bytes in the DBCS data field is always an even multiple of two), and (3) isolation of the DBCS data for subsequent processing. The routine searches for DBCS data strings and delimiters (such as 'comma', 'parentheses') in a specified field (and specifically for a 'mixed field'). Delimiters to search are defined in the input parameter, and a maximum of 8 different characters can be specified when DBCS data are found. The position and length for the DBCS string are set in the parameter area.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 84% of the total text.

Page 1 of 1

Kanji DATA Tokenizer

This is a method for scanning text data which contains regular EBCDIC characters as well as double byte characters (DBCS). The method allows: (1) the location of delimiters, which are used to differentiate between EBCDIC and DBCS data, (2) validation of the presence of DBCS data (by ensuring that the number of bytes in the DBCS data field is always an even multiple of two), and
(3) isolation of the DBCS data for subsequent processing. The routine searches for DBCS data strings and delimiters (such as 'comma', 'parentheses') in a specified field (and specifically for a 'mixed field'). Delimiters to search are defined in the input parameter, and a maximum of 8 different characters can be specified when DBCS data are found. The position and length for the DBCS string are set in the parameter area. When either delimiter is found first, searching stops and delimiter positions are saved in the parameter area, if delimiter(s) was specified, and that type of delimiter is returned to the caller by the delimiter sequence number. A maximum of 32767 bytes is specified for the search field length. If the second entry point is used, searching stops when any character other than a delimiter is found (if a blank is specified as a delimiter, the first non-blank character position is returned). PROCESSING LOGIC .SEARCH 'SO', 'SI' AND DELIMITER CHARACTERS SPECIFIED IN THE INPUT PARAMETER .IF DELIMITER IS SPECIFIED IN INPUT PARAMETER, SEARCH PROCESSING STOPS AS...