Browse Prior Art Database

Byte-Type Recognition Algorithm

IP.com Disclosure Number: IPCOM000113390D
Original Publication Date: 1994-Aug-01
Included in the Prior Art Database: 2005-Mar-27
Document File: 2 page(s) / 29K

Publishing Venue

IBM

Related People

Nguyen, BQ: AUTHOR

Abstract

A method is described for differentiating a double-byte character from a single-byte character in a mixed character string.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 92% of the total text.

Byte-Type Recognition Algorithm

      A method is described for differentiating a double-byte
character from a single-byte character in a mixed character string.

o   Algorithm: ByteType

o   Input: s - The string contains the byte in question

o          n - The nth byte of the string s

o   Output: DBCS_1ST - s[n] is the first byte of a double-byte
    character

o           DBCS_2ND - s[n] is the second byte of a double-byte
    character

o           SBCS     - s[n] is a single-byte character

o     Assumption:

o     The macro isdbcs1( asciiValue ) returns DBCS_1ST in asciiValue
    is in the DBCS reamges and returns SBCS otherwise.  For example,
    isdbcs1(83hex) returns DBCS_1ST, but isdbcs1(45hex) returns SBCS.

o     The input string s starts at index 0; that is, 0 indicates the
    first byte in the string, though this assumption is not
    necessary.

o     The index n is a valid index into the string s; that is,
    sizeof(s)>n.

o     The last byte in the string could be the first byte of a
    double-byte character if it is in DBCS ranges, though this is not
    a double-byte character since there is only one byte.

   BEGIN
       byteType = isdbcs1(s[n]);
       LOOP:
          IF (n=0) OR (isdbcs1(s[n-1]) = SBCS)
              RETURN byteType;
          n = n-1;
          IF (n=0)
             RETURN DBCS_2ND;
  ...