Browse Prior Art Database

Extended Binary Coded Decimal Interchange Code Extension Technique to Allow Coexistence of Multiple Coded Character Sets

IP.com Disclosure Number: IPCOM000122834D
Original Publication Date: 1998-Jan-01
Included in the Prior Art Database: 2005-Apr-04
Document File: 4 page(s) / 113K

Publishing Venue

IBM

Related People

Enomoto, Y: AUTHOR [+2]

Abstract

Disclosed is an architecture to extend the DBCS (Double-byte Character Set) host code, which has been designed based on EBCDIC (Extended Binary Coded Decimal Interchange Code) of IBM S/390 and AS/400 systems, to multiple byte codes. This architecture also allows coexistence of more than a coded character set (e.g., coexistence of Japanese, Korean, Simplified Chinese and Traditional Chinese DBCS-Host codes) in EBCDIC code schemes, which was not considered feasible in the past. This architecture is also considered to be a basic technology or framework towards realizing multi-language processing systems.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Extended Binary Coded Decimal Interchange Code Extension Technique
to Allow Coexistence of Multiple Coded Character Sets

      Disclosed is an architecture to extend the DBCS (Double-byte
Character Set) host code, which has been designed based on EBCDIC
(Extended Binary Coded Decimal Interchange Code) of IBM S/390 and
AS/400 systems, to multiple byte codes.  This architecture also
allows coexistence of more than a coded character set (e.g.,
coexistence of Japanese, Korean, Simplified Chinese and Traditional
Chinese DBCS-Host  codes) in EBCDIC code schemes, which was not
considered feasible in the  past.  This architecture is also
considered to be a basic technology or  framework towards realizing
multi-language processing systems.

      In this architecture, N bytes represent a character.  N can
theoretically be any positive number larger than 2, but N can
practically be any positive number larger than 2 and up to 190
inclusive.  Each byte ranges from X'41' through X'FE'.  N pieces of
X'40' represent an N byte space character.

      Fig. 1 shows, as an example, "Triple-byte code architecture":
1st, 2nd and 3rd byte range from X'41' though X'FE' plus X'404040'
shows a triple-byte space character.  The three byte code can
accommodate 6,859,000 (=190[3) characters excluding the space
character.  Fig. 2 shows, as another example, "Four-byte code
architecture": 1st, 2nd, 3rd  and 4th byte range from X'41' through
X'FE' plus X'40404040' shows a four-byte space character.  The four
byte code can accommodate 1.3 billion (=190[4) characters.  1st byte
indicates a collection of 190 double-byte code areas where each
double-byte code area consists of 190  wards and 190 points.

      The following shift code sequences differentiate the multiple
byte codes from existing EBCDIC or DBCS-Host (or double-byte) codes.
  Shift from EBCDIC to three byte code   ... X'0EFE41'
  Shift from EBCDIC to four byte code    ... X'0EFE42'
  Shift from EBCDIC to five byte code    ... X'0EFE43'
                           :
  Shift from EBCDIC to 190 byte code     ... X'0EFEFC'
  Shift back from multiple byte codes to EBCDIC  ... X'0F' Note:
DBCS-Host codes are differentiated by X'0E' from EBCDIC codes.
EBCDIC codes are differentiated by X'0F' from DBCS-Host codes.  To
use X'FExx'  as part of the shift code sequence, existing DBCS-Host
will not define  any graphic characters from X'FE41' through X'FEFC'.

     ...