Dismiss
InnovationQ will be updated on Sunday, Jan. 21, from 9am - 11am ET. You may experience brief service interruptions during that time.
Browse Prior Art Database

Extended Binary Coded Decimal Interchange Code Extension Technique to Allow Coexistence of Multiple Coded Character Sets

IP.com Disclosure Number: IPCOM000122834D
Original Publication Date: 1998-Jan-01
Included in the Prior Art Database: 2005-Apr-04
Document File: 4 page(s) / 113K

Publishing Venue

IBM

Related People

Enomoto, Y: AUTHOR [+2]

Abstract

Disclosed is an architecture to extend the DBCS (Double-byte Character Set) host code, which has been designed based on EBCDIC (Extended Binary Coded Decimal Interchange Code) of IBM S/390 and AS/400 systems, to multiple byte codes. This architecture also allows coexistence of more than a coded character set (e.g., coexistence of Japanese, Korean, Simplified Chinese and Traditional Chinese DBCS-Host codes) in EBCDIC code schemes, which was not considered feasible in the past. This architecture is also considered to be a basic technology or framework towards realizing multi-language processing systems.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Extended Binary Coded Decimal Interchange Code Extension Technique
to Allow Coexistence of Multiple Coded Character Sets

      Disclosed is an architecture to extend the DBCS (Double-byte
Character Set) host code, which has been designed based on EBCDIC
(Extended Binary Coded Decimal Interchange Code) of IBM S/390 and
AS/400 systems, to multiple byte codes.  This architecture also
allows coexistence of more than a coded character set (e.g.,
coexistence of Japanese, Korean, Simplified Chinese and Traditional
Chinese DBCS-Host  codes) in EBCDIC code schemes, which was not
considered feasible in the  past.  This architecture is also
considered to be a basic technology or  framework towards realizing
multi-language processing systems.

      In this architecture, N bytes represent a character.  N can
theoretically be any positive number larger than 2, but N can
practically be any positive number larger than 2 and up to 190
inclusive.  Each byte ranges from X'41' through X'FE'.  N pieces of
X'40' represent an N byte space character.

      Fig. 1 shows, as an example, "Triple-byte code architecture":
1st, 2nd and 3rd byte range from X'41' though X'FE' plus X'404040'
shows a triple-byte space character.  The three byte code can
accommodate 6,859,000 (=190[3) characters excluding the space
character.  Fig. 2 shows, as another example, "Four-byte code
architecture": 1st, 2nd, 3rd  and 4th byte range from X'41' through
X'FE' plus X'40404040' shows a four-byte space character.  The four
byte code can accommodate 1.3 billion (=190[4) characters.  1st byte
indicates a collection of 190 double-byte code areas where each
double-byte code area consists of 190  wards and 190 points.

      The following shift code sequences differentiate the multiple
byte codes from existing EBCDIC or DBCS-Host (or double-byte) codes.
  Shift from EBCDIC to three byte code   ... X'0EFE41'
  Shift from EBCDIC to four byte code    ... X'0EFE42'
  Shift from EBCDIC to five byte code    ... X'0EFE43'
                           :
  Shift from EBCDIC to 190 byte code     ... X'0EFEFC'
  Shift back from multiple byte codes to EBCDIC  ... X'0F' Note:
DBCS-Host codes are differentiated by X'0E' from EBCDIC codes.
EBCDIC codes are differentiated by X'0F' from DBCS-Host codes.  To
use X'FExx'  as part of the shift code sequence, existing DBCS-Host
will not define  any graphic characters from X'FE41' through X'FEFC'.

     ...