Browse Prior Art Database

Double-Byte Encoding for Thai Characters

IP.com Disclosure Number: IPCOM000062309D
Original Publication Date: 1986-Nov-01
Included in the Prior Art Database: 2005-Mar-09
Document File: 2 page(s) / 36K

Publishing Venue

IBM

Related People

Metwaly, MF: AUTHOR

Abstract

Thai characters include vowels, which are written in various positions relative to consonants, and "tones" which indicate tonal vocalization used in the enunciation of characters, such "tones" also having certain selected positions relative to characters and associated vowels. Current data processing equipment for generating Thai supports fifty-three characters, five tones, two lower vowels and six upper vowels. Fig. 1 illustrates the various, possible positions of tones and vowels relative to each other and to an associated character. At present, each tone and each vowel is represented by one byte, with the result that each complete Thai character requires at least three bytes.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 88% of the total text.

Page 1 of 2

Double-Byte Encoding for Thai Characters

Thai characters include vowels, which are written in various positions relative to consonants, and "tones" which indicate tonal vocalization used in the enunciation of characters, such "tones" also having certain selected positions relative to characters and associated vowels. Current data processing equipment for generating Thai supports fifty-three characters, five tones, two lower vowels and six upper vowels. Fig. 1 illustrates the various, possible positions of tones and vowels relative to each other and to an associated character. At present, each tone and each vowel is represented by one byte, with the result that each complete Thai character requires at least three bytes. In order to reduce the number of bytes per character, a double- byte encoding system can be used wherein one byte represents the basic character and the other byte represents the tone and the vowel together with their respective locations relative to the basic character. With reference to Fig. 2, it is seen that in byte-2, bit 6 always reflects a 1-code so that the contents of the byte will be 40 hex, i.e., in conformity with the EBCDIC coding system. Bit 7 of byte-2 in Fig. 2 is used to indicate the vowel-tone position or VP code. Thus, when VP = 0 The vowel is above the character, and the tone is on Level 1. (See Fig. 1.) and, when VP = 1 The vowel is below the character, and the tone is on Level 2. With this double-byte encoding system, as oppos...