Interactive Ideographic System
Original Publication Date: 1974-Dec-01
Included in the Prior Art Database: 2005-Feb-28
Miller, IM: AUTHOR [+3]
INTRODUCTION. Ideograms are the means of written communication used by some 750 million people throughout the Far East. Syllabic writing, such as Japanese Kana, is used primarily as a limited supplement to Kanji, which is the most widely used form of Japanese writing. Kanji is very similar to Chinese writing; in fact, many of the ideograms of Kanji are taken directly from Chinese. Other Oriental languages use different ideograms, but are of a general structure similar to Chinese.
Interactive Ideographic System
Ideograms are the means of written communication used by some 750 million people throughout the Far East. Syllabic writing, such as Japanese Kana, is used primarily as a limited supplement to Kanji, which is the most widely used form of Japanese writing. Kanji is very similar to Chinese writing; in fact, many of the ideograms of Kanji are taken directly from Chinese. Other Oriental languages use different ideograms, but are of a general structure similar to Chinese.
One of the noteworthy characteristics of ideographic languages is complexity. A typical Chinese dictionary contains over 40,000 unique ideograms. A vocabulary of over 2,000 ideograms is required to read a Chinese newspaper without substantial loss of meaning.
The system described below provides a practical structure for Chinese ideograms, together with a system for applying that structure to everyday problems of data entry, storage, retrieval, and updating. Similar work is described by Shi-Kuo Chang, An Interactive System for Chinese Character Generation and Retrieval, IEEE Transactions on Systems, Man and Cybernetics, May 1973, pp 257-265.
If a Chinese ideogram is equated to an English word (they are somewhat analogous), then it is possible to break the ideogram into nonunique components (radicals) which are analogous to English letters. Thus, it is possible to represent an ideogram as a number of simple radicals together with a "structure" definition, which shows how to arrange the radicals to form the ideogram. It is estimated that 40,000 ideograms can be represented by approximately 500 radicals and 16 structures.
The Chinese language interface subsystem embodies the ideogram structuring technique in a system comprising: . A data entry and display device, such as a vector display terminal. . A data encoding algorithm to permit effective data storage and manipulation. . A set of computer programs to permit definition, entry, and retrieval of Chinese ideograms. SYSTEM DESCRIPTION
An interactive system for rapid entry, display and retrieval of ideographic characters may utilize a graphic display terminal connected via a voice-grade telephone line to an APL system, together with a large tablet which is connected to the terminal to provide X-Y position input.
Structure definitions, radicals, and system control commands are laid out on the tablet in a rectangular array. Pointing to an element of the array with the tablet's pen communicates an X-Y position to the APL program, which interprets the X-Y value as a structure, radical, or control command. The resultant ideograms are displayed on the terminal as they are stored in the APL workspace.
It is estimated that a subset of 200 Kanji characters (which can be represented as radicals) account for 60% of all Kanji character usage, and that an average of 3.5 tablet operations are required for the remaining 40% (2.5 radicals + 1 structure).
Existing Kanji typewriters...