Conventions for Encoding the Vietnamese Language VISCII: VIetnamese Standard Code for Information Interchange VIQR: VIetnamese Quoted-Readable Specification (RFC1456)
Original Publication Date: 1993-May-01
Included in the Prior Art Database: 2019-Feb-10
Internet Society Requests For Comment (RFCs)
This document provides information to the Internet community on the currently used conventions for encoding Vietnamese characters into 7-bit US ASCII and in an 8-bit form. This memo provides information for the Internet community. It does not specify an Internet standard.
Network Working Group Vietnamese Standardization Working Group Request for Comments: 1456 May 1993
Conventions for Encoding the Vietnamese Language VISCII: VIetnamese Standard Code for Information Interchange VIQR: VIetnamese Quoted-Readable Specification Revision 1.1
Status of this Memo
This memo provides information for the Internet community. It does not specify an Internet standard. Distribution of this memo is unlimited.
This document provides information to the Internet community on the currently used conventions for encoding Vietnamese characters into 7-bit US ASCII and in an 8-bit form. These conventions are widely used by the overseas Vietnamese who are on the Internet and are active in USENET. This document only provides information and specifies no level of standard.
In this paper we describe two conventions for representing Vietnamese characters. VISCII (pronounced "visky") is an 8-bit character encoding that is similar to that used with ISO-8859. VIQR (pronounced "vicker") is a mnemonic encoding of Vietnamese characters into US ASCII for use on 7-bit systems. There is substantial existing online freely distributable software that implements these conventions for UNIX and personal computers. These encodings enable Vietnamese-language users to take full advantage of powerful tools already developed for the English-speaking world, eliminating unnecessary reinvention. This paper describes these conventions in part so that MIME-compliant software might also support the Vietnamese language.
NOTE: The accented Vietnamese letters are herein represented by their VIQR equivalents, offset by enclosing angle brackets. For example, the single letter "a acute" is written as <a’>, where the apostrophe is the mnemonic symbol for the acute.
2. LINGUISTIC OVERVIEW
As a romanized language, Vietnamese appears to lend itself readily to integration into existing English-based systems. To cite a simple
Vietnamese Standardization Working Group [Page 1]
RFC 1456 Conventions for Encoding Vietnamese May 1993
example, consider implementing support for French in such systems. One can allocate code positions in the 8-bit space necessary for accented letters such as <e^> or <e’>, then provide a means for users to access these codes through the keyboard. The required number of "extra" code positions is small (see, e.g., ISO-8859/Latin-1 ), and the relatively low frequency of occurrence of accented letters does not place heavy demand on efficient keyboard input schemes. The same things cannot be said for Vietnamese, where both the number and occurrence frequency of accented letters are large. Apart from the alphabetics already available in ASCII, Vietnamese requires an additional 134 combinations of a letter and diacritical symbols.
Note that one can resort to a composite encoding scheme to reduce this requirement, but that would mean giving up on integration into today’s computing platforms which for the most part do not support such...