Browse Prior Art Database

Japanese Character Encoding for Internet Messages (RFC1468)

IP.com Disclosure Number: IPCOM000002297D
Original Publication Date: 1993-Jun-01
Included in the Prior Art Database: 2000-Sep-12
Document File: 5 page(s) / 10K

Publishing Venue

Internet Society Requests For Comment (RFCs)

Related People

J. Murai: AUTHOR [+3]

Abstract

This document describes the encoding used in electronic mail [RFC822] and network news [RFC1036] messages in several Japanese networks. It was first specified by and used in JUNET [JUNET]. The encoding is now also widely used in Japanese IP communities.

This text was extracted from a ASCII document.
This is the abbreviated version, containing approximately 27% of the total text.

Network Working Group J. Murai

Request for Comments: 1468 Keio University

M. Crispin

Panda Programming

E. van der Poel

June 1993

Japanese Character Encoding for Internet Messages

Status of this Memo

This memo provides information for the Internet community. It does

not specify an Internet standard. Distribution of this memo is

unlimited.

Introduction

This document describes the encoding used in electronic mail [RFC822]

and network news [RFC1036] messages in several Japanese networks. It

was first specified by and used in JUNET [JUNET]. The encoding is now

also widely used in Japanese IP communities.

The name given to this encoding is "ISO-2022-JP", which is intended

to be used in the "charset" parameter field of MIME headers (see

[MIME1] and [MIME2]).

Description

The text starts in ASCII [ASCII], and switches to Japanese characters

through an escape sequence. For example, the escape sequence ESC $ B

(three bytes, hexadecimal values: 1B 24 42) indicates that the bytes

following this escape sequence are Japanese characters, which are

encoded in two bytes each. To switch back to ASCII, the escape

sequence ESC ( B is used.

The following table gives the escape sequences and the character sets

used in ISO-2022-JP messages. The ISOREG number is the registration

number in ISO's registry [ISOREG].

Esc Seq Character Set ISOREG

ESC ( B ASCII 6

ESC ( J JIS X 0201-1976 ("Roman" set) 14

ESC $ @ JIS X 0208-1978 42

ESC $ B JIS X 0208-1983 87

Note that JIS X 0208 was called JIS C 6226 until the name was changed

on March 1st, 1987. Likewise, JIS C 6220 was renamed JIS X 0201.

The "Roman" character set of JIS X 0201 [JISX0201] is identical to

ASCII except for backslash () and tilde (~). The backslash is

replaced by the Yen sign, and the tilde is replaced by overline. This

set is Japan's national variant of ISO 646 [ISO646].

The JIS X 0208 [JISX0208] character sets consist of Kanji, Hiragana,

Katakana and some other symbols and characters. Each character takes

up two bytes.

For further details about the JIS Japanese national character set

standards, refer to [JISX0201] and [JISX0208]. For further

information about the escap...