Browse Prior Art Database

Using Unicode with MIME (RFC1641)

IP.com Disclosure Number: IPCOM000002477D
Original Publication Date: 1994-Jul-01
Included in the Prior Art Database: 2001-Nov-12
Document File: 7 page(s) / 11K

Publishing Venue

Internet Society Requests For Comment (RFCs)

Related People

D. Goldsmith: AUTHOR [+2]

Abstract

The Unicode Standard, version 1.1, and ISO/IEC 10646-1:1993(E) jointly define a 16 bit character set (hereafter referred to as Unicode) which encompasses most of the world's writing systems. However, Internet mail (STD 11, RFC 822) currently supports only 7- bit US ASCII as a character set. MIME (RFC 1521 and RFC 1522) extends Internet mail to support different media types and character sets, and thus could support Unicode in mail messages. MIME neither defines Unicode as a permitted character set nor specifies how it would be encoded, although it does provide for the registration of additional character sets over time.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 32% of the total text.

Network Working Group                                       D. Goldsmith

Request for Comments: 1641                                      M. Davis

Category: Experimental                                    Taligent, Inc.

                                                               July 1994

                        Using Unicode with MIME

Status of this Memo

   This memo defines an Experimental Protocol for the Internet

   community.  This memo does not specify an Internet standard of any

   kind.  Distribution of this memo is unlimited.

Abstract

   The Unicode Standard, version 1.1, and ISO/IEC 10646-1:1993(E)

   jointly define a 16 bit character set (hereafter referred to as

   Unicode) which encompasses most of the world's writing systems.

   However, Internet mail (STD 11, RFC 822) currently supports only 7-

   bit US ASCII as a character set. MIME (RFC 1521 and RFC 1522) extends

   Internet mail to support different media types and character sets,

   and thus could support Unicode in mail messages. MIME neither defines

   Unicode as a permitted character set nor specifies how it would be

   encoded, although it does provide for the registration of additional

   character sets over time.

   This document specifies the usage of Unicode within MIME.

Motivation

   Since Unicode is starting to see widespread commercial adoption,

   users will want a way to transmit information in this character set

   in mail messages and other Internet media. Since MIME was expressly

   designed to allow such extensions and is on the standards track for

   the Internet, it is the most appropriate means for encoding Unicode.

   RFC 1521 and RFC 1522 do not define Unicode as an allowed character

   set, but allow registration of additional character sets.

   In addition to allowing use of Unicode within MIME bodies, another

   goal is to specify a way of using Unicode that allows text which

   consists largely, but not entirely, of US-ASCII characters to be

   represented in a way that can be read by mail clients who do not

   understand Unicode. This is in keeping with the philosophy of MIME.

   Such an encoding is described in another document, "UTF-7: A Mail

   Safe Transformation Format of Unicode" [UTF-7].

Goldsmith & Davis                                               [Page 1]

RFC 1641                Using Unicode with MIME                July 1994

Overview

   Several ways of using Unicode are possible. This document specifies

   both guidelines for use of Unicode within MIME, and a specific usage.

   The usage specified in this document is a straightforward use of

   Unicode as specified in "The Unicode Standard, Version 1.1".

   This encoding is intended for situations where sender and recipient

   do not want to do a lot of processing, when the text does not consist

   primarily of characters from the US-ASCII character set, or when

   sender and receiver are known in advance to support Unicode.

   Another encoding is intended for situations where the text consists

   primarily of US-ASCII, with occasional characters from other parts of

   Unicode. This encoding allows the US-ASCII portion to be read by all

   recipients without having to support Unicode. This encoding is

   specified in another document, "UTF-7: A Mail Safe Transformation

   Format of Unicode" [UTF-7].

   Finally, in keeping with the principles set forth in RFC 1521, text

   which ca...