Browse Prior Art Database

IETF Policy on Character Sets and Languages (RFC2277)

IP.com Disclosure Number: IPCOM000002838D
Original Publication Date: 1998-Jan-01
Included in the Prior Art Database: 2000-Sep-13
Document File: 7 page(s) / 15K

Publishing Venue

Internet Society Requests For Comment (RFCs)

Related People

H. Alvestrand: AUTHOR

Abstract

The Internet is international.

This text was extracted from a ASCII Text document.
This is the abbreviated version, containing approximately 19% of the total text.

Network Working Group H. Alvestrand

Request for Comments: 2277 UNINETT

BCP: 18 January 1998

Category: Best Current Practice

IETF Policy on Character Sets and Languages

Status of this Memo

This document specifies an Internet Best Current Practices for the

Internet Community, and requests discussion and suggestions for

improvements. Distribution of this memo is unlimited.

Copyright Notice

Copyright (C) The Internet Society (1998). All Rights Reserved.

1. Introduction

The Internet is international.

With the international Internet follows an absolute requirement to

interchange data in a multiplicity of languages, which in turn

utilize a bewildering number of characters.

This document is the current policies being applied by the Internet

Engineering Steering Group (IESG) towards the standardization efforts

in the Internet Engineering Task Force (IETF) in order to help

Internet protocols fulfill these requirements.

The document is very much based upon the recommendations of the IAB

Character Set Workshop of February 29-March 1, 1996, which is

documented in RFC 2130 [WR]. This document attempts to be concise,

explicit and clear; people wanting more background are encouraged to

read RFC 2130.

The document uses the terms 'MUST', 'SHOULD' and 'MAY', and their

negatives, in the way described in [RFC 2119]. In this case, 'the

specification' as used by RFC 2119 refers to the processing of

protocols being submitted to the IETF standards process.

2. Where to do internationalization

Internationalization is for humans. This means that protocols are not

subject to internationalization; text strings are. Where protocol

elements look like text tokens, such as in many IETF application

layer protocols, protocols MUST specify which parts are protocol and

which are text. [WR 2.2.1.1]

Names are a problem, because people feel strongly about them, many of

them are mostly for local usage, and all of them tend to leak out of

the local context at times. RFC 1958 [RFC 1958] recommends US-ASCII

for all globally visible names.

This document does not mandate a policy on name internationalization,

but requires that all protocols describe whether names are

internationalized or US-ASCII.

NOTE: In the protocol stack for any given application, there is

usually one or a few layers that need to address these problems.

It would, for instance, not be appropriate to define language tags

for Ethernet frames. But it is the responsibility of the WGs to

ensure that whenever responsibility for internationalization is left

to "another layer", those responsible for that layer are in fact

aware that they HAVE that responsibility.

3. Definition of Terms

This document uses the term "charset" to mean a set of rules for

mapping from a sequence of octets to a sequence of characters, suc...