Tags for the Identification of Languages (RFC1766) Disclosure Number: IPCOM000004017D
Original Publication Date: 1995-Mar-01
Included in the Prior Art Database: 2000-Sep-12
Internet Society Requests For Comment (RFCs)

H. Alvestrand: AUTHOR


This document describes a language tag for use in cases where it is desired to indicate the language used in an information object.

This text was extracted from a ASCII document.
This is the abbreviated version, containing approximately 17% of the total text.

Network Working Group H. Alvestrand

Request for Comments: 1766 UNINETT

Category: Standards Track March 1995

Tags for the Identification of Languages

Status of this Memo

This document specifies an Internet standards track protocol for the

Internet community, and requests discussion and suggestions for

improvements. Please refer to the current edition of the "Internet

Official Protocol Standards" (STD 1) for the standardization state

and status of this protocol. Distribution of this memo is unlimited.


This document describes a language tag for use in cases where it is

desired to indicate the language used in an information object.

It also defines a Content-language: header, for use in the case where

one desires to indicate the language of something that has RFC-822-

like headers, like MIME body parts or Web documents, and a new

parameter to the Multipart/Alternative type, to aid in the usage of

the Content-Language: header.

1. Introduction

There are a number of languages spoken by human beings in this world.

A great number of these people would prefer to have information

presented in a language that they understand.

In some contexts, it is possible to have information in more than one

language, or it might be possible to provide tools for assisting in

the understanding of a language (like dictionaries).

A prerequisite for any such function is a means of labelling the

information content with an identifier for the language in which is

is written.

In the tradition of solving only problems that we think we

understand, this document specifies an identifier mechanism, and one

possible use for it.

2. The Language tag

The language tag is composed of 1 or more parts: A primary language

tag and a (possibly empty) series of subtags.

The syntax of this tag in RFC-822 EBNF is:

Language-Tag = Primary-tag *( "-" Subtag )

Primary-tag = 1*8ALPHA

Subtag = 1*8ALPHA

Whitespace is not allowed within the tag.

All tags are to be treated as case insensitive; there exist

conventions for capitalization of some of them, but these should not

be taken to carry meaning.

The namespace of language tags is administered by the IANA according

to the rules in section 5 of this document.

The following registrations are predefined:

In the primary language tag:

- All 2-letter tags are interpreted according to ISO standard

639, "Code for the representation of names of languages" [ISO