Browse Prior Art Database

Internationalization of the Hypertext Markup Language (RFC2070)

IP.com Disclosure Number: IPCOM000002622D
Original Publication Date: 1997-Jan-01
Included in the Prior Art Database: 2000-Sep-13
Document File: 36 page(s) / 85K

Publishing Venue

Internet Society Requests For Comment (RFCs)

Related People

F. Yergeau: AUTHOR [+4]

Abstract

The Hypertext Markup Language (HTML) is a markup language used to create hypertext documents that are platform independent. Initially, the application of HTML on the World Wide Web was seriously restricted by its reliance on the ISO-8859-1 coded character set, which is appropriate only for Western European languages. Despite this restriction, HTML has been widely used with other languages, using other coded character sets or character encodings, at the expense of interoperability.

This text was extracted from a ASCII document.
This is the abbreviated version, containing approximately 3% of the total text.

Network Working Group F. Yergeau

Request for Comments: 2070 Alis Technologies

Category: Standards Track G. Nicol

Electronic Book Technologies

G. Adams

Spyglass

M. Duerst

University of Zurich

January 1997

Internationalization of the Hypertext Markup Language

Status of this Memo

This document specifies an Internet standards track protocol for the

Internet community, and requests discussion and suggestions for

improvements. Please refer to the current edition of the "Internet

Official Protocol Standards" (STD 1) for the standardization state

and status of this protocol. Distribution of this memo is unlimited.

Abstract

The Hypertext Markup Language (HTML) is a markup language used to

create hypertext documents that are platform independent. Initially,

the application of HTML on the World Wide Web was seriously

restricted by its reliance on the ISO-8859-1 coded character set,

which is appropriate only for Western European languages. Despite

this restriction, HTML has been widely used with other languages,

using other coded character sets or character encodings, at the

expense of interoperability.

This document is meant to address the issue of the

internationalization (i18n, i followed by 18 letters followed by n)

of HTML by extending the specification of HTML and giving additional

recommendations for proper internationalization support. A foremost

consideration is to make sure that HTML remains a valid application

of SGML, while enabling its use with all languages of the world.

Table of Contents

1. Introduction .................................................. 2

1.1. Scope ...................................................... 2

1.2. Conformance ................................................ 3

2. The document character set ..................................... 4

2.1. Reference processing model ................................. 4

2.2. The document character set ................................. 6

2.3. Undisplayable characters ................................... 8

3. The LANG attribute.............................................. 8

4. Additional entities, attributes and elements ................... 9

4.1. Full Latin-1 entity set ....