Browse Prior Art Database

A Generalized Unified Character Code: Western European and CJK Sections (RFC5242)

IP.com Disclosure Number: IPCOM000168882D
Original Publication Date: 2008-Apr-01
Included in the Prior Art Database: 2008-Apr-02
Document File: 15 page(s) / 31K

Publishing Venue

Internet Society Requests For Comment (RFCs)

Related People

J. Klensin: AUTHOR [+2]

Abstract

Many issues have been identified with the use of general-purpose character sets for internationalized domain names and similar purposes. This memo describes a fully unified coded character set for scripts based on Latin, Greek, Cyrillic, and Chinese (CJK) characters. It is not a complete specification of that character set.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 8% of the total text.

Network Working Group                                         J. Klensin Request for Comments: 5242 Category: Informational                                    H. Alvestrand                                                                   Google                                                             1 April 2008

 A Generalized Unified Character Code: Western European and CJK Sections

Status of This Memo

   This memo provides information for the Internet community.  It does    not specify an Internet standard of any kind.  Distribution of this    memo is unlimited.

IESG Note

   This is not an IETF document.  Readers should be aware of RFC 4690,    "Review and Recommendations for Internationalized Domain Names    (IDNs)", and its references.

   This document is not a candidate for any level of Internet Standard.    The IETF disclaims any knowledge of the fitness of this document for    any purpose, and in particular notes that it has not had IETF review    for such things as security, congestion control, or inappropriate    interaction with deployed protocols.  The RFC Editor has chosen to    publish this document at its discretion.  Readers of this document    should exercise caution in evaluating its value for implementation    and deployment.

Abstract

   Many issues have been identified with the use of general-purpose    character sets for internationalized domain names and similar    purposes.  This memo describes a fully unified coded character set    for scripts based on Latin, Greek, Cyrillic, and Chinese (CJK)    characters.  It is not a complete specification of that character    set.

Klensin & Alvestrand         Informational                      [Page 1]
 RFC 5242                      Unified CCS                     April 2008

 Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3

     1.1.  Terminology  . . . . . . . . . . . . . . . . . . . . . . .  3

     1.2.  Discussion . . . . . . . . . . . . . . . . . . . . . . . .  4

   2.  Types of Characters  . . . . . . . . . . . . . . . . . . . . .  4

     2.1.  Base Character . . . . . . . . . . . . . . . . . . . . . .  4

     2.2.  Nonspacing Marks . . . . . . . . . . . . . . . . . . . . .  4

     2.3.  Case Indicators  . . . . . . . . . . . . . . . . . . . . .  4

     2.4.  Joining Indicators . . . . . . . . . . . . . . . . . . . .  5

     2.5.  Character-Matrix Positioning Indicators  . . . . . . . . .  5

     2.6.  Position Shaping Controls  . . . . . . . . . . . . . . . .  6

     2.7.  Repetition In...