Method to Make Double-Byte and Single-Byte Identifiers URL Addressable
Original Publication Date: 2004-Oct-04
Included in the Prior Art Database: 2004-Oct-04
The HTTP 1.1 Specification (RFC 2616) specifies that a subset of 8 bit ASCII characters are the only valid content in a URL. This restriction becomes a problem when a URI containing double-byte character content (Japanese characters, Chinese characters, etc.) needs to be addressable. For instance, if a web-based document management system has two files: english.html and japanese.html (assume the characters in 'japanese.html' are actually Japanese characters). Trying to access the files using the URL's http://documentmanagementsystem.com/english.html and http://documentmanagementsystem.com/japanese.html would invalidate the HTTP 1.1 spec because the Japanese characters in the second URL are not 8 bit ASCII characters. A method is proposed in this article to replace the path and filename part of the URL with a GUID identifier in such a way that any embedded links (e.g. relative) resolved by the client program (browser) from within the document will result in a correct and manageable URL to those linked documents.