Browse Prior Art Database

Using language markup and destination information to automatically change spellings, code page and markup of documents

IP.com Disclosure Number: IPCOM000035205D
Original Publication Date: 2005-Jan-20
Included in the Prior Art Database: 2005-Jan-20
Document File: 2 page(s) / 44K

Publishing Venue

IBM

Abstract

A method for providing a client user accessing a server, such as a web browser accessing a web server, means for detecting and correcting differences in code page, language and spelling conventions.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 51% of the total text.

Page 1 of 2

Using language markup and destination information to automatically change spellings, code page and markup of documents

Documents written in a particular language may be intend for publication in several countries which have different standards for that language. Specifically, English has both British and American spellings, German has traditional and reformed spellings (and different hyphenation standards), and Chinese has traditional and simplified writing systems. Publication may be a web page to browsers throughout the world, or a writer submitting articles to journals or CVs to companies.

    The normal solution is to either publish in one language, maintain two or more similar versions, or maintain one version and automatically convert as required. This last method is the basis for this disclosure. It is straightforward to write a converter for American to British spelling, or visa-versa. The problem is deciding when it should be done. Often, the writer is not aware of who might be reading the document, and the reader might not be aware of who has written the document. A common example is when browsers go to a foreign language page and misdisplay the accented characters, since the writer has not put in a language tag which the browser can use.

    The transmission channels at various stages of document transferral can use embedded tags and routing information to change the mark-up and content of documents. Examples:
1)A server can detect from a clients IP address what country they are in, and adjust the spelling of a page's contents, specifically that enclosed in <english> ... </english> or <chinese> ... </chinese>, as appropriate. This will make online news, etc., more accessible.
2)A browser can detect the location of a web page, and change, or advise of a change, to the user's spelling being entered into a search engine. An example from my dissertation (see below) would be where a user types a transliterated Chinese name into a search engine. This name could be transliterated into pinyin, Wade-Giles or Yale, but the user is unlike to know or care about the distinction. A Wade-Giles transliteration is unlikely to found in a Singaporean database, for example; a browser might detect the .sg at the end of the URL and suggest that the name be transliterated to pinyin automatically.
3)Mail merge software can look at the address a letter, article or CV is being sent to, and adjust the spelling according to the country the recipient is in.
4)An ISP offering a 'safe surfing' package might detect which country or time zone (from customer supplied data) a browser is in, work out the local time in that country...