Browse Prior Art Database

An Extensible Approach for Isolating Translatable Resources of a Preexisting XML Documents in Language-specific Sub-documents (Files) Disclosure Number: IPCOM000012405D
Original Publication Date: 2003-May-05
Included in the Prior Art Database: 2003-May-05
Document File: 2 page(s) / 56K

Publishing Venue



Globalizing Software Applications often requires the translation of XML documents (Files). Due to the fact that translated resources are an inherent part of the XML model definition, translatable resources(text) are almost always mixed with non-Translatable resources (logical resources that are constant and not affected by translation) within the XML markup language. This represents a major translation challenge as it is not possible for translation tools to properly recognize translatable material within an XML Textual object. In the standard translation process, the XML files are written in one specific language, usually English, and then sent to translation services centers. The translators then reproduce the document in other languages by copying the original and replacing each translatable element with the appropriate translations (This process is usually handled by translation tools) . It has been always problematic for the translator to determine which elements to translate? One solution has always been to heavily comment the source to specifically indicate which attributes/text is translatable. But this approach is certainly error prone and laborious. The approach also presents a maintenance nightmare as each time a Translatable XML file is edited by the Translators non-Translatable information gets translatable and unnecessary functional problems/defects are created. In order for any proposed solution to the above problem to be viable, the following considerations must be taken into account: 1. Preexisting XML documents(files) in which translatable resources and non-translatable resources are mingled together (translatable text and logical elements are mixed together in the same file). Isolating translatable resources in separate XML sub-documents(files) must not require a change to the existing XML structure (Change to the DTD/schema). 2. XML Document structure is often dictated by data population and/or other DTD/Schema generation tools. Tool-generated structure usually does not conform to the national language XML syntax and style guidelines that are adopted by the almost all translation tool sets. XML is an open SGML language which allows for application specific tag set or Document Type Definition (DTD). This open SGML architecture requires that XML National Language enablement guidelines specify standards for the identification and presentation of translatable strings within the XML textual object. One of the guidelines that this standard strongly stresses , and DTD/Schema generation tools do not conform to, is that an XML element's attribute value is considered non-translatable unless the translatable text is defined as an internal entity which is assigned to that attribute (For example: WCM Massloader's DTD Generator tool, defines all translatable resources as XML elements' attribute values).

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 75% of the total text.

Page 1 of 2

  An Extensible Approach for Isolating Translatable Resources of a Preexisting XML Documents in Language-specific Sub-documents (Files)

Our proposed approach attempts to resolve the Isolation problem by carrying out the following main tasks:

1. Use a special key to flag translatable elements/attributes in the preexisting XML documents(files).

2. Extract the translatable resources of the XML document as follows:

A. Recognize the elements/attributes that are translatable by recognizing the special prefix (example: $$$_).

B. Separate an existing translatable XML document into one main non-translatable intermediary document and one language-specific(translatable) subdocument (file). This step is structure indepdent (i.e the Isolation process does not depend on the structure of the XML document).

3. Process the generated translation-friendly sub-document separately - send it to Translation Services Centers.

4. After translation, merge the translated XML sub-document with the intermediary XML document to generate the final translated XML document.

(The approach is in short: Flag translatable attributes/text by prefixing them with a special key and write a tool to extract all translatable attributes/text (flagged by the special key "$$$_") into one translatable XML file and one intermediary file (original file + Key) and later on merge the translated XML file with the intermediary file to get the final translated file).


XML file1

  Translatable attributes are prefixed wit...