Browse Prior Art Database

Converting HTML to Well Formed XML With Preference Based Tag Expansion

IP.com Disclosure Number: IPCOM000123939D
Original Publication Date: 1999-Jul-01
Included in the Prior Art Database: 2005-Apr-05
Document File: 3 page(s) / 184K

Publishing Venue

IBM

Related People

Wesley, AA: AUTHOR [+2]

Abstract

The Trans-Proxy architecture is designed to modify web content to accommodate device, browser and network bandwidth limitations as well as user preferences. The Text Transformation Engine consists of 2 sub engines. The Translation Engine which converts HTML to XML and is the focus of this write-up. The Transformation Engine which applies transformation objects to well formed XML documents to generate a resulting document optimized for the client device, network link, and browser. Both the Translation Engine and Transformation Engine coordinate a set of transformation beans (i.e. transforms), who manage the actual document modification. The Translation Engine's transforms convert specific HTML elements to well formed XML elements. The Transformation Engine's transforms apply conversions specified by the XML elements.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 34% of the total text.

Converting HTML to Well Formed XML With Preference Based Tag Expansion

   The Trans-Proxy architecture is designed to modify web
content to accommodate device, browser and network bandwidth
limitations as well as user preferences.  The Text Transformation
Engine consists of 2 sub engines.  The Translation Engine which
converts HTML to XML and is the focus of this write-up.  The
Transformation Engine which applies transformation objects to well
formed XML documents to generate a resulting document optimized for
the client device, network link, and browser.  Both the Translation
Engine and Transformation Engine coordinate a set of transformation
beans (i.e. transforms), who manage the actual document modification.
The Translation Engine's transforms convert specific HTML elements to
well formed XML elements.  The Transformation Engine's transforms
apply conversions specified by the XML elements.  Hence, the output
of the Translation Engine is well-formed XML whose elements relay a
"transformation bias" that should be applied to specific HTML
constructs (e.g. convert table to unordered list) by transformation
beans managed by the Transformation Engine.  Thus non-well formed
HTML documents undergo a translation process to a well-formed
document in which tags are expanded to relay not only HTML
constructs, but transformation directives as well.

   The translation of HTML to well formed XML documents with
expanded tags is desirable for several reasons.

   Well Formed XML Documents yield a Document Object Model
(i.e. DOM) in which all XML parsers are capable of parsing.  Thus
the Translation Engine yields documents which are capable of being
parsed by any XML 1.0 compliant processor.

   Transformation Objects will potentially be developed from
disparate sources (i.e.  ISV's).  To support such a model it is
imperative that transformation objects be independent of one
another.  Independent transformations are not possible with non-well
formed documents due to the "entangled" nature of the document
elements (i.e. tags).

   The Transformation Engine treats input documents as
transformation scripts, in that the source document's elements
provide sufficient information such that transforms are capable of
modifying XML tags to comply with stored preferences.

   The Trans-Proxy utilizes Orion Preferences to "suggest"
transformations that should be invoked to circumvent, or remedy the
device, browser , and network limitations as well as user
preferences.  Thus converting HTML to well formed XML, must be
extended to generating dynamic XML Tags, which indicate  HTML
constructs, as well as Transform Preferences.

   The Trans-Proxy leverages the Orion Preference API to
access preferences.  These preferences can be categorized in one or
more of the following:
  Device Preferences
  - Transforms invoked to accommodate device/browser limitations
  Network Preferences
  - Transforms invoked to accommodate limited network bandwidths
 ...