Browse Prior Art Database

Dynamic interpretation of XML trees

IP.com Disclosure Number: IPCOM000021448D
Original Publication Date: 2004-Jan-19
Included in the Prior Art Database: 2004-Jan-19
Document File: 2 page(s) / 53K

Publishing Venue

IBM

Abstract

Disclosed is a method for dynamically interpreting XML data, based on two principles. First, for each unknown XML namespace or node type, the XML interpreter will fetch a piece of code or library that can process it. Second, the handler of node will provide the interpreter the address of the next node to be interpreted, hence defining the way the interpreter parses the XML tree.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 52% of the total text.

Page 1 of 2

Dynamic interpretation of XML trees

Disclosed is a method for dynamically interpreting XML data. Today, a software that interprets XML data statistically knows about the XML format it handles: a routine or a portion of code describes what action should be taken for each XML node. This works well when the XML file only contains node of a certain kind, which the application knows about. But increasingly, XML trees include many different flavors of XML data (namespaces), and it is often impossible to predict all namespaces that will be used. When an application finds an unknown XML node or namespace, it has no other choice than ignoring it. The solution exposed below provides a dynamic processing paradigm for XML, based on two principles. First, for each unknown XML namespace or node type, the XML interpreter will fetch a piece of code or library that can process it. Second, the handler of node will provide the interpreter the address of the next node to be interpreted, hence defining the way the interpreter parses the XML tree.

We define an "XML shell" with the following recursive behavior:

When given a node to interpret (e.g. <svg xmlns="http://www.w3.org/2000/svg">...</svg>), the shell inspects the namespace of the node and determines the URL associated with that namespace (e.g. http://www.w3.org/2000/svg). It checks in an hash table if it has a handler for that node. If not, it retrieves from the namespace URL a dynamic library that handles that namespace (e.g. http://www.w3.org/2000/svg/default.so) and updates its hash table. It then looks within the dynamic library for the symbol with the node name (e.g. "svg") and calls the associated routine giving it the XML node address pointer as an argument. After being executed, this routine returns the address to a new XML node to be interpreted. The process is then repeated untill the shell is provided a NULL address, which makes it stop.

The prototype of a node handler is the following one:

xmlNodePtr handler(xmlNodePtr node);

The handler takes a node pointer as an argument and returns the address of the next XML node to be processed.

Suppose for instance that we have the following XML tree:

<X xmlns="http://address1/">

<Y xmlns="http://address1/">

<W xmlns="http://address1/" />

<Z xmlns="http://address2/" /> </X>

After parsing the XML tree into memory, the shell would start by checking the first node X with namespace "http://address1/". It connects to the internet and retrieves the dynamic l...