Browse Prior Art Database

A high-performance method to insert a DOM tree to another DOM tree

IP.com Disclosure Number: IPCOM000030394D
Original Publication Date: 2004-Aug-10
Included in the Prior Art Database: 2004-Aug-10
Document File: 3 page(s) / 22K

Publishing Venue

IBM

Abstract

Disclosed is a method to merge multiple XML documents into one XML document. Document Object Model (DOM) defines a programming interface for XML documents. To merge multiple documents with DOM, the importNode() method is used, and it has large performance overhead. This article discloses an efficient way to merge XML documents into one DOM tree using existing SAX parser.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 53% of the total text.

Page 1 of 3

A high-performance method to insert a DOM tree to another DOM tree

Disclosed is a method to merge multiple XML documents into one XML document. Document Object Model (DOM) [DOM] defines a programming interface for XML documents. To merge multiple documents with DOM, the importNode() [IMPORT] method is used, and it has large performance overhead. This article discloses an efficient way to merge XML documents into one DOM tree using existing SAX parser [SAX].

A straight-forward way to merge multiple XML documents into one DOM tree is
1. Parse each of the documents, and get their DOM trees 2-a. Call the imprortNode() method against source DOM trees, and insert the results to the destination DOM tree (Figure 1,) or
2-b. Call the adoptNode() [ADOPT] method against source DOM trees, and insert them to the destination DOM tree.

The importNode() method is not efficient because it duplicates the specified DOM tree. The adoptNode() method is more efficient than the importNode() method, but the adoptNode() is rarely used for now because it was introduced by newer DOM Level 3 [DOM3] and it can not be used with combination of multiple DOM implementations.

Figure 1: Merging documents with importNode()

According to DOM specification, any node in a DOM tree is owned by a Document node, and a node can not be merged to a tree owned by another Document node. So, importNode() must duplicate the input node, and should have large performance overhead because object creation is generally heavy task.

The main idea of the disclosed method is to use the common owner Document in parsing all of merged documents. As a result of the common owner, node duplication is not required. The basic procedure follows:

DOM-B

Document B

DO M-A

DOM-B

Docume nt insta nce

< b:r oo t x ml ns: b= ...>

<b: ch ild 1 /> <b: ch ild 2 /> <b: ch ild 3>

 <b :g ran dc hil d /> </b :c hil d3 >

< /b: ro ot>

DO M Parser

A. im por tNo de (B)

DO M-B owned by A

insert

F

F

1

[This page contains 5 pictures or other non-text objects]

Page 2 of 3

1. DOM-parse the destination document, and get the destination DOM tree

2. Prepare a SAX ContentHandler which creates a DOM tree corresponding to incoming events with a specified owner Document 3. Pass the owner Document of the destination DOM tree to the ContentHandler, set the ContentHandler to a SAX parser, and parse a source document with the SAX parser 4. Insert the DOM tree created by the ContentHandler to the destination DOM tree

Figure 2 shows the procedure.

Figure...