Browse Prior Art Database

Lazy parser

IP.com Disclosure Number: IPCOM000013002D
Original Publication Date: 2000-Jun-01
Included in the Prior Art Database: 2003-Jun-12
Document File: 1 page(s) / 37K

Publishing Venue

IBM

Abstract

In a flexible message processing system , or the XML DOM interface, present information is parsed from a message to an application on an element by element basis. In order to retrieve each element, certain parts of the message may be parsed. In standard technology, on the first request the entire message is parsed into an internal tree structure, and the requested element is passed to the application via a programming interface. Subsequent requests can be satisfied directly from the parse tree without further parsing. However, the initial request takes much longer than is necessary since the entire message must be parsed.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 70% of the total text.

Page 1 of 1

Lazy parser

In a flexible message processing system , or the XML DOM interface, present information is parsed from a message to an application on an element by element basis. In order to retrieve each element, certain parts of the message may be parsed. In standard technology, on the first request the entire message is parsed into an internal tree structure, and the requested element is passed to the application via a programming interface. Subsequent requests can be satisfied directly from the parse tree without further parsing. However, the initial request takes much longer than is necessary since the entire message must be parsed.

     We propose a 'lazy' parser that parses just as much of the message as is necessary to satisfy each request. It holds a tree for the information parsed so far, and uses this to satisfy requests where possible. Thus consider an XML or other string delimited structure holding elements E1, E2, ..., E10. When a request is made for E3, E3 can only be found by parsing E1 and E2. A simple string search for '<E3>' is not adequate as this will not cater for the possibility of a 'lower level' E3 element embedded in an earlier element. The parser then has a parse tree holding E1, E2 and E3. The parse is suspended, and an indication of the location in the string where the parse was suspended is held. The parse is now able to return E3.

     A subsequent request for E1 can be immediately satisfied from the tree. A subsequent request for E5 requires tha...