Browse Prior Art Database

Dynamically Displaying and Mapping XML Documents

IP.com Disclosure Number: IPCOM000019362D
Original Publication Date: 2003-Sep-12
Included in the Prior Art Database: 2003-Sep-12
Document File: 4 page(s) / 73K

Publishing Venue

IBM

Abstract

XML is important for business because it represents an open, standard-based solution to data exchange. Because of it's common format in the form of human readable grammar notation, it provides a common vehicle for data transfer between different databases, legacy systems and between different businesses. As a result, applications are developed which assumes hierarchical XML documents modeled in a form such as relational, which can then be used to query or extract information from the XML and/or combine it with data from other sources. Thus there is a need for a tool to exchange meta data and search the meta data. Plowing through an XML document or schema and trying to construct it's hierarchical form is often times very tedious, error prone and time consuming. In addition, the XML document itself can model very complex business applications. To be able to dynamically convert the XML format into a hierarchical representation for ease of use and try to map it dynamically to some sort of relational sources with matching elements, many difficult steps must be undertaken to manually achieve the result.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 37% of the total text.

Page 1 of 4

Dynamically Displaying and Mapping XML Documents

    The disclosure is a software application that automatically creates a mapping between hierarchical schema objects (e.g., XML Schemas) into a lossless relational schema representation. The disclosure uses this mapping to deduce the necessary SQL statements that allow access to the hierarchical objects (e.g., an XML document) as a relational view. These SQL statements make use of the capabilities of the DB2 XML Wrapper to realize the access to those objects. The disclosure also includes a graphical user interface that
1) shows a representation of the input hierarchical schema, 2) shows the corresponding relational schema, which was generated automatically using a default decomposition strategy or using a user-select strategy, 3) shows how the particles of each schema are related (the mapping), 4) allows users to modify the resulting relational schema and mapping, and 5) shows the realization of the mapping as a set of SQL Statements that can be used with the XML Wrapper.

The disclosure is designed to provide a relational view of hierarchical schema objects like those found in instances of XML Schemas. The disclosure achieves the goals stated above by following these steps:

The disclosure starts by reading and understanding one such hierarchical schema. Currently, the

disclosure can read "XML Schemas", "WSDL" definitions, and "XML documents." In the case an XML document is used as input, the disclosure automatically deduces an XML Schema that would accept the input document. XML Schemas are read using DOM and parsed into an internal model that represent schemas as nested repeating relations. The disclosure then decides how to "decompose" (sometimes referred as "flattening") the hierarchical

schema (called the "input schema") into a lossless relational schema. Lossless in this context means that the resulting relational schema must a) provide access to all the elements and attributes appearing in the input schema, and b) must preserve parent-child relations implied by the nested structure of the input schema. The disclosure uses a "default" decomposition strategy that works as follows:

The nested structure of the input schema is recursively visited and for each parent-child

relationship found, the disclosure tests the cordiality of that relationship. If the cordiality is 1-N
(i.e., the child part repeats within its parent), a new relational view is started for the child. The parent-child relationship is remembered in the relational schema by introducing special columns on the parent and child views that will represent key and foreign key fields (i.e., the disclosure introduces a referential constraint on the relational schema). Otherwise, the relationship is 1-1 and the elements in the child relation are added to the view representing the parent part. Care is taken to rename child elements appropriately when added to its parent's view since no two elements in a view can have the same na...