Browse Prior Art Database

Template Conformance using XPath analysis Disclosure Number: IPCOM000241749D
Publication Date: 2015-May-28
Document File: 4 page(s) / 31K

Publishing Venue

The Prior Art Database


Ensuring that an office document meets the styles and format defined in a template is an important feature for commercial, legal, academic and govenrment organisations. While there are techniques to examine documents for their semantic structure, to compare XML documents, and to apply schema rules, there is little with respect to managing the conformance of a document to a given template. We presents a technique based on XML path analysis to measure and report how well a document conforms to its template.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 39% of the total text.

Page 01 of 4

Template Conformance using XPath analysis

Templates are used within typical office documents to provide a guide as to what is expected in a given document. For instance a design document in an engineering environment. The purpose of the template is to ensure a consistent approach (e.g. style and structure) is used across all instances of a document type throughout an organisation. This applies to many situations, academic institutions, legal documents etc.

A template in effect should be more than just a guide. It defines a set of rules as to what is allowed in a given document. The challenge is in how to assess and manage compliance of all instance documents to the rules defined by the document template on an on-going basis.

How can a writer, or a reader of a given document ensure that the document conforms to the template? The same applies to a documentation manager responsible a company's repository of documents.

One way to manage documents and conformity/uniformity is simply not to use Office-like documents and to use a scheme instead, but that is not such an easy and user friendly way of creating documents.

A simple text-based comparison of a template to an instance document is only meaningful at the time the instance document is created from the template. It does not provide any context regarding styles and formatting defined in the template.

What is needed is a "rules based" check.

Modern office document tools use file formats based on a zipped collection of XML documents, describing style and other metadata as well as the actual document content. XML Schema validation could perhaps be used to validate compliance to a set of document 'rules', but the XML Schema would be specific to a document type, so it would need to be manually written or generated (e.g. based on a template) and the document authoring tool would need to support this notion of document-type-specific schema validation. For example, it may need to generate from a template or accept from a user an XML Schema, refer to that Schema in the instance document and include the Schema with the


Page 02 of 4

instance document or provide some way to resolve the Schema reference and perform the schema validation.

The core idea of this invention is to provide a mechanism to use the document template itself for measuring compliance of instance documents. That is, measuring how well an instance document conforms to its template. This mechanism can be used repeatedly over time, as the instance document changes. The template becomes more than just a general guide or creation-time cookie cutter.

A template comparison is based on the underlying document element types and their style and formatting as defined in the template document. It is not concerned with the the data content of the instance document.

A 'template' can be as simple as an initial document created with an example of each of the required document element types (headings, paragraphs, tables, etc), with their st...