Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Conversion of style based documents to arbitrary XML formats using externalized rule database.

IP.com Disclosure Number: IPCOM000016178D
Original Publication Date: 2002-Aug-16
Included in the Prior Art Database: 2003-Jun-21
Document File: 4 page(s) / 77K

Publishing Venue

IBM

Abstract

Disclosed is a mechanism for converting style based WordProcessor documents to XML or HTML format using a set of externalized rules to determine the markup to be applied to the resulting converted documents. Style based Word-processor documents use style names associated to individual paragraphs which signify the markup to be applied to that paragraph (See fig 1). In addition style names marking runs of text within a paragraph indicate character level attributes such as bold or underline (See fig 2). Fig 1 : A wordprocessor document showing a number of paragraphs using different styles. Fig 2 : A wordprocessor paragraph showing character level styling.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 53% of the total text.

Page 1 of 4

  Conversion of style based documents to arbitrary XML formats using externalized rule database.

Disclosed is a mechanism for converting style based WordProcessor documents to XML or HTML format using a set of externalized rules to determine the markup to be applied to the resulting converted documents.

    Style based Word-processor documents use style names associated to individual paragraphs which signify the markup to be applied to that paragraph (See fig 1). In addition style names marking runs of text within a paragraph indicate character level attributes such as bold or underline (See fig 2).

Fig 1 : A wordprocessor document showing a number of paragraphs using different styles.

Fig 2 : A wordprocessor paragraph showing character level styling.

    In this disclosure a set of rules is specified in an external database that is used to convert the style based document to an XML format.

The conversion rules specify Document, Paragraph, Text and special character

rules :

Document level rule Specifies the prefix and postfix XML content to be applied to the entire document. Paragraph level rules Specifies the prefix and postfix XML content to be applied to document paragraphs. A seperate rule is specified for each names paragraph style in the input document.

Text level rules Specify the prefix and postfix XML content to be applied to runs of text marked with a particular style or attribute.

Special Character rules Specify mappings between characters and special representations that they should have in the XML file.

    In addition to the above, more rules are specified to allow for the format of graphics and tables.

    In one specific implementation of this technology Word processor documents are converted to variations of HTML to provide product specific look and feel for software

1

[This page contains 2 pictures or other non-text objects]

Page 2 of 4

help.

    In one specific implementation of this technology the conversion rules are specified within an "IBM Lotus Notes" database. The file converter program "builder" is pointed to the source files and the database containing the rules and it then generates the output files according to the format specified within the rules.

    The following series of figures shows some rules from a database that will produce a basic HTML layout.

2

[This page contains 1 picture or other non-text object]

Page 3 of 4

    This allows non p...