Browse Prior Art Database

Conversion of Character Based Graphics to a Structured Graphic Representation

IP.com Disclosure Number: IPCOM000123310D
Original Publication Date: 1998-Sep-01
Included in the Prior Art Database: 2005-Apr-04
Document File: 3 page(s) / 140K

Publishing Venue

IBM

Related People

Kirk, M: AUTHOR [+2]

Abstract

One option to achieve a uniform product documentation practice is to use SGML for printed material, and HTML for online material. However there is a significant amount of legacy documentation that has to be converted as part of this process. Unfortunately neither of the new formats understands the cgraphic format, hence the need for a converter.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 44% of the total text.

Conversion of Character Based Graphics to a Structured Graphic Representation

   One option to achieve a uniform product documentation
practice is to use SGML for printed material, and HTML for online
material.  However there is a significant amount of legacy
documentation that has to be converted as part of this process.
Unfortunately neither of the new formats understands the cgraphic
format, hence the need for a converter.

   The following procedure enables the conversions required
to be performed:
  1.  From a Bookmaster root file, search for a .cgraphic tag,
      indicating the start of a cgraphic and write the graphic
      to a file until a .ecgraphic is encountered.
  2.  Match patterns found in the graphics and convert them
      according to entries made in a table.  Tags that represent
      characters (e.g. corner characters, general Bookmaster tags)
      are replaced with single characters that are not generally
      used, or characters that directly map to their ANSI
      equivalent.  Tags that are not supported are removed.  User
      defined tags within graphics, such as tags defined with
      .nameit macros, are resolved according to the .nameit
      statements found within the document that the graphic was
      extracted from.
  3.  Transfer the cgraphic(s) from HOST to PC via FTP to ensure
      consistent character mapping.
  4.  A cgraphic is read into a two-dimensional character array.
  5.  Any line in the array containing only formatting characters
      is removed.  The formatting characters are placed at the
      start of the next line.
  6.  All text characters and their co-ordinates are then
      extracted from the cgraphic, and formatting characters are
      resolved.  Horizontal text blocks are then constructed,
      breaking a text block after two sequential spaces, unless
      the previous character was a period, in which case it is
      broken after three sequential spaces.  These single line
      text blocks are then combined into multi-line text blocks
      by joining those text blocks that have the same x
      co-ordinate as the text block on the line immediately
      below.
  7.  Horizontal lines and their co-ordinates are then extracted
      in a sequential manner, top left to bottom right.  A line
      is defined by the co-ordinates of its start and end
      points.  The co-ordinates are calculated in relation
      to the position of the line within the character
      array.  A line is considered to be any number of
      adjacent characters with a complete horizontal
      element.  The line is delimited by characters with
      no or partial horizontal elements.
  8.  Vertical lines and their co-ordinates are then
      extracted in a similar fashion.
  9.  Dashed and dotted lines are then...