Browse Prior Art Database

Method of HTML Page maintenance

IP.com Disclosure Number: IPCOM000015013D
Original Publication Date: 2001-Aug-03
Included in the Prior Art Database: 2003-Jun-20
Document File: 2 page(s) / 42K

Publishing Venue

IBM

Abstract

Method and Apparatus for HTML Page Maintenance

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 47% of the total text.

Page 1 of 2

Method of HTML Page maintenance

Method and Apparatus for HTML Page Maintenance

A. Boilerplate Management

Disclosed are method and apparatus for maintaining a multiplicity of HTML pages that all have the same general appearance -- same masthead and same footer, for example. This methods permits the "factoring out" of the HTML for the common page parts while not requiring Server-Side Include (SSI). It is desirable to avoid SSI because SSI places undesirable load on the HTTP server machine (with SSI, a page is built each time it is served) and because it renders browser caches useless and therefore increases network traffic.

This method uses architected HTML comments to mark the regions of the HTML file where the "real content" of the page is. HTML outside these markers is taken to be boilerplate. For example, a page might look like this:

<html> <head>
... meta tags, etc. ...
</head> <!-- BEGIN HEADER -->
... boilerplate markup for masthead... <!-- BEGIN CONTENT -->
... the page's "real content"... <!-- END CONTENT -->
... boilerplate markup for footer...

In this example, the "real content" is in two regions: everything from the beginning of the file up to but not including "<!-- BEGIN HEADER -->", and everything between "<!-- BEGIN CONTENT -->" and "<!-- END CONTENT -->" noninclusive.

The invention is a computer program that can read a so-marked HTML file, extract the "real content" sections, and then rewrite the HTML file by combining prespecified boilerplate HTML with the extracted "real content". The rules for page rebuilding are contained in a control file that the rebuilder reads before beginning to rebuild pages. The rules specify the names of the files containing the boilerplate HTML fragments. The rules also specify the order in which these fragments are to be written to the rebuilt HTML file and the places at which extracted "real content" sections are to be inserted. The program iterates through the whole directory tree comprising the web site, rebuilding each encountered HTML file in turn.

Some extensions to the aforementioned program make it even more powerful.

1. One such extension is the idea that a directive placed in the HTML (via architected HTML comment, for example) can tell the rebuilder program which of several predefined "looks" it should use when rebuilding the file. The aforementioned control file contains the rules for all predefined looks, and said directive tells the program which one to use.

2. Another extension lets such a directive be a selector for a customized HTML snippet for a certain region of the page, such as a customized left-navigation bar. This is useful because thematically-related sets of pages on the site might all use the same customized page region and so factoring it out is valuable.

3. A further extension is the idea that the correspondence between HTML pages and customized page

1

Page 2 of 2

regions can be recorded in the aforementioned control file. This is most useful when the corresponde...