Browse Prior Art Database

System for paginating markup in the absence of direct feedback from an output device context

IP.com Disclosure Number: IPCOM000013934D
Original Publication Date: 2001-Oct-03
Included in the Prior Art Database: 2003-Jun-19
Document File: 5 page(s) / 54K

Publishing Venue

IBM

Abstract

Abstract

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 22% of the total text.

Page 1 of 5

  System for paginating markup in the absence of direct feedback from an output device context

Abstract

This document describes an algorithm for paginating markup in the absence of direct feedback from an output context. The algorithm utilizes a simplified numerical model of the behaviour of downstream output transformations to make upstream pagination decisions. The benefits of performing upstrream pagination of the input is that the downstream transform stages can be simplified so that they focus only on the task of laying out page-sized chunks of output without having to consider the complicated issues of trying to span content across multiple pages.

Context

With the advent of markup languages such as XML and HTML, there have been strong incentives to generate various kinds of application output as XML documents and then, separately, to transform such XML into a presentation format such as HTML. The advantages of this 2-phase approach to output generation are as follows: the device-independent XML can be used as input to other processes unrelated to presentation [ for example: analysis processing ] by substituting different presentation transforms, the device-independent XML can be retargeted to different kinds of device context Reporting outputs are a type of application output that are apparently well-suited to this approach, particularly since the intermediate XML representation is likely to have a number of different uses aside from presentation. One difficulty, however, is introduced when there is a requirement to paginate the final output, particularly when HTML and CSS are used as the presentation languages. The problem is that HTML+CSS only provide very limited support for output pagination. The support they do provide is a mechanism for signalling to a printing context where a page break should be forced. However, it remains the responsibility of the generator of the HTML to ensure that the generated content between two page break directives will fit within the confines of a typical page. This means the HTML generator must account for the depth of output thus far generated and, when necessary, generate a page footer, a page break directive and a subsequent page header then continue processing the input in the same fashion until the end of input has been reached. It is the need for the HTML generator to account for the depth of generated output that makes this problem a reasonably tricky one to solve. In fact, in the general case, it can only be solved with 100% accuracy if the HTML generator has a 100% accurate model of the output processor (usually a web browser) that performs the write into the final output context. Such complete accuracy is an unreasonable requirement to put onto an HTML generator. However, provided one is willing to make certain assumptions about the configuration of the output processor, restrict the nature of the generated HTML and so trade model accuracy for simplicity, it is possible to predict t...