Browse Prior Art Database

Method and System of Block Printing for Markup Page Disclosure Number: IPCOM000198968D
Publication Date: 2010-Aug-19
Document File: 2 page(s) / 123K

Publishing Venue

The Prior Art Database


Printing is almost the most frequent operation when people use the computer in daily life. Whenever people want to read a document carefully, or they could not and are not willing to read the document on front of the computer, they always choose to print the document out. Nowadays, it is very easy and fast for people to use a printer. Actually, in most offices, the printers are basically connected in a intranet, and the user could submit their printing request on their own computers, and then take their printed materials in a few minutes. In fact, almost all popular software, related to document processing and text edition, provide the functions of document printing. The user could simply issue the shortcut key with Ctrl+P to accomplishing the printing task. Printing has become a fundamental feature in these popular software.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 54% of the total text.

Page 1 of 2

Method and System of Block Printing for Markup Page

Main Idea

What led to such a bad situation? In fact,

page defined by markup language does fit for browsing, instead of printing. When a

web page is printed, due to paper size and printer capability

                                       , browsing layout is hardly consistent with printing layout. In this disclosure,

we want to take a layout

conversion for a web page before it is printed, and make it suitable for the printing.

In this disclosure,

page. It mainly consists of two components:
1) Main Content Automatic Extraction. In a web page, there are usually diverse content. Some content focuses on one topic while other content focuses on the other topic. In most cases, the content in a specific topic are displayed in an exclusive block. Besides, in a web page, there are always a main block to display the main content for the browsers. Other blocks, such as advertising and logos etc, are normally displayed at the aside of the whole page. For the users, they often only

. For this

we want to apply the techniques of web page segmentation and block

identification to automatically extract the main block from the web page . Optionally, if there are multiple main blocks in a web page,

we could extract all these blocks for

After the main content is extracted from the web

page for the users to select to print, a subsequent content is how to print the main content. Recall that the main block is now displayed in the web page only for the users' browsing. The original layout of the main block might be quite unsuitable to be printed. We need to re-layout the content in the main block according to the size of the paper. It should be noted that there are some complexes situations in a main block. For example, some pictures are embedded in the block, and some advertising are also embedded in the block. Thus,

we need to further filter those irrelevant

content and then layout the relevant content adaptively. We could adopt two techniques to achieve this goal:
2.1 Content Topic Shift Judgment. For the large main block, there...