Browse Prior Art Database

Local Saves of Complete User-Relevant HTML Content

IP.com Disclosure Number: IPCOM000012957D
Original Publication Date: 1999-Oct-01
Included in the Prior Art Database: 2003-Jun-11
Document File: 1 page(s) / 41K

Publishing Venue

IBM

Related People

Carl Binding: AUTHOR [+2]

Abstract

This technical disclosure describes adoption of the mechanisms described in [1] to enable Hypertext Markup Language (HTML) browsers to provide complete local saves of HTML content. When using the "Save As" command from the commonly deployed World-Wide Web (WWW) browsers, the HTML root source page is saved onto a file in the local machine's file system under an Uniform Resource Locator (URL) of the form "file:". However, embedded, relative URLs to data embedded by reference within the HTML source (for example an image file in GIF format) are not resolved and are not saved locally. As a consequence, when re-accessing the saved HTML root page, the browser cannot retrieve the embedded data since the absolute URL has changed: the embedded data cannot be retrieved by concatenating the locally saved HTML root page's local file path name with the relative path name of the GIF data because a) the data is not even available on the local file system, and b) there would be a name conflict between the root page's local file path name with an expected directory to contain image source data. For example, a relative URL of the form "/image.gif" becomes the absolute "file:/image.gif" instead of "http://www.some.host//image.gif". The agent mechanism described in [1] can evidently be incorporated with the “Save As” command of the browser. Instead of periodically monitoring Web based HTML pages, it is the user’s explicit action that triggers retrieval of a HTML page and the resolution of embedded, relative URLs by scanning the HTML document, retrieving embedded URLs and storing their content as well as re-labelling the embedded URL to point to the locally saved embedded content. The local save does not create a single file, but creates a directory in which the HTML root page is stored as well as all the embedded image data. Care must be taken to avoid naming conflicts between the HTML root file and the created directory. Reference

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 66% of the total text.

Page 1 of 1

Local Saves of Complete User-Relevant HTML Content

      This technical disclosure describes adoption of the mechanisms described in [1] to enable Hypertext Markup Language (HTML) browsers to provide complete local saves of HTML content.

   When using the "Save As" command from the commonly deployed World-Wide Web (WWW) browsers, the HTML root source page is saved onto a file in the local machine's file system under an Uniform Resource Locator (URL) of the form "file:<local file path name>". However, embedded, relative URLs to data embedded by reference within the HTML source (for example an image file in GIF format) are not resolved and are not saved locally. As a consequence, when re-accessing the saved HTML root page, the browser cannot retrieve the embedded data since the absolute URL has changed: the embedded data cannot be retrieved by concatenating the locally saved HTML root page's local file path name with the relative path name of the GIF data because a) the data is not even available on the local file system, and b) there would be a name conflict between the root page's local file path name with an expected directory to contain image source data. For example, a relative URL of the form "/image.gif" becomes the absolute "file:<local file path name>/image.gif" instead of "http://www.some.host/<relative path>/image.gif".

   The agent mechanism described in [1] can evidently be incorporated with the "Save As" command of the browser. Instead of periodically monitoring...