Browse Prior Art Database

Using hash values in filenames to provide dynamic local caching of remote resources

IP.com Disclosure Number: IPCOM000239035D
Publication Date: 2014-Oct-02
Document File: 3 page(s) / 195K

Publishing Venue

The IP.com Prior Art Database

Abstract

When using remote resources it is often desirable to cache to retrieved content locally so as to avoid the cost of continually downloading resources which have not changed. There are a number of existing solutions to knowing when a remote resource has changed, some of which rely on the protocol used to communicate with the remote server, others are protocol independent. The core idea of the solution herein is to create a hash (e.g. SHA1, MD5) of the URL of the remote resource, create a hash of the ETag received from the server, convert them to hex strings and use the result as the name and extension of the file in which a copy of the remote resource is, or will be, stored on the local disk. In the event that a server does not return an ETag for a resource, either because it is not supported or support has been removed, the created file has no extension. The file system is checked for an existing file that matches, if one is found then the local version is used, otherwise the updated resource is downloaded and all existing files that have a name corresponding to the URL hash are deleted (which automatically manages the cache entries by deleting old ones).

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 52% of the total text.

Page 01 of 3

Using hash values in filenames to provide dynamic local caching of remote resources

When using remote resources it is often desirable to cache to retrieved content locally so as to avoid the cost of continually downloading resources which have not changed. There are a number of existing solutions to knowing when a remote resource has changed, some of which rely on the protocol used to communicate with the remote server, and others are protocol independent. Some examples of which are as follows:
1. HTTP : 304 status code (http://en.wikipedia.org/wiki/List_of_HTTP_status_codes)
: this solution indicates to the client that a resource has not changed based on the headers sent by the client. However this on its own is not a solution, it requires the client to provide data in the headers that the server can then match, such as the time/date of the resource being requested, or an ETag (http://en.wikipedia.org/wiki/HTTP_ETag) if the server has generated it. The problems with this approach are that
- for dynamically generated content the last modified time and date are parameters that can constantly be changing, which means that it will be continually served by web site, even though the content hasn't changed.

- ETags solve the problem with the use of dates to determine if a local copy is stale, but these tags are opaque and only mean something to the web server which dictates that they have to be looked up against the resource in order to provide them in the server request. This mapping needs to be maintained as a separate entity.

2. Hash / Check sum: a resource can be downloaded and a check sum or hash produced, which can then be compared against the local copy. The problem with this approach is that a user still needs to download the file ev...