Browse Prior Art Database

Timestamp Method for Ensuring that Content Matches Notification in a Pull-Based Web Content Distribution System

IP.com Disclosure Number: IPCOM000015090D
Original Publication Date: 2002-Apr-26
Included in the Prior Art Database: 2003-Jun-20
Document File: 3 page(s) / 144K

Publishing Venue

IBM

Abstract

Presented is a mechanism for validating the freshness of web content in a push-pull web Content Distribution (CD) system . In a push-pull CD system , CD servers push notifications to web servers and caches in the Content Distribution Network (CDN). (For the rest of the document, web servers and caches are generically referred to as CD nodes) Each notification specifies a list of Uniform Resource Locators (URLs) that must be updated at each CD node that receives the notification. Upon receiving a notification, each CD node pulls web content from one or more CD nodes in an earlier wave [*]. CD nodes are partitioned into many waves. Each notification sent to a CD nodes in wave n, where n 1, pulls content from one of the nodes in waves 1 through n-1 (or from the origin server if n 1). A key problem to be solved in this design is to ensure that the most recent version of the content pulled in by a CD node from another CD node in an earlier wave. This problem is illustrated in the following example.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 52% of the total text.

Page 1 of 3

  Timestamp Method for Ensuring that Content Matches Notification in a Pull-Based Web Content Distribution System

  Presented is a mechanism for validating the freshness of web content in a push-pull web Content Distribution (CD) system . In a push-pull CD system , CD servers push notifications to web servers and caches in the Content Distribution Network (CDN). (For the rest of the document, web servers and caches are generically referred to as CD nodes) Each notification specifies a list of Uniform Resource Locators (URLs) that must be updated at each CD node that receives the notification. Upon receiving a notification, each CD node pulls web content from one or more CD nodes in an earlier wave [*]. CD nodes are partitioned into many waves. Each notification sent to a CD nodes in wave n, where n >= 1, pulls content from one of the nodes in waves 1 through n-1 (or from the origin server if n = 1). A key problem to be solved in this design is to ensure that the most recent version of the content pulled in by a CD node from another CD node in an earlier wave. This problem is illustrated in the following example.

Consider a web cache C in the third wave attempting to pull URL u from its preferred web server S in the second wave, as illustrated in Figure 1. During correct operation, S is expected to have the current version of URL u before C. But, because of the distributed nature of the network, S may not have the correct version of the content that C is attempting to pull. When the notification was distributed to the servers in the second wave, S may have momentarily been disconnected from the CDN, but may not have yet realized it (i.e., the appropriate timeouts may not have expired). Until the time S realizes that it has been disconnected from the CDN, S cannot prevent C from requesting and pulling an out-of-date version of URL u. (The order of occurrence of the various events leading to this scenario is illustrated in Figure 1.) The key problem is that C can never figure out that the content is out-of-date in this scenario. C is out-of-sync with the rest of the CDN but it will continue to serve out stale data to its clients. Downstream caches and browsers may then see invalid content for a long time, much longer than the failure...