Browse Prior Art Database

Smart Consistency Management for Query Response Caching

IP.com Disclosure Number: IPCOM000015105D
Original Publication Date: 2001-Nov-03
Included in the Prior Art Database: 2003-Jun-20
Document File: 3 page(s) / 60K

Publishing Venue

IBM

Abstract

We disclose the architecture of a system that can detect the potential inconsistency between the cached copy of data and the origin data in an efficient manner. The system as described can be used in reverse proxy caches in a server farm, as well as proxy caches that are deployed in a content distribution network. The use of this architecture can help reduce the amount of processing required at web-servers to reduce the window of inconsistency between the cached copy of data and the original data without processing at the origin server. Figure 1 shows the environment of a server farm where this system can be used. The server farm consists of one origin server which is front-ended by several proxy servers. When hosting a web-site, both the origin servers as well as the proxy servers include support for the HTTP protocol. The origin server may in turn contain a scheme for running programs such as cgi-bin scripts or servlets which access a database. Many of the pages that are provided by the web-server are generated by the cgi-bin scripts or servlets that may access the database. These pages are refered to as dynamic web-pages. The proxy servers are deployed to improve the scalability of the server farm when the origin server is not fast enough to process all of the client requests. In the environment shown in Figure 1, the proxy servers are located in front of the origin server. The entire site contains a load-balancer which can route incoming requests to one of many proxy servers. In order to improve the performance of the site, the proxy servers cache the dynamic web-pages that are generated by the origin server. When a client request arrives at the proxy server, it checks if it has a cached response for the request. If it does not have a cached response, the proxy contacts the orign server to get the response page and caches it. When an identical request is received again by the proxy server, the proxy has provide the response from the locally cached copy. Two requests are defined as identical if the programs they will be invoking at the servers are the same, and the parameters passed to the programs are also the same. The environment in a content distribution network is similar, except that the network connecting the proxy servers and the origin site happens to be a wide area network instead of a local area network.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 49% of the total text.

Page 1 of 3

Smart Consistency Management for Query Response Caching

    We disclose the architecture of a system that can detect the potential inconsistency between the cached copy of data and the origin data in an efficient manner. The system as described can be used in reverse proxy caches in a server farm, as well as proxy caches that are deployed in a content distribution network. The use of this architecture can help reduce the amount of processing required at web-servers to reduce the window of inconsistency between the cached copy of data and the original data without processing at the origin server. Figure 1 shows the environment of a server farm where this system can be used. The server farm consists of one origin server which is front-ended by several proxy servers. When hosting a web-site, both the origin servers as well as the proxy servers include support for the HTTP protocol. The origin server may in turn contain a scheme for running programs such as cgi-bin scripts or servlets which access a database. Many of the pages that are provided by the web-server are generated by the cgi-bin scripts or servlets that may access the database. These pages are refered to as dynamic web-pages. The proxy servers are deployed to improve the scalability of the server farm when the origin server is not fast enough to process all of the client requests. In the environment shown in Figure 1, the proxy servers are located in front of the origin server. The entire site contains a load-balancer which can route incoming requests to one of many proxy servers. In order to improve the performance of the site, the proxy servers cache the dynamic web-pages that are generated by the origin server. When a client request arrives at the proxy server, it checks if it has a cached response for the request. If it does not have a cached response, the proxy contacts the orign server to get the response page and caches it. When an identical request is received again by the proxy server, the proxy has provide the response from the locally cached copy. Two requests are defined as identical if the programs they will be invoking at the servers are the same, and the parameters passed to the programs are also the same. The environment in a content distribution network is similar, except that the network connecting the proxy servers and the origin site happens to be a wide area network instead of a local area network.

In the environment shown in Figure 1, requests arriving at the server farm can also modify the data that is stored at the origin site, or in the database at the origin site. In these cases, the responses that were generated on the basis of the modified data and cached at the proxy sites will no longer be valid. The responses need to be marked as invalid at all of the cached sites. In existing implementations, the origin site is the only entity that can determine which of the cached response pages have become invalid as a result of new request. This is because of...