Browse Prior Art Database

System and Method for Efficient Clustering of Heterogenous Stream Servers

IP.com Disclosure Number: IPCOM000175245D
Original Publication Date: 2008-Oct-06
Included in the Prior Art Database: 2008-Oct-06
Document File: 2 page(s) / 30K

Publishing Venue

IBM

Abstract

System and Method for Efficient Clustering of Heterogenous Stream Servers

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 40% of the total text.

Page 1 of 2

System and Method for Efficient Clustering of Heterogenous Stream Servers

Overall Idea:

Disclosed is a clustering mechanism for heterogeneous groups of streaming media servers which addresses the problem of achieving incremental scale for heterogeneous streaming media servers. While web servers and caching web proxies generally achieve incremental scale by clustering several systems to act as a single, more powerful system, clustering techniques developed for servers and proxies targeting ordinary web objects, such as web pages and images, do not work well for streaming media objects. The basic idea with clustering is to have all traffic sent to a single device and then have that device distribute the request to one of the web caches in the cluster. Web caches generally use one of the following mechanisms for this clustering.

There is existing clustering technology for web servers, but these solutions are inadequate for streaming media:

The first method allows a request for a single object to be sent to any cache within the cluster. The advantage of this method is that processing is evenly distributed over all caches in the cluster. However multiple caches in the cluster may have a copy of the same object. This is not a serious issue for ordinary (small) web objects, but for streaming media objects which tend to be large (megabytes or gigabytes), this is an inefficient use of disk space. Also, it does not consider the fact that many streaming formats are proprietary and therefore it is likely that a given media format may be supported by one (or more) of the caches in the cluster, but not by one (or more) of the other caches in the cluster.

The second method is to ensure all requests for a given object are directed to a single cache in the cluster. This is generally accomplished by using a hash of the name (URL) of the object being requested to determine which of the caches in the cluster the request should be directed to. While this solves the issues with multiple copies of a given object, it also has disadvantages for streaming media. Specifically, because of the bandwidth and potentially long duration attributes of streaming media objects, it is possible to have very popular objects which may require an aggregate bandwidth (to serve all concurrent users) than can be supported by a single cache. An additional disadvantage is that when a new node is added to the cache cluster, the hash must be modified and objects must be migrated from the other servers in the cache. The advantages of the above schemes are that they are simple and efficient. Erroneous caching decisions do not incur a significant penalty because the objects are small. However these penalties such as disk space utilization and fetch-cost are magnified manifold in the case of streaming media.

The disclosed idea enables multiple copies of a given object within the cluster (as in method 1), but has the advantage that multiple c...