Hybrid cache eviction policy for near caches in spatially distributed cache platforms.
Publication Date: 2014-Jan-28
The IP.com Prior Art Database
A distributed cache is an extension of the traditional concept of cache. It is a cache that spans multiple servers so that it can grow in size and transactional capacity. It is mainly used to store application data or web session data. The concept of a Near Cache is to store frequently accessed data by a client on the client node itself. In case the data is not present in the near cache it will have to be fetched from the distributed cache and this would involve 1) Identifying the node where the data is stored based on the hash value of the key. 2) Making an RPC (Network) call to fetch the data from that node.
Page 01 of 3
Hybrid cache eviction policy for near caches in spatially distributed cache platforms .
Prior Art: There are cache eviction mechanisms based on TTL (Time To Live). The disclosed technique proposes a new cache eviction policy for Near Caches in distributed caching products like Extreme Scale that is based on the fetch time for records from respective nodes. This is different from the TTL based cache eviction policy.
The core concept is "In a distributed caching system where the data is spread across multiple nodes that are spatially distributed, a hybrid eviction policy is proposed that dynamically computes the cost of fetching the data from its source and uses it as a factor while evicting the data elements from the near cache" to optimize the overall application performance.
In the disclosed technique we propose multiple cache eviction policies that take into account the fetch time for deciding which entry to evict in addition to the standard factors like time since last access and frequency of access.
Page 02 of 3
In Figure 1 we show different records that are fetched into the near cache from different nodes. Each record R(i) is fetched from node N(i). If the time taken for fetching a record from
node N(i) is T(i) then that is the factor that we will inspect to evaluate whether to discard an entry from the near cache. T(i) will vary based on a number of criteria, like the - distance of the Node N(i) from the near cache, the load on the network connection N(i) to the node on which the near cache is located etc and so can be dynamic i.e. the fetch time can vary with time and so to account for that variation we are proposing a time window to evaluate T(i).
T(i) = Σn (fetch time)/n
Thus T(i) is an average of fetch times for a node in a given window of time. We say, the fetch times for a node because all the entries fetched from a node in a distributed cache would have similar fetch times and therefore we will only maintain the set of fetch times for each node in the distributed cache from which records have been fetched. The values are maintained for a configurable time window. Each record in the near cache will have metadata that identifies the remote node from which it was fetched.
Thus we can have T(i) as the median fetch time for remote accesses to N(i) in the given time window or we can have it as an average of fetch times for remote access to N(i) for the same time window. The basic idea is to evict entries that have the minimum fetch time from the cache. This will be combined with the LFU and LRU algorithms to create a hybrid cache eviction policy. We will also evict entries after a certain TTL to ma...