Hybrid cache eviction policy for near caches in spatially distributed cache platforms.
Publication Date: 2014-Jan-28
The IP.com Prior Art Database
A distributed cache is an extension of the traditional concept of cache. It is a cache that spans multiple servers so that it can grow in size and transactional capacity. It is mainly used to store application data or web session data. The concept of a Near Cache is to store frequently accessed data by a client on the client node itself. In case the data is not present in the near cache it will have to be fetched from the distributed cache and this would involve 1) Identifying the node where the data is stored based on the hash value of the key. 2) Making an RPC (Network) call to fetch the data from that node. This is much slower than local memory access due to multiple RPC calls the client has to make. The Near cache will contain data that is most recently or most frequently used by the client and that the data may originally have been fetched from various nodes in a distributed caching topology. Near Caches have eviction policies based on time of creation of entry, time of use of entry, lifetime of the entry in the cache etc . Typical eviction policies used for Near Caches are Least Recently Used(LRU), Least Frequently Used(LFU) and a Hybrid of both based on weighted averages. However these policies do not take into consideration a very important function and that is the latency to repopulate an entry in case of a cache miss. This is a very important factor that is not used currently when deciding which entry to evict from the cache. Consider a scenario where we have entries that take a longer time to fetch from the distributed cache due to - lesser network speed or network latency or network congestion or data proximity. In such cases the cache eviction algorithm evicts these entries from the cache even if there are other entries of the slightly older age or of slightly less frequent usage that take much less time to fetch from the remote nodes. With traditional cache eviction methods like LRU, LFU works for traditional caching systems where there is single source of data, the same techniques would not suffice for spatially distributed sources of data. The worst case scenario is - where the application requests for a cache entity whose cost of fetching is way above the other elements in the cache but accessed "Relatively" less frequently/less recently. Hence we need an optimal eviction mechanism that caters to the distributed cache context.
Page 01 of 3
Hybrid cache eviction policy for near caches in spatially distributed cache platforms .
Prior Art: There are cache eviction mechanisms based on TTL (Time To Live). The disclosed technique proposes a new cache eviction policy for Near Caches in distributed caching products like Extreme Scale that is based on the fetch time for records from respective nodes. This is different from the TTL based cache eviction policy.
The core concept is "In a distributed caching system where the data is spread across multiple nodes that are spatially distributed, a hybrid eviction policy is proposed that dynamically computes the cost of fetching the data from its source and uses it as a factor while evicting the data elements from the near cache" to optimize the overall application performance.
In the disclosed technique we propose multiple cache eviction policies that take into account the fetch time for deciding which entry to evict in addition to the standard factors like time since last access and frequency of access.
Page 02 of 3
In Figure 1 we show different records that are fetched into the near cache from different nodes. Each record R(i) is fetched from node N(i). If the time taken for fetching a record from
node N(i) is T(i) then that is the factor that we will inspect to evaluate whether to discard an entry from the near cache. T(i) will vary based on a number of criteria, like the - distance of the Node N(i) from the near cache, the load on the network connection N(i) to the node on which the near cache is located etc and so can be dynamic i.e. the fetch time can vary with time and so to account for that variation we are proposing a time window to evaluate T(i).
T(i) = Σn (fetch time)/n
Thus T(i) is an average of fetch times for a node in a given window of time. We say, the fetch times for a node because all the entries fetched from a node in a distributed cache would have similar fetch times and therefore we will only maintain the set of fetch times for each node in the distributed cache from which records have been fetched. The values are maintained for a configurable time window. Each record in the near cache will have metadata that identifies the remote node from which it was fetched.
Thus we can have T(i) as the median fetch time for remote accesses to N(i) in the given time window or we can have it as an average of fetch times for remote access to N(i) for the same time window. The basic idea is to evict entries that have the minimum fetch time from the cache. This will be combined with the LFU and LRU algorithms to create a hybrid cache eviction policy. We will also evict entries after a certain TTL to ma...