Hybrid cache eviction policy for near caches in spatially distributed cache platforms.
Publication Date: 2014-Jan-28
The IP.com Prior Art Database
A distributed cache is an extension of the traditional concept of cache. It is a cache that spans multiple servers so that it can grow in size and transactional capacity. It is mainly used to store application data or web session data. The concept of a Near Cache is to store frequently accessed data by a client on the client node itself. In case the data is not present in the near cache it will have to be fetched from the distributed cache and this would involve 1) Identifying the node where the data is stored based on the hash value of the key. 2) Making an RPC (Network) call to fetch the data from that node. This is much slower than local memory access due to multiple RPC calls the client has to make. The Near cache will contain data that is most recently or most frequently used by the client and that the data may originally have been fetched from various nodes in a distributed caching topology. Near Caches have eviction policies based on time of creation of entry, time of use of entry, lifetime of the entry in the cache etc . Typical eviction policies used for Near Caches are Least Recently Used(LRU), Least Frequently Used(LFU) and a Hybrid of both based on weighted averages. However these policies do not take into consideration a very important function and that is the latency to repopulate an entry in case of a cache miss. This is a very important factor that is not used currently when deciding which entry to evict from the cache. Consider a scenario where we have entries that take a longer time to fetch from the distributed cache due to - lesser network speed or network latency or network congestion or data proximity. In such cases the cache eviction algorithm evicts these entries from the cache even if there are other entries of the slightly older age or of slightly less frequent usage that take much less time to fetch from the remote nodes. With traditional cache eviction methods like LRU, LFU works for traditional caching systems where there is single source of data, the same techniques would not suffice for spatially distributed sources of data. The worst case scenario is - where the application requests for a cache entity whose cost of fetching is way above the other elements in the cache but accessed "Relatively" less frequently/less recently. Hence we need an optimal eviction mechanism that caters to the distributed cache context.