Browse Prior Art Database

Cache memory management

IP.com Disclosure Number: IPCOM000240105D
Publication Date: 2015-Jan-02
Document File: 3 page(s) / 66K

Publishing Venue

The IP.com Prior Art Database

Abstract

Disclosed is an idea in the area of cache memory management to optimize power and performance of computer systems.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 52% of the total text.

Page 01 of 3

Cache memory management

In conventional computer memory arrangements, the cache memory provides huge performance improvements by sourcing required data with minimal latency to the "core" from "main memory". Cache memory is typically organized into several hierarchies to optimize the data access performance to the maximum extent possible. Cache memory at each level of hierarchy has its own latency for a read access by the core. Cache memory "close" to the core provides minimal latency to the core and vice versa. Depending on the overall performance needs of the system, there are multiple hierarchies of cache memory designed, and at each level, size is tuned to balance out performance targets versus other engineering factors (chip real estate, technology and cost requirements). During system boot, the cache memories present in all hierarchies get initialized by its respective cache controller and engages in the data path between core and main memory. Core looks for data in the cache in the first level of hierarchy (L1) and, if not available, it searches for data in L2 and so on. With such formal engagement of cache memory at system level, the cache memory usage is restricted to minimize read latency from the core. On the other hand, next generation system design demands for sophisticated functions and features to accommodate newer workloads. This drives for adoption of newer technologies in the design and also continual optimization of available methodologies and techniques at sub-system level.

    Memory subsystem technology has evolved to handle huge volume of real time for less latency to support very fast response time (data analytics, mobile advertising, etc.) and also there is a growing need to conserve and optimize memory power to the maximum extent possible. Workloads that need sequential data access from main memory benefits significantly from the cache memory because of latency improvements of sourcing data directly from cache (instead of going to main memory). Data pre-fetch mechanisms are available at HW/SW level to source data and keep cache full. However, customer workloads are dynamic in nature, and if workloads are very random driving non-sequential access, then cached data may not be used fully, in which case the power spent on fetching the data would not be beneficial. Profiling of workloads are helpful to decide "when to cache" or "when not to cache" the data and, accordingly, implement the run-time sourcing policies; however, this requires availability of code access prior to actual runtime and, therefore, not desirable. At the same time, all workloads are not going to leverage cache memory to its fullest extent possible. Partial cache memory would be good enough in such cases. It presents an opportunity to optimize remaining unused memory for some oth...