Browse Prior Art Database

Method and Apparatus for Software Managed Coherency of Hardware Buffers

IP.com Disclosure Number: IPCOM000020052D
Original Publication Date: 2003-Oct-21
Included in the Prior Art Database: 2003-Oct-21
Document File: 2 page(s) / 46K

Publishing Venue

IBM

Abstract

An incomplete or incorrectly implemented cache coherency layer in hardware can often force an expensive or time-consuming hardware fix before it is useable. However, once the problem is identified it is possible for software to manage the coherency manually by forcing coherent writes through the cache as needed. This solution eliminates the need for a hardware change while keeping much of the original performance benefits of the caching architecture.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 52% of the total text.

Page 1 of 2

Method and Apparatus for Software Managed Coherency of Hardware Buffers

   The problem is a broken cache coherency layer when accessing DRAM from a PCI device. Buffers used by the interface may not invalidate correctly during read/write operations. This causes stale data to be returned from successive reads. The correct sequence of events during an operation of this type would be:

1) DRAM area is allocated for use by software
2) DRAM contents are modified through normal processor code
3) PCI device does DMA-read of DRAM to retrieve data
4) New data is placed in DRAM
5) PCI device does DMA-read of DRAM to retrieve new data
...repeat 4-5... In the problem case, step 5 will not return the correct new data.

One solution is to have the hardware manage the coherency. This is the ideal solution from a performance and ease of programming standpoint. The primary drawback here comes if the coherency is not working correctly. In this case, an expensive hardware fix would be required (if a solution can even be found).

Another case would be if the hardware did not implement any coherency due to other factors, such as complexity, lacking functions required in the adjoining buses, etc.. In such a case, the hardware management of the coherency is not a viable option. An example of this would be the 40X bus used in the 403 processor family.

Another option of always invalidating the read buffer as soon as the current read completed would greatly reduce the bandwidth of the PCI bus.

The ideal solution from a performance and ease of programming standpoint is to have the hardware manage the coherency. The primary drawback here is the complexity of the fix and the fix introducing other problems into the design.

Another hardware / software solution would allow software to writ...