Browse Prior Art Database

Improved Cache Performance for Personal Computers

IP.com Disclosure Number: IPCOM000114124D
Original Publication Date: 1994-Nov-01
Included in the Prior Art Database: 2005-Mar-27
Document File: 4 page(s) / 114K

Publishing Venue

IBM

Related People

McKnight, GJ: AUTHOR [+2]

Abstract

Described is an architectural implementation designed to improve the performance of cache operations in Personal Computers (PCs). The technique improves busmaster memory access latency by eliminating unnecessary snoop cycles.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Improved Cache Performance for Personal Computers

      Described is an architectural implementation designed to
improve the performance of cache operations in Personal Computers
(PCs).  The technique improves busmaster memory access latency by
eliminating unnecessary snoop cycles.

      In prior art, write-back cache architecture was used in server
cache operations to increase the available Central Processing Unit
(CPU) bandwidth and to reduce the number of memory access made by the
processor.  This was because memory accesses by the processor could
interfere with the Input/Output (I/O) busmaster access to the memory.
However, there was a drawback in this write-back cache
implementation, versus a write-through cache design, in that there
was the requirement to snoop the processor cache whenever a busmaster
reads, or writes, to the system memory that was cacheable.  This
additional snooping usually degraded server performance.  The concept
described herein is designed to eliminate many unnecessary snoop
cycles implemented in the prior art, thereby improving overall PC
performance.

      Typically, in server type of environments, system memory may be
set as non-cacheable, by the operating system software, through the
use of the Page-level Cache Disable (PCD) bit in the CPU, as used in
Intel* x86 architecture.  The CPUs PCD bits are set, or unset, by
means of software on a 4 KB basis.  However, the memory controller
does not have access to this PCD information during busmaster
accesses to the system.  This lack of PCD information forces the
memory controller to snoop the processor cache on each busmaster
memory transfer that addresses a unique cache line, usually 16 or 32
bytes.  Therefore, the memory controller may unnecessarily snoop the
cache during busmaster transfers to non-cacheable system memory.
This can cause a delay in the I/O busmaster memory transfer until the
snoop is completed.  This unnecessary snooping also reduces the
available CPU and memory bandwidth, further decreasing performance.

      The state of the CPU's PCD bits can be tracked in a hardware
array by setting, or clearing, a bit for each 4 KB page of memory
during each CPU read cycle of system memory and are defined as PCD
tracking bits.  Fig. 1 shows a block diagram of a flow chart for
setting and storing PCD tracking bits.  At the start of each CPU
memory cycle, the C...