Browse Prior Art Database

Bidi Protocol to Block Store Operations for Minimum Fetch Latency

IP.com Disclosure Number: IPCOM000014037D
Original Publication Date: 2001-Nov-08
Included in the Prior Art Database: 2003-Jun-19
Document File: 3 page(s) / 49K

Publishing Venue

IBM

Abstract

The Problem:

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 55% of the total text.

Page 1 of 3

Bidi Protocol to Block Store Operations for Minimum Fetch Latency

The Problem:

As more companies migrate toward e-commerce and world-wide trade, the dependency on high-performance, high-bandwidth computers is growing at a rapid pace. One of the measurements of quality of these computer systems is the performance. Because of the amount of data that gets transferred, it is important for these machines to keep up with the growing demand.

One aspect of this growth is the ability for a Central Processing Unit (CPU) to obtain data as quickly as possible. This has led to larger L1 and L2 Cache sizes as well as larger storage to help support these CPs. Unfortunately, the packaging technology will usually not allow all these levels of cache to reside on one chip. This leads to two concurrent problems:

1. The delay involved in retrieving data from the next-level cache increases due to the packaging delay associated with a chip crossing, and

2. The limited bandwidth due to limited I/O on chips.

One way the I/O bandwidth limit can be effectively increased is to use the data nets as bi-directional nets. For example, instead of having an 8-byte store bus and an 8-byte fetch bus (which would take 32 cycles to fetch a 256-byte line), there could be a 16-byte bus which sometimes stores, sometimes fetches, depending on the bandwidth needs (which would take only 16 cycles to fetch a 256-byte line).

Another problem, however, is that driver-terminated CMOS nets need one or more dead-cycles at frequencies over 200 MHz, in order to turn the bidi around with proper impedence termination.

One of the critical timing relationships in a level-2 cache is the fetch alert. This is a directory lookup to determine whether the data resides in the L2 Cache. If the data is found in the cache, the bidi needs to be turned-around in time to send the fetch data from the L2 to the CPU. The bidi latency adds to this critical time. There is also a fetch-alert from the L3-level main storage which would experience similar bidi latencies.

The Solution:

Since the fetch is pipe-based, the invention allows for the reporting of a 'possible hit'. This in turn blocks the STORE from the CP in anticipation of turning the bidi around. This BLOCK acts as an automatic QUIESCE cycle if the data HITs in the L2. The data chip can then send ZERO data on the same cycle the store is blocked which is the cycle PRIOR to the cache data becoming available. Therefore, there is no bidi penalty cycle.

If the data is NOT in the L2, the data chip would NOT drive the ZEROs and the quiesce would not occur. Instead, one cycle worth of store data

1

Page 2 of 3

would have been blocked. The cycle after the block store, when the hit results are known, the CP can send another store without penalty of the bidi...