Browse Prior Art Database

Fast Distributed History Buffer Restore of Partially Written Data to a GPR in A Multislice Microprocessor

IP.com Disclosure Number: IPCOM000250304D
Publication Date: 2017-Jun-26
Document File: 5 page(s) / 662K

Publishing Venue

The IP.com Prior Art Database

Abstract

In a microprocessor, load instructions may access data that is unaligned and spread across multiple data-cache blocks. Each cache block may broadcast only a portion of the data back to the register file array as a ‘partial write’. Typically, when the next piece of data becomes available, the previously-written incomplete data is read out, merged with the incoming data, and then rewritten into the register file array as a full set of data. ECC is generated along with each write operation to compensate for array read errors. However, this read-modify-write process is inefficient and complex. This paper describes a new mechanism for ECC generation on partial data writes, allowing enhancements to flush-recovery for entries with partially written data.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 42% of the total text.

1

Fast Distributed History Buffer Restore of Partially Written Data to a GPR in A Multislice Microprocessor

Background: Instructions in a microprocessor can execute and write their results to a multitude of destinations.

These destinations may include register mappers, issue queues, reservation stations, and history

buffers.

Figure 1 shows an example processor, where instructions are fetched from the instruction cache (I$), buffered and dispatched to execution slices. The Data Cache (DCache) is partitioned among the execution slices, with a network allowing any execution slice to talk to any LSU DCache block.

Figure 1: Multislice processor

The DCache is partitioned into blocks that are Double Word (DW) aligned, with one block per

execution slice. Each DCache block can return Load data on a separate result bus. This allows load

instructions to issue from any execution slice and access any DCache block. When a Load access is

unaligned and has to access multiple DCache blocks, the Load Data will return from multiple LSU

DCache blocks on multiple result buses. Each result bus will have partial Load data. Figure 2 shows an

unaligned load, requesting data from two DCache blocks. Each DCache block broadcasts data back to

all the execution slices.

2

Figure 2: LSU writeback

The LSU DCache blocks can return the partial data result busses at different times, thus each writeback destination needs to be able to write each byte independently, rather than writing all 8- bytes at once. A writeback byte-mask is broadcast from the LSU slices to the execution slices, and tells each location which bytes are valid to be written. For the purposes of this paper, we are only concerned with array destinations, because arrays require ECC calculations. These arrays are located in the Mappers and the History Buffers.

ECC generation:

ECC (error-correcting code) allows the processor to detect and correct single-bit errors in data from the array. The ECC mechanism can correct single-bit errors and regenerate the data from a corrupted memory cell. Multi-bit errors are detected and reported, but are uncorrectable.

ECC is generated on the data before it writes into the array. When an entry is read from the array, the ECC value is also read out from the array. New ECC is generated on the read-data and compared against the ECC value that was stored in the array. This comparison indicates whether there was no error, a correctable (single-bit) error, or uncorrectable (multi-bit) error in the array.

Figure 3: ECC generation during Array write

New Idea:

In a microprocessor, data is partially written into storage because only a portion of the data is available. When the next piece of data becomes available, the written data is read out, merged with the incoming data, and then rewritten into memory as a full set of data. However, read-modify-write is inefficient and complex.

Since an entry of the Register File or History Buffer array can be written multiple times, each time with a pa...