Browse Prior Art Database

Performance Enhancements for Systems that Use Translation Control Entries (TCEs) to Translate Addresses

IP.com Disclosure Number: IPCOM000123462D
Original Publication Date: 1998-Dec-01
Included in the Prior Art Database: 2005-Apr-04
Document File: 3 page(s) / 146K

Publishing Venue

IBM

Related People

Buckland, P: AUTHOR [+4]

Abstract

In systems where 32-bit I/O devices must access 64-bit address spaces, it is necessary to have some sort of translation mechanism in the bridges to translate the 32-bit to 64-bit address. If this is not to be a fixed translation (which has many restrictions) then a dynamic method must be used, and such is the case in PowerPC platforms. In these systems a Translation Control Entry (TCE) is associated with each 4KB block of address space and this determines which I/O bus 4KB page will access which System Memory 4KB page. These TCEs are generally fetched when the devices first accesses a new page. The TCEs themselves are stored in System Memory, and there is a latency associated with the fetching of the TCE. This latency reduces I/O performance.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 42% of the total text.

Performance Enhancements for Systems that Use Translation Control
Entries (TCEs) to Translate Addresses

   In systems where 32-bit I/O devices must access 64-bit
address spaces, it is necessary to have some sort of translation
mechanism in the bridges to translate the 32-bit to 64-bit address.
If this is not to be a fixed translation (which has many
restrictions) then a dynamic method must be used, and such is the
case in PowerPC platforms.  In these systems a Translation Control
Entry (TCE) is associated with each 4KB block of address space and
this determines which I/O bus 4KB page will access which System
Memory 4KB page.  These TCEs are generally fetched when the devices
first accesses a new page.  The TCEs themselves are stored in System
Memory, and there is a latency associated with the fetching of the
TCE.  This latency reduces I/O performance.

   The TCE buffers in a bridge are generally managed as a
pool of resources, and when an I/O device gets off the bus and
another one gets on, if the resource which is holding the TCE is
needed to buffer a new TCE, the old TCE is thrown out.  If the
previous I/O device then gets back on the bus and needs the TCE that
it was using before, that TCE needs to be fetched again, possibly
replacing another TCE that will be needed at a future time.  This
throwing away of TCEs that will be needed again is wasteful of not
only the I/O devices' time waiting on the latency of the TCE fetch,
but also on the system performance which is affected by the throwing
away of TCEs and refetching them multiple times.

   Likewise, a similar statement can be made about the
buffers in a bridge that are used to buffer the data that is being
transferred to and from the I/O device.  That is, the data buffers in
a bridge are generally managed as a pool of resources, too, and when
an I/O device gets off the bus and another one gets on, if the
resource which is holding the data is needed to buffer a new
transaction, the old data is disposed of by writing it to the
destination on a write operation, or discarding prefetchable data on
a read operation.  If the previous I/O device then gets back on the
bus later to resume the previous operation, it can not pick up where
it left off without prefetching data that was discarded (if a read)
or starting a new write buffer (if a write).  This throwing away of
prefetched read data that will be needed again or writing of partial
buffers is wasteful of not only the I/O devices' time waiting on the
latency, but is also wasteful of the system performance.

   One solution to this problem is to increase the number of
TCEs or data buffers in the bridge, and implementing a least
recently used algorithm for throwing out TCEs or data.  If there is a
large enough buffer area, then the LRU algorithm might do a
sufficient job.  However, there is a better way.

   Now most systems use standard I/O buses, and these buses
do not provide a way for the I/O device to communicate to the
h...