Browse Prior Art Database

Minimizing Read Transfer Overhead by using a Flexible Transfer Window

IP.com Disclosure Number: IPCOM000104295D
Original Publication Date: 1993-Apr-01
Included in the Prior Art Database: 2005-Mar-19
Document File: 4 page(s) / 126K

Publishing Venue

IBM

Related People

Bowen, AD: AUTHOR [+2]

Abstract

In tiled screen memory architectures, the price paid for performance is time spent performing many read transfer cycles to place data from the RAM Array in the Serial Access memory (SAM) Port of the VRAM. In many systems, this overhead can reach values on the order of 15%. In order to retain as much of the performance as possible, it is desirable to reduce this number as much as possible without increasing other forms of overhead.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Minimizing Read Transfer Overhead by using a Flexible Transfer Window

      In tiled screen memory architectures, the price paid for
performance is time spent performing many read transfer cycles to
place data from the RAM Array in the Serial Access memory (SAM) Port
of the VRAM.  In many systems, this overhead can reach values on the
order of 15%.  In order to retain as much of the performance as
possible, it is desirable to reduce this number as much as possible
without increasing other forms of overhead.

      High-performance graphics systems need to be able to render
data in a wide variety of directions and manners.  In order to
facilitate the former, the concept of tiling has been adopted across
the computer industry.  A tile makes a two-dimensional patch of
screen to a single row in the memory device.  This allows the memory
device to remain in Page Mode over a greater range of motion.  Tiling
the screen increases the rendering performance of the system.  The
ideal tile from the rendering point of view is a square patch of
screen.  Unfortunately, the data in the VRAM must be scanned to the
screen or RAMDAC.  For every tile that is used to span the horizontal
axis of the screen, data from a different row must be placed in the
SAM.  This process, called Data Transfer (DT), degrades the
performance.  It does so since while the DT cycle is active, the
random access port must be inactive, thereby halting rendering
operations.

      If one were to plot performance versus tile size, the curve
would look somewhat parabolic.  When a single tile spans the entire
screen width, the rendering performance is limited by the overhead
associated with performing new RAS cycles due to crossing from one
tile to another in the vertical direction very often.  As the tile
width decreases, the row violation overhead also decreases; however,
the number of DT cycles per horizontal scanline begins to increase.

Up to a point, there is an overall performance gain.  For the
architecture designed, the number of tiles spanning the horizontal
axis was found to be optimal at 8.  For a screen with 1024 scanlines,
this means 8192 DT cycles must be performed every 1/60th of a second.
A typical 4Mbit VRAM specifies a DT cycle time of 120ns.  This means
that, at best, there will be a performance loss of about 6% due to DT
cycles.

      There is other overhead associated with the DT cycle.  In
previous VRAMs, the time when the DT cycle had to occur was extremely
well determined.  The cycle had to happen just so, so that after the
last pixel of one tile was scanned, then next tile was loaded in
between shift clocks.  To alleviate this timing constraint, the
concept of the Split Register was introduced.  This allowed the VRAM
controller to do the DT cycle at a time not near the row edge by
treating the SAM as two separate entities.  While one...