Browse Prior Art Database

Method for render-cache optimization for zone rendering

IP.com Disclosure Number: IPCOM000009708D
Publication Date: 2002-Sep-11
Document File: 5 page(s) / 125K

Publishing Venue

The IP.com Prior Art Database

Abstract

Disclosed is a method for render-cache optimization for zone rendering. Benefits include improved functionality, improved performance, and improved utilization of silicon surface space.

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 39% of the total text.

Method for render-cache optimization for zone rendering

Disclosed is a method for render-cache optimization for zone rendering. Benefits include improved functionality, improved performance, and improved utilization of silicon surface space.

Background

              There are two modes of rendering, classical rendering and binning. In classical rendering, the polygons are processed as they appear. A polygon designated as P may be at the top left corner of the screen, and a polygon designated as P+1 may be at the bottom right corner of the screen. In classical rendering, the memory bandwidth required is more compared to binning.

General description

              The disclosed method is optimization of render cache for zone rendering. The implementation of binning, timeslicing, and zone clearing improve performance beyond the standard 1 pixel per clock.

              The render cache (see Figure 1) has 16-KB of storage. It services the following drawing engines:

·        3-D rendering

·        Blitter

·        Motion compensation

·        Overlay

 

              The renderer requires/manipulates depth (Z) information and color (C) information for the polygons that make up the render scene. The blitter reads source (S) data and reads/manipulates destination (D) data. Motion compensation reads error (E) data and reads/manipulates destination (D) data. All clients operate in the virtual space and use screen coordinates.

              Each client has two streams accessing the cache. Each client interfaces the cache via the windower. The cache first arbitrates between the two streams as indicated by the mux. The virtual address is translated to the physical memory address where the data is stored. This mapping is implemented via a translation look-aside buffer (TLB). It has 16 entries with each entry storing the translation for 2 pages. The 14-deep TLB hit/miss queue is provided so that the cache does not block or stall on TLB misses.

              In the caching operation of the render cache, the allocator is a 6-stage pipe and uses the physical address for allocation (see Figure 2). The main components are the cache tag that is read by the allocator and the write-back engine. It evicts the cache lines to memory. The cache tag is written by the allocator. The architecture guarantees that the allocator and write-back engine cannot read the tag at the same time. However, the tag can be read and written at the same time. Physically, the tag has one port that is used for both reads and writes. The cache tag operates at twice the core frequency. By operating at twice the frequency, the tag requires just one port as opposed to two, resulting in silicon surface space savings.

              The disclosed method is uses binning. The rendering image is sectored into 8-KB bins. The polygons in one bin are completely rendered before moving to the next. This approach requires that the Z value be initialized to infinity prior to processing any polygons. Optionally, the frame buffer may be cleared to a background ...