Browse Prior Art Database

Organization of index data for cache efficiency

IP.com Disclosure Number: IPCOM000022027D
Original Publication Date: 2004-Feb-19
Included in the Prior Art Database: 2004-Feb-19
Document File: 2 page(s) / 54K

Publishing Venue

IBM

Abstract

Increasingly, cache misses are having a negative impact on performance. The following describes how cache misses in a binary tree structure can be reduced.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 53% of the total text.

Page 1 of 2

Organization of index data for cache efficiency

Disclosed is a method for reducing cache misses on binary trees. This is significant because, as processor speeds increase exponentially, the relative cost of a cache miss is increasing.

     Indexes are typically organized into logical pages of storage, typically one or more physical pages of memory in size. Within index pages, node structures refer to the root nodes of child pages. New index entries are added by finding the appropriate location in a page and logically inserting a new branch point in the tree. Typically, these new nodes are physically located at next available byte of storage in the page (There would normally not be a direct connection between logical position of a node in the tree and the physical location of the node in the page.)

     The proposed method groups nodes on a page into subtrees, with each subtree being approximately the size of one full cache line. When a node path in an index is traversed, from the root of the tree down to some leaf node, the node path will reside in various cache lines. If nodes are grouped as described, if a cache miss is encountered while referencing some node, the next N nodes in the node path will usually be in the same cache line (where N depends on the size of a cache line and the structure of the index nodes). Unless the cache is thrashing severely, cache misses will not be encountered on those next N levels of the subtree. This method has the added effect of decreasing the number of referenced cache lines (i.e., the footprint of the index in the cache), increasing the probability that other data and instruction streams will remain in their cache lines for subsequent use.

     A description of the implementation follows. Assume the numbers in the diagram below represent the cache line to which that node will be assigned. Thus all the nodes labeled "1" will be assigned to the first cache line. (Hereafter, these nodes will be referred to collectively as "subtree 1".) For this example, nodes are 4 bytes long and cache lines are assumed to be 32 bytes in width (enough room for at most 8 nodes). As subtree 1 is processed, these nodes are assigned to the first cache line of a logical page. For each branch point that is at the leaf level of subtree 1 (e.g., the nodes above subtrees 2 and 3), the position is stored in a queue. When subtree 1 has been processed, these nodes will occupy approximately one full cache line.

     T...