Browse Prior Art Database

Method and System for Providing Stable Readdir Cookies for Large Hashed Directories

IP.com Disclosure Number: IPCOM000236294D
Publication Date: 2014-Apr-17
Document File: 2 page(s) / 92K

Publishing Venue

The IP.com Prior Art Database

Abstract

A method and system is disclosed for providing stable readdir cookies for large hashed directories.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 51% of the total text.

Page 01 of 2

Method and System for Providing Stable Readdir Cookies for Large Hashed Directories

Disclosed is a method and system for providing stable readdir cookies for large hashed directories. The method and system divides readdir offset/cookies into two parts, one based on a directory block (external hash bucket) of M bits and other of C bits based on either uniquifier bits from name's hash or, a value unique within the entries sharing the same M bit value. Upon hash collision, the uniquifier is stored with the directory entry to provide persistence. The hash table within each directory block (internal hashing) indexes entries both by name hash and by uniquifier, if different. This allows for finding an entry by name quickly as well as easily enumerating a block's entries in order of readdir offset.

In accordance with the method and system, the available uniquifier value can be determined by considering only the entries in a single block. Internal hash table is well-organized to make selection of an unused uniquifier efficient; only a small range of hash buckets needs to be examined. Without this block locality, picking a unique cookie value to use as a readdir offset for a new directory entry in a large directory would be intractable. Relying on the existing (GPFS) method for locating blocks based on directory names (i.e., external hashing), no additional global metadata overhead is imposed by the scheme. Few small additions to each block are required to support uniquifiers and these are only necessary when name hash collisions actually require selection of value different from the preferred choice provided by the name hash. So for small directories, where hash value

collisions are vanishingly unlikely, the additional overhead value is zero.

Uniquifier bits are either from the name hash value or, in case of a collision, some value unused by any existing directory entry in the same block having the same M bit value. Two special linked directory entries are used to persistently record the additional information needed when the uniquifier bits are assigned in cases of a collision with the bits from the name hash value. The first, called a "tiny crumb", is used by lookup and indexed in the internal hash table by the name hash value. The second, called "unlucky", is used by readdir and indexed in the internal hash table by the uniquifier bits. The two entries are linked so they can share common information.

Hash chains are maintained in order of uniquifier.

Readdir iterates over blocks in cookie (bit reversed) order, then over the internal hash table in bucket t...