Browse Prior Art Database

Improved Access for Sequentially Numbered Files in a Hashing File System

IP.com Disclosure Number: IPCOM000113269D
Original Publication Date: 1994-Aug-01
Included in the Prior Art Database: 2005-Mar-27
Document File: 4 page(s) / 128K

Publishing Venue

IBM

Related People

Hall, T: AUTHOR

Abstract

A method for improved access for sequentially numbered files and files within the same subdirectory in a hashing file system with cache is disclosed. For sequentially numbered files, the trailing numerics are ignored. An extension to the above is a disclosure for files within the same subdirectory in a hashing file system, where only the subdirectory (path) name is used in the hashing algorithm.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Improved Access for Sequentially Numbered Files in a Hashing File
System

      A method for improved access for sequentially numbered files
and files within the same subdirectory in a hashing file system with
cache is disclosed.  For sequentially numbered files, the trailing
numerics are ignored.  An extension to the above is a disclosure for
files within the same subdirectory in a hashing file system, where
only the subdirectory (path) name is used in the hashing algorithm.

      File systems are the software layer within an operating system
which handles the accessing of data on a given medium.  It stores
"pointers" to file data in a structure known as a directory entry.
It uses the directory entry to locate data for a file on the medium.

      Hash classes are a means of more quickly accessing a directory
entry, and thus retrieving file data.  The file name will be sent
through a "hashing algorithm" which determines a numeric value based
on the values of the individual characters within the file name.
Typically the name will include the directories the file is a member
of, which is often referred to as the "path".  The different hash
values are then placed in a corresponding "hash class".  A hash class
corresponds to physical storage space on the medium.  This space can
often be contiguous.

      When a file directory entry needs to be found, its hash class
is determined and the directory entry is found within that hash
class.  This can significantly decrease access time when the
individual hash classes are small in relation to the number of
directory entries actually stored.  An alternative would be to locate
all the directory entries in one area on the medium.  This would
require a manual scan through this potentially large area to locate
directory entries.

      Presented are means of selectively hashing against portions of
a filename and its path to optimize file retrieval in some scenarios
handled poorly by standard hashing.

      Accessing Sequentially Numbered Files - It may be difficult for
a file system to access sequentially numbered files quickly,
particularly on very large media.  File systems will often use hash
classes to improve access times to directory entries pointing to a
given file.  Traditional hashing techniques, however, will spread
sequentially numbered file directory entries amongst different hash
classes, requiring many hash classes to be loaded before the entire
...