Browse Prior Art Database

Reorganizing Data in Log-Structured File Systems

IP.com Disclosure Number: IPCOM000112798D
Original Publication Date: 1994-Jun-01
Included in the Prior Art Database: 2005-Mar-27
Document File: 2 page(s) / 119K

Publishing Venue

IBM

Related People

Franaszek, PA: AUTHOR [+3]

Abstract

Disclosed is a class of methods for lower cost reorganization in log-structured file systems and, additionally, for improving sequential read performance in such systems. The design of a log-structured file system is described in (1). The basic idea is to optimize file system disk organization for writes by dividing the disk into a number of segments, each of which is in turn used as a log. Each file system write is handled as a "log write", i.e., written to the disk location that is physically the next sequential block after the previous write. Each such write causes another block, in the same or another segment, to become old (i.e., invalid) data. Such a block can be thought of as a hole in the segment (i.e., one block of free space). Over time all segments are eventually used and disk space becomes fragmented, i.e.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 47% of the total text.

Reorganizing Data in Log-Structured File Systems

      Disclosed is a class of methods for lower cost reorganization
in log-structured file systems and, additionally, for improving
sequential read performance in such systems.  The design of a
log-structured file system is described in (1).  The basic idea is
to optimize file system disk organization for writes by dividing the
disk into a number of segments, each of which is in turn used as a
log.  Each file system write is handled as a "log write", i.e.,
written to the disk location that is physically the next sequential
block after the previous write.  Each such write causes another
block, in the same or another segment, to become old (i.e., invalid)
data.  Such a block can be thought of as a hole in the segment (i.e.,
one block of free space).  Over time all segments are eventually used
and disk space becomes fragmented, i.e., there is no large physically
sequential area consisting entirely of free blocks.  Therefore, it is
necessary to reorganize the disk in order to produce such free areas
for log space.  The method proposed in (1) is called "copy and
compact":  an entire segment is read, and then re-written with all
valid (in-use) blocks written at the beginning of the segment.  This
can be thought of as "left-justifying" the in-use blocks of the
segment.  The remainder of the segment then consists of free space
which can be used for log writes.

      In (2), a class of methods for reducing storage fragmentation
on disks is described.  These methods were designed so as to give a
large reduction in fragmentation with a relatively small amount of
data movement.  Various measures of fragmentation are described, and
algorithms are given for choosing sets of blocks to move that reduce
fragmentation optimally in terms of minimizing cost/gain functions.
The disk is not considered to be divided into fixed regions; rather,
it is considered to be one linear space.  To make the problem
tractable, a disk is divided into sets of "proper intervals", but the
boundaries are determined by the usage of the disk in that proper
intervals are defined as intervals with free spaces at each endpoint.
These methods were implemented, and, in addition to the feature of
being able to reorganize disk packs dynamically (i.e., while in use),
resulted in approximately an order of magnitude reduction in data
movement for the same reduction in fragmentation as the previously
used method (off-line copying of disk packs, with in-use data
"left-justified" on the pack).

      Note that the copy and compact segment reorganization method
described in (1) is analagous to the disk pack copying
reorganization technique prior to the implementation of the methods
described in (2), where the analog of a segment is a disk pack.  This
implies that reductions in reorganization overhead could be achieved
by applying the methods of (2) to log-structured file systems.

      One way to do this is to maint...