Browse Prior Art Database

Method and Means of Maximizing Throughput by Batching Dirty Data for Minimum Disk Head Movement

IP.com Disclosure Number: IPCOM000013600D
Original Publication Date: 2000-Mar-01
Included in the Prior Art Database: 2003-Jun-18
Document File: 1 page(s) / 39K

Publishing Venue

IBM

Abstract

RAID has two reasons for being: fault tolerance and performance. RAID performance is gated by the performance it can get out of the physical drives. Maximum performance would be achieved in a sequential I/O mix where seek latency from one I/O to the next is 0. In a random I/O mix, seek latency can never be 0 but we can try to minimize the seek latency. In write-through mode, we have very little control over this, as we must go to disk with each host write. In write-back mode, the host data simply goes to cache and we flush it to disk later. In flushing, we can choose to write the data to disk in any sequence we want...not necessarily in the order it came from the host. This disclosure seeks to minimize seek latency in flushing dirty data by "batching" dirty pages according to disk location. The disk is divided logically into n areas of equal size. When a page is written to cache by the host, the page is linked into one of the n "batch" lists depending on which area of the disk it falls into. When we go to flush the LRU dirty page, we pull other dirty pages from the same "batch" list and flush all pages concurrently. Thus we are able to flush up to m dirty pages without moving the disk head more than 1/nth of the disk.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 100% of the total text.

Page 1 of 1

  Method and Means of Maximizing Throughput by Batching Dirty Data for Minimum Disk Head Movement

RAID has two reasons for being: fault tolerance and performance. RAID performance is gated by the performance it can get out of the physical drives. Maximum performance would be achieved in a sequential I/O mix where seek latency from one I/O to the next is 0. In a random I/O mix, seek latency can never be 0 but we can try to minimize the seek latency. In write-through mode, we have very little control over this, as we must go to disk with each host write. In write-back mode, the host data simply goes to cache and we flush it to disk later. In flushing, we can choose to write the data to disk in any sequence we want...not necessarily in the order it came from the host. This disclosure seeks to minimize seek latency in flushing dirty data by "batching" dirty pages according to disk location. The disk is divided logically into n areas of equal size. When a page is written to cache by the host, the page is linked into one of the n "batch" lists depending on which area of the disk it falls into. When we go to flush the LRU dirty page, we pull other dirty pages from the same "batch" list and flush all pages concurrently. Thus we are able to flush up to m dirty pages without moving the disk head more than 1/nth of the disk.

1