Browse Prior Art Database

Method and apparatus to reduce backup window for files stored on disk based storage systems

IP.com Disclosure Number: IPCOM000235883D
Publication Date: 2014-Mar-28
Document File: 5 page(s) / 60K

Publishing Venue

The IP.com Prior Art Database

Abstract

Traditional disk based storage systems used along with file systems; store data which is mostly scattered across the disk leading to fragmentation. Back applications which need to back up a set of files perform poorly in this kind of system because reading from these systems is very slow. This article shows method to address that limitation and reduce the backup window for the system. The methods presented in the article can help in all kinds of disk based systems and cab be applied directly or indirectly to applications.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 41% of the total text.

Page 01 of 5

Method and apparatus to reduce backup window for files stored on disk based storage systems

Background:

     In a shared storage environments file systems are created over block storage. Subsequently to protect the primary data from failures these files are needed to be backed up to other storage media. SAN administrators predominantly use backup clients that run on hosts and backup the important files by copying them periodically. These backup applications generally work by creating a list of files to backup and then copying them one by one to the backup storage. Backup jobs are IO intensive and load the hosts & storage from which backups are done. Hence backup jobs are typically executed in off-peak hours to ensure that backups are competed before peak hours start. Therefore the ability to complete a backup on time (called the backup window) is critical. With burgeoning data stores, backup windows are getting stretched. Therefore reduction of backup windows is considered as a critical requirement for backup applications & storage devices.

With faster network speeds, the main bottleneck in reducing the backup window is disk speed and often the backup duration is mainly due to the time taken to read files randomly from disk. This is mainly due to the design of filesystems, the blocks of different files are generally scattered across the disk in random order and reading the files one by one results in a lot of disk seek operations, which is the most expensive operation in a disk read request. This causes sub optimal READ performance from disks and hence the whole backup operation becomes slow.

Existing Solution :

1


Page 02 of 5

Above diagram shows a typical setup in a SAN environment. There is a filesystem on disk storage and the backup client on the host reads all files from primary storage and writes them to the backup storage.

The Following diagram illustrates the disk access pattern to read files F1, F2 and F3 in the given disk layout.

2


Page 03 of 5

As typically is the case, file blocks for different files are scattered across the disk and hence reading files generates a random disk access pattern causing a lot of SEEKs on the disk. Due to this overall read throughput is low and backup windows are longer.

Problems with existing solutions:

1. Generates Random read IO load on the source storage - As files are created over time on a filesystem, file blocks are typically scattered across the disk. Due to which reading a set of files (specially large sets) one by one in sequential order generates a random IO pattern on disks which causes a lot of disk head movement (seek) significantly reducing disk read throughput.

2. Increases Backup window -As a consequence of random disk IO pattern, reading many files is slow. This therefore increases backup window causing backup times to extend into peak hours thus contending for precious data center resources with critical applications and slowing them down.

3. Poor Disk bandwidth Utilization -...