Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

System and Method to optimize data synchronization between Panache Cache and Home by using prioritization based mechanism

IP.com Disclosure Number: IPCOM000240560D
Publication Date: 2015-Feb-09
Document File: 3 page(s) / 86K

Publishing Venue

The IP.com Prior Art Database

Abstract

This method optimizes data synchronization between GPFS(Panache) Cache filesets and GPFS(Panache) Home site by using priority mechanism. By defining priority at various objects/levels like files/filesets at Cache/ Home(destination) we can optimize resources/bandwidth for synchronization of changed priority data from priority source to priority destination. So priority data will be taken precedence over any other data and using resources/bandwidth efficiently to provide highly required/expected at the Home(destination) site at earliest. This helps achieve the business continuity by syncing awaited changed data from Cache to Home first, other less required/low priority files/filesets/destination data can be sync later.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 50% of the total text.

Page 01 of 3

System and Method to optimize data synchronization between Panache Cache and Home by using prioritization based mechanism

Panache

Panache is a scalable, high-performance, file system caching layer integrated within the GPFS cluster file system. It introduces concept of persistent data store at Cache site which masks wide-area network latencies and outages by using GPFS to cache massive data sets, allowing data access and modifications even when remote storage cluster is unavailable.

fig-1

In Panache terms, home site is source of original data and cache site caches data locally in persistent store for client applications. Client application trying to access any file, for first access, file is fetched from the home site and copied to GPFS file system at cache site. Subsequent requests for the file are served from local cache site eliminating need of WAN bandwidth. Panache design takes care of keeping cache site copy in sync with home site copy. File data is transferred using protocol where home site acts as NFS server and cache site acts as NFS client. GPFS file system at both sites stores files on the devices managed by the storage server.Panache is based on the concept of on-demand caching. First access to a file and subsequent accesses to a stale file (stale access) within certain constraints is proportional to the file size and bandwidth of the link. This may be a significant overhead in the case of a slow or limited bandwidth link.

Currently, in panache data is modified based on the change observed at cache site. This idea is good if we can wait till all data is sync to another site. But consider the huge amount of IO on cache is leading to Q are full and all data is synching in as-it-comes manner. This leads to the wait till the required file synchronize at home. This is adding delay to the data availability at home. User has no idea whether the required file arrived at home is complete or not or when it will be available?

1


Page 02 of 3

fig-2

Proposed Optimizations ::

Instead of performing changed data synchronization of every files from cache to home at same level, a priority can be set at files/Panache fileset/Panache Home in such a way that changed data at cache will be looked for priority of file/fileset/Targeted Home and then will be added up/down in the queue for synchronization to home site. This helps syncing priority data to home first so that required data can be made available to user early. This enormously solve the purpose of required data to be available at destination before any other data.

Prioritization of files :: Marking priority on files in cache, so that all changed offset-length pair of the priority file will be take precedence at the selected GW. This make sure that available bandwidth/resources are being utilize first to sync priority data. This will be very much useful when bandwidth/resource are limited. When lots of files are changed but few of them are priority files then we can provide bandwidth...