Browse Prior Art Database

Method for distributing and validating data files in a clustered computing environment

IP.com Disclosure Number: IPCOM000015918D
Original Publication Date: 2002-May-27
Included in the Prior Art Database: 2003-Jun-21

Publishing Venue

IBM

Abstract

Data distribution in clustered computing environments has been an issue in high performance computing for many years. However, two recent developments have introduced new issues in this problem domain: • The recent rise in “commodity clusters”, primarily built with Linux on x86 hardware • Data growth in the Life Sciences industry, particularly as it relates to genomic and proteomic data Linux clustering commonly known as “Beowulf”clusters – are increasingly being sought as cost effective solutions to problems requiring high performance computing systems. The nature of a Linux cluster is to interconnect a group of servers (typically x86-based servers connected via 10/100MB Ethernet), and to run parallel or “embarrassingly parallel” applications across the cluster. A considerable challenge in this environment is moving data across the cluster in an efficient, reliable, and dynamic manner. Further, the problem increases linearly as the size of the cluster grows.