Peer To Peer (P2P) Computer Data Backup
Publication Date: 2002-Aug-09
The IP.com Prior Art Database
Wayne Gramlich: INVENTOR [+2]
[ IPCOM000000001S originally published 2001-05-01 16:44 UTC ] Disclosed is a way to provide data backup and recovery using peer-to-peer (P2P) technology. Backup data can be stored on local machines (for example, machines sharing a local Ethernet). Advantages include automation of backup and recovery, reduction in equipment and storage costs, more frequent backups, and streamlining of offsite backup functionality. Much of this functionality is simply the result of applying RAID techniques to the problem of data backup. Other useful functionality is possible with a minimum of technology. This disclosure describes a system for backing up multiple networked computers or other storage systems connected by various communication systems including wireless and infrared; in many cases, no configuration or hardware purchase will be required: just install and forget. The invention includes three parts. First, detect computers to cooperate with. Second, keep track of files. Third, store the files in a safe place. (The second and third methods may be combined in some implementations.) Using this invention, no computer needs to have special "server" status or special hardware, or be manually configured in any way. For completely automated backup, computers connected with a TCP/IP network may find cooperative computers by scanning for others whose IP address differs only in subnet mask. Ethernet allows broadcast messages. For restore, a computer with the same Ethernet card, IP number, or name, but missing files, may have its files replaced automatically. The system may also be configured manually, including over a network. In practice, many computers have substantial amounts of free disk space. The program would allocate some portion of this disk space to the purpose of storing backup data from other nodes. If space became scarce, simply installing a new hard disk in one or two nodes would be sufficient intervention. The basic technology to keep track of which files are backed up, and what piece of media they are backed up on, is in use in backup systems today and is in the prior art. This system will simply treat each allocated area of storage ("node") as a separate piece of media. Several nodes may be in one computer, so that a standalone computer with multiple disks can be protected from a single-disk failure. The system can be extended to allow selective restore, revision tracking and version control. Because the backup storage is always available, it can provide instant automated backup. Since several nodes are cooperating to back up each of them, various RAID (Redundant Arrays of Independent Disks) technologies may be used to improve functionality. For example, files of similar length and arbitrary content may be backed up in a space no bigger than the longest of them. The system can handle offsite backup by treating removable media or networked storage as another node, and copying all files onto that node. The system may be configured to copy all files to nodes "nearby" the offsite node in terms of network speed before the offsite backup is begun. Various issues in scheduling the swapping of removable media, and other issues of offsite backup including automated determination of and adaptation to network conditions, may be solved with techniques in use in standard backup systems. Incremental backup is handled simply by storing only new files to the incremental node; restoration will require no or minimal extra bookkeeping, since the system already handles files on multiple nodes. Files may be encrypted, requiring no access lists or trusted systems. Selective restore is possible. Disk space may be conserved by compressing files, by a variation of copy-on-write, or by any of several RAID (Redundant Arrays of Independent Disks) techniques. The system can be used for installing software on new computers, restoring preset configurations, etc., and for detecting unauthorized changes such as those caused by a computer virus. Additional improvements will be apparent to those skilled in backup technology, network hardware or software technology, system administration, or RAID technology. The accompanying text file contains more detail on these and other innovations. [ 000000001S 01S 1S ]
Abstract: Disclosed is a way to provide data backup and recovery using peer-to-peer (P2P) technology.� Backup data can be stored on local machines (for example, machines sharing a local Ethernet).� Advantages include automation of backup and recovery, reduction in equipment and storage costs, more frequent backups, and streamlining of offsite backup functionality.� Much of this functionality is simply the result of applying RAID techniques to the problem of data backup.� Other useful functionality is possible with a minimum of technology.
It is common for computer systems to be run with no backup storage of their data.� When data is backed up, solutions frequently include extra hardware and extra chores, which may be neglected or done infrequently.� Automated solutions typically involve a central server that must be purchased, configured, and maintained.� This disclosure describes a system for backing up multiple networked computers; in many cases, no configuration or hardware purchase will be required: just install and forget.
The invention includes three parts.� First, detect computers to cooperate with.� Second, keep track of files.� Third, store the files in a safe place.� (The second and third methods may be combined in some implementations.)� Using this invention, no computer needs to have special "server" status or special hardware, or be manually configured in any way.
� Cooperating Computers
If computers have been assigned IP numbers corresponding to a namespace assigned to the organization that owns them, any computer with the IP address differing only in subnet number is likely to be local.� All computers in the same subnet can be polled on a preestablished port to find cooperative systems running the software.� (Computers connected through dialup connections use a special range of IP addresses, so this� condition can be detected and some other means used to determine which computer to back up to.)� If computers are connected with Ethernet, a broadcast message will reach any computer sharing the local network.� Other ways of determining "local" computers without user action will be apparent, depending on the hardware and software configuration.� Computers can scan periodically, e.g. once per day, and determine the other computers they should be working with.� They may also retrieve configuration lists from each other.� Optionally, a list of cooperating computers can be configured manually; these computers need not be physically nearby.� (Configuration also allows for excluding computers that would otherwise be determined to be "local" but are e.g. owned by a different cost center.)� For this disclosure, "nodes" will refer to computers participating in this backup scheme, however determined.� "Program" will refer to the software running on each of the nodes that performs the functionality described.� "P2PB" will refer to the peer-to-peer backup system being disclosed here.
In practice, many computers have substantial amounts of free...