On Demand Migration for Content/Resource Management Systems
Original Publication Date: 2005-Oct-06
Included in the Prior Art Database: 2005-Oct-06
Currently in the most advanced Content Management systems there is no good way to deal with content storage when the data repository is approaching its capacity. For example, we set a threshold, and hope that the files we store is not big enough such that it will surpass this threshold and exceed the storage device's storage capacity. For example, if the threshold is set to 70%, while we are currently at 69%, and the file we are trying to store takes 32% storage capacity, this file will fail to store (69% + 32% > 100%). This is exacerbated by a frustrated user, who sits at his/her terminal for hours (imagine a large file), sees that the file is 99% transferred (in terms of the storage device's capacity, 100-69=31% of 32% will go through), then we tell the user that the disk ran out of space. This happens because we fail to do the proper calculations when a file transfer to the data repository is initiated. There is no reason to not warn the user before the file transfer takes place. We simply cannot assume that the migrator will always successfully migrate and clear out enough space for the next file that is being transferred, nor can we disallow the user to import files when it is in fact possible for the file transfer. Thus, the migration scheme needs to be improved. Other migration schemes involve checking for file system space when the import occurs, and if the file system does not have sufficient space, return an error to the user saying the file cannot be transferred. This is simply not true--the file transfer CAN take place, just that migration of objects need to occur. Yet another scheme is to kick off the migrator once an import takes place. This is also bad and insufficient for the following scenario: imagine you are importing a 500 meg file and upon import, the RM sees only 200 megs of space. Then the migrator successfully locates and starts to migrate 300 megs of files over to TSM, which is known to be a slower medium (and by the way, the backup/failover/archive storage area by definition must be slower than the RM disk, since otherwise, it would be the primary RM storage area). In theory, the 500 meg file transfer will complete. But it is not guaranteed, because clearing space to TSM is slow, and the import to the RM disk will be too fast. What happens is that the user, some time after 200 megs has been imported, will get a timeout from the RM during the file transfer. Again, this is time unnecessarily wasted for the user/customer.
On Demand Migration for Content /Resource Management Systems
Four scenarios will happen for any given file transfer, and as we will explain, with our disclosure each scenario will be handled better than the current implementations of Content/Resource Management Systems we see today.
Below we will use the term "migrator" loosely. It is any process that moves objects from one storage space to another. Its implementation is thus trivial.
First scenario: when an import is initiated, the migrator will see that the file transfer can complete successfully because there exists sufficient space. Everything happens successfully.
Second scenario: it will see that the storage repository does not have enough disk space, in which case the migrator will find candidates for migration. This set of candidates must have total size greater than or equal to the difference in size of the actual file and the threshold of the storage device. If the search of migration candidates is successful, then it migrates and clears the disk space simultaneously as the file transfer takes place. As the in (file import) and out (migration file transfers) streams of file transfer takes place, we can calculate their rate of transfers. Let the amount of space we need to free be S, size of file to transfer be T, timeout value be O, file import rate be F and migration rate be M. If the following equation of (T/F + O)-S/M < 0 is true for any extended period of time we can safely predict that the file transfer will not be successful ahead of time, since our storage management system is not able to clear space fast enough, and will exceed the timeout value for the actual file import. This scenario, we do clear space fast enough, and file transfer completes.
Third scenario: like the second scenario: is that the import will see that there is not enough space, and kicks off the migrator on demand, which FAILs to find a migration candidates, in which case it warns the users and perhaps even disallows the file trans...