Browse Prior Art Database

Stochastic Identification of Duplicate Computer Files Disclosure Number: IPCOM000033691D
Publication Date: 2004-Dec-23

Publishing Venue

The Prior Art Database


This invention inserts a stochastic filtering procedure before any attempt to compare actual file contents, by calculating K-bit checksums for each of the candidate files and discarding files having unique checksums from further consideration as potential duplicate files. It further performs the comparisons of actual file contents only between files having identical checksums, further reducing the time required to confirm the identification of actual duplicates.