Leverage storage devices' antivirus interface to gain knowledge about file access frequency
Publication Date: 2015-Oct-01
The IP.com Prior Art Database
Access frequency of a file is a particularly interesting parameter with regard to the importance and value of its content. However, well-established filesystems do not natively provide such statistics. We present a method for access statistics calculation on storage devices or computers featuring an interface to a centralized anti-virus solution. Furthermore, since this method can intiate content analysis during file creation or modification, the usual practice of scheduled filesystem scans for new or changed content can be avoided.
Page 01 of 4
Leverage storage devices ' frequency
The invention addresses some problems in the domain of Information Lifecycle Management (ILM). Information in this context can be viewed as files which are stored on a data storage system. ILM depends on metadata about these files, i.e. information about a particular file and its content. Such data is kept in an information inventory. The information inventory is an entity which provides not only information about known files such as where the file is stored, when it was created, its size, or when it was accessed the last time. In addition, an information inventory may use content classification systems which provide further metadata on the file's content. Whether the file is a picture taken with a camera, a music track or a contract document. All this kind of metadata is consolidated in the information inventory's metadata catalog for each file. ILM decisions such as file archival, definition of retention periods, disposal, etc., are based on the information provided by the information inventory.
Among the many parameters which characterize a particular file for ILM, is the file's
access frequency. This is the number of accesses to the file within a certain time span. Obviously, this is one of the most valuable inputs for the calculation of a file's retention period or its disposability. The more often a file has been accessed in the nearest past, the more important it is and the less likely it can be disposed. File access frequency can be regarded as an indicator of the importance of the file's content. The invention described herein provides a method of determining a file's access frequency, provided it is located on a storage device capable of initiating a virus scan for the files stored on it. While operating systems and storage devices usually know about the latest access or modification time of a file, they do not provide information on how often a file was accessed during a certain time period.
Current information inventory systems usually initiate scans on file storage at certain intervals (harvesting). New and modified files get known to the system asynchronously. To discover a new file, the whole existing storage has to be traversed and the list of existing files has to be compared to the list of files found during the previous scan. The described invention provides a method to inform a information inventory system on new files. Since the method relies on the initiation of a virus scan on a file's content, a classification system could be synchronously invoked in order to receive current classification metadata. The information inventory system is thereby reliefed from performing scans of whole filesystems in order to discover new files and have them classified.
Many storage systems, namely network attached storage systems (NAS), come with an interface to antivirus software. These systems employ the antivirus software to analyze a file whenever it is created or written, and - option...