Browse Prior Art Database

An self learning mechanism to improve pattern removal hit rate Disclosure Number: IPCOM000249263D
Publication Date: 2017-Feb-15
Document File: 5 page(s) / 86K

Publishing Venue

The Prior Art Database


In industry storage systems, pattern removal usually designed and implemented with a static pattern database. This disclossure here describes a new design/implementation based on a self learning method during host input/output, and random time points of data sampling. By this new mechnism, pattern database will be customized for user's host input/output. An better predictable pattern removal hit rate and a more efficient pattern detection can be provided.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 100% of the total text.


An self learning mechanism to improve pattern removal hit rate

1. What is pattern removal: A storage system divides host IO(Input/Output) to smaller data chunk, and then detect these data chunks to see if there are known patterns. If a known pattern detected, then replace data chunk with correspondent pattern ID.

2. For a storage system, where a static pattern database designed/implemented for pattern detection, the pattern database is fixed and comes from some known structures in industry, i.e. zero filled block, known Operation System structures. Without further adoptions to customer's Hosts/Applications, the hit rate is unpredictable and not efficient enough. By this self learning mechanism, storage system will know some key characteristic of host IO(Input/Output) and then generate a customized pattern database for the customer, which will improve pattern hit rate with predictable result.

3. In this mechanism: Host IO(Input/Output) divided to 8K. Each 8k data chunk will be calculate a fingerprint and then compared to existing pattern DB (pattern database) in the storage system. Once fingerprint matched, the data will be removed and replaced with pattern ID. How this mechanism works:





Additional notes for above design chart: Pattern DB means pattern database. IO means Input/Output from host.