Dismiss
We will be performing system updates on Sunday, July 21st, from 9-noon ET. You may experience brief service interruptions during that time.
Browse Prior Art Database

Framework for Stream De-duplication using Biased Reservoir Sampling

IP.com Disclosure Number: IPCOM000216344D
Publication Date: 2012-Mar-31
Document File: 6 page(s) / 93K

Publishing Venue

The IP.com Prior Art Database

Abstract

This work demonstrates a novel Reservoir Sampling based Bloom Filter,(RSBF) data structure, based on the combined concepts of reservoir sampling and Bloom filters for approximate detection of duplicates in evolving data streams. It shows that RSBF offers the currently lowest False Negative Rate (FNR) and convergence rates, and are better than those of Stable Bloom Filter (SBF) while using the same memory. Using empirical analysis on varied datasets, it exhibits upto 2x improvement in FNR with better convergence rates as compared to SBF, while exhibiting comparable False Positive Rate (FPR).