Par-BF: A parallel partitioned Bloom filter for dynamic data sets

Yi Liu, Xiongzi Ge, David Hung Chang Du, Xiaoxia Huang

Research output: Contribution to journalArticlepeer-review

6 Scopus citations

Abstract

Compared with a hash table, a Bloom filter (BF) is more space efficient for supporting fast matching through a controllable and acceptable false positive probability. The space size of the basic BF is predetermined based on the expected number of elements to be stored. However, we cannot predict the space scale of a BF for dynamic sets. It is still challenging for the two existing solutions, scalable BF (SBF) and dynamic BF (DBF), to manipulate dynamic data sets with low memory overhead but achieving high performance. This article presents a partitioned BF (Par-BF) for dynamic data sets. Compared with DBF and SBF, Par-BF is able to leverage a sweet spot between high performance and low overhead by a group of formulas to support fast concurrent matching. Specifically, the size and the range of the false positive probability in Par-BF can be deliberately derived. From our trace-driven experimental results, the input/output operations per second of Par-BF outperforms that of DBF and SBF by 10× to 14× and by 3× to 8×, respectively. Besides, through our proposed garbage collection policy, Par-BF consumes less than half of the memory usage of SBF.

Original languageEnglish (US)
Pages (from-to)259-275
Number of pages17
JournalInternational Journal of High Performance Computing Applications
Volume30
Issue number3
DOIs
StatePublished - Aug 1 2016

Keywords

  • Bloom filter
  • Par-BF
  • dynamic sets
  • false positive
  • fast matching

Fingerprint Dive into the research topics of 'Par-BF: A parallel partitioned Bloom filter for dynamic data sets'. Together they form a unique fingerprint.

Cite this