A Forest-structured Bloom Filter with flash memory

Lu Guanlin, Biplob Debnath, David H.C. Du

Research output: Chapter in Book/Report/Conference proceedingConference contribution

15 Scopus citations

Abstract

A Bloom Filter (BF) is a data structure based on probability to compactly represent/record a set of elements (keys). It has wide applications on efficiently identifying a key that has been seen before with minimum amount of recording space used. BF is heavily used in chunking based data de-duplication. Traditionally, a BF is implemented as in-RAM data structure; hence its size is limited by the available RAM space on the machine. For certain applications like data de-duplication that require a big BF beyond the size of available RAM space, it becomes necessary to store a BF into a secondary storage device. Since BF operations are inherently random in nature, magnetic disk provides worse performance for the random read and write operations. It will not be a good fit for storing the large BF. Flash memory based Solid State Drive (SSD) has been considered as an emerging storage device that has superior performance and can potentially replace disks as the preferred secondary storage devices. However, several special characteristics of flash memory make designing a flash memory based BF very challenging. In this paper, our goal is to design an efficient flash memory based BF that is fully aware of these physical characteristics. To this end, we propose a Forest-structured BF design (FBF). FBF uses a combination of RAM and flash memory to design a BF. BF is stored on the flash, while RAM helps to mitigate the impact of slow write performance of flash memory. In addition, in-flash BF is organized in a forest-like structure in order to improve the lookup performance. Our experimental results show that FBF design achieves 2 times faster processing speed with 50% less number of flash write operations when compared with the existing flash memory based BF designs.

Original languageEnglish (US)
Title of host publication2011 IEEE 27th Symposium on Mass Storage Systems and Technologies, MSST 2011
DOIs
StatePublished - 2011
Event2011 IEEE 27th Symposium on Mass Storage Systems and Technologies, MSST 2011 - Denver, CO, United States
Duration: May 23 2011May 27 2011

Publication series

NameIEEE Symposium on Mass Storage Systems and Technologies
ISSN (Print)2160-1968

Other

Other2011 IEEE 27th Symposium on Mass Storage Systems and Technologies, MSST 2011
CountryUnited States
CityDenver, CO
Period5/23/115/27/11

Fingerprint Dive into the research topics of 'A Forest-structured Bloom Filter with flash memory'. Together they form a unique fingerprint.

Cite this