A Bloom Filter (BF) is a data structure based on probability to compactly represent/record a set of elements (keys). It has wide applications on efficiently identifying a key that has been seen before with minimum amount of recording space used. BF is heavily used in chunking based data de-duplication. Traditionally, a BF is implemented as in-RAM data structure; hence its size is limited by the available RAM space on the machine. For certain applications like data de-duplication that require a big BF beyond the size of available RAM space, it becomes necessary to store a BF into a secondary storage device. Since BF operations are inherently random in nature, magnetic disk provides worse performance for the random read and write operations. It will not be a good fit for storing the large BF. Flash memory based Solid State Drive (SSD) has been considered as an emerging storage device that has superior performance and can potentially replace disks as the preferred secondary storage devices. However, several special characteristics of flash memory make designing a flash memory based BF very challenging. In this paper, our goal is to design an efficient flash memory based BF that is fully aware of these physical characteristics. To this end, we propose a Forest-structured BF design (FBF). FBF uses a combination of RAM and flash memory to design a BF. BF is stored on the flash, while RAM helps to mitigate the impact of slow write performance of flash memory. In addition, in-flash BF is organized in a forest-like structure in order to improve the lookup performance. Our experimental results show that FBF design achieves 2 times faster processing speed with 50% less number of flash write operations when compared with the existing flash memory based BF designs.
|Original language||English (US)|
|Title of host publication||2011 IEEE 7th International Workshop on Storage Network Architecture and Parallel I/Os, SNAPI 2011|
|State||Published - Dec 1 2011|
|Event||2011 IEEE 7th International Workshop on Storage Network Architecture and Parallel I/Os, SNAPI 2011 - Denver, CO, United States|
Duration: May 25 2011 → May 25 2011
|Other||2011 IEEE 7th International Workshop on Storage Network Architecture and Parallel I/Os, SNAPI 2011|
|Period||5/25/11 → 5/25/11|