TY - GEN
T1 - Hot data identification for flash-based storage systems using multiple bloom filters
AU - Park, Dongchul
AU - Du, David H C
PY - 2011
Y1 - 2011
N2 - Hot data identification can be applied to a variety of fields. Particularly in flash memory, it has a critical impact on its performance (due to a garbage collection) as well as its life span (due to a wear leveling). Although the hot data identification is an issue of paramount importance in flash memory, little investigation has been made. Moreover, all existing schemes focus almost exclusively on a frequency viewpoint. However, recency also must be considered equally with the frequency for effective hot data identification. In this paper, we propose a novel hot data identification scheme adopting multiple bloom filters to efficiently capture finer-grained recency as well as frequency. In addition to this scheme, we propose a Window-based Direct Address Counting (WDAC) algorithm to approximate an ideal hot data identification as our baseline. Unlike the existing baseline algorithm that cannot appropriately capture recency information due to its exponential batch decay, our WDAC algorithm, using a sliding window concept, can capture very fine-grained recency information. Our experimental evaluation with diverse realistic workloads including real SSD traces demonstrates that our multiple bloom filter-based scheme outperforms the state-of-theart scheme. In particular, ours not only consumes 50% less memory and requires less computational overhead up to 58%, but also improves its performance up to 65%.
AB - Hot data identification can be applied to a variety of fields. Particularly in flash memory, it has a critical impact on its performance (due to a garbage collection) as well as its life span (due to a wear leveling). Although the hot data identification is an issue of paramount importance in flash memory, little investigation has been made. Moreover, all existing schemes focus almost exclusively on a frequency viewpoint. However, recency also must be considered equally with the frequency for effective hot data identification. In this paper, we propose a novel hot data identification scheme adopting multiple bloom filters to efficiently capture finer-grained recency as well as frequency. In addition to this scheme, we propose a Window-based Direct Address Counting (WDAC) algorithm to approximate an ideal hot data identification as our baseline. Unlike the existing baseline algorithm that cannot appropriately capture recency information due to its exponential batch decay, our WDAC algorithm, using a sliding window concept, can capture very fine-grained recency information. Our experimental evaluation with diverse realistic workloads including real SSD traces demonstrates that our multiple bloom filter-based scheme outperforms the state-of-theart scheme. In particular, ours not only consumes 50% less memory and requires less computational overhead up to 58%, but also improves its performance up to 65%.
KW - Bloom Filter
KW - Flash Memory
KW - Hot and Cold Data
KW - Hot Data Identification
KW - SSD
KW - WDAC
UR - http://www.scopus.com/inward/record.url?scp=79960920699&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79960920699&partnerID=8YFLogxK
U2 - 10.1109/MSST.2011.5937216
DO - 10.1109/MSST.2011.5937216
M3 - Conference contribution
AN - SCOPUS:79960920699
SN - 9781457704284
T3 - IEEE Symposium on Mass Storage Systems and Technologies
BT - IEEE Symposium on Mass Storage Systems and Technologies
T2 - 2011 IEEE 27th Symposium on Mass Storage Systems and Technologies, MSST 2011
Y2 - 23 May 2011 through 27 May 2011
ER -