Hot Data Identification with Multiple Bloom Filters: Block-Level Decision vs I/O Request-Level Decision

Dongchul Park, Weiping He, David H.C. Du

Research output: Contribution to journalArticlepeer-review

4 Scopus citations


Hot data identification is crucial for many applications though few investigations have examined the subject. All existing studies focus almost exclusively on frequency. However, effectively identifying hot data requires equally considering recency and frequency. Moreover, previous studies make hot data decisions at the data block level. Such a fine-grained decision fits particularly well for flash-based storage because its random access achieves performance comparable with its sequential access. However, hard disk drives (HDDs) have a significant performance disparity between sequential and random access. Therefore, unlike flash-based storage, exploiting asymmetric HDD access performance requires making a coarse-grained decision. This paper proposes a novel hot data identification scheme adopting multiple bloom filters to efficiently characterize recency as well as frequency. Consequently, it not only consumes 50% less memory and up to 58% less computational overhead, but also lowers false identification rates up to 65% compared with a state-of-the-art scheme. Moreover, we apply the scheme to a next generation HDD technology, i.e., Shingled Magnetic Recording (SMR), to verify its effectiveness. For this, we design a new hot data identification based SMR drive with a coarse-grained decision. The experiments demonstrate the importance and benefits of accurate hot data identification, thereby improving the proposed SMR drive performance by up to 42%.

Original languageEnglish (US)
Pages (from-to)79-97
Number of pages19
JournalJournal of Computer Science and Technology
Issue number1
StatePublished - Jan 1 2018

Bibliographical note

Funding Information:
Regular Paper This work was supported by Hankuk University of Foreign Studies Research Fund of Korea, and also partially supported by the National Science Foundation (NSF) Awards of USA under Grant Nos. 1053533, 1439622, 1217569, 1305237, and 1421913. A preliminary version of the work was published in the Proceedings of MSST 2012. ©2018 Springer Science + Business Media, LLC & Science Press, China

Publisher Copyright:
© 2018, Springer Science+Business Media, LLC, part of Springer Nature.


  • bloom filter
  • hot data
  • shingled magnetic recording (SMR)


Dive into the research topics of 'Hot Data Identification with Multiple Bloom Filters: Block-Level Decision vs I/O Request-Level Decision'. Together they form a unique fingerprint.

Cite this