TDDFS: A tier-aware data deduplication-based file system

Zhichao Cao, Hao Wen, Xiongzi Ge, Jingwei Ma, Jim Diehl, David H.C. Du

Research output: Contribution to journalArticle

1 Scopus citations

Abstract

With the rapid increase in the amount of data produced and the development of new types of storage devices, storage tiering continues to be a popular way to achieve a good tradeoff between performance and cost-effectiveness. In a basic two-tier storage system, a storage tier with higher performance and typically higher cost (the fast tier) is used to store frequently-accessed (active) data while a large amount of less-active data are stored in the lower-performance and low-cost tier (the slow tier). Data are migrated between these two tiers according to their activity. In this article, we propose a Tier-aware Data Deduplication-based File System, called TDDFS, which can operate efficiently on top of a two-tier storage environment. Specifically, to achieve better performance, nearly all file operations are performed in the fast tier. To achieve higher cost-effectiveness, files are migrated from the fast tier to the slow tier if they are no longer active, and this migration is done with data deduplication. The distinctiveness of our design is that it maintains the non-redundant (unique) chunks produced by data deduplication in both tiers if possible. When a file is reloaded (called a reloaded file) from the slow tier to the fast tier, if some data chunks of the file already exist in the fast tier, then the data migration of these chunks from the slow tier can be avoided. Our evaluation shows that TDDFS achieves close to the best overall performance among various file-tiering designs for two-tier storage systems.

Original languageEnglish (US)
Article number4
JournalACM Transactions on Storage
Volume15
Issue number1
DOIs
StatePublished - Feb 2019

    Fingerprint

Keywords

  • Data deduplication
  • Data migration
  • File system
  • Tiered storage

Cite this