TY - JOUR
T1 - Practical techniques for purging deleted data using liveness information
AU - Boutcher, David
AU - Chandra, Abhishek
N1 - Copyright:
Copyright 2010 Elsevier B.V., All rights reserved.
PY - 2008/7/1
Y1 - 2008/7/1
N2 - The layered design of the Linux operating system hides the liveness of file system data from the underlying block layers. This lack of liveness information prevents the storage system from discarding blocks deleted by the file system, often resulting in poor utilization, security problems, inefficient caching, and migration overheads. In this paper, we define a generic "purge" operation that can be used by a file system to pass liveness information to the block layer with minimal changes in the layer interfaces, allowing the storage system to discard deleted data. We present three approaches for implementing such a purge operation: direct call, zero blocks, and flagged writes, each of which differs in their architectural complexity and potential performance overhead. We evaluate the feasibility of these techniques through a reference implementation of a dynamically resizable copy on write (COW) data store in User Mode Linux (UML). Performance results obtained from this reference implementation show that all these techniques can achieve significant storage savings with a reasonable execution time overhead. At the same time, our results indicate that while the direct call approach has the best performance, the zero block approach provides the best compromise in terms of performance overhead and its semantic and architectural simplicity. Overall, our results demonstrate that passing liveness information across the file system-block layer interface with minimal changes is not only feasible but practical.
AB - The layered design of the Linux operating system hides the liveness of file system data from the underlying block layers. This lack of liveness information prevents the storage system from discarding blocks deleted by the file system, often resulting in poor utilization, security problems, inefficient caching, and migration overheads. In this paper, we define a generic "purge" operation that can be used by a file system to pass liveness information to the block layer with minimal changes in the layer interfaces, allowing the storage system to discard deleted data. We present three approaches for implementing such a purge operation: direct call, zero blocks, and flagged writes, each of which differs in their architectural complexity and potential performance overhead. We evaluate the feasibility of these techniques through a reference implementation of a dynamically resizable copy on write (COW) data store in User Mode Linux (UML). Performance results obtained from this reference implementation show that all these techniques can achieve significant storage savings with a reasonable execution time overhead. At the same time, our results indicate that while the direct call approach has the best performance, the zero block approach provides the best compromise in terms of performance overhead and its semantic and architectural simplicity. Overall, our results demonstrate that passing liveness information across the file system-block layer interface with minimal changes is not only feasible but practical.
UR - http://www.scopus.com/inward/record.url?scp=77952245289&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77952245289&partnerID=8YFLogxK
U2 - 10.1145/1400097.1400107
DO - 10.1145/1400097.1400107
M3 - Article
AN - SCOPUS:77952245289
SN - 0163-5980
VL - 42
SP - 85
EP - 94
JO - Operating Systems Review (ACM)
JF - Operating Systems Review (ACM)
IS - 5
ER -