TY - GEN
T1 - A highly parallel GPU-based hash accelerator for a data deduplication system
AU - Li, Xin
AU - Lilja, David J
N1 - Copyright:
Copyright 2010 Elsevier B.V., All rights reserved.
PY - 2009
Y1 - 2009
N2 - Recently, data storage systems with data deduplication have been introduced as a method of reducing storage space by eliminating redundant data. In a deduplication storage system, the collision-resistant fingerprint of each data segment must be calculated using a hash algorithm. This paper presents a GPU based accelerator, called g-Dedu, for processing the hash computation of the deduplication system. The g-Dedu accelerator algorithm is especially designed for handling the variable and small size of the data used in a deduplication system, which cannot be processed efficiently by a GPU in a straightforward way. Our data organization approach uses a hierarchical data structure to organize the processing data. A scheduler manages these data for optimal GPU processing. Our patterned data segment approach overcomes some noticeable performance drops resulting from the GPU memory model. Furthermore, different from some previous GPU hash accelerator work, our approach strictly follows the hash processing standard. Using this new approach, g-Dedu achieves 6 times speedup on the SHA-1 computation, and 7.4 times speedup on the SHA-2 computation when compared with a CPU-based mplementation.
AB - Recently, data storage systems with data deduplication have been introduced as a method of reducing storage space by eliminating redundant data. In a deduplication storage system, the collision-resistant fingerprint of each data segment must be calculated using a hash algorithm. This paper presents a GPU based accelerator, called g-Dedu, for processing the hash computation of the deduplication system. The g-Dedu accelerator algorithm is especially designed for handling the variable and small size of the data used in a deduplication system, which cannot be processed efficiently by a GPU in a straightforward way. Our data organization approach uses a hierarchical data structure to organize the processing data. A scheduler manages these data for optimal GPU processing. Our patterned data segment approach overcomes some noticeable performance drops resulting from the GPU memory model. Furthermore, different from some previous GPU hash accelerator work, our approach strictly follows the hash processing standard. Using this new approach, g-Dedu achieves 6 times speedup on the SHA-1 computation, and 7.4 times speedup on the SHA-2 computation when compared with a CPU-based mplementation.
KW - CUDA
KW - Deduplication system
KW - GPU computing
KW - Hash computing
UR - https://www.scopus.com/pages/publications/77952364312
UR - https://www.scopus.com/pages/publications/77952364312#tab=citedBy
M3 - Conference contribution
AN - SCOPUS:77952364312
SN - 9780889868113
T3 - Proceedings of the IASTED International Conference on Parallel and Distributed Computing and Systems
SP - 268
EP - 275
BT - Proceedings of the 21st IASTED International Conference on Parallel and Distributed Computing and Systems, PDCS 2009
T2 - 21st IASTED International Conference on Parallel and Distributed Computing and Systems, PDCS 2009
Y2 - 2 November 2009 through 4 November 2009
ER -