Low-power, Low-storage-overhead chipkill correct via Multi-line error correction

Xun Jian, Henry Duwe, John Sartori, Vilas Sridharan, Rakesh Kumar

Research output: Chapter in Book/Report/Conference proceedingConference contribution

28 Scopus citations

Abstract

Due to their large memory capacities, many modern servers require chipkill correct, an advanced type of memory error detection and correction, to meet their reliability require-ments. However, existing chipkill-correct solutions incur high power or storage overheads, or both because they use dedicated error-correction resources per codeword to per-form error correction. This requires high overhead for cor-rection and results in high overhead for error detection. We propose a novel chipkill-correct solution, multi-line error cor-rection, that uses resources shared across multiple lines in memory for error correction to reduce the overhead of both error detection and correction. Our evaluations show that the proposed solution reduces memory power by a mean of 27%, and up to 38% with respect to commercial solutions, at a cost of 0.4% increase in storage overhead and minimal impact on reliability.

Original languageEnglish (US)
Title of host publicationProceedings of SC 2013
Subtitle of host publicationThe International Conference for High Performance Computing, Networking, Storage and Analysis
PublisherIEEE Computer Society
ISBN (Print)9781450323789
DOIs
StatePublished - Jan 1 2013
Event2013 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2013 - Denver, CO, United States
Duration: Nov 17 2013Nov 22 2013

Publication series

NameInternational Conference for High Performance Computing, Networking, Storage and Analysis, SC
ISSN (Print)2167-4329
ISSN (Electronic)2167-4337

Other

Other2013 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2013
CountryUnited States
CityDenver, CO
Period11/17/1311/22/13

    Fingerprint

Cite this

Jian, X., Duwe, H., Sartori, J., Sridharan, V., & Kumar, R. (2013). Low-power, Low-storage-overhead chipkill correct via Multi-line error correction. In Proceedings of SC 2013: The International Conference for High Performance Computing, Networking, Storage and Analysis [24] (International Conference for High Performance Computing, Networking, Storage and Analysis, SC). IEEE Computer Society. https://doi.org/10.1145/2503210.2503243