Automated algorithmic error resilience for structured grid problems based on outlier detection

Amoghavarsha Suresh, John M Sartori

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations

Abstract

In this paper, we propose automated algorithmic error re- silience based on outlier detection. Our approach exploits the characteristic behavior of a class of applications to cre- Ate metric functions that normally produce metric values according to a designed distribution or behavior and pro- duce outlier values (i.e., values that do not conform to the designed distribution or behavior) when computations are affected by errors. For a robust algorithm that employs such an approach, error detection becomes equivalent to outlier detection. As such, we can make use of well-established, statistically rigorous techniques for outlier detection to effec- Tively and efficiently detect errors, and subsequently correct them. Our error-resilient algorithms incur significantly lower overhead than traditional hardware and software error re- silience techniques. Also, compared to previous approaches to application-based error resilience, our approaches param- eterize the robustification process, making it easy to auto- matically transform large classes of applications into robust applications with the use of parser-based tools and mini- mal programmer effort. We demonstrate the use of auto- mated error resilience based on outlier detection for struc- Tured grid problems, leveraging the flexibility of algorithmic error resilience to achieve improved application robustness and lower overhead compared to previous error resilience ap- proaches. We demonstrate 2×-3× improvement in output quality compared to the original algorithm with only 22% overhead, on average, for non-iterative structured grid prob- lems. Average overhead is as low as 4.5% for error-resilient iterative structured grid algorithms that tolerate error rates up to 10E-3 and achieve the same output quality as their error-free counterparts.

Original languageEnglish (US)
Title of host publicationProceedings of the 12th ACM/IEEE International Symposium on Code Generation and Optimization, CGO 2014
PublisherAssociation for Computing Machinery
Pages240-250
Number of pages11
ISBN (Print)9781450326704
DOIs
StatePublished - 2014
Event12th ACM/IEEE International Symposium on Code Generation and Optimization, CGO 2014 - Orlando, FL, United States
Duration: Feb 15 2014Feb 19 2014

Publication series

NameProceedings of the 12th ACM/IEEE International Symposium on Code Generation and Optimization, CGO 2014

Other

Other12th ACM/IEEE International Symposium on Code Generation and Optimization, CGO 2014
Country/TerritoryUnited States
CityOrlando, FL
Period2/15/142/19/14

Keywords

  • Algorithmic error resilience
  • Application robustification
  • Out- lier detection
  • Structured grids

Fingerprint

Dive into the research topics of 'Automated algorithmic error resilience for structured grid problems based on outlier detection'. Together they form a unique fingerprint.

Cite this