Extracting the textual and temporal structure of supercomputing logs

Sourabh Jain, Inderpreet Singh, Abhishek Chandra, Zhi Li Zhang, Greg Bronevetsky

Research output: Chapter in Book/Report/Conference proceedingConference contribution

10 Scopus citations

Abstract

Supercomputers are prone to frequent faults that adversely affect their performance, reliability and functionality. System logs collected on these systems are a valuable resource of information about their operational status and health. However, their massive size, complexity, and lack of standard format makes it difficult to automatically extract information that can be used to improve system management. In this work we propose a novel method to succinctly represent the contents of supercomputing logs, by using textual clustering to automatically find the syntactic structures of log messages. This information is used to automatically classify messages into semantic groups via an online clustering algorithm. Further, we describe a methodology for using the temporal proximity between groups of log messages to identify correlated events in the system. We apply our proposed methods to two large, publicly available supercomputing logs and show that our technique features nearly perfect accuracy for online log-classification and extracts meaningful structural and temporal message patterns that can be used to improve the accuracy of other log analysis techniques.

Original languageEnglish (US)
Title of host publication16th International Conference on High Performance Computing, HiPC 2009 - Proceedings
Pages254-263
Number of pages10
DOIs
StatePublished - Dec 1 2009
Event16th International Conference on High Performance Computing, HiPC 2009 - Kochi, India
Duration: Dec 16 2009Dec 19 2009

Publication series

Name16th International Conference on High Performance Computing, HiPC 2009 - Proceedings

Other

Other16th International Conference on High Performance Computing, HiPC 2009
CountryIndia
CityKochi
Period12/16/0912/19/09

Fingerprint Dive into the research topics of 'Extracting the textual and temporal structure of supercomputing logs'. Together they form a unique fingerprint.

  • Cite this

    Jain, S., Singh, I., Chandra, A., Zhang, Z. L., & Bronevetsky, G. (2009). Extracting the textual and temporal structure of supercomputing logs. In 16th International Conference on High Performance Computing, HiPC 2009 - Proceedings (pp. 254-263). [5433202] (16th International Conference on High Performance Computing, HiPC 2009 - Proceedings). https://doi.org/10.1109/HIPC.2009.5433202