A machine learning framework for the analysis and prediction of catalytic activity from experimental data

Alexander Smith, Andrea Keane, James A. Dumesic, George W. Huber, Victor M. Zavala

Research output: Contribution to journalArticlepeer-review

62 Scopus citations


We present a machine learning framework to explore the predictability limits of catalytic activity from experimental descriptor data (which characterizes catalyst formulations and reaction conditions). Artificial neural networks are used to fuse descriptor data to predict activity and we use principal component analysis (PCA) and sparse PCA to project the experimental data into an information space and with this identify regions that exhibit low- and high-predictability. Our framework also incorporates a constrained-PCA optimization formulation that identifies new experimental points while filtering out regions in the experimental space due to constraints on technology, economics, and expert knowledge. This allows us to navigate the experimental space in a more targeted manner. Our framework is applied to a comprehensive water–gas shift reaction data set, which contains 2228 experimental data points collected from the literature. Neural network analysis reveals strong predictability of activity across reaction conditions (e.g., varying temperature) but also reveals important gaps in predictability across catalyst formulations (e.g., varying metal, support, and promoter). PCA analysis reveals that these gaps are due to the fact that most experiments reported in the literature lie within narrow regions in the information space. We demonstrate that our framework can systematically guide experiments and the selection of descriptors in order to improve predictability and identify new promising formulations.

Original languageEnglish (US)
Article number118257
JournalApplied Catalysis B: Environmental
StatePublished - Apr 2020
Externally publishedYes

Bibliographical note

Funding Information:
Andrea Keane was supported by the WARF 2020 program at the University of Wisconsin-Madison.

Publisher Copyright:
© 2019 Elsevier B.V.


  • Catalysis
  • Data analysis
  • High-dimensional
  • Machine learning
  • Predictability


Dive into the research topics of 'A machine learning framework for the analysis and prediction of catalytic activity from experimental data'. Together they form a unique fingerprint.

Cite this