Two-Dimensional Energy Histograms as Features for Machine Learning to Predict Adsorption in Diverse Nanoporous Materials

Kaihang Shi, Zhao Li, Dylan M. Anstine, Dai Tang, Coray M. Colina, David S. Sholl, J. Ilja Siepmann, Randall Q. Snurr

Research output: Contribution to journalArticlepeer-review

19 Scopus citations


A major obstacle for machine learning (ML) in chemical science is the lack of physically informed feature representations that provide both accurate prediction and easy interpretability of the ML model. In this work, we describe adsorption systems using novel two-dimensional energy histogram (2D-EH) features, which are obtained from the probe-adsorbent energies and energy gradients at grid points located throughout the adsorbent. The 2D-EH features encode both energetic and structural information of the material and lead to highly accurate ML models (coefficient of determination R2 ∼ 0.94-0.99) for predicting single-component adsorption capacity in metal-organic frameworks (MOFs). We consider the adsorption of spherical molecules (Kr and Xe), linear alkanes with a wide range of aspect ratios (ethane, propane, n-butane, and n-hexane), and a branched alkane (2,2-dimethylbutane) over a wide range of temperatures and pressures. The interpretable 2D-EH features enable the ML model to learn the basic physics of adsorption in pores from the training data. We show that these MOF-data-trained ML models are transferrable to different families of amorphous nanoporous materials. We also identify several adsorption systems where capillary condensation occurs, and ML predictions are more challenging. Nevertheless, our 2D-EH features still outperform structural features including those derived from persistent homology. The novel 2D-EH features may help accelerate the discovery and design of advanced nanoporous materials using ML for gas storage and separation in the future.

Original languageEnglish (US)
Pages (from-to)4568-4583
Number of pages16
JournalJournal of Chemical Theory and Computation
Issue number14
StatePublished - Jul 25 2023

Bibliographical note

Funding Information:
This research was supported by the U.S. Department of Energy, Office of Basic Energy Sciences, Division of Chemical Sciences, Geosciences and Biosciences, under Award DE-FG02-17ER16362. Simulations in this work were made possible by the high-performance computing facility Quest at Northwestern University, the high-performance computing services core facility (RRID:SCR_022168) provided by North Carolina State University, the high-performance computing system HiPerGator 2.0 at the University of Florida, and the Hive cluster at the Georgia Institute of Technology which is supported by the National Science Foundation under Grant Number 1828187. Simulations in this research were supported in part through research cyberinfrastructure resources and services provided by the Partnership for an Advanced Computing Environment (PACE) at the Georgia Institute of Technology, Atlanta, Georgia, U.S.A. This research also used resources of the National Energy Research Scientific Computing Center (NERSC), a U.S. Department of Energy Office of Science User Facility located at Lawrence Berkeley National Laboratory, operated under Contract No. DE-AC02-05CH11231 using NERSC Award BES-ERCAP0020094.

Publisher Copyright:
© 2023 American Chemical Society.

PubMed: MeSH publication types

  • Journal Article


Dive into the research topics of 'Two-Dimensional Energy Histograms as Features for Machine Learning to Predict Adsorption in Diverse Nanoporous Materials'. Together they form a unique fingerprint.

Cite this