The modular nature of metal-organic frameworks (MOFs) leads to a very large number of possible structures. High-throughput computational screening has led to a rapid increase in property data that has enabled several potential applications for MOFs, including gas storage, separations, catalysis, and other fields. Despite their rich chemistry, MOFs are typically named using an ad hoc approach, which can impede their searchability and the discovery of broad insights. In this article, we develop two systematic MOF identifiers, coined MOFid and MOFkey, and algorithms for deconstructing MOFs into their building blocks and underlying topological network. We review existing cheminformatics formats for small molecules and address the challenges of adapting them to periodic crystal structures. Our algorithms are distributed as open-source software, and we apply them here to extract insights from several MOF databases. Through the process of designing MOFid and MOFkey, we provide a perspective on opportunities for the community to facilitate data reuse, improve searchability, and rapidly apply cheminformatics analyses.
Bibliographical noteFunding Information:
This work was primarily supported by the U.S. Department of Energy, Office of Basic Energy Sciences, Division of Chemical Sciences, Geosciences and Biosciences through the Nanoporous Materials Genome Center under Award Numbers DE-FG02-12ER16362 and DE-FG02-17ER16362. B.J.B. also acknowledges support from a National Science Foundation Graduate Research Fellowship under Grant No. DGE-1324585. A.S.R. acknowledges government support under Contract FA9550-11-C-0028 and awarded by the Department of Defense (DOD), Air Force Office of Scientific Research, National Defense Science and Engineering Graduate (NDSEG) Fellowship, 32 CFR 168a. B.J.B. and A.S.R. acknowledge support from Ryan Fellowships via the International Institute for Nanotechnology at Northwestern University. This research was supported in part through the computational resources and staff contributions provided for the Quest high-performance computing facility at Northwestern University, which is jointly supported by the Office of the Provost, the Office for Research, and Northwestern University Information Technology. The authors thank Jeffrey R. Long for helpful conversations about the identifiers.
© 2019 American Chemical Society.