TY - JOUR
T1 - CatEmbed
T2 - A Machine-Learned Representation Obtained via Categorical Entity Embedding for Predicting Adsorption and Reaction Energies on Bimetallic Alloy Surfaces
AU - Kirkvold, Clara
AU - Collins, Brianna A.
AU - Goodpaster, Jason D.
N1 - Publisher Copyright:
© 2024 American Chemical Society.
PY - 2024/7/4
Y1 - 2024/7/4
N2 - Machine-learning models for predicting adsorption energies on metallic surfaces often rely on basic elemental properties and electronic and geometric descriptors. Here, we apply categorical entity embedding, a featurization method inspired by natural language processing techniques, to predict adsorption energies on bimetallic alloy surfaces using categorical descriptors. Using this method, we develop a machine-learned representation from categorical descriptors (e.g., surface composition, adsorbate type, and site type) of the slab/adsorbate complex. By combining this representation with numerical features (e.g., slab metal stoichiometric ratios), we create the CatEmbed representation. Remarkably, decision tree models trained using CatEmbed, which includes no explicit geometric information, achieve a Mean Absolute Error (MAE) of 0.12 eV. Additionally, we extend this technique to predict reaction energies on bimetallic surfaces, creating the CatEmbed-React representation, which achieves an MAE of 0.08 eV. These findings highlight the effectiveness of categorical entity embedding for predicting adsorption and reaction energies on bimetallic alloy surfaces.
AB - Machine-learning models for predicting adsorption energies on metallic surfaces often rely on basic elemental properties and electronic and geometric descriptors. Here, we apply categorical entity embedding, a featurization method inspired by natural language processing techniques, to predict adsorption energies on bimetallic alloy surfaces using categorical descriptors. Using this method, we develop a machine-learned representation from categorical descriptors (e.g., surface composition, adsorbate type, and site type) of the slab/adsorbate complex. By combining this representation with numerical features (e.g., slab metal stoichiometric ratios), we create the CatEmbed representation. Remarkably, decision tree models trained using CatEmbed, which includes no explicit geometric information, achieve a Mean Absolute Error (MAE) of 0.12 eV. Additionally, we extend this technique to predict reaction energies on bimetallic surfaces, creating the CatEmbed-React representation, which achieves an MAE of 0.08 eV. These findings highlight the effectiveness of categorical entity embedding for predicting adsorption and reaction energies on bimetallic alloy surfaces.
UR - http://www.scopus.com/inward/record.url?scp=85196848284&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85196848284&partnerID=8YFLogxK
U2 - 10.1021/acs.jpclett.4c01492
DO - 10.1021/acs.jpclett.4c01492
M3 - Article
C2 - 38913414
AN - SCOPUS:85196848284
SN - 1948-7185
VL - 15
SP - 6791
EP - 6797
JO - Journal of Physical Chemistry Letters
JF - Journal of Physical Chemistry Letters
IS - 26
ER -