Data Set for the Development and Testing of the MC23 Nonclassical-Energy Functional

  • Jie J. Bao (Creator)
  • Dayou Zhang (Creator)
  • Shaoting Zhang (Creator)
  • Laura Gagliardi (Creator)
  • Donald G Truhlar (Creator)

Dataset

Description

This dataset contains files used to train and test the Multi-Configuration 23 (MC23) functional and to compare the results to other methods. It includes files to carry out electronic structure calculations. These include molecular geometries in xyz format, OpenMolcas input files for CASSCF calculations, converged CASSCF natural orbitals, OpenMolcas basis set files, and Gaussian 16 formatted checkpoint files for KS-DFT calculations. It also includes data used for data processing such as stoichiometries, absolute energies, and reference energies. Each file in this dataset is a .tar.xz archive. One can extract them by the following command: tar -xJf name_of_archive.tar.xz Below is a description of the content of each archive. gaussian_16_fchk.tar.xz contains Gaussian 16 formatted checkpoint files for all KS-DFT calculations used in this work. The files in the archive are named as functional/database/system.fchk openmolcas_basis_set.tar.xz contains OpenMolcas basis set files used for multireference calculations. To reproduce the results in this work, the basis set files should be placed in the “basis_library” directory in the OpenMolcas installation location. openmolcas_wave_function.tar.xz contains files needed by OpenMolcas to reproduce the CASSCF wave function used in this work. The files in the archive are named database/system.*. The file system.xyz contains the Cartesian coordinates. Note that for Data Set 2, the coordinates are in the input files system.inp. The file system.inp contains the OpenMolcas input file to perform CASSCF calculations. The files system.RasOrb, system.rasscf.h5, and system.rasscf.molden contain the converged CASSCF natural orbitals. gaussian_16_stoichiometry_energy.tar.xz and openmolcas_stoichiometry_energy.tar.xz contain files used for data processing. Files with names like database.ref contain information used to calculate the final energies and errors. They are tab-delimited files. Each row represents an energy difference (e.g. atomization energy, barrier height, etc.). The first column contains the name of the energy difference (note: spaces may be present in this column). This is followed by the file names of each electronic structure calculation and the stoichiometries used to calculate the energy difference from the absolute energies. Each name or stoichiometry occupies one column. The second from the last column contains the reference value in kcal/mol. The reference values contain spin–orbit coupling. The last column contains the factor by which the final energy should be divided. This factor usually equals 1, but it can be greater than 1 for databases calculating atomization energies per bond or per atom. Files with names like method.elist contain the absolute energies of each electronic structure calculation. They are tab-delimited files. Each row represents an electronic structure calculation, and each row always contains two columns. The first column is the file name of the calculation in the format database/system. The second column is the absolute energy in atomic units extracted from the output file of electronic structure programs. The file named SOC.dat contains the spin–orbit coupling term in kcal/mol to be added to each electronic structure calculation prior to calculating energy differences. It has the same format as files with names like method.elist. The database names in the directory names use a slightly different convention than the ones in the article describing MC23. A prefix DS2_ or DS3_ is used to indicate the data set to which a database belongs, and the number of data points is removed from the database name. For example, the MR-MGN-BE8 database from Data Set 2 has a file name DS2_MR-MGN-BE.
Date made availableSep 20 2024
PublisherZENODO

Cite this