TY - JOUR
T1 - Massive Assessment of the Binding Energies of Atmospheric Molecular Clusters
AU - Jensen, Andreas Buchgraitz
AU - Kubečka, Jakub
AU - Schmitz, Gunnar
AU - Christiansen, Ove
AU - Elm, Jonas
N1 - Publisher Copyright:
© 2022 American Chemical Society. All rights reserved.
PY - 2022/12
Y1 - 2022/12
N2 - Quantum chemical studies of the formation and growth of atmospheric molecular clusters are important for understanding aerosol particle formation. However, the search for the lowest free-energy cluster configuration is extremely time consuming. This makes high-level benchmark data sets extremely valuable in the quest for the global minimum as it allows the identification of cost-efficient computational methodologies, as well as the development of high-level machine learning (ML) models. Herein, we present a highly versatile quantum chemical data set comprising a total of 11 749 (acid)1-2(base)1-2cluster configurations, containing up to 44 atoms. Utilizing the LUMI supercomputer, we calculated highly accurate PNO-CCSD(F12*)(T)/cc-pVDZ-F12 binding energies of the full set of cluster configurations leading to an unprecedented data set both in regard to sheer size and with respect to the level of theory. We employ the constructed benchmark set to assess the performance of various semiempirical and density functional theory methods. In particular, we find that the r2-SCAN-3c method shows excellent performance across the data set related to both accuracy and CPU time, making it a promising method to employ during cluster configurational sampling. Furthermore, applying the data sets, we construct ML models based on Δ-learning and provide recommendations for future application of ML in cluster configurational sampling.
AB - Quantum chemical studies of the formation and growth of atmospheric molecular clusters are important for understanding aerosol particle formation. However, the search for the lowest free-energy cluster configuration is extremely time consuming. This makes high-level benchmark data sets extremely valuable in the quest for the global minimum as it allows the identification of cost-efficient computational methodologies, as well as the development of high-level machine learning (ML) models. Herein, we present a highly versatile quantum chemical data set comprising a total of 11 749 (acid)1-2(base)1-2cluster configurations, containing up to 44 atoms. Utilizing the LUMI supercomputer, we calculated highly accurate PNO-CCSD(F12*)(T)/cc-pVDZ-F12 binding energies of the full set of cluster configurations leading to an unprecedented data set both in regard to sheer size and with respect to the level of theory. We employ the constructed benchmark set to assess the performance of various semiempirical and density functional theory methods. In particular, we find that the r2-SCAN-3c method shows excellent performance across the data set related to both accuracy and CPU time, making it a promising method to employ during cluster configurational sampling. Furthermore, applying the data sets, we construct ML models based on Δ-learning and provide recommendations for future application of ML in cluster configurational sampling.
KW - Benchmarking
KW - Dimerization
KW - Quantum Theory
KW - Thermodynamics
UR - http://www.scopus.com/inward/record.url?scp=85143087309&partnerID=8YFLogxK
U2 - 10.1021/acs.jctc.2c00825
DO - 10.1021/acs.jctc.2c00825
M3 - Journal article
C2 - 36417753
AN - SCOPUS:85143087309
SN - 1549-9618
VL - 18
SP - 7373
EP - 7383
JO - Journal of Chemical Theory and Computation
JF - Journal of Chemical Theory and Computation
IS - 12
ER -