Unifying machine learning and quantum chemistry with a deep neural network for molecular wavefunctions

Source: Ryan Daws
Date: 2019-12-05


Machine learning advances chemistry and materials science by enabling large-scale exploration of chemical space based on quantum chemical calculations. While these models supply fast and accurate predictions of atomistic chemical properties, they do not explicitly capture the electronic degrees of freedom of a molecule, which limits their applicability for reactive chemistry and chemical analysis. Here we present a deep learning framework for the prediction of the quantum mechanical wavefunction in a local basis of atomic orbitals from which all other ground-state properties can be derived. This approach retains full access to the electronic structure via the wavefunction at force-field-like efficiency and captures quantum mechanics in an analytically differentiable representation. On several examples, we demonstrate that this opens promising avenues to perform inverse design of molecular structures for targeting electronic property optimisation and a clear path towards increased synergy of machine learning and quantum chemistry.


Machine learning (ML) methods reach ever deeper into quantum chemistry and materials simulation, delivering predictive models of interatomic potential energy surfaces, molecular forces, electron densities, density functionals, and molecular response properties such as polarisabilities, and infrared spectra. Large data sets of molecular properties calculated from quantum chemistry or measured from experiment are equally being used to construct predictive models to explore the vast chemical compound space to find new sustainable catalyst materials, and to design new synthetic pathways. Recent research has explored the potential role of machine learning in constructing approximate quantum chemical methods, as well as predicting MP2 and coupled cluster energies from Hartree–Fock orbitals. There have also been approaches that use neural networks as a basis representation of the wavefunction.
Most existing ML models have in common that they learn from quantum chemistry to describe molecular properties as scalar, vector, or tensor fields. Figure 1a shows schematically how quantum chemistry data of different electronic properties, such as energies or dipole moments, is used to construct individual ML models for the respective properties. This allows for the efficient exploration of chemical space with respect to these properties. Yet, these ML models do not explicitly capture the electronic degrees of freedom in molecules that lie at the heart of quantum chemistry. All chemical concepts and physical molecular properties are determined by the electronic Schrödinger equation and derive from the ground-state wavefunction. Thus, an electronic structure ML model that directly predicts the ground-state wavefunction (see Fig. 1b) would not only allow to obtain all ground-state properties, but could open avenues towards new approximate quantum chemistry methods based on an interface between ML and quantum chemistry. Hegde and Bowen28 have explored this idea using kernel ridge regression to predict the band structure and ballistic transmission in a limited study on straining single-species bulk systems with up to μfour atomic orbitals. Another recent example of this scheme is the prediction of coupled-cluster singles and doubles amplitudes from MP2-derived properties by Townsend and Vogiatzis.

Fig. 1

 Fig. 1
Synergy of quantum chemistry and machine learning. a Forward model: ML predicts chemical properties based on reference calculations. If another property is required, an additional ML model has to be trained. b Hybrid model: ML predicts the wavefunction. All ground state properties can be calculated and no additional ML is required. The wavefunctions can act as an interface between ML and QM.

In this work, we develop a deep learning framework that provides an accurate ML model of molecular electronic structure via a direct representation of the electronic Hamiltonian in a local basis representation. The model provides a seamless interface between quantum mechanics and ML by predicting the eigenvalue spectrum and molecular orbitals (MOs) of the Hamiltonian for organic molecules close to ‘chemical accuracy’ (~0.04 eV). This is achieved by training a flexible ML model to capture the chemical environment of atoms in molecules and of pairs of atoms. Thereby, it provides access to electronic properties that are important for chemical interpretation of reactions such as charge populations, bond orders, as well as dipole and quadrupole moments without the need of specialised ML models for each property. We demonstrate how our model retains the conceptual strength of quantum chemistry by performing an ML-driven molecular dynamics simulation of malondialdehyde showing the evolution of the electronic structure during a proton transfer while reducing the computational cost by 2–3 orders of magnitude. As we obtain a symmetry-adapted and analytically differentiable representation of the electronic structure, we are able to optimise electronic properties, such as the HOMO-LUMO gap, in a step towards inverse design of molecular structures. Beyond that, we show that the electronic structure predicted by our approach may serve as input to further quantum chemical calculations. For example, wavefunction restarts based on this ML model provide a significant speed-up of the self-consistent field procedure (SCF) due to a reduced number of iterations, without loss of accuracy. The latter showcases that quantum chemistry and machine learning can be used in tandem for future electronic structure methods.


Atomic representation of molecular electronic structure

In quantum chemistry, the wavefunction associated with the electronic Hamiltonian