# Fitting energies to classical model Hamiltonians

### Introduction

Questaal has a limited ability to build and use empirical classical model hamiltonians for the purposes of efficiently relaxing structures.

### Mathematical Framework

#### Definition of a Keating model

The ground state total energy E is a function of a collection of nuclear coordinates $\{\mathbf{R}\}$ of the system, and the atomic number for each nucleus. A good, sometimes very good approximation to E can be found from density-functional theory. This page explains how to fit an empirical classical hamiltonian to DFT results and use the model to relax the sytsem in other configurations.

Let the energy be expanded as a linear combination of two-body interactions P and three-body interactions T. Initially consider the following form:

$E$ is expanded as a linear combination of $O_{mM}$, $P_{mM}$ and $T_{mM}$, which are functions of a single site ($M{\in}\mathbf{R}$), pairs ($M{\in}[\mathbf{R}_i,\mathbf{R}_j]$) and triplets ($M{\in}[\mathbf{R}_i,\mathbf{R}_j,\mathbf{R}_k]$), respectively. $m$ labels a function type, and it can change with the environment, i.e. $m=m(M)$. $o_{m}$, $p_{m}$, and $t_{m}$ are expansion coefficients to be determined by fitting to reference data.

We will assume $P_{mM}$ is a purely radial force, of the form

where $\lambda_m$ and $h_m$ are undetermined parameters. For any $M$, there can be a multiplicity of $m_{\small P}$ independent functions $(M1)$, i.e. with different ${\lambda_m}$ and ${h_m}$. Further, the shape of $P_{mM}$ can vary with $M$ by allowing a unique $m$ for each $M$. Most simply, ${\lambda_m}$ and ${h_m}$ would be universal constants, independent of $M$. But realistically, $\lambda_m$ and $h_m$ should depend on the two species types of the pair. It is a truism known from materials chemistry that bond lengths are largely a function of the size of the two atoms: this is the logic behind standard tables of “ionic radii” for example. With this view in mind $h_m$ should depend on the two chemical species at $\mathbf{R}_i$ and $\mathbf{R}_j$, and be approximately the “equilibrium” bond length, since $dP/dr$ vanishes at $h_m$. If we allow $m_{\small P}$ functions per $M$, and if $\{\mathbf{R}\}$ contains $\mathcal{N}_s$ distinct species, there will be $m_{\small P}{\times}\mathcal{N}_s(\mathcal{N}_s+1)/2$ distinct kinds of functions $P$. But for a particular pair $M$, there will be at most $m_{\small P}$ functions.

Keating considered a sum of two kinds of forces: radial and angular. (For a general tool, more flexibility would be necessary which can be realized by folding more environmental information than the pair species into $P$. But, such a more sophisticated approach it would entail proliferation of coefficients. This is a problem tailor made for machine-learning. For our purposes (small relaxations about known structures) we will adopt this simple approach in the spirit of Keating.

The simplest triplet functions purely of angle are the cosine and sine functions. The short-sightedness principle tells us that tightly bound clusters should carry more weight than extended ones, so we introduce a scaling factor into the angular function:

A natural choice for $l_{\small M}$ is the the triangle’s perimeter. The factor $e^{-{\gamma_m}l_M}$ could be some more complicated function of the environment, but in this mode we adopt this form for simplicity, once again allowing $l_{\small M}$ to depend on $M$ only through the species at the three vertices of the triplet. Thus, there will be $2{\times}\mathcal{N}_s(\mathcal{N}_s+1)(\mathcal{N}_s+2)/6$ distinct kinds of functions, though only 2 for any particular triplet.

In the same spirit the one-body term should be a constant, but there can be an independent constant for each species. However, these functions will linearly independent only if there are configurations $\mathbf{R}$ included in the fit different proportions of species to make them independent. The number of one-body terms should not exceed the number of linearly independent proportions of species. The energy can be written compactly as

$X_{lL}$ is a one-, two-, or three- body term, while $l$ denotes the function type: it includes the $\mathcal{N}_s$ one-body terms, the $m_{\small P}{\times}\mathcal{N}_s(\mathcal{N}_s+1)/2$ two-body terms, and $2{\times}\mathcal{N}_s(\mathcal{N}_s+1)(\mathcal{N}_s+2)/6$ three-body terms. For a particular $L$, $X_{lL}$ vanishes unless the species at the vertices coincide with the species belonging to that $l$.