# Self-consistency

### Summary

Self-consistency is a concept closely linked with mean-field theories. In mean-field treatment of the Schrodinger equation, interacting particles give off a field, and interact with the (mean) field other particles generate. The field a particle senses controls its motion, which in turn affects the field it generates. Mean-field theories link the field a particle generates to the field it senses in a self-consistent manner. For the field to be time-independent and the effective field must be some average field. Thus in the mean-field approximation a particle moves in an average effective external field Veff(r) which determines the particles’ motion, which in turn determines Veff(r). Each feeds the other; the condition for which both are satisfied is the self-consistency condition.

### Self-Consistency in Density-functional Theory

Practical implementations of DFT, such as the LDA are instances of a mean-field theory. Formally the density n is determined by minimization of a total energy functional E[n]. In principle it could be done through iterating trial densities nin until the point of minimum energy is found. In practice this is not possible because realistic schemes must calculate the kinetic energy contribution to E[n] via an effective non-interacting Schrodinger equation. An effective one-particle potential Veff(r) is generated from the functional derivative Veff(r)=δE[n]/δn(r). Veff(r) defines a one-particle Hamiltonian which generates a new density nout, which differs from nin. Self-consistency is reached when an nout=nin. This density is also the one for which E[n] is minimum, or at least stationary, for certain formulations. The codes here implement the standard Hohenberg Kohn (HK) functional, for which the energy is a minimum, and also the Harris-Foulkes (HF) functional 1, for which the energy is stationary at self-consistency, though typically it is a maximum point.

Approaching self-consistency can be a tricky business, especially in magnetic systems (or whenever there is a degree of freedom with a very low energy scale associated with it).

#### Charge mixing, general considerations

Questaal codes follow the usual procedure of mixing a linear combination of input density  nin  and output density  nout  to make a trial  n*  as an estimate for the self-consistent density  n (see for example Chapter 9 in Richard Martin’s book2. In a perfect mixing scheme, it would be possible to infer the self-consistent density  n  directly from  nin  and  nout. Instead we must make an estimate  n*  from  nin  and  nout, use it for a new nin  find a new  nout, iterating until  noutnin  is small.

Practical implementaton of density mixing in Questaal is controlled by the ITER_MIX tag, as explained on this page.

If the static dielectric response is known, n* can approach the self-consistent density n to linear order in noutnin. It is not difficult to show that choosing

yields $n^* = n$ to linear order in $(n^\mathrm{out}{-}n^\mathrm{in})$. ε is a function of source and field point coordinates r and r′: ε = ε(r,r′) and in any case it is not given by the standard self-consistency procedure. Neverthless it can be important and some estimate for ε is needed. An especially pertinent example can be found in the NbFe superlattice tutorial.

In any case the Questaal codes do not have a plane wave representation so they do something else.

##### Charge mixing in the full-potential code

lmf uses a three component density: a smooth density n0 represented as plane waves (carried on a uniform mesh), defined everywhere in space and two local densities: the true density n1 and a one-center expansion n2 of the smooth density. (See Section 3.6 of Ref. 6 for a detailed description of augmentation and the density derived from it.)

If the density were expanded solely in plane waves n = ΣGCGnG, a simple mixing scheme would be to mix each CG using a model dielectric function. The Thomas Fermi approximation provides a reasonable, if rough estimate for ε, which reads in reciprocal space

Eq.(2) has one free parameter, the Thomas Fermi wave number kTF. It can be estimated given the total number of electrons qval from the free electron gas formula.

kF = (3π3/vol×qval)1/3 = EF1/2

This is called the “Kerker mixing” algorithm. The beauty of Kerker mixing is that charges in small G components of the density get damped out, while the short-ranged, large G components do not. One can use the Lindhard function instead. The idea is similar, but the Lindhard function is exact for free electrons; it has the same advantages as Kerker.

The mixing algorithm must mix all three components and it is somewhat involved, though the plane wave part does indeed use the Lindhard function. See fp/mixrho.f for details.

##### Charge mixing in the ASA

The ASA uses a simplified mixing scheme since the logarithmic derivative parameters P and energy moments of charge Q for each class is sufficient to completely specify the charge density. The density is not explicitly mixed. Since it does not have plane waves, some alternative model must be employed. One has been implemented (which diagonalizes the Madelung matrix and substitutes the eigenvalues for a kind of screened version). It dampens the the small G fluctuations, but at the same time it conceals true physical changes in the moments, and it is not ideal.

The ASA codes have the ability to make a discretized form ε(q=0) explicitly in spirit of the energy moments of charge that serve as a discretized representation of the density. This matrix is a good approximation to ε(q=0) in this context, and its use greatly stabilizes the convergence in difficult cases. The ASA Nb/Fe superlattice tutorial provides a classic illustration of this feature.

In summary the ASA codes (lm, lmgf, lmpg) offer two options:

1. A rough ε is obtained from eigenvalues of the Madelung matrix
2. The q=0 discretized polarization is explicitly calculated

There is some overhead associated with the second option, but it is not too large and having it greatly facilitates convergence in large systems. This is particularly important in magnetic metals, where there are low-energy degrees of freedom associated with the magnetic parts that require that a healthy component of  nout  for reasonable approach to self-consistency.

#### Mixing the density

The mixing process reduces to estimating a vector X* related to the density (e.g.  P,Q  in the ASA) where δX = XoutXin vanishes at Xin = X*.

Mixing algorithms mix linear combinations of (Xin,Xout) pairs taken from the current iteration together with pairs from prior iterations. If there are no prior iterations, then

X* = Xin + beta × (XoutXin)     (3)

It is evident from Eq.(1) that beta should be connected with the dielectric function. However, beta is just a number. If beta=1, X* = Xout; if beta→0, X* scarcely changes from Xin. Thus in that case you move like an “amoeba” downhill towards the self-consistent solution. For small systems it is usually sufficient to take beta on the order of, but smaller than unity. For large systems charge sloshing becomes a problem so you have to do something different. This is because the potential change goes as δV ~ G−2×δn so small G components of δn determine the rate of mixing. The simplest (but inefficient) choice is to make beta small.

By some method (see above) an estimate ε for the dielectric function is constructed. Construct δn = ε−1 (noutnin) and build δX from δn. Then estimate

X* = Xin + beta × δX     (4)

Now beta can be much larger again, of order unity.

The mixing algorithms use yet another device to accelerate the approach to self-consistency. Each iteration produces a new (Xin,Xout) pairs. There are acceleration techniques that given a family (Xin,Xout) pairs some combination of them can be taken that is better than taking any one pair. See ITER_MIX on this page.

#### LDA+U

LDA+U theory, an extension to LDA, is also a mean-field theory. At sites where U is added there is a self-consistency condition on the site density matrix, which must be incorporated together with the standard charge self-consistency. As of this writing, the density matrix is mixed separarately from the density, with its own mixing parameter

#### Self-Consistency in noncollinear magnetism

In noncollinear extensions to standard DFT, the density turns into a 2×2 matrix, for which at any point r the density may be decomposed into linear combinations of Pauli matrices. The decomposition defines the x,y,z components of the spin density. Using the connection between Pauli matrices and Cartesian coordinates, the spin density represented as spin amplitude and direction. This is how it is represented externally. Internally it is represented at times either as (amplitude, direction), or (when constructing the hamiltonian or charge density) as a 2×2 matrix. Noncollinear magnetism is implemented at present in the ASA only, and the spin orientation a site is assumed to be constant for the whole site. Self-consistency in this context means that the spin at each site is rotated so that the off-diagonal part of the 2×2 spin density matrix vanish.

### Self-Consistency in the Coherent Potential Approximation

Alloys consist of different kinds of atoms sitting at a given lattice site, or in the magnetic version, an atom of a particular type (e.g. Fe) with different spin configurations. The CPA replaces a statistical ensemble of atoms with which is taken to be an average, or composite of the true atoms. CPA theory is a way to construct the “average” properties of a composite atom, in an optimal manner, by constructing an the average scattering properties of a site. The theory is naturally formulated in the language of Green’s functions. The CPA has been implemented in the ASA for both magnetic and chemical disorder, in program lmgf. Thus lmgf requires two kinds of self-consistency: an outer loop for charge and an inner loop for the CPA condition. See here for a description of Questaal’s implementation of CPA.

### Self-Consistency in nonequilibrium transport

lmpg is a Green’s function code, similar to lmgf but designed for layered systems (periodic in two dimensions but not in the third). One of the best uses of this method is its ability to calculate transport in the Landauer-Buttiker framework. Outside the small bias regime, the potential must be determined self-consistently under nonequilibrium conditions. Questaal’s implementation is described in Ref [2].

### Self-Consistency in the Quasiparticle Self-Consistent GW Approximation

Self-consistency has a different purpose in the quasiparticle self-consistent GW approximation (QSGW) [3]. QSGW is based on a kind of self-consistent perturbation theory, where the self-consistency is used to construct an optimal non-interacting hamiltonian by minimizing the difference between it and the many-body hamiltonian, within the GW approximation. Self-consistency enables QSGW to describe optical properties in a wide range of materials rather well, including cases where the local-density and LDA-based GW approximations fail qualitatively. Self-consistency dramatically improves agreement with experiment, and is sometimes essential. QSGW avoids some formal and practical problems encountered in conventional self-consistent GW, which is more akin to the mean-field self-consistency described earlier. QSGW handles both itinerant and correlated electrons on an equal footing, without any ambiguity about how a localized state is defined, or how double-counting terms should be subtracted. Weakly correlated materials such as Na and sp semiconductors are described with uniformly high accuracy. Discrepancies with experiment are small and systematic, and can be explained in terms of the approximations made.

### Self-Consistency in tight-binding models

Typically empirical tight-binding models, for which the form of matrix elements is postulated, ignore self-consistency. However, the effects of charge transfer can be important, and the quality tight-binding models can be vastly improved with minimal complexity added to the hamiltonian [4].

This package’s implementation of empirical tight-binding hamiltonians, tbe, has a facility for including electrostatic interactions in a self-consistent manner. Site charges are accumulated from eigenvectors, which is used to make a Madelung potential

Typically tight-binding models ignore self-consistency; however, tbe has a capability to include potential changes from charge transfer in a self-consistent manner. The site charges generate an electrostatic potential through a Madelung matrix, and must be determined self-consistently.

### References

[1] M. Foulkes and R. Haydock, Tight-binding models and density-functional theory, Phys. Rev. B39, 12520 (1989). See also comment 3 on this page.

[2] S. V. Faleev, F. Leonard, D. A. Stewart, and M. van Schilfgaarde, Ab initio tight-binding LMTO method for nonequilibrium electron transport in nanosystems, Phys. Rev. B71, 195422 (2005).

[3] Takao Kotani, M. van Schilfgaarde, S. V. Faleev, Quasiparticle self-consistent GW method: a basis for the independent-particle approximation, Phys. Rev. B 76, 165106 (2007).

[4] M. W. Finnis, A. T. Paxton, M. Methfessel and M. van Schilfgaarde, Crystal Structures of Zirconia from First Principles and Self-Consistent Tight Binding, Phys. Rev. Lett. 81, 5149 (1998)

[5] R. M. Martin, Electronic Structure, Cambridge University Press (2004).

[6] Dimitar Pashov, Swagata Acharya, Walter R. L. Lambrecht, Jerome Jackson, Kirill D. Belashchenko, Athanasios Chantis, Francois Jamet, Mark van Schilfgaarde, Questaal: a package of electronic structure methods based on the linear muffin-tin orbital technique, submitted to CPC. Preprint https://arxiv.org/abs/1907.06021