Chapter 3 General Optimisation Algorithm to Find Classical Magnetic Ground States

Abstract

A number of methods exist to determine the ground state magnetic order of interacting spins on periodic lattices. This chapter illustrates the method used in this thesis for the case of an Ising-like antiferromagnet on a square lattice, for general orientation of applied magnetic field, where the magnetic structure is not collinear. This is followed by the general formalisation of the approach for the case of magnetic systems of many sublattices, with arbitrary bilinear exchange.

Β§ 3.1 Introduction

In order to make predictions of physical properties such as magnetic torque and linear magnetisation, calculations of magnetic structure using a gradient descent algorithm are performed in this thesis. This allowed development of a classical understanding of the pseudo-spin model of the materials which are discussed later. This approach treats spins as classical vectors and allows calculations to be performed much faster than quantum spin models such as exact-diagonalisation, which is intractable for many three-dimensional systems. A classical, three-dimensional pseudospin model is particularly useful when the system orders, and classical models are good approximations [81, 59].

The Hamiltonian for a collection of interacting spins in some applied magnetic field 𝑩 can be written as

β„‹ =βˆ‘i⁒α⁒β⁒i⁒jJi⁒jα⁒β⁒Siα⁒SjΞ² βˆ’ βˆ‘i⁒α⁒βgiα⁒β⁒Bα⁒SiΞ² (3.1)
=βˆ‘i⁒j𝑺iT⁒𝕁i⁒j⁒𝑺j βˆ’ βˆ‘i𝑩T⁒𝔾i⁒𝑺i (3.2)

where 𝕁i⁒j is the exchange tensor between sites i and j and 𝑺iT is the transpose of fixed-length vector spin at site i. 𝔾i with elements giα⁒β is the g-factor at site i in units of ΞΌB. The classical ground state is the set of spins ψ0={𝑺i} that minimises β„‹, under the condition that all spins have a fixed length si:

|𝑺i|=siβ’βˆ€i (3.3)

Finding this ground state is fundamentally an optimisation problem in which the energy of the system is minimised over the state space. Numerical solutions to these problems are well studied, in physics as well as computer science. A number of algorithms exist to solve these, such as branch and bound, simulated annealing, and gradient descent.

An approach to finding the exact solution in zero field is the Luttinger-Tisza method [63], which works by replacing the strong condition (3.3) with the so-called weak condition

βˆ‘i|𝑺i|2βˆ’si2=0 (3.4)

which is necessary but insufficient for the validity of (3.3). The Hamiltonian can be exactly solved under this condition using a Lagrange multiplier method [64]. If the solution also satisfies the strong condition then it is the true ground state.

A commonly used numerical method for solving these problems in physics is simulated annealing [54]. This iterative solution involves considering the current state 𝒙n with energy E and some β€˜neighbouring’ state 𝒙nβˆ— with energy Eβˆ—. The state is updated to this new value with probability P⁒(E,Eβˆ—,T) which is commonly

P⁒(E,Eβˆ—,T)={1if ⁒Eβˆ—<Eexp⁑Eβˆ’Eβˆ—TotherwiseΒ  (3.5)

where T is a β€˜temperature’ parameter which decreases over time. This algorithm provably converges if the cooling schedule is sufficiently slow, however it is challenging to determine the ideal cooling rate a priori.

A different method, used in this thesis, is based on gradient descent (GD). GD works by adjusting the state in the direction which locally minimises the energy. This method reliably and efficiently finds local minima in energy, but some care is needed to find global minima. A more in-depth explanation of the algorithm is in a later section.

Β§ 3.2 Example: Phase diagram of an Ising-like antiferromagnet on a square lattice for general orientation of magnetic field.

Figure 3.1: Schematic diagram showing a NΓ©el order antiferromagnet on a square lattice. Red plus ’+’ symbols indicate spins along the +𝒛 axis, out of the page. Blue minus ’-’ symbols indicate spins along the βˆ’π’› axis, into the page. Overlaid diagrams show a) primitive lattice vectors 𝒂, 𝒃, b) magnetic unit cell lattice vectors 𝑴𝒂, 𝑴𝒃, c) magnetic unit cell outline with labelled sublattices A and B, and d) nearest-neighbour exchange path J.

Numerical mean field calculations of magnetic structures are particularly useful when considering systems with many exchange terms, or with applied magnetic fields along non-symmetry directions. In order to demonstrate their utility, I will use the example of a square lattice antiferromagnet with nearest-neighbour anisotropic XXZ exchange. A schematic diagram of such a lattice can be seen in FigureΒ 3.1, which shows the primitive lattice vectors 𝒂 and 𝒃.

In this system, the total energy can be written as

β„‹=12β’βˆ‘i,jJ⁒(Six⁒Sjx+Siy⁒Sjy+Δ⁒Siz⁒Sjz)βˆ’βˆ‘ig⁒μB⁒𝑩⋅𝑺i (3.6)

where the half prefactor is to correct for double counting of bonds, J>0 parametrises the antiferromagnetic exchange, and Ξ” allows for exchange anisotropy. In this example Ξ”>1 indicates easy-axis Ising-type anisotropy.

The energy on a bond can be minimized by having the adjacent spins be antiparallel and collinear with the Ising 𝒛 axis, such that the bond energy is βˆ’Ξ”β’J⁒S2. On the square lattice, this can be satisfied for every bond by adopting a NΓ©el order, such that adjacent spins are all always antiparallel, shown as alternating + and βˆ’ symbols on FigureΒ 3.1. Because this structure minimises the energy on every bond, it must be the ground state when 𝑩=0.

The magnetic unit cell vectors are

(𝑴𝒂𝑴𝒃)=(1βˆ’111)⁒(𝒂𝒃) (3.7)

with

det|1βˆ’111|=2 (3.8)

showing that there are two primitive lattice points per magnetic unit cell. As such, the magnetic unit cell volume (in this case, area) is twice as large as the primitive unit cell volume. These magnetic unit cell vectors can be seen in FigureΒ 3.1b, the outline of the unit cell can be seen in FigureΒ 3.1c, with the two magnetic sublattices labelled A and B.

In order to understand how the spin orientations vary with applied magnetic field in this system, it is assumed that the magnetic unit cell does not change with applied magnetic field. The mean field energy per magnetic unit cell (Eqn.Β 3.6) can then be rewitten in terms of the spin orientations of the two sublattices 𝑺A and 𝑺B

β„‹N=n⁒J⁒(SAx⁒SBx+SAy⁒SBy+Δ⁒SAz⁒SBz)βˆ’g⁒μB⁒𝑩⋅(𝑺A+𝑺B) (3.9)

where n=4 is the number of nearest-neighbours on each site, and N is the total number of magnetic unit cells in the crystal. In this simple case, constructing the Hamiltonian as a function of sublattice spins is straightforward by inspection. In the most general case with longer range interactions and arbitrary numbers of sublattices the formalism is non-trivial and this is discussed in detail in SectionΒ 3.3.

One interesting feature of this system is that it undergoes a spin-flop transition when Ξ”>1, this is a metamagnetic phase transition which occurs when a sufficiently strong field is applied along the Ising 𝒛 axis. The zero-field ground state has spins along the 𝒛 axis, so when a field is applied the Zeeman energy

EZ=βˆ’12⁒g⁒μB⁒𝑩⋅(𝑺A+𝑺B) (3.10)

does not decrease with a magnetic field along 𝒛. The spins are also not able to cant towards the field as they would if the field was applied in the x⁒y plane. Determining the field at which this occurs, and the field dependence of the magnetisation can be done analytically and is demonstrated in many textbooks on the subject [13]. In this model, the spin-flop field BSF is found to be

ΞΌB⁒g⁒BSF=S⁒n⁒J⁒Δ2βˆ’1 (3.11)

where a spin-flop occurs for all values of Ξ”>1. At this field, the magnetisation changes discontinuously from 0 at B<BSF to a finite value at B>BSF.

However, the question of how the system behaves for non-axial fields, some arbitrary angle away from the 𝒛 axis, cannot be answered so clearly analytically. For this reason, it is useful to develop numerical methods for solving the problem. A natural approach is an iterative algorithm called gradient descent. This algorithm works by starting from a state and finding the direction in which the spins can be adjusted that decreases the energy the most. This is repeated until a minimum in energy is found with satisfactory precision.

In this example, EquationΒ 3.9 can be differentiated to give

1Nβ’βˆ‡π‘ΊAβ„‹=n⁒J⁒(SBxSByΔ⁒SBz)βˆ’g⁒μB⁒𝑩 (3.12)

which can be computed, then the sublattice spin 𝑺A adjusted along the βˆ’βˆ‡π‘ΊAβ„‹ direction, such that the energy decrease is maximised. Unfortunately, this runs into the problem that the magnitude of the spins |𝑺A| are unconstrained. For instance, in the NΓ©el order ground state with no field

𝑺A =S⁒zΜ‚
𝑺B =βˆ’S⁒zΜ‚
1Nβ’βˆ‡π‘ΊAβ„‹ =βˆ’Ξ”β’n⁒J⁒S⁒zΜ‚

so adjusting 𝑺A by βˆ’Ξ³β’βˆ‡π‘ΊAβ„‹ with small Ξ³>0 will increase the magnitude of |𝑺A|

|𝑺Aβˆ’Ξ³β’βˆ‡π‘ΊAβ„‹|=|𝑺A|βˆ’Ξ³β’π‘ΊAβ‹…βˆ‡π‘ΊAβ„‹+π’ͺ⁒(Ξ³2) (3.13)

To solve this problem, I note that in the limit of small step size Ξ³β‰ͺ1, |𝑺A| will not change if 𝑺Aβ‹…βˆ‡π‘ΊAβ„‹=0. This means that the sublattice spins will be rotated in the plane containing the spin vector and the gradient βˆ‡π‘ΊAβ„‹. I approach the formalism for this in a later section, but in summary this treats the condition |𝑺A|=S as a spherical manifold β„³, and the adjustments to the state occur in the tangent space of this manifold T𝒑⁒ℳ.

In order to keep the step direction in this plane, I project the energy minimising vector βˆ’βˆ‡π‘ΊAβ„‹ into the plane and use that as the step direction, and then normalise the new vector again to avoid accumulation of π’ͺ⁒(Ξ³2) errors.

𝑺A,n+1=projS⁑[𝑺A,nβˆ’Ξ³β’projTβ’βˆ‡π‘ΊAβ„‹] (3.14)

where projS indicates the projection onto a sphere with radius S and projT indicates projection into the plane normal to 𝑺A. This can then be iterated for each spin until an energy minimum is found. SectionΒ 3.4 formalises this approach and derives a generalisation of the above expression.

Figure 3.2: Phase diagram showing magnetisation per site for magnetic fields in the x⁒z plane up to 50Β T, for an Ising type XXZ antiferromagnet with calculation parameters found in TableΒ 3.1. Shading shows magnetisation projected onto the magnetic field direction as described in the text. Overlaid arrows show alignment of sublattice spins at various magnetic fields. ΞΈ is angle between the magnetic field and the Ising 𝒛 axis. Ξ±,Ξ² are the angles sublattice spins make with the magnetic field direction. Circular black boundary shows second order phase transition from canted antiferromagnet to polarised phase. For special directions 𝑩βˆ₯𝒛, π‘©βŸ‚π’› Ξ±=Ξ² for general field orientation Ξ±β‰ Ξ².
Figure 3.3: a) Evolution of magnetisation versus magnetic field for field aligned along (cyan) and perpendicular to (red) the Ising axis of an XXZ antiferromagnet on a square lattice. b) Schematic diagram showing axes arrangement and definition of angles in the rest of the figure. Angle ΞΈ is measured from the Ising 𝒛 axis towards the x⁒y-plane. c) shows evolution of magnetisation around the spin-flop field for angles between 0βˆ˜β‰€ΞΈβ‰€15∘ and fields 10⁒T≀B≀25⁒T. d) shows evolution of magnetic torque with magnetic field for angles between 0βˆ˜β‰€ΞΈβ‰€15∘ and fields 0⁒T≀B≀50⁒T. Calculation parameters are found in TableΒ 3.1.
Table 3.1: Calculation parameters for square-lattice XXZ antiferromagnet using definitions in EquationΒ 3.9
n 4
S 1/2
g 2
J 1Β meV
Ξ” 2

This calculation was performed with the parameters in TableΒ 3.1. From these, one expects a spin-flop transition to occur at ΞΌB⁒BSF=1⁒meV or BSFβ‰ˆ17.28⁒T and polarisation to occur at ΞΌB⁒BP=2.414⁒meV or BPβ‰ˆ41.71⁒T. The results of these calculations can be seen in FigureΒ 3.3. Fig.Β 3.3a) shows the evolution of magnetisation against magnetic field for field orientations parallel and perpendicular to the 𝒛 axis. For fields perpendicular to 𝒛, the magnetisation increases linearly in field from 0⁒μB at 0Β T to 1⁒μB at 41.7⁒(1)Β T. For fields parallel to 𝒛, the magnetisation remains at 0⁒μB between 0β†’17.3⁒(1)⁒T, and then discontinuously joins the π‘©βŸ‚π’› curve at higher fields.

FigureΒ 3.2 shows how the projection of magnetisation onto the magnetic field direction

𝑴⋅𝑩|𝑩| (3.15)

varies as as the magnetic field vector 𝑩 varies in the x⁒z plane. This shows a spin-flop transition at the isolated point 𝑩=BSF⁒𝒛, such that if the magnetic field is applied directly along the 𝒛 axis, there is a phase transition, and discontinuous change in magnetisation at |𝑩|=BSF. This also shows that for off-axis fields, there is no phase transition, but instead a smooth crossover between the two regimes. It show that the spin orientations which minimise the free energy take different angles with the magnetic field Ξ±β‰ Ξ² except in the special cases of axial fields ΞΈ=0, ΞΈ=Ο€2, highlighting why this calculation is challenging to do analytically. For all field orientations, the system polarises at the same field |𝑩|=BPβ‰ˆ41.71⁒T, which is in all cases a continuous phase transition.

Fig.Β 3.3b-d) show the effect of off-axis fields on this system. Fig.Β 3.3b) shows how the angle ΞΈ is defined in this setup: measuring an angle from the 𝒛 axis, towards the x⁒y plane. At zero-field the spins are aligned parallel to the 𝒛 axis. Fig.Β 3.3c) shows how magnetisation against field curves for ΞΈ between 0⁒° and 15⁒°. At ΞΈ=0⁒° the field is exactly along axis and the 𝑩βˆ₯𝒛 curve discussed above is recovered. As the field gets further off axis, the discontinuity at |𝑩|=BSF is immediately replaced by a smooth crossover between the low and high-field regimes. At higher values of ΞΈ, the crossover becomes less sharp. Fig.Β 3.3d) shows how magnetic torque, taken around the π’š=𝒛×𝒙 axis shown in Fig.Β 3.3b), evolves with applied magnetic field at different values of ΞΈ. At ΞΈ=0⁒° the field is exactly along the 𝒛 axis, and the torque is constrained by symmetry to be exactly 0 at all fields. For small values of ΞΈ, there is a sharp peak in torque around the spin-flop field. As ΞΈ increases, the peak becomes broader, and increases slightly in field. At the polarisation field BP, the magnetisation changes slope at all angles ΞΈ, indicating a continuous phase transition. There is no anomaly in torque at BP at any angle ΞΈ. These results reproduce what is expected theoretically [84, 14] has been seen before experimentally [113].

Β§ 3.3 General formalism for multi-sublattice magnetic systems

The previous section outlines the ground state determination method for the case of two sublattices in the magnetic unit cell and nearest-neighbour exchange. This is appropriate for many simple systems. This section demonstrates the appropriate formalism for multi-sublattice systems with exchanges beyond nearest-neighbours.

The general form of the a spin Hamiltonian with exchange and Zeeman terms is

β„‹ =βˆ‘Ξ±β’Ξ²β’i⁒jJi⁒jα⁒β⁒Siα⁒SjΞ² βˆ’ βˆ‘i⁒α⁒βgiα⁒β⁒Bα⁒SiΞ² (3.16)
=βˆ‘i⁒j𝑺iT⁒𝕁i⁒j⁒𝑺j βˆ’ βˆ‘i𝑩T⁒𝔾i⁒𝑺i (3.17)

allowing for all bilinear exchange terms Ji⁒jα⁒β and anisotropic g-factors giα⁒β. This form of the Hamiltonian requires indexing over all sites in the crystal. These sites however exist on a Bravis lattice with symmetry constraints which are hard to evaluate in this indexed form. An equivalent formalism for the Hamiltonian is to make this explicit, by summing over the sites in each primitive unit cell, and the lattice points in the crystal. This gives

β„‹=βˆ‘Ξ±β’Ξ²β’i⁒j⁒𝒖⁒𝒗Ji⁒j⁒𝒖⁒𝒗α⁒β⁒Si⁒𝒖α⁒Sjβ’π’—Ξ²βˆ’βˆ‘Ξ±β’Ξ²β’i⁒𝒖giα⁒β⁒Bα⁒S𝒖β (3.18)

where i,j index over the sites in the unit cell and 𝒖,𝒗 index over the lattice vectors in the crystal. Additionally, the value of the Hamiltonian must be invariant under the symmetry operations of the crystal, including translations by lattice vectors. In other words, it must be invariant under the transformation:

𝑺i⁒𝒖→𝑺i⁒𝒖+𝒕 (3.19)

where 𝒕 is a lattice vector. This constrains the exchange parameters, allowing them to be rewritten:

Ji⁒j⁒𝒖,𝒖+𝒕→Ji⁒j⁒𝒕 (3.20)

So including only the unique terms allowed by translational symmetry of the lattice:

β„‹=βˆ‘Ξ±β’Ξ²β’i⁒j⁒𝒖⁒𝒕Ji⁒j⁒𝒕α⁒β⁒Si⁒𝒖α⁒Sj⁒𝒖+π’•Ξ²βˆ’βˆ‘Ξ±β’Ξ²β’i⁒𝒖giα⁒β⁒Bα⁒Si⁒𝒖β (3.21)

The exchange part of the Hamiltonian can then be fully specified by a set of exchange parameters (nearest-neighbour, next-nearest neighbour, and so on).

In order to convert the Hamiltonian into a sum over sublattices, a magnetic unit cell needs to be assumed. This means that the spins Si⁒𝒖 and Si⁒𝒗 at the same site index i, for two different lattice vectors 𝒖, 𝒗, are equivalent if 𝒖 and 𝒗 differ by a magnetic lattice vector. Formally:

𝑺i⁒𝒖≑𝑺i⁒𝒗ifΒ β’βˆƒπ’Žβˆˆπ•„:𝒖=𝒗+π’Ž (3.22)

where 𝕄 is the vector space of all magnetic lattice vectors.

This means that the lattice vectors 𝒖, 𝒗 can be treated as equivalent if they differ by a magnetic lattice vector. This is the same as performing the algebra in the quotient space V/𝕄, where V is the vector space of primitive lattice vectors. It is worth noting that 𝕄 is a vector subspace of V. A useful operator to define is the equivalence class of a vector in V

[𝒖]≔{𝒖+π’Žβˆ£βˆ€π’Žβˆˆπ•„} (3.23)

which is the set of all primitive lattice vectors which are equivalent under the magnetic unit cell.

To put it simply, by imposing a magnetic unit cell 𝕄, two spins 𝑺i⁒𝒖 and 𝑺i⁒𝒗 at the same site index i, for two different primitive lattice vectors 𝒖, 𝒗, are equivalent if [𝒖]=[𝒗]. The set of equivalence classes here can be understood as the set of primitive lattice points in each magnetic unit cell. If there is only one site per primitive cell, then the equivalence classes are just the sublattices. If there is more than one site per primitive cell, the sublattices can be indexed by i,[𝒖], and the number of sublattices is equal to the number of sites per primitive unit cell Γ— the number of primitive lattice points per magnetic cell.

Because of this imposed symmetry, the spin at site i,𝒖 can be rewritten

𝑺i⁒𝒖→𝑺i⁒[𝒖] (3.24)

and the Hamiltonian can be rewritten as

β„‹=βˆ‘Ξ±β’Ξ²β’i⁒j⁒𝒖⁒𝒕Ji⁒j⁒𝒕α⁒β⁒Si⁒[𝒖]α⁒Sj⁒[𝒖+𝒕]Ξ²βˆ’βˆ‘Ξ±β’Ξ²β’i⁒𝒖giα⁒β⁒Bα⁒Si⁒[𝒖]Ξ² (3.25)

This still sums over all of the sites in the crystal, but the sum can be simplified by noting that for a finite size magnetic unit cell, there are a finite number of equivalence classes. The sum is taken over the magnetic unit cells, and the sites per magnetic unit cell

β„‹ =βˆ‘π’Žβˆˆπ•„βˆ‘Ξ±β’Ξ²β’i⁒j⁒𝒑⁒𝒕Ji⁒j⁒𝒕α⁒β⁒Si⁒[𝒑+π’Ž]α⁒Sj⁒[𝒑+π’Ž+𝒕]Ξ²βˆ’βˆ‘π’Žβˆˆπ•„βˆ‘Ξ±β’Ξ²β’i⁒𝒑giα⁒β⁒Bα⁒Si⁒[𝒑+π’Ž]Ξ² (3.26)
=N⁒(βˆ‘Ξ±β’Ξ²β’i⁒j⁒𝒑⁒𝒕Ji⁒j⁒𝒕α⁒β⁒Si⁒[𝒑]α⁒Sj⁒[𝒑+𝒕]Ξ²βˆ’βˆ‘Ξ±β’Ξ²β’i⁒𝒑giα⁒β⁒Bα⁒Si⁒[𝒑]Ξ²) (3.27)

where the first sum over is taken over all magnetic unit cells in the crystal π’Žβˆˆπ•„, N is the number of magnetic unit cells in the crystal and the sum over 𝒑 is over the primitive lattice points per magnetic unit cell (the equivalence classes of V/𝕄). The above uses the property of the equivalence classes that [𝒑+π’Ž]=[𝒑]β’βˆ€π’Žβˆˆπ•„.

Here, the factor of N can be safely removed by transforming N⁒ℋ→ℋ, such that

β„‹=βˆ‘Ξ±β’Ξ²β’i⁒j⁒𝒑⁒𝒕Ji⁒j⁒𝒕α⁒β⁒Si⁒[𝒑]α⁒Sj⁒[𝒑+𝒕]Ξ²βˆ’βˆ‘Ξ±β’Ξ²β’i⁒𝒑giα⁒β⁒Bα⁒Si⁒[𝒑]Ξ² (3.28)

The final hurdle to solve is the sum over 𝒕, which still sums over all primitive lattice points. This is easily resolved as in the calculation, only a small number of interaction terms Ji⁒j⁒𝒕α⁒β are non-zero. This allows the sum to run over all non-zero interactions which have been defined (nearest-neighbour, next-nearest-neighbour, etc.), which is likely to be small in any real-world application, except for dipolar coupled systems which are not the focus of this thesis.

For the sake of completeness, the classical Hamiltonian can be rewritten in terms of the exchange interaction between the sublattices by defining

J~i⁒[𝒑],j⁒[𝒓]Ξ±β’Ξ²β‰”βˆ‘π’Žβˆˆπ•„Ji⁒j⁒(π’“βˆ’π’‘+π’Ž)α⁒β (3.29)

such that

β„‹=βˆ‘Ξ±β’Ξ²β’i⁒[𝒑]⁒j⁒[𝒓]J~i⁒[𝒑],j⁒[𝒓]α⁒β⁒Si⁒[𝒑]α⁒Sj⁒[𝒓]Ξ²βˆ’βˆ‘Ξ±β’Ξ²β’i⁒[𝒑]giα⁒β⁒Bα⁒Si⁒[𝒑]Ξ² (3.30)

where i⁒[𝒑],j⁒[𝒓] sum pairwise over the sublattices. This is the same form as EquationΒ 3.9 in the example, and provides a good starting point for numerically solving these problems. The following section formalises a method for numerically minimising the mean field energy for Hamiltonians of this form.

The effect of applying a finite size magnetic unit cell is that only a subset of magnetic propagation vectors 𝒒 are allowed, such that

π’’β‹…π’Žβˆˆβ„€βˆ€π’Žβˆˆπ•„ (3.31)

This β€˜rationalisation’ of k-space is the standard consequence of applying periodic boundary conditions to a crystal. It should be noted that incommensurate magnetic structures have irrational propagation vectors, so can never be solved with this method.

As the choice of magnetic unit cell 𝕄 limits the solutions that can be found, it must be chosen from either prior theoretical work or empirical evidence of the propagation vector such as neutron diffraction experiments.

Β§ 3.4 Riemannian Optimisation Algorithm

This section formalises a method for solving optimisation problems where the state space is not a vector space. This is needed for finding the ground state magnetic Hamiltonians of the form

β„‹ =βˆ‘Ξ±β’Ξ²β’i⁒jJ~i⁒jα⁒β⁒Siα⁒SjΞ² βˆ’ βˆ‘Ξ±β’Ξ²β’igiα⁒β⁒Bα⁒SiΞ² + βˆ‘iAi⁒(𝑺i) (3.32)
=βˆ‘i⁒j𝑺iT⁒𝕁i⁒j⁒𝑺j βˆ’ βˆ‘i𝑩T⁒𝔾i⁒𝑺i + βˆ‘iAi⁒(𝑺i) (3.33)

where Ai⁒(𝑺i) parametrises local anisotropy on site i, some complex anisotropy terms are discussed in ChapterΒ 6.

These have the constraint that the sublattice spin vectors have a fixed magnitude i.e.

|𝑺i|=siβ’βˆ€i (3.34)

As such, the space of allowed 𝑺i do not form a vector space. Fortunately, as each spin is constrained to be a vector with a fixed length, it is equivalent to saying that each 𝑺i is a point on a sphere of radius si. This forms a differentiable Riemannian manifold over which some optimisation algorithms such as gradient descent can be generalised as shown below.

Traditionally, gradient descent is an algorithm for finding the local minimum of a scalar, differentiable, function F over a vector space

F:ℝd→ℝ (3.35)

The principle of the algorithm comes from defining the gradient operator βˆ‡F such that its inner product with some vector direction 𝒗

βŸ¨βˆ‡F,π’—βŸ©β‰”d⁒F⁒(𝒗) (3.36)

the derivative of F along 𝒗. This is equivalent to the more commonly used definition

βˆ‡Fβ‰”βˆ‘iβˆ‚Fβˆ‚xi⁒𝒆̂i (3.37)

where 𝒆iΜ‚ is the ithunit basis vector of the vector space, and βˆ‚Fβˆ‚xi is the partial derivative of F with respect to the coordinate xi.

For a small step 𝜹, the change in F is maximised when the step is parallel to the gradient βˆ‡F. This prompts the iterative solution

𝒙n+1=𝒙nβˆ’Ξ³β’βˆ‡F⁒(𝒙n) (3.38)

where Ξ³ is a learning rate, and is a free parameter.

In order to include the constant length spins constraint (Eqn.Β 3.3), modifications have to be made to the standard gradient descent algorithm. One option is to use a penalty method, which involves adding some smooth term to the Hamiltonian to penalise moving away from the constraint. In this case one could add the penalty term:

P=βˆ‘iη⁒(|𝑺i|βˆ’1)2 (3.39)

where Ξ·>0 is a free parameter. This would add a cost to the optimiser to violate the constraint, so it will stay approximately satisfied.

Another solution, with has been implemented for this project, is to consider the constrained state space Ξ¨ as a Riemannian manifold, described below. The gradient descent algorithm can be generalised to apply to any Riemannian manifold, not just the Euclidean geometry ℝd shown above [2, 88, 87]. In this form the algorithm is sometimes known as Riemannian Optimisation. However, some of the steps which are trivial in the Euclidean case are not trivial in the general case.

A Riemannian manifold is a smooth, differentiable manifold whose tangent spaces have an inner product, which varies smoothly across the manifold. The tangent space Tp⁒ℳ of a differentiable manifold β„³at p is the vector space spanned by the tangent vectors of all of the curves at p on β„³. If β„³can be embedded in an ambient Euclidean space, this can be easily visualised as the (hyper-)plane spanned by the tangent vectors as in FigureΒ 3.4a).

In this case, F is instead a differentiable, scalar function over the manifold β„³

F:ℳ→ℝ (3.40)

This allows the gradient βˆ‡F to be defined again such that

βŸ¨βˆ‡F,π’—βŸ©=d⁒F⁒(𝒗) (3.41)

is the derivative of F along 𝒗. It should be noted that the vectors 𝒗 and βˆ‡F are now elements of the tangent space Tp⁒ℳ. Clearly, a step along the vector βˆ‡F will maximise the change in F, so the approach of gradient descent will work. However, unlike in the Euclidean case, the tangent space and the manifold have very different geometry. A tangent vector on Tp⁒ℳ can be mapped to β„³by the exponential map:

expp:Tp⁒ℳ→ℳ (3.42)

In some cases, this map may be computationally expensive to evaluate. Overall this gives the iteration step for gradient descent on an arbitrary Riemannian manifold:

pn+1=exppn⁑(βˆ’Ξ³β’βˆ‡F) (3.43)
Figure 3.4: Schematic showing some properties of the spherical S2 manifold β„³. a) shows β„³embedded in ambient Euclidean space ℝ3. At a point p on β„³, the tangent space Tp⁒ℳ is visualised as a plane tangent to the sphere at p. Orthogonal unit vectors 𝒖, 𝒗 which span Tp⁒ℳ are shown as red arrows. The exponential map expp⁑𝒗 is shown as a red line along the sphere. b) shows a cross section of the sphere including 𝒗 and the origin. The projection operator projβ„³ is shown as a black dashed line. Note that p is written without bold when treated as a point on β„³, and as bold 𝒑 when treated as a vector in the ambient Euclidean space.

Applying this to the geometry of the problem in magnetism is straightforward. A point ψ on the overall space Ξ¨ is a collection of N spins 𝑺i, which are constrained to be unit vectors. So each spin is a point on a sphere, that is a point on the S2 spherical manifold. The overall space Ξ¨ is therefore the product manifold

Ξ¨=S2Γ—S2×…×S2⏟N⁒times (3.44)

and is easily visualised as N independent coordinates on a sphere. Formally:

ψ =(s1,…,sN)∈Ψ (3.45)
si ∈S2 (3.46)

The tangent space of a product manifold XΓ—Y is the product of the tangent spaces of the individual manifolds [2]:

T(x,y)⁒(XΓ—Y)=Tx⁒XΓ—Ty⁒Y (3.47)

The exponential map has a similar relationship [2]:

exp(x,y)XΓ—Y=(expxX,expyY) (3.48)

This means that the update step with N spins can be divided into N update steps each on a sphere:

ψn+1=(…,sn+1i,…) =exppnΨ⁑(βˆ’Ξ³β’βˆ‡Ξ¨F) (3.49)
=(…,expsniSi⁑(βˆ’Ξ³β’βˆ‡SiF),…) (3.50)
∴sn+1i =expsniSi⁑(βˆ’Ξ³β’βˆ‡SiF) (3.51)

Thankfully, the spherical geometry allows this to be computed without much trouble. FigureΒ 3.4a,b) show how the exponential map of the d-sphere Sd can be cheaply approximated by projecting the first order approximation of the map, the point on the tangent (hyper-)plane, back onto the sphere:

exp𝒑⁑(𝒖) =𝒑⁒cos⁑|𝒖|+𝒖|𝒖|⁒sin⁑|𝒖| (3.52)
β‰ˆprojS⁑(𝒑+𝒖)=𝒑+𝒖|𝒑+𝒖|=exp𝒑⁑(arctan⁑(|𝒖|)|𝒖|⁒𝒖) (3.53)

The effect of the approximation is to slightly lower the learning rate while the path along which the state is adjusted is unchanged.

By embedding the sphere in ℝ3, the coordinate of the point on the sphere can be stored as x, y, z components. The tangent space Tp⁒S2 is the plane 𝒗⋅𝒑=0, and is a 2-dimensional vector subspace of the ambient ℝ3. This allows for a very simple method for computing the gradient from the unconstrained Hamiltonian β„‹~:ℝ3→ℝ. The unconstrained gradient βˆ‡β„‹~βˆˆβ„3 can be computed cheaply and projected into the tangent plane to give the gradient of the constrained Hamiltonian:

βˆ‡β„‹|𝒑=projT𝒑⁒Sβ’βˆ‡β„‹~=βˆ‡β„‹~βˆ’(βˆ‡β„‹~⋅𝒑)⁒𝒑 (3.54)

This gives the final expression for the iteration step in the gradient descent algorithm on the sphere as

𝒙n+1 =projS⁑(𝒙nβˆ’Ξ³β’βˆ‡β„‹) (3.55)
=projS⁑(𝒙nβˆ’Ξ³β’projTβ’βˆ‡β„‹~) (3.56)
=𝒙nβˆ’Ξ³β’(βˆ‡β„‹~βˆ’(βˆ‡β„‹~⋅𝒙n)⁒𝒙n)|𝒙nβˆ’Ξ³β’(βˆ‡β„‹~βˆ’(βˆ‡β„‹~⋅𝒙n)⁒𝒙n)| (3.57)

where S represents the sphere, and T=T𝒙n⁒S2 is the tangent space of the sphere at 𝒙n. This is the generalised form of the expression found in the example (EquationΒ 3.14). In the implementation of the algorithm used in this thesis, Ξ³ is chosen to keep the maximum angle a spin rotates below a threshold, which starts at 0.05⁒rad, and is decreased until a specified precision is reached.

Numerically this can be computed with double precision floating point arithmetic very quickly, as it contains only linear operations, with the exception of the square-root in the magnitude. The final ingredient is the gradient of the unconstrained Hamiltonian with respect to an individual spin. Analytically this is

βˆ‡π‘Ίiβ„‹~=βˆ‘jβ‰ i𝕁i⁒j⁒𝑺j+(𝕁i⁒i+𝕁i⁒iT)⁒𝑺iβˆ’π”Ύi⁒𝑩+βˆ‡π‘ΊiAi⁒(𝑺i) (3.58)

where the term 𝕁i⁒i+𝕁i⁒iT captures exchange interactions between two different sites in the same sublattice. For instance, on the square lattice example, a next-nearest-neighbour interactions would couple the A and B sublattices to themselves. The term βˆ‡π‘ΊiAi⁒(𝑺i) captures single-ion anisotropy such as a βˆ’D⁒Sz2 term.

The time complexity of an algorithm is defined as the leading order scaling of the execution time as the size of the input grows, in the limit of large input size. For finding local minima this algorithm has a time complexity of π’ͺ⁒(N2), quadratic in the number of spins and equal to the time complexity of computing the gradient of the Hamiltonian. To find global minima in energy, the initial point 𝒙0 can be guessed at many points across β„³. However, in the case of magnetism, the number of points needed to cover the manifold scales exponentially in the number of spins in the calculation. With an overall time complexity of π’ͺ⁒(N2⁒bN), where the branching factor b depends on the exact form of Hamiltonian. In examples with only one local minima b=1 the scaling is very efficient, but examples with many local minima in energy are pathological for this algorithm.

In this case, Riemannian optimisation is a significant improvement over penalty methods in that is ensures that the constraint is satisfied exactly, without sacrificing speed. Compared to simulated annealing, it has the advantage of robustly determining local minima in energy, which is useful for tracking the evolution and stability of domains, and for visualising the evolution of the state when there are degenerate ground states. On the other hand, simulated annealing is able to efficiently find approximate global minima in energy, while gradient descent is exceptionally poor at the task for large numbers of spins and many local minima in energy. In such a case, a hybrid approach could work by using simulated annealing to produce an anstaz to seed the gradient descent algorithm, however in all cases in this thesis this was not required.

This algorithm is similar to variations of gradient descent used in many popular mean-field calculation programs. For example, SPINW’s [99] optmagsteep method rotates spins towards the local mean-field, similar to gradient descent. However, the precise formulation used in this thesis means that it provably converges on a local minimum in energy [2].

I used the principles outlined in this chapter to develop the Python library qafm, able to numerically find local and global minima in the mean-field energy with and without magnetic field applied. This can be performed on arbitrary crystal structures and commensurate magnetic unit cells, and is able to verify if given exchange tensors are symmetry allowed. This is used in later chapters to understand theoretically the response of different systems to magnetic fields at the mean-field level.