Magnetometry and Thermodynamic Studies of Anisotropic Honeycomb and Hyperhoneycomb Magnets 2 Experimental Methods 4 Honeycomb

\alpha\text{-}\text{Na}_{2}\text{Pr}\text{O}_{3}\,

Chapter 3 General Optimisation Algorithm to Find Classical Magnetic Ground States

Abstract

A number of methods exist to determine the ground state magnetic order of interacting spins on periodic lattices. This chapter illustrates the method used in this thesis for the case of an Ising-like antiferromagnet on a square lattice, for general orientation of applied magnetic field, where the magnetic structure is not collinear. This is followed by the general formalisation of the approach for the case of magnetic systems of many sublattices, with arbitrary bilinear exchange.

§ 3.1 Introduction

In order to make predictions of physical properties such as magnetic torque and linear magnetisation, calculations of magnetic structure using a gradient descent algorithm are performed in this thesis. This allowed development of a classical understanding of the pseudo-spin model of the materials which are discussed later. This approach treats spins as classical vectors and allows calculations to be performed much faster than quantum spin models such as exact-diagonalisation, which is intractable for many three-dimensional systems. A classical, three-dimensional pseudospin model is particularly useful when the system orders, and classical models are good approximations [81, 59].

The Hamiltonian for a collection of interacting spins in some applied magnetic field $\bm{B}$ can be written as

	$\displaystyle\mathcal{H}$	$\displaystyle=\sum_{i\alpha\beta ij}J_{ij}^{\alpha\beta}S_{i}^{\alpha}S_{j}^{\beta}$	$\displaystyle-$	$\displaystyle\sum_{i\alpha\beta}g_{i}^{\alpha\beta}B^{\alpha}S_{i}^{\beta}$		(3.1)
		$\displaystyle=\sum_{ij}\bm{S}_{i}^{T}\mathbb{J}_{ij}\bm{S}_{j}$	$\displaystyle-$	$\displaystyle\sum_{i}\bm{B}^{T}\mathbb{G}_{i}\bm{S}_{i}$		(3.2)

where $\mathbb{J}_{ij}$ is the exchange tensor between sites $i$ and $j$ and $\bm{S}^{T}_{i}$ is the transpose of fixed-length vector spin at site $i$ . $\mathbb{G}_{i}$ with elements $g_{i}^{\alpha\beta}$ is the g-factor at site $i$ in units of $\mu_{B}$ . The classical ground state is the set of spins $\psi_{0}=\{\bm{S}_{i}\}$ that minimises $\mathcal{H}$ , under the condition that all spins have a fixed length $s_{i}$ :

|\bm{S}_{i}|=s_{i}\;\forall i

(3.3)

Finding this ground state is fundamentally an optimisation problem in which the energy of the system is minimised over the state space. Numerical solutions to these problems are well studied, in physics as well as computer science. A number of algorithms exist to solve these, such as branch and bound, simulated annealing, and gradient descent.

An approach to finding the exact solution in zero field is the Luttinger-Tisza method [63], which works by replacing the strong condition (3.3) with the so-called weak condition

\sum_{i}|\bm{S}_{i}|^{2}-s_{i}^{2}=0

(3.4)

which is necessary but insufficient for the validity of (3.3). The Hamiltonian can be exactly solved under this condition using a Lagrange multiplier method [64]. If the solution also satisfies the strong condition then it is the true ground state.

A commonly used numerical method for solving these problems in physics is simulated annealing [54]. This iterative solution involves considering the current state $\bm{x}_{n}$ with energy $E$ and some ‘neighbouring’ state $\bm{x}_{n}^{*}$ with energy $E^{*}$ . The state is updated to this new value with probability $P(E,E^{*},T)$ which is commonly

P(E,E^{*},T)=\begin{cases}1&\text{if }E^{*}<E\\ \exp\frac{E-E^{*}}{T}&\text{otherwise }\end{cases}

(3.5)

where $T$ is a ‘temperature’ parameter which decreases over time. This algorithm provably converges if the cooling schedule is sufficiently slow, however it is challenging to determine the ideal cooling rate a priori.

A different method, used in this thesis, is based on gradient descent (GD). GD works by adjusting the state in the direction which locally minimises the energy. This method reliably and efficiently finds local minima in energy, but some care is needed to find global minima. A more in-depth explanation of the algorithm is in a later section.

§ 3.2 Example: Phase diagram of an Ising-like antiferromagnet on a square lattice for general orientation of magnetic field.

Figure 3.1: Schematic diagram showing a Néel order antiferromagnet on a square lattice. Red plus ’+’ symbols indicate spins along the

+\bm{z}

axis, out of the page. Blue minus ’-’ symbols indicate spins along the

-\bm{z}

axis, into the page. Overlaid diagrams show a) primitive lattice vectors

\bm{a}

\bm{b}

, b) magnetic unit cell lattice vectors

\bm{M_{a}}

\bm{M_{b}}

, c) magnetic unit cell outline with labelled sublattices

A

and

B

, and d) nearest-neighbour exchange path

J

Numerical mean field calculations of magnetic structures are particularly useful when considering systems with many exchange terms, or with applied magnetic fields along non-symmetry directions. In order to demonstrate their utility, I will use the example of a square lattice antiferromagnet with nearest-neighbour anisotropic XXZ exchange. A schematic diagram of such a lattice can be seen in Figure 3.1, which shows the primitive lattice vectors $\bm{a}$ and $\bm{b}$ .

In this system, the total energy can be written as

\mathcal{H}=\frac{1}{2}\sum_{i,j}J\left(S^{x}_{i}S^{x}_{j}+S^{y}_{i}S^{y}_{j}+% \Delta S^{z}_{i}S^{z}_{j}\right)-\sum_{i}g\mu_{B}\bm{B}\cdot\bm{S}_{i}\\

(3.6)

where the half prefactor is to correct for double counting of bonds, $J>0$ parametrises the antiferromagnetic exchange, and $\Delta$ allows for exchange anisotropy. In this example $\Delta>1$ indicates easy-axis Ising-type anisotropy.

The energy on a bond can be minimized by having the adjacent spins be antiparallel and collinear with the Ising $\bm{z}$ axis, such that the bond energy is $-\Delta JS^{2}$ . On the square lattice, this can be satisfied for every bond by adopting a Néel order, such that adjacent spins are all always antiparallel, shown as alternating $+$ and $-$ symbols on Figure 3.1. Because this structure minimises the energy on every bond, it must be the ground state when $\bm{B}=0$ .

The magnetic unit cell vectors are

\begin{pmatrix}\bm{M_{a}}\\ \bm{M_{b}}\end{pmatrix}=\begin{pmatrix}1&-1\\ 1&1\end{pmatrix}\,\begin{pmatrix}\bm{a}\\ \bm{b}\end{pmatrix}

(3.7)

with

\det\begin{vmatrix}1&-1\\ 1&1\end{vmatrix}=2

(3.8)

showing that there are two primitive lattice points per magnetic unit cell. As such, the magnetic unit cell volume (in this case, area) is twice as large as the primitive unit cell volume. These magnetic unit cell vectors can be seen in Figure 3.1b, the outline of the unit cell can be seen in Figure 3.1c, with the two magnetic sublattices labelled $A$ and $B$ .

In order to understand how the spin orientations vary with applied magnetic field in this system, it is assumed that the magnetic unit cell does not change with applied magnetic field. The mean field energy per magnetic unit cell (Eqn. 3.6) can then be rewitten in terms of the spin orientations of the two sublattices $\bm{S}_{A}$ and $\bm{S}_{B}$

\frac{\mathcal{H}}{N}=nJ\left(S^{x}_{A}S^{x}_{B}+S^{y}_{A}S^{y}_{B}+\Delta S^{% z}_{A}S^{z}_{B}\right)-g\mu_{B}\bm{B}\cdot\left(\bm{S}_{A}+\bm{S}_{B}\right)

(3.9)

where $n=4$ is the number of nearest-neighbours on each site, and $N$ is the total number of magnetic unit cells in the crystal. In this simple case, constructing the Hamiltonian as a function of sublattice spins is straightforward by inspection. In the most general case with longer range interactions and arbitrary numbers of sublattices the formalism is non-trivial and this is discussed in detail in Section 3.3.

One interesting feature of this system is that it undergoes a spin-flop transition when $\Delta>1$ , this is a metamagnetic phase transition which occurs when a sufficiently strong field is applied along the Ising $\bm{z}$ axis. The zero-field ground state has spins along the $\bm{z}$ axis, so when a field is applied the Zeeman energy

E_{\text{Z}}=-\frac{1}{2}g\mu_{B}\bm{B}\cdot\left(\bm{S}_{A}+\bm{S}_{B}\right)

(3.10)

does not decrease with a magnetic field along $\bm{z}$ . The spins are also not able to cant towards the field as they would if the field was applied in the $x y$ plane. Determining the field at which this occurs, and the field dependence of the magnetisation can be done analytically and is demonstrated in many textbooks on the subject [13]. In this model, the spin-flop field $B_{\text{SF}}$ is found to be

\mu_{B}gB_{\text{SF}}=SnJ\sqrt{\Delta^{2}-1}

(3.11)

where a spin-flop occurs for all values of $\Delta>1$ . At this field, the magnetisation changes discontinuously from $0$ at $B<B_{\text{SF}}$ to a finite value at $B>B_{\text{SF}}$ .

However, the question of how the system behaves for non-axial fields, some arbitrary angle away from the $\bm{z}$ axis, cannot be answered so clearly analytically. For this reason, it is useful to develop numerical methods for solving the problem. A natural approach is an iterative algorithm called gradient descent. This algorithm works by starting from a state and finding the direction in which the spins can be adjusted that decreases the energy the most. This is repeated until a minimum in energy is found with satisfactory precision.

In this example, Equation 3.9 can be differentiated to give

\frac{1}{N}\nabla_{\bm{S}_{A}}{\mathcal{H}}=nJ\begin{pmatrix}S^{x}_{B}\\ S^{y}_{B}\\ \Delta S^{z}_{B}\end{pmatrix}-g\mu_{B}\bm{B}

(3.12)

which can be computed, then the sublattice spin $\bm{S}_{A}$ adjusted along the $-\nabla_{\bm{S}_{A}}\mathcal{H}$ direction, such that the energy decrease is maximised. Unfortunately, this runs into the problem that the magnitude of the spins $\left|\bm{S}_{A}\right|$ are unconstrained. For instance, in the Néel order ground state with no field

	$\displaystyle\bm{S}_{A}$	$\displaystyle=\phantom{-}S\hat{z}$
	$\displaystyle\bm{S}_{B}$	$\displaystyle=-S\hat{z}$
	$\displaystyle\frac{1}{N}\nabla_{\bm{S}_{A}}{\mathcal{H}}$	$\displaystyle=-\Delta nJS\hat{z}$

so adjusting $\bm{S}_{A}$ by $-\gamma\nabla_{\bm{S}_{A}}\mathcal{H}$ with small $\gamma>0$ will increase the magnitude of $\left|\bm{S}_{A}\right|$

\left|\bm{S}_{A}-\gamma\nabla_{\bm{S}_{A}}\mathcal{H}\right|=\left|\bm{S}_{A}% \right|-\gamma\bm{S}_{A}\cdot\nabla_{\bm{S}_{A}}\mathcal{H}+\mathcal{O}(\gamma% ^{2})

(3.13)

To solve this problem, I note that in the limit of small step size $\gamma\ll 1$ , $\left|\bm{S}_{A}\right|$ will not change if $\bm{S}_{A}\cdot\nabla_{\bm{S}_{A}}\mathcal{H}=0$ . This means that the sublattice spins will be rotated in the plane containing the spin vector and the gradient $\nabla_{\bm{S}_{A}}\mathcal{H}$ . I approach the formalism for this in a later section, but in summary this treats the condition $\left|\bm{S}_{A}\right|=S$ as a spherical manifold $\mathcal{M}$ , and the adjustments to the state occur in the tangent space of this manifold $T_{\bm{p}}\mathcal{M}$ .

In order to keep the step direction in this plane, I project the energy minimising vector $-\nabla_{\bm{S}_{A}}\mathcal{H}$ into the plane and use that as the step direction, and then normalise the new vector again to avoid accumulation of $\mathcal{O}(\gamma^{2})$ errors.

\bm{S}_{A,n+1}=\operatorname{proj}_{S}\left[\bm{S}_{A,n}-\gamma\operatorname{% proj}_{T}\nabla_{\bm{S}_{A}}\mathcal{H}\right]

(3.14)

where $\operatorname{proj}_{S}$ indicates the projection onto a sphere with radius $S$ and $\operatorname{proj}_{T}$ indicates projection into the plane normal to $\bm{S}_{A}$ . This can then be iterated for each spin until an energy minimum is found. Section 3.4 formalises this approach and derives a generalisation of the above expression.

Figure 3.2: Phase diagram showing magnetisation per site for magnetic fields in the

x z

plane up to

50\text{\,}\mathrm{T}

, for an Ising type XXZ antiferromagnet with calculation parameters found in Table 3.1. Shading shows magnetisation projected onto the magnetic field direction as described in the text. Overlaid arrows show alignment of sublattice spins at various magnetic fields.

\theta

is angle between the magnetic field and the Ising

\bm{z}

axis.

\alpha,\beta

are the angles sublattice spins make with the magnetic field direction. Circular black boundary shows second order phase transition from canted antiferromagnet to polarised phase. For special directions

\bm{B}\parallel\bm{z}

\bm{B}\perp\bm{z}

\alpha=\beta

for general field orientation

\alpha\neq\beta

Figure 3.3: a) Evolution of magnetisation versus magnetic field for field aligned along (cyan) and perpendicular to (red) the Ising axis of an XXZ antiferromagnet on a square lattice. b) Schematic diagram showing axes arrangement and definition of angles in the rest of the figure. Angle

\theta

is measured from the Ising

\bm{z}

axis towards the

x y

-plane. c) shows evolution of magnetisation around the spin-flop field for angles between

0^{\circ}\leq\theta\leq 15^{\circ}

and fields

10\text{$\mathrm{T}$}\leq B\leq 25\text{$\mathrm{T}$}

. d) shows evolution of magnetic torque with magnetic field for angles between

0^{\circ}\leq\theta\leq 15^{\circ}

and fields

0\text{$\mathrm{T}$}\leq B\leq 50\text{$\mathrm{T}$}

. Calculation parameters are found in Table 3.1.

Table 3.1: Calculation parameters for square-lattice XXZ antiferromagnet using definitions in Equation 3.9

$n$	4
$S$	$\nicefrac{{1}}{{2}}$
$g$	2
$J$	$1\text{\,}\mathrm{meV}$
$\Delta$	$\sqrt{2}$

This calculation was performed with the parameters in Table 3.1. From these, one expects a spin-flop transition to occur at $\mu_{B}B_{\text{SF}}=1\text{$\mathrm{meV}$}$ or $B_{\text{SF}}\approx 17.28\text{$\mathrm{T}$}$ and polarisation to occur at $\mu_{B}B_{\text{P}}=2.414\text{$\mathrm{meV}$}$ or $B_{\text{P}}\approx 41.71\text{$\mathrm{T}$}$ . The results of these calculations can be seen in Figure 3.3. Fig. 3.3a) shows the evolution of magnetisation against magnetic field for field orientations parallel and perpendicular to the $\bm{z}$ axis. For fields perpendicular to $\bm{z}$ , the magnetisation increases linearly in field from $0\mu_{B}$ at $0\text{\,}\mathrm{T}$ to $1\mu_{B}$ at $41.7(1)\text{\,}\mathrm{T}$ . For fields parallel to $\bm{z}$ , the magnetisation remains at $0\mu_{B}$ between $0\rightarrow 17.3(1)\text{$\mathrm{T}$}$ , and then discontinuously joins the $\bm{B}\perp\bm{z}$ curve at higher fields.

Figure 3.2 shows how the projection of magnetisation onto the magnetic field direction

\bm{M}\cdot\frac{\bm{B}}{|\bm{B}|}

(3.15)

varies as as the magnetic field vector $\bm{B}$ varies in the $x z$ plane. This shows a spin-flop transition at the isolated point $\bm{B}=B_{\text{SF}}\bm{z}$ , such that if the magnetic field is applied directly along the $\bm{z}$ axis, there is a phase transition, and discontinuous change in magnetisation at $|\bm{B}|=B_{\text{SF}}$ . This also shows that for off-axis fields, there is no phase transition, but instead a smooth crossover between the two regimes. It show that the spin orientations which minimise the free energy take different angles with the magnetic field $\alpha\neq\beta$ except in the special cases of axial fields $\theta=0$ , $\theta=\frac{\pi}{2}$ , highlighting why this calculation is challenging to do analytically. For all field orientations, the system polarises at the same field $|\bm{B}|=B_{\text{P}}\approx 41.71\text{$\mathrm{T}$}$ , which is in all cases a continuous phase transition.

Fig. 3.3b-d) show the effect of off-axis fields on this system. Fig. 3.3b) shows how the angle $\theta$ is defined in this setup: measuring an angle from the $\bm{z}$ axis, towards the $x y$ plane. At zero-field the spins are aligned parallel to the $\bm{z}$ axis. Fig. 3.3c) shows how magnetisation against field curves for $\theta$ between $0\text{$\mathrm{\SIUnitSymbolDegree}$}$ and $15\text{$\mathrm{\SIUnitSymbolDegree}$}$ . At $\theta=0\text{$\mathrm{\SIUnitSymbolDegree}$}$ the field is exactly along axis and the $\bm{B}\parallel\bm{z}$ curve discussed above is recovered. As the field gets further off axis, the discontinuity at $|\bm{B}|=B_{\text{SF}}$ is immediately replaced by a smooth crossover between the low and high-field regimes. At higher values of $\theta$ , the crossover becomes less sharp. Fig. 3.3d) shows how magnetic torque, taken around the $\bm{y}=\bm{z}\times\bm{x}$ axis shown in Fig. 3.3b), evolves with applied magnetic field at different values of $\theta$ . At $\theta=0\text{$\mathrm{\SIUnitSymbolDegree}$}$ the field is exactly along the $\bm{z}$ axis, and the torque is constrained by symmetry to be exactly $0$ at all fields. For small values of $\theta$ , there is a sharp peak in torque around the spin-flop field. As $\theta$ increases, the peak becomes broader, and increases slightly in field. At the polarisation field $B_{\text{P}}$ , the magnetisation changes slope at all angles $\theta$ , indicating a continuous phase transition. There is no anomaly in torque at $B_{\text{P}}$ at any angle $\theta$ . These results reproduce what is expected theoretically [84, 14] has been seen before experimentally [113].

§ 3.3 General formalism for multi-sublattice magnetic systems

The previous section outlines the ground state determination method for the case of two sublattices in the magnetic unit cell and nearest-neighbour exchange. This is appropriate for many simple systems. This section demonstrates the appropriate formalism for multi-sublattice systems with exchanges beyond nearest-neighbours.

The general form of the a spin Hamiltonian with exchange and Zeeman terms is

	$\displaystyle\mathcal{H}$	$\displaystyle=\sum_{\alpha\beta ij}J_{ij}^{\alpha\beta}S_{i}^{\alpha}S_{j}^{\beta}$	$\displaystyle-$	$\displaystyle\sum_{i\alpha\beta}g_{i}^{\alpha\beta}B^{\alpha}S_{i}^{\beta}$		(3.16)
		$\displaystyle=\sum_{ij}\bm{S}_{i}^{T}\mathbb{J}_{ij}\bm{S}_{j}$	$\displaystyle-$	$\displaystyle\sum_{i}\bm{B}^{T}\mathbb{G}_{i}\bm{S}_{i}$		(3.17)

allowing for all bilinear exchange terms $J_{ij}^{\alpha\beta}$ and anisotropic g-factors $g_{i}^{\alpha\beta}$ . This form of the Hamiltonian requires indexing over all sites in the crystal. These sites however exist on a Bravis lattice with symmetry constraints which are hard to evaluate in this indexed form. An equivalent formalism for the Hamiltonian is to make this explicit, by summing over the sites in each primitive unit cell, and the lattice points in the crystal. This gives

\mathcal{H}=\sum_{\alpha\beta ij\bm{u}\bm{v}}J_{ij\bm{u}\bm{v}}^{\alpha\beta}S% _{i\bm{u}}^{\alpha}S_{j\bm{v}}^{\beta}-\sum_{\alpha\beta i\bm{u}}g_{i}^{\alpha% \beta}B^{\alpha}S_{\bm{u}}^{\beta}

(3.18)

where $i, j$ index over the sites in the unit cell and $\bm{u},\bm{v}$ index over the lattice vectors in the crystal. Additionally, the value of the Hamiltonian must be invariant under the symmetry operations of the crystal, including translations by lattice vectors. In other words, it must be invariant under the transformation:

\bm{S}_{i\bm{u}}\rightarrow\bm{S}_{i\bm{u}+\bm{t}}

(3.19)

where $\bm{t}$ is a lattice vector. This constrains the exchange parameters, allowing them to be rewritten:

J_{ij\bm{u},\bm{u}+\bm{t}}\rightarrow J_{ij\bm{t}}

(3.20)

So including only the unique terms allowed by translational symmetry of the lattice:

\mathcal{H}=\sum_{\alpha\beta ij\bm{u}\bm{t}}J_{ij\bm{t}}^{\alpha\beta}S_{i\bm% {u}}^{\alpha}S_{j\bm{u}+\bm{t}}^{\beta}-\sum_{\alpha\beta i\bm{u}}g_{i}^{% \alpha\beta}B^{\alpha}S_{i\bm{u}}^{\beta}

(3.21)

The exchange part of the Hamiltonian can then be fully specified by a set of exchange parameters (nearest-neighbour, next-nearest neighbour, and so on).

In order to convert the Hamiltonian into a sum over sublattices, a magnetic unit cell needs to be assumed. This means that the spins $S_{i\bm{u}}$ and $S_{i\bm{v}}$ at the same site index $i$ , for two different lattice vectors $\bm{u}$ , $\bm{v}$ , are equivalent if $\bm{u}$ and $\bm{v}$ differ by a magnetic lattice vector. Formally:

\bm{S}_{i\bm{u}}\equiv\bm{S}_{i\bm{v}}\quad\text{if }\exists\bm{m}\in\mathbb{M% }:\bm{u}=\bm{v}+\bm{m}

(3.22)

where $\mathbb{M}$ is the vector space of all magnetic lattice vectors.

This means that the lattice vectors $\bm{u}$ , $\bm{v}$ can be treated as equivalent if they differ by a magnetic lattice vector. This is the same as performing the algebra in the quotient space ${V}/{\mathbb{M}}$ , where $V$ is the vector space of primitive lattice vectors. It is worth noting that $\mathbb{M}$ is a vector subspace of $V$ . A useful operator to define is the equivalence class of a vector in $V$

\left[\bm{u}\right]\coloneqq\left\{\bm{u}+\bm{m}\mid\forall\bm{m}\in\mathbb{M}\right\}

(3.23)

which is the set of all primitive lattice vectors which are equivalent under the magnetic unit cell.

To put it simply, by imposing a magnetic unit cell $\mathbb{M}$ , two spins $\bm{S}_{i\bm{u}}$ and $\bm{S}_{i\bm{v}}$ at the same site index $i$ , for two different primitive lattice vectors $\bm{u}$ , $\bm{v}$ , are equivalent if $\left[\bm{u}\right]=\left[\bm{v}\right]$ . The set of equivalence classes here can be understood as the set of primitive lattice points in each magnetic unit cell. If there is only one site per primitive cell, then the equivalence classes are just the sublattices. If there is more than one site per primitive cell, the sublattices can be indexed by $i,\left[\bm{u}\right]$ , and the number of sublattices is equal to the number of sites per primitive unit cell $\times$ the number of primitive lattice points per magnetic cell.

Because of this imposed symmetry, the spin at site $i,\bm{u}$ can be rewritten

\bm{S}_{i\bm{u}}\rightarrow\bm{S}_{i\left[\bm{u}\right]}

(3.24)

and the Hamiltonian can be rewritten as

\mathcal{H}=\sum_{\alpha\beta ij\bm{u}\bm{t}}J_{ij\bm{t}}^{\alpha\beta}S_{i% \left[\bm{u}\right]}^{\alpha}S_{j\left[\bm{u}+\bm{t}\right]}^{\beta}-\sum_{% \alpha\beta i\bm{u}}g_{i}^{\alpha\beta}B^{\alpha}S_{i\left[\bm{u}\right]}^{\beta}

(3.25)

This still sums over all of the sites in the crystal, but the sum can be simplified by noting that for a finite size magnetic unit cell, there are a finite number of equivalence classes. The sum is taken over the magnetic unit cells, and the sites per magnetic unit cell

	$\displaystyle\mathcal{H}$	$\displaystyle=\sum_{\bm{m}\in\mathbb{M}}\sum_{\alpha\beta ij\bm{p}\bm{t}}J_{ij% \bm{t}}^{\alpha\beta}S_{i\left[\bm{p}+\bm{m}\right]}^{\alpha}S_{j\left[\bm{p}+% \bm{m}+\bm{t}\right]}^{\beta}-\sum_{\bm{m}\in\mathbb{M}}\sum_{\alpha\beta i\bm% {p}}g_{i}^{\alpha\beta}B^{\alpha}S_{i\left[\bm{p}+\bm{m}\right]}^{\beta}$		(3.26)
		$\displaystyle=N\left(\sum_{\alpha\beta ij\bm{p}\bm{t}}J_{ij\bm{t}}^{\alpha% \beta}S_{i\left[\bm{p}\right]}^{\alpha}S_{j\left[\bm{p}+\bm{t}\right]}^{\beta}% -\sum_{\alpha\beta i\bm{p}}g_{i}^{\alpha\beta}B^{\alpha}S_{i\left[\bm{p}\right% ]}^{\beta}\right)$		(3.27)

where the first sum over is taken over all magnetic unit cells in the crystal $\bm{m}\in\mathbb{M}$ , $N$ is the number of magnetic unit cells in the crystal and the sum over $\bm{p}$ is over the primitive lattice points per magnetic unit cell (the equivalence classes of ${V}/{\mathbb{M}}$ ). The above uses the property of the equivalence classes that $\left[\bm{p}+\bm{m}\right]=\left[\bm{p}\right]\;\forall\bm{m}\in\mathbb{M}$ .

Here, the factor of $N$ can be safely removed by transforming $N\mathcal{H}\rightarrow\mathcal{H}$ , such that

\mathcal{H}=\sum_{\alpha\beta ij\bm{p}\bm{t}}J_{ij\bm{t}}^{\alpha\beta}S_{i% \left[\bm{p}\right]}^{\alpha}S_{j\left[\bm{p}+\bm{t}\right]}^{\beta}-\sum_{% \alpha\beta i\bm{p}}g_{i}^{\alpha\beta}B^{\alpha}S_{i\left[\bm{p}\right]}^{\beta}

(3.28)

The final hurdle to solve is the sum over $\bm{t}$ , which still sums over all primitive lattice points. This is easily resolved as in the calculation, only a small number of interaction terms $J_{ij\bm{t}}^{\alpha\beta}$ are non-zero. This allows the sum to run over all non-zero interactions which have been defined (nearest-neighbour, next-nearest-neighbour, etc.), which is likely to be small in any real-world application, except for dipolar coupled systems which are not the focus of this thesis.

For the sake of completeness, the classical Hamiltonian can be rewritten in terms of the exchange interaction between the sublattices by defining

\widetilde{J}_{i\left[\bm{p}\right],j\left[\bm{r}\right]}^{\alpha\beta}% \coloneqq\sum_{\bm{m}\in\mathbb{M}}{J}_{ij{\left(\bm{r}-\bm{p}+\bm{m}\right)}}% ^{\alpha\beta}

(3.29)

such that

\mathcal{H}=\sum_{\alpha\beta i[\bm{p}]j[\bm{r}]}\widetilde{J}_{i\left[\bm{p}% \right],j\left[\bm{r}\right]}^{\alpha\beta}S_{i\left[\bm{p}\right]}^{\alpha}S_% {j\left[\bm{r}\right]}^{\beta}-\sum_{\alpha\beta i[\bm{p}]}g_{i}^{\alpha\beta}% B^{\alpha}S_{i\left[\bm{p}\right]}^{\beta}

(3.30)

where $i[\bm{p}],j[\bm{r}]$ sum pairwise over the sublattices. This is the same form as Equation 3.9 in the example, and provides a good starting point for numerically solving these problems. The following section formalises a method for numerically minimising the mean field energy for Hamiltonians of this form.

The effect of applying a finite size magnetic unit cell is that only a subset of magnetic propagation vectors $\bm{q}$ are allowed, such that

\bm{q}\cdot\bm{m}\in\mathbb{Z}\quad\forall\bm{m}\in\mathbb{M}

(3.31)

This ‘rationalisation’ of k-space is the standard consequence of applying periodic boundary conditions to a crystal. It should be noted that incommensurate magnetic structures have irrational propagation vectors, so can never be solved with this method.

As the choice of magnetic unit cell $\mathbb{M}$ limits the solutions that can be found, it must be chosen from either prior theoretical work or empirical evidence of the propagation vector such as neutron diffraction experiments.

§ 3.4 Riemannian Optimisation Algorithm

This section formalises a method for solving optimisation problems where the state space is not a vector space. This is needed for finding the ground state magnetic Hamiltonians of the form

	$\displaystyle\mathcal{H}$	$\displaystyle=\sum_{\alpha\beta ij}\widetilde{J}_{ij}^{\alpha\beta}S_{i}^{% \alpha}S_{j}^{\beta}$	$\displaystyle-$	$\displaystyle\sum_{\alpha\beta i}g_{i}^{\alpha\beta}B^{\alpha}S_{i}^{\beta}$	$\displaystyle+$	$\displaystyle\sum_{i}A_{i}(\bm{S}_{i})$		(3.32)
		$\displaystyle=\sum_{ij}\bm{S}_{i}^{T}\mathbb{J}_{ij}\bm{S}_{j}$	$\displaystyle-$	$\displaystyle\sum_{i}\bm{B}^{T}\mathbb{G}_{i}\bm{S}_{i}$	$\displaystyle+$	$\displaystyle\sum_{i}A_{i}(\bm{S}_{i})$		(3.33)

where $A_{i}(\bm{S}_{i})$ parametrises local anisotropy on site $i$ , some complex anisotropy terms are discussed in Chapter 6.

These have the constraint that the sublattice spin vectors have a fixed magnitude i.e.

\left|\bm{S}_{i}\right|=s_{i}\forall i

(3.34)

As such, the space of allowed $\bm{S}_{i}$ do not form a vector space. Fortunately, as each spin is constrained to be a vector with a fixed length, it is equivalent to saying that each $\bm{S}_{i}$ is a point on a sphere of radius $s_{i}$ . This forms a differentiable Riemannian manifold over which some optimisation algorithms such as gradient descent can be generalised as shown below.

Traditionally, gradient descent is an algorithm for finding the local minimum of a scalar, differentiable, function $F$ over a vector space

F:\mathbb{R}^{d}\rightarrow\mathbb{R}

(3.35)

The principle of the algorithm comes from defining the gradient operator $\nabla F$ such that its inner product with some vector direction $\bm{v}$

\left<\nabla F,\bm{v}\right>\coloneqq dF(\bm{v})

(3.36)

the derivative of $F$ along $\bm{v}$ . This is equivalent to the more commonly used definition

\nabla F\coloneqq\sum_{i}\frac{\partial F}{\partial x_{i}}\hat{\bm{e}}_{i}

(3.37)

where $\hat{\bm{e}_{i}}$ is the $i$ ^thunit basis vector of the vector space, and $\frac{\partial F}{\partial x_{i}}$ is the partial derivative of $F$ with respect to the coordinate $x_{i}$ .

For a small step $\bm{\delta}$ , the change in $F$ is maximised when the step is parallel to the gradient $\nabla F$ . This prompts the iterative solution

\bm{x}_{n+1}=\bm{x}_{n}-\gamma\nabla F(\bm{x}_{n})

(3.38)

where $\gamma$ is a learning rate, and is a free parameter.

In order to include the constant length spins constraint (Eqn. 3.3), modifications have to be made to the standard gradient descent algorithm. One option is to use a penalty method, which involves adding some smooth term to the Hamiltonian to penalise moving away from the constraint. In this case one could add the penalty term:

P=\sum_{i}\eta(|\bm{S}_{i}|-1)^{2}

(3.39)

where $\eta>0$ is a free parameter. This would add a cost to the optimiser to violate the constraint, so it will stay approximately satisfied.

Another solution, with has been implemented for this project, is to consider the constrained state space $\Psi$ as a Riemannian manifold, described below. The gradient descent algorithm can be generalised to apply to any Riemannian manifold, not just the Euclidean geometry $\mathbb{R}^{d}$ shown above [2, 88, 87]. In this form the algorithm is sometimes known as Riemannian Optimisation. However, some of the steps which are trivial in the Euclidean case are not trivial in the general case.

A Riemannian manifold is a smooth, differentiable manifold whose tangent spaces have an inner product, which varies smoothly across the manifold. The tangent space $T_{p}\mathcal{M}\,$ of a differentiable manifold $\mathcal{M}\,$ at $p$ is the vector space spanned by the tangent vectors of all of the curves at $p$ on $\mathcal{M}\,$ . If $\mathcal{M}\,$ can be embedded in an ambient Euclidean space, this can be easily visualised as the (hyper-)plane spanned by the tangent vectors as in Figure 3.4a).

In this case, $F$ is instead a differentiable, scalar function over the manifold $\mathcal{M}\,$

F:\mathcal{M}\,\rightarrow\mathbb{R}

(3.40)

This allows the gradient $\nabla F$ to be defined again such that

\left<\nabla F,\bm{v}\right>=dF(\bm{v})

(3.41)

is the derivative of $F$ along $\bm{v}$ . It should be noted that the vectors $\bm{v}$ and $\nabla F$ are now elements of the tangent space $T_{p}\mathcal{M}\,$ . Clearly, a step along the vector $\nabla F$ will maximise the change in $F$ , so the approach of gradient descent will work. However, unlike in the Euclidean case, the tangent space and the manifold have very different geometry. A tangent vector on $T_{p}\mathcal{M}\,$ can be mapped to $\mathcal{M}\,$ by the exponential map:

\exp_{p}:T_{p}\mathcal{M}\,\rightarrow\mathcal{M}\,

(3.42)

In some cases, this map may be computationally expensive to evaluate. Overall this gives the iteration step for gradient descent on an arbitrary Riemannian manifold:

p_{n+1}=\exp_{p_{n}}(-\gamma\nabla F)

(3.43)

Figure 3.4: Schematic showing some properties of the spherical

S^{2}

manifold

\mathcal{M}\,

. a) shows

\mathcal{M}\,

embedded in ambient Euclidean space

\mathbb{R}^{3}

. At a point

p

\mathcal{M}\,

, the tangent space

T_{p}\mathcal{M}\,

is visualised as a plane tangent to the sphere at

p

. Orthogonal unit vectors

\bm{u}

\bm{v}

which span

T_{p}\mathcal{M}\,

are shown as red arrows. The exponential map

\exp_{p}\bm{v}

is shown as a red line along the sphere. b) shows a cross section of the sphere including

\bm{v}

and the origin. The projection operator

\operatorname{proj}_{\mathcal{M}}\,

is shown as a black dashed line. Note that

p

is written without bold when treated as a point on

\mathcal{M}

, and as bold

\bm{p}

when treated as a vector in the ambient Euclidean space.

Applying this to the geometry of the problem in magnetism is straightforward. A point $\psi$ on the overall space $\Psi$ is a collection of $N$ spins $\bm{S}_{i}$ , which are constrained to be unit vectors. So each spin is a point on a sphere, that is a point on the $\text{S}^{2}$ spherical manifold. The overall space $\Psi$ is therefore the product manifold

\Psi=\underbrace{\text{S}^{2}\times\text{S}^{2}\times...\times\text{S}^{2}}_{N% \,\text{times}}

(3.44)

and is easily visualised as $N$ independent coordinates on a sphere. Formally:

	$\displaystyle\psi$	$\displaystyle=(s^{1},\dots,s^{N})\in\Psi$		(3.45)
	$\displaystyle s^{i}$	$\displaystyle\in S^{2}$		(3.46)

The tangent space of a product manifold $X\times Y$ is the product of the tangent spaces of the individual manifolds [2]:

T_{(x,y)}(X\times Y)=T_{x}X\times T_{y}Y

(3.47)

The exponential map has a similar relationship [2]:

\exp_{(x,y)}^{X\times Y}=(\exp_{x}^{X},\exp_{y}^{Y})

(3.48)

This means that the update step with $N$ spins can be divided into $N$ update steps each on a sphere:

$\displaystyle\psi_{n+1}=(\dots,s_{n+1}^{i},\dots)$	$\displaystyle=\exp_{p_{n}}^{\Psi}(-\gamma\nabla_{\Psi}F)$	(3.49)
	$\displaystyle=(\dots,\exp_{s_{n}^{i}}^{S_{i}}(-\gamma\nabla_{S_{i}}F),\dots)$	(3.50)
$\displaystyle\therefore s_{n+1}^{i}$	$\displaystyle=\exp_{s_{n}^{i}}^{S_{i}}(-\gamma\nabla_{S_{i}}F)$	(3.51)

Thankfully, the spherical geometry allows this to be computed without much trouble. Figure 3.4a,b) show how the exponential map of the $d$ -sphere $\text{S}^{d}$ can be cheaply approximated by projecting the first order approximation of the map, the point on the tangent (hyper-)plane, back onto the sphere:

	$\displaystyle\exp_{\bm{p}}(\bm{u})$	$\displaystyle=\bm{p}\cos\|\bm{u}\|+\frac{\bm{u}}{\|\bm{u}\|}\sin\|\bm{u}\|$		(3.52)
		$\displaystyle\approx\operatorname{proj}_{S}(\bm{p}+\bm{u})=\frac{\bm{p}+\bm{u}% }{\|\bm{p}+\bm{u}\|}=\exp_{\bm{p}}\left(\frac{\arctan(\|\bm{u}\|)}{\|\bm{u}\|}\bm{u}\right)$		(3.53)

The effect of the approximation is to slightly lower the learning rate while the path along which the state is adjusted is unchanged.

By embedding the sphere in $\mathbb{R}^{3}$ , the coordinate of the point on the sphere can be stored as $x$ , $y$ , $z$ components. The tangent space $T_{p}S^{2}$ is the plane $\bm{v}\cdot\bm{p}=0$ , and is a 2-dimensional vector subspace of the ambient $\mathbb{R}^{3}$ . This allows for a very simple method for computing the gradient from the unconstrained Hamiltonian $\mathcal{\tilde{H}}:\mathbb{R}^{3}\rightarrow\mathbb{R}$ . The unconstrained gradient $\nabla\mathcal{\tilde{H}}\in\mathbb{R}^{3}$ can be computed cheaply and projected into the tangent plane to give the gradient of the constrained Hamiltonian:

\nabla\mathcal{H}|_{\bm{p}}=\operatorname{proj}_{T_{\bm{p}}S}{\nabla\mathcal{% \tilde{H}}}=\nabla\mathcal{\tilde{H}}-(\nabla\mathcal{\tilde{H}}\cdot\bm{p})% \bm{p}

(3.54)

This gives the final expression for the iteration step in the gradient descent algorithm on the sphere as

$\displaystyle\bm{x}_{n+1}$	$\displaystyle=\operatorname{proj}_{S}\left(\bm{x}_{n}-\gamma\nabla\mathcal{H}\right)$	(3.55)
	$\displaystyle=\operatorname{proj}_{S}\left(\bm{x}_{n}-\gamma\,\operatorname{% proj}_{T}\nabla\mathcal{\tilde{H}}\right)$	(3.56)
	$\displaystyle=\frac{\bm{x}_{n}-\gamma(\nabla\mathcal{\tilde{H}}-(\nabla% \mathcal{\tilde{H}}\cdot\bm{x}_{n})\bm{x}_{n})}{\left\|\bm{x}_{n}-\gamma(\nabla% \mathcal{\tilde{H}}-(\nabla\mathcal{\tilde{H}}\cdot\bm{x}_{n})\bm{x}_{n})% \right\|}$	(3.57)

where $S$ represents the sphere, and $T=T_{\bm{x}_{n}}S^{2}$ is the tangent space of the sphere at $\bm{x}_{n}$ . This is the generalised form of the expression found in the example (Equation 3.14). In the implementation of the algorithm used in this thesis, $\gamma$ is chosen to keep the maximum angle a spin rotates below a threshold, which starts at $0.05\text{$\mathrm{rad}$}$ , and is decreased until a specified precision is reached.

Numerically this can be computed with double precision floating point arithmetic very quickly, as it contains only linear operations, with the exception of the square-root in the magnitude. The final ingredient is the gradient of the unconstrained Hamiltonian with respect to an individual spin. Analytically this is

\nabla_{\bm{S}_{i}}\mathcal{\tilde{H}}=\sum_{j\neq i}\mathbb{J}_{ij}\bm{S}_{j}% +(\mathbb{J}_{ii}+\mathbb{J}_{ii}^{T})\bm{S}_{i}-\mathbb{G}_{i}\bm{B}+\nabla_{% \bm{S}_{i}}A_{i}(\bm{S}_{i})

(3.58)

where the term $\mathbb{J}_{ii}+\mathbb{J}_{ii}^{T}$ captures exchange interactions between two different sites in the same sublattice. For instance, on the square lattice example, a next-nearest-neighbour interactions would couple the $A$ and $B$ sublattices to themselves. The term $\nabla_{\bm{S}_{i}}A_{i}(\bm{S}_{i})$ captures single-ion anisotropy such as a $-DS_{z}^{2}$ term.

The time complexity of an algorithm is defined as the leading order scaling of the execution time as the size of the input grows, in the limit of large input size. For finding local minima this algorithm has a time complexity of $\mathcal{O}(N^{2})$ , quadratic in the number of spins and equal to the time complexity of computing the gradient of the Hamiltonian. To find global minima in energy, the initial point $\bm{x}_{0}$ can be guessed at many points across $\mathcal{M}\,$ . However, in the case of magnetism, the number of points needed to cover the manifold scales exponentially in the number of spins in the calculation. With an overall time complexity of $\mathcal{O}(N^{2}b^{N})$ , where the branching factor $b$ depends on the exact form of Hamiltonian. In examples with only one local minima $b=1$ the scaling is very efficient, but examples with many local minima in energy are pathological for this algorithm.

In this case, Riemannian optimisation is a significant improvement over penalty methods in that is ensures that the constraint is satisfied exactly, without sacrificing speed. Compared to simulated annealing, it has the advantage of robustly determining local minima in energy, which is useful for tracking the evolution and stability of domains, and for visualising the evolution of the state when there are degenerate ground states. On the other hand, simulated annealing is able to efficiently find approximate global minima in energy, while gradient descent is exceptionally poor at the task for large numbers of spins and many local minima in energy. In such a case, a hybrid approach could work by using simulated annealing to produce an anstaz to seed the gradient descent algorithm, however in all cases in this thesis this was not required.

This algorithm is similar to variations of gradient descent used in many popular mean-field calculation programs. For example, SPINW’s [99] optmagsteep method rotates spins towards the local mean-field, similar to gradient descent. However, the precise formulation used in this thesis means that it provably converges on a local minimum in energy [2].

I used the principles outlined in this chapter to develop the Python library qafm, able to numerically find local and global minima in the mean-field energy with and without magnetic field applied. This can be performed on arbitrary crystal structures and commensurate magnetic unit cells, and is able to verify if given exchange tensors are symmetry allowed. This is used in later chapters to understand theoretically the response of different systems to magnetic fields at the mean-field level.