A number of methods exist to determine the ground state magnetic order of interacting spins on periodic lattices. This chapter illustrates the method used in this thesis for the case of an Ising-like antiferromagnet on a square lattice, for general orientation of applied magnetic field, where the magnetic structure is not collinear. This is followed by the general formalisation of the approach for the case of magnetic systems of many sublattices, with arbitrary bilinear exchange.
In order to make predictions of physical properties such as magnetic torque and linear magnetisation, calculations of magnetic structure using a gradient descent algorithm are performed in this thesis. This allowed development of a classical understanding of the pseudo-spin model of the materials which are discussed later. This approach treats spins as classical vectors and allows calculations to be performed much faster than quantum spin models such as exact-diagonalisation, which is intractable for many three-dimensional systems. A classical, three-dimensional pseudospin model is particularly useful when the system orders, and classical models are good approximations [81, 59].
The Hamiltonian for a collection of interacting spins in some applied magnetic field can be written as
(3.1) | ||||||
(3.2) |
where is the exchange tensor between sites and and is the transpose of fixed-length vector spin at site . with elements is the g-factor at site in units of . The classical ground state is the set of spins that minimises , under the condition that all spins have a fixed length :
(3.3) |
Finding this ground state is fundamentally an optimisation problem in which the energy of the system is minimised over the state space. Numerical solutions to these problems are well studied, in physics as well as computer science. A number of algorithms exist to solve these, such as branch and bound, simulated annealing, and gradient descent.
An approach to finding the exact solution in zero field is the Luttinger-Tisza method [63], which works by replacing the strong condition (3.3) with the so-called weak condition
(3.4) |
which is necessary but insufficient for the validity of (3.3). The Hamiltonian can be exactly solved under this condition using a Lagrange multiplier method [64]. If the solution also satisfies the strong condition then it is the true ground state.
A commonly used numerical method for solving these problems in physics is simulated annealing [54]. This iterative solution involves considering the current state with energy and some βneighbouringβ state with energy . The state is updated to this new value with probability which is commonly
(3.5) |
where is a βtemperatureβ parameter which decreases over time. This algorithm provably converges if the cooling schedule is sufficiently slow, however it is challenging to determine the ideal cooling rate a priori.
A different method, used in this thesis, is based on gradient descent (GD). GD works by adjusting the state in the direction which locally minimises the energy. This method reliably and efficiently finds local minima in energy, but some care is needed to find global minima. A more in-depth explanation of the algorithm is in a later section.
Numerical mean field calculations of magnetic structures are particularly useful when considering systems with many exchange terms, or with applied magnetic fields along non-symmetry directions. In order to demonstrate their utility, I will use the example of a square lattice antiferromagnet with nearest-neighbour anisotropic XXZ exchange. A schematic diagram of such a lattice can be seen in FigureΒ 3.1, which shows the primitive lattice vectors and .
In this system, the total energy can be written as
(3.6) |
where the half prefactor is to correct for double counting of bonds, parametrises the antiferromagnetic exchange, and allows for exchange anisotropy. In this example indicates easy-axis Ising-type anisotropy.
The energy on a bond can be minimized by having the adjacent spins be antiparallel and collinear with the Ising axis, such that the bond energy is . On the square lattice, this can be satisfied for every bond by adopting a NΓ©el order, such that adjacent spins are all always antiparallel, shown as alternating and symbols on FigureΒ 3.1. Because this structure minimises the energy on every bond, it must be the ground state when .
The magnetic unit cell vectors are
(3.7) |
with
(3.8) |
showing that there are two primitive lattice points per magnetic unit cell. As such, the magnetic unit cell volume (in this case, area) is twice as large as the primitive unit cell volume. These magnetic unit cell vectors can be seen in FigureΒ 3.1b, the outline of the unit cell can be seen in FigureΒ 3.1c, with the two magnetic sublattices labelled and .
In order to understand how the spin orientations vary with applied magnetic field in this system, it is assumed that the magnetic unit cell does not change with applied magnetic field. The mean field energy per magnetic unit cell (Eqn.Β 3.6) can then be rewitten in terms of the spin orientations of the two sublattices and
(3.9) |
where is the number of nearest-neighbours on each site, and is the total number of magnetic unit cells in the crystal. In this simple case, constructing the Hamiltonian as a function of sublattice spins is straightforward by inspection. In the most general case with longer range interactions and arbitrary numbers of sublattices the formalism is non-trivial and this is discussed in detail in SectionΒ 3.3.
One interesting feature of this system is that it undergoes a spin-flop transition when , this is a metamagnetic phase transition which occurs when a sufficiently strong field is applied along the Ising axis. The zero-field ground state has spins along the axis, so when a field is applied the Zeeman energy
(3.10) |
does not decrease with a magnetic field along . The spins are also not able to cant towards the field as they would if the field was applied in the plane. Determining the field at which this occurs, and the field dependence of the magnetisation can be done analytically and is demonstrated in many textbooks on the subject [13]. In this model, the spin-flop field is found to be
(3.11) |
where a spin-flop occurs for all values of . At this field, the magnetisation changes discontinuously from at to a finite value at .
However, the question of how the system behaves for non-axial fields, some arbitrary angle away from the axis, cannot be answered so clearly analytically. For this reason, it is useful to develop numerical methods for solving the problem. A natural approach is an iterative algorithm called gradient descent. This algorithm works by starting from a state and finding the direction in which the spins can be adjusted that decreases the energy the most. This is repeated until a minimum in energy is found with satisfactory precision.
In this example, EquationΒ 3.9 can be differentiated to give
(3.12) |
which can be computed, then the sublattice spin adjusted along the direction, such that the energy decrease is maximised. Unfortunately, this runs into the problem that the magnitude of the spins are unconstrained. For instance, in the NΓ©el order ground state with no field
so adjusting by with small will increase the magnitude of
(3.13) |
To solve this problem, I note that in the limit of small step size , will not change if . This means that the sublattice spins will be rotated in the plane containing the spin vector and the gradient . I approach the formalism for this in a later section, but in summary this treats the condition as a spherical manifold , and the adjustments to the state occur in the tangent space of this manifold .
In order to keep the step direction in this plane, I project the energy minimising vector into the plane and use that as the step direction, and then normalise the new vector again to avoid accumulation of errors.
(3.14) |
where indicates the projection onto a sphere with radius and indicates projection into the plane normal to . This can then be iterated for each spin until an energy minimum is found. SectionΒ 3.4 formalises this approach and derives a generalisation of the above expression.
4 | |
2 | |
This calculation was performed with the parameters in TableΒ 3.1. From these, one expects a spin-flop transition to occur at or and polarisation to occur at or . The results of these calculations can be seen in FigureΒ 3.3. Fig.Β 3.3a) shows the evolution of magnetisation against magnetic field for field orientations parallel and perpendicular to the axis. For fields perpendicular to , the magnetisation increases linearly in field from at to at . For fields parallel to , the magnetisation remains at between , and then discontinuously joins the curve at higher fields.
FigureΒ 3.2 shows how the projection of magnetisation onto the magnetic field direction
(3.15) |
varies as as the magnetic field vector varies in the plane. This shows a spin-flop transition at the isolated point , such that if the magnetic field is applied directly along the axis, there is a phase transition, and discontinuous change in magnetisation at . This also shows that for off-axis fields, there is no phase transition, but instead a smooth crossover between the two regimes. It show that the spin orientations which minimise the free energy take different angles with the magnetic field except in the special cases of axial fields , , highlighting why this calculation is challenging to do analytically. For all field orientations, the system polarises at the same field , which is in all cases a continuous phase transition.
Fig.Β 3.3b-d) show the effect of off-axis fields on this system. Fig.Β 3.3b) shows how the angle is defined in this setup: measuring an angle from the axis, towards the plane. At zero-field the spins are aligned parallel to the axis. Fig.Β 3.3c) shows how magnetisation against field curves for between and . At the field is exactly along axis and the curve discussed above is recovered. As the field gets further off axis, the discontinuity at is immediately replaced by a smooth crossover between the low and high-field regimes. At higher values of , the crossover becomes less sharp. Fig.Β 3.3d) shows how magnetic torque, taken around the axis shown in Fig.Β 3.3b), evolves with applied magnetic field at different values of . At the field is exactly along the axis, and the torque is constrained by symmetry to be exactly at all fields. For small values of , there is a sharp peak in torque around the spin-flop field. As increases, the peak becomes broader, and increases slightly in field. At the polarisation field , the magnetisation changes slope at all angles , indicating a continuous phase transition. There is no anomaly in torque at at any angle . These results reproduce what is expected theoretically [84, 14] has been seen before experimentally [113].
The previous section outlines the ground state determination method for the case of two sublattices in the magnetic unit cell and nearest-neighbour exchange. This is appropriate for many simple systems. This section demonstrates the appropriate formalism for multi-sublattice systems with exchanges beyond nearest-neighbours.
The general form of the a spin Hamiltonian with exchange and Zeeman terms is
(3.16) | ||||||
(3.17) |
allowing for all bilinear exchange terms and anisotropic g-factors . This form of the Hamiltonian requires indexing over all sites in the crystal. These sites however exist on a Bravis lattice with symmetry constraints which are hard to evaluate in this indexed form. An equivalent formalism for the Hamiltonian is to make this explicit, by summing over the sites in each primitive unit cell, and the lattice points in the crystal. This gives
(3.18) |
where index over the sites in the unit cell and index over the lattice vectors in the crystal. Additionally, the value of the Hamiltonian must be invariant under the symmetry operations of the crystal, including translations by lattice vectors. In other words, it must be invariant under the transformation:
(3.19) |
where is a lattice vector. This constrains the exchange parameters, allowing them to be rewritten:
(3.20) |
So including only the unique terms allowed by translational symmetry of the lattice:
(3.21) |
The exchange part of the Hamiltonian can then be fully specified by a set of exchange parameters (nearest-neighbour, next-nearest neighbour, and so on).
In order to convert the Hamiltonian into a sum over sublattices, a magnetic unit cell needs to be assumed. This means that the spins and at the same site index , for two different lattice vectors , , are equivalent if and differ by a magnetic lattice vector. Formally:
(3.22) |
where is the vector space of all magnetic lattice vectors.
This means that the lattice vectors , can be treated as equivalent if they differ by a magnetic lattice vector. This is the same as performing the algebra in the quotient space , where is the vector space of primitive lattice vectors. It is worth noting that is a vector subspace of . A useful operator to define is the equivalence class of a vector in
(3.23) |
which is the set of all primitive lattice vectors which are equivalent under the magnetic unit cell.
To put it simply, by imposing a magnetic unit cell , two spins and at the same site index , for two different primitive lattice vectors , , are equivalent if . The set of equivalence classes here can be understood as the set of primitive lattice points in each magnetic unit cell. If there is only one site per primitive cell, then the equivalence classes are just the sublattices. If there is more than one site per primitive cell, the sublattices can be indexed by , and the number of sublattices is equal to the number of sites per primitive unit cell the number of primitive lattice points per magnetic cell.
Because of this imposed symmetry, the spin at site can be rewritten
(3.24) |
and the Hamiltonian can be rewritten as
(3.25) |
This still sums over all of the sites in the crystal, but the sum can be simplified by noting that for a finite size magnetic unit cell, there are a finite number of equivalence classes. The sum is taken over the magnetic unit cells, and the sites per magnetic unit cell
(3.26) | ||||
(3.27) |
where the first sum over is taken over all magnetic unit cells in the crystal , is the number of magnetic unit cells in the crystal and the sum over is over the primitive lattice points per magnetic unit cell (the equivalence classes of ). The above uses the property of the equivalence classes that .
Here, the factor of can be safely removed by transforming , such that
(3.28) |
The final hurdle to solve is the sum over , which still sums over all primitive lattice points. This is easily resolved as in the calculation, only a small number of interaction terms are non-zero. This allows the sum to run over all non-zero interactions which have been defined (nearest-neighbour, next-nearest-neighbour, etc.), which is likely to be small in any real-world application, except for dipolar coupled systems which are not the focus of this thesis.
For the sake of completeness, the classical Hamiltonian can be rewritten in terms of the exchange interaction between the sublattices by defining
(3.29) |
such that
(3.30) |
where sum pairwise over the sublattices. This is the same form as EquationΒ 3.9 in the example, and provides a good starting point for numerically solving these problems. The following section formalises a method for numerically minimising the mean field energy for Hamiltonians of this form.
The effect of applying a finite size magnetic unit cell is that only a subset of magnetic propagation vectors are allowed, such that
(3.31) |
This βrationalisationβ of k-space is the standard consequence of applying periodic boundary conditions to a crystal. It should be noted that incommensurate magnetic structures have irrational propagation vectors, so can never be solved with this method.
As the choice of magnetic unit cell limits the solutions that can be found, it must be chosen from either prior theoretical work or empirical evidence of the propagation vector such as neutron diffraction experiments.
This section formalises a method for solving optimisation problems where the state space is not a vector space. This is needed for finding the ground state magnetic Hamiltonians of the form
(3.32) | ||||||||
(3.33) |
where parametrises local anisotropy on site , some complex anisotropy terms are discussed in ChapterΒ 6.
These have the constraint that the sublattice spin vectors have a fixed magnitude i.e.
(3.34) |
As such, the space of allowed do not form a vector space. Fortunately, as each spin is constrained to be a vector with a fixed length, it is equivalent to saying that each is a point on a sphere of radius . This forms a differentiable Riemannian manifold over which some optimisation algorithms such as gradient descent can be generalised as shown below.
Traditionally, gradient descent is an algorithm for finding the local minimum of a scalar, differentiable, function over a vector space
(3.35) |
The principle of the algorithm comes from defining the gradient operator such that its inner product with some vector direction
(3.36) |
the derivative of along . This is equivalent to the more commonly used definition
(3.37) |
where is the thunit basis vector of the vector space, and is the partial derivative of with respect to the coordinate .
For a small step , the change in is maximised when the step is parallel to the gradient . This prompts the iterative solution
(3.38) |
where is a learning rate, and is a free parameter.
In order to include the constant length spins constraint (Eqn.Β 3.3), modifications have to be made to the standard gradient descent algorithm. One option is to use a penalty method, which involves adding some smooth term to the Hamiltonian to penalise moving away from the constraint. In this case one could add the penalty term:
(3.39) |
where is a free parameter. This would add a cost to the optimiser to violate the constraint, so it will stay approximately satisfied.
Another solution, with has been implemented for this project, is to consider the constrained state space as a Riemannian manifold, described below. The gradient descent algorithm can be generalised to apply to any Riemannian manifold, not just the Euclidean geometry shown above [2, 88, 87]. In this form the algorithm is sometimes known as Riemannian Optimisation. However, some of the steps which are trivial in the Euclidean case are not trivial in the general case.
A Riemannian manifold is a smooth, differentiable manifold whose tangent spaces have an inner product, which varies smoothly across the manifold. The tangent space of a differentiable manifold at is the vector space spanned by the tangent vectors of all of the curves at on . If can be embedded in an ambient Euclidean space, this can be easily visualised as the (hyper-)plane spanned by the tangent vectors as in FigureΒ 3.4a).
In this case, is instead a differentiable, scalar function over the manifold
(3.40) |
This allows the gradient to be defined again such that
(3.41) |
is the derivative of along . It should be noted that the vectors and are now elements of the tangent space . Clearly, a step along the vector will maximise the change in , so the approach of gradient descent will work. However, unlike in the Euclidean case, the tangent space and the manifold have very different geometry. A tangent vector on can be mapped to by the exponential map:
(3.42) |
In some cases, this map may be computationally expensive to evaluate. Overall this gives the iteration step for gradient descent on an arbitrary Riemannian manifold:
(3.43) |
Applying this to the geometry of the problem in magnetism is straightforward. A point on the overall space is a collection of spins , which are constrained to be unit vectors. So each spin is a point on a sphere, that is a point on the spherical manifold. The overall space is therefore the product manifold
(3.44) |
and is easily visualised as independent coordinates on a sphere. Formally:
(3.45) | ||||
(3.46) |
The tangent space of a product manifold is the product of the tangent spaces of the individual manifolds [2]:
(3.47) |
The exponential map has a similar relationship [2]:
(3.48) |
This means that the update step with spins can be divided into update steps each on a sphere:
(3.49) | ||||
(3.50) | ||||
(3.51) |
Thankfully, the spherical geometry allows this to be computed without much trouble. FigureΒ 3.4a,b) show how the exponential map of the -sphere can be cheaply approximated by projecting the first order approximation of the map, the point on the tangent (hyper-)plane, back onto the sphere:
(3.52) | ||||
(3.53) |
The effect of the approximation is to slightly lower the learning rate while the path along which the state is adjusted is unchanged.
By embedding the sphere in , the coordinate of the point on the sphere can be stored as , , components. The tangent space is the plane , and is a 2-dimensional vector subspace of the ambient . This allows for a very simple method for computing the gradient from the unconstrained Hamiltonian . The unconstrained gradient can be computed cheaply and projected into the tangent plane to give the gradient of the constrained Hamiltonian:
(3.54) |
This gives the final expression for the iteration step in the gradient descent algorithm on the sphere as
(3.55) | ||||
(3.56) | ||||
(3.57) |
where represents the sphere, and is the tangent space of the sphere at . This is the generalised form of the expression found in the example (EquationΒ 3.14). In the implementation of the algorithm used in this thesis, is chosen to keep the maximum angle a spin rotates below a threshold, which starts at , and is decreased until a specified precision is reached.
Numerically this can be computed with double precision floating point arithmetic very quickly, as it contains only linear operations, with the exception of the square-root in the magnitude. The final ingredient is the gradient of the unconstrained Hamiltonian with respect to an individual spin. Analytically this is
(3.58) |
where the term captures exchange interactions between two different sites in the same sublattice. For instance, on the square lattice example, a next-nearest-neighbour interactions would couple the and sublattices to themselves. The term captures single-ion anisotropy such as a term.
The time complexity of an algorithm is defined as the leading order scaling of the execution time as the size of the input grows, in the limit of large input size. For finding local minima this algorithm has a time complexity of , quadratic in the number of spins and equal to the time complexity of computing the gradient of the Hamiltonian. To find global minima in energy, the initial point can be guessed at many points across . However, in the case of magnetism, the number of points needed to cover the manifold scales exponentially in the number of spins in the calculation. With an overall time complexity of , where the branching factor depends on the exact form of Hamiltonian. In examples with only one local minima the scaling is very efficient, but examples with many local minima in energy are pathological for this algorithm.
In this case, Riemannian optimisation is a significant improvement over penalty methods in that is ensures that the constraint is satisfied exactly, without sacrificing speed. Compared to simulated annealing, it has the advantage of robustly determining local minima in energy, which is useful for tracking the evolution and stability of domains, and for visualising the evolution of the state when there are degenerate ground states. On the other hand, simulated annealing is able to efficiently find approximate global minima in energy, while gradient descent is exceptionally poor at the task for large numbers of spins and many local minima in energy. In such a case, a hybrid approach could work by using simulated annealing to produce an anstaz to seed the gradient descent algorithm, however in all cases in this thesis this was not required.
This algorithm is similar to variations of gradient descent used in many popular mean-field calculation programs. For example, SPINWβs [99] optmagsteep method rotates spins towards the local mean-field, similar to gradient descent. However, the precise formulation used in this thesis means that it provably converges on a local minimum in energy [2].
I used the principles outlined in this chapter to develop the Python library qafm, able to numerically find local and global minima in the mean-field energy with and without magnetic field applied. This can be performed on arbitrary crystal structures and commensurate magnetic unit cells, and is able to verify if given exchange tensors are symmetry allowed. This is used in later chapters to understand theoretically the response of different systems to magnetic fields at the mean-field level.