1 Introduction

Cosmic rays (CRs), that is the population of charged, relativistic particles with non-thermal spectra, are ubiquitous in the Universe. They pervade systems of all sizes, from stellar systems to whole galaxies, from galaxy clusters to the intercluster medium. See Berezinsky et al. (1990), Strong et al. (2007), Grenier et al. (2015), Kotera and Olinto (2011) for reviews on Galactic and extra-galactic cosmic rays. CRs are not only responsible for genuinely non-thermal phenomena: the fluxes of CRs observed at Earth, the non-thermal emission of radio, X-ray and gamma-ray sources or the diffuse Galactic and extragalactic emission; but CRs oftentimes have energy densities comparable or even superior to other components, like the thermal gas, magnetic fields or radiation backgrounds. As such, CRs can contribute to the pressure equilibrium or even drive large-scale outflows (e.g. Everett et al. 2008, Hanasz et al. 2013, Simpson et al. 2016, Recchia et al. 2016). At the largest scales, it has been suggested that CRs (or gamma-rays from blazars) contribute to the heating of the Universe at redshifts as high as \(z \sim 10\) (Nath and Biermann 1993; Sazonov and Sunyaev 2015; Leite et al. 2017).

Any detailed modelling of CRs relies on understanding transport in coordinate and momentum space. For instance, modelling the locally observed CRs involves their propagation from the sources to the observer. It is believed that diffusion is the dominant process in shaping the spectra, both during shock or stochastic acceleration inside the sources and during their transport from the sources. Indeed for Galactic CRs the most important effect, that is the softening of the observed spectra with respect to the source spectra and the relative softness of so-called secondary species (e.g. boron) with respect to so-called primary species (e.g. carbon), can be explained with a rigidity-dependent diffusion coefficient. (See Gabici et al. (2019) for a recent review of the challenges to this picture.) Scrutinizing this picture and improving upon it requires a better, more refined understanding of spatial transport. A prominent example is the issue of small-scale anisotropies, that is the variation of the flux of CRs on angular scales as small as \(5^{\circ }\) which is absent in simple diffusion models. (See Ahlers and Mertsch (2017) for a review on small-scale anisotropies).

What has been hampering progress are mainly two issues. First, the transport of high-energy, charged particles through a turbulent magnetised plasma is intrinsically non-linear: The temporal evolution of the phase space density of particles can be described by a Fokker-Planck equation with coefficients that depend on the small-scale magnetic field as will be reviewed below. At the same time, however, CRs contribute to the dielectric tensor of the plasma, thus affecting its dispersion relation. Broadly speaking, waves are damped if the phase space density is very isotropic, but they can grow if there is sufficient anisotropy.Footnote 1 In general, sources are distributed inhomogeneously, this leads to anisotropy in momentum and growth of wave modes. This is called the streaming instability and can lead to self-confinement of CRs. While this fact was known already in the 1960’s (Kulsrud and Pearce 1969; Kulsrud and Cesarsky 1971; Skilling 1975), only recently has it been incorporated into (simple) phenomenological models (Blasi et al. 2012; Evoli et al. 2018). Note that self-generated turbulence is also important close to the sources of Galactic CRs (Malkov et al. 2013; Ptuskin et al. 2008; Nava et al. 2016, 2019). The amplified magnetic fields necessary for shock acceleration to the highest energies are thought to be provided by a different, but related instability (Bell and Lucek 2001).

The other issue is the lack of a fundamental microscopic theory for the transport of charged particles through turbulent magnetic fields. More than 50 years since its inception, quasi-linear theory (QLT) (Jokipii 1966; Kennel and Engelmann 1966; Hall and Sturrock 1967; Hasselmann and Wibberenz 1970) is still very much the paradigm for phenomenological applications to Galactic cosmic rays. In QLT, the Fokker-Planck equation for the temporal evolution of the phase space density of CRs is derived in a perturbative approach where the force on a particle due to a turbulent magnetic field is evaluated along the unperturbed trajectory in a regular background field. The Fokker-Planck coefficients, most prominently the components of the spatial diffusion tensor, can be computed for a given model of turbulence, parametrised by the two-point function of the turbulent magnetic field. Famously, in QLT the interactions between plasma waves and particles are found to be resonant, meaning that particles of a certain gyroradius \(r_{\text{g}} = v / \Omega \), \(\Omega \) denoting the gyrofrequency, are only affected by waves with a wavenumber \(k\) that satisfies \(k r_{\text{g}} \mu \approx 1\) (for low-frequency waves like Alfvén waves) where \(\mu \) is the cosine of the pitch-angle, that is the angle between the particle momentum \(\boldsymbol{p}\) and the regular magnetic field \(\langle \boldsymbol {B} \rangle \), \(\mu \equiv \boldsymbol{p} \cdot \langle \boldsymbol {B} \rangle / (|\boldsymbol{p}| | \langle \boldsymbol {B} \rangle |)\).

While some of QLT’s predictions are qualitatively confirmed by data, e.g. the rigidity-dependenceFootnote 2 of the diffusion coefficients, there are a number of concerns. The most famous one is the \(90^{\circ }\) problem: Due to the resonance condition, particles with pitch-angle close to \(90^{\circ }\) (\(\mu \approx 0\)) can only be in resonance with very large wavenumbers \(k\) which for the usual turbulent spectra contain little energy. In the limit \(\mu \to 0\), the scattering rate vanishes and particles cannot change direction along the background field resulting in ballistic transport. This is obviously at variance with the diffusive transport inferred from observations.

The root cause of the \(90^{\circ}\) problem is the assumption of unperturbed trajectories in QLT. This is remedied in non-linear theories, where the decay of correlations leads to a broadening of the resonance condition which allows for efficient enough scattering through \(90^{\circ}\). However, any such extension of QLT requires additional assumptions, for instance on the form of temporal decorrelation of the particle’s trajectory. Ideally, one would test these non-linear theories by comparing their predictions with data from observations. While this approach is being followed by the heliospheric community, a difficulty remains in that the actual turbulence (if known) turns out to be much more complex than what is routinely assumed in analytical transport theories. Alternatively, transport theories can be tested by comparing their predictions with those from numerical experiments.

Test particle simulations compute the transport of high-energy charged particles through prescribed electro-magnetic fields without taking into account the effect of the high-energy particles on the electro-magnetic fields. To this end, a realisation of the turbulent magnetic field is generated and the equations of motion (Newton-Lorentz equations) are solved for test particles, that is the contributions of the CR particles to the electromagnetic fields are ignored. Given the trajectories of a large enough number of test particles, one can numerically compute the Fokker-Planck coefficients.

This idea has been very popular ever since powerful enough computers have been available to allow for the computation of thousand if not millions of test particles. Yet, we have found the body of literature on this rather disjoint, with no single widely agreed upon method for how to synthetically generate the turbulent magnetic field. It is the intention of this review to provide a low-level introduction to the uninitiated while also discussing some of the applications of test particle simulations.

In addition to testing transport theories by comparing the analytically computed diffusion coefficients to simulated ones, there are at least two more applications of test particle simulations: First, for sources at distances closer or similar to the scattering mean free path, the diffusive transport theory is not necessarily applicable. An often-cited pathology of computing solutions to the diffusion equation is superluminal propagation speeds. Lately, there has been increased interest in the transition between the ballistic and diffusive phases of transport (Effenberger and Litvinenko 2014; Malkov and Sagdeev 2015) and test particle simulations allow exploring this transition for given turbulent electro-magnetic fields (e.g. Tautz and Lerche 2016). Second, analytical transport theories usually make predictions only for the ensemble-averaged phase-space density and it is usually assumed that the observed phase-space densities are close to the ensemble average. Recently, this has been called into question, in particular in view of the observation of small-scale anisotropies observed in the arrival directions of TeV-PeV CRs. Test particle simulations naturally simulate CR distributions for individual realisations of the turbulent fields and thus provide direct access to such stochasticity effects.

This review will be structured as follows. In Sect. 2 we give a brief review of QLT, describing how the diffusion coefficients are evaluated, introducing some of the simplest and most popular turbulence models. We will also review a few of QLT’s non-linear extensions. In Sect. 3, we explain the two main methods that have been employed in generating turbulent magnetic fields on a computer. We will reproduce the recipes from the literature in a way that should allow the interested reader to produce her/his own synthetic turbulence. In Sect. 4, we will discuss two applications of test particle simulations, that is the computation of parallel and perpendicular mean free paths and the prediction of anisotropies in the arrival direction of CRs. Specifically, we will clarify some of the issues related to backtracking–a technique based on solving the equations of motion backward in time in Sect. 4.3. We will conclude with a short summary and outlook in Sect. 5.

2 Quasi-linear theory and extensions

For some 50 years, quasi-linear theory (QLT) (Jokipii 1966; Kennel and Engelmann 1966; Hall and Sturrock 1967; Hasselmann and Wibberenz 1970) has been the broadly accepted and widely employed theory of CR transport. Its success and popularity can be ascribed to its conceptual simplicity and validity in a number of important environments, including the solar wind, the interstellar medium and galaxy clusters. In addition, QLT is simple in principle and thus allows for a straight-forward computation of the transport parameters, albeit it can become arbitrarily complex in practice. Finally, these results can be found to agree with inferences from observations, e.g. the normalisation and power law shape of the Galactic diffusion coefficient.

At the heart of QLT is the evaluation of the turbulent magnetic field and its contribution to the Lorentz force along “unperturbed orbits”, that is trajectories calculated in only a large-scale, regular magnetic field. Interactions of CRs with small-scale, magnetised turbulence result in resonant interactions, that is particles of Larmor radius \(r_{\text{g}}\) and pitch-angle cosine \(\mu \) interact predominantly with modes of wavenumber \(k\) that satisfies \(k r_{\text{g}} \mu \approx 1\). These resonant interactions lead to pitch-angle scattering and for a spectrum of magnetic turbulence with random phases, the particle performs a random walk in pitch-angle. The evolution of the phase space density can be described by a Fokker-Planck equation and the Fokker-Planck coefficients, e.g. the pitch-angle diffusion coefficient or the rate of second-order Fermi acceleration, depend on the two-point correlation functions of the turbulent magnetic field. In addition, under the assumption of slow variation of the phase space density with position and time, pitch-angle diffusion results in spatial diffusion along the background magnetic field (Earl et al. 1988). Finally, QLT also allows computing the dipole anisotropy in the arrival directions of CRs for a given spatial gradient of the phase space density.

In the following, we review the foundations of QLT, starting from the derivation of the Fokker-Planck equation. After an introduction to the various turbulence geometries in use, we outline how the transport coefficients can be computed. Motivated by the short-comings of QLT, we review some of its non-linear extensions.

2.1 Derivation of the Fokker-Planck equation

Charged particles in electric and magnetic fields \(\boldsymbol{E}\) and \(\boldsymbol{B}\) are subject to the Lorentz force,

$$ \boldsymbol{F}_{\text{L}} = e \left ( \boldsymbol{E} + \boldsymbol{v} \times \boldsymbol{B} / c \right ) \, , $$
(1)

with \(e\) and \(\boldsymbol{v}\) the charge and velocity of the particle and \(c\) the speed of light. It is customary to decompose the magnetic field into a large-scale, homogeneous, regular background field, \(\langle \boldsymbol {B} \rangle {}\) and a small-scale, turbulent, random field \(\boldsymbol{\delta B}\), that is \(\boldsymbol{B} = \langle \boldsymbol {B} \rangle {} + \boldsymbol{\delta B}\) with \(\langle \boldsymbol{\delta B} \rangle \equiv 0\). (Throughout this article, we use angled brackets to denote averages over an ensemble of turbulent magnetic fields.) Without loss of generality, we assume in the following that the regular field is oriented along the \(z\)-direction, \(\langle \boldsymbol {B} \rangle {} = B_{z} \hat{z}\), unless stated otherwise. Large-scale electric fields are usually ignored, \(\langle \boldsymbol{E} \rangle = 0\), as the large mobility of charges in astrophysical plasmas is efficiently shielding against regular electric fields (that is on scales much larger than the Debye length). Small-scale electric fields \(\boldsymbol{\delta E}\) are necessarily present, but from Faraday’s induction law, their magnitude can be estimated to be \(|\boldsymbol{\delta E}| \sim ( v_{\text{A}} / c) |\boldsymbol{\delta B}|\) with \(v_{\text{A}} \) the Alfvén velocity and \(v_{\text{A}} / c \ll 1\) in most astrophysical environments. Thus, to lowest order in \(( v_{\text{A}} / c)\), there is no electric field and as the magnetic force is not performing any work on the particle, particle energy is consequently conserved. Note that at higher orders in \(( v_{\text{A}} / c)\), the particle energy can change in a second-order Fermi type process. For simplicity, we constrain ourselves here to considering the lowest order case which results in pitch-angle scattering. For the fuller picture including the higher-order processes, we refer the interested reader to Schlickeiser (2002).

A charged particle in a magnetic field forms a Hamiltonian system as long as dissipative processes (or any form of energy losses) can be ignored. A consequence of this is Liouville’s theorem, that is the conservation of phase space volume under canonical transformations. As time evolution is a canonical transformation, phases space volume is conserved in time (Goldstein et al. 2002). Together with particle number conservation this implies the conservation of phase space density \(f = f(\boldsymbol{r}, \boldsymbol{p}, t)\). This is conveniently captured by what we will call Liouville’s equation,

$$ \frac{ \mathrm {d} f}{ \mathrm {d} t} = \frac{\partial f}{\partial t} + \frac{ \mathrm {d} \boldsymbol{r}}{ \mathrm {d} t} \cdot \boldsymbol{\nabla}_{\boldsymbol{r}} f + \frac{ \mathrm {d} \boldsymbol{p}}{ \mathrm {d} t} \cdot \boldsymbol{\nabla}_{\boldsymbol{p}} f = 0 \, , $$
(2)

encoding the incompressibility of the phase space flow. Here,

$$ \frac{ \mathrm {d} \boldsymbol{r}}{ \mathrm {d} t} = \boldsymbol{v} \, , \quad \frac{ \mathrm {d} \boldsymbol{p}}{ \mathrm {d} t} = e \left ( \boldsymbol{E} + \frac{\boldsymbol{v} \times \boldsymbol{B}}{c} \right ) \, , $$
(3)

are the equations of motion. Note that a necessary condition for a Hamiltonian system is that the forces are conservative and differentiable (“\(p\)-divergence-free”).

A collisionless plasma under the influence of external electric and magnetic fields, \(\boldsymbol{E}\) and \(\boldsymbol{B}\), is an example of a Hamiltonian system. Its Hamiltonian is (Jackson 1998)

$$ H = \sqrt{(c \boldsymbol{P} - e \boldsymbol{A})^{2} + m^{2} c^{4}} + e \Phi \, . $$
(4)

Here, \(\boldsymbol{P} = \boldsymbol{p} + (e/c) \boldsymbol{A}\) is the canonical momentum, \(\boldsymbol{A}\) the vector potential, \(m\) the particle mass, \(e\) its charge and \(\Phi \) the electric potential. Therefore, the phase space density of this collisionless plasma satisfies Eq. (2) and substituting the Lorentz force, Eq. (1), in Eq. (2) gives the Vlasov equation,

$$ \frac{\partial f}{\partial t} + \frac{ \mathrm {d} \boldsymbol{r}}{ \mathrm {d} t} \cdot \boldsymbol{\nabla}_{\boldsymbol{r}} f + e \left ( \boldsymbol{E} + \frac{\boldsymbol{v} \times \boldsymbol{B}}{c} \right ) \cdot \boldsymbol{\nabla}_{\boldsymbol{p}} f = 0 \, , $$
(5)

which together with Maxwell’s equations forms the basis of plasma kinetic theory. For a collisional plasma, a term needs to be added to the right-hand side, the famous collision operator. For a collisionless plasma (as appropriate for CRs) the right-hand side remains zero.

Considering turbulent fields, the phase space density also becomes a random field, \(f = \langle f \rangle + \delta f\), with an expectation value, \(\langle f \rangle \), and fluctuations around it, \(\delta f\), that satisfy \(\langle \delta f \rangle = 0\).

In any realistic astrophysical situation, it is of course impossible to know the small-scale turbulent field at all positions in order to exactly solve Eq. (5). Instead, one can only hope to predict statistical moments of the phase space density for a statistical ensemble of turbulent magnetic fields. Traditionally, one is mostly interested in the first moment, the ensemble average, though see Mertsch and Ahlers (2019) for the computation of a second-order moment.

In the following, we ignore electric fields, see above. Averaging Eq. (5), we find, see e.g. Jokipii (1972),

$$\begin{aligned} \frac{ \mathrm {d} \langle f \rangle }{ \mathrm {d} t} &= \frac{\partial \langle f \rangle }{\partial t} + \frac{ \mathrm {d} \boldsymbol{r}}{ \mathrm {d} t} \cdot \boldsymbol{\nabla}_{\boldsymbol{r}} \langle f \rangle + e \frac{\boldsymbol{v} \times \langle \boldsymbol {B} \rangle {}}{c} \cdot \boldsymbol{\nabla}_{ \boldsymbol{p}} \langle f \rangle \\ &= - \left \langle e \frac{\boldsymbol{v} \times \boldsymbol{\delta B}}{c} \cdot \nabla _{\boldsymbol{p}} \delta f \right \rangle \neq 0 \, . \end{aligned}$$
(6)

Note that unlike the phase space density \(f\), the ensemble averaged phase space density \(\langle f \rangle \) is not conserved, \(\mathrm {d} \langle f \rangle / \mathrm {d} t \neq 0\). (More on this in Sect. 4.3.)

One way to glean some physical insight from Eq. (6) is to identify its right-hand-side with a damping term, (Earl et al. 1988; Webb 1989),

$$ \left \langle e \frac{\boldsymbol{v} \times \boldsymbol{\delta B}}{c} \cdot \nabla _{\boldsymbol{p}} \delta f \right \rangle \to \nu \left ( \langle f \rangle - \frac{1}{4 \pi } \int \mathrm {d} \hat {\boldsymbol {p}} \, \langle f \rangle \right ) \, , $$
(7)

(where \(\hat {\boldsymbol {p}} \equiv \boldsymbol{p} / |\boldsymbol{p}|\)) that is driving the phase space density towards isotropy at a rate \(\nu \), an approach that can also be motivated by gas kinetic theory (Bhatnagar et al. 1954). This way, Eq. (6) can be solved and shown to lead to a spatial diffusion equation. The parallel diffusion coefficient can be identified as \(\kappa _{\parallel } = v^{2}/(3 \nu )\) whereas the perpendicular diffusion coefficient satisfies \(\kappa _{\perp}/\kappa _{\parallel} = (1 + \Omega ^{2}/\nu ^{2})^{-1}\), a result referred to as the “classical scattering limit” (Gleeson 1969). Here, \(\Omega \) is the particle’s gyrofrequency.

In QLT, however, a more systematic solution for \(f\) is sought through an equation for the temporal evolution of the fluctuations \(\delta f\). Such an equation can be obtained by subtracting the ensemble-averaged Vlasov Eq. (6) from the original Vlasov Eq. (5),

$$\begin{aligned} & \frac{\partial \delta f}{\partial t} + \frac{ \mathrm {d} \boldsymbol{r}}{ \mathrm {d} t} \cdot \boldsymbol{\nabla}_{\boldsymbol{r}} \delta f + e \left ( \frac{\boldsymbol{v} \times \langle \boldsymbol {B} \rangle {}}{c} \right ) \cdot \boldsymbol{\nabla}_{\boldsymbol{p}} \delta f \\ & \simeq -e \left ( \frac{\boldsymbol{v} \times \boldsymbol{\delta B}}{c} \right ) \cdot \boldsymbol{\nabla}_{\boldsymbol{p}} \langle f \rangle \, . \end{aligned}$$
(8)

Here, we have chosen to ignore the difference

$$ e \frac{\boldsymbol{v} \times \boldsymbol{\delta B}}{c} \cdot \nabla _{\boldsymbol{p}} \delta f - \left \langle e \frac{\boldsymbol{v} \times \boldsymbol{\delta B}}{c} \cdot \nabla _{\boldsymbol{p}} \delta f \right \rangle \, , $$
(9)

which is second order in perturbed quantities, \(\boldsymbol{\delta B}\) and \(\delta f\). This assumes, of course, that \(|\boldsymbol{\delta B}| \ll | \langle \boldsymbol {B} \rangle {}|\) and therefore \(\delta f \ll \langle f \rangle \). Equation (8) can now be integrated with the method of characteristics, the formal solution being

$$ \delta f = \delta f_{0} - \int _{t_{0}}^{t} \mathrm {d} t' \left [ e \left ( \frac{\boldsymbol{v} \times \boldsymbol{\delta B}}{c} \right ) \cdot \boldsymbol{\nabla}_{\boldsymbol{p}} \langle f \rangle \right ]_{P(t')} \, . $$
(10)

Here, \(\delta f_{0} \equiv \delta f(\boldsymbol{r}, \boldsymbol{p}, t_{0})\) denotes the phase space density at time \(t_{0}\) and the subscript \(P(t')\) indicates that positions and momenta in the square brackets are to be evaluated along the characteristics of Eq. (8), that is the solutions of the equations of motions, Eq. (3) with \(\boldsymbol{B}\) replaced by the regular field \(\langle \boldsymbol {B} \rangle {}\) only (and again no electric field). These solutions \(P\) are commonly referred to as “unperturbed orbits” or “unperturbed trajectories”. For the homogeneous regular magnetic field \(\langle \boldsymbol {B} \rangle {} = B_{z} \hat{z}\) assumed here they are of course helices along the \(z\)-direction.

We can now substitute Eq. (10) into Eq. (6),

$$\begin{aligned} & \frac{\partial \langle f \rangle }{\partial t} + \frac{ \mathrm {d} \boldsymbol{r}}{ \mathrm {d} t} \cdot \boldsymbol{\nabla}_{\boldsymbol{r}} \langle f \rangle + e \frac{\boldsymbol{v} \times \langle \boldsymbol {B} \rangle {}}{c} \cdot \boldsymbol{\nabla}_{ \boldsymbol{p}} \langle f \rangle \\ &\simeq \int _{t_{0}}^{t} \!\! \mathrm {d} t' \! \left \langle e \frac{\boldsymbol{v} \! \times \! \boldsymbol{\delta B}}{c} \! \cdot \! \nabla _{\boldsymbol{p}} \left [ e \frac{\boldsymbol{v} \! \times \! \boldsymbol{\delta B}}{c} \! \cdot \! \boldsymbol{\nabla}_{\boldsymbol{p}} \langle f \rangle \right ]_{P(t')} \right \rangle , \end{aligned}$$
(11)

where we have dropped the term \(\propto \delta f_{0}\). At this stage, we can already see that the right-hand side will lead to diffusion terms (courtesy of the two momentum derivatives) and that it depends on the turbulent magnetic field’s two-point function, integrated along the unperturbed trajectory \(P(t')\). To make further progress, we consider the momentum \(\boldsymbol{p}\) in spherical coordinates, that is \(\boldsymbol{p} = p (\sqrt{1 - \mu ^{2}} \cos \phi , \sqrt{1 - \mu ^{2}} \sin \phi , \mu )^{T}\) and introduce the correlation lengths \(l_{\text{c}}\) and correlation times \(\tau _{\text{c}}\) of the turbulent magnetic field, defined through

$$\begin{aligned} \langle \delta B^{2} \rangle l_{\text{c}} &\equiv \int _{0}^{\infty} \mathrm {d} \Delta r \, \langle \delta B(\boldsymbol{r}) \delta B(\boldsymbol{r}+\boldsymbol{\Delta r}) \rangle \,, \end{aligned}$$
(12)
$$\begin{aligned} \langle \delta B^{2} \rangle \tau _{\text{c}} &\equiv \int _{0}^{\infty} \mathrm {d} \Delta t \, \langle \delta B(t) \delta B(t+\Delta t) \rangle \,. \end{aligned}$$
(13)

(Strictly speaking, the correlation lengths and times are tensors because of the vector nature of the magnetic field in the two-point functions; here, however, we only require them for order of magnitude arguments, so we do not distinguish the different components.) Here, \(\delta B(t)\) is short-hand for \(\left [ \delta B(\boldsymbol{r}) \right ]_{P(t)}\), that is \(\delta B(\boldsymbol{r})\) evaluated along the unperturbed trajectory at time \(t\).

The right-hand side of Eq. (11) is still rather unwieldy and further progress requires a number of assumptions. In addition to

  1. 1.

    Smallness of perturbations, \(|\boldsymbol{\delta B}| \ll | \langle \boldsymbol {B} \rangle {}|\) (see above);

these are:

  1. 2.

    Gyrotropy: The ensemble-averaged phase space density \(\langle f \rangle \) does not depend on the azimuthal angle \(\phi \), so \(\langle f \rangle (\boldsymbol{r}, p, \mu , \phi , t) \to \langle f \rangle (\boldsymbol{r}, p, \mu , t)\).

  2. 3.

    Adiabatic approximation: The phase space density only varies on time-scales much larger than the correlation time of the turbulent magnetic field, \(\tau _{\text{c}}\),

    $$ \langle f \rangle \left / \frac{\partial \langle f \rangle }{\partial t} \right . \gg \tau _{\text{c}} \, . $$
    (14)
  3. 4.

    Finite correlation times: The correlation times of the turbulent magnetic field are much larger than the Larmor time, \(\tau _{\text{c}} \gg \Omega ^{-1}\).

  4. 5.

    Homogeneous and stationary turbulence.

Under these conditions, the ensemble averaged Vlasov equation ultimately results in a Fokker-Planck type equation (Fokker 1914; Planck 1917), also known as the Kolmogorov forward (Kolmogorov 1931) or as the Smoluchowski equation (Bogoliubov and Krylov 1939), describing diffusion in pitch-angle,

$$ \frac{\partial \langle f \rangle }{\partial t} + v \mu \frac{\partial \langle f \rangle }{\partial z} = \frac{\partial }{\partial \mu } \left ( D_{\mu \mu } \frac{\partial \langle f \rangle }{\partial \mu } \right ) \, . $$
(15)

Following the approach sketched above, the pitch-angle diffusion coefficient,

$$ D_{\mu \mu} \equiv \frac{\langle (\Delta \mu )^{2} \rangle }{2 \Delta t} \,, $$
(16)

can be expressed in terms of the correlation function of the magnetic field.

If we had not decided to ignore any electric field, additional terms would have appeared in the Fokker-Planck equation (15), relating to changes in momentum \(p\) and pitch-angle, with diffusion coefficients \(D_{\mu p} = D_{p \mu}\) and \(D_{pp}\) defined analogously to Eq. (16). We have furthermore assumed that \(v_{\text{A}} / v \ll 1\) in order for \(D_{xx}\), \(D_{yy}\), \(D_{xy}\) and \(D_{yx}\) to be negligible. Not doing so, would have resulted in the additional terms

$$\begin{aligned} & \frac{\partial}{\partial x} \left (D_{xx} \frac{\partial \langle f \rangle }{\partial x} + D_{xy} \frac{\partial \langle f \rangle }{\partial y} \right ) \\ + & \frac{\partial}{\partial y} \left (D_{yx} \frac{\partial \langle f \rangle }{\partial x} + D_{yy} \frac{\partial \langle f \rangle }{\partial y} \right ) \,, \end{aligned}$$
(17)

to be added to the right-hand side of Eq. (15).

In summary, under the influence of a turbulent magnetic field, charged particles are performing a random walk in pitch-angle which in the ensemble average results in diffusion in pitch-angle (cosine).

2.2 The diffusion approximation

Particle transport can be conveniently categorised if the mean-square displacement in direction \(i\), \(\langle \Delta x_{i}^{2} \rangle \), has a power law dependence,

$$ \langle \Delta {r}_{i}^{2} \rangle \propto (\Delta t)^{\alpha } $$
(18)

as

$$ \textstyle\begin{array}{r@{\quad } l} \alpha < 1: & \text{sub-diffusive,} \\ \alpha = 1: & \text{diffusive,} \\ \alpha > 1:& \text{super-diffusive, in particular} \\ \alpha = 2: & \text{ballistic.} \end{array} $$
(19)

It seems clear that transport in any perturbative theory with \(|\boldsymbol{\delta B}| \ll | \langle \boldsymbol {B} \rangle {}|\) must be ballistic at early enough times: Particles just gyrate around \(\langle \boldsymbol {B} \rangle {}\) and \(\langle \Delta z^{2} \rangle = (v \mu \Delta t)^{2}\) while \(\langle \Delta x^{2} \rangle = \langle \Delta y^{2} \rangle = 0\) when integrated over full gyroperiods. At late times, that is for \(t \gg D_{\mu \mu }^{-1}\), we would expect diffusive behaviour for the transport along the field.

In order to formalise this picture, we derive a spatial diffusion equation from the Fokker-Planck equation (15). To this end, we decompose \(\langle f \rangle \) into an isotropic part, \(g\), and an anisotropic part, \(h\),

$$\begin{aligned} \langle f \rangle (p, \mu , t) = g(p, t) + h(p, \mu , t) \, , \end{aligned}$$
(20)
$$\begin{aligned} \text{where } g(p, t) = \frac{1}{2} \int _{-1}^{1} \mathrm {d} \mu \langle f \rangle (p, \mu , t) \end{aligned}$$
(21)
$$\begin{aligned} \text{and} \quad \int _{-1}^{1} \mathrm {d} \mu \, h(p, \mu , t) = 0 \, . \end{aligned}$$
(22)

If \(g\) varies only slowly with time and position,

$$ g \left / \frac{\partial g}{\partial t} \right . \gg \tau _{\text{sc}} \quad \text{and} \quad g \left / \frac{\partial g}{\partial z} \right . \gg \lambda _{\text{sc}} \, , $$
(23)

where \(\tau _{\text{sc}} \sim D_{\mu \mu }^{-1}\) and \(\lambda _{\text{sc}} \sim v \tau _{\text{sc}}\) are the scattering time and mean-free path, respectively, the phase space density will be very isotropic, \(h \ll g\). In this case, we can derive a spatial diffusion equation for the isotropic part \(g\) (e.g. Hasselmann and Wibberenz 1970),

$$ \frac{\partial g}{\partial t} - \frac{\partial }{\partial z} \left ( \kappa _{\parallel } \frac{\partial g}{\partial z} \right ) = 0 \, , $$
(24)

with the parallel diffusion coefficient

$$ \kappa _{\parallel } = \frac{v^{2}}{8} \int _{-1}^{1} \mathrm {d} \mu \frac{(1- \ \mu ^{2})^{2}}{D_{\mu \mu }} \, . $$
(25)

Furthermore, we would expect the anisotropic part \(h\) to be dominated by the dipole anisotropy, that is \(h \approx h_{1} \mu \) with

$$ h_{1} = \frac{3}{2} \int _{-1}^{1} \mathrm {d} \mu \, \mu \, h(\mu ) = - \frac{2}{v} \kappa _{\parallel } \frac{\partial g}{\partial z} \, . $$
(26)

2.3 Computation of transport coefficients

So far, we have not specified the functional form of the Fokker-Planck coefficients, e.g. the pitch-angle diffusion coefficient \(D_{\mu \mu }\), and its dependence on the two-point correlation function of turbulence \(P_{ij}(\boldsymbol{k})\) that emerges in the derivation of the Fokker-Planck equation (15). An alternative to the derivation of Sect. 2.1 is to directly compute the Fokker-Planck coefficients from solutions of the equations of motion. In fact, an arbitrary Fokker-Planck coefficient \(D_{PQ}\) can be defined in terms of the mean displacements of the variables in question, \(P\) and \(Q\). For instance, the pitch-angle diffusion coefficient can be derived as the \(t \to \infty \) limit of the running diffusion coefficient,

$$ d_{\mu \mu }(t) = \frac{1}{2} \frac{ \mathrm {d} }{ \mathrm {d} t} \left \langle (\Delta \mu )^{2} \right \rangle \, . $$
(27)

This is a consequence of the Taylor-Green-Kubo formula (Taylor 1922; Green 1951; Kubo 1957),

$$ D_{\mu \mu } = \int _{0}^{\infty } \mathrm {d} t \langle \dot{\mu}(0) \dot{\mu}(t) \rangle \, , $$
(28)

where the dots denote derivatives with respect to time. For diffusive transport, Eqs. (16) and (27) coincide, of course. Moreover, this allows computing the parallel diffusion coefficient \(\kappa _{\parallel }\) without the detour of computing \(D_{\mu \mu }\) first and then applying the diffusion approximation, Eq. (25).

From the equations of motion, see Eq. (3), we find

$$ \dot{\mu } = \frac{e}{c p} \left ( \boldsymbol{v} \times \boldsymbol{B} \right )_{z} = \frac{1}{r_{\text{g}} B_{z} {}} \left (v_{x} \delta B_{x}(\boldsymbol{r}) - v_{y} \delta B_{y}(\boldsymbol{r}) \right ) , $$
(29)

and thus

$$\begin{aligned} & D_{\mu \mu } \\ =& \frac{1}{ B_{z} ^{2} r_{\text{g}}^{2}} \!\! \int _{0}^{\infty } \!\!\!\!\! \mathrm {d} t \left [ v_{x}(t) v_{x}(0) \mathcal{P}_{yy}(t) \! + \! v_{y}(t) v_{y}(0) \mathcal{P}_{xx}(t) \right ] . \end{aligned}$$
(30)

Here, we have defined

$$ \mathcal{P}_{ij}(t) \equiv \langle \delta B_{i}(0) \delta B_{j}(t) \rangle \, , $$
(31)

and both the velocities and the magnetic fields are to be evaluated along unperturbed trajectories. Note that the fact that the Fokker-Planck coefficients only depend on the two-point function means that we can constrain ourselves to the Gaussian part of the turbulent magnetic field.

2.4 Turbulence geometries and spectra

To make further progress, we need to specify the turbulence correlation tensor \(P_{ij}\). In the derivation of the Fokker-Planck equation we had to assume that turbulence is homogeneous and stationary, that is its statistical moments are invariant under translations in space and time (see assumption 5). In this case, the field can be represented very economically in Fourier space. To this end, we introduce the Fourier transform pair

$$\begin{aligned} \delta \tilde{B}_{j}(\boldsymbol{k}, t) &= (2 \pi )^{-3/2} \int _{-\infty }^{ \infty } \mathrm {d} ^{3} r \, \delta B_{j}(\boldsymbol{r}, t) \mathrm {e} ^{\imath \boldsymbol{k} \cdot \boldsymbol{r}} \, , \end{aligned}$$
(32)
$$\begin{aligned} \delta B_{j}(\boldsymbol{x}, t) &= (2 \pi )^{-3/2} \int _{-\infty }^{\infty } \mathrm {d} ^{3} k \, \delta \tilde{B}_{j}(\boldsymbol{k}, t) \mathrm {e} ^{- \imath \boldsymbol{k} \cdot \boldsymbol{r}} \, . \end{aligned}$$
(33)

Note that for the magnetic field to have real values, \(\delta B_{j}(\boldsymbol{r}) = \delta B^{*}_{j}(\boldsymbol{r})\), requires a relation between the Fourier components and their complex conjugates,

$$ \delta \tilde{B}^{*}_{j}(\boldsymbol{k}) = \delta \tilde{B}_{j}(-\boldsymbol{k}) \, . $$
(34)

The homogeneity and stationarity now guarantee that the two-point functions \(\langle \delta B_{i}(\boldsymbol{r},t ) \delta B_{j}(\boldsymbol{r}', t) \rangle \) depend on the positions \(\boldsymbol{r}\) and \(\boldsymbol{r}'\) and times \(t\) and \(t'\) only through the differences \(\Delta \boldsymbol{r} \equiv (\boldsymbol{r} - \boldsymbol{r}')\) and \((t - t')\). It is then easy to see that the two-point function in Fourier space is diagonal,

$$\begin{aligned} & \langle \delta \tilde{B}_{i}(\boldsymbol{k}, t) \delta \tilde{B}^{*}_{j}(\boldsymbol{k}', t') \rangle \\ &= \! (2 \pi )^{-3} \!\!\! \int \!\! \mathrm {d} ^{3} r \, \mathrm {e} ^{ \imath \boldsymbol{k} \cdot \boldsymbol{r}} \!\!\! \int \!\! \mathrm {d} ^{3} r' \mathrm {e} ^{-\imath \boldsymbol{k}' \cdot \boldsymbol{r}'} \!\! \langle \delta B_{i}(\boldsymbol{r}, t) \delta B_{j}(\boldsymbol{r}'\!, t') \rangle \end{aligned}$$
(35)
$$\begin{aligned} &= \delta ^{(3)} (\boldsymbol{k} - \boldsymbol{k}') P_{ij}(\boldsymbol{k}, t-t') \, , \end{aligned}$$
(36)

where the turbulence correlation tensor \(P_{ij}(\boldsymbol{k}, \Delta t)\) is the Fourier transform of the coordinate space two-point function,

$$\begin{aligned} P_{ij}(\boldsymbol{k}, \Delta t) \equiv & (2 \pi )^{-3/2} \int _{-\infty}^{\infty} \mathrm {d} ^{3} (\Delta r) \, \mathrm {e} ^{\imath \boldsymbol{k} \cdot \Delta \boldsymbol{r}} \\ & \times \langle \delta B_{i}(\boldsymbol{r}, t) \delta B_{j}(\boldsymbol{r} - \Delta \boldsymbol{r}\!, t') \rangle \,. \end{aligned}$$
(37)

It contains all the (statistical) information on the magnetic turbulence that enters into the computation of the Fokker-Planck coefficients. This includes information on the turbulence geometry, for instance whether there is a preferred direction for the propagation of waves, information on the turbulence spectrum, that is the distribution of energy among different turbulent scales, as well as information on the time-dependence of the correlations. We will discuss a few parametrisations below.

Oftentimes, it is assumed that \(P_{ij}(\boldsymbol{k}, \Delta t)\) factorises into a magnetostatic correlation tensor \(P_{ij}(\boldsymbol{k}) \equiv P_{ij}(\boldsymbol{k}, 0)\) independent of time and a time-dependent dynamical correlation function \(\Gamma (\boldsymbol{k}, \Delta t)\),

$$ P_{ij}(\boldsymbol{k}, \Delta t) = P_{ij}(\boldsymbol{k}) \Gamma (\boldsymbol{k}, \Delta t) \, . $$
(38)

In the magnetostatic approximation, we ignore any time-dependence altogether, so \(\Gamma \equiv 1\).

While in reality \(P_{ij}\) may be arbitrarily complicated, three turbulence geometries have dominated much of the literature, both in analytical studies of transport coefficients and numerical test particle simulations. These three geometries are conceptually simple and particularly amenable to analytical computations of the components of the diffusion tensor and the other Fokker-Planck coefficients: 3D isotropic turbulence, slab turbulence and a composition of slab and 2D isotropic turbulence. In the following, we will give explicit formulas for the turbulence correlation tensor for these models in terms of a scalar power spectrum \(g(k)\), the spectral part of the turbulence tensors. Afterwards, we introduce two popular parametrisations for \(g(k)\) and conclude with an example for the computation of the pitch-angle diffusion coefficient.

2.4.1 3D isotropic turbulence

It is easy to show (Batchelor 1982) that for 3D isotropic turbulence the magnetostatic correlation tensor takes the form

$$ P^{\text{3D}}_{ij}(\boldsymbol{k}) = g^{\text{3D}}(k) \left ( \delta _{ij} - \frac{k_{i} k_{j}}{k^{2}} + \imath \sigma (k) \epsilon _{ijm} \frac{k_{m}}{k} \right ) \, , $$
(39)

with \(k = |\boldsymbol{k}|\). The \(k\)-dependent real functions \(g^{\text{3D}}(k)\) and \(\sigma (k)\) allow modelling of the overall spectrum and of a wavenumber-dependent helicity, respectively. Note that for linearly polarised waves \(\sigma (k) \equiv 0\). The normalisation of \(g^{\text{3D}}(k)\) is fixed by requiring

$$\begin{aligned} \delta B^{2} &\equiv \langle \boldsymbol{\delta B}^{2}(\boldsymbol{x}) \rangle \! = \!\! \int \mathrm {d} ^{3} k \left ( P^{\text{3D}}_{xx}(\boldsymbol{k}) + P^{\text{3D}}_{yy}( \boldsymbol{k}) + P^{\text{3D}}_{zz}(\boldsymbol{k}) \right ) \\ &= 2 \int \mathrm {d} ^{3} k \, g^{\text{3D}}(k) = 8 \pi \int _{0}^{\infty } \mathrm {d} k \, k^{2} g^{\text{3D}}(k) \, . \end{aligned}$$
(40)

2.4.2 Slab turbulence

In slab turbulence, it is assumed that all quantities are independent of the coordinates perpendicular to the background field (in our case: \(x\) and \(y\)) and that the turbulent field has no \(z\)-component. Consequently, the wave vectors \(\boldsymbol{k} \parallel \hat {\boldsymbol {z}} \) and if we further demand turbulence to be axisymmetric, the turbulence correlation tensor reads

$$ P^{\text{slab}}_{ij}(\boldsymbol{k}) = g^{\text{slab}}(k_{\parallel }) \frac{\delta (k_{\perp })}{k_{\perp }} \left ( \delta _{ij} + \imath \sigma (k_{\parallel }) \epsilon _{ijz} \right ) \, , $$
(41)

for \(i, j \in x, y\) and zero otherwise. In our case, \(k_{\parallel } = k_{z}\) and \(k_{\perp } = \sqrt{k_{x}^{2} + k_{y}^{2}}\). Again, \(\sigma (k_{\parallel })\) allows for wavenumber-dependent helicity, but vanishes for linear polarisation. The normalisation is then

$$\begin{aligned} \delta B^{2} &\equiv \int \mathrm {d} ^{3} k \left ( P^{\text{slab}}_{xx}( \boldsymbol{k}) + P^{\text{slab}}_{yy}(\boldsymbol{k}) \right ) \\ \end{aligned}$$
(42)
$$\begin{aligned} &= 4 \pi \int _{-\infty }^{\infty } \mathrm {d} k_{\parallel } \, g^{ \text{slab}}(k_{\parallel }) \, . \end{aligned}$$
(43)

While slab turbulence might seem rather restrictive a turbulence model, it is quite attractive due to its simplicity. In addition, it could be argued that it is of physical relevance in situations where the turbulence is self-generated by anisotropies in the distribution of CRs (Kulsrud and Pearce 1969; Skilling 1975): It has been shown (e.g. Tademaru 1969) that the modes with wavevectors along the background magnetic field grow fastest.

2.4.3 Composite (slab + 2D isotropic) turbulence

Motivated by observations of the turbulence in the solar wind (Matthaeus et al. 1990), the heliospheric community has adopted a composite model for the correlation tensor as a superposition of a slab component and a 2D isotropic component. The motivation for this composite turbulence model were observations of CR mean-free paths which were in conflict with the observed turbulent energy densities. In fact, the observed mean-free path was significantly larger than what was predicted for the measured turbulence level in a pure slab model. As 2D turbulence contributes to pitch-angle scattering (and therefore to the parallel mean-free path) only marginally, moving part of the turbulent energy density from the slab to the 2D component, the measured level of turbulence could be reconciled with the mean-free path. According to Bieber et al. (1994), a \(80 \, \%\) to \(20 \, \%\) split between 2D and slab turbulence, respectively, reconciles the available data sets.

For linearly polarised waves, we can write

$$ P^{\text{comp}}_{ij}(\boldsymbol{k}) = P^{\text{slab}}_{ij}(\boldsymbol{k}) + P^{ \text{2D}}_{ij}(\boldsymbol{k}) \, , $$
(44)

with \(P^{\text{slab}}_{ij}(\boldsymbol{k})\) as in Eq. (41) and

$$ P^{\text{2D}}_{ij}(\boldsymbol{k}) = g^{\text{2D}}(k_{\perp }) \frac{\delta (k_{\parallel })}{k_{\perp }} \left ( \delta _{ij} - \frac{k_{i} k_{j}}{k^{2}} \right ) \, , $$
(45)

for \(i, j \in x, y\) and zero otherwise. This turbulent 2D field only depends on the \(x\)- and \(y\)-coordinate, and has no \(z\)-component. The normalisation condition for the 2D component is

$$\begin{aligned} \delta B^{2}_{\text{2D}} &\equiv \langle \boldsymbol{\delta B}^{2}(\boldsymbol{x}) \rangle \! = \!\! \int \mathrm {d} ^{3} k \left ( P^{\text{2D}}_{xx}(\boldsymbol{k}) + P^{ \text{2D}}_{yy}(\boldsymbol{k}) \right ) \\ &= 2 \pi \int \mathrm {d} k_{\perp } \, g^{\text{2D}}(k_{\perp }) \, . \end{aligned}$$
(46)

2.4.4 Turbulence spectra

Having reviewed three simple turbulence geometries, we need to specify the spectral shapes \(g(k)\) in order to compute transport coefficients. In cascade models of turbulence (Kolmogorov 1941; Iroshnikov 1963; Kraichnan 1965), energy is injected on the largest scales in the so-called energy range. Non-linear interactions transfer energy to smaller scales over the so-called inertial range. At very small scales, the turbulent energy is dissipated in the so-called dissipation range. The scale separating the energy and inertial ranges is called the outer scale of turbulence and the scale separating the inertial and the dissipation range is called the dissipation scale. For an introduction to turbulence theory, see e.g. Frisch (1995). Both turbulence theory and observations point at the existence of power law spectra in the inertial range. In fact, power law spectra have been observed in interplanetary and interstellar space (Armstrong et al. 1995). (For a review on interstellar turbulence, see Elmegreen and Scalo 2004).

Both in numerical simulations and in analytical work, most authors have confined themselves to one of two spectra. The first one is a simple power law with spectral index \(q\) and low wavenumber cut-off \(k_{0}\), corresponding to the outer scale \((2 \pi / k_{0})\),

$$ g_{\text{PL}}(k) = \left \{ \textstyle\begin{array}{l@{\quad }l} g_{0} (k/k_{0})^{-q} & \text{for } k \geq k_{0} \, , \\ 0 & \text{otherwise.} \end{array}\displaystyle \right . $$
(47)

The alternative is a broken power law with a flat spectrum below the wavenumber \(k_{0}\) and a power law slope \(q\) above,

$$ g_{\text{BPL}}(k) = g_{0} \left ( 1 + \left ( \frac{k}{k_{0}} \right )^{1/s} \right )^{-q s} \, . $$
(48)

Here, \(s\) is parametrising the softness of the break and \(s \to 0\) corresponds to a sharp break. It is assumed that the broken power law form can potentially also capture turbulence in the energy range, that is for \(k < k_{0}\).

2.4.5 Slab turbulence with broken power law spectrum

By ways of example, we report the result for the pitch-angle diffusion coefficient in slab turbulence and for the broken power law spectrum (Shalchi 2009),

$$ g^{\text{slab}}(k_{\parallel }) = \frac{C \left (q, \frac{1}{2} \right )}{2 \pi k_{0}} \delta B^{2} \left ( 1 + \left ( \frac{k}{k_{0}} \right )^{2} \right )^{-q/2} \, . $$
(49)

The function \(C(q, s)\) is fixed by the normalisation condition, see Eq. (43),

$$\begin{aligned} \frac{1}{C(q, s)} & \equiv \frac{4}{k_{0}} \int _{0}^{\infty } \mathrm {d} k \, \left ( 1 + \left ( \frac{k}{k_{0}} \right )^{1/s} \right )^{-q s} \\ &= \frac{4}{k_{0}} \frac{\Gamma (s(q-1)) \Gamma (1+s)}{\Gamma (q \, s)} \, , \end{aligned}$$
(50)

where \(\Gamma (\cdot )\) denotes the gamma function. We have assumed that \(q>1\) in order for the \(k\)-integral in Eq. (50) to converge. For \(q=1\), instead, we need to assume a cut-off, that is \(g^{\text{slab}} = 0\) for \(k_{\parallel} > k_{\text{max}}\).

Substituting Eq. (49) into Eqs. (41) and (30), one encouters the resonance function

$$ R^{\text{slab}} = \pi \delta (k_{\parallel } \mu v \pm \Omega ) \, , $$
(51)

see Schlickeiser (2002) for details. Eventually, this simplifies to

$$ D_{\mu \mu } = \frac{\pi }{2} C \! \left ( \! q, \frac{1}{2} \! \right ) q k_{0} \frac{\delta B^{2}}{B_{z}^{2}} \frac{(1 - \mu ^{2}) \mu ^{q-1} (r_{\text{g}} k_{0})^{q-2}}{(1 + \mu ^{2} (r_{\text{g}} k_{0}))^{q/2}} . $$
(52)

Here, \(r_{\text{g}}\) denotes again the particle’s gyroradius.

For relativistic particles \(r_{\text{g}} \propto \mathcal{R}\) (where ℛ again denotes rigidity) and if ℛ is small enough such that \(\mu ^{2} (r_{\text{g}} k_{0}) \ll 1\), we observe that the rigidity-dependence of \(D_{\mu \mu }\) is of power law form reflecting the power law nature of the underlying turbulence spectrum. For Kolmogorov and Kraichnan type values, \(q = 5/3\) and \(3/2\), the rigidity-dependence of the pitch-angle diffusion coefficient is \(D_{\mu \mu } \propto \mathcal{R}^{-1/3}\) and \(\mathcal{R}^{-1/2}\) and the spatial diffusion coefficient \(\kappa _{\parallel } \sim 1 / D_{\mu \mu } \propto \mathcal{R}^{1/3}\) and \(\mathcal{R}^{1/2}\), respectively.

2.5 Field-line random walk

The computation of the pitch-angle diffusion coefficient in Eqs. (28), (30) and (52) is based on an evaluation of the turbulent part of the Lorentz force along trajectories around the homogeneous background field. As long as perturbations are small, this gives the dominant contribution to the parallel diffusion coefficient, Eq. (25).

For perpendicular transport, however, there is another important contribution due to the fact that the field line is not perfectly homogeneous. Instead, the large-scale magnetic field evaluated for a particle along a field line changes direction with distance along this field line. Under certain conditions, this movement can be shown to be diffusive, see below. If the movement of the particle due to this effect is included in the computation of the mean-square displacements (or equivalently through the Taylor-Green-Kubo approach), this gives the so-called field-line random walk (FLRW) contribution to perpendicular transport. The contribution without this is oftentimes called the microscopic contribution.

For slab turbulence, the microscopic diffusion coefficient vanishes (the transport is in fact sub-diffusive), hence FLRW gives the only contribution. For other turbulence geometries, FLRW can also contribute, but might not be dominating.

Let’s again assume the regular background field \(\langle \boldsymbol{B} \rangle = B_{z} \hat{z}\) to be dominating over the perturbations \(\boldsymbol{\delta B}\). The equation determining the field line \(\{ x(z), y(z) \}\) is

$$ \frac{ \mathrm {d} x}{ \mathrm {d} z} = \frac{\delta B_{x}}{B_{z}} \, , $$
(53)

and similarly for \(y(z)\). This can formally be integrated to obtain the mean square displacement in the perpendicular directions, e.g.

$$ \langle (\Delta x)^{2} \rangle \! = \! \frac{1}{B_{z}^{2}} \int _{0}^{z} \!\! \! \mathrm {d} z' \!\! \int _{0}^{z} \!\!\! \mathrm {d} z'' \langle \delta B_{x}(\boldsymbol{r}(z')) \delta B_{x}(\boldsymbol{r}(z'')) \rangle . $$
(54)

In slab turbulence, the integrand only depends on \(z\) and it is easy to show that the perpendicular mean-square displacement \(\langle (\Delta r_{\perp})^{2} \rangle \) is ballistic at small \(z\) and diffusive for large \(z\), e.g.

$$ \langle (\Delta x)^{2} \rangle = \left \{ \textstyle\begin{array}{l@{\quad } l} z^{2} \left ( \delta B_{x} / B_{z} \right )^{2} & \text{for } z \to 0 \, , \\ 2 \kappa _{\text{FLRW}} |z| & \text{for } z \to \infty \, , \end{array}\displaystyle \right . $$
(55)

with the FLRW diffusion coefficient

$$ \kappa _{\text{FLRW}} = \frac{2 \pi ^{2}}{B_{z}^{2}} g^{\text{slab}}(0) \, . $$
(56)

In other turbulence geometries, the integrand in Eq. (54) also depends on \(x\) and \(y\), such that an explicit solution is not possible without further assumptions. See Shalchi (2009) for a more detailed discussion.

If particles are assumed to diffuse along field lines, \(\langle (\Delta z)^{2} \rangle \propto \Delta t\), FLRW leads to subdiffusive perpendicular transport, \(\langle (\Delta r_{\perp})^{2} \rangle \propto \sqrt{\Delta t}\), a phenomenon known as compound (sub)diffusion (Jokipii 1966; Matthaeus et al. 1995; Ragot 2006; Ruffolo et al. 2006). Theoretical predictions (Kóta and Jokipii 2000) have been largely confirmed by numerical simulations (Giacalone and Jokipii 1999; Mace et al. 2000; Qin et al. 2002b). Compound subdiffusion has been applied to a variety of environments like laboratory plasmas (Rechester and Rosenbluth 1978; Isichenko 1991), the heliosphere (Jokipii and Parker 1969; Zimbardo et al. 2006), Galactic transport (Getmantsev 1963; Lingenfelter et al. 1971; Chuvilgin and Ptuskin 1993), near-source transport (Nava and Gabici 2013) as well as shock acceleration (Achterberg and Ball 1994; Duffy et al. 1995; Kirk et al. 1996).

2.6 Short-comings of QLT

Despite its popularity, QLT exhibits a number of issues which we will briefly review in the following.

The most well-known pathology of magnetostatic QLT is its inability to scatter particles through \(90^{\circ }\). While present in a number of turbulence geometries, it is easiest illustrated in slab turbulence where the dependence of the pitch-angle diffusion coefficient \(D_{\mu \mu }\) on the spectrum \(g^{\text{slab}}(k)\) becomes very simple. In fact, inspecting Eq. (52) we see that \(D_{\mu \mu } \to 0\) for \(\mu \to 0\).

The root cause for the \(90^{\circ }\) problem is the narrow resonance condition in magnetostatic QLT, \(k_{\parallel } \mu \, r_{\text{g}} = \pm 1\), see Eq. (51). Particles at finite \(\mu \) are in resonance with waves of finite parallel wavenumber, \(k_{\parallel } = \pm 1 / (\mu r_{\text{g}})\). For \(\mu \) approaching 0, however, the resonant parallel wavenumber grows without bounds. With the turbulence spectra being falling power laws, however, there is only little energy at small scales and the pitch-angle scattering rate vanishes. In practice, there is of course no energy at all at scales below the dissipation scale.

We note that the vanishing of \(D_{\mu \mu}\) does not necessarily imply that the parallel diffusion coefficient \(\kappa _{\parallel}\) diverges. In fact, for slab turbulence, the \(\mu \)-integral in Eq. (25) remains finite as long as \(q<2\). Whether the QLT prediction of \(D_{\mu \mu}\) near \(\mu =0\) and of \(\kappa _{\parallel}\) are accurate is a different question altogether; test particle simulations can provide answers. We also note that for \(q=1\) it might appear from Eq. (52) that there is no \(90^{\circ}\) problem, however, Eq. (52) was derived under the assumption of \(q>1\). In fact, for \(q=1\), the necessary cut-off in \(g^{\text{slab}}\) leads to a finite resonance gap for \(|\mu | < \Omega / (v k_{\parallel})\). Finally, for isotropic turbulence, \(D_{\mu \mu}\) also vanishes at \(\mu =0\) and this time also \(\kappa _{\parallel}\) diverges, even for \(q<2\) (Tautz et al. 2006a). This is in stark contrast with test particle simulations which have shown that parallel transport is perfectly diffusive in isotropic turbulence, meaning that \(\kappa _{\parallel}\) attains a finite value.

Another comment is in order: In the above discussion, we have constrained ourselves to the simplest turbulence model, in particular slab turbulence, linear polarisation, and considered the limit \(v_{\text{A}} /v \to 0\) where \(v\) is again the particle speed. If we had allowed for oblique propagation of waves, we would have had to deal with compressive modes, like the magnetosonic mode. Due to its finite \(\delta B_{z}\) component, the magnetosonic wave allows for transit-time damping, that is another resonant interaction besides gyro-resonance. Note however, that this does not cure the \(90^{\circ}\) problem. It can be shown that the gyro-resonant interactions to \(D_{\mu \mu}(\mu =0)\) is \(\propto ( v_{\text{A}} /v)^{q}\) while the contribution from transit-time damping is vanishing at \(\mu =0\). The parallel diffusion coefficient in turn is determined solely by the gyro-resonant contribution and scales like \(( v_{\text{A}} /v) ^{1-q}\), thus again diverging in the limit of \(v_{\text{A}} /v \to 0\).

Nature has of course no difficulty to scatter particles through \(90^{\circ }\), as evidenced by the isotropy of Galactic CRs. Therefore, the vanishing of \(D_{\mu \mu }\) at \(\mu = 0\) must be considered a theoretical issue. It was realised early on (Voelk 1975, see cf. Tautz et al. (2008) for other references) that the origin of the \(90^{\circ }\) problem is actually the delta-like resonance function of QLT in the magnetostatic approximation and it was claimed that plasma wave effects or dynamical turbulence would in fact cure this issue. Other authors (Tautz et al. 2006b) have however pointed out that non-linear effects are likely more important. Non-linear theories, in particular, exhibit finite resonance widths, thus curing the \(90^{\circ }\) problem. In addition, non-resonant scattering can also play a role, e.g. Ragot (1999). A certain degree of non-resonant scattering has been inferred from PIC simulations of Whistler wave turbulence (Camporeale 2015), for instance.

Another important issue with QLT is its difficulty in describing perpendicular transport for slab turbulence. Whereas simulations find subdiffusive behaviour, \(\langle (\Delta r_{\perp})^{2} \rangle \propto \sqrt{\Delta t}\) (Qin et al. 2002b), the answer from analytical models is not quite as clear and depends on what kind of assumptions enter the definition of the perpendicular displacements and which equations of motion are assumed. If we define the perpendicular diffusion coefficients as found in the derivation of the Fokker-Planck equation (15), we assume the equations of motion as in Eq. (3), meaning that the turbulent field is evaluated along the unperturbed trajectories in the homogeneous background field \(\langle \boldsymbol{B} \rangle \), see Fig. 1a. In this case, \(\kappa _{\perp }\) vanishes (Schlickeiser 2002), again due to the narrow resonance condition. This assumption is of course strictly only true for small enough turbulent magnetic fields. If we instead make the assumption that particles follow field lines, see Fig. 1b, diffusive behaviour is found, \(\langle (\Delta r_{\perp})^{2} \rangle \propto \Delta t\). However, what has been ignored here is the diffusive nature of transport along the field line. If this is taken into account, see Fig. 1c, subdiffusive behaviour is found again. (This is the compound subdiffusion of Sect. 2.5 above.) Numerical simulations indeed confirm the subdiffusive behaviour. Whether the ambiguity of evaluating the perpendicular transport is an issue with QLT or of the additional assumptions made when evaluating \(\langle (\Delta r_{\perp})^{2} \rangle \) is a matter of debate. Note that for non-slab geometries, diffusive behaviour is recovered.

Fig. 1
figure 1

Illustration of the different choices for how to evaluate perpendicular velocities when considering the perpendicular transport: (a) standard quasi-linear theory; (b) quasi-linear theory with field line random walk (FLRW); (c) FLRW and diffusion along the field line. See text for details

Finally, it has been noted (Shalchi 2009) that in other turbulence geometries there are also deviations between the QLT predictions and numerical results. Noteworthy are the deviations for composite geometry (Shalchi et al. 2004b).

2.7 Non-linear extensions

So far, we have only considered magnetostatic turbulence which for QLT implies the \(\delta \)-like resonance function. Both dynamical turbulence and plasma wave damping lead to broadening of the resonance function. This has the potential of curing some of the deficiencies of QLT. (See Tautz et al. (2006b) for a discussion of the failure of QLT in undamped plasma wave models.)

Another way to broaden the resonance function are non-linear theories. These replace the unperturbed orbits of QLT with perturbed orbits, that are more realistic at finite turbulence levels. Below, we review a number of non-linear theories and cite their respective resonance functions.

2.7.1 BAM model (Bieber and Matthaeus 1997)

Bieber and Matthaeus (1997) start from the velocity autocorrelation functions, \(V_{ij}(t) \equiv \langle v_{i}(0) v_{j}(t) \rangle \) that are required for computing diffusion coefficients with the TGK formalism,

$$ \kappa _{ij} = \int _{0}^{\infty } \mathrm {d} t \, \langle v_{i}(0) v_{j}(t) \rangle \, . $$
(57)

In QLT, particle trajectories are perfect helices and the velocities of a particle along its trajectory stay correlated forever. This leads to simple, oscillatory correlations, \(V_{xx}(t) = V_{yy}(t) \propto \cos \Omega t\) and \(-V_{xy}(t) = V_{yx}(t) \propto \sin \Omega t\). In reality, however, velocities will not stay correlated indefinitely as particles will scatter in pitch-angle, and therefore these correlations should decay with time. In the BAM model, the decay is assumed exponential and thus the velocity correlation functions read

$$\begin{aligned} V_{xx}(t) = V_{yy}(t) &= \frac{v^{2}}{3} \cos \Omega t \exp [- \omega _{\perp } t ] \, , \end{aligned}$$
(58)
$$\begin{aligned} -V_{xy}(t) = V_{yx}(t) &= \frac{v^{2}}{3} \sin \Omega t \exp [- \omega _{\text{A}} t ] \, , \end{aligned}$$
(59)
$$\begin{aligned} V_{zz}(t) &= \frac{v^{2}}{3} \exp [- \omega _{\parallel } t ] \, . \end{aligned}$$
(60)

Substituting those into Eq. (57), one finds for the diffusion coefficients,

$$\begin{aligned} \kappa _{\perp } = \kappa _{xx} = \kappa _{yy} &= \frac{v^{2}}{3} \frac{\omega _{\perp }}{\omega _{\perp }^{2} + \Omega ^{2}} \, , \end{aligned}$$
(61)
$$\begin{aligned} \kappa _{\text{A}} = -\kappa _{xy} = \kappa _{yx} &= \frac{v^{2}}{3} \frac{\Omega }{\omega _{\perp }^{2} + \Omega ^{2}} \, , \end{aligned}$$
(62)
$$\begin{aligned} \kappa _{\parallel } = \kappa _{zz} &= \frac{v^{2}}{3} \frac{1}{ \omega _{\parallel } } \, . \end{aligned}$$
(63)

This is of similar form as the classical scattering result (Gleeson 1969), see Sect. 2.1.

In order to fix the perpendicular decorrelation rate, Bieber and Matthaeus (1997) consider FLRW and postulate that the distance \(z_{c}\) over which the field lines decorrelate is \(z_{c} = r_{\text{g}}^{2} / \kappa _{\text{FLRW}}\) and thus \(\omega _{\perp } = v / z_{c} = v \kappa _{\text{FLRW}} / r_{\text{g}}^{2}\).

For a given turbulence geometry and spectrum, both \(\kappa _{\parallel }\) and \(\kappa _{\text{FLRW}}\) can be computed and the BAM model then allows determining \(\kappa _{\perp }\) and \(\kappa _{\text{A}}\). In slab turbulence, however, the BAM model predicts diffusive behaviour in the perpendicular direction which is at variance with what is seen in simulations. Furthermore, in composite turbulence (slab+2D) the BAM model cannot deal with the superdiffusive behaviour of FLRW seen in simulations (Shalchi 2009). We thus conclude that the BAM model does not agree with simulation results, at least for two of the most important turbulence geometries.

2.7.2 Non-linear guiding centre (NLGC) theory

Non-linear guiding centre (NLGC) theory (Matthaeus et al. 2003) improves upon the velocity correlation functions of the BAM model insofar as that the perpendicular velocities (\(i \in x, y\)) are assumed to fulfill

$$ v_{i} = a v_{z} \frac{\delta B_{i}}{\delta B_{z}} \, , $$
(64)

where \(a\) is a free parameter that needs to be determined by fitting to simulations. This is inspired by the requirement for particle guiding centres to stay on field lines. In fact, for \(a=1\), Eq. (64) reduces to the field line equation (53).

The perpendicular diffusion coefficient is then evaluated with the Taylor-Green-Kubo formula (Taylor 1922; Green 1951; Kubo 1957). This gives four-point correlation functions \(\langle v_{z}(0) v_{z}(t) \delta B_{x}(0) \delta B_{x}(t) \rangle \) with two factors of magnetic field strength and two factors of parallel velocity. In NLGC theory this is assumed to factorise into two two-point functions. The (parallel) velocity part has a simple exponential form, if pitch-angle diffusion is isotropic, i.e. \(D_{\mu \mu } \propto (1 - \mu ^{2})\). The Fourier transform of the two-point correlation for the magnetic field is further assumed to factorise into the power spectrum \(P_{xx}\) and the so-called characteristic function, \(\langle \exp [\imath \boldsymbol{k} \cdot \boldsymbol{\Delta x} ] \rangle \). If the particle separations \(\boldsymbol{\Delta x}\) are assumed normal-distributed and diffusive, e.g. \(\langle (\Delta x)^{2} \rangle = 2 \kappa _{\perp } t\), the characteristic function takes a simple Gaussian form and the perpendicular diffusion coefficient reads,

$$\begin{aligned} \kappa _{\perp } &= \frac{a^{2}}{B_{z}^{2}} \frac{v^{2}}{3} \int \mathrm {d} ^{3} k \int _{0}^{\infty } \mathrm {d} t \, P_{\perp }(\boldsymbol{k}, t) \\ & \times \exp [ - \omega _{\parallel } t - \kappa _{\perp } k_{\perp }^{2} t - \kappa _{\parallel } k_{\parallel }^{2} t ] \, . \end{aligned}$$
(65)

With a power spectrum of the form \(P_{\perp }(\boldsymbol{k}, t) = P_{\perp }(\boldsymbol{k}) \Gamma (\boldsymbol{k}, t)\) and a dynamical correlation function \(\Gamma (\boldsymbol{k}, t) = \exp [ - \gamma (\boldsymbol{k}) t ]\) this simplifies to

$$ \kappa _{\perp } = \frac{a^{2}}{B_{z}^{2}} \frac{v^{2}}{3} \int \mathrm {d} ^{3} k \frac{P_{\perp }(\boldsymbol{k})}{ \omega _{\parallel } + \kappa _{\perp } k_{\perp }^{2} + \kappa _{\parallel } k_{\parallel }^{2} + \gamma (\boldsymbol{k}) } \, . $$
(66)

Note how the sought-for perpendicular diffusion coefficient appears on both sides of the equations. Oftentimes, \(\kappa _{\perp }\) is therefore computed iteratively.

For slab turbulence and in the magnetostatic case (\(\gamma =0\)), the integral in Eq. (66) can be computed analytically (Shalchi et al. 2004a; Zank et al. 2004). Comparing the parallel mean-free path \(\lambda _{\parallel } = 3 \kappa _{\parallel } / v\) to the correlation length \(\ell _{\text{c}}\), two limiting cases are noteworthy: For \(\lambda _{\parallel } \ll \ell _{\text{c}}\) and for \(\lambda _{\parallel } \gg \ell _{\text{c}}\), the results for \(\lambda _{\perp }\) from QLT and from the nonlinear closure approximation of Owens (1974) are recovered respectively. Note, however, that even though no assumption is made about the transport in the perpendicular directions (since \(k_{\perp }= 0\) in slab turbulence) perpendicular transport turns out to be diffusive, again at variance with numerical test particle simulations (see Sect. 4.1). For a composite slab+2D model, however, the NLGC theory agrees well with simulations if \(a = \sqrt{1/3}\).

2.7.3 Weakly non-linear theory

In weakly non-linear theory (WLNT, Shalchi et al. 2004b), the first two steps of NLGC theory are followed: (1) the factorisation of the fourth-order correlation function of two velocities and two magnetic field factors into two separate second-order correlation functions for velocities and magnetic field strength; (2) the decomposition of the field strength correlation function into the magnetic power spectrum and a characteristic function. The crucial difference with respect to the BAM theory is the form of the velocity correlations. Instead of Eqs. (58) to (60), the QLT velocity correlations are kept for the perpendicular motions and only the parallel velocities are assumed to decorrelate at a rate \(\omega \),

$$\begin{aligned} V_{xx}(t) = V_{yy}(t) &= v^{2} (1-\mu ^{2}) \cos \Omega t \, , \end{aligned}$$
(67)
$$\begin{aligned} -V_{xy}(t) = V_{yx}(t) &= v^{2} (1-\mu ^{2}) \sin \Omega t \, , \end{aligned}$$
(68)
$$\begin{aligned} V_{zz}(t) &= v^{2} \mu ^{2} \exp [- \omega t ] \, , \end{aligned}$$
(69)

where \(\omega \) is identified with the pitch-angle scattering frequency, \(\omega = 2 D_{\mu \mu } / (1 - \mu ^{2})\). For the characteristic function, a Gaussian distribution is assumed in the perpendicular direction whereas for the parallel motion, any possible diffuse contribution is ignored altogether.

Comparing the resulting expression with those from QLT it appears that only additional exponential factors with a linear time-dependence in the exponent have been introduced. When performing the time-integration these lead to resonance broadening which can be ascribed to pitch-angle scattering and perpendicular motion and the deviation of the particle orbits from purely helical motion. The resonance function is of the Breit-Wigner form. From this, the Fokker-Planck coefficient can be computed, in particular the pitch-angle diffusion coefficient and the perpendicular diffusion coefficient. Note however, that the perpendicular diffusion coefficient depends on the pitch-angle diffusion rate (or equivalently on the parallel diffusion coefficient). In order to probe the perpendicular diffusion independently when comparing to simulations, oftentimes the empirical parallel mean free path from the simulations is adopted.

2.7.4 Other approaches

Tautz et al. (2008) use a broadening of the resonance condition in isotropic turbulence, parametrised by smoothing of the particle position along the magnetic field as motivated by second-order QLT (Shalchi 2005). The width of the particle position is computed from the usual QLT. As a consequence, \(D_{\mu \mu }\) now has its maximum at \(\mu =0\). The authors find good agreement with the numerical simulations of Giacalone and Jokipii (1999). Also noteworthy is the work of Shalchi et al. (2009) who present an analytical computation of pitch-angle diffusion coefficient and mean-free path for slab turbulence. It is shown that QLT is a good approximation for \(|\mu | > \delta B / B_{z}\).

3 Generating turbulent magnetic fields on a computer

The most realistic way of generating a turbulent magnetic field on a computer to propagate particles in is of course to rely on simulations of this turbulence. This offers the opportunity to include (some of) the known complexity beyond the simple turbulence models described above, for instance anisotropic turbulence like the Goldreich-Sridhar picture (Sridhar and Goldreich 1994; Goldreich and Sridhar 1995). Given the large dynamical range required for most applications, it is however also the most computationally expensive. In the following, we will review such attempts and their results, before discussing the generation of synthetic turbulence.

3.1 Simulated turbulence

The most extensive set of simulations to date have been performed by Cohet and Marcowith (2016), CM16 from hereon, who tracked test particles through MHD turbulence generated with the RAMSES code (Teyssier 2002). They followed the pioneering work of Beresnyak et al. (2011) and Xu and Yan (2013) and discussed differences in setups and results.

For the most part, CM16 ran the MHD part of their simulations on a \(512^{3}\) grid, and the box length of the simulation was taken to be five times larger than the turbulence injection scale \(L_{\text{inj}}\). This resulted in about one-and-a-half orders of magnitude in dynamical range between the coherence length of turbulence and the dissipation length, the latter being due to the finite numerical resolution. Turbulence was injected either by solenoidal or compressible forcing and the results differ significantly. It is hypothesised that this is due to the preferential driving of Alfvénic turbulence for the solenoidal and of fast-magnetosonic turbulence for the compressible case, the latter leading to an isotropic turbulence cascade and being more efficient in CR scattering (Chandran 2000; Yan and Lazarian 2002).

CM16 studied in detail the dependence of parallel and perpendicular mean-free paths on the Alfvénic Mach number \(M_{\text{a}}\) (which is defined as the ratio of the rms fluid velocity and the Alfvén speed in the total magnetic field, i.e. background plus turbulent). For the parallel mean-free path, a power law scaling with the Alfvénic Mach number \(\lambda _{\parallel } \propto (M_{\text{A}})^{\alpha }\) is found. At small \(M_{\text{a}}\), the results differ strongly between solenoidal and compressible forcing, with the parallel mean-free path at \(M_{\text{a}} = 0.3\) being about two orders of magnitude larger in the former case. For the solenoidal case, \(\lambda _{\parallel }\) is much larger than found by Xu and Yan (2013) and the dependence on \(M_{\text{A}}\) is much stronger: Typically \(\alpha \) is between −7 and −5 which is also in tension with expectations from QLT where \(\lambda _{\parallel } \propto M_{\text{A}}^{-2}\), e.g. (Sun 2011). Note that this scaling was also confirmed in test particle simulations of synthetic isotropic turbulence, notably beyond the limits of validity of QLT (Casse et al. 2002). For the compressible driving, \(\lambda _{\parallel } \propto (M_{\text{A}})^{-2}\) as expected. The perpendicular mean-free path, on the other hand, is scaling like \(\lambda _{\perp } \propto M_{\text{A}}^{2}\) in QLT which is largely confirmed by CM16. This is being ascribed to the contribution from field-line random walk to the perpendicular transport. Another prediction for compressible MHD turbulence (Yan and Lazarian 2008) is \(\lambda _{\perp } \propto M_{\text{A}}^{4}\), but this only applies for the limits \(\lambda _{\parallel } \ll L_{\text{inj}}\) or \(\lambda _{\parallel } \gg L_{\text{inj}}\), whereas the simulations of CM16 are in between.

An equally crucial result is the dependence of the parallel and perpendicular mean-free paths on gyroradius \(r_{\text{g}}\) (normalised with respect to the simulation scale \(L\)). Here, the results for \(\lambda _{\parallel }\) again depend very sensitively on the driving at \(L_{\text{inj}}\): If the forcing is solenoidal, the rigidity dependence of \(\lambda _{\parallel }\) can be very weak: The dependence is power law like in the range of rigidities tested, \(\lambda _{\parallel } \propto r_{\text{g}}^{\delta }\), and \(\delta \) can become even negative, especially for large \(M_{\text{a}}\). In QLT this is only possible for turbulence spectra \(g(k) \propto k^{-q}\) with \(q > 2\) while the power spectral indices found by CM16 are \(q \sim 1.5\), that is consistently smaller than 2. In the compressible case, the agreement with expectations is much better and the observed scaling is compatible with both \(\delta = 1/3\) and \(1/2\). (The dynamical range is too small to tell, in fact.) Perpendicular mean-free paths show less of a difference between the solenoidal and compressible cases and are largely consistent with a scaling \(\propto r_{\text{g}}^{1/2}\). For gyroradii larger than \(L_{\text{inj}}\), the transition to small-angle scattering with \(\lambda _{\parallel } \propto r_{\text{g}}^{2}\) is being observed, as expected.

3.2 Synthetic turbulence

Realistic modelling of CR transport requires a rather wide dynamical range for the turbulent modes. MHD simulations of turbulence usually cover no more than one and a half orders of magnitude between the coherence length and the dissipation scale (see e.g. CM16). An alternative to using simulated turbulence is to adopt one of the turbulence correlation tensors \(P_{ij}(\boldsymbol{k}, t)\) discussed in Sect. 2.4 and to directly generate random realisations of a field with such a correlation structure on a computer. The turbulence generated in this way is usually referred to as “synthetic turbulence”. The obvious drawback of this method is its reliance on a turbulence model instead of using the more realistic results from MHD simulations of turbulence. The advantages are the large dynamical range possible in principle, and the possibility of directly testing some of the results of QLT and its non-linear extensions which are more straight-forward to compute for simple turbulence models.

When solving the equations of motion, we will need to evaluate the turbulent magnetic field \(\boldsymbol{\delta B}\) at many different positions, possibly also at different times, the latter distinction becoming relevant when considering models of dynamical turbulence. In order to do this, we need to keep track not only of the amplitudes of the turbulent field, but also its phases which are random. This implies generating a random sequence of phases and storing them for the duration of the test particle simulation. On a computer, the turbulent magnetic field will be characterised by a finite number of real numbers, that is the corresponding magnetic field is band-limited. In the literature, two methods have been suggested, depending on whether the phases of a finite number of modes are stored or whether the turbulent magnetic field \(\boldsymbol{\delta B}(\boldsymbol{r})\) is stored on a discrete grid. We will refer to the former as the harmonic method and to the latter as the grid methods. Both methods have their advantages, but also disadvantages which we will discuss.

3.2.1 Harmonic method

In the harmonic method, pioneered by Giacalone and Jokipii (1994) and others (Michałek and Ostrowski 1997, 1998; Giacalone and Jokipii 1999), the turbulent field is defined as a superposition of plane waves,

$$ \boldsymbol{\delta B}(\boldsymbol{r}) = \operatorname{Re} \left ( \sum _{n=0}^{N-1} \delta \tilde{\boldsymbol{B}}_{n} \mathrm {e} ^{\imath \boldsymbol{k}_{n} \cdot \boldsymbol{r}} \right ) \, . $$
(70)

Here, only the wavenumbers are discrete, and in order to cover as broad a dynamical range with as small a number \(N\) of modes as possible, the spacing in \(\boldsymbol{k}\) is oftentimes assumed to be logarithmic.

The alternative, but equivalent representation,

$$ \boldsymbol{\delta B}(\boldsymbol{r}) = \sum _{n=0}^{N-1} A_{n} \hat {\boldsymbol {\xi}} _{n} \cos \left [ k_{n} \hat {\boldsymbol {k}} _{n} \cdot \boldsymbol{r} + \beta _{n} \right ] \, , $$
(71)

makes explicit the interpretation as a superposition of \(N\) independent waves travelling in the directions \(\hat {\boldsymbol {k}} _{n}\) with amplitudes \(A_{n}\), polarisations \(\hat {\boldsymbol {\xi}} _{n}\), wavenumbers \(k_{n}\) and phase factors \(\beta _{n}\). Each mode \(n\) is thus specified by six real numbers: one for \(A_{n}\), one for \(\hat {\boldsymbol {\xi}} _{n}\) (as it needs to be \(\perp \hat {\boldsymbol {k}} _{n}\) in order for \(\boldsymbol{\delta B}\) to be divergence-free), one for \(k_{n}\), one for \(\beta _{n}\) and two for \(\hat {\boldsymbol {k}} _{n}\). Of these, \(\hat {\boldsymbol {\xi}} _{n}\), \(\hat {\boldsymbol {k}} _{n}\) and \(\beta _{n}\) are random variables and their statistical distributions are determined by the turbulence model.

For instance, in isotropic turbulence (see Sect. 2.4.1), \(\hat {\boldsymbol {\xi}} _{n}\) is uniformly distributed on the unit circle (such that \(\hat {\boldsymbol {\xi}} _{n} \cdot \hat {\boldsymbol {k}} _{n} = 0\)), \(\hat {\boldsymbol {k}} _{n}\) is uniformly distributed on the unit sphere and \(\beta _{n}\) is uniformly distributed in \([0, 2 \pi [\). Giacalone and Jokipii (1999) suggested the following construction

$$ \boldsymbol{\delta B}(x, y, z) = \sum _{n=1}^{N_{m}} A(k_{n}) \hat {\boldsymbol {\xi}} _{n} \exp \left [ \imath (k_{n}' z' + \beta _{n}) \right ] \, , $$
(72)

with polarisation vector

$$\begin{aligned} \hat {\boldsymbol {\xi}} _{n} &= \cos \alpha \, \hat {\boldsymbol {x}} _{n}' + \imath \sin \alpha \, \hat {\boldsymbol {y}} _{n}' \, , \end{aligned}$$
(73)

and

$$ \left ( \!\! \textstyle\begin{array}{c} x' \\ y' \\ z' \end{array}\displaystyle \!\! \right ) \!\! = \!\! \left ( \!\! \textstyle\begin{array}{c c c} \cos \theta _{n} \cos \phi _{n} & \cos \theta _{n} \sin \phi _{n} & - \sin \theta _{n} \\ - \sin \phi _{n} & \cos \phi _{n} & 0 \\ \sin \theta _{n} \cos \phi _{n} & \sin \theta _{n} \sin \phi _{n} & \cos \theta _{n} \end{array}\displaystyle \!\! \right ) \!\!\! \left ( \!\! \textstyle\begin{array}{c} x \\ y \\ z \end{array}\displaystyle \!\! \right ) $$
(74)

These equations describe a superposition of waves with wavenumbers \(k_{n}\) and (complex) amplitudes \(A(k_{n})\). The direction of each mode is along the \(z'\)-axis in a coordinate system generated from the lab system through a rotation by \(\theta _{n}\) around the \(y\)-axis and a subsequent rotation by \(\phi _{n}\) around the new \(z'\)-axis. \(\{ \theta _{n}, \phi _{n}, 0 \}\) are thus the Euler angles defining the rotation of the lab system into the rotated system in the \(zyz\) convention. Note that the first term in the exponent of Eq. (72) has been simplified in primed coordinates, \(\boldsymbol{k} \cdot \boldsymbol{x} = \boldsymbol{k}' \cdot \boldsymbol{x}' = k_{z}' z'\).

It has been claimed (Tautz and Dosch 2013) that this construction does not guarantee the correct variances for all three components of the turbulent magnetic field. We believe, however, that Tautz and Dosch (2013) did not compute the averages correctly and that with appropriate averages, the construction by Giacalone and Jokipii give the correct results.

The \(A_{n}\) in turn are fully determined by the power spectrum of turbulence. Again, for an isotropic turbulence tensor, \(\langle \delta \tilde{B}_{i}(\boldsymbol{k}) \delta \tilde{B}_{j}(\boldsymbol{k}') \rangle = \delta _{ij} \delta ^{(3)} (\boldsymbol{k} - \boldsymbol{k}') g(k)\) and thus \(A_{n} = \sqrt{g(k_{n})}\) is the discrete approximation for the desired power spectrum.

While the turbulence model fixes the \(A_{n}\) and the statistical distributions of the \(\hat {\boldsymbol {\xi}} _{n}\), \(\hat {\boldsymbol {k}} _{n}\) and \(\beta _{n}\), what is not fixed is the binning of the \(k_{n}\) and the total number of modes, \(N\). Both are usually constrained by the need to cover as wide a dynamical range as possible. Given our understanding from QLT that interactions are resonant, what is required in the magneto-static limit for one particle energy at a minimum is a spectrum spanning at least a factor of a few around the resonant wavenumber. In addition, power on larger scales can have an impact, depending exactly on what the observable is. This means that easily a few orders of magnitude in wavenumber range are required, even at minimum. Therefore, oftentimes a logarithmic spacing in \(k\) is adopted. This leaves open the question what the required number \(N\) of modes is. For the case of slab-turbulence, this question has been investigated using the convergence with number of modes of a “quasi-Lyapunov exponent” Tautz and Dosch (2013). On a more practical level, we note that the number oftentimes adopted are \(N = \mathcal{O}(100)-\mathcal{O}(1000)\) for a dynamical range \(k_{\text{min}} / k_{\text{max}} \sim 10^{4}\).

3.2.2 Grid method

Standard grid method

An alternative way to set up turbulent magnetic fields on a computer is called the grid method (e.g. Qin et al. 2002b). While in the harmonic method the amplitudes and phases of the turbulent modes are stored (e.g. in the combination \(\{ A_{n}, \hat {\boldsymbol {\xi}} _{n}, \hat {\boldsymbol {k}} _{n}, \phi _{n} \}\), in the grid method the turbulent magnetic field itself \(\boldsymbol{\delta B}(\boldsymbol{r})\) is stored on a spatial grid \(\boldsymbol{r}_{i,j,k}\) and can be interpolated between these grid points.

Here, we introduce the discretisations of the position \(\boldsymbol{r}_{n_{1},n_{2},n_{3}} = (x_{n_{1}}, y_{n_{2}}, z_{n_{3}})^{T} = (n_{1} \Delta r_{1}, n_{2} \Delta r_{2}, n_{3} \Delta r_{3})^{T}\) and wavenumber \(\boldsymbol{k}_{m_{1},m_{2},m_{3}} = (k_{x})_{m_{1}}, (k_{y})_{m_{2}}, (k_{z})_{m_{3}})^{T} = (m_{1} \Delta k_{1}, m_{2} \Delta k_{2}, m_{3} \Delta k_{3})^{T}\). The Fourier transform pair of Eqs. (32) and (33), \(\delta \tilde{B}_{j}(\boldsymbol{k})\) and \(\delta B_{j}(\boldsymbol{x})\), then corresponds to the discrete Fourier transform pair \(\delta \tilde{B}_{j}^{m_{1}, m_{2}, m_{3}}\) and \(\delta B_{j}^{n_{1}, n_{2}, n_{3}}\),

$$\begin{aligned} &\delta \tilde{B}_{j}^{m_{1}, m_{2}, m_{3}} \\ &= \sum _{n_{1}=0}^{N_{1}-1} \sum _{n_{2}=0}^{N_{2}-1} \sum _{n_{3}=0}^{N_{3}-1} \mathrm {e} ^{2 \pi \imath (\frac{m_{1} n_{1}}{N_{1}} + \frac{m_{2} n_{2}}{N_{2}} + \frac{m_{3} n_{3}}{N_{3}})} \delta B_{j}^{n_{1}, n_{2}, n_{3}} \, , \end{aligned}$$
(75)
$$\begin{aligned} &\delta B_{j}^{n_{1}, n_{2}, n_{3}} = \frac{1}{N_{1} N_{2} N_{3}} \\ &\qquad \times \sum _{m_{1}=0}^{N_{1}-1} \sum _{m_{2}=0}^{N_{2}-1} \sum _{m_{3}=0}^{N_{3}-1} \mathrm {e} ^{-2 \pi \imath (\frac{m_{1} n_{1}}{N_{1}} + \frac{m_{2} n_{2}}{N_{2}} + \frac{m_{3} n_{3}}{N_{3}})} \delta \tilde{B}_{j}^{m_{1}, m_{2}, m_{3}} \, , \end{aligned}$$
(76)

for discretely sampled \(\delta B_{i}(\boldsymbol{r})\) and \(\delta \tilde{B}_{i}(\boldsymbol{k})\),

$$\begin{aligned} \delta \tilde{B}_{j}^{m_{1}, m_{2}, m_{3}} &= \frac{(2 \pi )^{3/2}}{\Delta x_{1} \Delta x_{2} \Delta x_{3}} \delta \tilde{B}_{j}(\boldsymbol{k}_{m_{1},m_{2},m_{3}}) \,, \end{aligned}$$
(77)
$$\begin{aligned} \delta B_{j}^{n_{1}, n_{2}, n_{3}} &= \delta B_{j}(\boldsymbol{r}_{n_{1},n_{2},n_{3}}) \,. \end{aligned}$$
(78)

A fast way of setting up a homogeneous scalar Gaussian random field in 3 dimensions with a given power spectrum works in harmonic space. The requirement

$$ \langle \delta \tilde{B}_{i}(\boldsymbol{k}) \delta \tilde{B}^{*}_{j}(\boldsymbol{k}) \rangle = P_{ij}(\boldsymbol{k}) \,, $$
(79)

only fixes the amplitudes, but not the complex phases. To obtain a homogeneous Gaussian random field (with the correlation structure defined by the power spectrum), the phases must be complex normal distributed, \(\arg (\delta \tilde{B}_{n}) \sim \mathcal{N}(0,1) + \imath \, \mathcal{N}(0,1)\). However, for a real turbulent field the phases need to further satisfy the relation implied by Eq. (34). For a discrete field in one dimension, that is \(\delta \tilde{B} (N/2-k_{n}) = \delta \tilde{B}^{*} (k_{n})\). Instead of enforcing the reality conditions by hand, it has proven convenient to use an efficient routine for the generation of a real Gaussian random field with no correlation structure, that is white noise, Fourier transform and then scale the complex amplitudes with the desired power spectrum before transforming back. Note that modern Fourier transform libraries provide routines for reconstructing the full inverse Fourier transform from the Fourier transform at just the positive (spatial) frequencies.

Knowing how to generate a scalar Gaussian random field, it might seem that we just need to combine three independent scalar fields into a 3D vector. However, in general this 3D random field will not be divergence-free. In order to guarantee that the field is divergence-free, only the polarisations perpendicular to \(\hat {\boldsymbol {k}} \) should be retained. This can be achieved by subtracting from each \(\delta \tilde{B}_{j}^{m_{1}, m_{2}, m_{3}}\) the projection of it onto \(\hat {\boldsymbol {k}} \).

The advantage of the grid method is most importantly its speed: Instead of performing a sum of \(N\) modes for a large number of test particles at each timestep of the test particle propagation, only an interpolation between the relevant grid points is needed. For a fine enough grid in 3D, a tri-linear interpolation is sufficient. (See, however, Schlegel et al. (2019).) In most cases, this is computationally more efficient. However, this gain in speed is achieved at the price of increased memory requirements. For example, a 3D field of doubles on a \(2048^{3}\) grid requires \(192 \, \text{GB}\) of RAM, where we have ignored overhead. While certain nodes of computing clusters can have more RAM, as of the writing of this review, this is already beyond the reach but of the most powerful personal computers.

At any rate, a finite grid size implies issues with periodicity and accuracy of interpolation. The latter can be minimised by ensuring that the smallest wavenumber are a factor of a few larger than the grid spacing, \(\lambda _{\text{min}} = (\text{a few}) \, \Delta x\). At the same time, a few of the largest modes should fit onto the extent \(L\) of the grid, \(L = (\text{a few}) \, \lambda _{\text{max}}\), in order to reduce possible periodicity issues. Thus with 2048 grid points, we can cover at most a dynamical range of \(\lambda _{\text{max}} / \lambda _{\text{min}} \sim \mathcal{O}(100)\). This is probably enough to capture the particle-wave resonance, even for broadened resonances. However, modes at scales larger than the resonant scale can also have an effect on particle transport, e.g. through FLRW, but cannot be taken into account for such a small dynamical range.

Nested grid method

In light of these considerations, it was suggested (Giacinti et al. 2012) to increase the dynamical range by using nested grids. This method was later also used by Mertsch and Funk (2015) and Savchenko et al. (2015). The idea is that the total dynamical range \([ k_{\text{min}}, k_{\text{max}}]\) is divided into \(N\) intervals \([ k_{i}, k_{i+1}]\) with \(k_{0} =k_{\text{min}}\) and \(k_{N+1} = k_{\text{max}}\). Each interval is set up on a separate grid and these sub-grids are then periodically replicated over the whole computational domain. See Fig. 2 for an illustration of the method in 3D.

Fig. 2
figure 2

Illustration of the idea of using nested grids. Note that in this illustration padding is not used and thus the grids are not overlapping

The total turbulent field is given by the sum of turbulent fields on each grid. For a power law power spectrum, \(P(k) \propto k^{-q}\), the turbulent energy \(\delta B_{i}^{2}\) to be localised on a sub-grid \(i\) is

$$ \delta B_{i}^{2} = \delta B^{2} \frac{k_{i}^{3-q} - k_{i+1}^{3-q}}{k_{\text{min}}^{3-q} - k_{\text{max}}^{3-q}} \, . $$
(80)

As for the case of a single grid, it is advisable not to use the whole range of the grid for turbulent modes, but to use part of the range for padding. In Fig. 3, we illustrate the overlapping nested grids produced by this construction.

Fig. 3
figure 3

Illustration of the nested grid approach. Shown is the power spectrum and how it is partitioned onto four sub-grids \(i\), each only contributing in a limit range of wavenumbers

In this way a much larger dynamical range can be achieved. For definiteness, we close the discussion of nested grids with an example for how to set up the (sub-)grids for a test particle simulation. In Fig. 3, we illustrate the nesting of four grids with 32 points each. On each grid \(i\), we are only using 12 points to set up the turbulent modes with a dynamical range of \(k_{i+1}/k_{i} = 12\). The remaining 20 points are used for padding. For example, we can set the amplitude to zero for the first \((a-1) = 3\) modes, have finite power between \(j_{i} = a\) and \(b\) (corresponding to the wavenumbers \(k_{i}\) and \(k_{i+1}\)) and again no power for the remaining grid points. Note how the wavenumber grids are organised in order for the different grids to smoothly connect.

The parameters of this examples have been chosen to allow for a clear presentation in Fig. 3. As a real application example, we might instead consider the propagation of \(10 \text{ TeV}\) test particles in a \(\sqrt{ \langle B^{2} \rangle } = 4 \, \mu \text{G}\) isotropic field with a \(k_{\text{min}}\) of \(0.1 \, \text{kpc}\). The gyroradius in the \(4 \, \mu \text{G}\) field is \(\sim 2.7 \times 10^{-6} \, \text{kpc}\), thus the dynamical range required is at least \(0.1 / (2.7 \times 10^{-6}) \simeq 3.7 \times 10^{4}\). This could be achieved by nesting five grids of 128 points each, each grid only covering a factor 16 in dynamical range. The remaining range of \(128/16 = 8\) would be used for padding. Note that without nesting, the dynamical range of \(3.7 \times 10^{4}\) would have required a number of grid points per dimension of \(131\,072\) or more which corresponds to \(48 \, \text{PB} \) of RAM for a 3D-vector field of doubles!

4 Applications

Traditionally, test particle simulations have been used primarily for computation of diffusion coefficients which would then be compared with analytical results in order to test CR transport theories (Giacalone and Jokipii 1999; DeMarco et al. 2007; Snodin et al. 2016; Subedi et al. 2017). In addition, test particle simulations have been used (and are still being used) to study the deflection of ultra-high energy CRs in the Galactic magnetic fields where transport is certainly not resonant pitch-angle scattering (Karakula et al. 1971; Harari et al. 2000; Tinyakov and Tkachev 2002; Alvarez-Muniz et al. 2001; Harari et al. 2002; Kachelriess et al. 2006; Bretz et al. 2014; Farrar and Sutherland 2019). There are however situations where even Galactic transport is not diffusive or where the diffusive picture is questionable. These include the escape of Galactic CRs from the CR halo around the knee (DeMarco et al. 2007; Giacinti et al. 2015), near source transport (Giacinti et al. 2012; Kachelrieß et al. 2015), stochastic acceleration (O’Sullivan et al. 2009; Winchen and Buitink 2018), also in relativistic turbulence (Demidem et al. 2019), and the study of CR anisotropies (Giacinti and Sigl 2012; Giacinti et al. 2012; Schwadron et al. 2014; Mertsch and Funk 2015; Ahlers and Mertsch 2015; López-Barquero et al. 2016; Pohl and Rettig 2016; Kumar et al. 2019; Mertsch and Ahlers 2019). In the following, we will briefly review the use of test particle simulations and discuss the results for a few physics cases.

4.1 Computing transport coefficients

All the non-linear extensions that are meant to address QLT’s issues need to make certain assumptions (see Sect. 2.7). While these assumptions may be well motivated, it is not clear a priori whether they result in an accurate description of CR transport. It is therefore of great interest to test these theories by comparing their results with those of numerical simulations.

A central prediction of the non-linear models are the parallel and perpendicular mean-free path or equivalently the parallel and perpendicular diffusion coefficients, \(\kappa _{\parallel }\) and \(\kappa _{\perp }\). To a lesser extent, numerical simulations have also been employed to compute the pitch-angle scattering diffusion coefficients \(D_{\mu \mu }\) and the off-diagonal, anti-symmetric elements of the diffusion tensor \(\kappa _{\text{A}}\) describing drifts. Of course, checking if transport is diffusive in the first place (instead of subdiffusive or superdiffusive) is another important application of test particle simulations.

4.1.1 Technical details

We start by recalling the definition of the instantaneous diffusion coefficients,

$$\begin{aligned} d_{ii}(t) = \frac{\langle (\Delta x_{i})^{2} \rangle }{2 t} \, . \end{aligned}$$
(81)

The mean square displacements \(\langle (\Delta x_{i})^{2} \rangle \) are directly accessible for a set of trajectories \(\{ \boldsymbol{r}_{j} \}\) from test particle simulations

$$ \langle (\Delta x_{i})^{2} \rangle = \langle |r_{j,i}(t) - r_{j,i}(0)|^{2} \rangle \, . $$
(82)

Assuming again that the regular magnetic field \(\langle \boldsymbol{B} \rangle = B_{z} {} \hat {\boldsymbol {z}} \), we identify \(d_{\parallel } = d_{zz}\) and \(d_{\perp } = d_{xx} = d_{yy}\).

As far as the averaging on the RHS of Eq. (82) is concerned, most authors have adopted an averaging over initial particle velocity and over magnetic field realisations. The former is necessary as the (instantaneous) diffusion coefficients do not retain any pitch-angle dependence, cf. Eq. (25), and the latter is a consequence of QLT considering the ensemble-averaged phase space density. There is no agreement in the literature, however, on how many particle directions and how many field realisations are required to accurately compute diffusion coefficients.

For times much larger than the scattering time, the instantaneous diffusion coefficients should converge towards the asymptotic diffusion coefficients, \(\kappa _{\parallel }\) and \(\kappa _{\perp }\). Depending on the normalised rigidity, that is the gyroradius divided by the correlation length, \(r_{\text{g}}/l_{\text{c}}\), and on the level of turbulence, this only happens after many gyroperiods. Correspondingly, the computational expense can be very high. In order to increase the statistics at intermediate times, it was suggested (Casse et al. 2002) to not only use the initial position \(\boldsymbol{r}_{j}(0)\) as one endpoint of simulated trajectories in computing the mean squared distances, but to also consider intermediate intervals \([t_{i}, t_{i+1}]\). This improves the statistics of trajectories for intermediate times, however, it is not clear whether this does not introduce some unwanted correlations.

We note that it is also possible to test the diffusion approximation by computing \(\kappa _{\parallel }\) from the pitch-angle diffusion coefficient \(D_{\mu \mu }\). Note that in practice, oftentimes the scattering rate is derived from the already pitch-angle averaged correlation function \(\langle \mu (t) \mu (0) \rangle \) instead of from the pitch-angle diffusion coefficient \(D_{\mu \mu }(\mu )\).

We note that already Giacalone and Jokipii (1999) explored alternatives for computing the diffusion coefficients. The solution of the diffusion equation for an initially localised distribution is a multi-variate Gaussian with variances \(\sigma _{\parallel } = 2 \kappa _{\parallel } t\) and \(\sigma _{\perp } = 2 \kappa _{\perp } t\) in the parallel and perpendicular directions. Determining the spread of a set of trajectories from their common origin therefore allows computing the diffusion coefficients.

4.1.2 Results

In the following, we provide a brief overview of some of the first and some more recent computations of diffusion coefficients using test particle simulations. The first to use test particle simulations for the computation of transport coefficients were Giacalone and Jokipii (1994). Considering simplified turbulence models with 2D and 3D magnetostatic, isotropic turbulence they showed that diffusive perpendicular transport required 3D turbulence. For that case, they numerically computed \(\kappa _{\parallel}\) and \(\kappa _{\perp}\) for the first time. The ratio (\(\kappa _{\perp}/\kappa _{\parallel}\)) was found to deviate from the prediction of classical scattering (see Sect. 2.1) which was ascribed to the small dynamical range of the turbulence spectrum. This pioneering paper was followed up on by Michałek and Ostrowski a few years later (Michałek and Ostrowski 1997, 1998). Adopting the same harmonic method as Giacalone and Jokipii, Michałek and Ostrowski already considered a more complex and realistic scenario, including time-dependent turbulence with electric fields, that allowed them to study the role of stochastic acceleration and compare both parallel spatial diffusion and momentum diffusion with the prediction of QLT (Michałek and Ostrowski 1996). For slab-like turbulence they found good agreement. They also took into account the proper polarisation properties of the linear MHD waves in the cold plasma limit, that is shear-Alfvén and fast magnetosonic waves (Michałek and Ostrowski 1998). They found a much more effective cross-field diffusion for magnetosonic waves, compared to the case with Alfvén waves.

However, all these early simulations exhibited a rather limited dynamical range, \(\lesssim 100\). A follow-up of their earlier work, Giacalone and Jokipii (1999) extended this dynamical range to \(10^{4}\). A first study employing not only the harmonic method, but also the grid-based approach was performed by Casse et al. (2002) and they showed that both methods gave similar results. Qin et al. (2002b) shifted the focus back to perpendicular transport and showed that in close-to slab turbulence, perpendicular transport is indeed compound subdiffusion. This is the case as long as there is too little structure in the perpendicular directions. If however there is sufficient structure, a second regime of diffusion is attained after a transitory phase of subdiffusion (Qin et al. 2002a).

Another study with interest in applications to Galactic transport was the one by DeMarco et al. (2007). Not only did they consider extended turbulence and rigidity ranges, but also the mixed effects of particle scattering and drifts due to small-scale turbulence and inhomogeneities in the large-scale regular, background field. This is to be expected for the Galactic magnetic fields which are thought to trace out a spiral structure, similar to gas and stars in the Galaxy. It was shown that both diffusion and drifts play an important role in the escape of cosmic rays from the extended halo around \(10^{17} \, \text{eV}\). Giacinti et al. (2012) followed a similar interest in the transport of Galactic cosmic rays at PeV energies and the possibility to use the predicted dipole anisotropies to set limits on the Galactic contribution at even higher energies. To simulate particles at rigidities of \(30 \, \text{PV}\), they need to cover a dynamical range that was beyond the use of single grids for the turbulent magnetic field and they adopted nested grids instead. More recently, Giacinti et al. (2018) have also simulated anisotropic turbulence structures. Snodin et al. (2016) have presented simulation results for the largest dynamical range yet. They have considered 3D isotropic turbulence with a broken power law spectrum, motivated by the need for heuristic description of diffusion coefficients for MHD Galaxy/ISM simulations. They also took into account the contribution from FLRW, fitting the field line diffusion coefficient from the simulated turbulence.

In Fig. 4, we present a compilation of mean free paths computed for a 3D isotropic turbulence model and two different spectral shapes: a broken power law, see Eq. (48) in the left panels, and a power law with cut-off, Eq. (47) in the right panels. In the top panels, we show the parallel and perpendicular mean free paths \(\lambda _{\parallel}\) and \(\lambda _{\perp}\) as a function of the gyroradius \(r_{\text{g}}\) for various levels of turbulence, \(\eta \equiv \delta B^{2} / (B_{0}^{2} + \delta B^{2})\). (The simulations have been performed at discrete energies, of course, and the points are connected only to guide the eye.) The different scalings of \(\lambda _{\parallel}\) and \(\lambda _{\perp}\) with \(r_{\text{g}}\) in the regimes of resonant scattering (\(r_{\text{g}} \ll l_{\text{c}}\)) and small-angle scattering (\(r_{\text{g}} \gg l_{\text{c}}\)) are clearly visible. The behaviour with \(\eta \) is largely montonic. There is agreement between different groups that have presented results for the same setup, for instance compare the light green lines (\(\eta = 0.5\)) in the top left panel. In the resonant scattering regime, the mean-free paths seem to agree even for the different spectral shapes, compare the light green lines in the top left and top right panels for \(r_{\text{g}}/l_{\text{c}} \lesssim 1\). At larger gyroradii, \(r_{\text{g}}/l_{\text{c}} \gg 1\), the agreement is worse, however, with the case with a power law with cut-off exhibiting somewhat larger parallel and smaller perpendicular mean-free paths. This behaviour was to be expected as for \(r_{\text{g}}/l_{\text{c}} \gg 1\), the particle transport becomes more sensitive to the large modes of the turbulent spectrum where the difference between the spectra is most stark.

Fig. 4
figure 4

Compilation of mean free paths (normalised to the correlation length \(l_{\mathrm{c}}\)), computed from test particle simulations as a function of gyroradius \(r_{\text{g}}\) for isotropic turbulence (also normalised to \(l_{\mathrm{c}}\)). Top left: Parallel and perpendicular mean free paths \(\lambda _{\parallel}\) and \(\lambda _{\perp}\), assuming a broken power law turbulence spectrum, eq. (48), for various values of the turbulence level \(\eta \equiv \delta B^{2} / (B_{0}^{2} + \delta B^{2})\). Top right: \(\lambda _{\parallel}\) and \(\lambda _{\perp}\), assuming a power law turbulence spectrum with cut-off, eq. (47). Bottom left and right: Ratio \(\lambda _{\perp} / \lambda _{\parallel}\), again for a broken power law spectrum and a power law turbulence spectrum with cut-off, respectively

In the lower panels of Fig. 4, we show the ratio of perpendicular and parallel mean free paths, \(\lambda _{\perp} / \lambda _{\parallel}\). The agreement between different groups is again fair. It appears that at low turbulence levels, \(\lambda _{\perp}\) has a stronger dependence on \(r_{\text{g}}\) than \(\lambda _{\parallel}\), but this requires further tests.

It becomes apparent from the top panels of Fig. 4, that the parallel and perpendicular mean free paths start converging towards the isotropic mean-free path \(\lambda \) as \(\eta \to 1\). (For clarity, we have plotted the limit \(\eta = 1\), that is no background field, separately in Fig. 5.) Taken together, the simulations stretch three orders of magnitude in gyroradius and the different scalings in the low- and high-rigidity regimes is easily identified as \(\lambda \propto r_{\text{g}}^{1/3}\) for \(r_{\text{g}} \ll l_{\text{c}}\) and \(\lambda \propto r_{\text{g}}^{2}\) for \(r_{\text{g}} \gg l_{\text{c}}\). The agreement between different groups for the same spectral shape is excellent and the results for different spectral shapes are most pronounced in the small-angle scattering regime where the mean-free path is again larger for the power law with cut-off again than for the broken-power law shape.

Fig. 5
figure 5

Compilation of mean free paths (normalised to the correlation length \(l_{\mathrm{c}}\)), computed from test particle simulations as a function of gyroradius \(r_{\text{g}}\) for isotropic turbulence (also normalised to \(l_{\mathrm{c}}\)). Left: Mean-free path \(\lambda \), assuming a broken power law turbulence spectrum, eq. (48), without regular field, that is \(\eta = 1\). Right: \(\lambda \), assuming a power law turbulence spectrum with cut-off, eq. (47), without regular field, that is \(\eta = 1\)

In Fig. 6 we show the dependence of the parallel and perpendicular mean free paths on \((\delta B/B_{0})\). While the parallel mean free paths are \(\propto (\delta B/B_{0})^{-2}\), as expected from QLT, the perpendicular mean free paths are closer to \(\propto (\delta B/B_{0})^{1.5}\) whereas QLT (without the FLRW contribution) predicts \(\propto (\delta B/B_{0})^{2}\). We emphasise again that for isotropic turbulence, QLT is not valid as it predicts an infinite parallel mean-free path.

Fig. 6
figure 6

Compilation of mean free paths (normalised to the correlation length \(l_{\mathrm{c}}\)), computed from test particle simulations as a function of \((\delta B/B_{0})\). The dashed and dash-dotted line show power laws \(\propto (\delta B/B_{0})^{-2}\) and \(\propto (\delta B/B_{0})^{1.5}\), respectively

In Table 1 we compare the prediction of \(\kappa _{\parallel }\) and \(\kappa _{\perp }\) from various transport theories to the results from numerical simulations.

Table 1 Comparison of parallel and perpendicular transport in simulations and theories for different turbulence geometries. Here, we assume magnetostatic turbulence

4.2 CR anisotropies and backtracking

Another application of test particle simulations is the study of anisotropies. These are motivated by observations both on large-scale and small-scale anisotropies that hint at limitations of the standard diffusive picture of Sect. 2.2.

In this standard picture, a small spatial gradient in the CR phase space density leads to the formation of a small dipole in the arrival directions, aligned with the direction of the regular or mean magnetic field. What matters for the formation of the dipole is the gradient over a few mean-free paths before observation and any anisotropy imprinted at larger distances will be destroyed by pitch-angle scattering. However, the phase space density \(f\) in the actual realisation of the turbulent field will in general differ from the ensemble average \(\langle f \rangle \), see the discussion in Sect. 2.1, and therefore, also the arrival directions seen by an observer will differ from the dipole predicted for the ensemble-averaged phase space density.

This reasoning has been applied by Mertsch and Funk (2015) to the CR anisotropy problem (Hillas 2005; Zirakashvili 2005; Erlykin and Wolfendale 2006; Ptuskin et al. 2006; Blasi and Amato 2012; Evoli et al. 2012; Pohl and Eichler 2013; Sveshnikova et al. 2013; Kumar and Eichler 2014; Schwadron et al. 2014; Ahlers 2016), that is the discrepancy between the measured dipole anisotropy and the one predicted in isotropic diffusion models. Test particle simulations can be used to explore the deviations of the phase space density and anisotropies from the ensemble average in particular realisations of the turbulent magnetic field.

To this end, particles are followed backward in time, starting at position \(\boldsymbol{r}_{\oplus}\) at time \(t\) of observations and computing the trajectories back to an earlier time \(t_{0}\). For a given set of trajectories \(\{ \boldsymbol{r}_{j} \}\) from test particle simulations, we can then use Liouville’s theorem, that is the conservation of phase space density along trajectories, to connect the phase space density seen by an observer at time \(t\) and at the origin of the trajectories \(\boldsymbol{r}_{\oplus }\) to the assumed phase space density \(f(t_{0})\) at the other end of the trajectories. More specifically,

$$ f(\boldsymbol{r}_{\oplus }, \hat {\boldsymbol {p}} _{i}(t), t) \simeq f(\boldsymbol{r}_{i}(t_{0}), \hat {\boldsymbol {p}} _{i}(t_{0}), t_{0}) \, , $$
(83)

where \(\boldsymbol{r}_{i}(t')\) and \(c \hat {\boldsymbol {p}} _{i}(t')\) are the positions and velocities of a particle with position \(\boldsymbol{r}_{i}(t) = \boldsymbol{r}_{\oplus }\) and velocity \(c \hat {\boldsymbol {p}} _{i}(t)\) at observation. In order to predict the phase space density seen by an observer at time \(t\), some assumptions need to be made on the phase space density at the other ends of the trajectories, specifically at time \(t_{0}\). Usually, for \(f(t_{0})\) the random fluctuations are ignored and the ensemble-averaged \(\langle f(t_{0}) \rangle \) is adopted. Equation (83) then becomes exact if the backtracking time \((t - t_{0}) \to \infty \). This is motivated by the fact that ensemble averages of second moments of the phase space density, e.g. the dipole amplitude or the angular power spectrum, are insensitive to the fluctuations \(\delta f\) at \(t_{0}\) (Ahlers and Mertsch 2015). For the ensemble-average a solution of the CR transport equation is adopted, e.g. a spatial gradient.

We show the dipole amplitudes computed with test particle simulations for five different realisations of the turbulent magnetic field in Fig. 7. It was shown that the intermittency effects due to the turbulent magnetic field can lead to a significant uncertainty in the prediction of the dipole amplitude and direction, both for the case without and with strong background field (Mertsch and Funk 2015). Together with the projection effect due to a potential misalignment between CR gradient and magnetic field direction, this can bring the predicted dipole anisotropy back into agreement with the observations.

Fig. 7
figure 7

The dipole amplitude of Galactic cosmic rays (black open symbols) for five different realisations of the turbulent magnetic field under the assumption of misalignment of background field and cosmic ray gradient. Also shown are some measurements and the expectation from an isotropic diffusion model. From Mertsch and Funk (2015)

The same backtracking technique and Liouville’s theorem can be used to also investigate the appearance of anisotropies on small scales (Abdo et al. 2008; Abbasi et al. 2011; Abeysekara et al. 2014; Aartsen et al. 2016; Abeysekara et al. 2018, 2019) due to intermittency effects in small-scale turbulence (Giacinti and Sigl 2012; Ahlers and Mertsch 2015; López-Barquero et al. 2016; Pohl and Rettig 2016; Kumar et al. 2019). We refer the interested reader to the recent review by Ahlers and Mertsch (2017).

4.3 The validity of Liouville’s theorem

It has been questioned whether backtracking can be used reliably to investigate the formation of (small-scale) anisotropies (López-Barquero et al. 2017) and whether Liouville’s theorem is valid in the presence of pitch-angle scattering. We therefore provide a few comments on its validity.

First, we note that pitch-angle scattering is to be distinguished from collisions. In collisions the particle trajectories changes abruptly due to short-range forces, e.g. hard-sphere collisions in gas kinetic theory. In contrast, in collisionless plasmas each interaction between the particle and a wave-packet changes the particle’s pitch-angle only very moderately due to the small turbulent magnetic field, \(\delta B^{2} / B_{z}^{2} \ll 1\) (e.g. Kulsrud 2005). Thus, interactions with many wave-packets are needed for a particle to scatter (which can be defined as a particle changing direction by \(180^{\circ }\)). The particle trajectories are smooth since the Lorentz force mediating this change is differentiable.

Second, the validity of Liouville’s theorem is not only the basis for numerical backtracking, but is also at the heart of kinetic theory, including QLT and its non-linear extensions. If Liouville’s theorem was not applicable to collisionless plasmas in the presence of small-scale turbulence, then we would also need to abandon the majority of microscopic particle transport theories and much of plasma theory, in fact.

It has been claimed (López-Barquero et al. 2017) that conservation of phase space density is equivalent to the conservation of the magnetic moment \(M = m v_{\perp }^{2} / (2 B)\) of individual particles which can be checked by simulating test particles in random (electro)magnetic fields. We have elsewhere already argued against this view (Ahlers and Mertsch 2017): While conservation of phase space density requires only differentiability of forces, conservation of the magnetic moment requires the magnetic field to change only adiabatically, that is \(B / |\nabla B| \gg r_{\text{g}}\) and \(B / \dot{B} \gg \Omega ^{-1}\) where \(r_{\text{g}}\) and \(\Omega \) are the gyroradius and gyrofrequency. Therefore, the conditions for the conservation of the magnetic moment are stricter and variability of the magnetic moment does not imply violation of Liouville’s theorem. Note that, of course, magnetic moment \(M\) and pitch-angle cosine \(\mu \) are closely related for fixed particle energy, such that any pitch-angle scattering necessarily implies the violation of magnetic moment (Dalena et al. 2012; Weidl et al. 2015). The validity of Liouville’s theorem is however not affected by this.

Due to the equivalence of phase space volume and (negative) information entropy, it can be said that in the ensemble-average information is lost. The increase of entropy also implies that the evolution of the system is irreversible, reflecting the diffusive nature of the process. However, it is important to realise that the loss of reversibility only occurs through the ensemble averaging. By contrast, in one particular realisation of the turbulent magnetic field, even though particles scatter, phase space volume is conserved, entropy does not increase and the equations of motion are reversible. It is possible to confirm this fact in numerical test particle simulations.

5 Summary and outlook

In this review, we have given an overview over test particle simulations of CRs that are used to check transport theories, compute their parameters and predict observables beyond the current reach of such theories. In the first part, we summarised the findings of the current paradigm theory, QLT, and its possible extensions. In deriving the Fokker-Planck Eq. (15) and the diffusion Eq. (24), we have reviewed the salient features of QLT, that is the evaluation of the force due to the turbulent magnetic field along unperturbed trajectories and the hierarchy of time scales involved. We have introduced the three most popular analytical turbulence geometries (3D isotropic, slab and composite) and, as an example, have reviewed the derivation of the pitch-angle diffusion coefficient in slab geometry with a broken power law turbulence spectrum. Pointing out some of the shortcomings of QLT, in particular the so-called \(90^{\circ }\) problem, we have motivated the need to go beyond the simplest quasi-linear theories. For non-linear theories of CR transport, we have mostly limited ourselves to the BAM model, to NLGC theory and to WLNT.

The second part of this review was concerned with test particle simulations itself. First, we developed a technical but central part of running test particle simulations: the generation of the turbulent magnetic field. We have reviewed the two approaches that are regularly used, the harmonic method and the grid method. Both have advantages and disadvantages, but the grid method allows for a much faster evaluation if a large dynamic range in wavenumbers is to be considered. This is particularly true for the nested grid method. We have concluded by reviewing some of the applications of test particle simulations, the major motivation being the current lack of an agreed-upon microscopic transport theory that addresses the various issues that point beyond QLT. Extensions of QLT need to be tested against observations or simulations. Any theory necessarily relies on a certain turbulence model and since the nature of turbulence in the interstellar medium (to a lesser extent also in the interplanetary medium) is uncertain, comparing analytical approaches and numerical simulations based on the same assumed turbulence model is most reliable. We have sketched two important application cases, that is the computation of transport coefficients and the investigation of anisotropies. In doing so, we have stressed the validity of Liouville’s theorem for the phase space density before ensemble-averaging which is the basis not only of the backtracking used in anisotropy studies, but also of the analytical approaches.

Over the 25 years since their first use in CR transport studies, test particle simulations have proven a very useful tool. They have confirmed the sub-diffusive nature of perpendicular transport in slab turbulence, proving the importance of FLRW. Furthermore, parallel transport in isotropic turbulence has been shown to be diffusive where QLT predicts infinite mean-free paths. Test particle simulations have also allowed for tests of non-linear extensions of QLT, however, with no clear winner yet. Despite QLT’s deficiencies in detail, test particle simulations have reproduced some of its results in a qualitative fashion, e.g scaling of parallel mean-free path with rigidity and turbulence level, justifying to a certain extent the use of such scalings in phenomenological applications to Galactic CRs. More recently and thanks to increased computing power they have proven helpful in addressing phenomenological issues, for instance the transport at the transition from the resonant to the small-angle scattering regime (Giacinti et al. 2012) or the interpretation of cosmic ray small-scale anisotropies (Ahlers and Mertsch 2015).

Open questions that should be further studied and addressed with test particle simulations are the decorrelation of trajectories which leads to the broadening of the resonance condition in non-linear extensions of QLT, the transition in transport from the ballistic to the diffusive regime and a more detailed understanding of CR anisotropies.

On the technological side, a number of improvements are needed to allow for a broader use of test particle simulations though. It is widely accepted that turbulence is anisotropic in the presence of a background magnetic field. The direction of the anisotropy is determined by the effective large-scale field seen at a particular point and on a particular spatial distance scale. Yet, there have been no implementations for the generation of synthetic turbulence with anisotropies resembling those observed, for instance, in MHD simulations. The difficulty here is to allow for the direction of the anisotropy to vary over the spatial domain. The only cases we are aware of (Giacinti et al. 2018; Demidem et al. 2019) have considered low turbulence levels, such that the direction of the anisotropy is effectively the same at all positions and on all spatial scales. (Of course, this problem does not arise if the magnetic field is the result of MHD simulations.) However, we stress that in the spirit of keeping turbulence physics and transport physics apart, it would be valuable to have such a prescription. In addition, this would allow to cover a larger dynamical range as with MHD simulations.

Another trend, that is imminent in our view, is the adoption of computing architectures other than CPUs which is what most previous codes have been focussed on. The solution of a large number of equations of motion is perfectly amenable to single instruction, multiple data architectures like graphic processing units (GPUs). The addition of a large number of wave modes needed in the harmonic approach is another example (Tautz 2016).

Given the conceptual simplicity of test particle simulations of CR transport and the availability of computational resources necessary, test particle simulations are thus one of the most important computational tools in studies of CR transport. It is however also necessary to point to the limitations of test particle simulations. First, as alluded to above, the questions of whether the results can be compared to data is hindered by our ignorance of the underlying turbulence model. Of course, analytical transport theories suffer from the same shortcoming. Turning this argument around, we can however hope to constrain the nature of magnetised interstellar turbulence by comparing the results from test particle codes with observations, for example for anisotropies. Also, with ever increasing computational resources, computing trajectories in simulated turbulence will become increasingly important, but for the time being synthetic turbulence is more useful in investigating a number of phenomenological questions.

Second, test particle simulations ignore feedback of the cosmic rays onto the magnetised turbulence, by definition. Approaches like particle-in-cell simulations are appropriate for studying such processes in principle, but for the application of such instabilities to astrophysical phenomena, the large dynamical range between plasma skin widths and the relevant astrophysical scales is still challenging. We believe that careful hybrid approaches, combining kinetic cosmic rays with magnetohydrodynamic background plasma will prove most fruitful.

Another open question is the nature of MHD turbulence itself. While the Goldreich-Sridhar picture is an often employed model for anisotropic turbulence, it is based on the assumption of so-called critical balance, meaning that the Alfvénic and cascade times are identical. We note that this assumption of a single time-scale is regularly contested in the literature, see e.g. (Lugones et al. 2019).