0% found this document useful (0 votes)
2 views

Fundamentals Astrophysics Cosmology

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Fundamentals Astrophysics Cosmology

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 154

Astrophysics and cosmology notes

Jacopo Tissino, Giorgio Mentasti, Eleonora Vanzan

2020-11-03
Contents

1 Cosmography 4
1.1 The cosmological principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 The geometry of spacetime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.1 A bidimensional example . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.2 Other forms of the RW metric . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 The energy budget of the universe . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3.1 Energy density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3.2 Estimating the masses of galaxies . . . . . . . . . . . . . . . . . . . . . 12
1.3.3 Dark energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.3.4 Radiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.3.5 The Hubble law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2 Friedmann models 27
2.1 A Newtonian derivation of the Friedmann equations . . . . . . . . . . . . . . 27
2.2 The equation of state . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.2.1 Common equations of state . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.3 Solutions of the Friedmann Equations . . . . . . . . . . . . . . . . . . . . . . . 33
2.3.1 Einstein-De Sitter models . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.4 Measuring distances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.5 The cosmological constant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.5.1 Evolution of a dark energy dominated universe . . . . . . . . . . . . . 42
2.6 Curved models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.6.1 Positive curvature: a closed universe . . . . . . . . . . . . . . . . . . . 44
2.6.2 Negative curvature: an open universe . . . . . . . . . . . . . . . . . . . 48
2.6.3 Considerations on curvature . . . . . . . . . . . . . . . . . . . . . . . . 48

3 The thermal history of the universe 49


3.1 Radiation energy density and the equality redshift . . . . . . . . . . . . . . . 49
3.2 Thermodynamics in the early universe . . . . . . . . . . . . . . . . . . . . . . 50
3.2.1 Number density, energy density and pressure . . . . . . . . . . . . . . 51
3.2.2 Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.2.3 Explicit expressions for the thermodynamic quantities . . . . . . . . . 57
3.2.4 Decoupling and radiation temperature . . . . . . . . . . . . . . . . . . 62
3.3 Problems with the Hot Big Bang model, inflation . . . . . . . . . . . . . . . . 64

1
3.3.1 The cosmological horizon problem . . . . . . . . . . . . . . . . . . . . . 64
3.3.2 The flatness problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.3.3 Mechanisms for inflation . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.4 Baryogenesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.5 Decoupling of particle species . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.6 Hydrogen recombination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3.7 Primordial nucleosynthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
3.8 Dark Matter dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.8.1 The Boltzmann equation and decoupling . . . . . . . . . . . . . . . . . 87
3.8.2 HDM density estimate . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
3.8.3 CDM density estimate . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

4 Stellar Astrophysics 95
4.1 Stellar formation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.1.1 The freefall timescale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
4.1.2 Hydrostatic equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.2 Jeans instability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.2.1 Star formation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
4.2.2 Collapsing a solar-mass cloud . . . . . . . . . . . . . . . . . . . . . . . 113
4.2.3 Conditions for stardom and brown dwarfs . . . . . . . . . . . . . . . . 114
4.3 The Sun and other stars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
4.3.1 Radiative diffusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
4.3.2 Thermonuclear fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
4.3.3 Stellar evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
4.3.4 The Hertzsprung-Russel diagram . . . . . . . . . . . . . . . . . . . . . 121
4.3.5 The interior of a Main Sequence star . . . . . . . . . . . . . . . . . . . . 122
4.3.6 The maximum mass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
4.3.7 Degenerate electron gas . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
4.4 Stellar remnants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
4.4.1 Full degeneracy and white dwarfs . . . . . . . . . . . . . . . . . . . . . 133
4.4.2 The Chandrasekhar limit in more detail . . . . . . . . . . . . . . . . . . 134
4.4.3 White dwarf characteristics . . . . . . . . . . . . . . . . . . . . . . . . . 136
4.4.4 Neutron stars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
4.4.5 Relativistic corrections to the equation of state . . . . . . . . . . . . . . 139

5 Structure formation 142


5.1 The nonlinear evolution of a spherical perturbation . . . . . . . . . . . . . . . 142
5.2 Press-Schechter theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

Introduction and relevant material


These are the (yet to be) revised notes for the course “Fundamentals of Astrophysics and
Cosmology” held by professor Sabino Matarrese in fall 2019 at the university of Padua.

2
They are based on the notes I took during lectures, complemented with notes from the
previous years.
They will be revised by the professor in the future, as of yet they have not.
The exam is a traditional oral exam, there are fixed dates but they do not matter: on an
individual basis we should write an email to the professor to set a date and time.

Material There is a dropbox folder with notes by a student from the previous years [Pac18]
and handwritten notes by the professor.
There are many good textbooks, for example “Cosmology” by Coles and Lucchin [CL02].

3
Chapter 1

Cosmography

1.1 The cosmological principle


The basis for the modern treatment of cosmology is the Copernican principle: roughly
stated, it is “we do not occupy a special, atypical position in the universe”. We will discuss the
validity of this later in this section. It is extremely useful to make such an assumption
since it endows our model of the universe with a great deal of symmetry, which makes its
mathematical treatment manageable.
As we will discuss shortly, this principle can be combined with our observations of
isotropy to yield:

Proposition 1.1.1 (Cosmological principle). Every comoving observer observes the Universe
around them at a fixed time in their reference frame as being homogeneous and isotropic.

Comoving means moving coherently with the absolute reference frame, which is defined
as the rest frame of the cosmic fluid, which determines the geometry of the universe.
When we observe the Cosmic Microwave Background (CMB) we see that we are sur-
rounded by radiation distributed like a blackbody of temperature TCMB ⇡ 2.725 K [Fix09].
This radiation is not uniformly distributed in the sky: we see a dipole modulation of
around DT ⇡ 3.4 mK [Pla+19]. This is due to the Doppler effect: the Solar System is not
comoving with respect to the CMB.
In fact, we can measure the peculiar velocity of the Solar System this way: it comes out to
be around cDT/TCMB ⇡ 370 km/s. This cannot be explained by the movement of the Sun
through the galaxy, nor by the movement of the galaxy through the Local Group: the Local
group is actually moving with respect to the absolute reference. In fact, the velocities of the
Sun with respect to the Local Group and the velocity of the Local Group with respect to the
CMB are almost directed in opposite directions, so the velocity of the LG with respect to the
CMB can be measured to be ⇡ 620 km/s [Pla+19, Table 3].
So, the absolute reference frame can be experimentally defined as the frame of the
observer who sees the CMB with zero dipole moment.
The CMB has anisotropies of the order of 20 µK (root mean square) [Wri03] at higher
order multipole moments: it is uniform to approximately 1 part in 105 .

4
The word “time” in fixed time refers to the proper time of a comoving observer, which
is called cosmic time.
Homogeneity means that the characteristics of the universe as observed from a point
are the same as they would be as observed from any other point.
Isotropy means that the characteristics of the universe as observed in a certain direction
are the same as they would be if they were observed in another direction.

On the validity of the cosmological principle The principle is expected to hold only on
very large1 scales: at small scales we see structures, such as galaxies or our Solar System, so
we surely do not have homogeneity.
Change citation for homogeneity length scale?
How can we talk about homogeneity if we can only look at the universe from a single
point? We must assume that any other observer would also see isotropy as we do: this is
precisely what the Copernican principle tells us.
Isotropy around every point implies homogeneity. We observe isotropy, and with the
assumption that we are typical observers we obtain homogeneity.
In the end, this assumption is the basis of modern cosmology: it has to be made before
any cosmological study starts: it might not be completely correct, but it allows us to make
falsifiable predictions, so we shall keep it until the models it allows us to create do not
match observations anymore.

1.2 The geometry of spacetime


The best description for gravity we have so far is given by the general theory of relativity
(GR). In it, spacetime is modelled as a 3+1-dimensional semi-Riemannian manifold with a
line element which is generally given by the expression ds2 = gab dx a dx b and which.
Latin indices can take values from 0 to 3, and we adopt the “mostly minus” metric
signature.
Such a spacetime can have up to 4(4 + 1)/2 = 10 global continuous symmetries, which
can be classified into:

(a) 1 time translation;

(b) 3 Lorentz boosts;

(c) 3 spatial translations;

(d) 3 spatial rotations.

The metric of Minkowski spacetime, which has all of these 10 symmetries, reads:

ds2 = c2 dt2 d~x2 (Cartesian coordinates) (1.1a)


1 Larger than 260h 1 Mpc ⇡ 380 Mpc ⇡ 1.2 ⇥ 1025 m [YBK10], where h ⇡ 0.68 [Ade+16]: this will be made
clearer in later sections, but is meant to give an idea of the length scales involved . The portion of Universe we
can see is of order 10 Gpc.

5
= c2 dt2 dr2 r2 dW2 (spherical coordinates) , (1.1b)

where r = | x |, while q and j are the spherical angles and dW2 = dq 2 + sin2 q dj2 .
Minkowski space is maximally symmetric, which means we could not have more symmetries
than these. It is not the only possibility, there are three maximally symmetric spacetimes:
Minkowski, de Sitter, anti-de Sitter.
In cosmology we lose time translation symmetry, since the universe is expanding, and
Lorentz boost symmetry, since as we saw in the previous section we can tell at which speed
we are moving with respect to the CMB.
We keep the purely spatial symmetries: so, our description of spacetime will be as a
3+1-dimensional manifold which, if we fix the temporal coordinate for a comoving observer,
reduces to a maximally symmetric 3-dimensional space — for 3 dimensions the maximum
number of symmetries is 3(3 + 1)/2 = 6, which correspond to (c) and (d).
It can be shown that the most general form of a metric satisfying these conditions is
the Friedmann-Lemaître-Robertson-Walker line element,2 which, in the comoving frame,
reads

!
2 2 2 2 dr2
ds = c dt a (t) + r2 dW2 . (1.2)
1 kr2

The coordinate r does not have the dimensions of a length: we choose our variables so
that a(t) is a length, while r is dimensionless.3
The parameter a(t) is called the scale factor: varying it amounts to rescaling all of space.
It has the dimensions of a length, and it depends on the cosmic time t.
The parameter k is a constant describing the spatial curvature, which can always be
normalized to ±1 or 0. Universes with:

• k = 1 are called closed universes;

• k= 1 are called open universes;

• k = 0 are called flat universes.

We can choose the normalization of k, but its sign is a constant. Positive values corre-
spond to k = 1, negative values to k = 1. When computing probability distributions for
k, we must not consider it to be a discrete variable but a continuous variable instead: so,
the set of flat universes with k = 0 has zero measure, and thus zero probability with any
probability density function.
2 This is sometimes called just “Robertson-Walker”, or RW, or FLRW.
3 This is a matter of convention: if we normalize |k| = 1 then we will have 0  r < 1; we could also let k be
arbitrarily large; then r would also be.

6
1.2.1 A bidimensional example
We consider surfaces: these are the simplest manifolds which can have intrinsic curva-
ture.

Intrinsic curvature is described by the Riemann tensor, which has 20 independent compo-
nents in 4D and 6 in 3D, so it is difficult to visualize.
On the other hand, in 2D has only 1 independent component: R1212 = R det gab , where
R is the scalar curvature.
The scalar curvature R has an immediate geometric interpretation: it is equal to 2/(r1 r2 ),
where ri are the radii of the osculating circles at the point; it is positive if the circles are in
the same direction, as they are for a sphere, and negative if they are in different directions,
as they are for a hyperboloid. For a flat surface we cannot define an osculating circle (at
least not in both directions): its radius diverges, so the curvature vanishes.

The metric for a Cartesian flat 2D plane is dl 2 = a2 (dr2 + r2 dq 2 ). The constant a is


included since we want r to be dimensionless. ⇣ ⌘
The metric for the surface of a sphere is: dl 2 = a2 dq 2 + sin2 q d j2 , where a2 = R2 , the
square radius of the sphere. ⇣ ⌘
The metric for the surface of a hyperboloid is: dl 2 = a2 dq 2 + sinh2 q d j2 , therefore
the only difference is that trigonometric functions become hyperbolic ones.
Do note that for the sphere q 2 [ p, p ] while for the hyperboloid we can in principle
have q 2 R: this is indicative of the fact that the sphere is bounded, while the hyperboloid
is not.
For both of these, let us define the variable: r = sin q in the spherical case, and r = sinh q
in the hyperbolic case.
As we change variable we do the following manipulation for the sphere:
✓ ◆
2 dr 2 2
dr = dq = cos2 q dq 2 (1.3a)
dq
dr2 dr2
=) dq 2 = = , (1.3b)
cos2 q 1 r2

and similarly for the hyperboloid, except for the fact that in that case cosh2 q = 1 + sinh2 q =
1 + r2 .
So, the line elements become respectively:
!
2 2 dr2 2 2
dl sphere = a + r dj (1.4a)
1 r2
!
2 2 dr2 2 2
dl hyperboloid = a + r dj . (1.4b)
1 + r2

We have a striking similarity to the Robertson-Walker metric: we only need to make the
substitution dj ! dW in order to recover it.

7
sinh( )
10 sin( )

10

3 2 1 0 1 2 3

Figure 1.1: Plot of the functions sin(q ) and sinh(q ) in the interval p  q  p.

We can also work backwards and rewrite the RW line element in sphere- or hyperboloid-
like coordinates: 8
> 2 2 2
<dc + sin c dW
>
dl 2 = c2 dt2 a2 dc2 + c2 dW2 (1.5)
>
>
:dc2 + sinh2 c dW2

where we introduce a variable c defined so that if k = +1 then r = sin c, if k = 0 then r = c,


and if k = 1 then r = sinh c.
The properties of the sphere and of the hyperboloid actually carry over to the 3D case: a
spacetime with positive curvature is bounded, while if it is flat or hyperbolic it is unbounded.

1.2.2 Other forms of the RW metric


If we wish to use Cartesian coordinates the RW metric takes the following expression:
! 2⇣
k | x |2 ⌘
2 2 2 2
ds = c dt a (t) 1 + dx2 + dy2 + dz2 . (1.6)
4

8
Universes in which a is a constant are called Einstein spaces.
We can also change time variable: the conformal time h is such that dt = a(h ) dh, where
def
a(h ) = a(t(h )): so, we will have
0 !1
dr2
ds2 = a2 (h )@c2 dh 2 + r2 dW A . (1.7)
1 kr2

Clarify distinction on different types of conformal transformations and dependence on


spatial curvature.

Weyl transformations are defined to be those which preserve angles locally; since angles
are defined through the metric but the angle between two vectors does not change if they
are rescaled, this can be translated into the condition that the metric is rescaled by a generic
function:

gab ! a2 ( xi ) gab . (1.8)

If two metric are mapped into each other by a Weyl transformation, they are said to be
in the same conformal class.
In our case, the dependence on the point in spacetime is reduced to a dependence on
the cosmic time only since we have symmetry with respect to spatial translations.
The conformal time is called that because if we use it we can map our spacetime at
any time to spacetime at another time using a Weyl transformation. If there is no spatial
curvature, we can also map it to flat Minkowski spacetime.

The universe we inhabit does not have conformal symmetry: a generic massive particle
in it has a Compton wavelength l = h/(mc) which defines its interaction cross section, so
if the spacetime expands an ensemble of these particles will have different dynamics.
However, conformal geometry is useful for the description of particles which have no
characteristic length, such as photons. Particles with no characteristic length are insensitive
to dilation, since they do not have a “meter” to probe the expansion of spacetime. The
photons of the CMB look like they are thermal: they were thermal in the early universe, since
they were in thermal equilibrium with matter (photons and matter particles were constantly
Compton-scattering off each other); when matter and radiation decoupled the photons
scattered for the last time and then kept travelling. The universe has since expanded by a
factor ⇡ 1090, but because of the fact that the photons do not have a characteristic length
their distribution can still be modelled as a blackbody distribution for an appropriately
rescaled temperature. However, strictly speaking, we are not allowed to say that they are
thermal, since keeping thermal equilibrium implies that interactions are occurring.
Notice that we have made no use of dynamics so far: we wrote the line element, the
solution of the Einstein equations, without the Einstein equations themselves! Of course we
can obtain the Robertson-Walker metric starting from the field equations as well, but here
we have only based our considerations on geometrical assumptions. This approach is called
Cosmography.

9
1.3 The energy budget of the universe
Up until now we did not consider any dynamics in our spacetime. We will discuss this
topic in more detail in later sections, but for now we give the result: the dynamics of the
universe are described by the Friedmann equations:
8pGN 2
ȧ2 = ra kc2 (1.9a)
3 ✓ ◆
4pGN 3P
ä = a r+ 2 (1.9b)
3 c
✓ ◆
ȧ P
ṙ = 3 r + 2 (1.9c)
a c
where dots denote differentiation with respect to the cosmic time t, GN is Newton’s gravita-
tional constant, r = r(t) is the energy density, and P = P(t) is the isotropic pressure.
The curvature k appears in the first equation: so we can try to measure it by comparing
the other two terms in the equations. This way, we can determine whether the universe is
flat or curved.
In order to discuss this problem, let us establish some notation: an important parameter
def
is H (t) = ȧ/a, the Hubble parameter. We can write an equation for it from the first Friedmann
one:
8pG kc2
H2 = r (1.10)
3 a2
If k = 0, then there we must have a critical energy density
rC (t) = 3H 2 (t)/(8pG ) , (1.11)
and we define W(t) = r(t)/rC (t). For a flat universe, W = 1, and so we can determine
k = sign(W 1), since:
Dividing equation
8pG r c2
1= k (1.12) (1.10) through by H 2
3 H2 a2 H 2
r c2
1=k 2 2 (1.13)
rc a H
!
Took sign of both
c2
sign(W 1) = sign k = sign(k) = k . (1.14) sides
a2 H 2

This is a promising way to measure the curvature of the universe. As we will see, we
can infer the densities of the various constituents of the universe through their dynamics.
Notice that the measurement of the energy density is a “Newtonian” measurement, while
that of the geometry of spacetime is a General Relativity one.
The alternative is trying to measure the curvature geometrically, however we should look at
really large scales in order to see any effects: we are thus drawn to the CMB. Unfortunately,
the geometrical effects of spatial curvature on the CMB power spectrum4 are highest for
4 The power spectrum is, roughly speaking, the set of the square moduli of the coefficients in the spherical-
harmonics decomposition, classified according to the coefficient l; the higher l, the smaller the angular scale we
are considering.

10
the lowest multipoles, for which we have the largest variance. This means that the direct
geometric effects of curvature cannot be discerned in the CMB power spectrum, however
the dynamical effects can.
Further, we define the Hubble constant:
1 1 1 1
H0 = H (t0 ) = 100h ⇥ km s Mpc ⇡ 70 km s Mpc (1.15)

where h ⇡ 0.7 is a number, and t0 just means now.5


The reason for this peculiar way to write the constant is that historically it has been
difficult to determine the value of H0 precisely, and it affects many astronomical conversions:
keeping it indeterminate in this way allows us to quickly update our old estimates if we
measure H0 more precisely later. Historically, in the American school, the pupils of Hubble
thought h ⇠ 0.5, while the French school thought h ⇠ 1. Now, a great issue in cosmology
is the disagreement between the measurements of H0 obtained from the cosmic distance
ladder and those obtained from the CMB [Won+19].

1.3.1 Energy density


In order to determine the spatial curvature of the universe we need to look at W µ r: we
need to measure the energy density of the universe.
How do we do it?
Let us start by considering the energy density today (index 0) due to galaxies (index g): r0g .
We do not directly observe the mass6 of galaxies: we can only measure their luminosities.
So, we do the following: we compute the mean value of r, the mass per unit volume,
with the aid of the galaxy luminosity per unit volume `: the mean density is given by the
mean luminosity times the average ratio of mass to luminosity of galaxies:7

⌦ ↵ M
r ⇠ h`i , (1.18)
L
where h M/Li is the average ratio of mass over luminosity per galaxy: we had a ratio of
densities, but since we are considering averages we can integrate above and below with
respect to the spatial volume.
5 Also, pc means “parsec”:
1 AU
1 pc = ⇡ 3.085 ⇥ 1016 m ⇡ 3.26 ly , (1.16)
1 arcsec
where the angle is to be interpreted as dimensionless (in radians), and AU is an astronomical unit, the Earth-Sun
average distance. The definition is as such because of a way we have to measure the distances to nearby objects,
by measuring their parallax between winter and summer: if they are close enough, they will have apparently
moved with respect to further objects, because of the movement of the Earth around the Sun.
6 Here the terms mass and energy are used equivalently: the velocity of galaxies with respect to the CMB is

nonrelativistic, so we approximate E = gmc2 ⇡ mc2 .


7 Formally, the steps are
⌧ ⌧ ⌧
⌦ ↵ r r M
r = ` ⇡ h`i = h`i , (1.17)
` ` L
where we made the assumption of the ratio r/` being uncorrelated to the luminosity `. This is not precisely
verified, but we are giving order-of-magnitude estimates so this is close enough for our purposes.

11
It is measured in units of solar mass over solar luminosity: M /L . Reference values
for these are M ⇠ 1.99 ⇥ 1033 g, while L ⇠ 3.9 ⇥ 1033 erg s 1 .
def
We denote h`i = Lg : it is the mean (intrinsic, bolometric8 ) luminosity of galaxies per
unit volume.
By definition, it is given by Z •
Lg = LF( L) dL , (1.19)
0
where F( L) is the number density of galaxies per unit volume and unit luminosity: the
luminosity function.
The Schechter function is an empirical estimate for the shape of this distribution:
✓ ◆ a ✓ ◆
F⇤ L L
F( L) = ⇤ exp , (1.20)
L L⇤ L⇤
where F⇤ , L⇤ and a are parameters, with dimensions of respectively a number density, a
luminosity and a pure number.
These can be fit by observation: we find F⇤ ⇡ 10 2 h3 Mpc 3 , L⇤ ⇡ 1010 h 2 L (a typical
galaxy contains roughly ten billion Suns) and a ⇡ 1.
The integral for Lg converges despite the divergence of F( L) as L ! 0, since it is
multiplied by L: so we do not need to really worry about the low-luminosity divergence of
the distribution.
The result of the integral for a generic value of a is Lg = F⇤ L⇤ G(2 a), where G is the
Euler gamma function; for the a = 1 case we get a factor G(2 1) = 1.
Numerically, inserting reasonable estimates for the parameters, we get the following
estimate for the mean luminosity: Lg ⇡ (2.0 ± 0.7) ⇥ 108 hL Mpc 3 .
Now, we must estimate h M/Li. The luminosity of galaxies can be measured readily, the
great difficulty lies in estimating their mass.

1.3.2 Estimating the masses of galaxies


We must distinguish between the different shapes of the galaxies: spiral galaxies are
characterized by rotation of the stars about the galactic center, while in elliptical galaxies
the stars’ motion is disordered.

Spiral galaxies

If we see a spiral galaxy edge-on, we will have a side of it coming in our direction, and
the other side moving away from us (after correcting for other sources of Doppler shift,
such as the velocity of the whole galaxy). So, using the Doppler effect we can measure the
distribution of the velocity in the galaxy as a function of the radius.
In order to get a theoretical model, we can approximate the galaxy as a sphere: this is
very rough (spiral galaxies are closer to being disk-like), but it gives the same qualitative
result, so there is no need for a more precise model in this context.
8 Bolometric means “total, over all wavelengths”, as opposed to the luminosity in a certain wavelength band,

which is easier to measure in astronomy.

12
100

6
10

12
10

18
10
/⇤

24
10

30
10

36
10

42
10

2 1
10 10 100 101 102
L/L⇤

Figure 1.2: A rough plot of the Schechter function for a = 1.

We model the galaxy velocity distribution using Newtonian mechanics: the GR cor-
rections are negligible at these scales, galaxies are much larger than their Schwarzschild
radii.
Equating the gravitational acceleration GM/R2 to the centripetal acceleration v2 /R we
find:
r
GM
v= , (1.21)
R
where v and M are functions of R: M is the mass contained in the spherical shell of radius
R, and v is the orbital velocity at the boundary of the shell.
In the inside regions of the galaxy, where M ( R) µ R3 since the density is approximately
constant, we will have v µ R, while in the outskirts of the galaxy M ( R) will not change
much, since all the mass is inside, so we will have v µ R 1/2 .
Our prediction is then a roughly linear region, and then a region with v ⇠ R 1/2 . This is
shown as the “Predicted” curve in figure 1.3. A plot with real data can be found in Garrett
and Duda [GD11, fig. 1], which in turn is quoting a seminal paper by Rubin [Rub83].
Instead of this, when in the 1980s people started to be able to measure this curve

13
v(R)

Predicted
Observed

Figure 1.3: Predicted and observed velocity distribution for galaxies. The point at which
they star to diverge is approximately the radius of the bulk of the galaxy. This plot is
approximate, realistic models do not have such sharp corners, since there will not be a
precise edge after which the density drops to zero immediately.

accurately they saw that, after the linear region, v( R) was approximately constant. So, is
Newtonian gravity wrong?
An option to solve this problem was proposed by Milgrom and collaborators: it is called
MOdified Newtonian Dynamics, or MOND: they propose that gravity is not actually al-
ways described by a r 1 potential, but instead at low accelerations it behaves differently.
Specifically, they posit that the gravitational acceleration g should be modulated by a factor
µ( g/a0 ), where a0 ⇡ 8 ⇥ 10 10 h2 m/s2 is a characteristic acceleration while µ( x ) is an dimen-
sionless function such that µ ! 1 for x 1 and µ ! x for x ⌧ 1, such as µ = x/(1 + x )
[BM84].
This approach is Newtonian but there are also relativistic MOND variants. They do not
match observation as well as the alternative approach does.9 The heaviest thing weighing
9 MOND would be compatible with the speed of gravity being less than the speed of light, which is equivalent
to the graviton being massive. Recent measurements of gravitational waves seem to agree with the general-
relativistic prediction that it is massless.

14
against MOND is the fact that even using it we still need dark matter in order to fully
explain observations.

Dark matter

The alternative option is that Newtonian mechanics describes galactic mechanics well,
but the galaxy’s matter distribution inferred from our observations is actually smaller than
the real distribution, which extends outward further than the matter we can see: this is dark
matter.
We’d need mass obeying M ( R) µ R in order to have a constant value of v: since
RR
M ( R) = 4p 0 max R2 r( R) dR, we need the density profile to decay like r( R) µ R 2 . This is
called an isothermal density profile: we call it the dark matter halo, which surrounds all spiral
galaxies.
We do not know what dark matter is: we can say that it interacts gravitationally but
not electromagnetically. People tend to believe that it is made up of beyond-the-standard-
model particles, like a neutralino or an axion. Historically, people thought the effect could
be due to massive neutrinos; however their mass would need to be around 30 eV, and the
analysis of the CMB data showed that the sum of the masses of the neutrino species must
be  mn < 0.120 eV [Pla+19, Table 7].
The total density of matter (dark+regular) is ⇠ 6 times more than that of regular matter
alone: this must be accounted for in our estimate of the M/L ratio for spiral galaxies (since
we have additional mass but not additional luminosity): with this correction, we find that
for spiral galaxies

M M
⇡ 300h , (1.22)
L L

Historically, this was the first evidence for dark matter.

Elliptical galaxies

If galaxies are not spiral-shaped, we have to weigh them in a different way: the Doppler
broadening of spectral lines gives us a measure of the root-mean-square velocity.
Later in the course we will obtain the (nonrelativistic) virial theorem, now we just state it:
if T is the kinetic energy of a gravitationally bound system, U is the potential energy, then

2T + U = 0 (1.23)

holds when the inertia tensor stabilizes,


D E that is, when we have dynamical equilibrium.
The kinetic energy is T = 32 M v2r , where vr is the radial10 component of the velocity,
which we expect to account for one third of total energy by the equipartition theorem. M is
the total mass of the galaxy.
10 By “radial” we mean directed towards us, not towards the center of the galaxy.

15
The potential energy, instead, is U = GM2 /R. Substituting these expressions into the
virial theorem we get

3 D E GM2
2 ⇥ M v2r =0 (1.24)
2 R
3R D 2 E
M= vr , (1.25)
G
D E
so if we can measure v2r through Doppler broadening and we can give a reasonable
estimate for the radius R of the galaxy we can give an estimate for M.

Global matter contributions

Accounting for the dark matter mass, we get h M/Li ⇡ 300hM /L .


The value of the critical energy density today, r0c , is given by11

3H 2 29 2
r0c = ⇡ 1.88 ⇥ 10 h g/cm3 , (1.26)
8pG
so, in order to have W = 1, we’d need h M/Li to be equal to:

M r0c 1.88 ⇥ 10 29 h2 g/cm3
= ⇡ ⇡ 1390hM /L . (1.27)
L Lg 2 ⇥ 108 hL /Mpc3

We can define quantities of the form W0i = r0i /r0c , where i is a type of matter, such as
baryonic matter, dark matter, dark energy, radiation and so on, whose density is represented
as r0i .
These variables quantify how much, at the present, time, of the cosmic energy budget is
accounted for by that type of matter.
So, only W0b ⇡ 5% of the energy budget is given by baryonic matter (not all of which
is visible), while around W0DM ⇡ 27% is dark matter. Together, they are just denoted as
“matter”, and W0m ⇡ 30%.
We can ask ourselves: is dark matter actually baryonic matter which for some reason we
cannot see, such as black holes or brown dwarfs?
This cannot be the case: our observations, combined with models for primordial nucle-
osynthesis, gives the following bounds for the baryonic energy density:

0.013  W0b h2  0.025 . (1.28)

The upper bound for W0b is around 2.5%/h2 ⇡ 5.4 %.


This would seem to indicate that W0 ⇡ 0.3 ⌧ 1: however we are failing to consider
a crucial contribution. Consider the second Friedmann equation (1.9b): in the Newtonian
limit P ⇠ 0 while r > 0, so we get ä < 0: the universe contracts. This is not what is observed:
we actually see it in accelerated expansion.
11 Do note that the numerical figure, 1.88 ⇥ 10 29 , is approximate but its value is known to at least four
significant digits, since the only source of uncertainty in it lies in the uncertainty in our measurement of G —
all the uncertainty in H0 is expressed in variable form, with the parameter h.

16
1.3.3 Dark energy
The measurements leading to this conclusion are performed by estimating the distance
and redshift of far-away objects whose intrinsic luminosity is well known, called standard
candles: the most commonly used are type Ia supernovae and Cepheid variables.
So, if the expansion is accelerated then ä > 0: this means, again from the second
Friedmann equation (1.9b), that P < rc2 /3. This is commonly expressed by defining
w = P/rc2 . In order to have accelerated expansion we need w < 1/3; what is observed is
closer to w ⇠ 1.
This negative pressure has the effect of a tension, pulling the universe apart. As we
will discuss in section 2.5, a candidate for a cosmological fluid with negative pressure
(specifically, w = 1) is a cosmological constant term, called L, which can be inserted in the
Einstein Field Equations. It is not the possible one: dark energy is defined to be what causes
the expansion we see, so it could be constituted by any kind of fluid which is uniformly
distributed in space and which has negative pressure.
We cannot see directly neither dark matter nor dark energy: how do we distinguish the
two? Dark matter tends to cluster, while dark energy is uniformly distributed; furthermore,
dark energy has negative pressure.
From observations of both the anisotropies in the CMB and the distribution of galaxies
we can determine that WL ⇡ 0.7.

1.3.4 Radiation
We still need to compute the contribution of the energy of electromagnetic and neutrino
radiation to the total energy balance. Let us start with EM radiation: the greatest fraction
of the radiation energy density is contained in the CMB,12 which is extremely close to a
Planckian distribution:
2h n3
B(n, T ) = 2
⇣ ⌘ , (1.29)
c exp hn 1
kB T

12 This fact is not obvious: the CMB permeates the universe, but each photon in it has a much lower energy
than the typical photon emitted by a star. Also, stellar fusion is quite efficient, being able to convert around
0.7 % of the mass of its baryons into energy in the form of photons. If all the baryonic matter in the universe
underwent fusion, this would give us a contribution of the order of Wm ⇥ 0.7 % ⇡ 5 % ⇥ 0.007 ⇡ 3 ⇥ 10 4 , which
as we will see in a moment is of the same order of magnitude as the CMB contribution.
However, assuming that all baryonic matter undergoes fusion leads us to overestimate the result: first, a large
fraction of the baryons in the universe are not in star-forming galaxies but are instead distributed in filaments
among them, in what is called the Warm-Hot Intergalactic Medium, which accounts for around 30 % of the
baryonic density in the universe [dGra+19].
Also, in a given galaxy only a small fraction of the baryonic mass is contained in stars: roughly speaking,
15 % [And10, fig. 11].
A third point: stars only started forming in the relatively-early universe, with redshifts of around z ⇠ 10 ÷ 30.
As for high-energy (UV) photons, their contribution is truly negligible: their intensity is around 3 nW/m2 /sr
[Col18, page 2], corresponding to a density of around 10 36 g/cm3 (we multiply by 4p/c3 to find this result).
The same paper estimates the infrared density at around 4 times the UV one, which still means it is a couple
orders of magnitude below the CMB density.

17
with T = T0g ⇡ (2.725 ± 0.001) K. B is a measure of spectral intensity: it is measured in
units of energy per unit second, area, solid angle and frequency. This is a well-known
distribution, whose integral is given by
4
sr T0g 34 3
r0g = = 4.6 ⇥ 10 g cm (1.30)
c2

where sr = p 2 k4B /(15h̄3 c3 ), while sSB = sr c/4.


So, the radiation contribution to the global energy balance is W0g ⇡ 2.5 ⇥ 10 5 h 2 <
0.01 %, definitely negligible.
We are going to show in later sections that if neutrinos were massless, their temperature
would be Tn = (4/11)1/3 Tg < Tg .
They might not be massless, and if they are not the main contribution to the energy
density they will give will be from their masses. However, recent observations (for example,
by the Planck satellite) are bounding the mass of the neutrinos, Â mn  0.12 eV: we have

hmn i 30 3
rn = 3Nn 10 g cm , (1.31)
10 eV
where Nn is the number of neutrino species. Even if we assume the upper bound, Nn hmn i =
0.12 eV, we get W0n < 0.5 %.

Conclusions

If we add up all the contributions to W = Wb + W DM + WL (neglecting, as we said,


def
the contributions by EM radiation and neutrinos) we find experimentally Wk = 1 W ⇡
(5+38
40 ) ⇥ 10
4 [Pla+19, Table A.2].

So, with the observational uncertainties we have currently we cannot determine the
sign of the universe’s spatial curvature: the value of Wk is very much compatible with 0;
even though one is drawn to say that this means that an open universe is more likely, no
particular meaning should be drawn from the fact that the nominal value of Wk is slightly
above 0.

1.3.5 The Hubble law


At the end of the 1920s Edwin Hubble compared the estimates for the distances of
far-away galaxies (obtained through standard candles and other types of estimates) to their
velocities relative to us, as measured through their redshift. His results are shown in figure
1.4: he obtained a roughly linear relation in the form:

v = H0 d , (1.32)

where v is the velocity of the galaxies, and d is their distance from us, while H0 is a
constant of proportionality. Hubble’s measurements suggested H0 ⇠ 500 km/s/Mpc, more
refined modern ones using techniques similar to those used by Hubble yield a value of
(73.24 ± 1.74) km/s/Mpc [Rie+16].

18
Figure 1.4: Original velocity versus distance data from Edwin Hubble’s paper in 1929
[Hub29].

Measurements of H0 through analysis of the CMB gives an incompatible value: H0 =


(67.8 ± 0.9) km/s/Mpc [Ade+16].
Let us now show that this H0 is actually the same one we defined before, H0 = ȧ/a. H0
is called the Hubble constant, since it is constant with respect to the direction we look in
the sky.
We are considering the distance connecting us (the center of our reference frame) to a
distant galaxy, so we drop the angular part in the flat (k = 0) FLRW line element:

ds2 = c2 dt2 a2 (t) dr2 . (1.33)

So, at a fixed time the physical distance is given by d = a(t)r: therefore v = d˙ = ȧr =

ad = H0 d.
Assuming the universe is spatially flat is correct up to second order: for a general value
of k we have
Z r ⇣ ⌘
de
r
d=a p = a r + O(r3 ) , (1.34)
0 1 ke r2

since the integral gives either r, arcsin(r ) or arcsinh(r ) depending on k, and all three of these
equal r up to second order.

19
Do note that we neglected the temporal part of the metric: this is equivalent to assuming
that photons travel instantaneously. So, this is Newtonian and rough, but it gives us the
correct intuitive idea. We now wish to make this reasoning more precise.
The first step, since we want to discuss our observations of light, is to define the redshift
and the luminosity distance.

Definition 1.3.1 (Redshift). The redshift z of a photon is defined by

l0 le l0 ne
z= = 1= 1, (1.35)
le le n0
where l0 and le are the observed and emission wavelengths respectively, while n are frequencies with
the same notation.

We will show that the redshift can be found from the ratio of the scale factors now and
at emission: 1 + z = a0 /ae . Therefore, no /ne = ae /a0 .
We wish to study the distribution of light from an astronomical source: in Minkowski
spacetime the apparent luminosity ` decreases like r 2 if r is the distance from the object.
In a generic spacetime this will not be the case: however, we can define a measure of spatial
distance d L such that ` = L/(4pd2L ), where L is the intrinsic luminosity of the object.
Do note that ` is dimensionally a luminosity flux: it is measured in units of energy per
unit time per unit area.

Definition 1.3.2. The luminosity distance d L is defined as:


r
L
dL = . (1.36)
4p `
Since L µ `, this is a well-defined measure of distance between two generic points in spacetime,
regardless of the presence of a source of light there.

How do we relate the luminosity distance and the scale factor? The radiation from our
source is spread on a sphere: we integrate the angular part of the FLRW metric over a sphere
of fixed comoving radius r to find its area.
The metric restricted to the angular coordinates at fixed r is given by
⇣ ⌘
ds2 = a2 r2 dq 2 + sin2 q dj2 , (1.37)

so the area form on the surface of the sphere is


p
dA = det g dq ^ dj , (1.38)
p p
where det g = a4 r4 sin2 q = a2 r2 sin q. So,
Z Z p Z 2p
A= dA = a2 r2 dq dj sin q = 4pa2 r2 , (1.39)
S2 0 0

where we substituted the wedge product for the regular tensor product of the differentials
since our axes are orthogonal. Now, at which time do we compute the scale factor? We

20
are measuring the flux at the surface of the sphere, at the time at which we are observing:
therefore we need to compute the scale factor at observation time as well. This gives us
A = 4pr2 a20 .
Now, the emitted luminosity is in the form:
dN
L= hhne i , (1.40)
dte
where dN dt is the number of photons emitted per unit time, whose average energy is
hhne i. From the point of view of the observer, the number of photons is the same, while the
frequency of the observed photon and the time interval dte change: specifically, ne = n0 a0 /ae
and dte = dt0 ae /a0 .13 Therefore, the observed absolute luminosity obeys the relation
L = L0 ( a0 /ae )2 .
So, putting everything together we get:
✓ ◆2
L0 L ae
`= = 2
, (1.41)
A 4pr a0 a0
2

note that the value of k does not enter into the equation.
Therefore: s
r ✓ ◆
L 4pr2 a20 a0 2 a2
dL = = = 0 r = a0 (1 + z )r . (1.42)
4p ` 4p ae ae
Another way of defining a distance is the one we get by directly integrating the radial
part of the line element: this way, we are effectively using a space-like measuring stick,
working at fixed cosmic time and fixing both of the angles q and j.
Definition 1.3.3 (Proper distance). The proper distance d P at a fixed cosmic time t to an object at
a comoving radial coordinate r is given by:
Z r
de
r
d P = a(t) p . (1.43)
0 1 r2
ke

A derivation of the Hubble law

We want to derive the Hubble law (v = H0 d) mathematically. It can also be restated as


cz = H0 d: the observed velocity of recession of the objects is measured through redshift,
which is described (in the nonrelativistic limit14 ) by the formula
✓ ◆
v
l0 = l e 1 ± , (1.45)
c
13 This is the case even though the temporal component of the metric does not change, since we are not
considering a fixed instance of cosmic time (which is unphysical): instead, we are considering the time intervals
that the emitter and observed measure between the crests of a light wave sent from one to the other.
14 The relativistic expression is
r
1 + v/c v
1+z = & 1+ . (1.44)
1 v/c c
This will not be relevant for our discussion, however, since before the special-relativistic corrections become
relevant we will need to use general-relativistic corrections: roughly speaking, the effects of spacetime expanding
while the photon travels from its source to us.

21
where the sign in the ± is a plus if the object is receding from us. We also defined z through
l0 /le = 1 + z: this means that we can identify z = v/c.
Now, we are going to “move away from the current epoch by Taylor expanding”: the
scale factor at a time t, a(t), can be written as

1 ⇣ ⌘ We drop the error


a(t) = a0 + ȧ0 (t t0 ) + ä0 (t t0 )2 + O |t t0 |3 (1.46a) term
✓ 2 ◆ Substituted
1 2 2
⇡ a0 1 + H0 (t t0 ) q0 H0 (t t0 ) , (1.46b) H0 = ȧ0 /a0 .
2
def
where q0 = ä0 a0 /( ȧ0 )2 is called the deceleration parameter for historical reasons: people
thought they would see deceleration (ä < 0) when first writing this, so a positive q0 , but the
deceleration parameter is instead measured to be negative. The Hubble law is only correct
up to first order, and we will show this by also calculating the second order correction.
Now, 1 + z = a0 /a can be expressed as:
✓ ◆ 1
1
1+z ' 1 + H0 (t t0 ) q0 H02 (t t0 ) 2
. (1.47)
2
Do note that this is derived starting with a formula which is correct up to second order
in the time interval Dt = t t0 : when expanding we cannot trust terms of order higher than
second. Expanding with this in mind we get:15
q0 2 2
1+z ' 1 H0 Dt + H Dt + H02 Dt2 , (1.52)
2 0
therefore
◆ ✓ Changed the time
q0
z = H0 (t0 t) + 1 + H02 (t0 t)2 (1.53) intervals to
2 t0 t = Dt, the
" ✓ ◆ # square of which is
q0 2 the same as before.
= (t0 t) H0 + 1 + H0 (t0 t)z . (1.54)
2
15 We need to compute the first and second derivatives of
✓ ◆ 1
q0 2 1
1 + H0 Dt H Dt = (1 + aDt + bDt) = f (Dt) : (1.48)
2 0
they are

df
= (1 + aDt + bDt) 2 ( a + 2bDt) (1.49)
dDt
d2 f
= 2( a + 2bDt)2 (1 + aDt + bDt) (1 + aDt + bDt) 2
(2b) , (1.50)
dDt2
so we have

df 1 d2 d 1⇣ 2 ⌘
=x and = 2a 2b = a2 b. (1.51)
dDt Dt=0 2! dDt2 2!
Dt=0

22
Bringing the bracket to the other side we get
" ✓ ◆ # 1
q0
t0 t = z H0 + 1 + H02 (t0 t) (1.55)
2
" ✓ ◆ # 1
q0
= z H0 + 1 + H0 z (1.56)
2
" ✓ ◆ # 1
z q0
= 1+ 1+ z , (1.57)
H0 2

where we substituted the first order expression t0 t = z/H0 : we are allowed to make this
substitution since the expression is multiplied by z, which has the same asymptotic order as
t0 t (since H0 is finite): working to first order inside the brackets is equivalent to working
to second order in the global expression.
By the same reasoning, we can expand the inverse bracket to first order:
" ✓ ◆# ✓ ◆
z q0 z q0 z2
t0 t = 1 1+ z = 1+ , (1.58)
H0 2 H0 2 H0

We would like the time interval to disappear: we want a distance, not a time, so we
should seek an expression for r instead of Dt. We are observing photons, for which ds2 = 0,
which is equivalent c2 dt2 = a2 (t) dr2 /(1 kr2 ). Taking a square root and integrating we
get:
Z t0 Z 0
c dt de
r
=± p , (1.59)
t a ( t ) r 1 ker2
where we should select the negative sign since we want positive quantities on both sides.
The other choice would correspond to the photon being emitted from the Earth and received
at the comoving radius of the source.
The integral on the right hand side can be solved analytically: it is
8
> 3
Z 0
der <arcsin r = r + O(r )
> k=1
p = r k=0 (1.60)
r 1 ke r2 >
>
:arcsinh r = r + O(r3 ) k= 1
in all cases, it is just r up to second order (since the next term in the expansion of an arcsine
or hyperbolic arcsine is of third order).
On the left hand side, we can substitute in a(t) from equation (1.46b):
Z t0 Z
c dt c t0 h i 1
= det 1 + H0 (et t0 ) + O(Det2 ) (1.61)
t a(t) a0 t

c H0
= t0 t + (t t0 )2 + O(Dt3 ) (1.62)
a0 2

23
where we used the expression for the scale factor to first order only since the integration
raised the order of the estimate by one. we have:
✓ ◆
c 1 2 3
(t0 t) + H0 (t0 t) + O(Dt ) = r , (1.63)
a0 2
since the term proportional to q0 only gives a third order contribution. We can now substitute
the expression for the time difference with respect to the redshift (1.58), only computed to
second order:
2 0 1 3
✓ ◆ ! ✓ ◆ ! 2
c6 z q0 H0 @ z q0 7
r= 4 1 1+ z + 1 1+ z A 5 (1.64)
a0 H0 2 2 H0 2
" ✓ ◆ #
Ignored the third
c z q0 z2 H0 z2
= 1+ + (1.65) and higher order
a0 H0 2 H0 2 H02 terms in the square.

c 1
= z 1 + q0 z . (1.66)
a0 H0 2
Now, we can insert this expression for r into the formula for the luminosity distance
(1.42):
r
d L = a20 = a0 (1 + z )r (1.67)
a ✓ ◆
c 1
= a0 (1 + z ) z (1 + q0 ) z2 . (1.68)
a0 H0 2
As we expect, the term a0 disappears: it is a bookkeeping parameter, the physical
properties of a universe described by a FLRW metric are invariant under a global rescaling
of the scale factor.
Our expression also contains cubic terms in z: removing these to get back to second
order we find
✓ ◆
cz 1
dL = (1 + z ) 1 1 + q0 z (1.69)
H0 2
✓ ◆ !
cz 1 1
= 1+ 1 q0 z (1.70)
H0 2 2
✓ ◆
cz 1
= 1 + 1 q0 z . (1.71)
H0 2

We can turn this into a relation for cz in terms of d L by substituting in the first order
expression d L H0 = cz into the second order term (the error we make by this approximation
is of third order at least):
q0 1 2
cz = d L H0 + cz (1.72)
✓ 2 ◆
1 H0 2
cz = H0 d L + (q0 1) d L , (1.73)
2 c

24
and we can notice that the relation is approximately linear and independent of acceleration
for low redshift, but we can detect the acceleration at higher redshift. Typically we need to
measure galaxies at least 10 Mpc away in order to detect these second order effects. As we
mentioned in the beginning, the data show the parameter q0 to be negative.
This effect is similar to a Doppler effect, but the analogy is not perfect: the redshift is
caused by the expansion of space itself, and the apparent velocities of the galaxies at high
redshift would be superluminal.

Interpretation of superluminal recession velocities

What follows is a synthesis of the enlightening article by Davis and Lineweaver [DL04],
which should be referred to for clarification.
The proper way to define velocities is directly through the metric: so, we will need to
differentiate the relation d = ac, where c is the distance in comoving coordinates and d
is the physical distance, with respect to the cosmic time. This yields d˙ = ȧc = Hd. Note
that we are assuming that the object is stationary with respect to the comoving coordinates
(ċ = 0), so we are ignoring what is called the peculiar velocity.
This d˙ is precisely the recession velocity, and this definition can be extended to any
redshift. As written it is quite implicit, since we do not know what the distance in comoving
coordinates is to an object we observe with redshift z. This will be treated in section 2.4, but
for now the result is (see also the aforementioned paper [DL04, eq. 1], and an older paper
by Harrison [Har93, eq. 13]) that the velocity now of an object observed with redshift z is
Z z
de
z
vrec = H0 dC (z) = cH0 , (1.74)
0 H (e
z)

which can be greater than c: doing the computation allows us to see that v > c for z & 1.5.16
Does this result contradict General, or even Special Relativity? It does not! In General
Relativity we cannot directly compare velocities of objects which are far away from each
other: they lie in different vector spaces, and there can be no local inertial frame extending
that far. So, it is consistent to have superluminal recession velocities, while observers near
the emission, as well as observers near Earth, always measure velocities locally to be  c.
Now we can clear some common misconceptions:

1. recession velocities can indeed exceed the speed of light;

2. they can do so in periods of “regular” expansion of the universe: we did not consider
inflation, which surely did not occur for z ⇠ 1.5, which is the point at which the
recession velocities start being superluminal;

3. we can indeed see the light from objects which are currently receding superluminally:
the formula for the comoving distance is derived by integrating a photon’s trajectory.
16 An interesting (although meaningless) fact: the current recession velocity of the CMB (z ⇡ 1090) is around
3.14c ⇡ pc, according to the latest Planck data [Ade+16]. For more details on how the recession velocity scales
with the redshift, see figure 2 in Davis & Lineweaver [DL04].

25
Definition 1.3.4 (Angular diameter distance). The angular diameter distance is defined as the
ratio of the object’s physical transverse size L to its angular size in radians Dq:
L
= a ( t )r ,
dA = (1.75)
Dq
which is peculiar in that it is not monotonic in z [Hog00]: at z & 1 it starts decreasing.

Redshift-scale factor relation

Let us prove the statement from before, l0 /l = a0 /a: photons are emitted with a certain
wavelength le , at a comoving radius r from us, and detected at lo . p
The line element for the photon is ds2 = 0, therefore c dt /a(t) = ± dr / 1 kr2 .
As before, we can integrate this relation from the emission to the absorption: we call it
f (r ) (it can be any of the functions shown in equation (1.60)):
Z t0
c det
= = f (r ) (1.76)
t a(et)
If we map t ! t + dt and t0 ! dt0 in the integration limits, the integral must be constant
since it only depends on r — do note that all the expansion of the universe is accounted
for by the increasing scale factor, objects are stationary with respect to the comoving radial
coordinate r. We are computing the integral for two successive wavefronts of the light. Then,
we equate the two:
Z t0 Z t0 +dt0
c det c det
f (r ) = = , (1.77)
t a(et) t+dt a(et)
which we can split into:
"Z Z t0 Z t0 +dt0 Z t0
#
t c det
+ + = 0, (1.78)
t+dt t t0 t a(et)
where, since all the integrals have the same argument, we collect it at the end for clarity. We
simplify the original integral and swap the integration limits to get:
Z t+dt Z t0 +dt0
c det c det
= , (1.79)
t a(et) t0 a(et)
which can we approximate by
cdt cdt0
= , (1.80)
a(t) a ( t0 )
since the periods of the photons we are considering are generally much smaller than the
cosmic timescales.
Since the frequency of the emitted and observed photons must be proportional to the
inverse of the time intervals dt or dt0 , we have
ne a(te ) = no a(to ) , (1.81)
therefore
lo a0
1+z = = . (1.82)
le a

26
Chapter 2

Friedmann models

The Friedmann equations describe the dynamical evolution of the universe, as opposed
to the static description given by the FLRW metric.

2.1 A Newtonian derivation of the Friedmann equations


The Friedmann equations are derived starting from the Einstein Field Equations for the
FLRW metric, however we can derive them using an almost purely Newtonian argument.
We will not be able to recover the full equations, since a Newtonian fluid’s pressure is
P ⌧ rc2 : its contribution to the stress-energy-momentum tensor is negligible [TM20, eqs.
441–443]. So, through our argument we will recover the equations we would have with
P = 0.
The only non-Newtonian step in our derivation is the justification of the Newtonian
approximation: we wish to make use of the theorem attributed to Birkhoff, but first derived
by the Norwegian physicist J.T. Jebsen [JR05].

Proposition 2.1.1 (Jebsen-Birkhoff). The only solution to the vacuum Einstein Field Equations
which is spherically symmetric is given by the Schwarzschild metric [MTW73, sec. 32.2]:1
✓ ◆ ✓ ◆ 1
2GM 2 2 2GM
ds2 = 1 c dt 1 dr2 r2 dW2 . (2.1)
c2 r c2 r

With this in mind, let us take a spacetime with uniform density r. We consider a sphere,
and imagine taking all the mass inside the sphere away.
By the Jebsen-Birkhoff theorem, the internal geometry of this shell is only determined
by the mass distribution inside the shell: since we took all the mass inside the sphere away,
the inside spacetime is Minkowski, that is, Schwarzschild spacetime with M = 0.2
1 This is not relevant for what we will discuss here, but do note that there is no requirement of the generating
(internal) mass distribution to be static: this is, in fact, the reason why spherically symmetric collapsing or
pulsating stars cannot emit gravitational waves.
2 There is actually some nuance to this: as is expounded upon in an article by Zhuang and Yi [ZY12], in

general relativity the geometry inside the shell is actually influenced by the presence of the shell, in that its
time coordinate is different to the one which would be measured by an outside observer. This is relevant, for

27
We describe this system through a mass coordinate: M (`) is the mass enclosed by a
spherical shell of radius `.
The mass taken away will be M (`) = 4p 3 ~
3 r ` , where l = a ( t )~r is the radius of the sphere,
whose norm is ` = ~l , and r, as mentioned before, is the constant density.
We suppose that the gravitational field is weak: this is quantitatively expressed using the
relation ` ⌧ r g , where r g is the Schwarzschild radius of the system: this relation becomes

GM(`)
⌧ 1, (2.2)
` c2
and is a necessary assumption in order to apply a Newtonian approximation.
Now, we “put back” the mass which was removed, in order to restore the initial situation.
We put a test mass on the surface of the sphere. What is the motion of the mass due to
the gravitational field from the center? It will surely be radial, and since as we said we are
in the Newtonian approximation we can calculate it using Newton’s equation:

GM (`) 4pG
`¨ = = r` . (2.3)
`2 3
This seems to give us a net force even though we expect everything to be stationary
because of isotropy. This is because we are not working in comoving coordinates: the
radius of the sphere, `, can change with the scale factor, even when the comoving vector ~r is
constant.
Our final result will not depend on the unit vector we choose.
Plugging in ` = ar we find

4pG
är = rar (2.4)
3
4pG
ä = ra . (2.5)
3
This is the second Friedmann equation, (1.9b), without the pressure term for the reasons
mentioned before.
Starting from the acceleration equation (2.3) we can get

GM ˙ Multiplied by `˙
`˙ `¨ = `, (2.6)
`2
.
and identifying 2`˙ `¨ = d(`)
˙ 2 dt we find the conservation of energy equation:

1 d ⇣ ˙ ⌘2 GM`˙
` = (2.7)
2 dt `2
Z
1 ⇣ ˙ ⌘2 1 d` We integrate on both
` = GM dt (2.8) sides in dt
2 `2 dt
instance, if we wish to compute the Shapiro delay of a ray of light passing through the shell and coming back
out. This is not an issue for us, since the geometry on the inside is locally indistinguishable from a pure vacuum
Minkowski spacetime if we measure only inside the shell.

28
Used M = 4pr`3 /3.
1 ˙2 GM 4p
` = +C = Gr`2 + C , (2.9)
2 ` 3
where C is an arbitrary constant, which we can express in terms of the scale factor:
8pG 2 2
ȧ2 r2 = ra r + C (2.10)
3
or, removing the r2 term, which is a constant (since it is a comoving radius, with respect to
which objects are stationary),
8pG 2 C
ȧ2 = ra + 2 . (2.11)
3 r
Now, the dimensions of this constant are those of a speed, therefore we can express
it as C/r2 = k N = kc2 , where k is dimensionless — we are allowed to do this since
k N , the Newtonian curvature constant, has the dimensions of an energy per unit mass, or
equivalently a velocity squared.
This clarifies the statement that the magnitude of k is arbitrary: we get it by dividing
the constant C (which is fixed) by the dimensionless comoving radius, whose magnitude is
indeed arbitrary, therefore we can normalize it however we wish: so we choose |k | = 1 or 0.
The equation we found is of the form:

Ekin + Egrav = kc2 , (2.12)

where Ekin and Egrav are the energies per unit mass in the form of either kinetic or gravita-
tional energy. So we can directly see that k, in a sense, describes the intrinsic energy of a
free, stationary (with respect to the comoving coordinates) test mass.
If it is positive (k < 0) then the particles have an intrinsic positive energy, which causes
expansion, while if it is negative (k > 0) it is as if they were intrinsically gravitationally
bound, which causes contraction.
We can also recover the third Friedmann equation
✓ ◆
ȧ P ȧ
ṙ = 3 r + 2 ⇡ 3 r , (2.13)
a c a
where the approximation as before is the Newtonian one, P ⌧ rc2 . In this case, however, we
will be able to also recover the nonrelativistic pressure term if we account for conservation
of energy and not just mass.
When deriving these equations relativistically the third one comes from the conservation
of the temporal component of the stress-energy tensor (rµ T µ0 = 0), i.e. the “conserva-
tion of energy”,3 so to find it in our Newtonian calculation we will need to consider the
nonrelativistic equivalent of that equation, which is the first law of thermodynamics.
3 This is actually not a conservation law as stated, it is not covariant: in order for it to be we would need
to project it along a temporal Killing vector, which does not exist in cosmology. However, if we did have a
temporal Killing vector x µ the projection x n rµ T µn = 0 would indeed be the conservation of energy.
Indeed, it is more correct to say that this equation is not a conservation law at all, but is instead just an
expression of the geometric Bianchi identities rµ G µn = 0, which are written in terms of the Einstein tensor
Gµn = Rµn 12 Rgµn ; if the Einstein equations hold then Gµn is proportional to Tµn , therefore the two statements
are equivalent.

29
We will consider ideal fluids. The first law, assuming adiabaticity — which must be
present, since net heat transfer in the universe would violate isotropy — states:

dE + P dV = 0 . (2.14)

We can write the total energy as the product of the energy density times the volume:
E = 4p 2 3 4p 3
3 rc a , since the volume is V = 3 a .
So, the first law reads:

4p ⇣ 2 3 ⌘
0= d rc a + P d( a3 ) (2.15)
3 Divided through by
= c2 r d( a3 ) + c2 a3 dr + P d( a3 ) (2.16) 4p/3
✓ ◆ Divided through by
P da
= 3 r+ 2 + dr , (2.17) c2 a3 , collected terms.
c a

which is the third Friedmann equation, (1.9c): the only manipulation left to do is to apply
the differentials, which are covectors, to the temporal vector d dt in order to turn them
into time derivatives.
Why were we able to recover the relativistic term this time? The completely non-
relativistic approach to this would be to write M = 4p 3
3 ra , and to write down the equation
for the conservation of mass alone. Indeed, this would yield the Friedmann equation
without the P term. What we have effectively done is to consider the first relativistic
order instead, since the classical kinetic energy is recovered by the first order expansion
E = gmc2 ⇡ m + mv2 /2.
The three Friedman equations are not all independent, we only need two of them: for
example, the second one (1.9b) can be derived from the first and third.
This means that we can derive the full relativistic equations in this Newtonian context,
using the derivations we have shown for the first and third equation, and then combining
these to find the second.
Let us do this derivation explicitly: we differentiate the first Friedmann equation

8pG 2
ȧ2 = ra kc2 (2.18)
3
with respect to time to find
8pG 2 16pG
2ȧ ä = ṙa + r ȧa (2.19)
3 3
We then substitute in the expression we have for ṙ from the third equation:
✓ ◆
ȧ P
ṙ = 3 r + 2 , (2.20)
a c

which gives us
" ✓ ◆#
8pG ȧ P 16pG
2ȧ ä = 3 r+ 2 a2 + r ȧa (2.21)
3 a c 3

30
✓ ◆ Dividing through by
P 8pG
ä = 4pG r + 2 a + ra (2.22) 2ȧ
c 3
" ✓ ◆ #
Dividing through by
ä 8pG 3 P
= r+ 2 r (2.23) a
a 3 2 c
✓ ◆
ä 4pG P
= r+3 2 , (2.24)
a 3 c
which is precisely equation (1.9b).

2.2 The equation of state


So, the equation system is underdetermined: we do not in fact have three independent
Friedmann equations, but just two. The variables we want to find, however, are three: P(t),
r ( t ), a ( t ).
We have to make an assumption: we will assume our fluid is a barotropic perfect fluid,
!
that is, one for which the pressure only depends on the density: P = P(r).
Very often this equation of state will be linear: P = wrc2 , with a dimensionless constant
4
w. We will assume this relation to be true.

2.2.1 Common equations of state


A thing we will compute for the different equations of state is the adiabatic5 speed of
sound:
∂P dP
c2s = = = wc2 . (2.25)
∂r dr
We also will be able to tell what the evolution of the energy density is for a varying scale
factor: this can be derived from the third Friedmann equation (1.9c): in this case it reads
ṙ P ȧ ȧ
+3 2 +3 = 0 (2.26)
r rc a a
ṙ ȧ
+ 3(1 + w ) = 0 (2.27)
r a
d ⇣ ⌘
log ra3(1+w) = 0 =) ra3(1+w) = const , (2.28)
dt
so r µ a 3(1+ w ) .

1. w = 0 is equivalent to P ⌘ 0: this is what we get in the nonrelativistic limit, for


P ⌧ rc2 , since there is no pressure this can be interpreted as a dust. In this case
r µ a 3 ; also, we have c2s ⌧ c2 .
4 This is a latin w, not a greek w: students historically call it “omega” for some reason.
5 The speed of sound is usually computed for adiabatic transformations, since the transmission of sound
is usually close to an adiabatic process. In our case, adiabaticity is embedded in the hypotheses made in
the derivation of the Friedmann equations. So, we can calculate the derivative without worrying about the
adiabaticity condition being respected since for the solutions we will consider it always will be.

31
2. w = 1/3pis what we get if we seek the pressure of radiation.6 In this case we have
cs = c/ 3, while the energy density goes like r µ a 4 , since we get a factor a 3
from the volume expansion and another a 1 from the decrease of the energy of each
photon due to redshift. Alternatively, from what was derived before we can see that
the exponent in the powerlaw must be 3(1 + 1/3) = 4.
So, for a radiation-dominated universe the total energy E µ ra3 µ a 1 is not conserved.

3. w = 1 is called stiff matter: it has P = rc2 and cs = c. This is an incompressible fluid:


it is so difficult to set this matter in motion that once one does it travels at the speed
of light. Now, r µ P µ a 6 .

4. w = 1 means that P = rc2 . We cannot compute a speed of sound (it would be


imaginary). Now r and p are constants, since they are proportional to a0 = 1. This
is the case of dark energy: we will show in section 2.5 that inserting a cosmological
constant L into the Einstein equations has precisely this effect.7

So, we replace the third Friedmann Equation with w = const and


3(1+ w )
r(t) = r⇤ a(t)/a⇤ , (2.29)

where a⇤ and r⇤ are the scale factor and density at some chosen time.
If we substitute this expression into the second FE we get that gravity is attractive and
the universe decelerates (ä < 0) if and only if w > 1/3.
Throughout this section we worked as if we had a single type of cosmic fluid in the
universe: this is not really the case, we have many of them, and they will be interacting, but
it is a good first approximation to consider them as separate.
In figure 2.1, a can be interpreted as the time, since their relation is monotonic. We
could insert the spatial curvature k in the plot: it decreases, but slower than matter, since
it appears in the first Friedmann equation with an exponent a 2 . We can find an effective
r( a) law for the curvature by defining an effective rk for curvature with H 2 = 8pG 3 ( r + r k ),
2
which implies rk = 3kc /(8pGa ). 2

We can also express this in units of the critical energy density rc = 3H 2 /(8pG ): we find

rk kc2
= Wk = . (2.30)
rc H 2 a2

Now, the dark energy in the universe is the most important component. It it dominant
over matter, radiation, and also dominant over spatial curvature.
6 This expression can be derived in different ways, one of which is to start from the fact that the stress energy
(i ) (i ) (i )
tensor must be traceless since it is of the form Tµn ⇠ Âi ruµ un , where uµ are the four-velocities of photons:
µ
their norm is zero, so we must have gµn Tµn = 0, but also for a perfect fluid T = Tµ = r 3P/c2 . Another,
perhaps more illustrative derivation was given in the General Relativity course [TM20, pag. 86-87].
7 This is shown by interpreting the additional term in the EFE as an addition to the stress-energy tensor and

interpreting it as a perfect-fluid tensor [TM20, eqs. 434-438].

32
2.00
Cosmological constant
Radiation
1.75
Matter
Now
1.50
0c
Energy density: /

1.25

1.00

0.75

0.50

0.25

0.00
0.0 0.5 1.0 1.5 2.0 2.5 3.0
Scale factor: a/a0

Figure 2.1: Contributions to the energy density varying with the scale factor. They are
normalized to the current critical energy density, using data from the 2015 Planck mission
[Ade+16]. The increase of the radiation energy density with an increasing scale factor is
sharp, but the crossover point with the matter density is at a = rrad /rm ⇡ 10 4 , at which
point we have r ⇡ 1012 r0c .

2.3 Solutions of the Friedmann Equations


We want to solve the equation system
ȧ2 8pG kc2
2
= r (2.31)
a 3 a
! 3(1+ w )
a(t)
r(t) = r⇤ , (2.32)
a⇤
which encompasses all the physical content of the FE, since the second equation can be
derived from these two.
Inserting (2.32) into (2.31) we find:
✓ ◆2 ✓ ◆ 3(1+ w )
ȧ 8pG a kc2
= r⇤ . (2.33)
a 3 a⇤ a2

33
2.3.1 Einstein-De Sitter models
As we discussed before, if we had zero spatial curvature (k = 0) then we would find
8pGr
r = rc = 3H 2 /(8pG ): so, we define the parameter W = 3H2 = r/rC which quantifies how
close this is to being true; experimentally, this is compatible with 1.
The Einstein-de Sitter model is one where we take W ⌘ 1: negligible spatial curvature,
which is equivalent to setting k = 0. So, the equation becomes:
Set k = 0, multiplied
8pG 3(1+ w )
ȧ2 = r⇤ a⇤ a (1+3w)
, (2.34) by a2 , defined A.
| 3 {z }
A2

1+3w 1+3w
therefore ȧ = ± Aa 2 , or a 2 da = ± A dt. We choose the positive sign, since we observe
the universe to be expanding. This can be integrated directly: the equation is
Z a Z t
1+3w
a
e 2 de
a=A det = A(t t⇤ ) , (2.35)
a⇤ t⇤

but we must distinguish two cases: either (1 + 3w)/2 = 1, which is equivalent to w = 1,


or not. Let us first assume that w 6= 1. Then, we get:
A solution is:
a
2 3+3w
a 2 = A(t t⇤ ) (2.36)
3 + 3w a⇤
r
Since k = 0 we have
3+3w 3+3w 3(1 + w ) 8pG 3+3w
a 2 a⇤ 2
= r⇤ a⇤ 2 (t t⇤ ) (2.37) H⇤2 = 8pG
3 r⇤
2 3
| {z }
H⇤
✓ ◆
3+3w 3+3w 3
a 2 = a⇤ 2
1 + (1 + w) H⇤ (t t⇤ ) (2.38)
2
✓ ◆ 2
3 3(1+ w )
a(t) = a⇤ 1 + (1 + w) H⇤ (t t⇤ ) , (2.39)
2
which we can couple to the equation for the evolution of the density, by plugging this
expression for the scale factor directly into (2.32):
✓ ◆ 2
3
r(t) = r⇤ 1 + (1 + w) H⇤ (t t⇤ ) , (2.40)
2
p
and also the Hubble parameter H µ r:
✓ ◆ 1
3
H (t) = H⇤ 1 + (1 + w) H⇤ (t t⇤ ) . (2.41)
2

There is a time where the bracket in a(t) is zero, which means a = 0: this moment is
commonly known as the “Big Bang’, so we call it tBB , defined by
3
1 + (1 + w) H⇤ (tBB t⇤ ) = 0 . (2.42)
2

34
Since the curvature scalar is R µ H 2 µ r,8 at tBB the curvature is diverges.
This time can be expressed by inverting the equation:

2
tBB = t⇤ . (2.44)
3(1 + w) H⇤

Hakwing & Ellis proved that if w > 1/3 we unavoidably must have a Big Bang.
We can define a new time variable by

2
tnew ⌘ t tBB = (t t⇤ ) + . (2.45)
3H⇤ (1 + w)

Using this new variable the t⇤ simplifies, and we can just write:
2
a µ tnew 3(1+ w ) . (2.46)

Inserting this new time variable (which we will just call t), we get that the Hubble
parameter is:

1 da 2 2
1 2
H (t) = = t 3(1+ w ) t 3(1+ w ) (2.47)
a dt 3(1 + w )
2
H (t) = , (2.48)
3(1 + w ) t

so we can compute the density:

3H 2 3 4
r(t) = = (2.49)
8pG 8pG 9(1 + w)2 t2
1
= . (2.50)
6(1 + w)2 pGt2

Let us now revisit the cases from before:

1. w = 0 is nonrelativistic matter: it has a µ t2/3 , r = 1/(6pGt2 ) and H = 2/(3t).


This yields a prediction for the age of the universe of t ⇡ 9.6 Gyr (using the Planck
data [Ade+16]): this is not correct, since the actual value is more like t ⇡ 13.8 Gyr,
but it has the right order of magnitude; the discrepancy is due to the fact that the
assumption of the universe being dominated by nonrelativistic matter is wrong.

2. w = 1/3 is radiation: it has a µ t1/2 , r = 3/(32pGt2 ) and H = 1/(2t);


8 This can be shown by a simple argument: we take the trace of the EFE, to get
✓ ◆
1
gµn Rµn Rgµn = 8pGgµn Tµn =) R = 8pGT , (2.43)
2

where R = gµn Rµn and T = gµn Tµn are the traces of the Ricci tensor and of the stress-energy tensor. We’ve
shown that R µ T: so, since T µ r, we have R µ r, but if we also assume that there is no spatial curvature then
H 2 µ r, therefore R µ H 2 .

35
3. w = 1 is dark energy: as we mentioned before, this is the case which must be treated
separately. The integral yields:
✓ ◆
a
log = A(t t⇤ ) , (2.51)
a⇤

which means
✓ 3(1+ w )

a(t) = a⇤ exp H⇤ a⇤ 2
(t t⇤ ) , (2.52)

while, as we saw, r and P (and, therefore, H) are constant.

From the redshift we can trace back the time of emission of the photon: for a matter-
dominated universe (w = 0), for example, we have:
✓ ◆2/3 ✓ ◆2/3
a0 t0 2
1+z = = = . (2.53)
a t 3H0 t

We can calculate the deceleration parameter for this case, to check that it is indeed
positive: we need to differentiate the expression
✓ ◆2/3
3H0 t
a ( t ) = a0 , (2.54)
2

and we find
⇣ ⌘
2 1
ä0 a0 t0 4/3 t2/3
0 3 3 1 The a0 (3H0 /2)2/3
q0 = = ⇥ = . (2.55) terms all simplify.
ȧ20 1/3 2
( t0 ) 22
33
2

This is a special case of the fact that [CL02, eq. 2.2.4b]

1 + 3w
q ⌘ q0 = . (2.56)
2

2.4 Measuring distances


We want to be able to compute the comoving radius, given our knowledge of the
evolution of the distribution of energy density in time.
We have shown that the luminosity distance is given by:
r
L
dL ⌘ = a0 (1 + z )r ( z ) . (2.57)
4p `
Also recall conformal time h, which is defined by its relation to cosmic time, a(h ) dh = dt:
it allows us to write the FLRW metric as
!
2 2 2 2 dr2 2 2
ds = a (h ) c dh r dW . (2.58)
1 kr2

36
This is very important when we talk about massless particles, with no intrinsic length
scale: the photon, which is our primary tool for astrophysical observations, is one of them.
This can be written in terms of the variable c:
⇣ ⌘
ds2 = a2 (h ) c2 dh 2 dc2 f k2 (c) dW2 , (2.59)

where f k (c) = r is equal to sin(c), c or sinh(c) if k is equal to 1, 0 or 1; in other words we


either have c = arcsin(r ), c = r or c = arcsinh(r ).
If we look at photons moving radially we do not need to account for the angular part,
and we find ⇣ ⌘
ds2 = 0 = a2 (h ) c2 dh 2 dc2 , (2.60)

therefore c2 dh 2 = dc2 : we get c h (t0 ) h (te ) = c(re ) c(r0 ), where a subscript e means
“emission”, while a subscript 0 means detection. We are choosing the negative sign when
simplifying the square, since the problem we are considering is that of radiation starting
from an astrophysical source and coming towards us: its radial coordinate c decreases when
the temporal coordinate h increases.
This means that we can find out the comoving distance Dc between two events by
calculating the difference between their comoving times Dh. This is what was meant by the
fact that this expression of the metric is useful for massless particles: the scale factor gets
factored out, we can write the expression in a very simple way.

dt da
dh = = , (2.61)
a a ȧ
and now recall (1 + z) = a0 /a: we differentiate this with respect to time to find

dz a0 a0 H ( z )
= ȧ = , (2.62)
dt a2 a
which means
Took the inverse of
dt dz
dh = = , (2.63) the equation, split
a a0 H ( z ) the differentials,
used the definition of
so we get our final expression: h

c dz Used the fact that


dc = . (2.64) dc = c dh.
a0 H ( z )

So, if we can find a way to parametrize the Hubble parameter H (z) in terms of the
redshift we will be able to measure distances.
The Hubble parameter is given by

8pG kc2
H2 = r , (2.65)
3 a2
where the density comes from several components: r(t) = rr (t) + rm (t) + rL , where the
first term is the density of radiation and scales like a 4 , the second is the density of matter
and scales like a 3 , the third is the density of dark energy and is constant.

37
In terms of the redshift, they scale like (1 + z)4 , (1 + z)3 (and (1 + z)0 ) respectively.
We express the Hubble parameter as a multiple of its value now: H (z) = H0 E(z), where
E(z) is a dimensionless function.
Recall the definition of W(t): it describes the ratio of the density of a certain type of fluid
to the critical density. We can look at the Wi (t) for i corresponding to matter, radiation and
so on:
8pGri (z) 8pGri (z = 0) ri (z)/ri (z = 0) (1 + z ) a
Wi ( z ) = = ⇥ = W i,0 , (2.66)
3H 2 (z) 3H02 E2 ( z ) E2 ( z )

where a is the exponent of the scaling of the fluid: a = 4 for radiation, a = 3 for matter,
a = 0 for the cosmological constant L, while for spatial curvature a = 2.
For the W corresponding to the curvature we define: Wk = kc2 /( a2 H 2 ) (see equation
(2.30)).
We must have
1 = Wr + W m + W L + W k . (2.67)
We can write an expression for E2 (z) by taking the ratio of the densities at emission
versus now:
H2
E2 ( z ) = = WL,0 + Wm,0 (1 + z)3 + Wr,0 (1 + z)4 + Wk,0 (1 + z)2 , (2.68)
H02

and to get E we just take the square root.


Now we can finally compute our integral
Z z
c dz0
c(z) = , (2.69)
a0 H0 0 E(z0 )

therefore !
Z z
c dz0
r = fk . (2.70)
a0 H0 0 E(z0 )
This does depend on k, but the differences between positive and negative curvature are
only relevant starting from third order: regardless of the curvature, f k is close to the identity
for small z. If the curvature is zero, we get the comoving distance:
Z z
c dz0
dC = ra0 = . (2.71)
H0 0 E(z0 )

If the curvature is not zero, we can still define a useful distance: the transverse comoving
distance,
Z z
!
c dz0
d M = a0 r = a0 f k ; (2.72)
a0 H0 0 E(z0 )

for k = 0 these two coincide.

38
Distance name Formula Description
Comoving dC = Rra0 Distance in comoving coordinates multi-
z 0
distance = Hc0 0 Edz
(z0 )
plied by the current scale factor: if the
expansion of the universe froze during
our measurement, this is the distance we
would measure between the two events.
Assumes k = 0.
Transverse d M = ra⇣0 R ⌘ Generalization of the comoving distance
z dz0
comoving = a0 f k H0ca0 0 E(z0 )
to k 6= 0.
distance
Luminosity = d M (1 + z )
dL p Distance defined so that the radiative in-
distance = L/(4p `) tensity we measure follows the inverse
square law.
Angular d A = d M (1 + z ) 1 Distance defined by the ratio of a far-
diameter = Dx/Dq away object’s size (measured using the
distance scale factor at the time of the emission
of the radiation we observe now) to its
angular size.

Figure 2.2: A summary of the cosmological distances we defined, drawing on the summary
by Hogg [Hog00].

Now, suppose we are looking at a certain far-away object with angular size Dq and
linear size at emission of Dx: then the angular diameter distance is given, in the small-angle
approximation, by
Dx a0 r z d
dA = = a ( t e )r = = M . (2.73)
Dq 1+z 1+z
Since the luminosity distance is given by

d L = a0 (1 + z )r = d M (1 + z ) (2.74)

their ratio is
dL
= (1 + z )2 . (2.75)
dA

2.5 The cosmological constant


Einstein thought that the universe had to be static: it was a common notion at the time
that it should be, almost a philosophical principle.9 Now we know that the universe is
9 An interesting historical fact: this was corroborated by a calculation error on Einstein’s part, which was
later pointed out by Friedmann. Einstein thought [Ein22] that rµ T µn = 0 implied ∂t r = 0, while Friedmann
p
pointed out [Fri22] that the correct equation reads ∂t gr = 0: the density of the universe is not forced to be
time independent if the determinant of the metric changes accordingly. Even the best make mistakes.

39
neither static nor stationary.10
So, he sought static solutions (a = const) for matter (P = 0) to the Friedmann equations
(1.9): if we set ȧ = ä = 0 the third equation becomes ṙ = 0, the second equation gives us
r ⌘ 0, and from the first we must also have k = 0: the only way to have a static matter-filled
universe is for the density of matter to be zero, and for the spatial curvature to be also zero.
In order to satisfy what he thought was an empirical fact, Einstein modified his equations
in order to get a static non-empty solution.
The Einstein equations, from which the Friedmann ones are derived, read

Gµn = 8pGTµn , (2.76)

when c = 1, where the Einstein tensor Gµn can be defined in terms of the Ricci curvature
tensor Rµn and the scalar curvature R as:

1
Gµn = Rµn gµn R . (2.77)
2
This peculiar construction is the only one which can be made in terms of the curvature
tensor and which is covariantly constant: rµ G µn = 0. This is a necessary condition since
rµ T µn = 0: the Einstein equations state that they are proportional, so if we take the covariant
derivative of the equations we must get the identity 0 = 0.
Einstein added a term Lgµn to the LHS of the Einstein equations, with L a constant
scalar. This is allowed since

1. it is tensorial (since it is a scalar multiple of the metric, which is a tensor);

2. it is symmetric;

3. it has zero covariant divergence, since L is constant and the metric is covariantly
constant rµ gµn = 0.

Then, we can rewrite the EE in two equivalent ways: either

eµn = 8pGTµn
G with eµn = Gµn
G Lgµn (2.78)
Lgµn
eµn
Gµn = 8pG T with eµn
T = Tµn + . (2.79)
8pG
In the first interpretation, the cosmological constant is an intrinsic geometric property of
spacetime; in the second interpretation cosmological constant is a particular kind of fluid,
with the property of its contribution to the stress-energy tensor always being a constant
multiple of the metric.
10 The distinction between static and stationary is subtle but significant [Lud99]: stationarity is about the
existence of a timelike Killing vector, while staticity is about the timelike Killing vector being orthogonal to
spacelike submanifolds. A concrete example: Schwarzschild geometry is both static and stationary, Kerr
geometry is stationary but not static, FLRW geometry is neither, since there is no timelike Killing vector field.

40
In order to find out what the properties of this fluid are, we compare its stress-energy
tensor to a generic ideal fluid tensor:
2 3 2 3
r 0 0 0 1 0 0 0
6 7 Lgµn L 6 7
(generic) 6 0 P 0 0 7 (L) 6 0 1 0 0 7
Tµn =6 7 Tµn = = 6 7 , (2.80)
4 0 0 P 0 5 8pG 8pG 4 0 0 1 0 5
0 0 0 P 0 0 0 1

so the corrections to the stress energy tensor must be r ! r + L/8pG and P ! P L/8pG,
or, in other words, the density and pressure of the “cosmological constant fluid” are rL =
PL = L/8pG. This proves that the equation of state of the cosmological constant is
w = 1.
Inserting this into the Friedmann equations we get:
✓ ◆2
ȧ 8pG L k
= r+ (2.81)
a 3 3 a2
ä 4pG
= r+L (2.82)
a 3
ȧ ⇣ ⌘ ȧ
ṙ = 3 re + Pe = 3 r + P , (2.83)
a a
and we can see that in the third equation, the effect of the source is encompassed in a
e the two L terms cancel, since they are opposite. For a cosmological constant-
term re + P:
dominated universe — that is, for a universe in which the only fluid behaves like the
cosmological constant – we have ṙ = Ṗ = 0.
So, proceeding with the derivation by Einstein, we set ȧ = ä = 0: for the first Friedmann
equation we get
8pG L k
r+ = 2, (2.84)
3 3 a
and for the second:
4pGr = L . (2.85)
So, we substitute the expression for 4pGr into the first Friedmann equation:
✓ ◆
2 L 1 2 k
4pGr + =L + =L= 2. (2.86)
3 3 3 3 a

What are the physical conclusions to draw? Since we want matter in the universe we
must have r > 0, which implies L > 0, which implies k = 1: so the universe must be closed.
Friedmann studied perturbations around this solution and found it to be unstable: so,
it is not suitable as a description of the universe. This, combined with the observations by
Hubble of an expanding universe, prompted the scientific community to discard the idea of
a stationary universe in favor of an expanding one.
Einstein probably [Aut18] called the introduction of the cosmological constant into the
equation his “greatest blunder”; however in modern cosmology the idea of a cosmological
constant has gained new vigor: we observe the universe’s expansion to be accelerated, that

41
is ä > 0, and the only way for this to be the case if r > 0 is if L > 0 as well. It is the only
kind of fluid which has a repulsive gravitational effect.
As opposed to the approach by Einstein, in which the cosmological constant was inserted
to stationarize the universe, we make it a measurable parameter of our theory.
A candidate for the cosmological constant term, which is a kind of intrinsic energy of
space, is the vacuum energy in Quantum Field Theory: however, the estimate we get when
trying to make this quantitative is around 10120 times the measured value of L. Friday
2010-10-18,
2.5.1 Evolution of a dark energy dominated universe compiled
2020-11-03
In order to find out how this parameter affects the universe’s expansion, we consider a
universe in which the only fluid behaves like the cosmological constant. So, we take the first
Friedmann equation (2.81) in the absence of ordinary matter (r = rL ) and with negligible
spatial curvature (k = 0). This yields:
✓ ◆2
ȧ L
= . (2.87)
a 3
This is actually a good approximation for the asymptotic state of the universe, since the
cosmological constant term is the only one which does not decay with the scale factor (and
so, with time).
The solution to this differential equation is, as we mentioned in section 2.3:
r !
L
a(t) = a⇤ exp (t t⇤ ) , (2.88)
3
p
which can also be written as a µ e Ht , since H = ȧ/a = L/3. This is called a steady-
state solution, since the Hubble parameter is constant. It is also called a de Sitter solution:
it belongs to the maximally symmetric 4D spacetime solutions to the Einstein Equations:
Minkowski, de Sitter and Anti de Sitter: the latter has L < 0, the former has L > 0.
This actually seems to model the observed expansion of the universe well, and until
recently it competed with the Big Bang theory.
The fraction of the cosmic fluid which behaves like dark energy is bound to increase
with time, since as we saw it is the only component which does not decrease in density over
time.
This is expressed formally using the so-called no-hair cosmic theorem, which is actually
a conjecture if it is meant to describe the universe: it states that asymptotically only the
dark energy contribution is relevant, all the matter and everything else is forgotten. In order
to interpolate between the current — matter dominated, or in which at least matter has a
sizeable contribution — universe and the asymptotic one we can use a solution in the form
2/3
a µ sinh( At) , (2.89)
p
where we define 2A/3 = L/3, since the hyperbolic sine is asymptotically close to an
exponential.

42
2.6 Curved models
We seek solutions to the Friedmann equations for nonzero spatial curvature k, for a
universe containing nonrelativistic matter (w = 0) without dark energy. We make these
assumptions since with them we can find an analytic solution; they do not match our
observations of the whole universe, but the model we will derive will find an application in
chapter 5, we will use it to model the collapse of a dark matter halo.
We can rewrite the two independent Friedmann equations as
8pG 2
ȧ2 = ra k (2.90)
3
✓ ◆ 3
a
r = r0 , (2.91)
a0
and now we will solve them with k = ±1.

Solutions to parametric ODEs In general, for an ODE like y = f (y0 ) for the function
y = y( x ) with f 0 continuous we introduce y0 ⌘ p, assuming p 6= 0: then y = f ( p), which
implies
Differentiated both
df 0
y0 = p , (2.92) sides of y = f ( p)
dp
which we can manipulate to get
df 0 dx 1 df
p= p =) = , (2.93)
dp dp p dp
so we can get the solution by integration: we get an expression for x in terms of p, which
we will be able to invert since by assumption p0 6= 0: so, we get
Z
1 df
x= dp and y = f ( p) . (2.94)
p dp

We use this for our problem: our differential equation looks like
8pG 2
ȧ2 = ra k (2.95)
3
8pG a30 2 Substituted
ȧ2 = r0 3 a k (2.96) r = r0 a30 /a3 from the
3 a third Friedmann
ȧ2 = Aa 1 k , (2.97) equation.

where we defined A ⌘ 8pGa30 r0 /3. We can rewrite this as


A
a= = f ( p) where p = ȧ . (2.98)
p2 + k
Then, using the general formula we get:
Z
1 df df 2Ap
t= dp where = (2.99)
p dp dp ( p2+ k )2
Z
2A
= dp . (2.100)
( p + k )2
2

43
2.6.1 Positive curvature: a closed universe
If k = +1, then we can make the substitution p = tan(q ), which is helpful since 1 + p2 =
sec2 q; for the change of variable we have dp = dq sec2 q. So, for the time we find:
Z
sec2 q dq
t= 2A 4
(2.101)
Z sec q
= 2A cos2 (q ) dq = A q + sin(q ) cos(q ) + const , (2.102)

and we can apply the trigonometric identity sin(q ) cos(q ) = sin(2q ):


A
t= 2q + sin(2q ) + const . (2.103)
2
Now we can define 2q = p a, which allows for the simplification sin(2q ) = sin(a);
also, we can express p = tan q in terms of a. This gives us
✓ ◆ Absorbed factor
A p a
t= a sin(a) + const and p = tan . (2.104) Ap/2 into the
2 2 2 constant

We almost have our solution: inserting p(a) into the main equation for a (2.98) we get
✓ ◆ Used 1 + tan2 x =
A 2 p a
a= = A cos (2.105) 1/ cos2 x.
1 + tan2 (p/2 a/2) 2 2
Used cos2 ( x/2) =
A A
= 1 + cos(p a) = 1 cos(a) , (2.106) (1 + cos x )/2 and
2 2 cos x = cos(p x ).

which should be complemented with the equation we found for t: in the end, our solution
looks like
A
t= (a sin a) + const (2.107)
2
A
a = (1 cos a) , (2.108)
2
so, in order to interpret this physically we fix t = 0 () a = 0, which sets “const” to zero,
and we reinsert the constants. In order to do so, we wish to express the constant A/2 in
term of observables such as H0 and W0 = r0 /r0c . We have:

8pGr0 8pGa30 r0
W0 = =) A= = W0 H02 a30 , (2.109)
3H02 3

which we can simplify by making use of the first Friedmann equation, which reads:
We are treating the
8pG k 1 1
H02 = r0 =) 1 = W0 =) a20 H02 = , (2.110) case k = 1
3 a20 a0 H02
2 1 W0

so we can write A/2 in two different ways:


A a0 W0 1 W0
= = . (2.111)
2 2 1 W0 2H0 (1 W0 )3/2

44
We use one of these for a and the other for t: this is done because it makes the prefactor
of the expression manifestly dimensionally consistent with the quantity we are expressing
— this is not always the case when working with c = 1. Then, the expressions for a and t
become:
W0 1 cos a
a = a0 1 cos(a) = e a0 (2.112)
2( W0 1) 2
1 W0 a sin a
t= (a sin(a)) = et0 . (2.113)
H0 2(W0 1)3/2 2

For the discussion of these results we rename the angle variable from a to q for historical
reasons.

Scale factor a
3.0 Time t
a0 or time t/et0

2.5
Normalized scale factor a/e

2.0

1.5

1.0

0.5

0.0

0 p/2 p 3p/2 2p
Angle parameter q

Figure 2.3: A plot of a(q ) and t(q ).

We have ȧ > 0 when 0  q  qm = p, while ȧ < 0 when qm  q  2p: so, we call qm the
turn-around angle. The angles 0 and 2p correspond to the Big Bang and the Big Crunch.
At qm we have:
W0
am = e
a0 = a0 (2.114)
W0 1

45
1.0

0.8
a0
Scale factor a/e

0.6

0.4

0.2

0.0

0.0 0.5 1.0 1.5 2.0 2.5 3.0


Time t/et0

Figure 2.4: A plot of the reparametrization a(t).

p p W0
tm = et0 = . (2.115)
2 2H0 (W0 1)3/2

The age of a closed universe

The total lifetime of the universe, in this scenario, is equal to 2tm = pet0 . How does this
compare to the result we found for a flat universe, namely t0 = 2/(3H0 ) (equation (2.48)
with w = 0)?
We set a(t) = a0 , which means we are normalizing the scale factor to the current one:
this yields

W0 1 cos q 2( W0 1) 2
1= =) cos q = 1 = 1, (2.116)
1 W0 2 W0 W0
so we can invert the cosine (assuming we are in the expanding phase: it is not invertible

46
globally) and insert our expression for q into the expression for the time, to get11
✓ ◆ !
1 W0 2 2 p
t0 = arccos 1 W0 1 . (2.118)
2H0 (W0 1)3/2 W0 W0

1.00
Curved model, k = 1
0.95 Curved model, k = 1
Flat model
0.90
Universe age t0 in units of H0 1

0.85

0.80

0.75

0.70

0.65

0.60

0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00


Current ratio of density to critical density W0

Figure 2.5: Universe age at the current time as a function of W0 . For W0 > 1 we use the
positive curvature model in equation (2.118): the age is lower than 2/(3H0 ); for W0 < 1 we
use the negative curvature model (2.121): the age is greater than 2/(3H0 ). The flat model
is plotted with a horizontal line for clarity, but if the universe is spatially flat then we must
have W0 = 1.

As we can see in figure 2.5, this means that the estimated age of the universe is lower
than 2/(3H0 ) ⇡ 9.6 Gyr, while the measured age of the universe is around 14 Gyr.
p p
11 We need to use the expression sin arccos( x ) = 1 cos2 (arccos x ) = 1 x2 with x = 2/W0 1, and
then the following manipulation:
s
✓ ◆2 s
2 4 4 2 p
1 1 = 1 + 1= W0 1 . (2.117)
W0 W20 W0 W0

47
2.6.2 Negative curvature: an open universe
For k = 1 we do exactly the same steps with hyperbolic functions instead of trigono-
metric ones, calling the argument of these functions y instead of q: we get

W0
a(y) = a cosh y 1 (2.119)
2(1 W0 )
1 W0
t(y) = (sinh y y) , (2.120)
H0 2(W0 1)3/2

and as before we can calculate the independent variable with cosh y = 2/W0 1.12
An analogous reasoning to the one before gives us
✓ ◆! Now
1 W0 2 p 2
t0 = 1 W0 arccosh 1 , (2.121) sinh arccossh( x ) =
p
2H0 (1 W0 )3/2 W0 W0 x2 1

which is plotted, again, in figure 2.5: in this case t0 > 2/(3H0 )! This is then more attractive.

2.6.3 Considerations on curvature


The experimental fact that t0 > 2/(3H0 ) seems to favour an open universe. However, the
age of the universe is t0 ⇡ 0.96H0 1 : looking at figure 2.5 it is clear that in order to account
for it with spatial curvature only we would need W0 ⌧ 1, and actually W0 < W0m , where
W0m is the current measured ratio of the density of matter to the critical density.
In fact, in the current LCDM model of cosmology this is accounted for using dark energy,
which means a positive cosmological constant.
From the second Friedmann equation

ä 4pG
= r(1 + 3w) (2.122)
a 3
we know that ä < 0, that is, the expansion of the universe decelerates if w > 1/3. A
singular instant at which a = 0 must be reached if w > 1/3, while this is not necessarily
the case for w < 1/3. As we will discuss in the next chapter, this will be one of the
motivations behind the theory of inflation, which does not require the presence of an initial
singularity.

12 A doubt one might have: where does the sign change from q sin(q ) to sinh y y?
The difference between the calculations with the trigonometric functions and the hyperbolic functions lies in
the substitution 2q = p a: in the hyperbolic case we cannot do it this way, since the hyperbolic functions do
not have any periodicity like this. Instead, the right substitution looks like 2y = ip + u, since sinh(ip + u) =
sinh(u).
Then, we have the same expressions as before, but their sign is flipped.

48
Chapter 3

The thermal history of the universe

In the study of the early stages of the universe the variation of the temperature, which
determines the distribution of the energies of the collisions of the particles, plays a central
role. It makes sense to talk about temperature when the particles are actually in thermal
equilibrium, as they were in the early universe: photons and electrons were continuously
Compton-scattering off each other. After the particles stop continuously interacting we say
that they decoupled, and each component evolves independently.
The fact that the universe’s temperature was much higher in the past is need to explain
primordial nucleosynthesis: Helium-4 is the outcome of Hydrogen burning but in stellar
evolution it is burned into heavier elements after it is formed, so we would expect to see
small amounts of it. Instead, we see a relatively large amount of Helium-4: it makes up
about a quarter of the universe by mass. The primordial universe being very hot helps
account for this. This was first predicted in 1948, in the notorious abg paper [ABG48].

3.1 Radiation energy density and the equality redshift


In section 2.2.1 we discussed the evolution of the energy density of matter and radiation,
showing that for radiation rr (z) = r0r (1 + z)4 while for matter rm (z) = r0m (1 + z)3 . We
discussed this in the context of electromagnetic radiation, but it describes well the behavior
of very relativistic particles, such as neutrinos.
We can define a moment called the equality redshift zeq . This is when the energy density
of radiation and that of matter were equal: rr (zeq ) = rm (zeq ). This means that

r0,m W0,m
r0,r (1 + zeq )4 = r0,m (1 + zeq )3 =) (1 + zeq ) = = , (3.1)
r0,r W0,r

where we divided and multiplied by the critical density today.


We know that W0,m is around 0.3, while for the radiation we can deduce the density from
the spectrum of the CMB.
Accounting for everything, we think that

1 + zeq ' 2.3 ⇥ 104 W0,m h2 ⇡ 3370 . (3.2)

49
The value is that obtained from the Planck Collaboration [Ade+16].
This means that the recombination of electrons and protons into Hydrogen, which
occurred around redshift zCMB ⇡ 1090, happened when the universe was already matter
dominated — specifically, the density of matter was ⇡ 3 times that of radiation.
Another interesting time is zL , when the energy density due to the cosmological constant
equalled that of matter: rm (zL ) = rL (zL ), which is calculated with the same reasoning as
zeq , recalling that rL is a constant with respect to the redshift:
!1/3 ✓ ◆1/3
r0,L 0.7
1 + zL = ' ⇡ 0.33 . (3.3)
r0,m 0.3

This is relatively close, in cosmological terms: the comoving distance corresponding to


this redshift is around 1350 Mpc, less than 10 % of the comoving distance to the CMB.
What is the temperature of a radiation-dominated universe? From the Stefan-Boltzmann
law we know that rr µ T 4 , while as we have discussed previously rr µ a 4 . Therefore, we
expect T µ 1/a to hold: this is known as Tolman’s law. In this chapter we will discuss how
this does approximately hold, but we need to make some corrections due to the annihilation
of ultrarelativistic particles.
We know that in a radiation dominated universe a µ t1/2 , which means that T µ t 1/2 .
We shall describe the pressure P, number density n and energy density r in the universe,
as functions of the chemical potentials µ and of the temperature T. We will use natural units,
so that c = h̄ = k B = 1: so, temperatures and masses will be measured in electronVolts.
This is very convenient, since it allows us to make the following consideration: when
the temperature will larger than the mass of a certain elementary particle, statistically that
type of particle will usually be ultra-relativistic.
This section will mostly follow Weinberg’s book [Wei72, page 538, section 15.6].

3.2 Thermodynamics in the early universe


We will express the quantities mentioned above: P, r and n in terms of the distribution
of particles in phase space: in general the phase space for a single particle in 3D is six-
dimensional, but we operate under the assumption that the cosmological principle holds, so
by homogeneity the spatial dependence of the distribution function can be neglected. Thus,
we can talk of densities,1 neglecting the spatial position, and integrating over momentum
space to gather all the information there is to know about the particles in that position.
1 Also, as a general rule, one should avoid talking about “global quantities”: the universe is, in principle,
infinite, so we should refer at most to what is or could be inside our light cone. It is better to work in terms of
densities.

50
3.2.1 Number density, energy density and pressure
Number density

If f (~q) is the distribution density of particles with three-dimensional momentum ~q, the
number density of particles is given by:
Z
g
n= d3 q f (~q, T, µ) , (3.4)
(2p )3
where the parameter g is the number of helicity states: it is the number of particles we can
have with different quantum numbers, after fixing momentum and position.
This essentially is our choice for the normalization of the distribution function. We
include the factor (2p )3 in order to normalize the integral: the number of particles, a pure
number, is given by
Z
Nµ d3 q d3 x f (~q, ~x ) , (3.5)

so the right hand side’s differentials have the dimensions of an action cubed: we need to
normalize them, and the conventional action used to do so is h = 2ph̄. So, when we set
h̄ = 1 we get a factor (2p )3 on the denominator. Do note that the d3 x integral, giving a
volume, is brought to the left in our expression to give a number density.

The number of helicity states

The only quantum number which can vary after fixing those if we are considering an
elementary particle is the spin component sz ; therefore if the total spin is s we should
have 2s + 1 possible spin states. For example electrons, which have spin 1/2, will have
g = 2s + 1 = 2.
Things, however, are more complicated than this: for photons, we only have two spin
states (g = 2) even though they have s = 1, since sz = 0 is unphysical for a photon. Gravitons
also have g = 2 even though their total spin is s = 2: this is because |sz |  1 is unphysical
for a graviton. In general for massless particles Lorentz invariance guarantees the fact that
transverse modes cannot exist, since we cannot go in the rest frame of the particle.2
Even for massive particles we do not always have g = 2s + 1: g accounts for all internal
degrees of freedom, and as the temperature drops below a certain value we need to consider
composite particles as well: for atoms we also have vibration, rotation and such. These all
contribute to g.
2 Some more details on this: if we measure the spin component s to be equal to some value in a certain
z
reference frame, then this will mean: (1) if sz 6= 0 we need a rotation of at least 2p/sz around the z axis in
order to recover the system we started with, while (2) if sz = 0 the system is symmetric with respect to rotations
around the z axis.
So, for a photon to have sz = 0 we would need to be in a frame in which its wavefunction was cylindrically
symmetric. This cannot be the case if the photon is travelling in the z direction, so we must be in the rest frame
of the photon, which does not exist.
Similarly, for gravitons the argument as to why sz 6= 0 still applies, and we can exclude the spins |sz | = 1
by the following argument: in full generality we can remove all the gauge freedom in a gravitational wave by
going to TT gauge, and we can show that in TT gauge the wave is symmetric under rotations of angle p about
the z axis. Therefore, the spin of the graviton must be at least 2 in magnitude.

51
Energy density

The energy density is given by


Z
g
r= d3 q E(q) f (~q, T, µ) , (3.6)
(2p )3

where E2 = q2 + m23 . For photons E = q, for nonrelativistic particles E ⇡ m + q2 /2m. Here


we are denoting the modulus of the momentum vector as q = ~q .
This formula is a weighted average of the energies on the distribution function on the
momenta.

Pressure

The adiabatic pressure is


Z
g q2 f (~q, T, µ)
P= d3 q . (3.7)
(2p )3 3E(q)

This comes from a consideration of the diagonal components of the stress energy tensor of
an ideal fluid: we know that
2 3
r 0 0 0 *Z +
6 7 g d 3q
6 0 P 0 0 7
T µn = 6 7= f (~q) pµ (~q) pn (~q) , (3.8)
4 0 0 P 0 5 (2p )3 E(~q)
0 0 0 P

where pµ = ( E(q), ~q) is the momentum vector. Although it may not look like it, this
formula is covariant.a The average sign is meant to indicate a spatial average, across
volumes which are wide enough for homogeneity to hold.
This formula reproduces equation (3.6) for µ = n = 0: indeed, in this case p0 (q) = E(q).
The off-diagonal components are zero by isotropy: if they were not, we would see heat and
particle flow in specific directions.
So, for the diagonal components spatial components the integrand looks like qi q j /E(q).
The formula for the pressure then follows by isotropy: the total force per unit area to
go around is q2 /E, and it must be distributed equally in the
R three spatial directions, so if
we want to switch from the directional integral T ii R= P µ qi qi /E (not summed over i) to
an integral of the modulus of the momentum, P µ q2 /E we must divide by 3.
a we are integrating a tensorial expression ( f pµ pn is a tensor) with respect to a covariant integration element:

d3 q /E(q) is a scalar with respect to Lorentz transformations, since it can be obtained as


q
d3 q
d4 q d ( E 2 p2 m2 ) = d( E m2 + p2 ) . (3.9)
2E

3 Particles are on-shell, that is they obey the equations of motion (which is not mandatory, and this is what

Quantum Mechanics is all about).

52
This definition gives us P = r/3 for photons directly, which can be seen by substituting
E = q.

The distribution function

If the particles are in thermal equilibrium, the distribution in momentum space will be
given by the following expression:
0 ! 1 1
E(q) µ
f (~q) = @exp ± 1A , (3.10)
T

where we have a plus for fermions, and a minus for bosons. Here, µ = ∂r ∂n is the
chemical potential, the derivative of the energy (density) with respect to a change in the
number (density) of particles: is becomes relevant when the gas becomes hot and dense, if
it is sparse then adding particles does not affect the energy.
The Planck distribution, which describes the statistics of photons, is consistent with this,
since it is given by:
✓ ◆ ! 1
q
f k (~q) = exp 1 , (3.11)
T

since they are bosons with no chemical potential.4 The fact that the distribution of photons
is indeed described by this distribution with µ = 0 is a way to experimentally determine the
fact that the chemical potential of photons is indeed zero. If we observed the distribution
for physical blackbodies to have µ 6= 0 this would be called a spectral distortion. The CMB is
wonderfully consistent with µ = 0, it is actually the best Planckian in Nature.
It is a fact that the chemical potential µ can be neglected when dealing with the early
universe. Let us justify this.
4 This may seem weird at first: is the Planck function not

2n3
Bn ( T ) = (3.12)
e2pn/T 1
in natural units? Well, these two are actually equivalent formulations. To see this, recall that in natural units
q = E = w = 2pn for a photon. First of all, they describe different physical quantities: the Planck function
describes the spectral radiance, dE = Bn ( T ) dt dA dn dW, while the distribution f (q) describes the number density
of particles per unit momentum volume: dN = f (q) d3 q.
To check their equivalence, let us compute the energy density with both:
Z Z Z
g q 2 2 w3 2n3
r= dW q dq = dW dq = dW dq (3.13)
(2p )3 eq/T 1 (2p )3 eq/T 1 e2pn/T 1
Z Z
dE dE
but also r= = dq dW = Bq ( T ) dW dq , (3.14)
dV dt dA dq dW

where we used the fact that, in natural units dV = dt dA.

53
In general, we can say that if for some chemical species we have the reaction i + j $ k + l,
and we reach chemical equilibrium, then the chemical potentials of the species will be
connected by µi + µ j = µk + µl : this is called the Saha equation.
Assuming that we are in thermal equilibrium is not in general valid, we will do so
in our discussion for simplicity, but non-equilibrium dynamics must be considered when
dealing with CMB anisotropies. Although the assumption does not perfectly hold, this is
quite instructive: the CMB spectrum is very close to an equilibrium blackbody spectrum,
the deviations from equilibrium are small.
By enumerating all the possible chemical reactions between the various particle types
we will get a system of equations for their chemical potentials, complemented with some
known facts, such as the fact that photons have µg = 0, which is the case since they do not
interact with each other.
For example, from the annihilation of electron and positron e+ + e $ 2g we can derive
a relation between the chemical potentials of e+ and e : µe+ = µe .
We can relate some chemical potentials by reactions, but not all of them: our system
of equations will be degenerate, with degeneracy corresponding precisely to the globally
conserved quantities (electric charge, lepton number, baryon number) which follow from
the symmetry group of our theory. These can have any value and are conserved in any
reaction,5 so they cannot be fixed by the system.
If there was a global electric charge, we’d expect global magnetic fields, but we only
see them with magnitudes of the order of the 1 nT, which gives an upper bound on the
global charge of the universe. So, any global electric charge would be quite small — we will
assume it is exactly zero.
We can estimate the orders of magnitude for the abundances of the various particle
species in the universe. The baryon number is very small when compared to the number of
photons in the universe, roughly speaking ng /nb ⇠ 1010 .
The lepton number is harder to estimate, but it is reasonable to assume that it is quite
small as well. For slightly more detailed discussion, see the book by Weinberg [Wei72, before
eq. 15.6.5].
In the end, we can say that in the early universe µ/T ⌧ 1, so we can assume µ ⇡ 0. This
is just a reasonable simplification, which we make in order to get analytic results.
Under this assumption the quantities characterizing the matter distribution in the uni-
verse only depend on the temperature: so, we will just write n( T ), r( T ) and P( T ).
In general, when dealing with thermodynamic problems in an expanding spacetime
there is a complication: in Minkowski spacetime we have symmetry under time translations,
thus it makes sense to talk about stationarity. In an expanding universe, instead, we have
no Killing vector with respect to time. There is a competition between two evolutions, the
thermodynamic evolution of the system and the expansion of the universe: we cannot truly
have equilibrium!
5 This holds as long as the temperature is low enough: we are considering the reactions which are allowed by

the Standard Model of interactions, with its symmetry group SU (3)c ⇥ SU (2) L ⇥ U (1)Y , but it is not currently
known whether at higher temperatures (i. e. earlier times) this is the most general symmetry group which is
spontaneously broken to the SM group. So, the statements we make only apply at relatively late times.

54
The way to deal with this problem is: we assume that the first evolution is much faster
than the other, that is, we reach thermal equilibrium on timescales that are short if compared
to the expansion. This way, we can neglect the expansion of the universe while our system
reaches equilibrium.
So our problem is oversimplified: we assume thermodynamic equilibrium, which makes
sense in certain periods of the life of the universe, and that allows us to embed a thermal
situation into a universe which evolves in time.

3.2.2 Entropy
From the second principle of thermodynamics we know that the entropy in a certain
volume V at temperature T, denoted S(V, T ) is given by:
0 1
1B C 1
dS = @d r( T )V + P( T ) dV A = V dr( T ) + ( P( T ) + r( T )) dV , (3.15)
T | {z } T
dE

since in order to get the total energy we must multiply the constant energy by the volume:
E = r( T )V.
Then we can read off the partial derivatives of the entropy:

∂S 1 ∂S V dr( T )
= r( T ) + P( T ) and = . (3.16)
∂V T ∂T T dT
In order for the differential to be exact it needs to be closed, which means that the second
partial derivatives need to commute (these are known as the Pfaff relations):6

∂2 S ∂2 S
= (3.17)
∂T∂V ∂V∂T !
✓ ◆
∂ 1 ∂ V dr( T )
r( T ) + P( T ) = (3.18)
∂T T ∂V T dT
✓ ◆ Simplified
1 1 dr dP 1 dr
2
r+P + + = (3.19) T 1 ( dr dT ),
T T dT dT T dT multiplied by T and
dP 1 brought T 1 (r + P)
= r+P . (3.20) to the other side
dT T
Cosmology has not entered into the picture yet, but it can by the third Friedmann
equation, which can be rewritten as

ṙ = 3 (r + P) (3.21)
a Multiplied by a3 .
0 = 3ȧa2 (r + P) + a3 ṙ (3.22)
Added a3 Ṗ on both
3 2 3 3
a Ṗ = 3ȧa (r + P) + a ṙ + a Ṗ (3.23) sides.

d( a3 ) d( r + P )
a3 Ṗ = + a3 (3.24)
dt dt
6 They only hold in a simply connected space.

55
d⇣ 3 ⌘
a3 Ṗ = a (r + P) , (3.25)
dt
and these two, when put together, are equivalent to
!
d a3
(r( T ) + P( T )) = 0, (3.26)
dt T

therefore this quantity is a constant of motion. Let us verify this statement: expanding the
derivative we get
! ✓ ◆
d a3 1 d⇣ 3 ⌘
3 d 1
(r + P) = a (r + P) + a (r + P) (3.27)
dt T T dt dt T
1 3 Ṫ
= a Ṗ a3 (r + P) 2 (3.28)
T T
a3 dP Ṫ
= Ṫ a3 (r + P) (3.29)
T dT T
Used equation (3.20).
a3 ( r + P ) Ṫ
= Ṫ a3 (r + P) = 0 . (3.30)
T T T
p
For the RW line element, the square root of the determinant is given by g = a3 , so
the conserved quantity can be written as
✓ ◆
d p r+P
g = 0. (3.31)
dt T
p
This is relevant because the volume of any given spatial region scales with g as the
universe expands.
So the quantity which is differentiated is constant. If we plug this back into the differen-

56
tial expression for the entropy, we get:7
!
( r + P )V
dS = d , (3.37)
T

therefore the differentiated quantities are equal up to an additive constant; from the con-
served quantity we found and the fact that V µ a3 we now get that the entropy is constant
in a comoving volume in thermal equilibrium:

a3
S ⌘ S ( a3 , T ) = r + P = const . (3.38)
T
Let us see what this entails: if we take photons, for example, we have r µ P µ a 4 : if we
1
substitute this in we find that a 4+3 /T = aT must be a constant, therefore T µ a 1 . This is
known as Tolman’s law.
We only consider photons since they have a much larger number density.

3.2.3 Explicit expressions for the thermodynamic quantities


Let us give explicit expressions for the number density, energy density and pressure as a
function of time. We are always assuming isotropy, so in all cases we will be able to simplify
the angular part of the triple integral in d3 q as
Z Z •
d3~q = 4p dq q2 , (3.39)
0

so the three expressions will read


Z
g
n( T ) = dq q2 f (q) (3.40a)
2p 2
7 The procedure to prove this result is as follows: the expression we want to show is equal to dS can be

written like
! ✓ ◆!
? ( r + P )V (r + P)V V dr dP r+P
dS = d = + + dT + dV , (3.32)
T T2 T dT dT T

while the definition of dS (3.15) can be written as


V dr r+P
dS = dT + dV , (3.33)
T dT T
so we can see that, since the term proportional to dV is the same in both cases, we only need to show that the
coefficients of dT are equal, so what we need to prove is
✓ ◆
(r + P)V V dr dP ? V dr
+ + = (3.34)
T2 T dT dT T dT
(r + P)V V dP ?
+ =0 (3.35)
T2 T dT
dP r+P
= , (3.36)
dT T
which is precisely the statement we found to be equivalent to the Pfaff relations (3.20).

57
Z
g
r( T ) = dq q2 f (q) E(q) (3.40b)
2p 2
Z
g 2 q2
P( T ) = dq q f ( q ) . (3.40c)
6p 2 E(q)

In general these do not have analytic solutions, however if we only consider the ultrarel-
ativistic and nonrelativistic limiting cases we can do the calculation.

Ultrarelativistic limit A particle being ultrarelativistic means that its momentum is much
greater than its rest energy, q m.
In our case we do not really care about any single particle being ultrarelativistic, rather,
we ask that the temperature is high enough that the bulk of the particles is ultrarelativistic.
The momentum of any single particles will not always be large — in fact the distribution
has its maximum at q = 0 — but the regions in which it is large give a much greater
contribution than those in which it is small, as long as the temperature is large.
We define the rescaled p momentum x = q/T, so that then the term appearing in the
exponential is E(q)/T = x2 + m2 /T 2 ⇡ x = q/T under the assumption that m/T ⌧ 1.
With this assumption we get:
Z ⇣ ⌘
g 2
1
n( T ) = dq q exp q/T ⌥ 1 (3.41)
2p 2 R+
Z ⇣ ⌘
g 3
1
r( T ) = dq q exp q/T ⌥ 1 (3.42)
2p 2 R+
Z ⇣ ⌘
g 3
1
P( T ) = dq q exp q/T ⌥ 1 , (3.43)
6p 2 R+
so we can see that in this approximation, which is equivalent to m ⇡ 0, we get matter
behaving like radiation: P = r/3.
The result of the integrals depends on the statistics of the particles (which determine the
± sign in the distribution), and it is given by the following expressions:
8
>
> x (3) 3
>
< gT Bose-Einstein
p2
n( T ) = (3.44)
>
> 3 x (3) 3
>
: gT Fermi-Dirac ,
4 p2

where z (3) is the Riemann zeta function calculated at 3, giving



1
z (3) = Â n3
⇡ 1.202 . (3.45)
n =1

For the energy density:


8
> 2
> p gT 4
> Bose-Einstein
<
30
r( T ) = (3.46)
>
> 7 p2 4
>
: gT Fermi-Dirac ,
8 30

58
while to get the result for the pressure P( T ) = r( T )/3 we just divide by 3.
We note that in natural units the Stefan-Boltzmann constant is sSB = p 2 /15; the result we
have found coincides with the Stefan-Boltzmann law for photons, which obey Bose-Einstein
statistics and have two helicity states: g = 2, therefore

p2 4
r( T ) = 2 T = sSB T 4 . (3.47)
30

Nonrelativistic limit Now we work in the opposite limit, m T. We can then expand the
energy in powers of q/m (or T/m: as before, the point is that the typical value of q is T, so
we can do it either way):
r !
q2 q2 q2
E = m 1+ 2 ⇡ m+ +O . (3.48)
m 2m m2

The first temptation one might have is to work at the lowest possible order, approxi-
mating E ⇡ m. The exponential exp( E/T ) will be very large compared to 1, so we can
neglect the ±1 in the denominator (which also means that the difference between bosons
and fermions becomes negligible).
So, to zeroth order in (q/m) we get
✓ ◆
m µ
f ⇡ exp , (3.49)
T

therefore the number density will be given by


✓ ◆Z
g m µ
n= exp dq q2 , (3.50)
2p 2 T R +

which diverges.
This is called the ultraviolet catastrophe: it is due to the fact that, while we are assuming
q is small, we are not enforcing this in any way, and approximating all states as having the
same energy regardless of their momentum. If, instead, we go to first order in q/m then we
find
✓ ◆Z ! ✓ ◆ ✓ ◆
g m µ 2 q2 mT 3/2 µ m
n= exp dq q exp = g exp , (3.51)
2p 2 T R+ 2mT 2p T

where we applied the identity


Z ⇣ ⌘ p
2 2 p
dx x exp ax = . (3.52)
R 2a3/2
The exponential factor exp( m/T ) is known as the Boltzmann suppression factor, which
tells us that as long as relativistic and nonrelativistic particles are in thermal equilibrium
there will be a much smaller number of the latter.

59
The energy density can be easily recovered from the number density if neglect higher
order terms:
Z Z
! Z
g g q 2 g
2 2
r( T ) = dq q E(q) f (q) ⇡ dq m + q f (q) ⇡ m dq q2 f (q) . (3.53)
2p 2 2p 2 2m 2p
|
2
{z }
n( T )

For the pressure, on the other hand, we have


Z Z
!
g 4 f (q) g m/T q4 q2
P( T ) = dq q ⇡ e dq exp , (3.54)
6p 2 m+
q2 6p 2 m 2mT
2m

and now we can apply the Gaussian integral identity [WA03, special case of eq. 10.1.11 (b)]:
Z ⇣ ⌘ r
4 2 3 p
dx x exp ax = 2 , (3.55)
R 8a a
where for us a = 1/2mT, which gives us
✓ ◆3/2
g e m/T 3(2mT )2 p m3/2 T 5/2 m/T mT m/T
P( T ) = 2mTp = g e =g e ⇥T (3.56)
6p 2 m 8 p 3/2 23/2 2p
= n( T ) T . (3.57)

Therefore, P = Tn = ( T/m)r, which tells us that the pressure of the nonrelativistic


particles is much smaller than their energy density, since T/m ⌧ 1: we characterize them
as noninteracting dust. The result we found, P = nT, is just the ideal gas law.
If we compare relativistic particles to nonrelativistic ones, the former dominate the latter
in terms of all of these three quantities.
The physical context in which this becomes relevant, in the early universe, is that when-
ever the temperature drops below the mass of a certain particle, that particle starts to become
nonrelativistic and its density drops exponentially, due to the Boltzmann suppression.
The main way for the particle to do so is generally to annihilate with its own antiparticle,
thus producing radiation.

Effective degrees of freedom We have been discussing the behavior of a single particle
species with g degrees of freedom; however we know that there were many types of particles
in the early universe, so we need a way to generalize these results. We do so by defining the
number of effective degrees of freedom:
✓ ◆4 ✓ ◆4
Ti 7 T
g⇤ ( T ) = Â gi + Â gi i , (3.58)
i 2BE
T 8 i2FD T

where the index i labels all the particle species in our model, running over all of those which
are relativistic at temperature T (that is, as a first approximation, we only count those with
masses mi < T). The equilibrium temperature is T, while Ti are the temperatures of the
various particle species, which we allow to be different from T — we will elaborate on this

60
point in a moment. We distinguish two different terms in the sum, depending on whether
the particle species obey Bose-Einstein or Fermi-Dirac statistics, since as we have seen the
latter have a prefactor of 7/8 in the expression for the energy density.
This definition is constructed so that we can write the compact relation

p2 4
r ( T ) = g⇤ ( T ) T . (3.59)
30
Of course, considering particles completely when mi < T and not at all when mi > T is
a simplification: in the region in which the temperature is of the order of the mass of the
particle there will be a transition, which can be calculated properly by doing the integrals
numerically. The results are shown in figure 3 of a paper by Husdal [Hus16], which can also
be referred to for many more details on effective degrees of freedom. Figure 1 of the same
paper shows how g⇤ decreases while the temperature of the universe decreases and more
and more particle species become nonrelativistic.
Why do we consider the possibility of the temperature of a particle species being different
from the equilibrium temperature?
Each process involving particles, be it decay or scattering, is characterized by a certain
timescale. If the timescale of a certain interaction is larger than the cosmological timescale
(the age of the universe), then that interaction statistically will not happen. Particles which
cannot reach thermal equilibrium because of this are called decoupled, ones for which this is
not the case are called coupled.
Although they may not interact, as long as they are relativistic decoupled particles can
still affect the energy density of the universe, so we need to count them.

The time-temperature relation We want to find a relation between time and temperature
in the early universe. Let us consider ultrarelativistic particles which are coupled, in the
radiation-dominated early universe (here “radiation” refers to all kinds of ultrarelativistic
particles).
We start from the first Friedmann equation

8pG k
H2 = r , (3.60)
3 a2
neglect the curvature term8 and use the facts that for a radiation-dominated universe r µ a 4
while a µ t1/2 , meaning that H = ȧ/a = 1/(2t).
Substituting these, as well as the expression we have found for the energy density in
terms of the effective degrees of freedom, we get

1 8pG p2
2
= g⇤ ( T ) T 4 . (3.61)
4t 3 30
8 We can do so since we know that right now the contribution to the global W of curvature is small (we have

not been able to distinguish it from zero) and while this term scales as a 2 the matter term scales as a 3 . Since
the matter term is dominant over the curvature term now, it was even more so earlier.

61
Then we have a formula for temperature in terms of time:
✓ ◆1/2 !1/2
1 8pG p2
= g⇤1/2 T2 (3.62)
2t 3 30
✓ ◆ 2
1 mP T
t⇡ q g⇤ 1/2 2 ⇡ s, (3.63)
2 8pp
2 T MeV
3⇥30
| {z }
⇡0.301

where m P = G 1/2 ⇡ 1.2 ⇥ 1019 GeV is the Planck mass. Beware: there are different conven-
tions for this, sometimes the definition is chosen as m P = (8pG ) 1/2 , which simplifies the
Friedmann and Einstein equations somewhat. This mass corresponds to the energy scale at
which quantum gravitational effects cannot be neglected.
The last approximation in (3.63) is quite rough, as it neglects the variation of the effective
number of degrees of freedom completely: however, the factor g⇤ 1/2 is of order 1 around
T ⇡ 1 MeV, which is the region in which we will apply our formula, so this is fine for our
purposes.

Entropy effective degrees of freedom Entropy density is defined as entropy per unit
volume, s = S/V = ( P + r)/T. Since the total entropy in a comoving region is conserved
(if there is thermal equilibrium) the quantity sa3 , proportional to S, is conserved.
If we only have relativistic particles (which satisfy P = r/3), the entropy density can be
expressed as
4r
s = ( P + r)/T = = (2p 2 /45) g⇤s T 3 ; (3.64)
3T
where we defined a new number of effective degrees of freedom, g⇤s , whose definition is
slightly different from that of the one used for the energy, s µ T 3 as opposed to r µ T 4 :
✓ ◆3 ✓ ◆3
Ti 7 Ti
g⇤ s ⌘ Â gi
T
+
8 i2Â
g i
T
. (3.65)
i 2 BE FD

The expression for s µ g⇤s T 3 is more general than simply s µ T 3 , and in fact with this
new one we can update Tolman’s law: taking the cube root of the conserved quantity sa3
we find Tag⇤1/3
s = const.

3.2.4 Decoupling and radiation temperature


The temperature 1 MeV occurs when the age of the universe is approximately 1 s, and
this is the point at which the weak interactions involving neutrinos stop occurring.9
9 This point is also relevant for another process: the weak interaction mediated processes are also what
allows there to be an equilibrium between protons and neutrons, so when those reactions stop they become
independent, and they evolve differently, since protons are stable while neutrons are not. This is crucial when
discussing nucleosynthesis, the formation of nuclei.

62
When this happens, neutrinos decouple, so they stop interacting: they start evolving as
any relativistic particle species would (they are relativistic since their mass is much lower
than the MeV).
At this point, the temperatures of neutrinos and photons are “disconnected”, there is
no mechanism to equalize them. However, while the neutrinos are freely floating by, their
energy density scaling like r µ a 4 , the photons will be active for a few more seconds.
The mass of electrons and positrons is around 0.5 MeV, so until about 4 s into the life of
the universe, 3 s from the decoupling of the neutrinos, the reaction e+ + e $ 2g is still in
equilibrium. After this, the populations of electrons and positrons annihilate, and since they
have less effective degrees of freedom in which to deposit their energy than before (since
the neutrinos are not coupled anymore) they dump it all into photons, thus increasing their
temperature.
After this occurs, the photons keep evolving like any other relativistic particle species,
with r µ a 4 but their temperature is higher than that of the neutrinos. Since they evolve in
the same way, the ratio of the temperatures is constant. Now we will calculate this ratio.
We impose continuity of the entropy in a comoving volume across the transition which
happens at 0.5 MeV between the stage in which electrons, positrons and photons are in
equilibrium and the stage in which they decouple, since the electrons and positrons are not
relativistic anymore and thus annihilate.
This transition only affects the temperature of the photons, while the neutrinos de-
coupled three seconds earlier; so, before the transition the temperatures of photons and
neutrinos are equal, after it the photons’ temperature increases.
Let us denote with an index > the quantities pertaining to an earlier time, ttransition > t,
while an index < will denote the quantities pertaining to a later time, ttransition < t.
The “updated” version of Tolman’s law reads Tag⇤1/3 s = const, and if the transition is fast
enough the scale factor can be taken to be equal on both sides of it. Therefore, we have
!1/3
g⇤ s >
T< = T> , (3.66)
g⇤ s <

so we can compute the temperature after the transition, T< , if we find the effective degrees
of freedom before and after: before the transition we have photons, electrons and positrons.
Photons have two polarization, and so do both electrons and positrons; also, the latter are
fermions, so we find
7 11
g⇤ s > = 2 + (4) = , (3.67)
8 2
while after the transition only photons are relativistic, so we have

g⇤ s < = 2 . (3.68)

This means that the temperature of the photons increases by a factor


✓ ◆1/3
11
T< = T> ⇡ 1.4T> , (3.69)
4

63
which allows us to compute the neutrino temperature at any time, since T< /T> = Tg /Tn ,
and because they scale in the same way:
✓ ◆1/3
4
Tn = Tg . (3.70)
11

Right now, the temperature will be around T0n ⇡ (4/11)1/3 T0g ⇡ 1.94 K, where T0g is
the current CMB temperature.
As an exercise, let us compute the number of effective degrees of freedom some time
after the decoupling of electrons, say at T = 0.1 MeV. The global temperature T we are
referring to is the one of the photons: so, applying the definition we find
✓ ◆4 ✓ ◆4
Ti7 T
g ⇤ = Â gi + Â gi i (3.71a)
i 2 BE
T 8 i2 FD T
0 !4 1
7 Tn A
= 2 + @3 ⇥ 2 (3.71b)
8 Tg
✓ ◆
21 4 4/3
= 2+ ⇡ 3.36 , (3.71c)
4 11

since we need to consider neutrinos (of which there are three flavors, each having two
polarization states), which contribute to the total energy density, but not electrons which are
not relativistic anymore.
With this result, we can find the energy density of radiation at that time according to
(3.59): we get
Multiplied by h̄ 3 c 5
4 4 3
rr (0.1 MeV) ⇡ 1.1 ⇥ 10 MeV = 25.7 gcm . (3.72) to get the CGS units.

3.3 Problems with the Hot Big Bang model, inflation


Around the 1960s, cosmologies were trying to piece together a description of the early
universe in terms of particle physics, as we discussed in this chapter up to here. However,
soon it became apparent that the standard cosmological model in use has some inconsisten-
cies. Let us now explore these.

3.3.1 The cosmological horizon problem


This was noticed as early as 1956. Let us consider radial null geodesics in a universe
described by a FLRW metric. These are the worldlines of photons we can detect with
telescopes. Imposing ds2 = 0 we find:

1
c2 dt2 = a2 (t) dr2 , (3.73)
1 kr2

64
which we can integrate (taking one of the two solutions for simplicity, choosing one over
the other just amounts to parametrizing time in the opposite direction) to find
Z t Z r
c dt de
r
= p = f (r ) . (3.74)
0 a(t) 0 1 r2
ke

The function f (r ) gives us the proper distance between emission and detection of a
photon, but it does so in terms of the adimensional coordinate r: in order to get something
which has the dimensions of a length we need to multiply by a calculated at a certain time,10
Z t
c det
dhor (t) = a(t) . (3.75)
0 a(et)

If this integral is convergent, we should be worried: let us see why.


If we integrate from the beginning of time to now, we get the spatial (current) comoving
distance elapsed by a photon which started moving at the start of time. This is the radius of
the largest region we could in principle observe. It is of the order of 3 Gpc. Since the integral
is convergent this is finite, and it is increasing as time passes. So, ever-further regions are
“coming into view” (at least in principle). Roughly speaking, the issue is that the regions
which we start seeing at the edges should be causally disconnected from the ones already
in view, so we would not expect them to exhibit the same properties — but they do. This is
the basic idea, let us formalize it slightly and connect it to observations.
We cannot actually see light coming from the very edge of the in-principle-observable
universe, since for redshifts larger than z LS ⇡ 1100 the universe was opaque to electromag-
netic radiation. The surface of points at this redshift is called the Last Scattering surface. So,
we refer our expectations to the CMB, which was emitted as the primordial plasma became
transparent.
The CMB was emitted at a cosmic time of t ⇡ 3.8 ⇥ 105 yr after the Big Bang. It is ob-
served to be very close to being uniform, with DT/T ⇠ 10 5 after correcting for the Doppler
dipole modulation: it looks like a distribution emitted by matter in thermal equilibrium.
Crucially, this holds at any angular scale we choose: the equilibrium is there across the
whole sphere.
Recall that the angular diameter distance d A is defined so that if an object with linear size
at emission Dx spans an angle Dq then we have (in the small-angle approximation)

Dx
dA = . (3.76)
Dq
The angular diameter distance to the last scattering surface is approximately d A (z LS ) ⇡
12.8 Mpc. On the other hand, the scale of the particle horizon at that redshift can be
calculated by taking the difference of comoving distances to us, dC (z• ) dC (z LS ) ⇡ 281 Mpc
and multiplying it by the scale factor, a(z LS ) ⇡ 9.2 ⇥ 10 4 , which yields a horizon scale of
10 Note that this choice is arbitrary: we are computing the comoving distance as measured at the cosmic time of

detection.

65
approximately r H ⇡ 260 kpc at that time.11
We can then say that Dx ⇠ r H , therefore the angular scale at which we expect to be able
to observe correlations since there can be causal connections is around
Dx
Dq ⇡ ⇡ 0.02 rad ⇡ 1.2° . (3.77)
dA

A similar calculation [Toj, eqs. 8–12] yields Dq ⇠ (1 + z LS ) 1/2 ⇡ 1.7°, using the (reason-
able) assumption of matter dominance in the epoch of recombination. Some steps there
are not really clear to me, so I’m not sure whether my line of reasoning is equivalent
(and valid) besides the assumption.
This is in stark opposition with the scale of observed correlations, which span the whole
sky!
Mention in the lecture of the Mixmaster Universe by Misner (and the Bianchi classifica-
tion of Lie Algebras for context) as an alternative to inflation — is this relevant here?

Cosmic inflation Now, if the quantity dHor (t) were to diverge this would mean that we
could have a causal connection with any point in the universe, provided we went far enough
back in time.
We can approximate
Z t
c det c
dHor (t) = a(t) ⇡ ct ⇠ ⌘ dH , (3.78)
0 a(et) H
where we defined the new Hubble distance, d H = c/H. This is a physical distance, but we can
also define the corresponding dimensionless comoving Hubble radius: r H = c/( Ha) = c/ ȧ,
which satisfies d H = ar H .
We can hypothesize that there was a period in the early universe when the comoving
radius r H was decreasing with time: if this is the case, the regions we are observing today
as “coming into view” could actually have been in causal contact in the early universe.
For the comoving radius to be decreasing (ṙ H < 0), the condition is (neglecting factors of
c, or working in natural units):

ṙ H = < 0, (3.79)
ȧ2
11 All
the calculations were made automatically using the astropy package, using a flat LCDM model with
parameters obtained from the Planck mission [Ade+16].
1 from astropy . cosmology import Planck15 as cosmo
2 import numpy as np
3 import astropy . units as u
4 z_LS = 1089
5 dx = ( cosmo . comoving_distance ( np . inf ) - cosmo . comoving_distance ( z_LS ) ) * cosmo
. scale_factor ( z_LS )
6 dA = cosmo . a n g u l a r _ di a m e t e r _ d i s ta n c e ( z_LS )
7 ( dx / dA ) . to ( u . degree , equivalencies = u . dimensionless_angles () )

66
therefore we need ä > 0 for at least some time.
The second Friedmann equation (in natural units) tells us that

4pG
ä = r + 3P , (3.80)
3
therefore the condition we need to have is r + 3P < 0. So, since the energy density is
positive, the condition is P < r/3.

Types of inflation Another way to express the parameter ä is as the derivative of ȧ = Ha:

ä = ȧH + a Ḣ = a( H 2 + Ḣ ) > 0 , (3.81)

so the condition can also be expressed as H 2 + Ḣ > 0.


We have shown earlier that, neglecting curvature, the time-dependence of the scale factor
looks like
✓ ◆2/(3(1+w))
3
a(t) = a⇤ 1 + (1 + w) H⇤ (t t⇤ ) , (3.82)
2

and it is reasonable to use this result since, as we will discuss in this section, inflation makes
curvature negligible.
We can characterize the solutions based on the sign of Ḣ, which determines whether the
equation of state parameter P/r = w is larger or smaller than 1: the possibilities are

1. Ḣ < 0 while Ḣ + H 2 > 0: this corresponds to 1 < w < 1/3, which in General
Relativity-speak is called a violation of the weak energy condition, and inserting it in our
general solution gives a(t) µ ta for some a > 0, a power-law inflation;

2. Ḣ = 0: this is a De Sitter, dark-energy dominated universe, whose scale factor evolves


like a(t) = exp( Ht), corresponding to w = 1;

3. Ḣ > 0 (and so also Ḣ + H 2 > 0), which corresponds to w < 1 and which gives us
a(t) µ (t tbounce ) a with a > 0, a singularity in the future.

The boundary at w = 1 is called the phantom divide.

An estimate of the inflation e-foldings. By how much does the early universe need to
inflate? The condition we need to impose is that the comoving radius of the universe at
some early time, r H (tin ), should be larger than the current one, r H (t0 ). Since r H = d H /a, we
can write this inequality as

d H (tin ) d H ( t0 )
a(t f ) a(t f ) , (3.83)
a(tin ) a ( t0 )

where we multiplied both sides by the scale factor calculated at t f , a time corresponding to
the end of inflation, a minimum for the scale factor. Let us define Zmin = a(t f )/a(tin ). This
will be 1, and it will describe by how much the universe inflated.

67
In our rough approximation d H ⇠ H 1 , so we can say that the boundary of the inequality,
the minimum inflationary expansion, will be
d H ( t0 ) a ( t f )
Zmin = (3.84)
d H (tin ) a(t0 )
H (tin ) a(t f )
= (3.85)
H ( t0 ) a ( t0 )
H (tin ) H (t f ) a(t f )
= (3.86)
H ( t f ) H ( t0 ) a ( t0 )
Hf Hf af Denoting H (ti ) ⌘ Hi
Zmin = . (3.87) and similarly for a.
Hin H0 a0
Now, we want to put some numbers into this expression: we know from the first
Friedmann equation that (as long as there is no spatial curvature) H 2 µ r, while the third tells
1+ w
us that r µ a 3(1+w) : therefore, H µ a 3 2 . Since we are working with ratios, proportionality
is all we need. Now, what w should we use? At any stage in the evolution of the universe
there are several fluids, but in order to simplify the calculation we will only consider the
dominant one and neglect dark energy in the current phase of the evolution of the universe.
In the inflationary phase we will have an undetermined w = winf , so that
✓ ◆ 3 1+2 w
Hf af
= , (3.88)
Hin ain
so the left-hand side of the equation reads
1+3winf
Hf 1+ w
1 3 2 inf
1 3winf
2
Zmin = Zmin = Zmin 2
= Zmin , (3.89)
Hin
since winf > 1/3 means 1 + 3winf > 0. The right-hand side has precisely the same form, so
we can express it as
✓ ◆ 1+23w
Hf af af
= , (3.90)
H0 a0 a0
where w is that of the dominant fluid from the end of inflation to now. The issue is, there is
not a single one! In the early stages radiation was dominant, then matter started dominating
(now dark energy is dominant, but we shall not worry about it). So, we split the term in two,
with the radiation-matter equality being the breaking point. The earlier radiation-dominated
phase is characterized by w = 1/3, while the latter matter-dominated phase is characterized
by w = 0, so we can compactly write the term as
! 1+2 1 ✓ ◆ 1+2 0 ! !1/2
Hf af af aeq aeq a0
= = . (3.91)
H0 a0 aeq a0 af aeq
With this result, we now have an almost explicit expression for Zmin :
2 ! !1/2 3 2 2 ! ! 3 2
1+3winf 1/2 1+3winf
aeq a0 a0 a0
Zmin = 4 5 =4 5 . (3.92)
af aeq af aeq

68
The ratio a0 /aeq can be written as 1 + zeq ⇡ 2.3 ⇥ 104 Wh2 ⇡ 104 (very roughly). The
other rough estimate we make is to apply Tolman’s law, so that

a0 Tf Tf m p
⇡ = ; (3.93)
af T0 m p T0
|{z}
⇠1032

properly speaking this only holds when there is radiation dominance so it does not apply
for the whole range in which we are applying it (up to today), but we are estimating the
order of magnitude of an exponent, so even an order-of-magnitude error is not an issue. We
normalized by the Planck mass (or temperature, since we are using natural units) since the
temperature T f at the end of inflation probably was of that order of magnitude. The final
estimate we get is
" # 2
1+3winf
Tf
Zmin ⇡ 1030 . (3.94)
mp

We do not know what winf is besides it being smaller than 1/3; let us say that it is of
the order of 1 like the current dark-energy dominated phase. If we further assume that
T f /m p ⇠ 1,12 we find Zmin ⇠ 1030 ⇡ exp(70).
This is often written as “70 e-foldings”, meaning 70 e-fold increases in size.

3.3.2 The flatness problem


Now, let us consider the flatness problem, which was first proposed by Dicke and Peebles
in 1986.
Cannot seem to find the paper or article. . .
The first Friedmann equation can be rearranged as

3a2 H 2 3k
= ra2 (3.95)
8pG
| {z } 8pG
= a2 r C
✓ ◆
1 3k
( rC r ) a2 = 1 ra2 = = const , (3.96)
W 8pG

where rC = 3H 2 /8pG is the critical density. The right-hand side only contains constants,
therefore the left-hand side must be constant as well. As the universe evolves r decreases
while a increases; however r µ a 3(1+w) by the third Friedmann equation, meaning that
the term scales like ra2 µ a 1 3w : as long as w > 1/3, which is the case for most of the
universe’s evolution (with radiation and matter dominance) the term is decreasing, meaning
that the other term must be increasing.
12 The constraints on T are the parameters of baryogenesis (we can observe it through the isotope distribution
f
early galaxies) and the fact we have not detected gravitational waves from this early time. These tell us that
T f /m p cannot be larger than 10 3 .

69
If k = 0 we also have W ⌘ 1, so the point is moot, however as we have already mentioned
there are reasons to think this is unlikely.
So, let us consider the case k = ±1: then, the term W 1 1 must scale like a1+3w in order
to balance the other.
If we assume w = 1/3 for all times we get (using the fact that a µ (1 + z) 1 and Tolman’s
law) that the term calculated at a redshift z is given by:
✓ ◆
1 1 2 1 T0 2
W ( z ) 1 = ( W0 1)(1 + z) = (W0 1) . (3.97)
T (z)
The reasoning in [Pac18] looks way more complicated, but this way seems just as
valid. . .
The assumption w ⌘ 1/3 is not really correct, but the result would only change by a few
orders of magnitude if we did the calculation properly, and as before we are giving only a
rough estimate of an exponent.
Let us extend this line of reasoning back to the Planck epoch, since beyond that our theory
of gravity might behave differently so it is not justified to apply Friedmann’s equations.
If we compute TPl /T0 we get approximately 1032 . This means that
1
W (zPl ) 1 ⇡ ( W0 1 1)10 64
. (3.98)
The correct calculation, keeping track of the dominant fluid in each phase, gives 10 60
instead of 10 64 : not that significant a difference when discussing these kinds of numbers.
The current estimate for W0 , as discussed in section 12, is at most of the order of 10 3
away from 1 — therefore, in the Planck epoch the parameter would have needed to be
different from 1 by a part in 10 63 .
This is a type of fine-tuning problem: the initial conditions seem to require an “unnatural”
number like W ⇠ 1 ± 10 63 . Typically, a fine-tuning problem is interpreted as a signal that
we should improve our theory.13

3.3.3 Mechanisms for inflation


Both the horizon problem and the flatness problem are addressed by the theory of
inflation; up until now we have seen what inflation does, but we still have to discuss how it
occurs. The way to properly describe it is through the language of Quantum Field Theory
in curved spacetime, which surely cannot be introduced here; this section is meant to just
give a flavor of the mechanism.
In regular Quantum Field Theory the Hamiltonian of our theory is often in the form (this
example is for a real scalar field, with a† and a being its creation
p and annihilation operator
and k being the momentum, with corresponding energy Ek = m2 + k2 ):
Z Z Z
a† a + aa† w h i
H = d3 k Ek a† a + d3 k k a, a† ,
= d3 k Ek |{z} (3.99)
2 | 2
{z }
N
!•
13 This approach has also received criticism [Hos19, sec. 3.2]: we do not know what distribution the initial
conditions are drawn from, so how can we say whether a certain number is more likely (or “natural”) than
another?

70
meaning that it can be expressed in terms of an integral of the number h operator
i times the
energy, plus an integral which is constant and which diverges, since a, a = d 3) (0) and
† (

the ground state of each harmonic oscillator in our continuum of possible values of the
momentum is nonzero.
As long as we are writing our theory in flat spacetime (and without the Einstein equa-
tions) this is not an issue: we are just adding a constant to the Hamiltonian, which does not
affect the equations of motions which depend on its derivatives. In GR this is not the case:
this energy gravitates, as we have seen any energy density does.
Is the ground-state energy just an artifact of the mathematical description of the fields?
If so, we do not have a problem; unfortunately this is not the case, we can see this ground
state energy through the Casimir effect.
If we put two metallic plates close to each other in a vacuum we can detect an attractive
force between them, due to the fact that long-wavelength fluctuations do not “fit” in the gap
between the plates, decreasing the energy density between the plates compared to the one
outside, which is equivalent to the binding energy of an attractive force.14
An interesting digression, but I’m not sure whether it fits in this section.
In QFT fields are classified based on how they transform under rotations: scalars do
not change, vectors come back to themselves after a rotation of 2p, spinors come back to
themselves after a rotation of 4p. In the standard model generally “matter” particles are
spinors, while “force” particles are vectors. Scalars are rare: the only one is the Higgs boson.
We want our mechanism for inflation to be a field with a nonzero expectation value,
which should also satisfy the symmetries of the FLRW metric. Our only option then is a
scalar field which is homogeneous (and automatically isotropic since it does not define a
direction), but which can change in time.
A vector and a spinor both define a direction in space, and thus do not satisfy the
requirement of isotropy. A term like yy, where y is a spinor and y is its adjoint, can
actually respect the required symmetries. We will not explore this further, but it gives rise
to what is called a fermion condensate.
So, we will add a scalar field F to our model.

The Lagrangian formulation of GR Usually, an action for the Standard Model particles
in a general-relativistic setting will have a term containing the derivatives of the metric, Sg ,
and a term containing the actions of all the Standard Model particles, SSM :

S = Sg + SSM . (3.100)

As in classical mechanics, the equations of motion are derived from a variational princi-
ple, dS = 0; however since S is a function of many fields this actually contains the equations
of motion for each of them. Varying with respect to the SM fields yields their equations of
14 This was demonstrated to actually occur: the first group to do the experiment with the original parallel-plate

configuration was in Padua [Bre+02]!

71
motion, while varying with respect to the inverse metric gµn yields the Einstein equations:
1
Rµn Rgµn = 8pGTµn . (3.101)
| {z2 } | {z }
dSg dSSM
µ
µ dgµn
dgµn

In this Lagrangian approach, the very definition of the stress-energy tensor is as a certain
multiple of the functional derivative of the action of the particles with respect to the inverse
metric.

Coupling between a field and gravity We update the action by adding a term for the field
F:

S = SF + Sg + SSM . (3.102)

As usual the action is found by integrating a Lagrangian density, but since we have a
p
nontrivial metric we need to use the invariant volume element d4 x g (which for FLRW
is just d4 x a3 ):
Z p
S= d4 x gL . (3.103)

The gravitational Lagrangian is given in terms of the Ricci scalar, Lg = R/16pG, and
varying it with respect to the metric yields the left-hand side of the Einstein equations.
A typical Lagrangian for a particle of mass m whose position is described by the coordi-
nates q is given by a kinetic term and a potential term:
m 2
L = q̇ V (q) , (3.104)
2
for a scalar field in Minkowski spacetime its equivalent would be
1⇣ ⌘
LF = ∂µ F (∂µ F) V (F) . (3.105)
2
In the GR case it is customary to be more explicit about the metric appearing in the
kinetic term; also, the derivatives should in general become covariant ones, although in this
case there is no change since the covariant derivative of a scalar is equal to its partial one:
rµ F = ∂µ F. The Lagrangian then becomes:
1 µn
L = g ∂µ F∂n F V (F) . (3.106)
2
The potential may include different terms, a common one is a mass term, which looks
like V (F) = mF2 /2.
If we add a massive term, proportional to RF2 , we get that adding it to the global action
looks like gravity.
Clarify

72
Actions are dimensionless since h̄ = 1, and since ds2 = gµn dx µ dx n the metric gµn is also
dimensionless. The Riemann tensor is given by the second derivatives of the metric, so its
dimension is a length to the 2, or Ra mass squared.
p
So, the dimensional analysis of d4 x gL gives us that L must have dimensions of
a length to the 4, or a mass to the 4. The field F has the dimensions of a mass, which
is an inverse length. The coupling constants are conventionally taken to be dimensionless:
therefore if we are to add a term to the Lagrangian, it must be xF2 times an inverse square
length, which is often m2 , m being the mass of the field, while x is a real number.
With all of this said, in terms of dimensionality we can add to our Lagrangian a term
xRF2 , where R is the Ricci scalar. This is a prototype for modified GR theories.
The value of x is undetermined: setting x = 1/6 gives us conformal symmetry, while in
other cases it is useful to set it as x = 1/4. A Weyl transformation (a local rescaling of the
metric) allows us [FGN98] to remove this term: we move from the Jordan frame (where we
do have coupling between our scalar field and the curvature, with a term such as the one
we described) to the Einstein frame, in which we do not have this term, but we do have an
additional matter-like term in the Einstein equations, a new component in the stress-energy
tensor, which will look like:
✓ ◆
1 rs
Tµn (F) = F,µ F,n gµn g F,r F,s V (F) , (3.107)
2
where commas denote partial derivatives: F,a = ∂a F.
We can get an explicit solution for the solution of the equations of motion of this field by
using the symmetries of our spacetime: we assume that, because of homogeneity, F( x µ ) =
j ( t ).
There is another issue: in QFT any field F is an operator acting on a Fock space, while
the left-hand side of the Einstein equation is a simple tensor — we are not quantizing space!
The solution to this problem is a semiclassical mean-field approximation, which is similar to
the Hartree-Fock mean-field method: we assume we are “close” to the ground state, and so
we substitute the stress-energy tensor on the right-hand side with its mean value computed
in the ground state of the theory:
D E
Gµn = 8pG T̂µn , (3.108)
0
where we define the ground state |0i as that one with the most symmetry allowed (that is, it
should be invariant under rotations and translations, the symmetries
D ofEthe FLRW ⌦metric).

If we perturb the state of the field F, we get F = j + dF: so F2 = j2 + 2 jdF +
D E
dF2 , but the second term is zero since hdFi = 0 and j is constant. The last term in
diverges. We do not know how to deal with it. We therefore assume that it is small.
What? is this about renormalization?
When computing the stress energy tensor we get only diagonal terms: this scalar acts
like a perfect fluid!
The energy density is equal to the Hamiltonian:
1 2
r = T00 = j̇ + V ( j) = H , (3.109)
2

73
while the pressure is the Lagrangian:
1 2
P= j̇ V ( j) = L . (3.110)
2
We assume that anything in the universe which is not our field behaves like radiation,
with energy density rr . Then, the Friedmann equations (assuming zero spatial curvature)
will read
✓ ◆
2 8pG 1 2
H = j̇ + V + rr (3.111a)
3 2
ä 8pG ⇣ 2 ⌘ Pr = rr /3.
= j̇ V + rr (3.111b)
a 3

ṙtot = 3 rtot + Ptot , (3.111c)
a
but in the continuity equation we can split the contributions by inserting an unknown factor
G, the transfer of energy between the field and radiation, so that the respective energy
densities evolve as

ṙ j = 3 j̇2 + G (3.112a)
a

ṙr = 4 rr G . (3.112b)
a
In order to see what the evolution of the field looks like we drop G, assuming that there
is little energy transfer between the field and radiation. The equation of motion of the field
reads

j̈ + 3H j̇ = V0 , (3.113)

where V 0 = ∂ j V.15
We have already mentioned that this field behaves like a fluid: so, what is its equation
of state? the definition of w is
1 2
P 2 j̇ V
w= = 1 2
, (3.115)
r 2 j̇ +V

whose limiting case, when j̇2 ⌧ 2|V |, is w = 1. As we have seen, this corresponds to an
evolution of the universe with a ⇠ exp( Ht): an exponential expansion!
The “old inflation model” has inflation being caused by the breaking of a certain sym-
metry through quantum tunneling, while a new inflation model involves “slow rolling”.
The equation j̈ + 3H j̇ = V 0 looks like a regular equation of motion with a kinetic and
friction term: after a time 1/H the “friction” velocity-dependent term dominates.
15 This may look peculiar, but it is in fact just the Klein-Gordon equation written in curved spacetime: the
difference comes about because the Dalambertian operator now reads [NS19]
⇣p ⌘ p
1 ∂µ g µ ∂ t ( a3 ) ȧ
⇤ j = p ∂µ g∂µ j = ∂µ ∂µ f + p ∂ j = j̈ + j̇ 3 = j̈ + 3 j̇ . (3.114)
g g a a

74
8pG
Then we get a slow-roll (friction-dominated) regime: H 2 = 3 V and j̇ ⇡ V 0 /3H.
Mention of chaotic inflation and many more things. . . could not really form a coherent
narrative. Should try again after reading [CL02, sec. 7.11].
This section could probably do with some pictures of potentials. . .
Let us consider a region of radius approximately 1/H (tb ), where tb is a certain moment
before the start of the inflationary phase. This region is expanded by many e-foldings
through inflation, such that any inhomogeneities are smoothed out: this is the cosmic
no-hair theorem [CL02, pag. 159].
So, there might have been perturbations before inflation: we cannot know. Perturbations
on scales larger than the cosmological horizon are not perceivable as perturbations: we only
perceive our local mean value.

Reheating The energy density of radiation through the inflationary period scales as rr µ
a 4 µ e 4Ht , while the one of matter instead it scales as rm µ a 3 µ e 3Ht since a µ e Ht .
Qualitatively speaking, as the universe rapidly expands it becomes basically devoid of particles,
and its temperature drops substantially.
At the end of inflation, reheating takes place, substantially raising the temperature and
allowing for the formation of most of the SM particles we observe today. This is due to
the latent energy released by our scalar field due to its coupling to the rest of the universe,
which acts as a sort of viscous force. Intuitively speaking, the field “falls” into its ground
state and then oscillates ever slower, dissipating its energy in the process [CL02, fig. 7.6].

3.4 Baryogenesis
This is the process which led to the formation of baryons.
The main issue we seek to address is that of baryon asymmetry. In the Standard
Model each fermion has a corresponding antifermion, and the same holds for the composite
particles of quarks, such as mesons and baryons.
We usually talk of baryons only; they are characterized by the conserved baryon number
B, which is the number of baryons minus the number of antibaryons.16
Matter and antimatter annihilate if they meet, producing radiation (g photons and/or
other high-energy particles). This allows us to investigate the presence of antimatter in the
universe: if there were patches of antimatter as well as patches of matter their boundaries
would be regions of great g-ray emission, which we would be able to detect as a background.
Observations then allow us to rule out the presence of antimatter regions large enough to
have B = 0 in the observable universe — small patches of antimatter may exist, but there
are not enough of them to balance out all the matter [CDG98].
This constitutes the problem of baryon asymmetry: it seems unnatural for the universe
to start off with more baryons than antibaryons, all the known Standard Model interactions
conserve B, yet we observe more baryons than antibaryons.
16 This number can actually be computed from the number of quarks already, so it makes sense to discuss it

even before the formation of proper baryons.

75
In particle physics the absolute number B is used, but in cosmology absolute numbers
are not used: we work in terms of densities. All number densities scale like a 3 , so we
need to normalize the difference nb n a with another number density: we define: h/2 =
(nb n a )/(nb + n a ). Here b means baryons, while a means antibaryons.
The reason for the factor 2 is given by how we can actually estimate this parameter:
in the early universe, when the baryons were still relativistic, processes like b + b $ 2g
occurred at thermal equilibrium, so roughly speaking we should have had nb ⇡ n a ⇡ 2ng .
What would be the proper chemical-equilibrium considerations?
Now, all of these densities scaled like a 3 from the moment the baryons became nonrela-
tivistic to now. So, we have
⇡0
z}|{
nb n a n n0a n0b
2 ⇡ 0b ⇡ = h0 . (3.116)
nb + n a n0g n0g
| {z }
2ng

This allows us to estimate the asymmetry constant h with parameters measured today:
the baryon number density and the photon number density.
For the baryon number density we can estimate 17
W0b ⇡ 0.0486.
r0b W r
n0b ⇡ = 0b C ⇡ 5.46 ⇥ 10 7
cm 3 2
h , (3.117)
mp mp

while for the photons we can find the number density by integrating the CMB Planckian:
the result is 18
✓ ◆
2z (3) 3 k B 3
n0g ⇡ T ⇡ 410 cm 3 . (3.118)
p 2 0g h̄c
17 The result comes about from the following code:
1 from astropy . cosmology import Planck15 as cosmo
2 import numpy as np
3 import astropy . units as u
4 from astropy . constants import codata2018 as ac
5 H0 = u . littleh *100 * u . km / u . s / u . Mpc
6 rhoC = 3 * H0 **2 / (8 * np . pi * ac . G )
7 ( rhoC * cosmo . Ob0 / ac . m_p ) . to ( u . cm ** -3 * u . littleh **2)

18 The result comes about from the following code:


1 from astropy . cosmology import Planck15 as cosmo
2 import numpy as np
3 import astropy . units as u
4 from astropy . constants import codata2018 as ac
5 z3 = np . sum ([ n ** -3 for n in range (1 , 1000000) ])
6 (2 * z3 / np . pi **2 * cosmo . Tcmb0 **3 * ( ac . k_B / ac . hbar / ac . c ) **3) . cgs

76
Combining these results yields
8
h0 ⇡ 3 ⇥ 10 W0b h2 ⇡ 1.3 ⇥ 10 9 2
h ⇡ 6.1 ⇥ 10 10
. (3.119)

How do we interpret this result? It means that in the radiation-dominated epoch the
asymmetry between baryons and antibaryons was very slight, about one part in a billion;
most of the pairs annihilated but a bit of matter was leftover, and that is the matter we
currently have.
In 1966, the Soviet physicist Sakharov postulated that in order to generate baryon-
antibaryon asymmetry in the early universe there are three necessary conditions:

1. violation of baryon number conservation;

2. C and CP violation (while CPT symmetry must hold for any well-behaved QFT);

3. out-of-equilibrium processes.

B violation is needed for the existence of processes which can generate more baryons than
antibaryons. C violation is needed because otherwise any process which generates baryons
would have a counterpart generating antibaryons occurring with the same probability. CP
violation is needed because otherwise processes generating left-handed baryons would
be balanced by processes generating right-handed antibaryons, and vice versa. Out-of-
equilibrium processes are needed because otherwise B-increasing and B-reducing processes
would balance out.

3.5 Decoupling of particle species


Earlier we made the claim that neutrinos decouple at T ⇡ 1 MeV: let us justify this, and
explore the concept of a particle species being decoupled in general.
We define the interaction rate G: it is the number of interactions per unit time that each
particle undergoes.
The Hubble rate H = ȧ/a is also, dimensionally, an inverse time, so we can compare H
and G directly. What does this comparison tell us? We have seen earlier that the age of the
universe is roughly given by 1/H, while 1/G is the average time for an interaction to occur
(modelling the interaction times as a Poisson process). So, if 1/H < 1/G, then on average
the particle species has undergone on average less than one interaction in the whole existence
of the universe. This condition is called decoupling, and is equivalent to G < H. If G ⌧ H
then the interaction basically does not happen.
In general, G can be calculated as G = n hsvi, where n is the number density, v is the
velocity of the particles, s is the cross section of the interaction, and we average over the
velocity distribution of the particles.
In the Standard Model, interactions are mediated by gauge bosons, which can be either
massless (like the photon) or massive (like the weak interaction W ± and Z bosons19 ).
19 Properlyspeaking, these bosons are not massive in general since they acquire their mass through the
spontaneous breaking of the SU (2) L ⇥ U (1)Y symmetry of the Standard Model to U (1)em , which happens

77
For 2 2
p massless boson mediators, the cross-section is (roughly) given by s ⇠ a /T , where
g = 4pa.
For massive boson mediators with mass m x , we need to distinguish between two cases:
for temperatures T  m x , the cross section is of the order s ⇠ Gx2 T 2 , while for higher
temperatures T > m x we have s ⇠ a2 /T 2 as in the massless case.20 Typically, Gx = a/m2x .
Let us then estimate the decoupling temperatures in the two cases. In our computation,
v is the relative velocity between the two particles; in both cases we assume that the particles
are relativistic, therefore v ⇡ 1. We will make very rough estimates, neglecting multiples
like p in front of our expressions; the things we want to get right are the dimensionality
and the asymptotic scaling of the equations.

Massless boson decoupling We know that the number density of particles scales like
n ⇠ T 3 ⇠ a 3 in this radiation-dominated epoch, therefore the interaction rate will scale like

a2
G ⇠ T3 s ⇠ T3 ⇠ a2 T . (3.121)
T2
What is the scaling of H? We can recover it from the first Friedmann equation (see
equation (3.63)):
r !1/2
8pG 1/2 p 2 T2
H= g T2 ⇠ , (3.122)
3 ⇤ 30 mpl

since G ⇠ 1/m2pl . The numeric factor we are neglecting is about a factor 10, only one order
of magnitude, which is fine given the roughness of our calculation.
Let us then see when G/H < 1, which corresponds to decoupling:

G a2 Tmpl 1
⇠ 2
⇠ a2 mpl , (3.123)
H T T
so we have decoupling when T > a2 mpl .
For electromagnetism a ⇡ 1/137, so at temperatures larger than T ⇡ 1016 GeV this
massless photon is decoupled.
Note that decoupling occurs for very high temperatures, meaning very early times, close
to the Planck epoch; further, the photons start off decoupled and then couple. After this,
there is no lower bound: they may remain coupled for arbitrarily low temperatures.
below the scale of electroweak symmetry breaking ⇠ 102 GeV. The bosons being massive yields an exponential
cutoff e mr in the expression of the potential for the interaction (this is the Yukawa potential), so at energies
higher than the electroweak SB scale the weak interaction becomes long-range.
20 The reason for this is that when we compute the cross section for an interaction with a massive boson we

find an expression like


s
s⇠ , (3.120)
(s m2x )2
where s is the square of the center-of mass energy of the interaction; if there is thermal equilibrium then typically
we will have s ⇠ T 2 . We can then see that at high energies the expression is asymptotically s ⇠ T 2 , while at
low energies it is s ⇠ T 2 /m4x . This expression also holds for m x = 0.

78
Some comments are made about gravitational interactions, but what would agrav be?

The calculation seems to yield T ⇡ 1015 GeV, or even less if we account for g⇤ . . . I guess
it does not really matter but it seemed curious.

Massive boson decoupling It is of interest for us to consider the low-energy situation, in


which T < m x . The opposite limit, T m x , behaves like the massless case: we find an
upper limit above which there is decoupling.
So, we have G ⇠ T 3 Gx2 T 2 = Gx2 T 5 , to compare with H ⇠ T 2 /mpl : as before we take the
ratio, to find

G T 5 G2
⇠ 2 x ⇠ Gx2 mpl T 3 , (3.124)
H T /mpl

so we have decoupling (G/H < 1) when T < mpl1/3 Gx 2/3 . This is a lower bound! The
situation is qualitatively different from the massless case.
For the weak interaction, Gx is called the Fermi constant GF ⇠ (300 GeV) 2 :21 this yields
the bound

T . 1 MeV , (3.125)

which is why below 1 MeV neutrinos are decoupled.


Then Pacciani [Pac18] applies the formula to gravitation setting Gx = G to find that it
decouples at T ⇠ m P , but the graviton is massless!
To summarize, interactions which are mediated by massless particles couple at a certain
(high) temperature and are thereafter coupled. On the other hand, interactions which are
mediated by massive particles couple at a certain (high) temperature, stay coupled for a
while, and then decouple again at a relatively low temperature.

3.6 Hydrogen recombination


In the early universe, at z ⇠ a few ⇥1000, there were free electrons and free protons. The
temperature had dropped below the binding energy of hydrogen, 13.6 eV, quite a long time
earlier, around z ⇠ 5 ⇥ 104 , however as we will see in more detail the conditions were such
that most hydrogen was still ionized.
Around z ⇠ 2000 the process of recombination22 started. Then, around z ⇠ 1400 ÷ 1600
[CL02, table 9.1] the process started really picking up, crossing the half-way point for the
fraction of hydrogen which was ionized.
21 This value can be experimentally measured from weak interactions, but it is consistent with the masses of
the mediators being of the order of 100 GeV.
22 The process e + p ! H + g is known as “recombination” for historical reasons, although electrons and

protons were not re-combining, since bound hydrogen could not have existed earlier in the history of the
universe.

79
At around z ⇠ 1100 the fraction of ionized hydrogen became so low that the universe
became transparent to radiation. Photons which were being constantly scattered reached
their last scattering and then kept going; the temperature at that point was T ⇠ 3000 K, so
now we see the CMB as a thermal distribution at a temperature of 3000 K/(1 + z) ⇡ 2.7 K.
Now, let us get to the details of how this all happened.
In order to precisely treat the process e + p $ H + g we would need all the scattering
matrix elements, as well as all the phase space densities of the particles. This would allow
us to treat a general distribution in phase space, even out of equilibrium.
What we will do instead is to provide a bulk estimate, making use of the Saha equation,
which assumes we are at both thermal and chemical equilibrium. It relates the chemical
potentials of the particles as µe + µ p = µ H + µg ; however we know that µg = 0, so we have

µe + µ p = µ H . (3.126)

Both electrons, protons and hydrogen atoms were nonrelativistic at this stage: therefore
their number densities can be approximated with Boltzmann statistics (the low-energy
approximation: equation (3.51)),
✓ ◆3/2 ✓ ◆
me T µe me
n e = ge exp (3.127)
2p T
✓ ◆ ✓ ◆
m p T 3/2 µp mp
np = gp exp (3.128)
2p T
✓ ◆3/2 ✓ ◆
mH T µH mH
nH = gH exp . (3.129)
2p T

The statistical weights here are ge = g p = 2 and g H = 4 [CL02, pag. 194].23 Also, we will
later need the fact that the universe being globally neutral implies ne = n p .
Let us now introduce the total baryon density, nb . In principle, we should account for
Helium: as we will see, a couple of minutes after the Big Bang He-4 nuclei started to form,
but they made up only something like 25% of the mass, which means 6% of the number
density: so we ignore them and say that the number density of baryons is

nb = n p + n H . (3.130)

The quantity we want to describe is how much hydrogen is still ionized: this is given
by the ratio of the free electrons to the total baryons, ne /nb . This is called the ionization
fraction Xe , and is also equal to n p /nb since the universe must be locally neutral.
We expect to have Xe = 1 in the early universe, and Xe = 0 after the end of reionization.
This is a good model to keep in mind although it is not precisely correct, since simula-
tions show that there will be some residual ionization: Xe only goes to around 10 4 ÷ 10 5
at the end of reionization. Also, in the modern universe a large fraction of the hydrogen
has become ionized again, especially in the intergalactic medium; this is likely due to the
energy injected into it by structure formation, which definitely breaks the approximation of
23 These numbers are given as fact here, they come from a quantum-mechanical study of the system.

80
global thermal and chemical equilibrium. This is not a concern for us: here, we only wish
to describe the processes in the interval 1000 . z . 2000.
The binding energy of the hydrogen is B = m p + me m H = 13.6 eV, so instead of m H
we can write m H = m p + me B. Let us insert this expression and the Saha equation in the
expression for the hydrogen number density; then we will recognize part of the expressions
for the proton and electron number densities, which we can
✓ ◆3/2 ✓ ◆
mH T µH mH
nH = gH exp (3.131a)
2p T
✓ ◆ ✓ ◆
m H T 3/2 µe + µ p me m p + B
= gH exp (3.131b)
2p T
✓ ◆3/2*✓ ◆ 3/2 ✓ *
◆ 3/2 ✓ ◆
gH mH T me T mpT B
= ne n p exp (3.131c)
ge g p 2p 2p 2p T
| {z }
=1
✓ ◆✓ ◆3/2
nH me T B
= exp (3.131d)
ne n p 2p T
✓ ◆ 3/2 ✓ ◆
nb n p me T B
2
= exp , (3.131e)
np 2p T

which we can manipulate, using the following identity:


⇣ ⌘
nb n p nb 1 n p /nb 1 1 Xe
= = , (3.132)
n2p n2b Xe2 nb Xe2

where we use ne = n p and the definition of Xe = n p /nb . Then, we bring the nb to the other
side of the equation and multiply and divide by the photon number density, which is given
by

2z (3) T 3
ng = . (3.133)
p2
With this, we can insert the baryon fraction hb (which is conserved, so we can use its
current value):
✓ ◆ 3/2 ✓ ◆
1 Xe nb me T B
= n g exp (3.134a)
Xe2 ng 2p T
|{z}
hp
✓ ◆ 3/2 ✓ ◆
2z (3) T 3 me T B
= h0 2
exp (3.134b)
p 2p T
p ✓ ◆3/2 ✓ ◆
4 2z (3) T B
= h0 p exp , (3.134c)
p me T
which we can solve numerically to find Xe as a function of temperature, or of redshift. The
results are shown in figure 3.1.

81
1.0

0.8
ionization fraction Xe

0.6

0.4

0.2

0.0

2000 1800 1600 1400 1200 1000 800


redshift z

Figure 3.1: Ionization fraction as a function of redshift: a numerical solution of equation


(3.134c).

Let us try to understand what is going on. Roughly speaking, we get intermediate values
for Xe , like 0.5, when the right-hand side is of order 1. If the right-hand side only contained
the terms ( T/me )3/2 exp( B/T ) without any multiplicative factor in front (meaning, roughly,
that we had h0 ⇠ 1), then we would see recombination starting to occur already at z ⇠ 4000
(T ⇠ 1 eV), and by z ⇠ 2000 if would almost be over. The reason this does not happen is that
h0 is very small: there are a lot of photons, many more than the baryons, which are ready to
dissociate any atoms which start to form.
So, hydrogen is truly formed only when T ⇠ 0.3 eV, much lower than its ionization
energy.
But even with h0 ⇠ 1 we get formation at T ⇠ 1 eV, an order of magnitude less than B!
is there an intuitive argument as to why this is the case?
The estimates we gave do depend on the value we assign to W0b and h, however within
the currently accepted experimental ranges the main predictions do not vary substantially.
The process of recombination is gradual, but we can choose a conventional redshift as,
for example, the time when we reach an arbitrary threshold like Xe = 0.1.

82
The interaction rate between photons and free electrons is Gg = ne sT c, where sT is
the Thompson cross section while ne is the number density of free electrons, which can
be estimated as ne = nb Xe ⇡ rC Wb Xe /m p . Then, we can estimate the moment of the last
scattering by checking the decoupling condition: Gg < H. With the numbers given earlier,
we find z ⇠ 1120: a very good approximation to the currently accepted value z ⇠ 1089!
Then Pacciani has an argument as to why T ⇠ a 2 for matter, should this be included
here? It basically comes from entropy conservation dE = P dV, the ideal gas law
P = nb k B T and the equipartition theorem E = 32 nb a3 k B T.

3.7 Primordial nucleosynthesis


Already in the 1940s it was noticed by Alpher, Bethe as well as Gamow that the abun-
dances of certain nuclides could not be explained if they were formed in stellar interiors
alone. Specifically, the issue is with the abundances of light elements: deuterium 2H, as well
as 3He and 4He, while heavier nuclides (with mass number A 7) could not form in the
early universe [CL02, sec. 8.6.1]. 24

The most abundant of these light nuclides by far is 4He, whose mass fraction is denoted
as Y ⇡ 25 %, while its number fraction is approximately 6 %. Our model will need to yield
this many helium-4 nuclides after primordial nucleosynthesis.
A mechanism for the synthesis of these light nuclides in the early universe is needed.
We shall model this mechanism, under the following assumptions (which are, as far as
we know, roughly verified):

1. the universe passed through a very high temperature phase, with T > 1012 K, during
which thermal equilibrium held;

2. the universe at this stage is described by General Relativity and the Standard Model
of particle physics, and it is homogeneous and isotropic;

3. the chemical potentials for the neutrinos µn have certain upper bounds, such that the
number of neutrino types is approximately 3;

4. there is no matter-antimatter separation (as in, antimatter “bubbles”);

5. there are no strong magnetic fields;

6. the number of exotic particles has a certain upper bound (it is small compared to the
number of photons).

The main formation channels in the early universe are:

1. n + p $ d + g (d = 2H denotes deuterium);
24 Environments which allow their production (stellar interiors) have lower temperature but a higher density,
since the main obstruction to their production is the absence of stable nuclides at A = 5 and A = 8 — the
environment needs to raise the odds of an unstable 8Be nuclide colliding with an a particle to form carbon
before it decays.

83
2. d + d $ 3He + n;

3. 3He + d $ 4He + p.

Note that these processes, unlike the stellar ones, do not involve the weak interaction:
neutrons and protons do not turn into each other. In stars, there are no free neutrons so this
process is not possible.
The slowest process of the three is the first, since it is heavily affected by photons, which
destroy deuterium. After we have produced deuterium, Helium-4 is readily produced.
In order to find out how much deuterium we have, we need the proton-to-neutron ratio.
We are working at energies of around 1 MeV, so protons and neutrons are not relativistic
anymore; as long as they are in equilibrium through weak processes both will obey Boltzmann
statistics, so for i = n, p:
✓ ◆3/2 ✓ ◆
mi T µi mi
n i = gi exp , (3.135)
2p T

meaning that their number ratio is given by25


The explanation in [CL02] is not the same as the one given by Pacciani [Pac18]. . . I’m
inclined to trust the former.

✓ ◆
nn mn mp
⇠ exp , (3.136)
np T

where mn m p ⇡ 1.3 MeV ⇡ 1.5 ⇥ 1010 K.


The proton is the lightest baryon and is therefore stable; while the neutron is unstable:
it can decay through the weak-interaction processes

1. n + ne $ p + e ;

2. n + e+ $ p + ne ;

3. n ! p + e + ne .

The neutron fraction keeps decreasing as T decreases and these processes keep hap-
pening, however as we have previously discussed at around Tdn ⇠ 1 MeV neutrinos de-
couple, at which point the first two back-and-forth reactions stop, and we are left with
nn /n p ⇡ exp( Dm/Tdn ) ⇡ 0.27.
We define the number fraction of neutrons, which is approximately
nn
Xn ( t ) ⌘ ⇡ 0.21 . (3.137)
nn + n p
25 We are neglecting the chemical potentials since, as explained by Coles and Lucchin [CL02, sec. 8.6.2], as long

as both weak and electromagnetic interactions are in chemical equilibrium the chemical potentials are forced to
be zero by all the balance equations.

84
The third reaction, which is b decay, keeps occurring since it does not require the
presence of neutrinos, so after the decoupling of neutrinos the number fraction decays
exponentially as:
✓ ◆
t t dn
Xn (t) = Xn (tdn ) exp , (3.138)
tn

where tn = log 2t1/2 , and the half-life of neutrons is given by t1/2 ⇡ (10.5 ± 0.2) min. So,
each minute neutrons stay unbound some of them are decaying; the process of deuterium
formation however is rather fast as we shall see, so not many of them are lost.
Let us then move to deuterium formation: its binding energy is around Bd = m p + mn
md ⇡ 2.2 MeV. We proceed exactly like we did with hydrogen: since µ p + µn = µd , the
deuterium number density can be expressed as
✓ ◆3/2 ✓ ◆
md T µd md
n d = gd exp (3.139)
2p T
!3/2 ✓ ◆ ✓ ◆
3/2
g md T Bd
= d nn n p exp , (3.140)
g p gn mn m p 2p T

which, dividing through by nb and using gd = 3 (since deuterium has spin 1) and g p = gn =
2, can be expressed as:
!3/2 ✓ ◆ ✓ ◆
3/2
3 md T Bd
Xd = n b Xn X p exp (3.141a)
4 mn m p 2p T
!3/2 ✓ ◆ Substituted
3 md 2z (3) B
= h0 X n X p 2
(2pT ) exp d
3/2
(3.141b) n b = h0 n g .
4 mn m p p T
!3/2 ✓ ◆ Approximated
3 md 2 B
⇡ h0 X n ( 1 Xn ) 2
(2pT ) z (3) exp d ,
3/2
(3.141c) X p + Xn ⇡ 1,
4 mn m p p T ignoring heavier
nuclides.

Does approximating X p + Xn ⇡ 1 not ignore deuterium as well? Or rather: by the


definition given before Xn + X p = 1 is exact, and if we do normalize by nn + n p + nd +
. . . we should specify. . .
which describes the deuterium bottleneck: similarly to hydrogen recombination, the pres-
ence of many photons for each nuclide keeps compound particles from forming for quite a
long time, and this precludes the formation of Helium.
We can approximate this as
✓ ⇣ ⌘◆
25.82 3 2
Xd ⇡ Xn (1 Xn ) exp 29.33 + log T9 + log W0b h , (3.142)
T9 2

where T9 = T/109 K. The factor W0b h2 comes from the scaling of the baryon-to-photon ratio
h (equation (3.119)).

85
Pacciani [Pac18] says we use the exponential decay of the neutron population to find
this formula: how does it come in, though?
Almost all the deuterium which is formed then turns into 4He, so since those nuclei are
as heavy as four neutrons, and two neutrons are needed to make each we can estimate that
the helium mass fraction will be given by Y ⇡ 2Xn ⇡ 0.25. The calculation is quite rough,
in order to do it properly we should use the Boltzmann equation.
Very rough calculation. . . the corrective factor of 0.8 is kind of thrown in there, also I
think using exp( 1.5) instead of exp( 1.3) helps nudging the number and hides just
how rough this is
This provides an experimental bound for many parameters: the lifetime of the neutron,
for example, cannot be much shorter since otherwise too many would have decayed before
forming deuterium.
The reaction rate G for weak interactions, as we have discussed earlier, is given by
G ⇠ GF2 T 5 ⇡ T 5 /tn , where the Fermi coupling constant GF is connected to the lifetime of
the neutron tn since that is also a weak-interaction process.
Then, the moment of decoupling G ⇠ H also constrains tn , since it happens at a temper-
ature Tdn ⇠ GF 2/3 m P 1/3 ⇡ tn4/3 m P 1/3 .26
Increasing the lifetime of neutrons decreases the amount of He-4 in the universe.
We know that the Hubble parameter is given by

8pG p 2
H2 = g⇤ ( T ) T 4 , (3.143)
3 |30 {z }
r = rr

p
so H ( T ) ⇠ g⇤ ( T ): this means that if we add more particles to our model which are
coupled in the ⇠ MeV range, thus increasing g⇤ at that temperature, we also increase H,
which means that the moment at which G < H comes about earlier, meaning that we get
more Helium. This is very useful constraint on our models; for instance, any dark matter
particle we hypothesize needs to not “break baryogenesis”.
Decreasing the baryon fraction Wb inhibits the production of deuterium, since then there
are more photons to disassociate them. We find that the model fits observation as long as
0.011h 2  W0b  0.25h 2 ; most people agree that we are near the upper bound of this
range.
There is also another parameter, m P ⇠ G 1/2 : modified gravity theories often predict
variations of the gravitational constant with time, but since this appears in our calculations
we can constrain its value at that stage in the early universe.
26 One might think that the lifetime of the neutron would be a well-established result in Earthly particle
physics, but in fact there is a conflict between two different kinds of measurements [Wol], so while the value is
roughly known the error bars of these measurements do not overlap: this is the “neutron lifetime puzzle”.

86
3.8 Dark Matter dynamics
We have described how dark matter came to be accepted as a necessary component of
the matter content of the universe, now we wish to understand what are its basic properties.
Observationally, dark matter has no relevant electromagnetic interactions: in the low-
redshift universe it only interacts gravitationally (it is decoupled), and is able to cluster.
This allows us to distinguish it from dark energy, which instead is uniformly distributed.
Models of dark matter can be classified into Hot and Cold dark matter: HDM and CDM.
We can also have Warm dark matter, which has intermediate properties.
In HDM, particles decoupled while they were ultrarelativistic, so they have very high
thermal motion. They move fast, and tend to destroy gravitational potential wells in which
they might settle by moving out of them, and thus decreasing the quantity of matter there.
Neutrinos were thought to be Dark Matter, and would have been classified as HDM.
They do this on scales comparable to the maximum distance travelled by them: this is
calculated as vt, where v is their average thermal velocity, and t is the age of the universe.
This means that the structures formed in the presence of HDM are larger than 1015 M ;
however this seems to conflict with observation, since we also see smaller structures! The
hypothesis is then that they were formed later, by fragmentation: this is known as the
top-down approach.
The top-down approach, however, is falsified by the observation of high-redshift quasars
combined with the scale of the anisotropies of the CMB: in order to account fo high-redshift
small-scale structures (we have seen stars at z ⇠ 20!) we would have to increase the
amplitude of the anisotropies to a scale which is not compatible to the anisotropies we see
in the CMB. This does not mean that HDM cannot exist, but it cannot make up most of the
observed DM density.
CDM, on the other hand, is dark matter which decoupled at a time when it was already
nonrelativistic. A bottom-up approach to structure formation is compatible with the existence
of CDM. This is the kind of dark matter which appears in the currently-accepted LCDM
model of cosmology.
Although it is the best model to date, there are also issues with CDM, which are current
research topics. Dark matter is distributed in halos, gravitationally-bound regions which
contain dark matter and are currently decoupled from cosmic expansion.
One issue is that simulations predict small-scale halos in the larger ones, which are
however not observed. Also, the density profiles from simulations do not seem to match
what is observed: they have a cusp in the center of the halo, while observations suggest that
density profiles are flat in the center.
There are many proposed solutions to these problems: for example, dark matter could
self-interact in ways other than gravitational, or it could be warm.

3.8.1 The Boltzmann equation and decoupling


In order to understand how Dark Matter evolves, let us introduce the main equation
used to analyze non-equilibrium phenomena: the Boltzmann equation. It describes the

87
evolution of the phase space distribution of particles, f . Its compact form is:

L[ f ] = C[ f ] , (3.144)

where L is the Liouville operator, which is a total derivative of the phase space density
with respect to changes in time, position as well as momentum; while C is the collision
operator, which describes the effect (in terms of variation per unit time) on the phase space
distribution of collisions between particles — it is written in terms of scattering matrices. In
general, we should consider the phase space distribution of all the particle species, writing
an equation for each of theses f i ; now we will treat a single particle species for simplicity.
This basic form is quite general: it can be used both for a simple Newtonian calculation
and to include general-relativistic and quantum-mechanical effects.
We start with the Newtonian description: the phase space has position, momentum and
time as coordinates, and a density function f (~q, ~p, t) is defined on it.
The classical form of the Liouville operator is the convective derivative,

D ∂ d~x d~p
L= = + · rx + · rp (3.145a)
Dt ∂t dt dt
∂ ~F
= + ~v · r x + · r p . (3.145b)
∂t m
This equation describes motion in much more detail than, say, the Navier-Stokes equa-
tions, since it allows for arbitrary momentum at each point in space. We can recover the N-S
equations, as well as the energy and mass conservation equations, by taking moments of the
Boltzmann equation: we multiply it by m~v, or mv2 /2, or m, and integrate in d3 p.
In the Newtonian case, the force term ~F includes gravity, which in GR instead is “ge-
ometrized”, meaning that it is included in the inertial motion of the particles. The rela-
tivistic version of the Liouville operator is written in terms of position x a and momentum
pa = dx a dl

dx a ∂ dpa ∂
L= + (3.146)
dl ∂x a dl ∂pa

= pa ∂a Gabg p b pg a . (3.147)
∂p

Is the momentum supposed to be dimensionless or not? Setting dx a dl = pa as well


as p2 = m2 seems contradictory. . .
This is because, since any particle must move along geodesics, pa must satisfy the
geodesic equation:

Dpa dpa
= + Gabg p b pg = 0 . (3.148)
Dt dl
We require that the particles are on shell:

gab pa p b = m2 , (3.149)

88
meaning that we are neglecting the uncertainty principle and our discussion is relativistic
but classical (not fully quantum-mechanical).
Let us make this explicit, using the Christoffel symbols of the FLRW metric and assum-
ing isotropy and homogeneity (which together imply that the phase space density f only
depends on t and p0 = E): many terms vanish, and we are left with

∂f ȧ 2 ∂ f
L [ f ] = p0 ∂ t f G0ij pi p j = E∂t f ~p , (3.150)
∂E a ∂E
since the relevant Christoffel symbols (in Cartesian coordinates) are

G0ij = dij . (3.151)
a
So, we can write the Boltzmann equation as

∂f ȧ p2 ∂ f C[ f ]
= . (3.152)
∂t a E ∂E E
Let us now do what is also done in the classical case and integrate in d3~p: we will lose
detail in the description but gain in computability. First, recall the definition of the number
density from the phase space distribution:
Z
g
n(t) = d3~p f ( ~p , t) , (3.153)
(2p )3
which appears in the first term if we integrate the Boltzmann equation (also multiplying by
g/(2p )3 ):
Z Z
∂n g ȧ 3 p2 d f g 1
d ~p = d3 p C[ f ] . (3.154)
∂t (2p )3 a E dE (2p )3 E

The number density actually also appears in the second term: we can manipulate it as
Z Z Z d( E2 ) = 2E dE and
3 p2 ∂ f 3 ∂f 2 ∂f
d p =2d pp = 2 d3 p p 2 (3.155a) similarly for d( p2 ).
E ∂E ∂ ( E2 ) ∂ ( p2 )
Z Z •
∂f ∂f
= d3 p p = 4p dp p3 (3.155b)
∂p 0 ∂p
• Z • Z
(2p )3
= 4p p3 f 4p dp 3p2 f = 3 d3 p f = 3n . (3.155c)
0 0 g

The boundary term vanishes since at 0 we have p = 0, and at (momentum) infinity we


have f = 0 (since otherwise the energy would diverge). We are also moving back and forth
between integrals in d3 p and in dp times 4p, which we can do because of isotropy.
So, we can see that the left-hand side is equal to ṅ + 3ȧn/a, and we have the cosmological
Boltzmann equation
Z
ȧ g 1
ṅ + 3 n = d3 p Ĉ [ f ] . (3.156)
a (2p )3 E

89
Right away we can see that this makes sense in the decoupling limit: if there are no
collisions the right-hand side must vanish, so we are left with
ȧ d ⇣ 3⌘
ṅ + 3 n = na = 0 , (3.157)
a dt
meaning that n µ a 3 : this is the usual scaling of the number density, so our approach is
working.
Now we need to understand what the right-hand side looks if particle species are
coupled. This is difficult in general, we will give an expression using “bulk” parameters:
Z
g d3 p
Ĉ [ f ] = Y GAn = Y hsvi n2 , (3.158)
(2p )3 E
where Y is the rate of creation of particles per unit volume, while G A is the rate of anni-
hilation. Note that Y has the dimensions s 1 m 3 , while G A has the dimensions s 1 , so it
needs to be multiplied by n in order to be dimensionally consistent. Physically speaking,
the reason the term has that form is that the rate of annihilation must be proportional to the
number density of particles there are at the moment.
The annihilation rate is found in terms of the cross section of the process, which we
must average over all the momentum distribution of the particles: this is the reason for the
appearance of hsvi.
At equilibrium, the left-hand side is equal to zero. The number density will be given by
some equilibrium value, neq : inserting this in the right-hand side, which must also vanish,
we find Y = hsvi n2eq . This result can then be used in general: the right-hand side is written
⇣ ⌘
as hsvi n2eq n2 .
This means that the equation reads
ȧ ⇣ ⌘
ṅ + 3 n = hsA vi n2eq n2 . (3.159)
a
Since the number density n itself is not conserved, we define the comoving number
density, which is conserved if there is equilibrium
✓ ◆3
a
nC = n , (3.160)
a0
for some arbitrary initial scale factor a0 . With this, we can simplify the left-hand side:
" # ✓ ◆
ȧ d a30 ȧ a30 a30 3ȧ a30 ȧ a30
ṅ + 3 n = nC 3 + 3 nC 3 = ṅC 3 + nC + 3 nC (3.161)
a dt a a a a a2 a2 a a3
✓ ◆3
a0
= ṅC . (3.162)
a
Similarly, we can express the right-hand side in terms of comoving densities: hsA vi is
the same, while
2 3
⇣ ⌘ a6 a 6 nC2
n2eq n2 = n2C,eq n2C 06 = n2C,eq 06 41 5, (3.163)
a a n2C,eq

90
so the equation will read:
" #
a3 a6 n2C
ṅC 03 = hsA vi n2C,eq 06 1 (3.164)
a a n2C,eq
| {z }
The a3 factors simplify
✓ ◆3 " #
a0 n2
ṅC = hsA vi n2C,eq 1 . (3.165)
a n2eq
| {z }
=nC,eq neq

We want to write this in a more intuitive way; the derivative with respect to time can be
expressed in terms of the scale factor, as

d d d
= ȧ = Ha . (3.166)
dt da da
With it, we can express the equation as
2 !2 3
a dnC hsA vi neq 4 n
= 15 . (3.167)
nC, eq da H neq

The ratio before the parenthesis has an intuitive physical meaning: the characteristic time
of the collisions which can annihilate this particle species is tcoll = 1/G = 1/(hsA vi neq ),
while the timescale of the expansion of the universe is texp = 1/H, we get

2 !2 3
a dnC texp
4 n
= 15 . (3.168)
nC, eq da tcoll neq

This allows us to characterize decoupling in a much more specific way than before.

1. If G H, then texp tcoll , therefore n ⇡ neq , which also implies nC = nC, eq (this
quantity can vary!). This is the equilibrium case, the particles are coupled.

2. If G ⌧ H, then texp ⌧ tcoll we have decoupling, and nC = const, therefore n µ a 3.

Pacciani [Pac18] writes that in the coupled case the comoving density is constant, but
this will not be the case in general: allowed annihilation processes may change over
time, as other particle species decouple!
This approach is much more powerful than the qualitative one we gave earlier, since
while the limiting cases are the same this equation allows us to also treat all the intermediate
situations.
Let us apply this to both HDM and CDM.

91
3.8.2 HDM density estimate
Neutrinos are a candidate for HDM: we know that at temperatures below Td = 1 MeV
they decoupled, after which — as we have seen earlier — their temperature27 evolved like:
!1/3
g⇤after decoupling
Tn = Tg , (3.169)
g⇤before decoupling

and so this scaled like Tn µ a 1 , while their number density scaled like nn µ a 3 µ Tn3 .
There is nothing special about neutrinos: we can apply the same line of reasoning to a
generic HDM species x, whose number density today will be then given by

z (3) 3
n0x = Bgx T , (3.170)
p 2 0x
where the factor B accounts for the statistics: it is 1 for bosons, 3/4 for fermions. The
parameter gx is the number of degrees of freedom of the particle species x,
We can rescale this in terms of the photon number density, which is given by the same
expression, with B = 1 and gx = 2: using the
z (3)
n0x Bgx 2 T 3
= z (3p) 0x (3.171)
n0g 3
2 p2 T0g
B g⇤0
n0x = n0g gx , (3.172)
2 g⇤dx

where g⇤0 is the current amount of effective degrees of freedom, while g⇤dx is the same
quantity, computed at the decoupling time of particle x.
It should not be g⇤0 though, right? we compute g⇤ on the two boundaries of the transi-
tion, the effective dof right now are irrelevant, I’d think.
The energy density, as long as today the particles in question are nonrelativistic,28 is
given by r0x = m x n0x , so we can write

B g⇤0x
r0x = m x n0g gx , (3.173)
2 g⇤dx
27 We call it “temperature” for clarity, but properly speaking it is not one, since ever since decoupling neu-
trinos are not thermal. It is better understood as the “temperature parameter” of the neutrinos’ phase space
distribution.
28 Which we assume they are. This is not meant to be interpreted as an experimental fact — for neutrinos, say,

we do not actually know what their mass is, so we are not sure, although given their mass differences (which
can be inferred from neutrino oscillations) at the current temperature T0n ⇠ 200 µeV at least some neutrino
species must be nonrelativistic.
Rather, the point is that if the mass of one of these HDM particles were so low that they were relativistic today
than their energy density would be very low, comparable to that of photons, which as we have seen earlier is
negligible: they could not constitute the large fraction of the critical density rC that we know DM constitutes.
In order for HDM to have a chance to be a significant constitutent of DM it must be nonrelativistic today.

92
therefore the mass fraction of the HDM particle x today will be roughly
r0x 2 g⇤0 m x
W0x h2 ⇡ h ⇡ 2Bgx . (3.174)
r0c g⇤dx 102 eV

This, together with what we know the W of dark matter to be, allows us to check whether
a candidate for HDM is viable or not, based on its mass and on when it decouples. We can
already see that if the mass is lower than a few eV the particle cannot make up most of the
DM budget.

3.8.3 CDM density estimate


CDM is made of particles which were already nonrelativistic when they decoupled, so
we can describe them using Boltzmann statistics: we keep referring to the DM candidate as
x, so at the temperature of decoupling Tdx we have
✓ ◆3/2 ✓ ◆
m x Tdx mx
n x ( Tdx ) = gx exp , (3.175)
2p Tdx

after which the density will scale like a 3, so29


!3 ✓ ◆3
a( Tdx ) g⇤0 T0g
n0x = n x ( Tdx ) = n x ( Tdx ) . (3.176)
a0 g⇤ x Tdx

The difficulty lies in determining the decoupling temperature Tdx , which is when the
collision and expansion timescales are equal.
The first Friedmann equation combined with the expression for the energy density (of
radiation, but corrected according to the effective degrees of freedom at that time) tells us

8pG p2 4
H 2 ( Tdx ) = g⇤dx Tdx , (3.177)
3 30
which we can use to estimate the expansion timescale texp = 1/H:
⇣ ⌘ 1/2
1/2 mpl
8p p 2
0.6 ⇡ .
texp ⇡ 0.6g⇤dx 2
. (3.178) 3 30

Tdx

Pacciani [Pac18] writes 0.3, am I getting the calculation wrong?


Now, let us estimate the collision timescale: we know that its inverse is G = n hsA vi, and
it is a fact from particle physics that the average cross section scales with the temperature
like:
✓ ◆N
T
hsA vi = s0 , (3.179)
mx
29 Applying the “updated” version of Tolman’s law, Tag⇤1/3
s = const.

93
where N = 0 or 1, while s0 is a constant characteristic cross section of the process. So, the
collision timescale is
✓ ◆ ! 1
Tdx N
tcoll ( Tdx ) = n( Tdx )s0 . (3.180)
mx

Equating the two timescales we get the following equation:


✓ ◆N ! 1
Tdx mpl
n( Tdx )s0 = 0.6g⇤ 1/2 2
, (3.181)
mx Tdx

which is transcendental in T, since we have an exponential as well as a polynomial term in


the expression for n x ( Tdx ).
We can solve it iteratively, in terms of the parameter xdx = m x /Tdx , which are assuming
to be much larger than one (in order for the procedure we have done so far to be valid,
and in order for x to be CDM): this allows us to select the physical solution to the equation
among the nonphysical ones.
The solution, after the second iteration, is found to be something like:
! ✓ ◆ !
gx 1 gx
xdx = log 0.038 1/2 mpl m x s0 N log log 0.038 1/2 mpl m x s0 . (3.182)
g 2 g
⇤ xd ⇤ xd

From this we can determine the contribution of this CDM particle to the current energy
density.
Qualitatively, we find that there is a very significant dependence on the mass of the
particle (since we have an exponential suppression in its density) and on the way it interacts
(s0 ).

94
Chapter 4

Stellar Astrophysics

4.1 Stellar formation


We will now discuss the formation of stars, which started happening at a redshift z ⇠ 20.
We shall do so in a Newtonian approximation, neglecting the expansion of the universe —
later we will discuss how cosmology affects this.
A star is a gravitationally-bound sphere of plasma, inside which fusion occurs. Stars
form from the gravitational collapse of instabilities, which is followed by an increase in
temperature and pressure from the release of gravitational energy.
We will model this through simple assumptions, since they already give a good picture
of what this collapse looks like. Two forces are at play: gravity and pressure.1 As we will
see, the gravitational force initially dominates and compresses the material up to a certain
point, at which the pressure prevents it from going further.
We start with a spherically symmetric region of baryonic matter, characterized by a
density r(r ): the mass enclosed in a radius r is
Z r
m (r ) = r2 r(e
4pe r ) de
r. (4.1)
0

The modulus of the gravitational acceleration of the material at a radial coordinate r can
be calculated from Gauss’ theorem:
Gm(r )
g (r ) = . (4.2)
r2
Let us now consider a spherical shell at a radius r, with its enclosed mass DM =
r(r )DADr, where DA = 4pr2 . Let us denote P as the pressure at the inner surface, and
P + DP the pressure at the outer surface. Then, the net force on the surface is given by
✓ ◆
dP dP dP DM
( P + DP)DA PDA = P(r ) + Dr DA P(r )DA = DADr = . (4.3)
dr dr dr r(r )
1 Inthis section we will neglect the Pauli exclusion principle, which does not allow fermionic matter to
compress beyond a certain point. We will come back to this point when we discuss the Chandrasekhar mass.

95
Note that this is an inward force if DP is positive (and so dP dr also is), since then there
is more pressure outside than inside.
The equation of motion of the spherical shell is given by ma = F:
dP DM
DMr̈ = DMg(r ) + (4.4)
dr r(r )
Gm(r ) 1 dP
r̈ = 2
+ , (4.5)
r r(r ) dr
where the minus sign comes from the two forces are positive if they are directed inward.
This means that, in order to achieve equilibrium (r̈ = 0), the pressure gradient dP dr must
be negative, since m(r ) can never be.
Now we shall give two estimates about stellar formation: the first is the free-fall
timescale, in which we ignore pressure forces in order to ballpark the time taken for the
matter to fall onto itself.
Then, we will study the equilibrium configuration of the star, in order to understand
what are the conditions under which it is actually gravitationally bound (that is, with
negative total energy).

4.1.1 The freefall timescale


We are ignoring pressure (or, equivalently, assuming that it is constant), so the equation
of motion reads

r̈ = g(r ) . (4.6)

This will not generally be the case, but let us suppose that the collapse is orderly: the
ordering of the layers stays the same as they fall.
We use the energy integral instead of directly solving the differential equation, that is,
we impose energy conservation. Computing the total energy (kinetic plus potential) at the
initial radius of the cloud, r0 (at which the gas is stationary), and at a radius r we get
✓ ◆
Gm0 1 dr 2 Gm0
= (4.7)
r0 2 dt r
✓ ◆ ✓ ◆
1 dr 2 1 1
= Gm0 . (4.8)
2 dt r r0
This way, we have found a first-order ODE instead of a second-order one. Note that the
potential energy at a radius r is computed using m0 since as the layer falls the other layers
below it are still below it, so the mass inside the layer is always equal to the initial one.
We can now directly compute the freefall time by integrating from r0 to r:
Z 0 Z 0 ✓ ◆ 1/2
dt 1 1 1
tfree fall = dr = dr p , (4.9)
r0 dr r0 2Gm0 r r0

where we have a minus sign since, when simplifying the square of the derivative ṙ2 we must
choose the negative sign: dr dt < 0, since the material is infalling.

96
We can then change variables to x = r/r0 (with dr = r0 dx) and switch the bounds of
integration, recovering the positive sign:

Z 1
" ✓ ◆# 1/2
1 1 1
tfree fall =p r0 dx 1 (4.10)
2Gm0 0 r0 x
s r s
Z 1
r03 x p r03
= dx = . (4.11)
2Gm0 0 1 x 2 2Gm0

Although the integrand diverges for x ! 1 (meaning, at large r) the integral converges
to p/2.2 ⇣ ⌘
The average density is given by r̄ = m0 / 4pr03 /3 , which we can insert into our expres-
sion to get
s
3p
tfree fall = ⇡ 0.54G 1/2 r 1/2 . (4.13)
32G r̄

Comparison with the expansion timescale We might be tempted to ignore the expansion
of the universe in these calculations: we know that
8pG
H2 = r̄ , (4.14)
3
where we wrote r to mean that this holds on the scales at which homogeneity applies; if
we consider smaller scales we must take a spatial average of the density. In the Newtonian
(matter-dominated and flat) case we know that a(t) µ t2/3 and H = 2/(3t).
So, we can substitute this expression to find the expansion timescale

4 8pG 4 3
= r̄ =) t2 = r̄ 1
, (4.15)
3t2 3 9 8pG
so
1/2 1/2
texp ⇡ 0.23G r̄ . (4.16)

If the universe was perfectly homogeneous we would then expect structure formation to
be forbidden: however, if there are some over-dense regions to start with, their characteristic
freefall time can become lower than the chracteristic expansion time of the universe.
But still, shouldn’t we consider expansion in the computation? By how much are we
getting it wrong?

2 The integral can be computed with the substitution x = sin2 q and then y = cos q:
s
Z 1r Z p/2 Z p/2 Z 1q
x sin2 q 2 p
dx = 2 sin q cos q dq = sin q dq = 1 y2 dy = . (4.12)
0 1 x 0 cos2 q 0 0 2

97
4.1.2 Hydrostatic equilibrium
At equilibrium the stellar layers are static: r̈ = 0, so the equation of motion reads

1 dP dP m (r ) r (r )
r̈ = 0 = g(r ) + =) = G . (4.17)
r(r ) dr dr r2

We multiply both sides by 4pr3 and integrate in dr from the core (r = 0) to the surface
of the star, r = R:
Z R Z R
dP m(r )r(r )4pr2
dr 4pr3 = G dr , (4.18)
0 dr 0 r

and we can change variables: r(r )4pr2 dr = dm (this is physically meaningful: it is the
differential mass of the layer at a radius r), so we can identify the left-hand side with the
total gravitational energy:
Z
m(r ) dm
Egrav = G . (4.19)
r
On the RHS, instead, we can integrate by parts:
Z R h iR Z R
dP
dr 4pr3 = P(r )4pr3 3 dr 4pr2 P(r ) , (4.20)
0 dr 0 0

where the boundary term vanishes: at the origin r = 0, at the surface (by definition of
surface) P = 0.
We can better understand what this means if we divide and multiply by the volume
V ( R) ⌘ 4pR3 /3:
Z R Z R
dr 4pr2 P(r )
3 dr 4pr}2 P(r ) =
| {z 3V ( R) = 3V ( R) h Pi , (4.21)
0 0 V ( R)
dV | {z }
h Pi

where we interpret the integral as a weighted average, so we get

1 Egrav 1
Egrav = 3 h Pi V or h Pi = = rGR . (4.22)
3 V 3
This is the Virial Theorem.
Now, the question we want to as is: is this equilibrium configuration stable? This is
equivalent to asking whether the system is gravitationally bound, Egrav < 0, which as we
have shown is equivalent to h Pi > 0.
In order to answer this question we shall use a statistical-mechanics, microscopic ap-
proach.
We consider a cubic box of volume V = L3 with N particles inside it, each of which
has a velocity ~v = (v x , vy , vz )> and a momentum p. Let us select a face of the box, which
we assume to be perpendicular to the x axis. Each particle will hit it with a frequency

98
t 1 = v x /2L, and each time it does so it imparts upon it a momentum 2p x , since it is
reflected backwards.
Summing over all the particles, the rate of momentum transfer (so, the force) in the
direction x is given by

N ⌦ ↵
2p x v x , (4.23)
2L
so the pressure upon that face will be the force divided by the area of the face

N⌦ ↵ 1 N ⌦ ↵
Px = px vx 2 = px vx . (4.24)
L L V
|{z}
=n

This will be the same


⌦ for each
↵ ⌦direction
↵ by isotropicity: Px = Py = Pz , and by the same
argument we can write p x v x = ~p · ~v /3: so
n⌦ ↵
P= ~p · ~v , (4.25)
3
which, although we will not show it, generalizes to a configuration of any shape, and
does not change if we consider quantum-mechanical or relativistic effects. This is a simple
expression for the equipartition theorem, a crucial result in Hamiltonian mechanics.
Let us consider two limits: nonrelativistic and fully relativistic particles.

Nonrelativistic particles In this case, since g ⇡ 1 the four-momentum of the particles is


approximately
" # " #
µ gmc2 mc2
p = ⇡ , (4.26)
gm~v m~v

⌦ ↵ D E
so ~p = m~v, which means ~p · ~v = mv2 .
Then, for a gas of nonrelativistic particles we can write the pressure as

n D 2E 2
P= mv = r EK , (4.27)
3 3
D E
where r EK = nm v2 /2 is the density of translational kinetic energy.
Combining this result with the fact that, as we have seen before, h Pi = rgrav /3, we find

1 2
rgrav = r EK =) 2EK + Egrav = 0 , (4.28)
3 3
which is an alternate statement of the nonrelativistic case of the virial theorem.
The total energy is then given by Etot = Ek + Egrav = Ek : this means that in general the
system will be bound — the kinetic energy is quadratic, so always positive — and that the
hotter it is, the more bound it is.

99
⌦ ↵
Relativistic case In this case v ⇡ c, so p · ~v ⇡ pc = gmc. We can apply the reasoning
from before, but the density of translational kinetic energy is given by

r EK = n( E mc2 ) = n(g 1)mc2 ⇡ ngmc2 = npc , (4.29)

so we have
n⌦ ↵ rE
P= ~p · ~v = k . (4.30)
3 3
Then, we can apply the same reasoning as the nonrelativistic case, with the difference of
the missing factor 2: we then get

r Ek + rgrav = 0 =) Egrav + Ek = Etot = 0 , (4.31)

so the system is unbound, it does not have any constraint preventing it from dissociating.

Adiabatic gas We have seen the limiting cases, now let us consider a slightly more general
one: a gas undergoing an adiabatic transformation, such that PV g (with some real number
g) is constant.3 We will show that this is equivalent to the equations of state considered in
cosmology, where P = wr. This will allow us to characterize the gravitational stability of
the to-be star depending on the equation of state of the gas.
We start by differentiating: d( PV g ) = 0, which means that we also have d(log( PV g )) =
0, which we can expand into

dV dP
d log(V g ) + d log( P) = g + = 0, (4.32)
V P
so

(g 1) P dV = P dV + V dP = d( PV ) . (4.33)

In an adiabatic transformation the entropy must not change: so, we can write

T dS = dEin + P dV = 0 , (4.34)

which we can then write using the relation we derived previously:

1
dEin = d( PV ) (4.35)
g 1
PV
Ein = (4.36)
g 1
Ein
P = ( g 1) = (g 1)rin . (4.37)
V
We can then see that if we impose that the transformation be adiabatic, we find the
equation of state P = wr, with g 1 = w.
3 A more realistic model would allow g to vary, which it definitely does in the stages of stellar formation and

evolution and even across a single transformation. We will not, however, get that deep in the weeds.

100
Using the fact that, as we have shown before, P = rgrav /3, this means
rgrav
= (g 1)rin =) 3(g 1) Ein + Egr = 0 , (4.38)
3
which, together with the fact that the total energy of the star after the collapse is the initial
energy plus the (negative) gravitational binding energy: Etot = Ein + Egr , so

Etot = (3g 4) Ein , (4.39)

which means that g > 4/3 characterizes a bound system, while g < 4/3 characterizes a
free system. This is consistent with what we have seen before: the limiting case g = 4/3 is
equivalent to w = 1/3, the equation of state of radiation (or ultrarelativistic matter), which
as we have already seen is unbound.
From classical thermodynamics we know that, for instance, a monoatomic gas has
g = 5/3.

4.2 Jeans instability


Let us now try to understand the conditions under which a cloud of gas may become
unstable and collapse onto itself to form a star (or a planet, for that matter).
In general, the gravitational potential energy of a body whose characteristic size is R and
whose mass is M is given by
Z
G GM2
Egrav = d3 x d3 y r( x )r(y) = f , (4.40)
x,y2V x y R

where f is a numerical factor depending on the mass distribution. If the object at hand is
uniform-density sphere, we have f = 3/5. In general, the factor is of order 1.
The kinetic component of the energy, on the other hand, is

3
EK = Nk B T . (4.41)
2
The gravitational cloud is unstable the gravitational energy is larger than the kinetic
energy:
Why should this be? The way Keeton [Kee14] discusses it makes more sense to me: he
studies the response of the total energy to a decrease in radius, and checks that it is
positive; if might be the same as what we are doing here but that’s not really obvious.

GM2 3
f > Nk B T , (4.42)
R 2
and the Jeans mass, M J , corresponds to the boundary of the stability region: the number of
particles, N, depends on it as N = M J /m, where m is the average particle mass.

101
The criterion then reads:
gM2J 3 MJ
f = kB T (4.43)
R 2 m̄
3 kB T
MJ = R, (4.44)
2 G m̄
where we set f = 1, since we are only interested in an order-of-magnitude calculation.
As usual, we want to reframe our result in terms of densities: the Jeans mass corresponds
to a Jeans density times the volume of the sphere:
4p
MJ = r J R3 . (4.45)
3
In order to find out what this density is we start off by cubing the expression for the
Jeans mass, and then substituting the expression for M J in terms of r J :
✓ ◆3
3k B T
M3J = R3 (4.46)
2G m̄
✓ ◆
3k B T 3 3M J
= (4.47)
2G m̄ 4pr J
✓ ◆
3 3k B T 3
rJ = . (4.48)
4pM2J 2G m̄

Alternatively, we can write

4p 3 kB T
r J R3 = R (4.49)
3 2 Gm
9 1 kB T
rJ = . (4.50)
8p R2 Gm
We will have an instability if the density is larger than this. As we have seen in the
previous section, a lower temperature facilitates the collapse. It should be stressed that the
precise numerical coefficient will depend on the geometry of the cloud of material, this is
not a hard rule but more of a guide for the understanding of the behavior of clouds.
Here appears in the lecture the argument for the fact that the temperature of matter
decreases as T ⇠ a 2 ; it does not really seem to fit with the rest of the chapter, perhaps
it should go earlier?
I’ll leave it here, commented out.

Equations for stellar structure

In order to properly study the dynamics of the stellar collapse, however, we need to
analyze the differential equations which govern it. We will start out by doing so on a static
background, following the original reasoning by Jeans (who, working in the early 1900s,
did not know about the expansion of the universe). Then, we will discuss the effects of the
universe’s expansion on the gravitational instability.

102
The continuity equation, imposed by mass conservation, is

∂t r + r · r~v = 0 , (4.51)

where r is the matter density while ~v is the velocity field; the Euler equation, imposed by
momentum conservation (assuming no viscosity), is
⇣ ⌘ 1~
~ ~v =
∂t~v + ~v · r rP ~ F,
r (4.52)
r

where P is the pressure while F is the gravitational potential.


If we define the convective time derivative,
D
= ∂t + ~v · r x ⇡ uµ ∂µ , (4.53)
Dt
we can write the two equations as

D
r + rr · ~v = 0 (4.54)
Dt
D rP
~v = rF . (4.55)
Dt r

Lastly, the gravitational field F must obey Poisson’s equation:

r2 F = 4pGr . (4.56)

Right now we have five equations (Euler is a vector equation, corresponding to three
scalar ones) and six variables: r, F, P and the three components of ~v. In order to be able to
solve this system we need one more condition; typically this is provided as an equation of
state, giving P in terms of the other variables.
One way to go about this is to consider entropy: we define the entropy density s by the
relation S = sr, where S is the (total?) entropy.
We will consider isentropic processes, in which

Ds ~ · ~v = 0 .
+ sr (4.57)
Dt
We introduce this, an additional eqution as well as an additional variable, in order to
complete our equations with an equation of state in the form:

P = P(r, s) . (4.58)

Now, then, we are left with seven equations and seven variables: let us solve them! This
is in general very hard, no analytic solutions exist.
Jeans’ approach, which we will follow, is to find a fixed background solution and then to
perturb it. We are then looking to see whether the perturbation is dampened or amplified.
Perturbations are always present, so this will tell us whether the configuration is stable or
unstable.

103
Static ansatz

Jeans’ first ansatz was r = r0 = const, ~v = 0, s = s0 = const, F = F0 = const,


P = P0 = const.
This is a very simplified model, and it is not even self-consistent: unless r = 0, Poisson’s
equation cannot be satisfied, but we want to have matter in our proto-star. We will ignore
this problem, since despite it we get a physically meaningful result. The equation cannot
precisely hold, but in low-density regions it is not that far from equality.
We perturb the variables: for each variable we will have x = x0 + dx (except ~v: since
there is no ~v0 , we just write ~v instead of d~v).
With this, the equations read:

~ · ~v = 0
∂t dr + r0 r (4.59)
1~ ~ dF
∂t~v = rdP r (4.60)
r0
r2 dF = 4pGdr (4.61)
∂t ds = 0 . (4.62)

The pressure perturbation can be expressed in terms of the density and entropy ones:

∂P ∂P
dP = dr + ds , (4.63)
∂r ∂s
| {z }s
=c2s

where we recognize the constant-entropy derivative of the pressure with respect to the
density: the square of the adiabatic speed of sound.
Next, in order to solve these we need to move to Fourier space. Properly speaking, we
would need to take the Fourier transform of all our variables; in terms of computation it
is as if we were considering an exponential ansatz for all of them, in that taking spatial
derivatives corresponds to bringing down a factor i~k:
⇣ ⌘
dxi = xi0 exp i~k · ~x iwt , (4.64)

with xi = r, ~v, F, s.
To first order, then, our equations become:

iwdr + r0 i~k · ~v = 0 (4.65a)


✓ ◆
1 ~ 2 ∂P
iw~v = i k cs dr0 + ds i~kdF0 (4.65b)
r0 ∂s
k2 dF = 4pGdr (4.65c)
wds = 0 . (4.65d)

Now, the last equation can be solved by either w = 0 (so, they are time-independent) or
ds = 0 (so, they are isentropic).

104
These two cases differ substantially in the shape of the vector field they give. A result
from Helmholtz is the fact that every velocity field can be decomposed into a divergenceless
part and an irrotational part, at least locally: there exist Y and ~T such that ~v = rY + ~T with
r · ~T = 0.
The two cases, as we will see, amount to making the velocity field fully irrotational or
fully divergenceless.

Time-independent solutions We start by considering the first option, w = 0: then, we


get
~k · ~v = 0 (4.66a)
✓ ◆
1~ 2 ∂P
0 = k cs dr + ds +~kdF (4.66b)
r0 ∂s
k2 dF = 4pGdr0 . (4.66c)

The first equation tells us that ~k · ~v = 0, which in position space translates to r


~ · ~v = 0,
so the velocity field describes the motion of an incompressible fluid.
. . . which is turbulent? That’s not the case in general, water is nearly incompressible
but it can have laminar motion. This seems to be what the professor and Pacciani
[Pac18] state though. This is dismissed as uninteresting together with the w = ds = 0
case. The argument made later, from the Helmholtz decomposition and the decoupling
of the divergenceless and irrotational components of the velocity, seems more convinc-
ing.

Isentropic solutions Now, we consider the case in which ds0 = 0, while w 6= 0 in


general. The equations read

wdr + r0~k · ~v = 0 (4.67a)


1
w v~0 = ~kc2s dr ~kdF (4.67b)
r0
k2 dF = 4pGdr , (4.67c)

which we can write as a linear system for the vector [dr, ~v, dF].
The 5x5 coefficient matrix is:
2 3
w r ~k 0
6 1 2 0 7
6 ~kc ~ 7, (4.68a)
4 r0 s w k 5
4pG 0 k2

and in order to have more than one solution (we need a family of them, since parameters
like w are variable) we need to set its determinant to zero, which yields:
!
2 1 2 2
wk r0~k · ~kc k 4pG~k = 0 , (4.69)
r0 s

105
which gives the dispersion relation w 2 = c2s k2 r0 4pG.
This has a direct physical interpretation: if w 2 is positive, then w is real so the solution is
oscillatory; while if w 2 is negative then w is imaginary, therefore the solution is given by a
real exponential, which quickly amplifies (or dampens, but that case is not interesting since
it does not have macroscopic effects) the perturbation. This is the unstable case.
A generic solution will be a combination of these, with varying ~k.
We can connect this result with what we have found for the freefall timescale earlier: if
the pressure is negligible then so is cs , so we have w 2 = r0 4pG: so, we have a solution
increasing on a timescale dictated by |w | = (4pGr0 )1/2 , the characteristic time
1 1
t= =p , (4.70)
|w | 4pGr0

which is very similar to the freefall timescale


!1/2
3p
tfree fall = ; (4.71)
32Gr0

the difference is only the numerical factor in front, and they are quite similar (0.28 versus
0.54).
The separation between real and imaginary w is reached when w 2 = 0, which gives us
the critical Jeans wavenumber:
4pGr0
k2J = . (4.72)
c2s
The wavenumber k J also defines a wavelength l J = 2p/k J , which will then tell us what
the length scale above which we have instability is.
This is, up to a order-1 difference in the numerical factor in front, the same
p result we
had before: we can see this if we consider the fact that, for an ideal gas, cs ⇡ k B T/m we
recover
r
1 kB T
lJ ⇠ p , (4.73)
Gr0 m

which is consistent with equation (4.50).

A plasma analogy The result we found is similar to what we get with a plasma of
charged particles, with the electrostatic potential instead of the gravitational field. In that
case, the dispersion relation is given by

4pne e2
w 2 = c2s k2 + , (4.74)
me
where ne is the number density of electrons, me is the electron mass.
We have the following analogies:

ne ! r0 /m (4.75a)

106
me ! m (4.75b)
2 2
e ! Gm . (4.75c)

The equations are similar but, crucially, the sign of the additional term in the dispersion
relation is positive in the gravitational case and negative in the electromagnetic case. This
is due to the fact that there exists only positive gravitational “charge”, while we have
both positive and negative electric charge: in the electromagnetic case, then, we can have
screening effects, not so in the gravitational one.

Expanding ansatz

Now, we will study the same problem, but instead of a static background we will use
an expanding one, described by a flat FLRW metric (the effects of spatial curvature will be
negligible on the scale of a stellar formation cloud anyway).
The physical radial vector will be given by ~r = a(t)~x, where ~x is the radial vector in
comoving coordinates.
We now drop the vector sign, but still imply it; the physical velocity is given by

u = ṙ = ȧx + a ẋ = r + v = Hr + v , (4.76)
a
where v = a ẋ is called the peculiar velocity. This has a direct physical implication: the
distant galaxies we observe are, generally speaking, the more redshifted the further they
are from us; however, there is noise in this relation due to the Doppler shift caused by the
peculiar velocity. An extreme example of this is the Andromeda galaxy which, despite being
almost a Mpc away, is actually blueshifted to z ⇡ 0.001 since its peculiar velocity is directed
towards the Solar System.
Let us then seek perturbed solutions to the equations of motion of the fluid. We will
neglect the pressure gradient in order to simplify the considerations — this is not a great
approximation, since we are always perturbing around the equilibrium situation in which
the instability has not yet formed, so in a low-pressure scenario. When the pressure will
start to take over we will be far from the background anyway, so we cannot hope to describe
the situation in this way. This discussion will then also apply to dark-matter structure
formation, since it is pressureless.
We will use a slightly different notation from before, denoting the background with an
index b, so that

r(~r, t) = rb (t) + dr(~r, t) (4.77)


~v(~r, t) = ~vb (~r, t) + ~v(~r, t) (4.78)
F(~r, t) = Fb (~r, t) + f(~r, t) . (4.79)

The equations, using the customary notation for a partial derivative taken while keeping
a certain variable constant, read:
∂r
+ r~r · r~u = 0 (4.80)
∂t ~r

107
∂~u
+ (~u · r~r )~u = r~r F (4.81)
∂r ~r
r~2r F = 4pGr . (4.82)

Let us now try to find the background solution. We assumed that the background
density of matter rb (t) is space independent but time dependent: therefore, up to a constant
(which we set to zero) and a linear term in ~r (which would violate isotropy) we must

2pG
Fb (~r, t) = r ( t )r 2 , (4.83)
3 b
which is consistent with what we found earlier, since its gradient and then Laplacian are:

4pG
r~r Fb = r (t)~r (4.84)
3 b
r~2r Fb = 4pGrb . (4.85)

This potential diverges at r ! •; however this is not an issue with the solution, but with
the Newtonian approximation we made. It will not affect our treatment of the problem.
We want to get equations in the comoving coordinates (~x), not the local inertial ones (~r).
Let us consider a generic function f (~r, t), which can also be expressed with respect to (~x, t).
It can be shown4 that the difference between the time derivatives at fixed ~r and at fixed
~x of a generic function f (which can also be a vector) is given by

∂f ∂f
= + H (~r · r~r ) f . (4.88)
∂t ~x ∂t ~r

We can then mold the continuity equation:

dr ~ · ~u + (~u · r~r )r = 0
H (~r · r~r )r + rr (4.89)
dt ~x
dr ⇣ ⌘ ⇣ ⌘
~ ~r · ( H~r + ~v) + H ~r · r
H (~r · r~r )r + rr ~ ~r r + ~v · r
~ ~r r = 0 (4.90)
dt ~x
dr ⇣ ⌘
~ ~r · ~v + ~v · r
+ 3Hr + rr ~ ~r r = 0 (4.91)
dt ~x
∂r 1
+ 3Hr + r~x · r~v = 0 . (4.92)
∂t ~x |a{z }
~ ~r
r

4 The way to go about it is to impose the equality of the total time derivatives of f (~x, t) and f (~r, t), which
read:
D f (~r, t) ∂f
= + r~r f · (~v + H~r ) (4.86)
Dt ∂t ~r | {z }

D f (~x, t) ∂f
= ~v .
+ r~x f · |{z} (4.87)
Dt ∂t ~x

108
As we would expect, if the velocity ~v is equal to zero then the density scales like r ⇠ a 3 .
The computation for the Euler equation is an application of the same principles, with one
exception: we can simplify the background potential term with some terms which appear
on the left-hand side, using the equation
∂( H~r )
+ H (~r · r~r )( H~r ) = r~r Fb . (4.93)
∂t
~r
The divergence term is just r j ∂ j ri = ri . Inserting the expression we know for the back-
ground gravitational potential, whose gradient is proportional to ~r, we get:
✓ ◆
2 4pG
~r Ḣ + H = r , (4.94)
3 b
which must hold for any ~r, so we drop it and recover the second Friedmann equation (using
Ḣ + H 2 = ä/a). Working backwards, we can prove the equation.
Using this, we can write the Euler equation as:
∂~v 1 1
+ H~v + (~v · r~x )~v = r dF ; (4.95)
∂t ~x a a ~x
getting the Poisson equation in comoving coordinates is faster and directly yields
r2 dF = a2 4pGdr . (4.96)
We express the density perturbation by defining the variable d as
dr r(~x, t) rb (t)
d(~x, t) = = , (4.97)
rb rb (t)
which can take values anywhere from 1 to +•: we can interpret a negative d as a sort of
“negative effective gravitational charge”. This has the physical meaning that we can expect
screenings: under-dense and over-dense regions can “balance out” at large distances, just
like we do not observe the effects of large-scale electric charge imbalances.
We can then give a quantitative check of whether the Newtonian approximation is a
good one. In terms of d, the Poisson equation reads:
r2 dF = 4pGrb da2 , (4.98)
which we can estimate: let us say that l is the typical variation scale of the potential, so
that r2 dF ⇠ l 2 dF; also let us use the first Friedmann equation relating H 2 with the
background density rb (since the FE only hold at large scales):

2 H2
l dF ⇠ 4pG a2 d (4.99)
8pG/3
!
3 l2
dF ⇠ H 2 da2 l2 ⇠ d, (4.100)
2 l2hor
where the typical variation of the gravitational field is l ⇠ Mpc, while c/( Ha) = lhor ⇠ Gpc
is the (comoving) Hubble horizon scale.
The density perturbation will be, at most, of order 1, so we get that dF is indeed small.
As long as the perturbations are only galactic, the Newtonian approximation is good.

109
Solving the equations The derivative of the density, which appears in the continuity
equation, reads:

∂r ∂r ∂d
= b (1 + d ) + r b . (4.101)
∂t ∂t ∂t
The procedure we will then apply is to simplify the equation making use of the fact that
it holds at zeroth order (with all the perturbations set to zero): we have ∂t rb + 3Hrb = 0.
Also, we neglect second and higher order terms (recall that ~v is already first order): the
computation goes like

∂rb ∂d 1~
(1 + d) + rb + 3Hrb (1 + d) + r · ( r b (1 + d ) v ) = 0 (4.102)
∂t ∂t  a
∂rb ∂d r ~
(1 + d ) + 3Hrb + rb + b r · ~v = 0 (4.103)
∂t ∂t a
∂d 1 ~
+ r · ~v = 0 . (4.104)
∂t a
~ v
On the other hand, for the momentum equation we only need to neglect the (~v · r)~
term, which is second order, to find:

∂~v 1~
+ H~v = rf , (4.105)
∂t a
where f = dF is the perturbation
Finally, the Poisson equation is already in its simplest, linear form.
In order to solve these we expand in Fourier space: the density perturbation d is ex-
pressed in terms of its Fourier transform de as
Z ⇣ ⌘
1
d(~x, t) = d3~k de(t) exp i~k · ~x , (4.106)
(2p )3

and for ~v and f. Note that we only expand in 3D space: plane waves do not propagate nicely
in an expanding universe, so they are not a good Fourier base: we keep time derivatives,
substituting spatial ones with multiplication by i~k.
The actual quantities must be real: therefore we know that de~⇤ (t) = de ~k (t).
k
Before we Fourier-transform the equations, we can make a simplification: as we men-
tioned before, any vector field ~v can be decomposed by the Helmholtz theorem into the
gradient of a scalar field and a divergenceless vector field:

~v = rY + ~T , (4.107)

where r · ~T = 0 (and, as is known from vector calculus, r ⇥ (rY) = 0).


Before we considered only the first order, the Euler equation read:

D~v 1
+ H~v = rf . (4.108)
Dt a

110
If we substitute ~v with its Helmholtz decomposition, we can split the equation into two,
one for ~T and one for Y:

D~T ~ Y)
D( r ~f
r
+ H~T = 0 and ~Y=
+ Hr . (4.109)
Dt Dt a
We could determine which terms went on either side based on whether they had zero
divergence or zero curl.
Does the convective derivative commute with r⇥ and r·, though?
This has an important physical meaning: the divergenceless part of the velocity evolves
by itself, unaffected by the gravitational field perturbation. The way it evolves is, roughly
speaking, a decreasing exponential, so its magnitude will diminish over time. Therefore,
any ~T component which is part of the velocity field initially gets ever more diluted.
Because of this, we neglect the divergenceless component of the velocity field, and only
consider the ~v = rY part. In Fourier space, this reads ~v µ ~kY, so we can project the three
equations along the unit vector k̂ = ~k/ ~k , simplifying them to a single scalar one. We will
denote v = ~v · k̂ and k = ~k · k̂ = ~k .
With this and denoting time derivatives with a dot, our equations will read:

ik
ḋ + v=0 (4.110)
a
ik
v̇ + Hv = f (4.111)
a
k2 f = 4pGa2 rb d . (4.112)

We can find an equation for d alone by differentiating the first equation with respect to
time and substituting the Euler equation:

ik ik
d̈ + v̇ ȧv = 0 (4.113a)
✓ a◆ a2
ik ik ik
d̈ + Hv f Hv = 0 (4.113b)
a a a
2ik k2 f
d̈ Hv + 2 = 0 , (4.113c)
a a
but, from the Poisson equation, the last term is equal to 4pGrb d, and from the continuity
equation again the second term is equal to 2H ḋ: the equation then reads

d̈ + 2H ḋ 4pGrb d = 0 . (4.114)

This looks promising: if H and rb were constant, it would be a simple second-order ODE.
They are not constant, but they are functions corresponding to the background: we can use
the solutions found earlier corresponding to a matter-dominated universe,

2 ⇣ ⌘ 1
a(t) µ t2/3 , H= , rb = 6pGt2 . (4.115)
3t

111
With these, the equation reads

4 2
d̈ + ḋ d = 0, (4.116)
3t 3t2
which will have two independent solutions since it is of second order: it turns out that both
can be recovered using a powerlaw ansatz, d µ ta : the equation for a reads

4 2
a(a 1) + a = 0, (4.117)
3 3
whose solutions are a = 1 and a = 2/3.
The solution with a = 2/3 is the growing mode, while the one with a = 1 (so,
d µ t 1 µ H (t)) is the decaying mode.

1. The growing mode has d µ t2/3 µ a(t), v µ t1/3 , f = const;

2. the decaying mode has d µ t 1 µ H ( t ), v µ t 1/2 , fµt 5/3 .

Typically, we are more interested in the growing mode.


"When the inflaton field becomes classical we lose a degree of freedom: this removes
the decaying solutions. ": not really clear to me.

4.2.1 Star formation


Let us now actually discuss stellar formation specifically. The presence of molecular
hydrogen, H2 , is correlated to stellar formation since it can absorb some of the kinetic
energy of the collapsing cloud by dissociating, thus allowing for further collapse; it forms
through the channels

H+e $ H +g H + H $ H2 + e (4.118a)
+ +
H + p $ H2 + g H2 + H $ H2 + p , (4.118b)

and these processes, generally speaking, start to become efficient at redshifts of about
z ⇠ 200.
As we have seen, if we fix the temperature then the Jeans critical density scales like
r J µ M 2 (4.48): so, it is easier to get above the Jeans density if the mass is high (and the
temperature is low). This means that we expect the formation of stars to be a top-down
process: larger structures form first.
Let us now give some quantitative estimates of the densities at hand.
Approximately, the baryonic density today is r0b ⇠ 1 ⇥ 10 28 kgm 3 .
Going backwards in time, the baryonic density scales like rb (z) = (1 + z)3 r0b : at z ⇠ 200,
for example, we have rb (z ⇠ 200) ⇠ 10 22 kgm 3 .
Stars will not form anywhere, they will only do so in over-dense regions, which form
preferentially in the centers of dark-matter halos, acting as gravitational “traps”.

112
Let us take, as an example, a molecular hydrogen (m ⇡ 2m p ) cloud with a mass of
M = 1000M and T ⇡ 20 K (a typical temperature for matter at this redshift): we have a
Jeans critical density of r J ⇡ 4 ⇥ 10 22 kgm 3 .
Perhaps this is the place to put the scaling of the temperature of matter! The fact that
the temperature of baryons decreases slower than the radiation’s is crucial to reach a
temperature low enough to reach the Jeans bound.
For a solar-mass cloud, the critical density is much lower: r J ⇡ 4 ⇥ 10 16 kgm 3 .

4.2.2 Collapsing a solar-mass cloud


Let us suppose that, by virtue of being in a dark matter halo and after some large-scale
collapse, we have indeed reached the conditions of r ⇠ 4 ⇥ 10 16 kgm 3 and T ⇠ 20 K. We
will try to understand how a Sun-like star might form.
Molecular hydrogen is present in the cloud, its dissociation energy is eD ⇡ 4.5 eV while,
as we know, the ionization energy of hydrogen is e H ⇡ 13.6 eV.
As the collapse starts, any kinetic energy which is developed is used, first to dissociate
molecular hydrogen and then to ionize hydrogen, so for a while the evolution goes according
to the free-fall equation of motion:
✓ ◆
1 dr 2 Gm0 Gm0
= , (4.119)
2 dt r r0
which, as we have seen earlier, corresponds to a free-fall time given by (4.13): we get
t FF ⇡ 100 kyr.
The energy needed to dissociate all the H2 and then ionize the H is given in terms of the
mass of the cloud, M = M , by
M M
E= eD + eI , (4.120)
2m H mH
where m H ⇡ m p is the mass of hydrogen. The mass of H2 is slightly different from 2m H , but
the difference is of the order of eD /m H ⇠ p10 8 , completely negligible.
The cloud starts from a radius R1 ⇠ 3 3M/4pr J ⇡ 1 ⇥ 1015 m ⇡ 0.1 lyr: what radius R2
does it reach by the time all the hydrogen is ionized?
We can calculate it by equating the difference in potential energy to the ionization and
dissociation energy:
✓ ◆
1 1 M M
GM2 =E= eD + eI (4.121)
R2 R1 2m H mH
✓ ◆
1 1 e e
GM = D + I ⇡ 1.7 ⇥ 10 8 c2 (4.122)
R2 R1 2m H mH
✓ ◆ 1
1 1 GM
⇡ ⇥ 6 ⇥ 107 ⇡ 9 ⇥ 1010 m , (4.123)
R2 R1 c2
so, since R1 1011 m the R1 1 term is basically negligible, while R2 ⇡ 1011 m. We have
shrunk our cloud from 0.1 lyr to about 0.6 AU, just smaller than the radius of the orbit of
Venus. This is still much larger than R ⇡ 7 ⇥ 108 m.

113
The rest of the collapse is much slower, and it happens under hydrostatic equilibrium;
the proto-star will still shrink under its own gravity, getting ever hotter, until it reaches the
temperature needed for the ignition of fusion, around 107 K.
At this stage the virial theorem applies, since the plasma is optically thick, very little
energy is lost to radiation:

2Ek + Egr = 0 , (4.124)

and we can recover the total kinetic energy from the equipartition theorem:
3 3M 3M
Ek = Nk B T = kB T ⇡ kB T , (4.125)
2 2m mH
where we used the fact that m = 0.5m H : the hydrogen is ionized, so we have both free
electrons and free protons, and basically all the mass is with the latter.
The gravitational binding energy can be estimated as what was lost in the dissociation
and ionization:
✓ ◆
M M eD
2 ⇥ 3k B T = Egr ⇡ + eI (4.126)
mH mH 2
| {z }
Ek
✓ ◆
1 eD
kB T ⇡ + e I ⇡ 2.6 eV ⇡ 30 kK , (4.127)
6 2
still very much lower than the temperature needed to ignite fusion, which is on the order of
a keV ⇠ 10 MK.

4.2.3 Conditions for stardom and brown dwarfs


At this point, the proto-star reaching a high enough temperature to fuse hydrogen is not
a given: it may not happen.
In order to discuss the conditions for stardom we need to account (albeit in an approximate
way, as usual) for the fermionic nature of protons and electrons, which will give us a
maximum density due to the Pauli exclusion principle: the particles will form a degenerate
Fermi gas. Depending on the mass of the star, this maximum density may be reached before
the ignition temperature: when this is the case, a brown dwarf is formed.
We estimate the minumum space a fermion can occupy with its De Broglie wavelength:

h 2ph̄
l= = , (4.128)
p p

where p is the momentum of the particle. Since the temperature is still less than a keV, both
electrons and protons are nonrelativistic, so we can approximate Ek = p2 /2m, which will
typically be of the order EK ⇠ k B T.
Then, the momentum and wavelength will be:
p 2ph̄
p⇠ 2mk B T =) l ⇠ p . (4.129)
mk B T

114
p
We can see that l ⇠ 1/ m: this means that the wavelength is smaller (by a factor 40)
for the protons than it is for the electrons, and they appear in equal numbers; therefore the
first bound to be reached will be that of the degenerate electron gas, for this reason we will
neglect protons now.
The critical density will be reached when we have an electron for each l3 ; however in
order to ensure local charge neutrality we must have protons distributed in the same way,
and their mass contribution to the density will be the largest:
I’d actually say, then, that we should write rc = m p /l3 !

m (me k B T )3/2
rc = 3
⇠m . (4.130)
l (2ph̄)3
As we have seen earlier, since the virial theorem applies the temperature of the proto-star
is tied to its gravitational binding energy by

GM2 GMm
3Nk B T = =) k B T = . (4.131)
R 3R

The mass can be expressed in terms of the average density as M = 43 prR3 , so


✓ ◆1/3
1 4p r
= , (4.132)
R 3 M

which gives us the result


✓ ◆
GMm 4p r 1/3
kB T = . (4.133)
3 3 M

If we substitute the critical density rc for r we will get the maximum possible temperature
allowed at a given mass M: after some manipulation we get (up to a small constant, which
we neglect since the calculation is rough anyway):
✓ ◆
GMm 4p 1/3 m1/3
kB T = (me k B T )1/2 (4.134)
3 3M (2ph̄)
✓ ◆
G2 M2 m2 4p 2/3 m2/3
( k B T )2 = me k B T (4.135)
9 3M (2ph̄)2
G2 m8/3 M4/3 me
kB T ⇡ . (4.136)
(2ph̄)2

Inserting the ignition temperature of around 1 keV we get Mmin ⇡ M , which is almost
the right order of magnitude: more accurate models agree with the observational result
of Mmin ⇡ 0.08M . We have treated electron-degenerate matter in a very rough way, we
will improve our estimate in a later section with a more accurate quantum-mechanical
description.

115
4.3 The Sun and other stars
Let us start off with a list of the physical characteristics of the Sun: its mass is M ⇡
1.99 ⇥ 1030 kg, its radius is R ⇡ 6.96 ⇥ 108 m, its bolometric luminosity5 is L = 3.86 ⇥ 1026 W.
Its age is around t ⇡ 4.55 ⇥ 109 yr, which is comparable to the age of the Universe.
At its core, the density is rc ⇡ 1.48 ⇥ 105 kgm 3 , the temperature is Tc = 1.56 ⇥ 107 K ⇡
1.3 keV, and the pressure is around Pc = 2.29 ⇥ 1016 Pa.
The radiation emitted by the Sun approximately follows a blackbody curve, whose
characteristic temperature is called the effective temperature: it is around TE ⇡ 5780 K ⇡
0.5 eV.
The power emitted by the Sun is quite small as a fraction of its total energy, so we can
apply the virial theorem: one of its formulations we found is

1 Egr
h Pi = , (4.137)
3 V
where Egr ⇡ GM2 /R while V = 4pR3 /3: plugging the Sun’s numbers we get

GM2
h Pi = ⇡ 1014 Pa , (4.138)
4pR4

100
⌦ ↵ times less than the central pressure. Similarly, the average density can be computed as
r = M /V ⇡ 1.4 ⇥ 103 kg/m3 , just slightly more than the density of water! This is also
roughly 100 times less than the central value.
By the ideal gas law (which does not precisely apply, but as we saw the Sun has a
rather low density overall so it is a reasonable approximation) we can find the mean internal
temperature of the Sun, TI , as:
⌦ ↵
r
h Pi = k B TI , (4.139)
m
where m, the average particle mass, is m ⇡ 0.61m H instead of 0.5m H when considering
the proper composition of the Sun, which has a noticeable fraction of helium and heavier
elements.
We get
! 1
GM2 M GM m
k B TI = m= ⇡ 0.5 keV ⇡ 5 ⇥ 106 K . (4.140)
4pR4 4pR3 /3 3R

Again, we see the relation xcentral ⌧ xmean ⌧ xsurface , which holds for x = density,
pressure, temperature.
The measured bolometric luminosity of the Sun is consistent with

L = 4pR2 sTE4 , (4.141)

where s is Stefan’s constant: s ⇡ 5.67 ⇥ 10 8 Wm 2 K 4 .

5 This means: the total luminosity, integrated across all the electromagnetic spectrum.

116
Why does this hold with the comparatively low temperature TE and not with the mean
internal temperature TI ? We will now see that this is because the interior of the Sun is
optically thick (opaque), so any photon from the interior cannot simply be emitted, it will
undergo many scatterings before doing so.

4.3.1 Radiative diffusion


~ from production to
The motion of the photon is Brownian, and the total displacement D
emission will be written in terms of N short straight tracts,
N
~ =
D Â ~`i . (4.142)
i =1

The process is inherently stochastic, and we will need to describe it as such. We need
two equations in order to describe it: the Langevin and Fokker-Planck equations.
The Langevin equation gives us the derivative of the position of the particle in terms of
a stochastic force h:

~˙ (t) = ~h (t) .
D (4.143)
⌦ ↵
The requirements
D on Eh are that it should have zero mean: ~h (t) = 0 and that it
should satisfy hi (t)h j (t0 ) = 2Ddij d(t t0 ). This means that it is spatially and temporally
uncorrelated, making the process a Markovian one: it has “no memory”. The parameter D
is called the diffusion constant, its units are m2 /s (and it is not related to the displacement
~
D).s
Nontrivially (we will not discuss the details of the derivation) this gives us the Fokker-
Planck formula:
∂P
= D r2 P , (4.144)
∂t
where P, a function of position and time, quantifies the probability density of finding a
photon there.
Mathematically speaking this is a parabolic differential equation, which concretely means
that we need to give it both initial conditions and boundary conditions.
The kind of boundary condition we want for the study of the Sun is a so-called absorbing
boundary: physically, as a photon reaches the surface it is emitted, which from the inside
looks like if the boundary “absorbed it”.
The solution to the Fokker-Planck equation,Dwith E an impulsive initial condition like
P( x, t = 0) = d(3) ( x ) is a Gaussian with variance x2 = 2Dt and centered around zero:
!
1 x2
P( x, t) = exp . (4.145)
(4pDt)3/2 4Dt

This holds for 0 < |~x | < R , while for | x | > R the particles have escaped.

117
Integrating the Gaussian within the stellar boundary at a time t yields the probability
that any single photon emitted
p at the center of the Sun is still inside it after a time t. Roughly,
this be the case with s = 2Dt . R .
Let us now consider the problem geometrically. The mean square value of the displace-
ment will be given by
D E D E D E
~ 2 = Â ~`2i + 2 Â ~`i ·~` j ,
D (4.146)
i i< j

but if we have isotropy then the scalar products have mean zero, since they are averages of
two lengths times a cosine.
Then we find that, in order for the photon to have reached the
D boundary
E of the star on
~ 2 = R2 we get
average it will need to have undergone N scatterings: if we set D
D E
~ 2 = N `2 = R2 ,
D (4.147)

so N = R2 /`2 , where ` is the typical path travelled between scatterings.


The time it takes for a photon to cover a distance ` is t = `/c. Then, the total time taken
in the random walk is given by t RW = Nt = R2 `/(`2 c), which means

R2
t RW = , (4.148)
c`
while in direct flight the photon would only have taken t0 = R /c: their ratio is

t RW R
= . (4.149)
t0 `

The argument which follows is still unconvincing to me: the photons take ⇠ 50 kyr to
come out of the Sun, so this process will have stabilized in the Sun’s 5 Gyr lifetime!
Estimating the mean free path as ` ⇠ 1/(nsT ) could work, it yields ⇠ 1 cm with the
mean density and ⇠ 0.1 mm with the central density, so perhaps since in the low-`
regions the photons stay for a longer time this can average out to 1 mm when doing
the calculation properly.
If L0 is the luminosity for the Sun calculated using the average internal temperature TI
instead of the effective temperature TE we get

`
L = L0 , (4.150)
R
which means
✓ ◆1/4
`
TE = TI , (4.151)
R

so we can calculate ` by knowing the other three parameters: we get that the mean free path,
averaged over the star, is ` ⇡ 1 mm.

118
The total solar luminosity is then:

` `
L = L0 = 4pR2 sTI4 , (4.152)
R R
GMm
and we know that k B TI = 3R : inserting this we find
✓ ◆4
2GMm ` (4p )2 s 4 4
L = 4pR s = G m r ` M3 . (4.153)
3Rk B R 35 k4B

The most important part of this result, which is observationally verifiable, is L µ M3 .


This allows us to estimate the lifespan of a star: the fraction of mass available to be turned
into energy through fusion is a constant multiple of M (slighly less than 1 %): therefore, we
have t µ M/L µ M 2 . More massive stars die earlier.
The relation L µ Ma for a = 3 is quite close to the data globally, in specific regions we
can have better estimates for the powerlaw index: the value a = 3.5 is more commonly used
for Main Sequence stars (which we will discuss later).
The lifetime always scales like t µ M1 a , and we always have a > 1, so the fact that
more massive stars die earlier always holds.
We then expect that in each galaxy there can have been several “generations” of heavy
stars, while the very lightest are still in their first generation. This can be observationally
confirmed by looking at the abundances of elements, and comparing them with the expected
production inside heavy stars.

4.3.2 Thermonuclear fusion


The reaction chain which produces Helium in stellar cores is the following:

p + p ! d + e+ + ne (4.154a)
3
p + d ! He + g (4.154b)
3 3 4
He + He ! He + p + p , (4.154c)

which involves the weak interaction (for the first process, whose lifetime is t ⇠ 5 ⇥ 109 yr),
EM interaction (for the second process: t ⇠ 1 s) and strong interaction (for the third process:
t ⇠ 3 ⇥ 105 yr).
The net balance is 4 protons in, one 4He and some photons and neutrinos out, accounting
for the ⇡ 0.66 % mass difference.
We can then see that the weak-interaction part of the chain is the bottleneck.
The chain we had during primordial nucleosynthesis, on the other hand, did not need
any weak-interaction processes:

n+p ! d+g (4.155a)


3
d + d ! He + n (4.155b)
3
He + d ! 4He + p . (4.155c)

119
The issue is that all the free neutrons quickly decayed after primordial nucleosynthesis,
and free deuterium is scarce: secondary production chains can then prevail.
The weak interaction chain has a very low power density: P /V ⇡ 0.27 W/m3 , a lower
power density than a human (who generally has a volume less than a cubic meter, but can
still output a few hundreds of Watts).
For each 4He nucleus we get an energy E = (4m p m 4He )c2 ⇡ 25 MeV. Then, we can
calculate the number of protons per second the Sun uses in order to produce the power it
does.
The rate r of proton usage is given by
4L
r= ⇡ 4 ⇥ 1038 Hz , (4.156)
E
which corresponds to about 6.5 ⇥ 1011 kg/s.6
For each process one electron neutrino is also emitted, and the first two steps of the
process happen twice for each 4He nucleus, so around 2 ⇥ 1038 neutrinos per second are
produced. These travel basically undisturbed through the Sun and out.
A proton’s mass is around m p ⇡ 1 GeV ⇡ 1.78 ⇥ 10 27 kg, so in the Sun there are around
10 protons: this means that the total lifetime of the Sun will be around 1010 years. The Sun
56

is approximately half-way through this lifetime.

4.3.3 Stellar evolution


This section is massively simplified, stellar evolution is complicated, not completely
understood and there can be many confounding variables. We will only try to give a general
overview.
Throughout the evolution of the star, the interior is near equilibrium between gravitation
and the pressure gradient due to the energy emitted by fusion. If the fusion starts to produce
more energy, then the star expands and reaches a new equilibrium.
The series of processes which can happen in stellar fusion is shown in table .

Process Fuel Products Tmin Mmin


Hydrogen burning Hydrogen Helium 107 K 0.08M
Helium burning Helium Carbon, Oxygen 108 K 0.5M
Carbon burning Carbon Oxygen, Neon, Sodium 5 ⇥ 108 K 8M
Neon burning Neon Magnesium, Oxygen 109 K 9M
Oxygen burning Oxygen Magnesium to Sulphur 2 ⇥ 109 K 10M
Silicon burning Silicon Iron and nearby elements 3 ⇥ 109 K 11M

Figure 4.1: Solar processes.

As one type of fusion fuel starts to run out, the pressure from the inside diminishes,
therefore the interior of the star starts contracting, which typically allows the interior to start
fusing the next kind of fuel, while the exterior keeps expanding.
6 Which, to use a gruesome comparison, corresponds to roughly the mass of the entire human population

every second.

120
The required temperature to fuse every next element is ever higher, after every cycle
there is the possibility of reaching the maximum density allowed by electron degeneracy
pressure, which yields a mass threshold for each burning stage, denoted as Mmin in the
table.
For example, the Sun will reach the Helium burning phase, but it will go no further.
A star with a mass of less than 11M will not reach the iron stage; as it starts burning
helium it becomes a red giant, and it only keeps growing as it goes through the burning
phases. As it runs out of its last fuel, it starts shrinking and becomes either a white dwarf
or a neutron star.
There is a boundary, the Chandrasekhar mass MC ⇡ 1.4M , between the final fate of
the star being a white dwarf or a neutron star: this is the maximum mass a star made of
protons and electron can have if it must resist its own gravitational collapse only through
electron degeneracy pressure.
Now, let us consider massive stars with M > 11M . They are able to burn oxygen into
iron in their cores; now, the thing to note here is that the iron nuclide 56Fe has one of the
highest binding energy per nucleon,7 so once it is reached any successive burning stages
would absorb energy, instead of releasing it.
This region in the core will not have any way to provide pressure to counteract the
gravity of all the rest of the star above it. When the iron core reaches the Chandrasekhar
mass, it collapses onto itself, and the rebound from this creates a supernova, which will
either leave a neutron star or a black hole as a remnant.
In the supernova the conditions allow for the formation of heavier elements, beyond
iron. The matter which is expelled can form a planetary nebula.
Maybe add more details here about supernovae and the final fates of very massive
stars? Not sure whether it makes sense, maybe refer to the advanced astrophysics
notes.

4.3.4 The Hertzsprung-Russel diagram


In the Herzsprung Russel diagram we plot L/L versus Teff , the latter increasing right
to left.
The Main Sequence runs from the upper left to the lower right, we have Red Giants on
the upper right and White Dwarves on the lower left. Most of the stars are on the Main
Sequence: the hydrogen burning phase lasts a long time.
Pacciani here has a more in-depth discussion of HR diagrams, absolute magnitudes
and so on: this would be a useful thing to insert. Also, a figure would be useful.

7 It is not the highest, since there are a couple nuclides like 62Ni which slightly exceed its binding energy
per nucleon. These are collectively known as the “iron group”, and are also formed, albeit in smaller amounts,
during stellar fusion. For more details, see section IV in Fewell [Few95], a very clear (and not so technical)
paper.

121
4.3.5 The interior of a Main Sequence star
We wish to describe the statics of the fluid which makes up a Main Sequence star; the
goal we set is to calculate what is the maximum mass of a Main Sequence star — the Main
Sequence includes all the stars which are in the process of burning hydrogen.

The density profile The first thing we will need is to calculate the density profile of the
star, which is written as r(r ) since as always we are assuming spherical symmetry.
Our equation of hydrostatic equilibrium can be written as:

dP Gm(r )r(r )
= (4.157)
dr r2
r2 dP
= Gm(r ) , (4.158)
r(r ) dr

and we can relate the differential mass of a spherical shell with the differential radius
through dm = 4pr2 r(r ) dr.
Differentiating the equation of hydrostatic equilibrium we find
!
d r2 dP dm
= G = 4pGr(r )r2 , (4.159)
dr r(r ) dr dr

more commonly stated as


!
1 d r2 dP
= 4pGr(r ) . (4.160)
r2 dr r(r ) dr

⇣ ⌘
This resembles the Laplacian in spherical coordinates, r2 f (r ) = r 2∂
r r2 ∂r f . . . can this
be stated more precisely?
We can solve this using an equation of state which gives us P = P(r); for stellar interiors
the cosmological equations of state P µ r do not in general work well, and instead we must
generalize to a polytropic equation:
n +1
P = kr n = krg , (4.161)

where k and n are constants; also n = 1/(g 1). This g is the adiabatic index: for a
monoatomic nonrelativistic gas g = 5/3 and n = 3/2, for an ultrarelativistic gas g = 4/3
and n = 3.
This will yield a second order differential equation for r, which we must complement
with two boundary conditions. We can set the value of the central density, r(r = 0) = rc ;
also, in order for the density to be a differentiable function of position inside the stare we
must also have ∂r r(r = 0) = 0: this is because if we move in a straight line through the
center of the star, as we pass r = 0 we move from a certain value of the derivative to minus
that value, since r goes from decreasing to increasing. The only way for this to be continuous
is if the derivative is zero.

122
This is confirmed by the fact that, if we substitute the polytropic equation of state, we
find
dr Gm(r )
r1/n µ r (r ) , (4.162)
dr r2
and the mass in a small region around the origin is approximately m(r ) ⇠ rc r3 : therefore
r1/n dr dr µ rc r.
These equations can be solved numerically. The radius of the star can be calculated
as the one at which the density goes to 0: r( R) = 0, and the mass of the star is given by
m( R) = M.
This model is unphysical in that it assumes that the star’s interior can be described by a
constant g throughout; as we have seen there are several orders of magnitude of difference in
pressure, temperature and density from the core to the surface, so it is a strong assumption
to say that the plasma behaves in the same way throughout the star.

The Clayton model Let us discuss a model proposed by Clayton in 1986, which uses an
ansatz for the density profile in order to extract information about the star.
Near the center of the star, we can estimate the mass contained within a spherical shell by
assuming r = rc throughout, so m(r ) ⇠ 4p 3
3 rc r , which we can substitute into the equation
of hydrostatic equilibrium:
dP Gmr 4pG 2
= ⇡ r r, (4.163)
dr r2 3 c
so the pressure gradient goes to zero linearly in r.
Also, as r approaches the radius of the star, R, we get dP dr ! 0 as well, since the
pressure gradient is proportional to r(r ) in that region, while r approaches a constant.
So, the pressure gradient approaches zero both at the core and at the surface, while in
the interior it has a negative value.
The ansatz by Clayton is a relatively simple expression which achieves these require-
ments:
!
dP 4p 2 r2
= Grc r exp , (4.164)
dr 3 a2

where the parameter a has the dimensions of a length, and we take it to be a ⌧ R. This
model is quite accurate near the center, not so much near the surface!
By integrating we can calculate the pressure profile:
0 ! !1
2p 2 2 @ r 2 2
R A
P (r ) = Grc a exp exp , (4.165)
3 a2 a2

so that the pressure is exactly zero at the surface: P( R) = 0.


Substituting the relation dm = 4pr2 r dr into the hydrostatic equilibrium equation we
find

Gm(r ) dm = 4pr4 dP , (4.166)

123
which can be integrated in order to calculate the mass from the pressure profile:
Z r
1 dP
Gm2 (r ) = 4p de
rer4 (4.167)
2 0 der
✓ ◆ Z r
!
8p 4p r2
m2 (r ) = Gr2c de
rer5 exp (4.168)
G 3 0 a2
4pa3
m (r ) = rc F( x ) , (4.169)
3
where we performed a change of variable to x = r/a (bringing out a6 ) and defined F( x ) as
the square root of the dimensionless integral:
Z x ⇣ ⌘
y2 x2
F2 ( x ) = 6 dy y5 e =6 3 x4 + 2x2 + 2 e . (4.170)
0

Near the surface x is very large (since, as we mentioned, a ⌧ R), so the exponential
dominates: we find F2 ( x ) ⇡ 6 there.
The density profile can also be expressed in terms of F( x ):

1 dm 1 1 4pa3 rc dF
r (r ) = = (4.171)
4pr2 dr ✓ 4pa2 x2 a 3 dx
rc 1 d
Z x ⇣ ⌘ ◆
= 2 6 y5 exp y2 dy (4.172)
3x 2F dx 0
2
!
x3 e x
= rc , (4.173)
F( x )

which, under the assumptions of the model, is a complete description of the density profile.
We can also recover the temperature profile from the ideal gas law (which holds as long as
the gas inside the star is nonrelativistic):

m P (r )
T (r ) = . (4.174)
k B r (r )

Let us calculate this near the center of the star, meaning x ⌧ 1;


2 !3
x 4 6
x 5
F( x ) = 46 3( x4 + 2x2 + 2) 1 x2 + (4.175)
2 6
✓ ◆1/2
6 3 8 3 1 12
⇡ x x + x10 x +... , (4.176)
4 10 12

which we can insert into our expression for r(r ), also let us expand P(r ) to second order, so
that we can also calculate T (r ):
!
5 r2
r (r ) ⇡ r c 1 +... (4.177)
8 a2

124
!
2p 2 2 r2
P (r ) ⇡ Grc a 1 +... (4.178)
3 a2
!
3 r2
T (r ) ⇡ Tc 1 +... , (4.179)
8 a2

where
m 2p
Tc = Grc a2 . (4.180)
kB 3
Moving to the surface, we can calculate the total mass
p
4prc a3 4prc a3 6
M = m( R) = F( R/a) ⇡ . (4.181)
3 3
Then, the average density is given by

⌦ ↵ M p ✓ a ◆3
r = 4p 3
= 6 rc , (4.182)
3 R
R
⌦ ↵
so if it is the case that a ⌧ R then we also have rc r .
We can invert this relation to find a in terms of M and rc ; for the Sun we find a ⇡ R /5.4,
r( a) = 0.53rc and m( a) = 0.28M .
This means that, as we expected, the Sun is quite concentrated: over a quarter of its mass
is contained within (1/5.4)3 ⇡ 0.6 % of its volume.
A useful result we can derive from this model is the central pressure expressed in terms
of the mass and central density:
!2/3
2p 2 2 2p 2 3M
Pc = Grc a = Grc p (4.183)
3 3 4prc 6
✓ ◆1/3
p
⇡ GM2/3 r4/3
c . (4.184)
36

This model then predicts the prefactor q = (p/36)1/3 ⇡ 0.44. The powers of M and rc
are the same in more sophisticated models.
From simulations with varying g we get: for g = 5/3 the factor is q ⇡ 0.48, for g = 4/3
the factor is q ⇡ 0.36. The results are quite close to ours!
Pacciani also states that q < (p/6)1/3 ⇡ 0.14. . . not sure how that would make sense!

4.3.6 The maximum mass


We can apply the ideal gas relation to the core of the star: then we find
✓ ◆1/3
Pc p
k B Tc = m ⇡ GmM1/3 r1/3
c . (4.185)
rc 36

125
In the core of the star, we have both nonrelativistic and relativistic material in equilibrium:
electrons and protons are nonrelativistic, while photons are relativistic. We have discussed
earlier that if most of the material in a star were relativistic it would become unstable (since,
as g ! 4/3, the binding energy approaches 0); let us then discuss the composition of the
star.
We can decompose the pressure at the core into the fractions due to nonrelativistic matter
and to radiation:

Pc = Pm + Pr = bPc + (1 b) Pc , (4.186)

where we define b 2 (0, 1) as the fraction of the core pressure which is due to matter:
b = Pm /Pc .
The two contributions can be separately expressed as:
rc k B Tc
bPc = Pm = (4.187)
m
1 4
(1 b) Pc = Pr = aT , (4.188)
3
where a is the radiation constant, related to the Stefan-Boltzmann constant s:
p 2 k2B
a= . (4.189)
15h̄3 c3
We can then relate b to the mass of the star: in order to simplify the core temperature,
we start by computing
4
bPc r4 3
= c4 (k B Tc )4 4 (4.190)
(1 b) Pc m aTc
4
✓ ◆4
b 3 k B rc
Pc3 = , (4.191)
1 b a m
which we can invert to find an expression for the core pressure Pc in terms of b, which we
then compare to the expression we found for the core pressure as a result of the Clayton
model:
!1/3 ✓ ◆ ✓ ◆1/3
31 b k B rc 4/3 p
Pc = 4
= GM2/3 r4/3
c (4.192)
a b m 36
✓ ◆1/3 !1/3 ✓ ◆
p 2/3 3 (1 b ) k B 4/3
GM = , (4.193)
36 a b4 m

the core density simplifies!


So, if we compare stars at the same stage of fusion so that m is constant, we have
M µ f ( b) = (1 b)1/2 /b2 .
f ( b) decreases as b increases, and it diverges to +• for b ! 0.
Looking at the plot the other way, the heavier the star, the larger the contribution of
radiation to the core pressure, which is what 1 b quantifies.

126
1.0

0.8
Nonrelativistic pressure fraction b

0.6

0.4

0.2

0.0

101 102 103 104 105 106


Mass [M ]

Figure 4.2: A plot of M in terms of b.

This makes sense intuitively: heavier stars reach higher temperatures and densities, so
they have more radiation in the core.
We know that for b ! 0 the star is surely unstable, but the instability is actually reached
earlier, since even before the gravitational binding energy being exactly zero large parts of
the star can be flung out as stellar winds. Proper considerations about what an appropriate
critical value of b should be allow us to bound the stellar mass from above, at around 50M .

4.3.7 Degenerate electron gas


Now, we will deal with the degenerate electron gas in stars, and see what is its effect on
the minimum and maximum mass of a star.
The distribution function of the electrons, which are fermions, is given by
" ✓ ◆ # 1
ep µ
f ( p) = exp +1 , (4.194)
kB T

127
p
where e p = m2 c4 + p2 c2 . With it, we can calculate the number density of electrons:
Z
gs
ne = d3 p f ( p ) , (4.195)
h3
where gs , the number of helicity states of the electron, is equal to 2.
We want to consider the degenerate case for this distribution, which corresponds to the
saturation of all the low-energy configurations in phase space: this is known as a Fermi
gas. As the temperature approaches zero, the phase space distribution approaches the
configuration
" ✓ ◆ # 1 8
ep µ <1 ep < µ
f ( p) = lim exp +1 = (4.196)
T !0 kB T :0 ep > µ .

The chemical potential yields a critical energy, known as the Fermi energy, eF = µ, which
is also tied to a Fermi momentum:

e2F = c2 p2F + m2 c4 . (4.197)

The number density of electrons in this configuration is given by the integral mentioned
before: since the distribution is spherically symmetric we have
Z pF ✓ ◆3
2 1 8p pF
ne = 2 dp p 4p 3 = , (4.198)
0 h 3 h

which allows us to express the Fermi momentum in terms of the number density of electrons:
◆ ✓
3ne 1/3
pF = h. (4.199)
8p
p
In natural units, this is roughly p F ⇡ 6.6 3 ne .
The energy density is given by the expression
Z pF
2
r= 4p p2 dp e p , (4.200)
h3 0

which we can consider in either the nonrelativistic or the ultrarelativistic limit — the analytic
integral is complicated and not very enlightening.

Nonrelativistic limit In this limit the energy is approximately

p2
e p = mc2 + , (4.201)
2m
so in the computation we need to integrate a polynomial: the result is
!
p 2
3
r = n mc2 + F
, (4.202)
10 m

128
where the first term corresponds to the rest-energy of the electrons, while the second gives
their kinetic energy. We have derived earlier the following expression for the pressure of a
nonrelativistic gas:

2 Ek
P= , (4.203)
3V
and Ek /V is precisely the kinetic energy density, the second term of the expression for the
total energy density r; so for our nonrelativistic Fermi gas we will have:

p2F
P=n . (4.204)
5m
Since the Fermi momentum p F can be written as a function of the number density ne , so
can the pressure P: we find
✓ ◆2/3
h2 3
P= n5/3
e . (4.205)
5m 8p
| {z }
K NR

Ultra relativistic limit In the relativistic case, on the other hand, we can approximate the
energy as e p ⇡ cp, so the energy density will be given by

3
r= nr F c . (4.206)
4
In this case, we also know that the pressure becomes

1 Ek
P= , (4.207)
3V
so we find
✓ ◆
hc 3 1/3 4/3
P= n . (4.208)
4 8p
| {z }
KUR

Fermion gas classification We have discussed some expressions describing a non-relativistic


or ultrarelativistic degenerate fermion gas.
We have derived our results with the assumption T ! 0, but a gas can behave very
similarly with nonzero temperatures as well. What is the temperature threshold under
which the gas behaves in a degenerate-like way? We will not discuss how the transition
region looks, but if k B T ⌧ eF then the gas behaves like a degenerate one, while if k B T eF
then there will be many unfilled gaps in the phase space distribution, so the gas will not be
degenerate.
Recall that p F µ n1/3e : therefore, in a log-log plot of temperature T versus density
of possible electron densities ne we can draw a line distinguishing the degenerate and
nondegenerate cases, with the critical temperature becoming higher for higher ne .

129
Having distinguished the degenerate and nondegenerate regions, we can distinguish the
relativistic and nonrelativistic ones: for the nondegenerate case, as is usual, we reach the
relativistic condition if we increase the temperature.
In the degenerate case this is not really the case: as long as the gas is degenerate, the
temperature does not really matter, and the gas becomes relativistic when the Fermi energy
eF becomes larger than the mass of the fermion. Since eF is only a function of the number
density, this means that the gas can become relativistic at arbitrarily low temperatures as
long as it is dense enough.

1012

RELATIVISTIC
1010 Iron core

108
Temperature [K]

Sun core White dwarf

106

104
Iron at room temperature
102
DEGENERATE

100
1025 1027 1029 1031 1033 1035 1037 1039
Electron number density [m 3]

Figure 4.3: Regions in which an electron gas is degenerate (red) and/or relativistic (blue),
depending on its density ne and on its temperature T. The colors are decided defining
“degenerate” as having T < TF = eF me c2 and “relativistic” as having the average kinetic
energy of an electron (Ek = (g 1)me c2 ), be larger than me c2 . The transition region for
both spans an order of magnitude, symmetrically around the equality condition. Orders
of magnitude are also given regarding common objects; “iron core” refers to the core of a
massive star in the Silicon burning phase.

Application to the Sun As we have discussed earlier, the core temperature, pressure and
density of the Sun are related by the following relation:
rc
Pc = k B Tc . (4.209)
m

130
The average mass which appears here is a function of the chemical composition of the
interior: we can neglect all the metals and only consider the mass fractions of hydrogen (x1 )
and of helium (x4 ):

1
m = 2m H ⇥ . (4.210)
1 + 3x1 + 0.5x4

This expression works well in the limits of x1 = 1 and x4 = 1, but where does it come
from? I would have expected m = ( x1 m H + x4 m He )/2. . .
The Clayton model gave us an expression for the central pressure Pc , which we turned
into one for the central temperature Tc :


p 1/3
Pc =⇡ GM2/3 r4/3
c (4.211)
36
✓ ◆1/3
p
k B Tc ⇡ GmM2/3 r1/3
c , (4.212)
36

however when deriving it we not consider the effect of the fact that the gas there may be at
least party degenerate.
Let us consider a different approximation: suppose that the electrons in the core are fully
degenerate and nonrelativistic, while the ions (whose density is ni , which by local neutrality
is also equal to ne = rc /m) are completely classical.8 Our estimate for the central pressure
will need to account for both electrons and ions:

Pc = k NR n5/3
e + ni k B Tc . (4.213)

Let us equate this expression with the one given by the Clayton model for the central
pressure: we find
✓ ◆1/3 ✓ ◆5/3
p rc rc
GM2/3 r4/3
c = k NR + k B Tc (4.214)
36 mH mH
✓ ◆1/3
p
k B Tc = Gm H M2/3 r1/3
c k NR m H2/3 r2/3 . (4.215)
36 | {z } c
| {z } B
A
8 In order to see why it makes sense to consider them as classical while the electrons are degenerate, let us

look at the fact that in the nonrelativistic approximation the Fermi energy is given by e F = p2F /2m ⇠ m 1 n2/3 ,
so the critical temperature needed for a Fermi gas to become degenerate depends on the number density as
well as the mass of the particle: for a higher-mass particle, the Fermi energy is lower.
The electrons being degenerate means that the temperature of the core is (roughly speaking) lower than their
Fermi energy; the Fermi energy of the ions however is at least three orders of magnitude lower, so it makes
sense that it is not as low as the Fermi temperature of the ions.
In order to have some numbers at hand, with a number density like that of the core of the Sun the Fermi
temperature for electrons is ⇠ 11 MK, while the Fermi temperature for protons is a measly ⇠ 6000 K. The actual
temperature of the core is Tc ⇠ 15 MK, slightly above the Fermi temperature of electrons. It is close enough
that modelling them as degenerate works, while the assumption of the ions being nondegenerate is completely
valid.

131
We can then ask what is the maximum temperature Tc we can reach for a given mass M
if we vary the core density rc : this can be calculated to be
✓ ◆2
M
rmax
c = ( A/2B)3 ⇡ 5 ⇥ 107 kg/m3 , (4.216)
M

where we have
✓ ◆2/3 ✓ ◆
A2 p G2 m8/3
H 4/3 M 4/3
k B Tc = = M ⇡ 5.7 keV . (4.217)
4B 36 4k NR M

This allows us to estimate the minimum mass a star needs to have in order to fuse
hydrogen: we just need to set Tc to be equal to the ignition temperature Tc = Tign ⇡ 1 keV
and we find
✓ ◆1/2 !3/4
36 4k NR ⇣ ⌘3/4
Mmin = k B Tign ⇡ 0.27 M . (4.218)
p G2 m8/3
H

This is a much better estimate than the one we found earlier since we are now accounting
for the degenerate Fermi gas nature of the electrons in the core (this lowers the estimate,
since it means that even at relatively low temperatures there will be electrons with high
energy) and since we are computing the core density rc instead of the average density r.
The estimate is. . . still not great really, right? it is still 3 times larger than the correct
value of 0.08M ! How do we account for such a discrepancy?

Expressing the result with coupling constants The gravitational potential energy between
two hydrogen nuclei separated by a distance equal to their (reduced!) Compton wavelength
r = h̄/m H c is

Gm2H Gm3H c
Eg = = . (4.219)
r h̄
Comparing this to the rest energy of an electron, E = m H c2 , is the way to calculate the
gravitational coupling constant aG , a dimensionless parameter quantifying the “strength” of
the gravitational interaction between hydrogen nuclei:

Eg Gm2H 39
aG = = ⇠ 5.9 ⇥ 10 . (4.220)
E h̄c
In natural units, aG = m2H /m2P .
By a similar line of reasoning we find the electromagnetic coupling constant:

e2 1
a EM = ⇡ , (4.221)
4pe0 h̄c 137
which is enormously greater.

132
In terms of the gravitational coupling constant the minimum mass we found can be
written as
!3/4
k B Tign
Mmin ⇡ 16 aG 3/2 m H . (4.222)
m e c2

If Tign ⇠ 1.5 ⇥ 106 K, one tenth of the temperature of the Sun, we find
Is 0.1 keV really enough to reach ignition? This seems to contradict what was said ear-
lier. . .

Mmin ⇠ 0.03aG 3/2 m H . (4.223)

We can apply a similar line of reasoning to the formula we found for the maximum
mass: taking equation (4.193) with a critical fraction of nonrelativistic matter of b = 0.5 and
m = 0.61m H we get a result which, once again, scales with aG 3/2 m H :

Mmax ⇡ 56aG 3/2 m H . (4.224)

This hints to the fact that m⇤ = aG 3/2 m H is an important characteristic mass for all of
stellar evolution.
This is around 1.85M , and it corresponds to a number of nucleons of
m⇤
N⇤ = ⇡ 2 ⇥ 1057 . (4.225)
mH

4.4 Stellar remnants


4.4.1 Full degeneracy and white dwarfs
White dwarfs are the remnants of low-mass stars who have exhausted the elements
they are able to fuse in their core. They glow, emitting thermal radiation, which causes
their temperature to slowly decrease until they become brown dwarfs. They are dim but
observable; the closest one to the Solar System is Sirius B, a companion to the brightest star
in the night sky.
They are of interest to us since they allow us to apply the theory of degenerate Fermi
gasses once more: they are very dense objects, since there is no fusion-induced pressure
gradient inside them to balance gravity, and the electrons inside them form a degenerate
gas.
The number density of electrons inside a white dwarf is given by
rc
ne = Ye , (4.226)
mH

where Ye = (1 + x1 )/2 quantifies the number of electrons per baryon (x1 is the hydrogen
mass fraction).

133
Why would Ye be given by that expression? hydrogen has one electron per each
baryon, but helium also has half an electron per baryon. . . If the white dwarf was ex-
clusively hydrogen, would we not expect Ye = 1/2?
Let us start by assuming that the matter is nonrelativistic: then the pressure is given by
✓ ◆5/3
Ye rc
P= k NR n5/3
e = k NR , (4.227)
mH
which as usual we compare to the results of the Clayton model:
✓ ◆1/3
p
Pc = GM2/3 r4/3
c . (4.228)
36
Equating these two we find
✓ ◆
3.1 M 2 mH
rc ⇡ 5 . (4.229)
Ye m⇤ (h/me c2 )3
If, on the other hand, we were to assume that the matter is ultrarelativistic the pressure
would be given by
✓ ◆4/3
Ye rc
P= kUR n4/3
e = kUR , (4.230)
mH
so, instead of getting an expression for the central density rc , we would find
✓ ◆4/3 ✓ ◆1/3
Ye rc p
kUR ⇡ GM2/3 r4/3
c : (4.231)
mH 36
in this limit the expression becomes independent of rc !
This gives us a limit mass, since as we increase the mass of a white dwarf which is
not relativistic we increase its density and thus its temperature, making it closer to being
relativistic, and this is the mass we get for the fully relativistic configuration (which is
unstable because of the usual binding energy considerations).
The limit is known as the Chandrasekhar mass, the largest mass at which a fully degen-
erate white dwarf can support itself:
✓ ◆1/2 ✓ ◆2 ✓ ◆3/2
36 Ye kUR
MCH = ⇡ 2.3Ye2 m⇤ ⇡ 4.3Ye2 M ⇡ 1.4M . (4.232)
p mH G

4.4.2 The Chandrasekhar limit in more detail


We want to derive the Chandrasekhar limit in a more precise manner.
Instead of approximating the gas as either ultrarelativistic or nonrelativistic, we can use
the correct expression for the particle energy in the integral for the momentum:
Z pF
4p p2 c2
P= g⇤ dp p2 , (4.233)
3h3 0 ep

134
⇣ ⌘1/2
with e p = p2 c2 + m2 c4 .
We change variables to the dimensionless x = p/(me c) and substitute g⇤ = 2, since
electrons have spin 1/2:
Z xF
8p 4 5 x4
P= m c dx . (4.234)
3h3 e 0 (1 + x2 )1/2
The variable x F is given by:
✓ ◆1/3 ✓ ◆1/3
pF 3ne h 3Ye rc h
xF = = = , (4.235)
me c 8p me c 8pm H me c

and, since the electrons are fully degenerate, x F 1 corresponds to the ultrarelativistic case
while x F ⌧ 1 corresponds to the nonrelativistic case.
Now, as x F ! • the integral is asymptotically
Z xF
x4 x4F
p dx ⇠ , (4.236)
0 1 + x2 4

so we define
Z xF
4 x4
I (xF ) = dx (4.237)
x4F 0 (1 + x2 )1/2
0 ! 1
3 @ 2x2 ⇣ ⌘
= 4 x (1 + x2 )1/2 1 + log x + (1 + x2 )1/2 A , (4.238)
2x 3

which approaches 1 as x F ! •. Then, we can write the pressure as

8p 4 5 x4F
P= m c I (xF ) (4.239)
3h3 e 4
= kUR n4/3
e I (xF ) . (4.240)

We can see that this manipulation works by explicitly doing the calculation, but we
can also just observe that the limiting case x F ! • must reduce to the ultrarelativistic
approximation, so the prefactor must be the same.
In the nonrelativistic case, x F ⌧ 1, we have I ( x F ) ⇠ 4x F /5, and since x F ⇠ n1/3
e this
5/3
yields P ⇠ ne as expected. This expression interpolates between the two limits.
Then, we can apply the same reasoning as before: we compare the pressure of the Fermi
gas with the prediction of the Clayton model to find
✓ ◆4/3 ✓ ◆1/3
Ye rc p
kUR I (xF ) ⇡ GM4/3 r4/3
c , (4.241)
mH 36

so we can extract the mass:

M = I ( x F )3/2 MCh , (4.242)

135
10

6
xF

0.0 0.2 0.4 0.6 0.8 1.0


M/MCh = I ( x F )3/2

Figure 4.4: A plot of M/MCh against x F µ n1/3


e µ r1/3
c .

where MCh ⇡ 1.4 M is the Chandrasekhar mass we defined earlier.


We can see that as we increase the mass approaching MCh the central density diverges:
this is not physically possible, of course, so as it gets higher some usually-prohibited process
takes over; typically for white dwarfs this is electron capture, by which electrons and protons
combine into neutrons, forming a neutron star.
We have not used any particular characteristics of electrons beyond their being fermions,
so this line of reasoning may be used to also bound the mass of a neutron star, since neutrons
are fermions as well. The issue, as we will see, is that general-relativistic corrections become
important in that case, since neutron stars have a much higher density.

4.4.3 White dwarf characteristics


Now that we have a model for the equation of state at the core of a fully degenerate
object like a white dwarf, we can try to extract some of its characteristics: it is reasonable
(from more complete studies of the object) to estimate the mean density as 1/6 of the central

136
one,
✓ ◆
⌦ ↵ 1 0.51 M 2 m H
r = rc = 2 , (4.243)
6 Ye m⇤ (h/me c)3

so we can estimate the radius as


!1/3 ✓ ◆1/3
3M M h
R= ⌦ ↵ ⇡ 0.77Ye5/3 aG 1/2 , (4.244)
4p r m⇤ me c
| {z }
`WD

which is of the same order of magnitude as the characteristic length

h
`WD = aG 1/2 ⇡ 3 ⇥ 107 m ⇡ 0.04R , (4.245)
me c
so, taking Ye = 0.5 we can express the radius as
✓ ◆1/3
R M
R= . (4.246)
74 M

Using the radius we can estimate the luminosity of the thermal radiation emitted by
these bodies:
✓ ◆4/3 ✓ ◆
1 M TE
L = 4pR2 sTE4 = 2 L , (4.247)
74 M 6000 K

so if we take a typical effective temperature of around 104 K (recall that white dwarfs are in
the blue part of the HR diagram), M = 0.4M we get L ⇡ 3 ⇥ 10 3 L : they are very dim.

4.4.4 Neutron stars


The first thing to consider when discussing neutron stars is the fact that neutrons, as we
discussed, are usually unstable, with lifetimes on the order of 10 min. How can a neutron
star be stable then? The process through which neutrons decay is:

n ! p + e + ne , (4.248)

and the crucial fact is that neutron stars are composed of a degenerate neutron gas as well
as a degenerate ultrarelativistic electron gas: the Fermi temperature is much higher than
me c2 /k B [Yak+, eq. 1]. While the electron gas is ultrarelativistic the neutron gas is not; the
energy released by neutron decay is of the order of 800 keV, so the momentum an emitted
electron would have would be well within the Fermi sphere, which is already full!
Thus, neutron decay is inhibited; on the other hand, electron capture, which looks like

e + p ! n + ne , (4.249)

is favoured, and it can increase the number of neutrons.

137
We can look at the Saha formula to get numerical estimates for the equilibrium between
these processes. The chemical potential of neutrinos can be neglected, therefore we find

µn = µp + µe , (4.250)

and, since as we saw earlier the chemical potential of a Fermi gas is its Fermi energy, the
same equation holds for their Fermi energies:

eF,n = eF,p + eF,e . (4.251)

How does this translate into the number densities of the three constituents? We have the
constraint that the number density of protons must equal the number density of electrons in
order to ensure local neutrality, while there is no constraint on the ratio between neutrons
and protons.
We will not get into the calculation, but due to the slight mass imbalance mn > m p we
have a large difference in the number densities: typically,
nn
np = ne = . (4.252)
200
Since the overwhelming majority of the particles in the neutron star are neutrons, we
can make our calculations with the approximation that the number of neutrons per baryon
is Yn ⇡ 1:
rc rc
nn = Yn ⇡ . (4.253)
mn mn

Typical values for these densities are rc ⇡ 2 ⇥ 1017 kg/m3 and nn ⇡ 1044 m 3 .
With a similar reasoning to the one we applied to white dwarfs we can calculate
✓ ◆2
M mn
rNS
c ⇡ 3.1 , (4.254)
M⇤ (h/mn c)3
which, due to the fact that mn me , is much larger than the corresponding result for white
dwarfs
✓ ◆2
3.1 M mH
rWD
c ⇡ 5 . (4.255)
Ye M⇤ (h/me c2 )3

In both cases, we used the characteristic mass M⇤ = aG 3/5 mn ⇡ 1.85M .


As we did before, knowing the central density we can estimate the average one, which
then allows us to calculate the radius:
✓ ◆
M⇤ 1/3 1/2 h
R = 0.77 aG , (4.256)
M mn c

where the characteristic length is given by

h 1
Ln = aG 1/2 ⇡ 17 km ⇡ Le , (4.257)
mn c 1200

138
1200 times smaller than the corresponding length scale for white dwarfs (denoted with an e
for “electron degeneracy”).
Finally, we can compute a maximum mass: Mmax NS = 3.1M = 5.8M .

We have neglected general-relativistic effects, but would they be relevant? The quantity
we need to compute is the ratio of the Schwarzschild radius to the actual radius of the NS:
✓ ◆
RSchw 2GM M 4/3
= ⇡ 0.4 , (4.258)
R Rc2 M⇤

which is large, of order 1! We have not computed a minimum mass for a neutron star, but
typically their mass is of the order of the Chandrasekhar mass, M NS ⇡ 1.4M because of
how they form in supernovae (so, M/M⇤ ⇠ 1). The neutron star might not be small enough
to actually collapse into a black hole, but surely general relativity must be considered when
describing its dynamics.
Neutron stars were first detected as very regular radio pulses: pulsars. These are due to
the very strong (⇠ 108 T) magnetic fields accelerating particles in beams aligned with the
magnetic poles of the NS; these are not aligned with the rotation axis of the NS, so they
constantly change the direction of emission, and the Earth can happen to be in this cone.
In order to estimate how fast these pulses can be (in a classical and rough way), let us
assume that the NS is rotating barely below a speed which would disintegrate it, so that
its binding energy equals its rotational energy: neglecting the order-1 numerical factors we
have
GM2
⇠ MR2 wmax
2
, (4.259)
R
we can then compute the minimum period using the expression we have for the radius in
terms of M and M⇤ :
!1/2 ✓ ◆
2p R3 M⇤ h M⇤
tmin = ⇡ 2p ⇡ 11 aG 1/2 ⇡ 0.6 ms . (4.260)
wmax GM M m n c2 M

The signals produced by pulsars are of this order of magnitude — we have observed
“millisecond pulsars”, so NSs do indeed rotate close to these extremely high rates.
Neutron stars can also produce gravitational waves in their rotation, as long as they
have a slight asymmetry (a “mountain”, although their typical sizes are of the order of
centimeters); these would have frequencies in the Hz range. We cannot detect these with
ground-based detectors, but we might be able to do so with space-based ones.

4.4.5 Relativistic corrections to the equation of state


Black holes are objects which are so dense that their radius is smaller than the Schwarzschild
radius:
2GM
R< = RSch . (4.261)
c2

139
General Relativity predicts that when this is the case an event horizon form, a surface
which forms a causal boundary: as is almost cliché, not even light can escape.
Let us see how the classical description of a star fails for objects with relativistic masses.
The equation of hydrostatic balance, derived under classical assumptions, reads:
dP Gmr
= . (4.262)
dr r2
If we seek a relativistic analogue under similar assumption (spherical symmetry and
equilibrium) we find the Tolman-Oppenheimer-Volkov equation:
! !✓ ◆
dP Gmr P 4pr3 P 2Gm 1
= 1+ 2 1+ 1 . (4.263)
dr r2 rc mc2 rc2

This equation is exact under the assumptions we mentioned. In the classical limit
(2Gm/c2 ⌧ r and P ⌧ rc2 ) this reduces to the classical hydrostatic balance equation. The
first correction is reminiscent of the second Friedmann equation: the pressure itself contributes
to the inertia of the system.
Let us see how their predictions differ assuming constant density, r ⌘ r0 . In the
Newtonian case the mass below a radius r is given by
4p 3
m (r ) = r0 r , (4.264)
3
using which we can integrate the hydrostatic balance equation (from the surface, where the
pressure vanishes) to find:
Z r0 ✓ ◆
Gmr 2pG 2 ⇣ 2 2

P (r ) = dr = r R r 0 . (4.265)
R r2 3 0
| {z }
dP/dr

Then, the central pressure is given by


✓ ◆1/3
2p 2 2 p
Pcclassical = Gr0 R = GM2/3 r4/3
c . (4.266)
3 6
Keeping the constant-density assumption, this can be done analytically in the relativistic
case as well! We get
0 1
( 1 2GMr 2 /R3 c2 )1/2 ( 1 2GM/Rc 2 )1/2
P (r ) = r0 c2 @ 1/2 1/2
A (4.267)
3 1 2GM/Rc2 1 2GMr2 /R3 c2
q
1 1 2GMRc2
relativistic 2
Pc = r0 c q . (4.268)
3 1 2GM Rc2
1

In the classical model we had a finite central pressure for each value of the mass and
density; now, instead, in order for the pressure to not diverge we must require
9 2GM
R> . (4.269)
8 c2

140
This is known as the Buchdahl bound; the radius of a non-black-hole (with constant
density) cannot be arbitrarily close to the Schwarzschild radius, it must be at least 12.5 %
larger in order for the pressure to not diverge at the center.9
This also yields a mass limit for neutron stars. Let us estimate the constant density
r0 by assuming that each neutron takes up a sphere of radius its Compton wavelength:
rn ⇡ h/mn c, so

mn 3m4n c3
r0 ⇡ ⇡ . (4.270)
4p 3
3 rn
4ph3

A more accurate estimate would be given by rn ⇡ 0.7h/mn c.


so that, using the fact that r0 = M/(4pR3 /3), we can start manipulating the Buchdahl
bound:
✓ ◆
4c2 4c2 4pr0 1/3
M< = (4.271)
9GR 9G 3M
!1/3
4c 2 4p 3m4 c3
n
M2/3 < (4.272)
9G 3 4ph3
✓ ◆3/2
8p
M< M⇤ . (4.273)
9

This yields a bound on the order of M . 5M .

9 More stringent bounds can also be derived — in Lattimer and Prakash [LP07, fig. 2] a plot is shown of
possible equations of state of neutron stars in a mass versus radius plane. The bound we derived is denoted as
P < •, and we can also see a “causality” bound, which is related to the speed of sound in an ultrarelativistic
medium. Realistic equations of state can reach approximately R & 3GM/c2 .

141
Chapter 5

Structure formation

5.1 The nonlinear evolution of a spherical perturbation


We want to discuss how a dark matter halo might form and evolve. In order to do so, we
must make certain simplifying assumptions about the shape of the perturbation we want to
consider. We will assume spherical symmetry: the fact that this is reasonable is not obvious,
and was the subject of debate historically; the American school used spherical models, while
the Russian school studied “pancakes”, ellipsoids for which one axis was much shorter than
the others.
A result which can be derived is that, starting from a generic ellipsoid, if it is less dense
than the background it will tend towards a sphere, while it will become less spherical if it is
denser than the background.
So, our model being over-dense and spherical will be kind of unphysical, right?
We will discuss the evolution of a spherical perturbation in the shape of a top-hat: a
constant-density spherical region, embedded in a universe whose density is constant as well,
and which is described by the usual FLRW metric, assuming zero spatial curvature and
matter dominance.
Let us denote the background density as rb (t); by the results we know about the Einstein-
De Sitter model this will be given by rb (t) = 1/(6pGt2 ).
In the perturbed region we will have a different density, r(~x, t). As we did earlier, when
discussing gravitational collapse, we introduce the dimensionless density perturbation

r(~x, t) rb (t)
d(~x, t) = . (5.1)
rb (t)

This can have values from 1 to +•; when it is positive we have an over-density while
when it is negative we have an under-density. We will assume that 0 < d ⌧ 1: a small
over-density, which will allow us to apply perturbation theory.
Earlier we found that if we take the linear order of the equations of motion for a per-
turbation in an Einstein-De Sitter universe we get a growing mode d µ t2/3 and a decaying
mode d µ t 1 , for which the velocities read v µ t1/3 and v µ t 4/3 respectively.

142
If ti is the initial time, then the density perturbation at a time t is given by
✓ ◆2/3 ✓ ◆ 1
t t
d(t) = d+ (ti ) + d ( ti ) . (5.2)
ti ti

The linearized continuity equation (4.110) allows us to express the velocity in terms of
the derivative of the density:
✓ ◆1/3 ✓ ◆ 4/3 !
ḋ 2 t t
v=i aµ d+ (ti ) d ( ti ) , (5.3)
k 3 ti ti

since a µ t2/3 .
We suppose that at t = ti we have unperturbed Hubble flow — the comoving coordinates
of each particle are constant, so v(ti ) = 0. Imposing this for the equation we just found for
the velocity we get

2
d ( ti ) = d+ (ti ) , (5.4)
3
so we can express the initial density as

5
d(ti ) = di = d+ + d = d+ . (5.5)
3
Our perturbation can be dealt with as if it were a local FRLW closed universe: if the
background universe is flat, with k = 0, then Wbg = 1, so in the perturbed “bubble” the
perturbed density parameter will be

W p (ti ) = 1 + di > 1 . (5.6)

When studying curved models, we have shown that they exhibit a turnaround time after
which the scale factor decreases: since our bubble behaves like a closed Einsten-De Sitter
universe, it will do the same. After the turnaround, a closed universe collapses to a single
point at a cosmic time tcollapse = 2tturnaround . This will not actually happen in our perturbed
model: as the cloud nears collapse oscillations dissipate energy, so that the equilibrium
configuration is a finitely-dense cloud with a radius Rvir , where “vir” stands for “virialized”.
These effects take over at the very end of the collapse, so we can estimate the time until
equilibrium is reached as twice the turnaround time.
A corollary of Birkhoff’s theorem tells us that we can then treat this region using the
Friedmann equations with k = +1: the first FE reads

8pG 2
ȧ2 = ra k, (5.7)
3
which we can write as

k = (1 W p ) a2 H 2 . (5.8)

143
This allows us to write the following equation by substituting this equation evaluated at
the initial time ti : k = (1 W p (ti )) a2i Hi2 :
ȧ2 = H 2 W p a2 + (1 W p (ti )) a2i Hi2 (5.9)
ȧ2 a2
2
= H 2 W p 2 + (1 W p (ti )) Hi2 (5.10)
ai ai
✓ ◆
ȧ2 2 ai
= Hi W (
p it ) + ( 1 W (
p it )) , (5.11)
a2i a
where we introduce the index p to denote the fact that we are talking about a perturbation.
The calculation does not seem to work out. . . Is the factor ai /a right? The first two
equations are what I’d do, the last is what Pacciani writes.
The perturbed density r p (t) evolves like
!3
a p ( ti )
r p ( t ) = r p ( ti ) (5.12a)
a p (t)
!3
a p ( ti )
= r b ( ti ) W p ( ti ) . (5.12b)
a p (t)

It can be shown that, if we choose the turnaround time tm by imposing ȧ(tm ) = 0, we


find a density equal to
How does the calculation actually go? It is not clear to me how to get to this next
equation

!3
W p ( ti ) 1
r p ( t m ) = r b ( ti ) W p ( ti ) (5.13a)
W p ( ti )
⇣ ⌘3
W p ( ti ) 1
= r b ( ti ) . (5.13b)
W p ( t i )2
As we discussed with curved models, the turnaround time can be calculated by finding
a parametric solution to the Friedmann equations: this yields, changing the reference time
from “0” (now) to the initial moment ti ,
!1/2
p Wi p r b ( ti )
tm = = , (5.14)
2Hi (Wi 1)3/2 2Hi r p (tm )
where we inserted the expression we found for the density calculated at the turnaround
moment.
Since we assumed that at the initial time there was unperturbed Hubble flow, at that
time the Hubble parameter inside and outside was the same, and we can compute it through
the first Friedmann equation applied to the background spacetime:
8pG
H 2 ( ti ) = r ( ti ) , (5.15)
3 b

144
so we have a cancellation, since the ratio rb (ti )1/2 /Hi yields a constant:
!1/2 !1/2
p r b ( ti ) 3p
tm = = (5.16)
2Hi r p (tm ) 32Gr p (tm )
3p
r p (tm ) = , (5.17)
32Gt2m
which holds inside the bubble.
Outside it, the density evolves according to the usual law:
1
rb (tm ) = . (5.18)
6pGt2m
We are implicitly using the synchronous gauge, in which the proper time defines the time
coordinate for each observer. The exact solution to the Einstein Field Equations we found is
known as the Lemaître-Tolman-Bondi solution.
We can then ask: at the turnaround time, how much is the interior density larger than
the exterior one? this is given by
✓ ◆2
r p (tm ) 3p 3p
1 + dp (tm ) = c(tm ) = = 6pG = ⇡ 5.6 . (5.19)
rb (tm ) 32G 4

This means that dp (tm ) ⇡ 4.6. This certainly is not smaller than 1, so it makes sense to
have sought an exact solution instead of a perturbative one.

How would linear theory have fared? It would not have predicted a turnaround: in it the
growing mode keeps growing, so in order to make the comparison we will need use the
expression for the turnaround time from the exact model.
The density perturbation at the turnaround time, considering only the growing mode
since the decaying one becomes negligible, would have been
✓ ◆2/3 ✓ ◆2/3
tm 3 tm
dp (tm ) ⇡ d+ (ti ) = d p ( ti ) . (5.20)
ti 5 ti

Let us take the expression for the turnaround time and substitute Hi = 2/(3ti ); also, we
want to express the turnaround time in terms of the density perturbation, so we must use
W p 1 = (1 + dp ) 1 = dp . This yields
!
3pti 1 + di 3pti 3/2
tm = ⇥ ⇡ d , (5.21)
4 d 3/2 4 i
i

since di is small by assumption.


The ratio tm /ti , which appears in the expression for the perturbation at the turnaround
time, is then given by
tm 3p 3/2
= d , (5.22)
ti 4 i

145
which fortunately means that the size of the initial perturbation cancels out:
✓ ◆2/3 ✓ ◆
3 3p 3/2 3 3p 2/3
dp (tm ) = di d = ⇡ 1.06 . (5.23)
5 4 i 5 4
As expected, when the perturbation grows linear theory stops giving us a good approxi-
mation for the result.

The virialization time As mentioned before, after the virialization time the perturbation
has become a halo with a higher temperature than before, and for which the virial theorem
applies, and we estimate it as

tvir ⇡ tcollapse = 2tm . (5.24)

The virial theorem tells us that 2T + Egr = 0, so the total energy of the system is given
by
1
Etot = T + Egr = Egr = T. (5.25)
2
The total energy at virialization is given by

1 3 GM2
Evir = , (5.26)
2 5 Req
| {z }
Egr

where the factor 3/5 comes from the spherical symmetry of the system — we are calculating
the gravitational potential energy of a constant-density sphere of radius Req .
At the time of collapse, instead, there is no kinetic energy since we are at a stationary
point, so we only have the gravitational contribution:

3 GM2
Em = . (5.27)
5 Rm
Energy conservation tells us that Em = Evir , which means that the radius at virialization,
Req , is twice the radius at the start of the collapse, Rm : 2Req = Rm , since the factor 1/2 and
the radius are the only difference between the formulas.
Since the mass is the same while the volume shrinks by a factor 8 = 23 , this then means
rvir = 8rm .
With this, we can calculate the ratio of the density of the virialized cloud to the density
of the background: we find

r p (tc ) r p (tc ) r p (tm ) rb (tm )


= ⇡ 180 , (5.28)
rc (tc ) r p (tm ) rb (tm ) rb (tc )
| {z } | {z } | {z }
8 c⇡5.6 22

where the 22 factor for the background density comes from the fact that rb µ t 2 and
tc = 2tm .

146
What would happen if we were to use linear theory at this time? It predicts that the
density scales as t2/3 , so we get
✓ ◆2/3 ✓ ◆
tc 3 3p 2/3 2/3
d+ (tc ) = d+ (tm ) ⇡ 2 ⇡ 1.686 . (5.29)
tm 5 4

Once again, we see that linear theory cannot be applied in this context: it is not able to
predict the large over-densities which are generated by the collapse of clouds.
This result might seem trivial: we already knew that linear theory did not apply! How-
ever, this is in fact very useful: in many contexts linear theory can still be applied successfully,
but it can be hard to know when the approximation it provides breaks down. This value is
then used as a heuristic: when linear theory predicts an over-density of d ⇠ 1.686 we know
that what physically would have happened there is a collapse with d ⇠ 180.

5.2 Press-Schechter theory


This is a theory which was developed in the seventies, and which allows us to study
structure formation by estimating the mass function

dN
n( M) = = # of objects per unit volume with mass in [ M, M + dM] , (5.30)
dM
where “objects” is taken to mean “virialized clouds”, described according to the top-hat
collapse of the last section.
We will use linear theory and apply the “d(~x, t) > 1.686 criterion”. Let us denote this
critical perturbation value as dc = 1.686.
In order to only consider objects above a certain mass, we use spatial filtering: we
ignore perturbations whose characteristic length is higher than a certain radius R, which
corresponds to a certain mass M µ R3 . This is obtained by the application of a low-pass
filter WR (~x ).
The probability density of seeing a perturbation dM (the index M denotes the fact that
we applied the low-pass filter) is well approximated by a Gaussian
!
1 d2M
p(dM ) ddM = q exp 2
ddM , (5.31)
2ps2 2sM
M

where the variance is typically diverging before the application of the filter; after its applica-
tion instead
D E
2
sM = d2M µ M 2a , (5.32)

and typically a ⇠ 1/2.


In order to have nonvanishing probabilities of d > dc we must have s be quite large,
almost of order one: how do we deal with the tail d < 1 then?

147
We can then calculate the probability of seeing a perturbation higher than dc in a generic
location as
Z •
P >dc ( M) = ddM p(dM ) . (5.33)
dc

This basically gives us the fraction of the universe which is occupied by virialized objects
of mass smaller than M.
Also counting their “areas of influence”, right? the region which has d > dc in linear
theory is much larger than the true volume of the perturbation. . .

n( M ) M dM = rm P >dc ( M) P >dc ( M + dM) (5.34a)


dP >dc
= rm dM (5.34b)
dM
dP >dc dsM
= rm dM . (5.34c)
dsM dM

Integrating this expression from M = 0 to M ! • we expect to find rm , but if we


actually compute it we find rm /2.
This comes from a miscount: as the mass we are considering shrinks, we might be
already including smaller objects inside the gravitational influence of larger ones. Properly
accounting for this one gets precisely a factor 2; this means that all the matter in the universe
is found in virialized objects.
Integrating, then, with the factor 2 and taking sM = ( M/M0 ) a we find
r ✓ ◆ ✓ ◆ !
2 rm M a 2 M 2a
n( M) = a exp , (5.35)
p M⇤2 M⇤ M⇤

where M⇤ = (2/dc )1/2a M0 .


Usually the application this theory finds is the calculation of the density of dark matter
halos, but in principle it could also be used in order to find the luminosity function of
galaxies. Surely this function resembles the Schechter luminosity function F( L) if we set
a = 1/2 and assume M/L = const. This assumption is, however, not very accurate, so
empirical estimates of the luminosity function are preferred when available.
In 1999 Sheth, Mo, and Tormen [SMT01] improved the estimates significantly by account-
ing for nonspherical collapse — specifically, they allowed for halos shaped like ellipsoids
with arbitrary axes. Their results very closely resemble those coming from N-body simula-
tions.

148
Bibliography

[Ade+16] P. a. R. Ade et al. “Planck 2015 Results - XIII. Cosmological Parameters”. In: As-
tronomy & Astrophysics 594 (Oct. 1, 2016), A13. issn: 0004-6361, 1432-0746. doi:
10.1051/0004-6361/201525830. u r l: https://www.aanda.org/articles/aa/
abs/2016/10/aa25830- 15/aa25830- 15.html (visited on 2020-03-03) (cit. on
pp. 5, 19, 25, 33, 35, 50, 66).
[ABG48] R. A. Alpher, H. Bethe, and G. Gamow. “The Origin of Chemical Elements”. In:
Physical Review 73.7 (Apr. 1, 1948), pp. 803–804. doi: 10.1103/PhysRev.73.803.
u r l: https : / / link . aps . org / doi / 10 . 1103 / PhysRev . 73 . 803 (visited on
2020-03-17) (cit. on p. 49).
[And10] S. Andreon. “The Stellar Mass Fraction and Baryon Content of Galaxy Clusters
and Groups”. In: Monthly Notices of the Royal Astronomical Society 407.1 (Sept. 1,
2010). Comment: MNRAS, in press, pp. 263–276. issn: 00358711. doi: 10.1111/
j.1365-2966.2010.16856.x. arXiv: 1004.2785. u r l: http://arxiv.org/abs/
1004.2785 (visited on 2020-09-14) (cit. on p. 17).
[Aut18] O’Raifeartaigh Cormac author Aut. “Investigating the Legend of Einstein’s
“Biggest Blunder””. In: (Oct. 30, 2018). d o i: 10.1063/PT.6.3.20181030a. u r l:
https://physicstoday.scitation.org/do/10.1063/PT.6.3.20181030a/abs/
(visited on 2020-03-11) (cit. on p. 41).
[BM84] J. Bekenstein and M. Milgrom. “Does the Missing Mass Problem Signal the
Breakdown of Newtonian Gravity?” In: The Astrophysical Journal 286 (Nov. 1,
1984), pp. 7–14. i s s n: 0004-637X. d o i: 10.1086/162570. u r l: http://adsabs.
harvard.edu/abs/1984ApJ...286....7B (visited on 2020-03-03) (cit. on p. 14).
[Bre+02] G. Bressi et al. “Measurement of the Casimir Force between Parallel Metallic
Surfaces”. In: Physical Review Letters 88.4 (Jan. 15, 2002). Comment: 4 Figures,
p. 041804. i s s n: 0031-9007, 1079-7114. d o i: 10.1103/PhysRevLett.88.041804.
arXiv: quant- ph/0203002. u r l: http://arxiv.org/abs/quant- ph/0203002
(visited on 2020-09-15) (cit. on p. 71).
[CDG98] A. G. Cohen, A. De Rujula, and S. L. Glashow. “A Matter-Antimatter Uni-
verse?” In: The Astrophysical Journal 495.2 (Mar. 10, 1998). Comment: 28 pages,
5 figures, uses graphicx package; updated for Ap. J. version. Includes addi-
tional discussion of turbulence, pp. 539–549. i s s n: 0004-637X, 1538-4357. d o i:

149
10.1086/305328. arXiv: astro- ph/9707087. u r l: http://arxiv.org/abs/
astro-ph/9707087 (visited on 2020-09-16) (cit. on p. 75).
[CL02] P Coles and F. Lucchin. Cosmology. 2nd ed. Wiley, 2002 (cit. on pp. 3, 36, 75, 79,
80, 83, 84).
[Col18] The Fermi-LAT Collaboration. “A Gamma-Ray Determination of the Universe’s
Star Formation History”. In: Science 362.6418 (Nov. 30, 2018), pp. 1031–1034.
i s s n: 0036-8075, 1095-9203. d o i: 10.1126/science.aat8123. pmid: 30498122.
u r l: https://science.sciencemag.org/content/362/6418/1031 (visited on
2020-07-30) (cit. on p. 17).
[DL04] Tamara M. Davis and Charles H. Lineweaver. “Expanding Confusion: Common
Misconceptions of Cosmological Horizons and the Superluminal Expansion of
the Universe”. In: Publications of the Astronomical Society of Australia 21.1 (2004).
Comment: To appear in Publications of the Astronomical Society of Australia,
26 pages (preprint format), 6 figures. Version 2: Section 4.1 revised, pp. 97–109.
i s s n: 1323-3580, 1448-6083. d o i: 10.1071/AS03040. arXiv: astro-ph/0310808.
url: http://arxiv.org/abs/astro-ph/0310808 (visited on 2020-03-09) (cit. on
p. 25).
[dGra+19] Anna de Graaff et al. “Probing the Missing Baryons with the Sunyaev-Zel’dovich
Effect from Filaments”. Version 1. In: Astronomy & Astrophysics 624 (Apr. 2019).
Comment: 13 pages, 8 figures; accepted for publication in A&A, A48. i s s n:
0004-6361, 1432-0746. d o i: 10.1051/0004-6361/201935159. arXiv: 1709.10378.
url: http://arxiv.org/abs/1709.10378 (visited on 2020-07-30) (cit. on p. 17).
[Ein22] A. Einstein. Comment on A. Friedmann’s Paper "On the Curvature of Space". Letter.
Einstein wrongly corrects Friedmann. 1922. u r l: https://einsteinpapers.
press.princeton.edu/vol13-trans/301 (visited on 2020-03-11) (cit. on p. 39).
[FGN98] Valerio Faraoni, Edgard Gunzig, and Pasquale Nardone. Conformal Transforma-
tions in Classical Gravitational Theories and in Cosmology. Comment: LaTeX, 54
pages, no figures. To appear in Fundamentals of Cosmic Physics. Nov. 14, 1998.
arXiv: gr-qc/9811047. u r l: http://arxiv.org/abs/gr-qc/9811047 (visited
on 2020-09-16) (cit. on p. 73).
[Few95] M. P. Fewell. “The Atomic Nuclide with the Highest Mean Binding Energy”. In:
American Journal of Physics 63.7 (July 1, 1995), pp. 653–658. issn: 0002-9505. doi:
10.1119/1.17828. u r l: https://aapt.scitation.org/doi/10.1119/1.17828
(visited on 2020-09-24) (cit. on p. 121).
[Fix09] D. J. Fixsen. “The Temperature of the Cosmic Microwave Background”. In:
The Astrophysical Journal 707.2 (Nov. 2009), pp. 916–920. i s s n: 0004-637X. d o i:
10.1088/0004- 637X/707/2/916. u r l: https://doi.org/10.1088%2F0004-
637x%2F707%2F2%2F916 (visited on 2020-03-03) (cit. on p. 4).
[Fri22] A. Friedmann. From Alexander Friedmann. Letter. 1922. url: https://einsteinpapers.
press.princeton.edu/vol13-trans/363 (visited on 2020-03-11) (cit. on p. 39).

150
[GD11] Katherine Garrett and Gintaras Duda. “Dark Matter: A Primer”. Version 2. In:
Advances in Astronomy 2011 (2011). Comment: 26 pages, 6 figures, pp. 1–22.
i s s n: 1687-7969, 1687-7977. d o i: 10 . 1155 / 2011 / 968283. arXiv: 1006 . 2483.
url: http://arxiv.org/abs/1006.2483 (visited on 2020-10-23) (cit. on p. 13).
[Har93] E. Harrison. “The Redshift-Distance and Velocity-Distance Laws”. In: (1993).
u r l: http://adsabs.harvard.edu/full/1993ApJ...403...28H (visited on
2020-03-24) (cit. on p. 25).
[Hog00] David W. Hogg. Distance Measures in Cosmology. Version 4. Comment: This
short and purely pedagogical text is not submitted anywhere but here. Errors
reported to the author will be rewarded with gratitude and acknowledgements.
Dec. 15, 2000. arXiv: astro-ph/9905116. u r l: http://arxiv.org/abs/astro-
ph/9905116 (visited on 2020-03-07) (cit. on pp. 26, 39).
[Hos19] S. Hossenfelder. “Screams for Explanation: Finetuning and Naturalness in the
Foundations of Physics”. In: Synthese (Sept. 3, 2019). Comment: 17 pages, no
figures, typos fixed. i s s n: 0039-7857, 1573-0964. d o i: 10.1007/s11229- 019-
02377-5. arXiv: 1801.02176. url: http://arxiv.org/abs/1801.02176 (visited
on 2020-09-14) (cit. on p. 70).
[Hub29] Edwin Hubble. “A Relation Between Distance And Radial Velocity Among
Extra-Galactic Nebulae”. In: (1929). u r l: https://www.pnas.org/content/
pnas/15/3/168.full.pdf (visited on 2020-03-03) (cit. on p. 19).
[Hus16] Lars Husdal. “On Effective Degrees of Freedom in the Early Universe”. In:
Galaxies 4.4 (Dec. 17, 2016). Comment: 29 pages, 7 figures, p. 78. i s s n: 2075-
4434. d o i: 10.3390/galaxies4040078. arXiv: 1609.04979. u r l: http://arxiv.
org/abs/1609.04979 (visited on 2020-06-30) (cit. on p. 61).
[JR05] Nils Voje Johansen and Finn Ravndal. On the Discovery of Birkhoff’s Theorem.
Version 2. Comment: 4 pages, no figures. More references have been included,
small changes in text plus new data on Eiesland. Sept. 6, 2005. arXiv: physics/
0508163. u r l: http://arxiv.org/abs/physics/0508163 (visited on 2020-03-
08) (cit. on p. 27).
[Kee14] Charles Keeton. “Star and Planet Formation”. In: Principles of Astrophysics: Using
Gravity and Stellar Physics to Explore the Cosmos. Ed. by Charles Keeton. Under-
graduate Lecture Notes in Physics. New York, NY: Springer, 2014, pp. 377–
394. i s b n: 978-1-4614-9236-8. d o i: 10.1007/978- 1- 4614- 9236- 8_19. u r l:
https://doi.org/10.1007/978- 1- 4614- 9236- 8_19 (visited on 2020-04-21)
(cit. on p. 101).
[LP07] James M. Lattimer and Maddapa Prakash. “Neutron Star Observations: Prog-
nosis for Equation of State Constraints”. In: Physics Reports 442.1-6 (Apr. 2007).
Comment: 70 pages, 19 figures, submitted to Hans Bethe Centennial Physics
Reports, pp. 109–165. i s s n: 03701573. d o i: 10.1016/j.physrep.2007.02.003.
arXiv: astro- ph/0612440. u r l: http://arxiv.org/abs/astro- ph/0612440
(visited on 2020-09-26) (cit. on p. 141).

151
[Lud99] Malcolm Ludvigsen. General Relativity: A Geometric Approach. For distinction
between static and stationary spacetimes. Cambridge University Press, May 28,
1999. 234 pp. i s b n: 978-0-521-63976-7. Google Books: YA8rxOn9H1sC (cit. on
p. 40).
[MTW73] C.W. Misner, K.S. Thorne, and J.A. Wheeler. Gravitation. W.H. Freeman & Co.,
1973 (cit. on p. 27).
[NS19] Jose Natario and Amol Sasane. Decay of Solutions to the Klein-Gordon Equation on
Some Expanding Cosmological Spacetimes. Comment: 52 pages, 4 figures. Sept. 9,
2019. arXiv: 1909.01292 [gr-qc, physics:math-ph]. url: http://arxiv.org/
abs/1909.01292 (visited on 2020-09-16) (cit. on p. 74).
[Pac18] L. Pacciani. Appunti Del Corso The Physical Universe. Pacciani notes for Funda-
mentals of Astrophysics and Cosmology. 2018. u r l: https://leonardo.pm/
teaching/ (visited on 2020-03-03) (cit. on pp. 3, 70, 79, 84, 86, 91, 93, 105).
[Pla+19] Planck Collaboration et al. Planck 2018 Results. I. Overview and the Cosmological
Legacy of Planck. Dec. 3, 2019. arXiv: 1807.06205 [astro-ph]. u r l: http://
arxiv.org/abs/1807.06205 (visited on 2020-03-03) (cit. on pp. 4, 15, 18).
[Rie+16] Adam G. Riess et al. “A 2.4% Determination of the Local Value of the Hubble
Constant”. In: The Astrophysical Journal 826.1 (July 21, 2016). Comment: accepted
ApJ, includes proof corrections and edits, 63 pages, 16 figures, 8 tables. Table
4 available electronically by ApJ Revised since v1 to include one new super-
nova/calibrator and updated Planck constraints, p. 56. i s s n: 1538-4357. d o i:
10.3847/0004-637X/826/1/56. arXiv: 1604.01424. u r l: http://arxiv.org/
abs/1604.01424 (visited on 2020-03-05) (cit. on p. 18).
[Rub83] V. C. Rubin. “Dark Matter in Spiral Galaxies”. In: Scientific American 248 (June 1,
1983), pp. 96–106. d o i: 10 . 1038 / scientificamerican0683 - 96. u r l: http :
/ / adsabs . harvard . edu / abs / 1983SciAm . 248f . .96R (visited on 2020-10-23)
(cit. on p. 13).
[SMT01] Ravi K. Sheth, H. J. Mo, and Giuseppe Tormen. “Ellipsoidal Collapse and an Im-
proved Model for the Number and Spatial Distribution of Dark Matter Haloes”.
In: Monthly Notices of the Royal Astronomical Society 323.1 (May 1, 2001). Com-
ment: 12 pages, 6 figures, submitted to MNRAS, pp. 1–12. issn: 0035-8711, 1365-
2966. d o i: 10.1046/j.1365- 8711.2001.04006.x. arXiv: astro- ph/9907024.
url: http://arxiv.org/abs/astro-ph/9907024 (visited on 2020-09-27) (cit. on
p. 148).
[TM20] J. Tissino and G. Mentasti. General Relativity Notes. 2020. url: https://github.
com/jacopok/notes/blob/master/ap_first_semester/general_relativity/
main.pdf (cit. on pp. 27, 32).
[Toj] Rita Tojeiro. “Understanding the Cosmic Microwave Background Temperature
Power Spectrum”. In: (), p. 9 (cit. on p. 66).
[WA03] H. J. Weber and G. B. Afken. Essential Mathematical Methods for Physicists. Aca-
demic Press, 2003 (cit. on p. 60).

152
[Wei72] S. Weinberg. Gravitation and Cosmology - Principles and Applications of the General
Theory of Relativity. Wiley, 1972. i sb n: 0-471-92567-5 (cit. on pp. 50, 54).
[Wol] Natalie Wolchover. Neutron Lifetime Puzzle Deepens, but No Dark Matter Seen.
u r l: https://www.quantamagazine.org/neutron-lifetime-puzzle-deepens-
but-no-dark-matter-seen-20180213/ (visited on 2020-09-19) (cit. on p. 86).
[Won+19] Kenneth C. Wong et al. H0LiCOW XIII. A 2.4% Measurement of $H_{0}$ from
Lensed Quasars: $5.3\sigma$ Tension between Early and Late-Universe Probes. Com-
ment: Accepted for publication in MNRAS; 23 pages, 13 figures, 8 tables. Nov. 5,
2019. arXiv: 1907.04869 [astro-ph]. url: http://arxiv.org/abs/1907.04869
(visited on 2020-03-03) (cit. on p. 11).
[Wri03] E. L. Wright. Theoretical Overview of Cosmic Microwave Background Anisotropy.
Comment: Review written for the Carnegie Observatories Centennial Sympo-
sium II. Includes the angular power spectrum and parameter fits as of Novem-
ber, 2002. 18 pages Latex with 19 embedded Postscript figures. May 29, 2003.
arXiv: astro- ph/0305591. u r l: http://arxiv.org/abs/astro- ph/0305591
(visited on 2020-03-03) (cit. on p. 4).
[YBK10] Jaswant K. Yadav, J. S. Bagla, and Nishikanta Khandai. “Fractal Dimension
as a Measure of the Scale of Homogeneity”. In: Monthly Notices of the Royal
Astronomical Society 405.3 (July 1, 2010), pp. 2009–2015. i s s n: 0035-8711. d o i:
10.1111/j.1365- 2966.2010.16612.x. u r l: https://academic.oup.com/
mnras/article/405/3/2009/967466 (visited on 2020-03-03) (cit. on p. 5).
[Yak+] D G Yakovlev et al. “Neutrino Emission from Neutron Stars”. In: (), p. 165
(cit. on p. 137).
[ZY12] Shuang-Nan Zhang and Shuxu Yi. “ON A COMMON MISUNDERSTANDING
OF THE BIRKHOFF THEOREM AND LIGHT DEFLECTION CALCULATION:
GENERALIZED SHAPIRO DELAY AND ITS POSSIBLE LABORATORY TEST”.
In: International Journal of Modern Physics: Conference Series 12 (Jan. 2012), pp. 419–
430. i s s n: 2010-1945, 2010-1945. d o i: 10 . 1142 / S2010194512006642. u r l:
https://www.worldscientific.com/doi/abs/10.1142/S2010194512006642
(visited on 2020-03-08) (cit. on p. 27).

153

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy