AQM2024-25
AQM2024-25
6ccm436a/7ccmms31
Ana Raclariu∗
N.B. These notes are not particularly original but rather based on the lecture notes of
Profs. N. Lambert, N. Drukker and G. Watts.
∗
email: ana-maria.raclariu@kcl.ac.uk
2
Contents
0.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
0.2 Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Angular Momentum 25
2.1 Transformations and Unitary Operators . . . . . . . . . . . . . . . . . . 25
2.1.1 Transformation of states . . . . . . . . . . . . . . . . . . . . . . . 25
2.1.2 Transformation of operators . . . . . . . . . . . . . . . . . . . . . 27
2.1.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.1.4 Heisenberg vs. Schrödinger pictures, revisited . . . . . . . . . . . 28
2.2 Symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.2.1 Aside: relation to classical mechanics . . . . . . . . . . . . . . . . 30
2.3 A Spherically Symmetric Potential . . . . . . . . . . . . . . . . . . . . . 31
2.4 Rotations and Angular Momentum Operators . . . . . . . . . . . . . . . 33
3
4 CONTENTS
0.1 Introduction
Quantum Mechanics arguably presents our most profound and revolutionary under-
standing of objective reality. In the classical world, as studied for example in “Classical
Dynamics” (5ccm231a) the physically observable world consists of knowing the positions
and momenta (velocities) of every single particle at a moment in time. The dynamical
equations can then be used to evolve the system so as to know the positions and mo-
menta in the future leading to a causal and highly predictive theory. You could forgive
the physicists of the 19th century for thinking that they were almost done. Everything
seemed to work perfectly and there were just a few loose strings to get under control.
It turned out that by pulling on those loose strings the whole structure of classical
physics and our understanding of objective relativity fell apart. We now have a more
predictive and precise theory in the form of Quantum Mechanics. But no one has got
to grips with what it truly means.
The classic experiment, although there are many, that lead to the unravelling of
classical mechanics relates to the question as to whether or not light is a particle or
a wave. If you consider a coherent light source and send a beam of light through two
small slits and see what happens you will find an interference pattern. This is what
you would expect if light was a wave and the two slits produced two wave sources that
could add constructively or cancel out deconstructively. Thus an interference pattern
is observed. Fine, light is a wave. Let’s do the same with electrons. We find the same
thing! Shooting an electron beam at a double slit also leads to an interference pattern.
So electrons, which otherwise appear as point like particles with fixed electric charges
also behave like waves (and similarly there are experiments such as the photoelectric
effect which show that light behaves like a particle). This interference pattern even
exists if you slow to beam down so that one electron at a time is released.
What has happened? Nothing that has a classical interpretation. We can think of
electrons as waves with profile ψ. After passing through the slits we have two wave-
functions ψhole1 and ψhole2 . Quantum mechanics tells us that the system is described by
ψhole1 + ψhole2 but in this case the two wave profiles can constructively or deconstruc-
tively interfere (i.e. add or cancel out). Classically there isn’t really an explanation but
one would expect two the electron beams coming out from each of the slits to behave
independently. Experiment tells us which is true (see figure 0.1.1). Needless to say there
have been countless experiments since which confirm the quantum picture and refute
any classical explanation (most notably in the Bell’s inequalities).
There are further revelations to that we won’t get to speak much of. Relativity
came and told us that time is not absolute. Combining this with Quantum Mechanics
means that there really aren’t any particles at all just local excitations of fields that
permeate spacetime. Why, after all, is every electron identical to every other? This is
the topic of Quantum Field Theory. However exploring Quantum Mechanics beyond an
introductory module such as “Introductory Quantum Theory” (6cmm332c) is crucial
for any physicist.
6 CONTENTS
Figure 0.1.1: After 100, 200, 500 and 1000 electrons hit the screen the interference
picture is formed (on top). Compare this to the probability distribution computed from
the wave function of the electron (bottom). In particular the quantum probability (left)
is obtained from |ψhole1 + ψhole2 |2 vs. the classical sum of probabilities |ψhole1 |2 + |ψhole2 |2
(right).
0.2 Plan
We hope to cover each chapter in a week. The main lectures, where new material is
presented, are on Tuesdays 12-2pm and Wednesdays 12-1pm. The Tuesday lectures
will focus more on foundations, while in the Wednesday lecture we will aim to cover
examples and sometimes real world applications of the material.
In addition there will be weekly problem sets found on Keats. You are strongly
urged to do the problems which are essential to understanding the material. There
are weekly tutorials on Monday at 9am which will discuss the problem set from the
previous week. Each week you will find a questionnaire on Keats asking you to indicate
one topic that you understood well and one topic that you are struggling with. The
answers will be anonymous and I strongly encourage you to reflect on these questions
after the Wednesday class each week and submit your response. This will allow us to
0.2. PLAN 7
identify the confusing points and focus on clarifying them during the tutorials.
There is a reading week starting Oct 28 where there will not be any lectures, discus-
sion sessions or tutorials.
8 CONTENTS
Chapter 1
Let’s jump head first into the formulation of Quantum Mechanics. In this section we
will just consider one-dimensional systems, meaning one spatial dimension and time.
Conceptually the extension to higher dimensions is easy but more technically involved
and is the subject of much of this module. One can also consider quantum systems with
no spatial dimension. We will consider these later. This chapter is expected to be a
review.
Quantum mechanics describes physical systems through a state ψ in a complex vector
space evolving according to the Schrodinger equation
∂ψ
iℏ = Ĥψ. (1.1)
∂t
Here Ĥ is the Hamiltonian operator and carries information about the energy of the
system. For a non-relativistic particle in one spatial dimension subject to a potential
V (x), (1.1) becomes
∂ψ(t, x) ℏ2 ∂ 2 ψ(t, x)
iℏ =− + V̂ (x)ψ(t, x). (1.2)
∂t 2m ∂ 2 x
We can compare and contrast this to classical mechanics, where in closed systems
(ie. systems that are not acted upon by any forces) energy E is conserved. This energy
depends on the details of the system, but typically consists of a kinetic and a potential
component. For a one-particle system of mass m and momentum p subject to a potential
V (x), the energy is given by
p2
E= + V (x) (1.3)
2m
For a classical system, momenta p and positions x are entirely determined by the equa-
tions of motion and are constrained to obey (1.3) for definite E. (1.2) takes a form very
similar to (1.3) subject to the replacements
∂ ∂
E ↔ iℏ , p ↔ −iℏ , V ↔ V̂ . (1.4)
∂t ∂x
9
10 CHAPTER 1. REVIEW: ELEMENTS OF QUANTUM MECHANICS
In other words, the numbers in (1.3) are replaced by operators acting on a wavefunction
ψ(x, t)!
More generally, (1.1) tells us that, at microscopic scales, the energy of the system
cannot be determined with certainty: the system may be found in a superposition of
energy eigenstates ψn ,1 X
ψ(t, x) = cn ψn (t, x), (1.5)
n
where ψn are defined by
ℏ 2 ∂ 2 ψn
En ψn = − + V̂ (x)ψn . (1.6)
2m ∂x2
Unlike in classical mechanics, upon measurement, the system is found to have energy
En with probability
pn ≡ |cn |2 ≡ cn c∗n . (1.7)
P
Demanding that ψ is normalized ensures that n pn = 1. Quantum mechanics is
fundamentally probabilistic in its predictions! One may of course choose to measure
other observables of the system, such as position or momentum. We will review these
examples in section 1.2 below.
This all can be formalized by introducing the notion of a Hilbert space H and
Hermitian operators Ô acting on H. As we will see in sections 1.1, 1.2, Ô correspond
to the physical observables. An example of an operator above is the Hamiltonian Ĥ. In
general
• the wavefunction or state ψ determine the probabilities with which these outcomes
will be observed
Note: the probabilities of the outcomes should not be confused with the outcomes
themselves. The probabilities are a property of the state ψ (cf. |cn |2 in (1.5), (1.7)):
given a state ψ, it admits a unique decomposition in the basis {ψn } with coefficients
{cn }. The measurement outcomes are a property of the observable Ô (cf. En in (1.6) if
Ô = Ĥ).
We will also see in section 1.4 that closed quantum systems evolve unitarily in time.
Definition: A Hilbert space is a complete, complex vector space with positive definite
inner product
There is a theorem which states that H∗ is also a Hilbert space and is isomorphic to
H. This means that for each |ψ⟩ there is a unique ⟨ψ| and vice versa.
O:H→H (1.11)
In Dirac notation, given a basis |en ⟩ for H, the linear maps O admit the decompo-
sition
X
O= Omn |em ⟩⟨en | (1.13)
mn
† ∗
where, just as in finite dimensions, Omn = Onm .
In addition to a Hilbert space of states Quantum Mechanics exploits the fact that
there exists a preferred class of linear maps which are self-adjoint (aka Hermitian in the
case when H is finite dimensional) in the sense that
for all vectors |ψ1 ⟩ and |ψ2 ⟩. Such maps are call observables and they lead to an
important
First we suppose that |ψ1 ⟩ = |ψ2 ⟩ and hence λ2 = λ1 . Thus we have, since |ψ1 ⟩ must
have non-zero norm, λ1 = λ∗1 . Next we suppose that |ψ1 ⟩ =
̸ |ψ2 ⟩. Since we now know
that λ1 and λ2 are real we have
Thus if λ2 ̸= λ1 then ⟨ψ1 |ψ2 ⟩ = 0. If there are degeneracies, meaning that there are
many eigenstates with the same eigenvalue then one can arrange to find an orthogonal
1.2. OPERATORS, OBSERVABLES AND MEASUREMENTS 13
basis by a Gram-Schmidt process. But we can’t conclude that two such eigenstates are
orthogonal without extra work. On the other hand, we know immediately that if two
eigenstates of a self-adjoint operator have different eigenvalues then they are orthogonal.
If an operator is self-adjoint then there is a choice of basis |en ⟩ so that (see problem
set 1)
X
O= λn |en ⟩⟨en | (1.18)
n
with λn ∈ R. In other words, the operator is diagonalizable and this is what a diagonal
operator looks like in Dirac notation.
How do we extract numbers that we can compare with the so-called real world
through experiment? The key idea of Quantum Mechanics is that observables represent
a physical measurement and the eigenvalues are the possible outcomes of such mea-
surements. We may define the expectation value of an operator O in a state |ψ⟩ to
be
The above theorem tells us that we can find an orthonormal basis of eigenvectors of
an observable O. Thus any state |ψ⟩ can be written as
X
|ψ⟩ = cn |ψn ⟩ (1.20)
n
where O|ψn ⟩ = λn |ψn ⟩. Intuitively the physical idea is that in order to measure an
observable one must probe the state you are observing, e.g. to find the position of an
electron you must look for it which means hitting it with light which will then impart
momentum and change it. Thus the act of measurement changes the system and hence
one doesn’t know what the system is doing after the measurement. Eigenstates are
special as they are, in some sense, unaffected by a particular measurement (but not
others).
Looking at |ψ⟩ we can compute the expectation value of O to find
X
⟨ψ|Oψ⟩ = c∗m cn ⟨ψm |Oψn ⟩
m,n
X
= λn c∗m cn ⟨ψm |ψn ⟩
m,n
X
= λn c∗n cn
n
X
= λn pn (1.21)
n
where we think of pn = c∗n cn as a probability distribution since the unit norm of |ψ⟩
14 CHAPTER 1. REVIEW: ELEMENTS OF QUANTUM MECHANICS
implies that
1 = ⟨ψ|ψ⟩
X
= c∗m cn ⟨ψm |ψn ⟩
m,n
X
= c∗m cn ⟨ψm |ψn ⟩
m,n
X
= c∗n cn
n
X
= pn (1.22)
n
Note that the right hand side can be written as ⟨(O − ⟨O⟩I)2 ⟩ and hence is positive.
Another theorem asserts that if two operators commute:
[O1 , O2 ] = O1 O2 − O2 O1 = 0 (1.24)
then one can find an orthonormal basis of states which are simultaneously eigenstates
of both operators. On the other hand for pairs of operators that don’t commute we find
the famous Heisenberg uncertainty principle:
1
∆O1 ∆O2 ≥ |⟨[O1 , O2 ]⟩| (1.25)
2
The classic example of this are the positions and momenta operators:
∂
x̂ψ(x) = xψ(x) p̂ψ(x) = −iℏ ψ(x) (1.26)
∂x
so that [x̂, p̂] = iℏ and hence
Examples
Furthermore
⟨ψ|ψ⟩ = 1 =⇒ |c0 |2 + |c1 |2 = 1. (1.29)
|c0 |2 , |c1 |2 are the probabilities to find the system states |0⟩ and |1⟩ respectively, provided
we measure an operator whose evectors are |0⟩, |1⟩. In vector notation we may choose
! !
1 0
|0⟩ = , |1⟩ = . (1.30)
0 1
An observable in this case is a Hermitian 2×2 matrix and since {|0⟩, |1⟩} are its evectors,
we may write !
λ0 0
O = λ0 |0⟩⟨0| + λ1 |1⟩⟨1| = , (1.31)
0 λ1
then measuring O will give outcome λ0 with probability |c0 |2 and outcome λ1 with prob-
ability |c1 |2 . We will see later that a spin- 21 system is characterized by a state of the
form (1.29) and the spin operator along a direction in space that can be taken to be
proportional to one of the Pauli matrices (say σz ) is an observable.
The ‘canonical’ way to quantise this system is consider the Hilbert space L2 (RN ) of
functions of qi and make the replacements2
as required (and used above). Note that this corresponded to a particular choice of coor-
dinates qi and conjugate momentum pi . However Hamiltonian systems admit canonical
transformations that mix these and hence there are several ways to quantise correspond-
ing to choosing a ‘polarisation’, i.e. a choice of what is a q and what is a p.
However this is really putting the cart before the horse as the classical world emerges
from the quantum world and not the other way around. Indeed there are quantum
systems with no classical limit at all. They obey the rules of quantum mechanics that we
are outlining here but they need not come from “quantising” some classical Hamiltonian
system.
dO
= {O, H} (1.44)
dt
2
In general we will avoid putting hats on operators but we need to do so here as qi have very specific
meanings as real numbers.
1.4. TIME EVOLUTION 17
becomes
dOH i
= − [OH , H] (1.45)
dt ℏ
Here I have introduced a subscript H on O to indicate that we are in the so-called
Heisenberg picture where operators evolve in time and states are constant. And 1.45 is
known as the Heisenberg equation.
A more familiar picture is the Schrödinger picture where
OS = e−iHt/ℏ OH eiHt/ℏ
|ΨS ⟩ = e−iHt/ℏ |ΨH ⟩ (1.46)
∂
iℏ |ΨS ⟩ = H|ΨS ⟩ (1.48)
∂t
This follows by thinking of the Hamiltonian operator H as the energy operator Ê and
making the identification
∂
Ê = iℏ (1.49)
∂t
In what follows we drop the subscript S and will always work in the Schrödinger picture.
One then sees that eigenstates of H with eigenvalue En evolve rather trivially (for
simplicity here we assume that H and hence its eigenvectors and eigenvalues are time
independent):
I will try to stick to a convention where the time dependent wavefunction is |Ψ⟩ and
the time independent one, meaning it is an eigenstate of the Hamiltonian, |ψ⟩. The
difference then is just the time-dependent phase.
More generally given any state |Ψ⟩ we can expand it in an energy eigenstate basis:
X
|Ψ⟩ = cn (t)|ψn ⟩ (1.51)
n
and the cn (0) can be found by expanding |Ψ(0)⟩ in the energy eigenstate basis. We
have recovered (1.5). Of course finding the eigenstates and eigenvalues of a typical
Hamiltonian is highly non-trivial. But in principle solving Quantum Mechanics is down
to (infinite-dimensional) linear algebra.
A central point is that time evolution is unitary, meaning that it preserves the
inner-product between two states. It is crucial for the consistency of the theory as
otherwise the probabilities we discussed above will fail to sum to unity.
∂Ψ ℏ2 ∂ 2 Ψ 1 2
iℏ =− + kx Ψ (1.55)
∂t 2m ∂x2 2
We could proceed by finding the general solution to this differential equation. However
this is a difficult task and a much better analysis can be done using algebra.
Again our Hilbert space is H = L2 (R). We can write the Schrödinger equation as
1 2 k 2
Ê|Ψ⟩ = p̂ + x̂ |Ψ⟩ (1.56)
2m 2
These satisfy
√ √
† 1 2 mk 2 i 1 2 mk 2 1
ââ = √ p̂ + x̂ − [x̂, p̂] = √ p̂ + x̂ +
2ℏ mk 2ℏ 2ℏ 2ℏ mk 2ℏ 2
√ √
1 mk 2 i 1 mk 2 1
↠â = √ p̂2 + x̂ + [x̂, p̂] = √ p̂2 + x̂ −
2ℏ mk 2ℏ 2ℏ 2ℏ mk 2ℏ 2
(1.58)
and hence
and
and
N̂ ↠|n⟩ = ([N̂ , ↠] + ↠N̂ )|n⟩ = (↠+ n↠)|n⟩ = (n + 1)↠|n⟩ (1.72)
The operators ↠and â are therefore known as raising and lowering operators respec-
tively. From this we see that
1 1
H|n⟩ = ℏω N̂ + |n⟩ = ℏω n + |n⟩ (1.73)
2 2
i.e. the energy of the state |n⟩ is
1
En = ℏω n + (1.74)
2
So we have completely solved the system without ever solving a differential equation.
Furthermore this approach also allows us to explicitly construct the corresponding
energy eigenstate wavefunctions. For example the ground state satisfies
21 √ ! 21
1 mk
â|0⟩ = √ p̂ − i x̂ |0⟩ = 0 (1.75)
2ℏ mk 2ℏ
where Ψ0 (x, t) = e−iEn t/ℏ ψ(x) is the explicit element of L2 (R) that represents |0⟩.
Rewriting this equation gives
√
dψ0 mk
=− xψ0 (1.77)
dx ℏ
1.5. EXAMPLE: THE QUANTUM HARMONIC OSCILLATOR 21
√
mkx2 /2ℏ
The solution to this differential equation is simply ψ ∝ e− and hence
√
itE0 mk 2
Ψ = N0 e− ℏ e− 2ℏ
x
(1.78)
√
mk 2
These have the form of a polynomial times the Gaussian e− 2ℏ x . We can explicitly
construct the first few wavefunctions, for example,
12 √ ! 12 √
itE1 1 ∂ mk mk 2
Ψ1 (x, t) = N0 e− ℏ −iℏ √ +i x e− 2ℏ x ,
2ℏ mk ∂x 2ℏ
√ ! 21 √ ! 12 √
itE1 mk mk mk 2
= N0 e− ℏ +i x+i x e− 2ℏ x
2ℏ 2ℏ
√ ! 12 √
mk itE1 mk 2
= 2iN0 e− ℏ xe− 2ℏ x
2ℏ
(1.80)
It would have been very difficult to find all these solutions to the Schrödinger equation
by brute force and also to use them to calculate quantities such as ⟨x̂⟩ and ⟨x̂2 ⟩.
We can also calculate various expectation values in this formalism. For example, in
the state |n⟩, we have
⟨x̂⟩ = ⟨n|x̂n⟩
12
i 2ℏ
= √ ⟨n|(â − ↠)n⟩
2 mk
12
i 2ℏ
⟨n|ân⟩ − ⟨n|↠n⟩
= √ (1.81)
2 mk
⟨x̂2 ⟩ = ⟨n|x̂2 n⟩
1 2ℏ
= − √ ⟨n|(â − ↠)2 n⟩
4 mk
1 2ℏ
⟨n|(â)2 n⟩ + ⟨n|(↠)2 n⟩ − ⟨n|â↠n⟩ − ⟨n|↠ân⟩
= − √
4 mk
(1.83)
Now again (â)2 |n⟩ ∝ |n − 2⟩ and (↠)2 |n⟩ ∝ |n + 2⟩ so that the first two terms give zero.
However the last two terms give
1 2ℏ
2
⟨x̂ ⟩ = √ (⟨↠n|↠n⟩ + ⟨ân|ân⟩) (1.84)
4 mk
Now we have already seen that
r
1
|n + 1⟩ = ↠|n⟩ , (1.85)
n+1
which implies that
r r r
1 1 1
â|n + 1⟩ = â↠|n⟩ = (N̂ + [â, ↠])|n⟩ = (n + 1)|n⟩ (1.86)
n+1 n+1 n+1
Thus
r
1
|n⟩ = â|n + 1⟩ (1.87)
n+1
and finally
2 1 2ℏ
⟨x̂ ⟩ = √ (2n + 1) (1.88)
4 mk
You can also evaluate ⟨p̂⟩ and ⟨p̂2 ⟩ this way and check that the Heisenberg uncertainty
relation is satisfied.
For completeness, we finally show how the energy spectra of the harmonic oscillator
may also be recovered by directly solving the Schrödinger equation. Consider the time-
independent Schrödinger equation
ℏ2 d2 ψ 1 2
− + kx ψ = Eψ (1.89)
2m dx2 2
It is helpful to write
2
ψ = f (x)e−αx (1.90)
so that
dψ df 2 2
= − 2αxf e−α x
dx dx
2
2
dψ df df 2 2
2
= 2
− 4αx − 2αf + 4α x f e−α x
2 2
(1.91)
dx dx dx
1.5. EXAMPLE: THE QUANTUM HARMONIC OSCILLATOR 23
and
ℏ2 d2 f
df 1
− 2
− 4αx − 2αf + 4α2 x2 f + kx2 f = Ef (1.92)
2m dx dx 2
d2 f df 2mE
2
− 4αx − 2αf = − 2 f (1.93)
dx dx ℏ
This equation will have polynomial solutions. Indeed if we write f = xn + . . . then the
leading order xn−2 term must cancel to give
2mE
−4αn − 2α = − (1.94)
ℏ2
From here we read off that
r
ℏ2 α k
E=2 (n + 1/2) = ℏ (n + 1/2) (1.95)
m m
as before.
24 CHAPTER 1. REVIEW: ELEMENTS OF QUANTUM MECHANICS
Chapter 2
Angular Momentum
Let us start our exploration of Quantum Mechanics by looking at more realistic three-
dimensional models. In Classical Dynamics the simplest, but also most common, sys-
tems have rotational symmetry which leads via Noether’s theorem to conserved angular
momentum. This in turn means that the system can typically be solved. The classic
example is the Kepler problem of a planet moving around the Sun.
It’s ridiculous to think of the quantum problem of something so macroscopic as a
planet moving around something even bigger such as the Sun. The correspondence
principle tells us that we should just reproduce the classical results with incredible
accuracy. But happily there is a suitable quantum analogue which is of a negatively
charged particle (an electron) moving around a heavier positively charged particle (a
nucleus) as found in atoms! But before we tackle this problem head on and see Quantum
Mechanics really working for us and matching experiment we should step back and
think of transformations of states, symmetries, and in particular angular momentum in
Quantum Mechanics. We will see that the angular momentum is simply the generator of
rotations and, just like in classical mechanics, it is conserved in systems with rotational
symmetry. An example of such a system is the Hydrogen atom, which we will introduce
in this chapter and discuss in more detail later.
25
26 CHAPTER 2. ANGULAR MOMENTUM
This transformation must preserve the norm of |ψ⟩ (recall the first Postulate) and so
1 = ⟨ψ|ψ⟩ = ⟨ψ ′ |ψ ′ ⟩ = ⟨ψ|U † U |ψ⟩. (2.2)
It follows that U must be a unitary matrix, ie.
U † U = I. (2.3)
To see this, let
|ψ⟩ = |ϕ⟩ + λ|χ⟩ (2.4)
and substitute into (2.2). After elementary algebra, this yields
λ∗ ⟨χ|ϕ⟩ − ⟨χ|U † U |ϕ⟩ = −λ ⟨ϕ|χ⟩ − ⟨ϕ|U † U |χ⟩ ,
(2.5)
which must be true as we vary λ ∈ C. This means that both sides must vanish, resulting
in (2.3).
We will be interested in transformations that form a group. Recall this means that,
given g1 , g2 ∈ G, we have g1 ·g2 ∈ G, where · is the group composition which is associative.
G has an identity element and every element has an inverse. We want U (gi ) to form a
representation of gi on H, which means that they provide a group homomorphism
U (g1 ) ◦ U (g2 ) = U (g1 · g2 ). (2.6)
We will also usually be interested in transformations that depend on a continuous
parameter θ (cf. translations, rotations). Then given an infinitesimal transformation
θ = δθ, we have
U (δθ) = I − iδθT + O(δθ2 ), (2.7)
where θ = 0 corresponds to no transformation hence the identity on the RHS above.
This is just the operator analog of the Taylor expansion for functions. The unitarity
condition for U then implies
U † (δθ)U (δθ) = I ⇐⇒ iδθT † − iδθT = 0 =⇒ T † = T . (2.8)
T is therefore a Hermitian matrix, hence a good candidate for an observable. More
generally, we have
|ψ ′ ⟩ = |ψ(θ + δθ)⟩ = U (θ + δθ)|ψ⟩ = |ψ(θ)⟩ − iδθT |ψ(θ)⟩ + O(δθ2 ). (2.9)
Rearranging and taking the limit δθ → 0, we find
∂|ψ(θ)⟩
i = T |ψ(θ)⟩. (2.10)
∂θ
This means that |ψ⟩ varies smoothly as a function of θ in H and the rate at which it
varies is governed by T .
Given an infinitesimal transformation, we can exponentiate it as follows,
N
iθ
U (θ) = lim 1 − T = e−iθT . (2.11)
N →∞ N
In other words, we can generate a finite transformation θ by applying θ/N N times and
taking the continuum limit N → ∞.
2.1. TRANSFORMATIONS AND UNITARY OPERATORS 27
Hence we see that we can keep |ψ⟩ fixed and transform A instead
A 7→ A′ = U † AU. (2.13)
2.1.3 Examples
a) Translations
Indeed, we have
U † (a)x̂U (a)|x⟩ = (x̂ + aI)|x⟩ = (x + a)|x⟩ ⇐⇒ x̂U (a)|x⟩ = (x + a)U (a)|x⟩. (2.16)
In other words, U (a)|x⟩ = |x + a⟩. So what is U (a)? To find it, consider first the
infinitesimal translation
i
U (δa) = I − δa · P̂ + O(δa2 ). (2.17)
ℏ
Here we allowed for translations in 3d space where δa = (δax , δay , δaz ) and ℏ−1 P̂ is a
3-component vector of translation generators (operators). From (2.14) and (2.15), we
have
1
U † (δa)x̂U (δa) = x̂ + δaI + O(δa2 ) ⇐⇒ i [δa · P̂ , x̂] = δaI. (2.18)
ℏ
For this to hold for any 3-vector δa, we must have
But these are just the canonical commutation relations between position and momenta
in QM! Exponentiating,
i
U (a) = e− ℏ a·P̂ . (2.20)
b) Time translations
From the same argument as before, we conclude that time translations are imple-
mented by unitary operators parameterized by the time t
i
U (t) = e− ℏ Ĥt , (2.21)
where Ĥ is the generator of time translations. Then a state |ψ(t)⟩ is related to a state
at time t = 0 |ψ(0)⟩ by the action of U (t), namely
i
|ψ(t)⟩ = U (t)|ψ(0)⟩ = e− ℏ Ĥt |ψ(0)⟩. (2.22)
∂
iℏ |ψ(t)⟩ = Ĥ|ψ(t)⟩ (2.23)
∂t
which is nothing but the time-dependent Schrödinger equation!
By construction, we have
∂
|ψH ⟩ = 0, (2.25)
∂t
but we could also derive the same relation by substituting (2.24) into the Schrödinger
equation (2.23) provided that H doesn’t depend on explicitly on time t.
Consider now the action of operators with no explicit time dependence on |ψ(t)⟩
d i i
Ĥt ∂O i
OH = [ Ĥ, OH ] +
e ℏ e− ℏ Ĥt (2.27)
dt ℏ ∂t
where the last term on the RHS vanishes due to the assumed time independence of O.
(2.27) contains the same information as the Schrödinger equation.
2.2. SYMMETRIES 29
We have recovered the relation between the Schrödinger and Heisenberg pictures
discussed in Chapter 1. In the former, states |ψ(t)⟩ are time-dependent and their evo-
lution is governed by the Schrödinger equation while operators O are taken to be time-
independent. In the latter states are time-independent |ψH ⟩ and instead, the operators
OH are time dependent and governed by the Heisenberg equation (2.27). There is noth-
ing physically different about the two pictures, you can think about them as a change in
the point of view. In particular, expectation values of operators are the same regardless
of the picture we choose to use
The situation becomes more complicated if Ĥ or the operators are dependent on time.
This is usually the case in quantum field theory (there you will encounter yet another
picture, namely the “interaction” picture). Such problems are usually quite hard to
solve in generality and one has to resort to approximate methods such as perturbation
theory. We will talk about this later in the course.
2.2 Symmetries
Consider a Heisenberg picture operator that has no explicit time dependence and com-
mutes with the Hamiltonian, in other words let
dQ(t) i
= [ Ĥ, Q(t)] = 0, (2.30)
dt ℏ
so Q(t) is time independent. Operators that are time independent in the Heisenberg
picture are said to be conserved. Note that if we start out with an eigenstate of Q at
time t = 0
Q|ψ(0)⟩ = q|ψ(0)⟩ (2.31)
at a later time t, we find
where in the second equality we used that [Q, Ĥ] = 0 and hence [Q, U (t)] = 0. We
see that a particle that starts off in a Q eigenstate will remain so at all later times.
Furthermore, since [Q, Ĥ] = 0, eigenstates of Q are also eigenstates of Ĥ,
d d d
⟨ψ|Q|ψ⟩ = ⟨ ψ|Q|ψ⟩ + ⟨ψ|Q| ψ⟩
dt dt dt
i i
= ⟨− Ĥψ|Q|ψ⟩ + ⟨ψ|Q| − Ĥψ⟩
ℏ ℏ
i
= ⟨ψ|Ĥ † Q − QĤ|ψ⟩
ℏ
i
= ⟨ψ|[Ĥ, Q]|ψ⟩
ℏ
= 0. (2.34)
The most important source of conserved quantities are symmetries of the Hamilto-
nian. A symmetry of Ĥ is a transformation U (θ) that leaves Ĥ invariant, namely
where T is the generator of the transformation discussed before. In this special case, T
is also a conserved quantity by (2.29). We arrive at the important conclusion that sym-
metries of the Hamiltonian correspond to conserved quantities. The analog statement
in classical mechanics is the famous Noether theorem.
Recall that a canonical transformation qi → qi′ , pi → p′i preserves the Poisson brackets,
namely
qi′ = qi + εTi + . . .
p′i = pi + εUi + . . . (2.39)
here the ellipsis denote higher order powers of ε and Ti , Ui are some functions on phase
space. It is known that for this to be a canonical transformation there must exist a
2.3. A SPHERICALLY SYMMETRIC POTENTIAL 31
∂Q ∂Q
Ti = Ui = − . (2.40)
∂pi ∂qi
dQ
{Q, H} = 0 ⇐⇒ = 0, (2.43)
dt
where we have assumed that Q has no explicit time-dependence. This it the famous
Noether theorem: every infinitesimal symmetry gives rise to a conserved quantity
and vice-versa (although the converse is not true in a Lagrangian formulation).
∂Ψ ℏ2 2
iℏ =− ∇ Ψ + V (x, y, x)Ψ (2.44)
∂t 2m
Here we have assumed the potential is time-independent and
∂2 ∂2 ∂2
∇2 = + + . (2.45)
∂x2 ∂y 2 ∂z 2
This is in general too hard to solve but let us consider a potential with spherical symme-
p
try V = V (r) and r = x2 + y 2 + z 2 . Because the system is symmetric, it is helpful to
switch to coordinates that make these symmetry manifest, namely spherical coordinates:
x = r sin θ cos ϕ ,
y = r sin θ sin ϕ ,
z = r cos θ. (2.46)
1
One can easily check that the first equation in (2.38) tell us that Ti , Ui must be total derivatives
wrt pi , qi respectively of some functions Q, P , while the second equation gives the functions are the
same up to a sign.
32 CHAPTER 2. ANGULAR MOMENTUM
In this case (you can show this by a tedious calculation just using the chain rule, see
the problem sets, its easier if you know Riemannian geometry)
∂2
2 1 ∂ 2 ∂ 1 ∂ ∂ 1 1 ∂ 2 ∂ 1
∇ = 2 r + 2 sin θ + 2 2 2
≡ 2 r + 2 L2 ,
r ∂r ∂r r sin θ ∂θ ∂θ r sin θ ∂ϕ r ∂r ∂r r
(2.47)
where the differential operator L2 only depends on the angular variables. We will see
later how this is related to the angular momentum generator in some basis.
We now look for energy eigenstates. We can make the Ansatz (see section 1.4)
ℏ2 L2 (θ, ϕ)
1 ∂ 2 ∂
− r + ψ + V (r)ψ = Eψ. (2.49)
2m r2 ∂r ∂r r2
We can solve this equation by separation of variables. In particular we write
By the usual logic of separation of variables (divide by uY /r2 and see that the first and
third terms only depend on r where as the second is independent of r) splits into two
equations
ℏ2 d
2 du
− r + (V (r) − E)r2 = −ϵ
2mu dr dr
1 ℏ2 1 ∂2
1 ∂ ∂
− sin θ + Y = ϵ, (2.52)
Y 2m sin θ ∂θ ∂θ sin2 θ ∂ϕ2
| {z }
L2
where ϵ is a constant. In particular the second equation knows nothing about the
potential and hence nothing about the problem at hand:
1 ∂2
1 ∂ ∂ 2mϵ
sin θ + + 2 Y = 0. (2.53)
sin θ ∂θ ∂θ sin2 θ ∂ϕ2 ℏ
Indeed the solutions for Y (θ, ϕ) can be completely solved and the solutions are related
to spherical harmonics Yl,m (θ, ϕ), labelled by two integers l, m with l ≥ 0 and |m| ≤ l.
On the other hand the first equation is a single ordinary differential equation for
u(r) and we have some hope to solve it:
1 d 2 du 2m ϵ
− 2 r + 2 V (r) + 2 − E u = 0 (2.54)
r dr dr ℏ r
2.4. ROTATIONS AND ANGULAR MOMENTUM OPERATORS 33
This is the only place which is sensitive to the particular problem at hand through the
choice of potential V (r).
Thus we have reduced the problem to solving two independent differential equations.
And the only one specific to the particular problem is a second order ordinary linear
differential equation. One can proceed with brute force (the Y ′ s can be found by sepa-
ration of variables once more). You can find them on Wikipedia or Mathematica so we
won’t say more about them here. Although we will say that one finds
ℏ2 l(l + 1)
ϵ= (2.55)
2m
for some non-negative integer l. Later we will construct them in an insightful way. Indeed
spherical harmonics arise from some very important physics (angular momentum) and
mathematics (Lie algebras) underlying the system. So let us explore...
where
v ′ · v ′ = v · v, det R(α) = 1. (2.57)
These two constraints imply that R(α) ∈ SO(3), ie. they are special orthogonal matri-
ces. You may recall from studying rotations in classical mechanics that an infinitesimal
rotation of v by an angle δα around some unit vector n̂ (that points around the axis of
rotation) is given by
v ′ = v + δαn̂ × v + O(δα2 ), (2.58)
where × is the cross product. Draw a picture to convince yourself that this is the case
(recall that the wedge product of two vectors gives a vector perpendicular to the plane
defined by the two vectors).
In quantum mechanics, rotations are defined by a unitary transformation U (α) whose
action on the position operator is given by (see also the discussion in section 2.1.3)
0 0 1 ẑ ẑ
34 CHAPTER 2. ANGULAR MOMENTUM
The important thing to keep in mind is that the components of these vectors are not
numbers, but operators on H.
From (2.7), we can write an infinitesimal rotation as
i
U (δα) = I − δα · Ĵ + O(δα2 ) (2.61)
ℏ
Substituting this into the definition (2.59), and using (2.14) and (2.58), we deduce that
an infinitesimal rotation of x̂ is given by
i
[δα · Ĵ, x̂] = δα × x̂. (2.62)
ℏ
Using2
3
X X
δα · Ĵ = δαi Jˆi , (δα × x̂)i = εijk δαj x̂k (2.63)
i=1 j,k
and given that (2.62) has to hold for all δαi , we find
ℏ
[Jˆi , x̂j ] = εjik x̂k = iℏεijk x̂k . (2.64)
i
We can now compute the commutator (or the difference between first rotating around
α and then around β and vice versa)
The last line can be checked from the definition of the cross product and please do so!
Since U (α) is a homomorphism, (2.66) implies that
or using (2.61),
1 i
2
−
[δβ · Ĵ, δα · Ĵ] = − (δβ × δα) · Ĵ. (2.68)
ℏ ℏ
Inserting the formulas (2.63) for the scalar and cross products, since (2.68) must hold
for any δα, δβ we must have that
This means that Jˆi form an algebra. We will see in the next chapter that this is the Lie
algebra of the Lie groups SO(3) and SU (2), whose representations we will discuss in
detail.
A special case is the angular momentum from classical mechanics L = x × p, which
in quantum mechanics becomes
L = x̂ × p̂ (2.70)
where i = 1, 2, 3 and we use the convention that repeated indices are summed over.
Therefore, following canonical quantisation, we find the operator
∂
Li = −iℏεijk xj (2.72)
∂xk
for example
∂ ∂
L1 = −iℏx2 3
+ iℏx3 2 (2.73)
∂x ∂x
To gain some intuition let us look at
[L1 , L2 ] = L1 L2 − (1 ↔ 2)
2 k ∂ m ∂
= −ℏ ε1kl x ε2mn x − (1 ↔ 2)
∂xl ∂xn
∂2
2 k ∂ k m
= −ℏ ε1kl ε2mn x δlm n + x x − (1 ↔ 2)
∂x ∂xl ∂xn
∂ ∂2
= −ℏ2 ε1kl ε2ln xk n − ℏ2 ε1kl ε2mn xk xm l n − (1 ↔ 2) (2.74)
∂x ∂x ∂x
Lets look at the second term. We see that it is symmetric under k ↔ m and l ↔ n.
Therefore it is symmetric under 1 ↔ 2 and is cancelled when we subtract 1 ↔ 2. Thus
we have
∂
[L1 , L2 ] = −ℏ2 ε1kl ε2ln xk n − (1 ↔ 2)
∂x
∂
= −ℏ2 ε123 ε231 x2 1 − (1 ↔ 2)
∂x
∂
= −ℏ2 x2 1 − (1 ↔ 2)
∂x
= iℏL3 (2.75)
where we have used the fact that l ̸= 1 in ε1kl and l ̸= 2 in ε2ln so the only non-zero
term comes from l = 3 and hence k = 2, n = 1. This is a general fact which follows from
the identity3
We find again (2.69). This structure is powerful enough for us to deduce more or less
anything we need to know about states with angular momentum from purely algebraic
considerations. In the next chapter we will also discuss another important observable
that obeys the same relation (2.69) but has no classical analog: the spin.
Chapter 3
P̂ 2
H= − µ̂ · B. (3.1)
2M
Clasically, µ would correspond to the magnetic dipole moment of the atom and is pro-
portional to the angular momentum of electrons going around a loop. In quantum me-
chanics, the magnetic dipole moment is proportional to the spin, an intrinsic property
of atoms:
µ
µ → µ̂ ≡ Ŝ. (3.2)
ℏs
Ŝ obeys the same commutation relations as the angular momentum, namely
and so the state of the atom transforms non-trivially under Ŝ. Essentially, the magnetic
moment captures deviations of the atom from spherical symmetry (ie. there is a preferred
direction of “spin” around an axis).
As discussed before, observables are obtained by computing expectation values of
various operators whose evolution is governed by the Heisenberg equations of motion.
Recalling that in the Heisenberg picture states are time-independent, we have
d⟨x̂⟩ i i P̂ 2 ⟨P̂ ⟩
= ⟨ [H, x̂]⟩ = ⟨ [ , x̂]⟩ = , (3.4)
dt ℏ ℏ 2M M
where we used the commutation relations for P̂ and x̂ (2.19). This is the analog of the
classical relation between position and momentum. Furthermore, we have
d⟨P̂ ⟩ i
= ⟨[H, P̂ ]⟩ = ⟨∇(µ̂ · B)⟩. (3.5)
dt ℏ
37
38 CHAPTER 3. ALL ABOUT su(2)
This the analog of Newton’s second law, where the RHS is interpreted as a force that
acts on the atom.
Provided that the magnetic field is inhomogeneous, ie. ∇(µ̂ · B) ̸= 0, the atoms
experience a force that pushes them in the direction of increasing B if their spins are
aligned, and in the direction of decreasing B if their spins are anti-aligned. Provided the
atoms are unpolarized, classically, we would expect a continuum distribution of atoms
on a remote screen, as shown in Figure ??. Instead the SG experiment found in the
case of silver atoms that the beam split into two after passing through the magnetic
field. This observation was consistent with the quantization of µ (ie. Ŝ promoted to a
vector of operators, one for each direction in space). In this case, dimH = 2 and each
component of µ̂ is a 2 × 2 matrix with eigenvalues ±1 (in some units).
We will see that this case corresponds to a 2-dimensional representation of the su(2)
Lie algebra. In general su(2) admits (2s + 1)-dimensional representations, and indeed,
going through the periodic table, the pattern on the screen in the SG experiment was
found to consist of an integer number of lines! This observation is fully consistent with
the description of spin in terms of representations of su(2) with s ∈ 21 N. The goal of
this Chapter is to explain these observations using math.
[ · , · ]:G×G →G (3.8)
[A, [B, C]] + [B, [C, A]] + [C, [A, B]] = 0. (3.9)
3.2. su(2) LIE ALGEBRAS AND REPRESENTATIONS 39
where there is a sum over c and the Ta are known as generators of the Lie-algebra.
The factor of i requires some explanation. It is often not there in the mathematics
literature. However in Physics we like our generators to be Hermitian (just like the
angular momentum operators). If the Ta are Hermitian then
Thus the factor of i ensures that fab c are real. In Mathematics one often drops the factor
of i and takes Ta = −Ta† . This is essentially the same reason why we included a factor
of i in the expansion of (2.7).
A representation is a linear map
Π : G → End(V ) (3.12)
such that
Here V is a vector space (typically complex) and End(V ) are the set of endomorphisms
of V (linear maps from V to V ). In physics language V = CN and End(V ) are N × N
matrices.
We are particularly interested in irreducible representations. These are represen-
tations with no non-trivial invariant subspaces. That is, there are no vector subspaces
of V that are mapped to itself by Π.
Let us suppose that we are given matrices Ji that satisfy [Ji , Jj ] = iϵijk Jk with
i = 1, 2, 3. We have dimG = 3 and fabc = ϵabc which define the su(2) ≃ so(3) Lie algebra.
Since ϵijk are real, Ji† = Ji but we do not know anything else yet and we certainly don’t
assume that they are 2 × 2 matrices or differential operators. It is important not to
confuse the dimension of the Lie-algebra dimG with the dimension of the representation,
N . We will see that we can construct N × N Hermitian matrices Ji , i.e. N -dimensional
representations of su(2), for any N = 0, 1, 2, ....
We note that
J 2 = (J1 )2 + (J2 )2 + (J3 )2 , (3.15)
=0. (3.16)
There is a famous theorem known as Schur’s lemma which states that any such Casimir
must act as a multiple of the identity in an irreducible representation. This means that
J 2 = λI in any irreducible representation. In practical terms if J 2 commutes with all
other operators then nothing will change the eigenvalue of J 2 .
Since the Ji are Hermitian we can chose to diagonalise one, but only one since su(2)
has rank 1, say J3 . Thus the representation has a basis of states labelled by eigenvalues
of J3 and J 2 :
J3 |λ, m⟩ = m|λ, m⟩ J 2 |λ, m⟩ = λ|λ, m⟩. (3.17)
In analogy to the harmonic oscillator we can trade J1 and J2 for the operators
Notice that
Therefore we have
where the constants c+ (m) and c− (m) are chosen to ensure that the states are normalized
(we are assuming for simplicity that the eigenspaces of J3 are one-dimensional - we will
return to this shortly).
It will also be useful to know that
[J+ , J− ] = [J1 + iJ2 , J1 − iJ2 ] = −i[J1 , J2 ] + i[J2 , J1 ] = −2i × iϵ123 J3 = 2J3 . (3.22)
Furthermore, it is easy to show that in terms of J± , the Casimir (3.15) takes the form
(please check!)
1
J 2 = (J+ J− + J− J+ ) + J32 (3.23)
2
Using these identities, we can now determine the normalization constants in (3.21) as
follows. Using that the states are normalized to 1, namely ⟨m|m⟩ = ⟨m + 1|m + 1⟩ = 1,
and (3.18), we have
We can now conveniently use (3.23) in the first identity and (3.22) in the second one to
find
|c+ |2 + |c− |2 = ⟨λ, m|(J 2 − J32 )|λ, m⟩ = (2λ − 2m2 )
(3.25)
|c+ |2 − |c− |2 = −2⟨λ, m|J3 |λ, m⟩ = −2m,
where in the last line we used that |λ, m⟩ is an eigenstate of J 2 , J3 (3.17). We can now
easily solve for c±
√
c+ = λ − m2 − m
√ (3.26)
c− = λ − m2 + m.
Thus we see that any irrep of su(2) is labelled by λ and has states with J3 eigenvalues
m, m ± 1, m ± 2, . . .. If we look for finite dimensional representations then there must be
a highest value of J3 -eigenvalue mh and lowest value ml . Furthermore the corresponding
states must satisfy
This is a quadratic equation for ml as a function of mh and hence has two solutions.
Simple inspection tells us that
ml = −mh or ml = mh + 1 . (3.31)
The second solution is impossible since ml ≤ mh and hence the spectrum of J3 eigenvalues
is:
mh , mh − 1, ..., −mh + 1, −mh , (3.32)
with a single state assigned to each eigenvalue. Furthermore there are 2mh + 1 such
eigenvalues and hence the representation has dimension 2mh + 1. This must be an
integer so we learn that
2mh = 0, 1, 2, 3.... . (3.33)
We now return to the issue whether or not the eigenspaces |λ, m⟩ can be more
than one-dimensional. If the space of eigenvalues with m = mh is N -dimensional then
when we act with J− we obtain N -dimensional eigenspaces for each eigenvalue m. This
would lead to a reducible representation where one could simply take one-dimensional
subspaces of each eigenspace. Let us then suppose that there is only a one-dimensional
eigenspace for m = mh , spanned by |λ, mh ⟩. It is then clear that acting with J− produces
all states and each eigenspace of J3 has only a one-dimensional subspace spanned by
|λ, m⟩ ∝ (J− )n |λ, mh ⟩ for some n = 0, 1, ..., 2λ + 1.
In summary, and changing notation slightly to match the norm, we have obtained
a (2l + 1)-dimensional unitary representation determined by any l = 0, 12 , 1, 23 , ... having
the Casimir J 2 = l(l + 1)I (in terms of what we had before l = mh ). The states can be
labelled by |l, m⟩ where m = −l, −l + 1, ..., l − 1, l.
Let us look at some examples.
l = 0: Here we have just one state |0, 0⟩ and the matrices Ji act trivially. This is the
trivial representation.
l = 1/2: Here we have 2 states:
! !
1 0
| 12 , 21 ⟩ = | 21 , − 12 ⟩ = . (3.34)
0 1
By construction J3 is diagonal:
!
1/2 0
J3 = . (3.35)
0 −1/2
3.3. LIE GROUPS VS. ALGEBRAS 43
so that !
0 1
J+ = . (3.37)
0 0
And can determine J− through
p
J− | 21 , 12 ⟩ = 3/4 − 1/4 + 1/2| 21 , − 12 ⟩ J− | 21 , − 21 ⟩ = 0 (3.38)
so that !
0 0
J− = . (3.39)
1 0
Or alternatively
!
1 1 0 1
J1 = (J+ + J− ) =
2 2 1 0
!
1 1 0 −i
J2 = (J+ − J− ) = (3.40)
2i 2 i 0
where δθi << 1 some arbitrarily small parameters. This is exactly the same expansion
we have encountered before in the discussion of rotations (there n = 3). To obtain
representations of the group we simply exponentiate these matrices (see also (??)):
i i
g = e−iθ J (3.42)
θ = θn̂ (3.44)
This means that (in̂ · τ )2 = −I. Thus we have (the proof is the same as for eiθ =
cos θ + i sin θ)
i
g = e− 2 θ·τ
= cos(θ/2)I − in̂ · τ sin(θ/2) (3.46)
and hence
i
In other words g † = e 2 θ·τ = g −1 so we have unitary matrices. We can also compute
i i
det g = det e− 2 θ·τ = etr(− 2 θ·τ ) = 1. (3.48)
This follows because τi are traceless. Thus we have g ∈ SU (2) where SU (2) is the group
One caveat: these are unitary representations because we assumed that the Ji are Her-
i
mitian (so the group elements e−iθ Ji are Unitary). But we started off by looking at
rotations, that is SO(3). What happened?
Here encounter the fact that there are two groups with the same Lie-algebra: SU (2)
and SO(3). Indeed there is a well-known theorem:
SO(3) ∼
= SU (2)/Z2 (3.50)
where Z2 is generated by the centre of SU (2) - that is to say the set of all elements
in SU (2) that commute which each other. This means that they are multiples of the
identity and this leaves just ±I. In particular
To exhibit the isomorphism, let’s go back to our discussion of rotations which are
elements of SO(3). Looking at infinitesimal rotations in R3 , we have
R = I − iδθ · J + . . . (3.52)
with
0 0 0 0 0 1 0 1 0
J 1 = i 0 0 1 , J2 = i 0 0 0 , J3 = i −1 0 0 (3.53)
0 −1 0 −1 0 0 0 0 0
z z z
x + δθ3 y
= y − δθ3 x , (3.54)
which is indeed an infinitesimal rotation around the z axis. One can see explicitly that
This is the su(2) Lie algebra, but unlike the Pauli matrices, the generators of rotations
are 3 × 3 matrices. On the one hand, this is a different representation of su(2).
On the other hand, exponentiating these gives a relatively complicated expression.
Restricting again our attention to rotations around z, we get
0 0 0 1 0 0 0 1 0
g = e−iθ3 J3 = 0 0 0 + cos θ3 0 1 0 + i sin θ3 −1 0 0
0 0 1 0 0 0 0 0 0
cos θ3 sin θ3 0
= − sin θ3 cos θ3 0 (3.56)
0 0 1
Here we have used the fact that J3 , and all powers of J3 , split into a non-trivial 2 × 2
bit that squares to minus the identity and a trivial part with only zeros. Thus we have
rotations with θ3 ∈ [0, 2π].
We can now see that the homomorphism Φ : SU (2) → SO(3) is
i
− θ·τ
Φ e 2 = e−iθ·J (3.57)
It’s also interesting to think about how SU (2) and SO(3) look as spaces (manifolds).
Rewriting equation 3.46 as
g = AI + iB · τ (3.58)
0
is closed in SO(3) but can’t be continuously deformed to a point. But curiously enough
going around this loop twice is deformable to a point, i.e. no loop, π1 (SO(3)) = Z2 .
More general representation arise by considering tensors Tµ1 ...µn over C2 for su(2)
or R3 for SO(3). The group elements act on each of the µi indices in the natural
way. In general this does not give an irreducible representation. For larger algebras
such as SU (N ) and SO(N ) taking Tµ1 ...µn to be totally anti-symmetric does lead to
an irreducible representation. So does totally symmetric and traceless on any pair of
indices.
What happens in Nature? We we will see that the spherical harmonics are basis
functions for representations of SO(3). They arise because there is a fundamental SO(3)
rotational symmetry of space. For spherical harmonics l is an integer. This is because
for rotations, we need the image of −I in SU (2) under the homomorphism Φ in (3.57)
to be the identity in SO(3), namely
e2πiJ3 = I. (3.60)
∂
Li = −iℏϵijk xj ≡ (x̂ × p̂)i , (4.1)
∂xk
obey the su(2) commutation relations. These are clearly not N × N matrices, so where
do they fit? Recall that the position and momentum operators (x̂, p̂) act on the space
of differentiable functions on R3 which is an infinite dimensional Hilbert space. This
means that Li in (4.1) must also act on this space. However, we will now show that they
do not act irreducibly on this space. In other words, functions on R3 (or more precisely
functions on the 2-dimensional sphere S 2 ) form a reducible representation of the su(2)
algebra. We can study the action of (4.1) on this space to decompose it into irreducible
subspaces.
Following the abstract construction of finite-dimensional irreducible representations
in Section ??, we want to look for eigenstates of the Casimir L2 , and L3 . As before, we
introduce
47
48 CHAPTER 4. BACK TO ANGULAR MOMENTUM
Note that all these equations follow from the su(2) algebra
and hence must hold for our differential operators (4.1) as well. Nevertheless, it will
be helpful to see what the previous analysis looks like in terms of these differential
operators.
∂ψ
Li ψ = −iℏεijk xj = −2iℏεijk xj xk f ′ (r2 ) = 0 (4.6)
∂xk
so these are an invariant subspace.
To continue it will be convenient to work in a spherical coordinate system,
x = r sin θ cos ϕ
y = r sin θ sin ϕ
z = r cos θ (4.7)
and express the Li generators are differential operators acting on S 2 . This is done by a
straightforward coordinate transformation (see Problem Sheets). First, we invert (4.7)
and find
p
r= x2 + y 2 + z 2 n, n (4.8)
θ = arccos(z/r) ,
ϕ = arctan(y/x). (4.9)
4.1. DIFFERENTIAL OPERATORS VS IRREDUCIBLE REPRESENTATIONS 49
∂r/∂x ∂r/∂y ∂r/∂z sin θ cos ϕ sin θ sin ϕ cos θ
∂θ/∂x ∂θ/∂y ∂θ/∂z = cos θ cos ϕ/r cos θ sin ϕ/r − sin θ/r , (4.10)
∂ ∂r ∂ ∂θ ∂ ∂ϕ ∂
= + +
∂x ∂x ∂r ∂x ∂θ ∂x ∂ϕ
∂ cos θ cos ϕ ∂ sin ϕ ∂
= sin θ cos ϕ + − ,
∂r r ∂θ r sin θ ∂ϕ
∂ ∂r ∂ ∂θ ∂ ∂ϕ ∂
= + +
∂y ∂y ∂r ∂y ∂θ ∂y ∂ϕ
∂ cos θ sin ϕ ∂ cos ϕ ∂
= sin θ sin ϕ + + ,
∂r r ∂θ r sin θ ∂ϕ
∂ ∂r ∂ ∂θ ∂ ∂ϕ ∂
= + +
∂z ∂z ∂r ∂z ∂θ ∂z ∂ϕ
∂ sin θ ∂
= cos θ − .
∂r r ∂θ
(4.11)
Finally, this allows us to express the angular momentum generators in spherical coordi-
nates
∂ ∂
L1 = − iℏy + iℏz
∂z ∂y
∂ sin θ ∂
= − iℏr sin θ sin ϕ cos θ −
∂r r ∂θ
∂ cos θ sin ϕ ∂ cos ϕ ∂
+ iℏr cos θ sin θ sin ϕ + +
∂r r ∂θ r sin θ ∂ϕ
∂ ∂
= iℏ sin ϕ + cot θcos ϕ , (4.12)
∂θ ∂ϕ
∂ ∂
L2 = −iℏz + iℏx
∂x ∂z
∂ cos θ cos ϕ ∂ sin ϕ ∂
= −iℏrcos θ sin θ cos ϕ + −
∂r r ∂θ r sin θ ∂ϕ
∂ sin θ ∂
+ iℏr sin θ cos ϕ cos θ −
∂r r ∂θ
∂ ∂
= iℏ −cos ϕ + cot θsin ϕ , (4.13)
∂θ ∂ϕ
50 CHAPTER 4. BACK TO ANGULAR MOMENTUM
L+ = L1 + iL2
iϕ ∂ ∂
= iℏe −i + cot θ ,
∂θ ∂ϕ
L− = L1 − iL2
−iϕ ∂ ∂
= ie i + cot θ . (4.15)
∂θ ∂ϕ
From here we see that
with
L2
Vef f (r) = V (r) +
2mr2
ℏ2 l(l + 1)
= V (r) + (4.18)
2mr2
where in the second line we have assumed that we are looking at angular momentum
eigenstates, i.e. eigenstates of L2 . It’s an easy excercise to see that L2 does indeed
commute with H.
From last Chapter, to construct su(2) irreducible representations, we need a highest
weight state
and we saw that the last condition implies that l = m. We would like to construct
functions on the sphere that form a basis for the l representations of su(2). To this end,
define
Yl,m (θ, ϕ) ≡ ⟨θ, ϕ|l, m⟩, (4.20)
L3 Yl,m = mℏYl,m ,
L2 Yl,m = ℏ2 l(l + 1)Yl,m . (4.21)
The set {Yl,m |m = −l, ..., +l} therefore provides an irreducible, (2l + 1)-dimensional
representation of su(2) inside the space of all differentiable functions on S 2 (viewed as
the unit sphere in R3 ).
Let us look for the associated eigenfunctions corresponding to the states |l, m⟩, which
are also known as spherical harmonics. We can construct them easily as solutions to
differential equations imposed by (4.21), namely
∂
L3 Yl,m = −iℏ Yl,m = mℏYl,m (4.22)
∂ϕ
Thus
for some function Θl,m (θ). To fix Θl,m , we start with the highest weight state |l, l⟩. We
know that
0 = L+ Yl,l
iϕ ∂ ∂
= iℏe −i + cot θ eilϕ Θl,l (θ)
∂θ ∂ϕ
d
0 = −i + il cot θ Θl,l (θ) (4.24)
dθ
This is easily solved:
dΘl,l
= l cot θdθ = ld ln sin θ
Θl,l
=⇒ Θl,l (θ) = Cl (sin θ)l (4.25)
for some explicit function Θl,m . Note that we haven’t worried here about the normal-
ization but one can. But in practice these functions are given in books or built in to
Mathematica.
A special case of a beautiful result in mathematics (harmonic analysis), known as the
Peter-Weyl theorem then implies that the regular representation of su(2) on functions on
S 2 decomposes as a direct sum of irreducible, unitary representations. In other words,
∞ X
X ℓ ∞ X
X ℓ
ψ(θ, ϕ) ≡ ⟨θ, ϕ|ψ⟩ = cℓ,m ⟨θ, ϕ|ℓ, m⟩ = cℓ,m Yℓ,m (θ, ϕ). (4.27)
ℓ=0 m=−ℓ ℓ=0 m=−ℓ
In fact this is a general result which applies to any compact group G and allows one to
decompose L2 functions on G as direct sums of all unitary, irreducible representations of
G. The most familiar example of this fact is the Fourier series, in which case G = U (1)
(the group of unit complex numbers).
(p1 p2 ) ∈ H1 ⊗ H2 . (4.29)
|j T , mT , j1 , j2 ⟩, (4.34)
where
(1) (2)
JiT ≡ Ji + Ji , i = 1, 2, 3,
3
T 2
X (4.35)
(J ) ≡ (JiT )2 ,
i=1
54 CHAPTER 4. BACK TO ANGULAR MOMENTUM
and
J3T |j T , mT , j1 , j2 ⟩ = mT |j T , mT , j1 , j2 ⟩,
(4.36)
(J T )2 |j T , mT , j1 , j2 ⟩ = j T (j T + 1)|j T , mT , j1 , j2 ⟩
commute. Here j T and mT relate to the total angular momentum-squared and total
angular momentum around the z-axis. Importantly, one can check from the definitions
that
[JiT , JjT ] = iϵijk JkT , (4.38)
and it is the invariant subspaces of this su(2) algebra that we want to identify inside the
vector space (2j1 + 1) ⊗ (2j2 + 1). The corresponding raising and lowering operators
LT± are constructed as before.
In general, the change of basis from |j1 , m1 ; j2 , m2 ⟩ to a basis labelled by |j T , mT , j1 , j2 ⟩
takes the form
j1 j2
T T
X X
T T j ,m
|j , m , j1 , j2 ⟩ = Cm 1 ,m2
(j1 , j2 )|j1 , m1 ; j2 , m2 ⟩ (4.39)
m1 =−j1 m2 =−j2
j ,m T T
The constants Cm 1 ,m2
(j1 , j2 ) are known as Clebsch-Gordan coefficients. They can be
computed recursively, as we now describe. The idea is to start with a highest weight
state with respect to J3T . An obvious choice is the state of maximum eigenvalue of
J3T = J3 ⊗ I + I ⊗ J3 , (4.40)
c− (j1 + j2 , j1 + j2 )| j1 + j2 , j1 + j2 − 1, j1 , j2 ⟩
| {z } | {z }
jT mT
where c− were defined in (??). We emphasize that the derivation of these coefficients
only relied on the existence of the su(2) algebra and not the form of the generators. It
4.2. ADDITION OF ANGULAR MOMENTUM 55
therefore, applies equally well to the su(2) generated by JiT . Straightforward evaluation
using (??) gives
p
c− (j1 , j1 ) = 2j1 ,
p
c− (j1 , j2 ) = 2j2 , (4.44)
p
c− (j1 + j2 , j1 , j2 ) = 2(j1 + j2 ),
The other coefficients can be determined by continuing to act with J−T on this states, until
we hit the lowest weight state | j1 + j2 , −(j1 + j2 ), j1 , j2 ⟩. We have therefore isolated a
| {z } | {z }
jT mT =−j T
2(j1 + j2 ) + 1 dimensional su(2) subspace inside (2j1 + 1) ⊗ (2j2 + 1). Provided (2j1 +
1) · (2j2 + 1) − 2(j1 + j2 ) − 1 > 0, we can continue applying the same procedure to
the highest state with respect to J+T in the complement of the subspace that we just
isolated. It is not hard to see that the next such state has mT = j1 + j2 − 1 = j T . Going
on like this, one can show that
j1 +j2
M
(2j1 + 1) ⊗ (2j2 + 1) = (2J + 1). (4.46)
J=|j1 −j2 |
We can check that this formula makes sense by comparing the dimensions of the
LHS and the RHS:
j1 +j2 2j2
X X
dim(RHS) = (2J + 1) = (2(J + j1 − j2 ) + 1)
J=|j1 −j2 | J=0
Let us be more specific and look at the simplest non-trivial case of j1 = j2 = 1/2
which appeared before. To clean up the notation, we denote in this exercise
1 1
| , m1 , , m2 ⟩ ≡ |m1 , m2 ⟩ = |m1 ⟩ ⊗ |m2 ⟩. (4.48)
2 2
In the tensor product we have the following basis of states (this is exactly the basis of
(4.30))
| 21 , 12 ⟩ = | 21 ⟩ ⊗ | 12 ⟩
| 12 , − 12 ⟩ = | 21 ⟩ ⊗ | − 12 ⟩
| − 21 , 12 ⟩ = | − 21 ⟩ ⊗ | 12 ⟩
| − 21 , − 12 ⟩ = | − 21 ⟩ ⊗ | − 21 ⟩. (4.49)
56 CHAPTER 4. BACK TO ANGULAR MOMENTUM
(1) (2)
These are eigenstates of J3 + J3
(1) (2) (1) (2)
(J3 + J3 )| 12 , 12 ⟩ = J3 | 12 ⟩ ⊗ | 21 ⟩ + | 12 ⟩ ⊗ J3 | 12 ⟩
= 12 | 21 ⟩ ⊗ | 12 ⟩ + 12 | 12 ⟩ ⊗ | 21 ⟩
= | 12 , 12 ⟩
(1) (2) (1) (2)
(J3 + J3 )| 12 , − 12 ⟩ = J3 | 12 ⟩ ⊗ | − 21 ⟩ + | 12 ⟩ ⊗ J3 | − 12 ⟩
= 12 | 21 ⟩ ⊗ | − 12 ⟩ − 21 | 12 ⟩ ⊗ | − 12 ⟩
=0
(1) (2) (1) (2)
(J3 + J3 )| − 12 , 12 ⟩ = J3 | − 12 ⟩ ⊗ | 12 ⟩ + | − 12 ⟩ ⊗ J3 | 12 ⟩
= − 12 | − 12 ⟩ ⊗ | 12 ⟩ + 12 | − 12 ⟩ ⊗ | 12 ⟩
=0
(1) (2) (1) (2)
(J3 + J3 )| − 12 , − 12 ⟩ = J3 | − 12 ⟩ ⊗ | − 12 ⟩ + | − 12 ⟩ ⊗ J3 | − 12 ⟩
= − 12 | − 12 ⟩ ⊗ | − 12 ⟩ − 12 | − 12 ⟩ ⊗ | − 12 ⟩
= −| − 12 , − 21 ⟩ (4.50)
(T )
Thus we find the J3 eigenvalues (−1, 0, 0, +1). This doesn’t correspond to any irre-
ducible representation of su(2) but we can split it as 0 and (−1, 0, 1) which are the
eigenvalues of the j T = 0 and j T = 1 representations. Thus we expect to find
2 ⊗ 2 = 1 ⊕ 3. (4.51)
and 3
|j T = 1, mT = −1⟩ = | − 12 , − 21 ⟩
1 1
|j T = 1, mT = 0⟩ = √ | 12 , − 21 ⟩ + √ | − 21 , 12 ⟩
2 2
T T 1 1
|j = 1, m = 1⟩ = | 2 , 2 ⟩ (4.53)
You can check this for yourself. These form another basis for 2 ⊗ 2, which are manifestly
invariant subspaces, or 3-dimensional and 1-dimensional irreducible representations of
su(2).
algebras. The subject of this section, we will focus on the su(2) symmetry and consider
a spin- 12 particle in 3d, or simply a 2-dimensional irreducible representation of su(2).
The Pauli matrices defined before allow us to define the operators (or observables)
! ! !
ℏ 0 1 ℏ ℏ 0 −i ℏ ℏ 1 0 ℏ
Sx = = σx , Sy = = σy , Sz = = σz . (4.54)
2 1 0 2 2 i 0 2 2 0 −1 2
The eigenvalues of these matrices are ± ℏ2 which are the values that one observes when
measuring spin along the x, y or z axes (recall the Stern-Gerlach experiment).
Let’s put the qubit in a magnetic field B = (0, 0, B), in which case the Hamiltonian
takes the form
ℏ
H = −µ · B = − γσz B0 , µz ≡ γSz , (4.55)
2
for some constant γ. The state of the system evolves according to the Schrodinger
equation !
∂ ℏ B0 0
iℏ |ψ(t)⟩ = − γ |ψ(t)⟩. (4.56)
∂t 2 0 −B0
Let {|0⟩, |1⟩} be a basis for the 2 irrep of su(2). The state of the spin at some time t
must be a linear combination of these basis elements with time-dependent coefficients,
Plugging this into (4.56), we find a system of differential equations for a(t), b(t)
! ! !
∂ a i B0 0 a iγB0 iγB0
= γ =⇒ a(t) = e 2 t a(0), b(t) = e− 2 t b(0). (4.58)
∂t b 2 0 −B0 b
Without loss of generality, we take the spin |ψ(0)⟩ to “point” in the n̂ direction (this
amounts to choosing a boundary condition). Of course, you shouldn’t take this literally,
|ψ⟩ is a vector in a 2-dimensional complex vector space. What this means is that |ψ(0)⟩
is an eigenstate of the spin in that direction, ie.
ℏ
n̂ · Ŝ|ψ(0)⟩ = |ψ(0)⟩. (4.59)
2
Further (for simplicity) taking n̂ = (0, sin θ, cos θ), ie ϕ = π/2, we have
!
ℏ ℏ cos θ −i sin θ
n̂ · Ŝ = (sin θσy + cos θσz ) = , (4.60)
2 2 i sin θ − cos θ
whose eigenvalues are ± ℏ2 . The evector corresponding to the +ve eigenvalue is then
(CHECK!) !
θ θ cos(θ/2)
|ψ(0)⟩ = cos |0⟩ + i sin |1⟩ = . (4.61)
2 2 i sin(θ/2)
From this boundary condition, we can read off a(0) and b(0) and so
!
cos(θ/2)eiω0 t/2
|ψ(t)⟩ = , ω0 ≡ γB0 . (4.62)
i sin(θ/2)e−iω0 t/2
58 CHAPTER 4. BACK TO ANGULAR MOMENTUM
This means that the spin will precess around the B0 ẑ axis with frequency ω0 . To see
this you can check that (4.62) is an eigenstate of n̂(t) · Ŝ, where
MRI essentially exploits the fact that spins couple, hence react to and can be manipu-
lated with external magnetic fields. In particular, one can use a time-dependent, probe
magnetic field to flip spin states from |0⟩ to |1⟩. These states have different energies.
Every time the transition happens, the change in state (hence energy) of the system is
compensated by the emission of radiation. Upon detection, this allows one to get infor-
mation about the structure of molecules. Specifically, molecules consist of many spins
in general (here is where addition of su(2) would be relevant in practice); each spin will
experience a local B field (analog of the constant B0 above) due to other atoms around.
Molecules will hence have several resonance frequencies depending on their constituents
which can be found by applying a probe field (see B(t) below). In this way, one can get
information about the structure of molecules.
To be specific, let’s apply to our spin in the constant magnetic field B0 another
time-dependent probe field B(t) that rotates in the x − y plane, namely
This one is harder to solve directly because the Hamiltonian is time-dependent (more
on this later).
Luckily, in this case we can actually still solve it exactly by noting that we can go to
a rotating frame in which the Hamiltonian becomes time-independent. In other words,
we can change basis from (ab) to (a′ b′ ), where
! ! ! !
a′ (t) e−iωt/2 0 a(t) a(t)
= ≡ R(t) . (4.67)
b′ (t) 0 eiωt/2 b(t) b(t)
or equivalently ! !
ȧ′ (t) ′
′ a (t)
= iH , (4.69)
ḃ′ (t) b′ (t)
4.3. AN APPLICATION OF SPIN: MAGNETIC RESONANCE IMAGING (MRI) 59
with
! !
γ B 0 B1 eiωt
dR† 1 −∆ω ω1
H′ ≡ R −iωt
R †
+iR = , ω1 ≡ γB1 , ∆ω = ω−ω0 .
2 B1 e −B0 dt 2 ω1 ∆ω
(4.70)
You can check this by direct substitution into mathematica. The nice feature of (4.69)
is that H ′ is again time independent, and so we have (see Chapter 2)
! !
′ ′
a (t) ′ a (0)
|ψ ′ (t)⟩ ≡ = eiH t ′ . (4.71)
b′ (t) b (0)
ω12
q
2 t
P12 = |b(t)| = sin2 2
ω1 + ∆ω 2 . (4.73)
|∆ω 2 + ω12 | 2
These are called Rabi oscillations and are at the heart of MRI imaging.
60 CHAPTER 4. BACK TO ANGULAR MOMENTUM
Chapter 5
Let us put what we learnt above to use in solving for an electron moving around a
positively charged nucleus. We consider it as a two-body problem where the position of
the electron is denoted by re and that of the nucleus by rn . Thus our wavefunction is
Furthermore the potential arising from the electrostatic force between them is
Ze2
V =− (5.2)
|rn − re |
where Z is the nucleon number and e the charge of an electron in appropriate units.
Thus the time dependent Schrödinger equation is
ℏ2 2 ℏ2 2 Ze2
Eψ = − ∇ ψ− ∇ ψ− ψ (5.3)
2me re 2mn rn |rn − re |
where me and mn are the masses of the electron and nucleon respectively.
ℏ2 2 ℏ2 2 ℏ2 2 ℏ2 2
− ∇r e − ∇r n = − ∇R − ∇ (5.6)
2me 2mn 2M 2µ r12
61
62 CHAPTER 5. THE HYDROGEN ATOM
where M = me + mn and µ = me mn /M .
Thus our time-independent Schrödinger equation is
ℏ2 2 ℏ2 2 Ze2
Eψ = − ∇R ψ − ∇r12 ψ − ψ. (5.7)
2M 2µ |r12 |
Now we can use separation of variables and consider the Ansatz
ℏ2 2
ϵCoM ψCoM = − ∇ ψCoM
2M R
ℏ2 Ze2
ϵrel ψrel = − ∇2r12 ψrel − ψrel (5.9)
2µ |r12 |
with E = ϵCoM + ϵrel .
with ϵCoM = ℏ2 |k|2 /2M . These just describe a free particle, the atom, in a basis where
the linear momentum is fixed. We can then construct wave packets through
d3 k ik·R
Z
ψCoM (R) = e χ(k) (5.11)
(2π)3
which will not be energy eigenstates but can be localized in position to an arbitrary
accuracy.
The more interesting part is to solve for ψrel (r12 ). Here we can switch from r12 to
spherical coordinates and use separation of variables yet again:
where r = |r12 |. We already know what the Yl,m ’s are and we know that ul satisfies:
Ze2 ℏ2 l(l + 1)
1 d 2 dul 2µ
− 2 r + 2 − + − ϵrel ul = 0 (5.13)
r dr dr ℏ r 2µr2
So solving the Hydrogen atom (which corresponds to Z = 1 but we can be more general
without causing any more pain) comes down to solving this equation and finding ul , E
and ψrel and ψ(re , rn ).
To continue we write
to find
d2 fl 2(l + 1) dfl 2µZe2 2µϵrel
− − − f l − fl = 0 (5.15)
dr2 r dr ℏ2 r ℏ2
This substitution is relatively standard in spherically symmetric examples as it removes
the l(l + 1) term from the equation.
To continue we look at the large r limit where only the first and last terms are
important. If ϵrel > 0 then we find oscillating solutions:
p p
f ∼ C1 cos( 2µϵrel r/ℏ) + C2 sin( 2µϵrel r/ℏ) (5.16)
which will not be normalizable. Solutions with ϵrel = 0 will also not be normalizable.
Thus we conclude that ϵrel < 0. In this case we expect solutions, at large r, to be of the
form
√ √
f ∼ C1 e− −2µϵrel r/ℏ
+ C2 e −2µϵrel r/ℏ
(5.17)
2µZe2 p p
− + 2 −2µϵrel n + 2(l + 1) −2µϵrel = 0 (5.19)
ℏ
which must vanish. We can rearrange this to give
2
µZ 2 e4
1
ϵrel =− (5.20)
2ℏ2 n+l+1
Substituting back in simply leads to a recursion relation obtained from setting the
coefficient of ρk−2 to zero:
We see that taking k = 0 leads to C−1 = 0 and hence the series indeed terminates giving
a polynomial. These polynomials are known as Laguerre Polynomials. We haven’t
shown here that these are the only normalizable solutions but this does turn out to be
the case.
In summary our solutions are of the form
and ψCoM is a generic free wave packet. It’s fun to plot |ψrel |2 for various choices of n, l
and m. For example look here https://en.wikipedia.org/wiki/Atomic_orbital#
/media/File:Hydrogen_Density_Plots.png On the other hand there are actual pic-
tures of atoms such as here: https://www.nature.com/articles/498009d
We can also reproduce the famous formula postulated by Bohr in the earliest days of
Quantum Mechanics (before all this formalism that we have discussed was formulated):
µZ 2 e4 1
EBohr = − (5.26)
2ℏ2 N 2
for some integer N = n + l + 1 = 1, 2, 3, ...,. In particular since a proton is 2000 times
more massive than an electron we have mn >> me and so µ = me to very high accuracy.
Thus one finds (for Hydrogen where Z = 1)
me Z 2 e 4 1 me Z 2 e4 1
Ephoton = − +
2ℏ2 N12 2ℏ2 N22
1 1
=R 2
− 2 (5.28)
N2 N1
• N = 1: n = l = 0 1 state
5.2. THE HYDROGEN ATOM 65
• N = 2: n = 1, l = 0 or n = 0, l = 1 1+3=4 states
• N = 3: n = 2, l = 0 or n = 1, l = 1 or n = 0, l = 2 1+3+5 = 9 states
In fact we find N 2 states for each energy level (the sum of the first N odd numbers is
N 2 ).
Now, what are the lowest energy states with multiple electrons? In fact there are
twice as many as these as each electron can be spin up or spin down (more on this
later). This is just an additional quantum number (which only takes two values: up or
down) and corresponds to the fact that the correct rotation group of Nature isn’t SO(3)
but SU (2) = Spin(3). Note also that the terminology of spin up and spin down has
no physical interpretation in terms of up and down: if you take an electron of spin up
to Australia it does not become spin down. It has more to do with the fact that we
write vectors as column matrices and the up state are on top and then down state on
the bottom. We must know another fact: no two electrons can be in the same state
(the multi-electron wavefunction must be anti-symmetric - odd - under interchange of
any two electrons). Thus the degeneracies of low energy multi-electron states are (this
ignores inter electron interactions which become increasingly important)
• 2: n = l = 0
This pattern is evident in the periodic table whose rows have 2, 8 and 18 elements. We
have now predicted this based on the “crazy” idea that states are vectors in a Hilbert
space and observables are self-adjoint linear maps.
66 CHAPTER 5. THE HYDROGEN ATOM
Chapter 6
Next we must face the fact that solving the Schrödinger equation in general is too
complicated and we need to make approximations to make progress (for spherically
symmetric systems one is on better ground but still one has to solve a second order
ODE). The idea is to find a system you can solve exactly, for example a free, non-
interacting, system or a system you have solved exactly such as a single electron in an
atom, and then imagine that you perturb it (say by adding in another electron or putting
the atom in a small background magnetic field). This means by adding an interaction
whose strength is controlled by a parameter g which we can make as small as we like
such that setting g = 0 reduces us the problem we can solve exactly. One then computes
the physically meaningful quantities in a power series:
E = E0 + gE1 + g 2 E2 + . . . (6.1)
The constants g are referred to as coupling constants - there can be several in any given
problem. The contributions at order g are called first-order and at g 2 second order etc..
Computing each term in this expansion is known as perturbation theory. Much, almost
all, of Physics is done this way.
Of course in Nature g is not a free parameter but some constant that you determine
from experiment. If g is small then we say the theory is weakly coupled whereas if g
is not small then it is strongly coupled. For example in electromagnetism the relevant
coupling constant is
e2 1
α= ∼ . (6.2)
ℏc 137
This is known as the fine structure constant. It is so named as it leads to corrections to
the Hydrogen atom spectrum that correspond to “fine”, e.g. small, corrections to what
we found above. This is indeed small. Nature was kind to Physicists as this meant
that accurate predictions from quantum electrodynamics (QED) could be made using
perturbation theory. Indeed in some cases theory and experiment agree to 13 decimal
67
68 CHAPTER 6. TIME INDEPENDENT PERTURBATION THEORY
place accuracy. This is like knowing the distance to the moon to within the width of a
human hair.
On the other hand in Quantum Chromodynamics (QCD), the theory that describes
quarks inside protons and neutrons, the relevant coupling is
1
αs ∼ (6.3)
2
In fact neither α nor αs are constant. αs becomes larger at longer distances whereas α
gets smaller. The values I gave here are the approximate values of αs at the distance of
a proton and α at infinite distance. So we are not lucky with QCD (superconductivity
is another system that is strongly coupled and perturbation theory fails). At distances
around the size of a proton, QCD is strongly coupled and computations in perturbation
theory are next to useless (but we can use perturbation theory well at very short, sub
proton, distances where they behave in a weakly coupled way). This is asymptotic
freedom, meaning that at short distances the theory is weakly coupled. Pulling quarks
apart increases the force between them, somewhat like a spring. Quarks are confined
into protons (and other baryons such as neutrons) but this is still poorly understood
largely because we can’t compute much.
It is important to ask what small means. It is meaningless to say that a quantity
that is dimensionful is small or large. By dimensionful we mean that it is measured in
some units. For example is a fighter jet fast? A typical speed for a fighter jet is 2000
kph and that is very fast compared to driving a car (say 100 kph):
So no in this sense a fighter jet is very very slow. Thus when we do perturbation theory
we need to identify a small dimensionless number. Small then means g 2 << g, so that
higher order terms in the expansion are indeed smaller (although their coefficients might
grow). Note that this requires us to compare g and g 2 so they must have the same units
and hence g must be dimensionless.
Lastly we note that even if we have a small, even tiny, dimensionless coupling con-
stant perturbation theory can still fail. This is because not all functions admit a Taylor
expansion about g = 0. The classic example is
1
e− g2 g ̸= 0
f (g) = (6.6)
0 g=0
i.e. perturbation theory misses all the information in f (g). This may seem like an esoteric
example but actually functions of this form arise all the time in quantum theories as
one can see in the path integral formulation.
In practice in quantum theories the perturbative series of the form 6.1 do not even
converge! They become more accurate as we include higher order terms for a while but
then they get worse if you include too many and then would ultimately diverge if one
could do infinitely many. In QED where g = α = 1/137 one expects the series to start
failing around the 137-th term. Such a series is known as an asymptotic series. The full
theory is not divergent and one expect the complete answer to take the form
2
E = (E0,0 + E0,1 g + E0,2 g 2 + . . .) + (E1,0 + E1,1 g + E1,2 g 2 + . . .)e−1/g + . . . (6.8)
(ϵ)
! √ !
E2 −E2 (E1 −E2 )− (E1 −E2 )2 +4ϵ2
(ϵ)
|E2 ⟩ = N2 ϵ = N2 2ϵ (6.13)
1 1
where the normalization constants are equal and given by
1
N1/2 = q √ . (6.14)
(E2 −E1 + (E1 −E2 )2 +4ϵ2 )2
1+ 4ϵ2
70 CHAPTER 6. TIME INDEPENDENT PERTURBATION THEORY
Not nice but solved exactly! Note that for E1 ̸= E2 we can smoothly take ϵ → 0 as
s !
(ϵ) 1 4ϵ2
E1/2 = E1 + E2 ± |E1 − E2 | 1 +
2 (E1 − E2 )2
2ϵ2
1
= E1 + E2 ± |E1 − E2 | ± + ...
2 |E1 − E2 |
ϵ2
= E1/2 ± + O(ϵ4 ). (6.15)
|E1 − E2 |
where
! ! !
(0)
E1 0 (0) 1 (0) 0
H (0) = (0) |E1 ⟩ = |E2 ⟩ = (6.19)
0 E2 0 1
(0)
and we have relabelled En = En . Next we expand to lowest order in ϵ. In principle
this involves an infinite number of terms but let’s start with just the first order term
H = H (0) + ϵH (1) + . . .
En = En(0) + ϵEn(1) + . . .
|En ⟩ = |En(0) ⟩ + ϵ|En(1) ⟩ + . . . (6.20)
The terms of order ϵ0 cancel as we have solved the unperturbed problem. The O(ϵ)
equation on the other hand allows us to deduce the new spectrum and eigenstates of
(0) (0)
the perturbed Hamiltonian as follows. We know H (0) , H (1) , En and |En ⟩ so we need
(1) (1) (0)
to solve for En and |En ⟩. To this end we use the fact that |En ⟩ are an orthonormal
(0)
basis. So we can take the inner product of both sides of (6.23) with |Em ⟩ to find
(0)
⟨Em |H (1) |En(0) ⟩ + ⟨Em
(0)
|H (0) |En(1) ⟩ = En(1) ⟨Em
(0)
|En(0) ⟩ + En(0) ⟨Em
(0)
|En(1) ⟩
= En(1) δnm + En(0) ⟨Em
(0)
|En(1) ⟩, (6.24)
(0)
where in the second equality we have used orthogonality of |En ⟩. However we also
(0) (0) (0)
know that ⟨Em |H (0) = Em ⟨Em | and hence
(0)
⟨Em |H (1) |En(0) ⟩ = En(1) δnm + (En(0) − Em
(0) (0)
)⟨Em |En(1) ⟩ (6.25)
(0) 1
⟨Em |En(1) ⟩ = (0) (0)
(0)
⟨Em |H (1) |En(0) ⟩ (6.27)
En − Em
Here is where the non-degeneracy is important. (6.27) are nothing but the coefficients
(1) (0)
of |En ⟩ in an expansion in the |Em ⟩ basis, for m ̸= n. To get the complete state, we
still need the coefficients for m = n. These can be obtained by demanding that the
complete eigenstates are properly normalized, namely
⟨En(1) |Em
(0)
⟩ = −⟨En(0) |Em
(1) (1)
⟩ = −(⟨Em |En(0) ⟩)∗ (6.29)
(0) (1)
For m = n, this tells us that the real part of ⟨En |En ⟩ vanishes. In particular we can
take
This tells us that the first order correction is orthogonal to the unperturbed eigenstate.
Finally
X
|En(1) ⟩ = (0)
⟨Em |En(1) ⟩|Em
(0)
⟩
m
X 1 (0)
= (0) (0)
⟨Em |H (1) |En(0) ⟩|Em
(0)
⟩ (6.31)
m̸=n En − Em
as required for keep the eigenvectors orthonormal. Note that in this derivation, although
we started with a two-dimensional Hilbert space, we didn’t use this and our answer is
true more generally.
6.2. FIRST ORDER NON-DEGENERATE PERTURBATION THEORY 73
This agrees with our exact result but will not be true in general since our H (1) is off-
diagonal.
(1)
However |En ⟩ will be non-zero as:
! !
0 1 0
(0) (1) (0)
⟨E1 |H |E2 ⟩ = 1 0 =1
1 0 1
! !
0 1 1
(0) (1) (0)
⟨E2 |H |E1 ⟩ = 0 1 =1 (6.36)
1 0 0
and
(0) (0)
(0) ⟨E1 |H (1) |E2 ⟩ (0)
|E2 ⟩ = |E2 ⟩ + ϵ (0) (0)
|E1 ⟩
−
E2 E1
! !
0 ϵ 1
= + (0) (0)
1 E2 − E1 0
!
− (0) ϵ (0)
= E1 −E2 (6.38)
1
But we can assume that the zeroth and first order equations have been solved so we
find, at second order,
H (2) |En(0) ⟩ + H (1) |En(1) ⟩ + H (0) |En(2) ⟩ = En(0) |En(2) ⟩ + En(1) |En(1) ⟩ + En(2) |En(0) ⟩ (6.42)
(2) (2)
Remember that the unknowns are En and |En ⟩ everything else is known. So we take
matrix elements again:
(0)
⟨Em |H (2) |En(0) ⟩ + ⟨Em
(0)
|H (1) |En(1) ⟩ + ⟨Em
(0)
|H (0) |En(2) ⟩
= En(0) ⟨Em
(0)
|En(2) ⟩ + En(1) ⟨Em
(0)
|En(1) ⟩ + En(2) ⟨Em
(0)
|En(0) ⟩ (6.43)
which becomes
(0)
⟨Em |H (2) |En(0) ⟩ + ⟨Em
(0)
|H (1) |En(1) ⟩ + Em
(0) (0)
⟨Em |En(2) ⟩
= En(0) ⟨Em
(0)
|En(2) ⟩ + En(1) ⟨Em
(0)
|En(1) ⟩ + En(2) δnm (6.44)
⟨En(0) |H (2) |En(0) ⟩ + ⟨En(0) |H (1) |En(1) ⟩ + En(0) ⟨En(0) |En(2) ⟩ = En(0) ⟨En(0) |En(2) ⟩ + En(2) (6.45)
(0) (1) (0) (2)
where we have used ⟨En |En ⟩ = 0. The ⟨En |En ⟩ terms cancel so this tells us the
second order correction to En :
(2)
Since we can compute everything on the right hand side we now know En . It is left
(2)
and an exercise to compute |En ⟩ (just do it for H (2) = 0).
So let us try to compute the correction to the energy in our simple example. Here
we have H (2) = 0 and so (using the matrix elements we computed before 6.36)
(0) (0) (0) (0)
(2) ⟨E2 |H (1) |E1 ⟩⟨E1 |H (1) |E2 ⟩ 1
E1 = (0) (0)
=
E1 − E2 E1 − E2
(0) (0) (0) (0)
(2) ⟨E1 |H (1) |E2 ⟩⟨E2 |H (1) |E1 ⟩ 1
E2 = (0) (0)
=− (6.47)
E2 − E1 E1 − E2
which agrees with what we found above from the exact solution 6.15.
Let us summarise what we found and in a slightly cleaner notation. Typically (but
not always) the perturbation to the Hamiltonian comes from the potential so
ℏ2 2
En(0) ψn = − ∇ ψn + V (0) ψn . (6.51)
2m
Then our formulae are (see the problem set)
X Vmn Vnm
En = En(0) + ϵVnn + ϵ2 (0) (0)
+ ...
m̸=n En − Em
X Vnm
|En ⟩ = |n⟩ + ϵ (0) (0)
|m⟩
m̸=n En − Em
"
XX Vpn Vmp X Vnn Vmn
+ ϵ2 (0) (0) (0) (0)
|m⟩ − (0) (0) 2
|m⟩
m̸=n p̸=n (En − Em )(En − Ep ) m̸=n (En − Em )
#
∗
1X Vmn Vmn
− |n⟩ + . . . (6.52)
2 m̸=n (En(0) − Em (0) 2
)
The last term is obtained by solving for the normalization of the full state to second
order in ϵ.
In principle, you can continue like this to find third-order and higher corrections. The
calculations will become increasingly tedious and, furthermore, at some point perturba-
tion theory will break down. This is because, as we see from the first two corrections to
76 CHAPTER 6. TIME INDEPENDENT PERTURBATION THEORY
the Hamiltonian eigenvalues and eigenstates above, the expansion is roughly speaking
in powers of
(0) (0)
⟨Em |∆H|En ⟩
, m ̸= n. (6.53)
Em − En
• Spin-orbit coupling due to the spin of the electron coupling to the B field due to
the proton in the electron frame
We will only discuss the first two. The last one should be a topic in Quantum Field
Theory.
where c is the speed of light and me is the mass of the electron.2 If you haven’t seen
Special Relativity before, you surely have heard about the famous E = mc2 . (6.54) is
just that.
Now the speed of the electron can be assumed to be much smaller than the speed of
light hence
|p|2
≪1 (6.55)
m2e c2
and we can Taylor expand (6.54) in this small parameter. We find
|p|2
2
|p|4
E = me c + − + ··· (6.56)
2me 8m3e c2
1
A precise criterion is actually hard to find, since at higher order in ptertubation theory, the correc-
tions will involve sums over these quantities.
2
We work in the approximation before where the mass of the electron is much smaller than the mass
of the nucleus and µ ≃ me .
6.4. APPLICATION: FINE STRUCTURE OF HYDROGEN 77
The first two terms are just the usual kinetic terms. The last term is a correction to the
Hydrogen atom Hamiltonian discussed in Chapter 5. We want to use time-independent,
non-degenerate perturbation theory to compute the correction on the energy spectrum
of the Hydrogen atom that it gives rise to.
From the last section we have
(1)
En,ℓ = ⟨n, ℓ, m|∆H|n, ℓ, m⟩, (6.57)
where
P̂ 4
∆H = − . (6.58)
8m3e c2
In fact, we have seen in the discussion of the Hydrogen atom that states with different
ℓ, m but the same n have the same energy and are hence degenerate. We may be
worried that we need to use degenerate perturbation theory in this case. The fact that
the perturbation (6.58) is spherically symmetric saves us: the perturbation cannot lead
to mixing between the degenerate energy eigenstates since
between (ℓ′ , m′ ) and (ℓ, m) states. (6.59) is an example of a selection rule: if the Hamil-
tonian has certain properties such as symmetries, transitions between certain states may
be forbidden.
To proceed let notice that
P̂ 4
= (H0 − V (r))2 (6.61)
4m2e
where H0 is the Hamiltonian for the hydrogen atom from last Chapter (ie. without
corrections). We found last week that
me c2 α2
H0 |n, ℓ, m⟩ = En |n, ℓ, m⟩ = − |n, ℓ, m⟩. (6.62)
n2
We drop to superscript (0) on En above, it is to be understood that the above is the
unperturbed energy. We hence have
(1) 1 2 2
En,ℓ = − E n − 2En ⟨V ⟩n,ℓ,m + ⟨V ⟩n,ℓ,m (6.63)
2me c2
Recall now that for the Hydrogen atom (previously worked in units where 4πϵ0 to avoid
unnecessary clutter)
e2
V =− . (6.64)
4πϵ0 r
78 CHAPTER 6. TIME INDEPENDENT PERTURBATION THEORY
As such, we just need to evaluate the expectation values of 1/r and 1/r2 with respect
to the Hydrogen wavefunctions derived before. One finds (exercise)
1 1
⟨ ⟩n,ℓ,m = ,
r a0 n 2
1 1 (6.65)
⟨ 2 ⟩n,ℓ,m = 2 3 .
r a0 n (ℓ + 21 )
e2 e4
(1) 1 2 1
En,ℓ = − En + 2En +
2me c2 4πϵ0 a0 n2 16π 2 ϵ20 (ℓ + 21 )n3 a20
(6.68)
me c2 α4
2 n 3 2 n 3
=− − En = − − .
me c2 ℓ + 12 4 2 n4 ℓ + 21 4
We see that this correction is α2 suppressed with respect to En . We also see that
the ℓ degeneracy is lifted. Needless to say, these corrections have been experimentally
measured a long time ago (in fact before the discovery of quantum mechanics!). It is
quite nice that the conceptually simple perturbative framework that we set up allows
us to predict these.
γ 1 er̂ e
B= v × E = p × = − L. (6.69)
c2 m e c2 4πϵ0 r2 4πϵ0 me c2
We have also seen before that the electron carries spin S and spins couple to magnetic
fields via
e αℏ 1
HSO = S · B = − 2 3 S · L. (6.70)
2me 2me c r
The correction to the energy due to spin-orbit coupling is therefore
(1)
ESO = ⟨n, ℓ, m; s|HSO |n, ℓ, m; s⟩ (6.71)
where
|n, ℓ, m; s⟩ ≡ |n, ℓ, m⟩ ⊗ |s⟩ (6.72)
6.5. DEGENERATE PERTURBATION THEORY 79
are a basis for the Hilbert space of the electron (obtained by the tensor product). We
now notice that
1 2
J − L2 − S 2 ,
S·L= (6.73)
2
where
J =L+S (6.74)
is the net (orbital + spin) angular momentum. Unlike before where we looked at two-
particle systems and added their angular momenta, here we have one particle (the
electron) that can carry two kinds of angular momentum. The physics is different, but
the math is the same.
Just like before, it is convenient to go to a basis that diagonalizes J 2 , L2 , S 2 and Jz
instead of (6.72) which diagonalizes L2 , Lz , S 2 , Sz . In this new basis, S · L is diagonal
with evalues
ℏ2 ℓ, j = ℓ + 21 ,
S · L|j, mj ; s, ℓ⟩ = (6.75)
2 −(ℓ + 1), j = ℓ − 1 .
2
Hence
αℏ 1 3 j = ℓ + 12 ,
ℓ,
En;j,ℓ = − 2 ⟨ 3 ⟩n;j,ℓ (6.76)
4me c r −(ℓ + 1), j = ℓ − 1 .
2
These two equations allow us to express ⟨r−3 ⟩ in terms of ⟨r−2 ⟩ which we already found
before.
We can finally combine the spin-orbit correction with the relativistic correction to
obtain
m e α 2 c2 1 α2 3
1
En,j,ℓ = − − − . (6.80)
2 n2 n3 4n j + 21
This formula holds when ℓ ̸= 0 for both j = ℓ ± 12 .
where sgn(ϵ) = ϵ/|ϵ|. Here we see the problem: the answer is not analytic in ϵ and so a
naive Taylor series expansion will fail. In addition taking ϵ → 0± does not give us the
unperturbed eigenvectors. This is fine as the eigenvalues at ϵ = 0 are degenerate and
hence we could just as well as started with
! !
(0) 1 1 (0) 1 −1
|E1 ⟩ = √ |E2 ⟩ = √ (6.84)
2 1 2 1
or the other way around. Thus we need to be more careful and treat degenerate
eigenspaces as a special case. The degeneracy maybe just a single pair, or more typically,
every energy eigenstate has some fixed degeneracy, e.g. two spin states of an electron.
To this end, let V ⊂ H be an N -dimensional subspace of states of equal (degenerate)
energies EV of H (0) . All of these states satisfy
(0) (0)
H0 |EV,n ⟩ = EV |EV,n ⟩, n = 1, · · · , N. (6.85)
H = PV H ⊕ P⊥ H = V ⊕ H ⊥ . (6.88)
Let’s first analyze the first equation in (6.93). Start with a state in V and expand
(0)
En (ϵ) = EV,n + ϵEn(1) + O(ϵ2 ),
(0)
(6.94)
|En ⟩ = |EV,n ⟩ + ϵ|En(1) ⟩ + O(ϵ2 ).
Note that the unperturbed state must lie in V , but the perturbation can in principle
take us out of V . Note also that P⊥ |En ⟩ = O(ϵ). To leading order in ϵ, we just get that
E (0) = EV . We already knew this. To O(ϵ) we get something more interesting, namely
(0) (0)
−En(1) |EV,n ⟩ + PV H (1) PV |EV,n ⟩ = 0. (6.95)
In other words
(0) (0)
PV H (1) PV |EV,n ⟩ = En(1) |EV,n ⟩ (6.96)
(0)
meaning that |EV,n ⟩ is also an eigenstate of the perturbation projected on the degenerate
subspace! This is important. It tells us that under the assumption that a small change
(0)
in H causes a small change in the state, |EV,n ⟩ must not only be an eigenstate of H (0)
but also of the perturbation H (1) , ie. it is an eigenstate of the full Hamiltonian. In other
words, degenerate systems stable under perturbations are those that are eigenstates of
the perturbation.
In conclusion, we get that the perturbation lifts the degeneracy of the |EV,n ⟩ subspace
according to
(0) (0)
En(1) = ⟨EV,n |H (1) |EV,n ⟩. (6.97)
This is the same formula as before. We can also use the second equation in (6.93) to
(0)
deduce that the correction to H (0) eigenstates |Ep ⟩ ∈ H⊥ are given by a similar formula
To compute the perturbed eigenstates in (6.94), we use the second eq. in (6.93). We
find
(0) (0)
P⊥ H (1) |EV,n ⟩ + P⊥ H (0) |En(1) ⟩ − EV,n P⊥ |En(1) ⟩ = 0 (6.99)
(0)
Taking inner product with |Ep ⟩ ∈ H⊥ , we find
(0) (0)
⟨Ep |H (1) |EV,n ⟩
⟨Ep(0) |En(1) ⟩ = (0) (0)
. (6.100)
EV,n − Ep
82 CHAPTER 6. TIME INDEPENDENT PERTURBATION THEORY
(1) (0)
This only determines the components of |En ⟩ along |Ep ⟩ ∈ H⊥ . To determine the
components along the degenerate subspace, we need to go to second order in perturba-
tion theory. This is an interesting exercise.
Thus looking at our simple two-dimensional example we should start from the basis
! !
1 1 2 1 −1
|E1 ⟩ = √ + O(ϵ ) |E2 ⟩ = √ + O(ϵ2 ) (6.101)
2 1 2 1
In fact in this simple case the perturbation ends at first order but this won’t be true in
general.
Of course it could be that there are still degeneracies but then one simply goes to
the next order. In quantum theories one expects that the only degeneracies that persist
to all orders in perturbation theory are those that are protected by a symmetry. That
is there exists an observable Q that commutes with the full Hamiltonian [Q, H] = 0
and hence the states |En ⟩ and Q|En ⟩ have the same energy to all orders in perturbation
theory:
One then works with a energy eigenstates that are also eigenstates of Q.
Chapter 7
Our next topic in perturbation theory is to allow for a small time dependence. So
far we have assumed that the Hamiltonian has no explicit time dependence. But this
need not be the case. So what happens if we consider a perturbation that includes
time-dependence:
∂|Ψ⟩
iℏ = H (0) |Ψ⟩ + gH (1) (t)|Ψ⟩ (7.1)
∂t
(we assume the unperturbed Hamiltonian is time independent). Recall that in the
time-independent case we solve the Schröding equation by
X
|Ψ(0) (t)⟩ = cn e−iEn t/ℏ |ψn(0) ⟩, (7.2)
n
(0)
where |ψn ⟩ are a basis of eigenstates of H (0) :
In this case the cn ’s are just constants that characterise the state at t = 0.
To solve the perturbed Schrödinger equation we allow cn to be functions of time.
Substituting back we now find
X dcn X
iℏ + En cn e−iℏEn t/ℏ |ψn(0) ⟩ = cn e−iEn t/ℏ H (0) |ψn(0) ⟩
n
dt n
X
+g cn e−iEn t/ℏ H (1) (t)|ψn(0) ⟩ (7.4)
n
(0)
Next we use the fact that the |ψn ⟩ satisfy the unperturbed time-independent Schrödinger
equation so
X dcn X
iℏ e−iEn t/ℏ |ψn(0) ⟩ = g cn (t)e−iEn t/ℏ H (1) (t)|ψn(0) ⟩ (7.5)
n
dt n
83
84 CHAPTER 7. TIME DEPENDENT PERTURBATION THEORY
(0)
Note that again we can still use the fact that the |ψn ⟩ are an orthonormal basis of the
(0)
Hilbert space. Thus we can take the inner product of this equation with |ψm ⟩ to obtain
dcm −iEm t/ℏ X
iℏ e =g cn e−iEn t/ℏ ⟨ψm
(0)
|H (1) (t)|ψn(0) ⟩
dt n
dcm ig X
=− cn ei(Em −En )t/ℏ ⟨ψm(0)
|H (1) (t)|ψn(0) ⟩ (7.6)
dt ℏ n
(0)
For the usual case where |ψn ⟩ are realised by functions on R3 we have
Z
(1)
⟨ψm |H (t)|ψn ⟩ = (ψm (0)
(x))∗ H (1) (t)ψn(0) (x)d3 x (7.7)
Thus we obtain a first order differential equation for each cn (t) but it is coupled to all
other c’s. Thus cn (t) is determined by the original initial condition cn (0).
So far we haven’t made any approximation. But we have an infinite set of coupled
differential equations! At lowest order cn is constant so let us expand
(n)
where by (7.8), all cm (0) = 0 for n ≥ 1.
It is natural to imagine a situation where the perturbation is turned on at t = 0
and then switched off at some later stage. That is to say H (1) (t) = 0 for t ≤ 0 and
H (1) (t) → 0 rapidly at late times. Further let us suppose that at t ≤ 0 the system is in
the k-th energy eigenstate:
The interpretation is that |cm (t)|2 gives the probability that after the perturbation the
(1)
system will lie in the m-th energy state (m ̸= k) at time t. cm (t) is called the first order
transition amplitude. Note that we don’t expect to find
X
|cn (t)|2 = 1 (7.13)
n
85
The interpretation is that the perturbation has introduced (or taken away) energy into
the system which then gets redistributed.
So lets do an example! Consider a harmonic oscillator
p̂2 1
H (0) = + kx̂2 (7.14)
2m 2
with states |n⟩, n = 0, 1, 2... and energies En = ℏω(n + 12 ). Next we perturb it by
2 /τ 2
H (1) = −x̂e−t (7.15)
Classically this corresponds to an applied force F = g in the x-direction but only for a
period of time of the order 2τ . In this case we assume that the system is in the ground
state at t → −∞. Thus we need to evaluate (we have change the initial time to t = −∞)
i t iωnt′
Z
(1)
cn (t) = − e ⟨n|H (1) (t′ )|0⟩dt′
ℏ −∞
Z t
i ′ ′2 2
= ⟨n|x̂|0⟩ eiωnt −t /τ dt′ (7.16)
ℏ −∞
There is no closed form for this integral but we can look at late times so
Z ∞
i ′ ′2 2
(1)
cn (∞) = ⟨n|x̂|0⟩ eiωnt −t /τ dt′
ℏ −∞
√
iτ π −τ 2 n2 ω2 /4
= e ⟨n|x̂|0⟩ (7.17)
ℏ
where we have used to integral
∞
Z r
−at′2 +bt′ π b2 /4a
e dt′ = e (7.18)
−∞ a
Lastly we need to compute
r
ℏ
⟨n|x̂|0⟩ = ⟨n|â + [f ix]↠|0⟩
r 2mω
ℏ
= ⟨n|↠|0⟩
r 2mω
ℏ
= δn,1 (7.19)
2mω
So it is only non-zero for the first excited state:
r
π 2 2
c1 (∞) = ig τ e−τ ω /4 (7.20)
2mℏω
More generally its easy to see that if we started in |N ⟩ then we could only jump to
|N ± 1⟩ with the same dependence on τ .
To make predictions it is necessary to see what happens to the ground state. The
equation for c0 is
(1)
c0 (t) = 1 + gc0 (t) + . . . (7.21)
86 CHAPTER 7. TIME DEPENDENT PERTURBATION THEORY
with
Z ∞
(1) i
c0 (∞) =− ⟨0|H (1) (t′ )|0⟩dt′
ℏ −∞
=0 (7.22)
Here ∆ = ∆(x, p, · · · ) has no explicit time dependence and H (1) is manifestly Hermi-
tian. You can think about H (1) resulting from a harmonic driving force. An external
electromagnetic field will give rise to such pertrubations (although EM radiation is not
necessarily mono-chromatic, ie. the perturbation may involve terms of different ω).
Starting in an energy eigenstate as before and performing the time integrals, we find
Z t Z t
i ′ i(ωn −ωm )t′ −iωt′ i † ′ ′
(1)
cn (t) = − ⟨n|∆|m⟩ dt e e − ⟨n|∆ |m⟩ dt′ ei(ωn −ωm )t eiωt
ℏ 0 ℏ 0
i(ωn −ωm )t−iωt i(ωn −ωm +ω)t
(7.27)
1 e −1 e −1 †
=− ⟨n|∆|m⟩ + ⟨n|∆ |n⟩
ℏ (ωn − ωm − ω) (ωn − ωm + ω)
where ωn are the H (0) eigenvalues in units of ℏ (or frequencies). In the late time limit
t → ∞, we obtain a significant transition amplitude only when ωnm ≡ ωn − ωm = ±ω.
For ω > 0, the + sign corresponds to energy absorbtion, while the − sign corresponds
to energy emission. As such, the time-dependent monochromatic perturbation acts as
a source or sink of energy that can be exchanged with the system.
7.1. FERMI GOLDEN RULE 87
1
lim sin(xt) = δ(x). (7.31)
t→∞ πx
(7.30) is a very useful equation that allows us to determine the transition rates between
atomic energy levels: shining monochromatic light of frequency that matches the energy
difference between two levels will induce a transition between the levels.
These results were first obtained by Dirac and it was Fermi who later called them
“golden rules”.
H = H0 + eE(t) · x̂ (7.32)
This is an assumption that tells us that there are no correlations among different spatial
components of the fields. P (ω) is the average energy density in the the radiation. We
can use this result to simplify the average probability of transition
2 XZ t Z t
e ′ ′ ′′
|cn |2 = 2 dt dt′′ Ei (t′ )Ej (t′′ )⟨n|xi |m⟩⟨m|x̂i |n⟩eiωnm (t −t )
ℏ i 0 0
3 Z ∞ Z t 2
e2 X 2 ′ i(ωnm −ω)t′
= 2 |⟨n|x̂i |m⟩| dωP (ω) dt e (7.36)
ℏ i=1 −∞ 0
3
sin2 (ωnm2−ω)t
Z ∞
e2 X 2
= 2 |⟨n|x̂i |m⟩| dωP (ω) .
ℏ i=1 −∞ (ωnm − ω)2
In the large t limit, we can use the result from before (7.31). In that case, the integral
over ω sets the argument of P to be ωnm and the result is proportional to the probability
that the electric dipole moment can link n and m. Depending on whether ωnm > 0 or
ωnm < 0, we get absorbtion or stimulated emission.
Chapter 8
Let us now look at another type of approximation which holds whenever a system is
“nearly classical”. We will make this more precise below, but to get some intuition
about what this means, note that we expect that for large systems, classical results
should emerge. Much of the quantum-ness of quantum theory should disappear. This
disappearance arises as the wavefunctions typically oscillate so quickly that the quantum
effects cancel out and only the dominant classical configurations remain. Furthermore,
we should be able to use quantum mechanics to characterize deviations from classicality.
8.1 WKB
The WKB approximation is named after Wentzel, Krammers & Brillouin (’26), although
the methods were previously developed by Jeffreys (’23) in the mathematical context
of approximate methods for differential equations. The same method appears under
different names in various areas of physics. In optics, it goes under the name of “eikonal”
approximation. In QCD similar methods, emplyoing a separation of the degrees of
freedom into fast and slow ones, are used in computing high-energy observables.
Before we describe the math, let’s think about the question: “How do we quantify
the semi-classical regime of a system?” There are two naive answers:
• ℏ → 0. On the one hand, recall that the uncertainty principle tells us that ∆x∆p ∼
ℏ, hence we cannot measure both position and momenta with precision. In classical
mechanics, we of course can, so we expect the classical limit to be one in which
the uncertainty principle disappears, hence ℏ → 0. On the other hand, ℏ is a fixed
constant of nature, and it makes no sense to take it to 0. What we should instead
be doing is to look for a dimensionless quantity involving ℏ which can be taken to
be small. As we will see, it’s still useful to think about ℏ → 0 as taking a classical
limit and you may encounter this in other courses such as QFT.
89
90 CHAPTER 8. SEMI-CLASSICAL QUANTIZATION: WKB
2πℏ
λdB ≡ L, (8.1)
p
where p is the momentum of system and L its size. Recall that for a system at
rest, p = m0 c2 , where m0 is the rest mass of the system/object. You can calculate
the de Broglie wavelength for mundane objects like a ball or a cup of tea and you
will find a tiny answer, much much smaller than the size of the object itself. So
quantum mechanics is irrelevant when λdB << L. In this limit, we cannot resolve
the wave-like nature of particles. We enter the quantum regime as λdB ∼ L.
In terms of λdB we will see that we can apply semi-classical methods when:
a)
dλdB
≪1 (8.2)
dx
and
b)
dV |p|2
λdB ≪ , (8.3)
dx m
where V is the potential experienced by the system. a) tells us that the dB wavelength
has to be slowly varying. One can imagine that in this case deviations from classicality
will be under control. b) tells us that the potential is slowly varying. We will understand
how these conditions arise when solving the Schrodinger equation with “approximate
wavefunctions”.
We expect that classical solutions will be just waves (of huge frequency), so let’s
consider wave-functions of the form (in one dimension)
For now, we won’t worry about the normalization. We can fix that later. Taking the
exponent to be σ(x)/ℏ as opposed to simply kx, will allow us to characterize deviations
from simple wave-like behavior. Note that
dψ i dσ iσ(x)/ℏ
= e
dx ℏ dx
2
d2 ψ i d2 σ iσ(x)/ℏ 1 dσ
2
= 2
e − 2 eiσ(x)/ℏ (8.5)
dx ℏ dx ℏ dx
Substituting this Ansatz into the Schrodinger equation restricted to one dimension, we
find
ℏ2 d2 ψ
Eψ = − + V (x)ψ
2m dx
2
iℏ d2 σ 1 dσ
E=− 2
+ + V (x). (8.6)
2m dx 2m dx
8.1. WKB 91
We now see that taking ℏ to be small really means that the first term on the RHS is
much smaller than the second, namely
2
dσ d2 σ
≫ ℏ 2. (8.7)
dx dx
In fact this condition is nothing but condition a) stated above. To see this note that
2π dk
σ(x)/ℏ = k(x)x = x = kx + x2 + · · · =⇒
λ(x) dx
(8.8)
1 dσ dk 1 d2 σ dk
= k + 2 x + · · · =⇒ 2
= 2 + ···
ℏ dx dx ℏ dx dx
Hence (8.7) is equivalent to
dk
ℏ2 k 2 ≫ ℏ (8.9)
dx
or in terms of λ
4π 2π dλ dλ
≫ =⇒ ≪ 1.✓ (8.10)
λ2 λ2 dx dx
If this condition holds, then we can make a semi-classical expansion:
Thus we have determined the wavefunction (in principle and assuming the condition
(8.7)). In other words our semi-classical wavefunction is
i
Rx i
Rx
ψsemiclassical = Ae ℏ p(y)dy
+ Be− ℏ p(y)dy
(8.15)
p2
E= + V (y) (8.16)
2m
92 CHAPTER 8. SEMI-CLASSICAL QUANTIZATION: WKB
ψ = cAi(z) (8.27)
for some constant c. It is named after George Airy who was an Astronomer Royal at
the Greenwich Observatory. He features in their museum if you have ever been. The
Airy function is the poster-child of difficult functions to understand using perturbative
techniques due to its bi-polar characteristics of oscillating and decaying. But we love
it none the less as it is a thing of beauty with important and varied applications (and
with modern computer techniques it can be evaluated numerically with high precision).
It can be defined by
Z ∞
1
Ai(z) = √ cos( 31 u3 + zu)du (8.28)
π 0
From here you can check that
Z ∞
′ 1
Ai (z) = − √ u sin( 13 u3 + zu)du
π 0
Z ∞
′′ 1
Ai (z) = − √ u2 cos( 13 u3 + zu)du (8.29)
π 0
1
There is a second solution that grows at large z.
94 CHAPTER 8. SEMI-CLASSICAL QUANTIZATION: WKB
0.4
0.2
-20 -10 10 20
-0.2
-0.4
so
Z ∞
′′ 1
Ai (z) − zAi(z) = − √ (u2 + z) cos( 13 u3 + zu)du
π 0
Z ∞
1 d
= −√ sin( 13 u3 + zu)du
π 0 du
=0 (8.30)
(one must be careful with the boundary term at infinity which is oscillating wildly). A
plot of Ai(z) is given below. It oscillates to the left of the turning point (where E > F x)
and then dies off exponentially to the right (where E < F x). Therefore, near a turning
point, this is what we expect the wavefunction to look like. In particular the asymptotic
form of Ai(z) is known:
1 e− 23 z3/2 z→∞
1/4
Ai(z) ∼ z (8.31)
1 sin 2 |z|3/2 + π/4 z → −∞
|z|1/4 3
It’s worth mentioning here the fact that Ai(z) and hence the wavefunction is not zero
to the right of the turning point where the potential energy V = F x is greater than
the total energy: V > E. Thus there is a non-zero probability to find the particle in a
region which is strictly forbidden in the classical world.
So let us try this with the WKB approximation. The idea is that WKB should
be good away from the turning points and near a turning point we can use the Airy
function. So it’s a question of patching together the various wavefunctions.
So lets do the WKB procedure for V = E + F (x − b). This is a potential that rises
to the right (F > 0) with the turning point at x = b:
p
p(x) = 2m(F b − F x) (8.32)
8.3. TURNING POINTS, AIRY AND BOHR-SOMMERFELD QUATIZATION 95
and hence (we pick the lower bound of integration constant to make things simple as
the effect is just a constant that can be absorbed elsewhere)
Z xp
(0)
σ = 2m(F b − F y)dy
b
2√
=− 2mF (b − x)3/2 (8.33)
3
Thus our WKB wavefunction is
A 2 √ 3/2 B 2 √ 3/2
ψ=p ei 3ℏ 2mF (b−x) + p e−i 3ℏ 2mF (b−x) , b>x (8.34)
p(x) p(x)
but we don’t trust it near x = b and in particular p = 0 there. Rather we take the
following
2 √ 2 √
− 2mF (x−b)3/2 + 2mF (x−b)3/2
√Ar
e 3ℏ + √ Br
e 3ℏ x>b
p(x) p(x)
ψ = cAi (2mF/ℏ2 )1/3 (x − b) x∼b (8.35)
2 √ 3/2 2 √ 3/2
√Al ei 3ℏ 2mF (b−x) + √Bl e−i 3ℏ 2mF (b−x) x<b
p(x) p(x)
In other words the WKB solution agrees with the asymptotic regions of the exact Airy
function but not the region where it changes from oscillating to damping.
We can also imagine a turning point, rising to the left, located at x = a < b. The
analysis is the same but with x − b → a − x.
√
√ c
sin − 2
3ℏ
2mF (x − a)3/2
− π/4 x>a
p(x)
ψ = cAi (2mF/ℏ2 )1/3 (a − x) x∼a (8.38)
2 √ 3/2
√ c e− 3ℏ 2mF (a−x) x<a
p(x)
Let us now consider a more general situation there there is a potential V with turning
points on the left and right. In the middle, a < x < b we need to match the two solutions
we find (this time assuming a general potential):
1 x
Z
c
ψb = p sin − p(y)dy + π/4 x<b
p(x) ℏ b
1 x
Z
c
ψa = p sin − p(y)dy − π/4 x>a (8.39)
p(x) ℏ a
96 CHAPTER 8. SEMI-CLASSICAL QUANTIZATION: WKB
where a and b are two turning points. This was more or less guessed by Bohr and
Sommerfeld in the early days of quantum mechanics where they conjectured that n =
1, 2, ....
The associated wavefunctions will not be the ones we found before: i.e. a polynomial
times an exponential suppression. But that’s okay, we don’t expect to land on the exact
answer in any non-trivial case.
√
Let us see what we can do. At large x we have σ (0) ∼ ±i mk x2 and hence we do
see an exponential suppression in ψ (for the right choice of sign - the must discard the
wrong choice due to normalizability). We can apply the Bohr-Sommerfeld quantization
condition:
Z b Z √2E/k √
p(x)dx = √ 2mE − mkx2 dx
a − 2E/k
p
= πE m/k
= πE/ω (8.43)
Setting this to πℏ(n − 1/2) indeed gives us the correct spectrum (n = 1, 2, ...):
⟨ψ|ψ⟩ = 1. (9.2)
97
98CHAPTER 9. ENTANGLEMENT, DENSITY MATRICES AND THERMAL STATES
and so on. We will from now on drop the tensor product sign, namely | ↑⟩⊗| ↑⟩ ≡ | ↑⟩| ↑⟩.
From (9.1) we see that states such as
are both in H1 ⊗ H2 .
Definition: States that can be written as
are called product states. States that are not product states are called entangled.
We see that |ψ1 ⟩ in (9.4) is a product state, while |ψ2 ⟩ in (9.5) is entangled.
Entanglement is a fundamental property of quantum mechanics. To see this, we
consider the following thought experiment (EPR 1935): Let 1 and 2 be two particles
produced at the same source and sent to two distinct detectors with respectively settings
a and b. An example of setting here would be the spatial orientation of the detector.
Let A and B be the measurement outcomes at the two detectors, given the respective
settings a and b. Einstein, Podolski and Rosen in their now famous 1935 paper proposed
that any “physical” theory should meet the following two criteria:
a) Reality: if without disturbing the system we can predict with certainty the value of
a physical theory, then there exists an element of reality to the physical quantity.
Criteria a) and b) go under the name of “local realism”. Quantum mechanics is in-
compatible with local realism, as famously demonstrated through a violation of “Bell’s
9.1. ENTANGLEMENT AND BELL’S INEQUALITY 99
inequality” first in 1972 by Clauser and Freedman and a dozen other times since. In-
deed, the Nobel prize a few years ago went to three experimenters who showed, in a
variety of experiments, that Bell’s inequalities are violated.
Bell’s inequality is an inequality that ought to be obeyed by physical systems with
properties a) and b) above. To be more specific, let particles 1 and 2 in our thought
experiment above be simple, two-level systems. Classically, this means that each of these
particles can be in one of two states, and hence the measurement outcomes A and B
above can take one of two possible values: ±1. Furthermore, (a, b) could involve rotating
the detector (ie. measuring the spin or polarization of the particles along different axes).
Each of the detectors is allowed to be in one of two possible settings, a or a′ and b or b′ .
Now given detector settings (a, b) and measurement outcomes as above, one can
define the correlation function
where the angle brackets mean averaging over many runs in which the particles are
subjected to measurements with the same detector setting (a, b). Note that since (clas-
sically) A(a) = ±1 and B(b) = ±1, we must also necessarily have A(a)B(b) = ±1.
Therefore (think about the average correlated outcome of tossing two coins)
−1 ≤ E(a, b) ≤ 1. (9.8)
Claim 1: Any physical theory obeying local realism must have |S| ≤ 2. This is called
Bell’s inequality.
Proof: Simply notice that
This is because B = ±B ′ so only one of the two brackets can be non-zero and must
equal ±2. The average over this quantity is then
Z
S ≡ ⟨S0 ⟩ = dλp({λ})S0 ({λ}) (9.11)
where λ are possible “hidden variables”, ie. parameters that could vary across different
experimental runs, assumed to be the same for both particles since they are created at
the same source. Then Z
|S| ≤ dλp({λ})|S0 ({λ})| ≤ 2, (9.12)
R
where we have used that dλp({λ}) = 1.
Claim 2: Quantum mechanics violates the Bell inequality (9.12).
100CHAPTER 9. ENTANGLEMENT, DENSITY MATRICES AND THERMAL STATES
1
|ψ⟩ = √ (| ↑⟩A | ↓⟩B − | ↓⟩A | ↑⟩B ) (9.13)
2
Note that entanglement violates locality (property b) in the definition of local realism
above) through measurement of either particle.
Then consider the following settings of the detector:
a) Rotate the first device to measure spin/polarization along n̂a in the plane per-
pendicular to the direction of propagation of particle 1. In other words, the first
device will perform a measurement of the particle 1 in the rotated basis
| ↑′ ⟩A = Ra | ↑⟩A , | ↓′ ⟩A = Ra | ↓⟩A ,
!
cos θa sin θa (9.14)
Ra =
− sin θa cos θa
b) Rotate the second device to measure spin/polarization along n̂b in the plane per-
pendicular to the direction of propagation of particle 2. In other words, the second
device will perform a measurement of the particle 2 in the rotated basis
| ↑′ ⟩B = Rb | ↑⟩B , | ↓′ ⟩B = Rb | ↓⟩B ,
!
cos θb sin θb (9.15)
Rb =
− sin θb cos θb
In this new basis, the outcomes of the measurement will still be ±1, however the prob-
abilities with which they occur will be different. Quantum mechanics predicts
A similar computation then gives the probabilities of the other outcomes (given the
same detector settings above)
1
p(1, −1; θa , θb ) = [cos(θab )]2
2
1
p(−1, 1; θa , θb ) = [cos(θab )]2 (9.17)
2
1
p(−1, −1; θa , θb ) = [sin(θab )]2 , θab ≡ θa − θb .
2
9.2. DENSITY MATRICES 101
In this case
S = −3 cos 2θ + cos 6θ, (9.21)
and maximizing this quantity with respect to θ gives
dS π
= 0 =⇒ 6 sin 2θ − 6 sin 6θ = 0 ⇐⇒ θ = . (9.22)
dθ 8
Plugging this value back into (9.19), we find
√
Smax = −2 2 =⇒ |S| > 2. (9.23)
Hence, given an entangled state and assuming the postulates of quantum mechanics, one
can violate Bell’s inequality. This violation has been experimentally confirmed, some of
the experiments also recently closing some loopsholes regaring the existence of “hidden
variables”.
ρ = |ψ⟩⟨ψ|. (9.24)
Given any observable O we can compute the expectation value of |ψ⟩ by computing a
trace:
X
tr(Oρ) = ⟨en |Oρ|en ⟩
n
X
= ⟨en |O|ψ⟩⟨ψ|en ⟩ (9.25)
n
102CHAPTER 9. ENTANGLEMENT, DENSITY MATRICES AND THERMAL STATES
Next we expand
X
|ψ⟩ = cm |em ⟩ ⇐⇒ cm = ⟨em |ψ⟩ = (⟨ψ|em ⟩)∗ (9.26)
m
so
X
tr(Oρ) = cm ⟨en |O|em ⟩c∗n
n,m
X
= cm λm ⟨en |em ⟩c∗n
n,m
X
= λn |cn |2
n
= ⟨ψ|O|ψ⟩ (9.27)
= 1. (9.28)
Thus we can swap a state for a density matrix. Such a state is called a pure state,
meaning that it is equivalent to a single state in the more traditional formulation.
So what’s the point? We can consider more general density matrices:
Note that this is with respect to some basis. If we choose a different basis then ρ may
not take such a diagonal form. The second condition translates into
X
pi = 1. (9.30)
In general these are called mixed states when they can’t be written as
ρ = |ψ⟩⟨ψ| (9.31)
for a single state |ψ⟩. A mixed state allows us to introduce a statistical notion of
uncertainty, in the sense that we don’t know what the quantum state is (but maybe we
could if we did further experiments). For example we can compute
The expectation value then has the interpretation as a classical statistic average over
the individual quantum expectation values.
Let’s consider the following examples of density matrices:
ρ1 = |ψ⟩⟨ψ|
1
= (| ↑⟩ + | ↓⟩)(⟨↑ | + ⟨↓ |)
2
1
= (| ↑⟩⟨↑ | + | ↑⟩⟨↓ | + | ↓⟩⟨↑ | + | ↓⟩⟨↓ |) (9.33)
2
and
1
ρ2 = (| ↑⟩⟨↑ | + | ↓⟩⟨↓ |) (9.34)
2
Note that although the density matrix ρ2 may look simpler than ρ1 it is a mixed state
whereas ρ1 is pure.
How can we tell in general whether or not a density matrix corresponds to a pure or
mixed state? Well a pure state means ρ = |ψ⟩⟨ψ| for some state |ψ⟩ and hence
So in particular tr(ρ2 ) = tr(ρ) = 1. But for a mixed state (we assume |ψn ⟩ form an
orthonormal basis)
X
ρ= pn |ψn ⟩⟨ψn |
n
X
2
ρ = pn pm |ψn ⟩⟨ψn |ψm ⟩⟨ψm |
n,m
X
= p2n |ψn ⟩⟨ψn | (9.36)
n
tr(ρ2 ) ≤ 1 (9.38)
with equality iff ρ represents a pure state. For example in our classical density matrix
above we have p1 = p2 = 1/2 and so tr(ρ) = 1 as required but tr(ρ2 ) = 1/4 + 1/4 =
1/2 < 1 which tells us it is indeed a mixed state.
104CHAPTER 9. ENTANGLEMENT, DENSITY MATRICES AND THERMAL STATES
1 X −En /kB T
ρthermal = e |En ⟩⟨En | (9.39)
Z n
where |En ⟩ are the energy eigenstates, T is the temperature and kB is Boltzman’s
constant, kB = 1.380649 × 10−23 J/o K that converts temperature into energy.1 Clearly
this is a mixed state and is known as the Boltzman distribution.
To determine the normalization Z we need to impose
1 = tr(ρ)
1 X −En /kB T
= e ⟨Em |En ⟩⟨En |Em ⟩
Z n,m
1 X −En /kB T
= e . (9.40)
Z n
Thus
X
Z= e−En /kB T . (9.41)
n
This known as the Partition function and plays a very central role. It counts the number
of states available at each energy En :
X
Z= d(n)e−βEn (9.42)
En
where d(n) counts the degeneracy of states at energy level En and and β = 1/kB T is
the inverse temperature.
At low temperatures, where T → 0, the density matrix will strongly peak around
the lowest energy state:
and hence becomes pure. However at high energy, T → ∞, all the energies states
contribute more or less equally and so
1 X
lim ρthermal = |En ⟩⟨En |. (9.44)
T →∞ dimH n
This is proportional to the identity matrix (in the |En ⟩ basis). Such states are called
maximally mixed.
1
This value of kB is exact. It’s a definition.
9.3. THERMAL STATES 105
Figure 9.3.1: The Planck Curve: N.B. The horizontal axis is wavelength λ ∼ 1/ω
This is the famous Planck formula for the spectrum of emitted light from a so-called
black body of temperature T , albeit in just one dimension. In three dimensions there
is an 4πω 2 in the numerator. The extra 4πω 2 comes from counting modes in three
dimensions; basically it arises from the growth of the size of a sphere as ω = |k| where
k is the spatial momentum. A plot of Eth /(ℏω) is given in Fig. 8.2.1. In particular
it was observed that all bodies radiate with a certain spectrum that depends on their
temperature. Humans (living ones) radiate in the infra-red. If it is hotter you can see
it as visible light such as an oven hob (not an induction one!). Even the empty space of
the Universe emits radiation in the form of the cosmic microwave background (CMB)
corresponding to a temperature of about −270o C. Planck’s formula Ethermal matches
experiment but deriving it was one of those problems the eighteenth century physicists
thought was just a loose string. Rather Planck had to introduce the notion of discrete
energies and ℏ to derive it. That was the beginning of the end for Classical Physics.
9.4 Entropy
To continue we need to talk about the idea of entropy. Entropy is a measure of disorder.
More precisely it measures how many microscopic states lead to the same macroscopic
one. All your various charger cables are always tangled up as there are so many more
tangled possibilities than untangled ones. It’s not that life is against you its just that
being organised is an exponentially unlikely state of affairs: organisation requires some
9.5. MORE ON ENTANGLEMENT 107
organiser and quite some effort2 . The most basic definition of entropy is it is the log of
the number of states with the same energy. So
This requires a little motivation. Firstly to compute the Log of an operator the easiest
way is to diagonalise it so:
X
SvN = − pn ln pn (9.53)
n
In fact there is a theorem that SvN ≤ ln(dim H). A mixed state with SvN = ln(dim H)
is said to be maximally mixed. For example a maximally mixed state arises when
pn = 1/N for all n, where N = dim H. Such a mixed state is equally mixed between all
pure states.
Now these two particles might be very far from each other. Perhaps they were created
in some experiment as a pair and then flew away across the universe, like endless rain into
a paper cup. Suppose sometime later we measure one of the particles, say the first, and
it is in the state | 12 , 12 ⟩. Then we know, we absolute certainty, that the second particle,
wherever it is in the Universe, is in the | 12 , − 12 ⟩ state. This is very counter intuitive
and indeed it violates condition b) in the local realism hypothesis. Nevertheless, we
can’t use it to send messages faster than light as we can’t control how the first particle’s
wavefunction ‘collapses’: to know what state the second particle is in one needs to
receive communication about the outcome on particle 1, and this cannot happen faster
than the speed of light.
Now recall that a general state in the tensor product H = H1 ⊗ H2 of Hilbert spaces
can be written as
X
|Ψ⟩ = cn′ n′′ |en′ ⟩ ⊗ |en′′ ⟩ (9.56)
n′ ,n′′
with |en′ ⟩ a basis of H1 and |en′′ ⟩ a basis of H2 . Note that pure state takes the form
ρ = |Ψ⟩⟨Ψ|
X
= cn′ n′′ c∗m′ m′′ |en′ ⟩ ⊗ |en′′ ⟩⟨em′ | ⊗ ⟨em′′ |
n′ ,n′′ ,m′ ,m′′
X
= cn′ n′′ c∗m′ m′′ |en′ ⟩⟨em′ | ⊗ |en′′ ⟩⟨em′′ | (9.57)
n′ ,n′′ ,m′ ,m′′
Next we want to we admit that we have no idea what is going on in the second Hilbert
space H2 . Thus we would construct a density matrix where we sum (trace) over all the
options for H2 :
= an′ m′ (9.59)
9.5. MORE ON ENTANGLEMENT 109
In this chapter we discuss the interplay between quantum mechanics and special rela-
tivity. As a warm-up, we will revisit the problem of a particle in an electromagnetic
field, the properties of the Schrodinger equation under “gauge” transformations and
the implications for physical observables. In particular, we will study the Aharonov-
Bohm effect. We will then see how demanding that Quantum mechanics is compatible
with special relativity leads us to new equations such as the Klein-Gordon and Dirac
equations.
dv q
m = qE + v × B. (10.1)
dt c
The electric and magnetic field can be combined into an electromagnetic field strength
tensor Fµν , µ, ν = 0, · · · , 3 that transforms covariantly under Lorentz transformations
(which are symmetries of special relativity, as we will see later). Fµν is antisymmetric
and the electric and magnetic fields correspond to the following components
1 dA
Ei ≡ −F0i = −∇V − ,
c dt (10.2)
jk
Bi ≡ ϵijk F = (∇ × A)i
Here A is the vector or gauge potential and together with the Coulomb potential V ,
combines into a 4-vector
Aµ = (V, A) . (10.3)
mv 2 Q
L= + v · A − qV , (10.4)
2 |c {z }
≡ 1c pµ Aµ
111
112 CHAPTER 10. RELATIVISTIC QUANTUM MECHANICS
Note that p is the canonical momentum, which in this case differs from the usual momen-
tum mv. The Schordinger equation for a quantum particle in an EM field is obtained
from (10.5) by simply replacing p → −iℏ∇. We obtain
" 2 #
1 Q
Ĥ = −iℏ∇ − A + qV . (10.6)
2m c
All physical quantities (electric, magnetic fields) are invariant under the gauge trans-
formations
1∂
V → Ṽ = V − Λ(x, t),
c ∂t (10.7)
A → Ã = A + ∇Λ(x, t),
Aµ → Aµ + ∂µ Λ. (10.8)
Under these transformations, the wavefunction is not invariant, but instead transforms
as
iQΛ
ψ̃(x, t) = e ℏc ψ(x, t). (10.9)
(10.10)
which means that the Schrodinger equation is invariant under gauge transformations,
" 2 #
1 Q ∂
−iℏ∇ − Ã + QṼ ψ̃ = iℏ ψ̃ (10.11)
2m c ∂t
• two particles emitted at a source and directed along paths enclosing (but not
crossing) a region with non-vanishing magnetic flux
• particles recombine
The key outcome of this experiment is that, upon recombination, the particles give rise
to an interference pattern that is sensitive to the magnetic flux. This all despite the fact
that the particles were never exposed directly to the flux. Classically, this would not be
possible.
Let
Φ=B·S (10.12)
be the magnetic flux generated by a solenoid of area S (magnitude proportional to the
surface area, direction perpendicular to the area). Note that
Z Z I
Φ = dS · B = dS · (∇ × A) = A · dx, (10.13)
where in the last line we used Stokes’ theorem. Now let’s work in cylindrical coordinates
with the z axis aligned along the solenoid, in the direction of the magnetic field. Then
I Z 2π
Φ
Φ = A · dx = Aϕ rdϕ =⇒ Aϕ = . (10.14)
0 2πr
All other components of the gauge potential vanish. The Hamiltonian (10.6) is then
2
1 ∂ QΦ
H= −iℏ − . (10.15)
2mr2 ∂ϕ 2π
where
2πℏ
Φ0 = (10.17)
q
is the flux quantum.
Now notice that the spectrum is unaffected if Φ = mΦ0 , m ∈ Z. On the other hand,
if Φ ̸= mΦ0 , the spectrum is shifted. This phenomenon is known as spectral flow and
is physical: it knows about B and Φ despite the fact that the particles are not directly
exposed to Φ. It can be measured by the Aharonov-Bohm effect, ie. an interference
pattern.
To see this, consider the more general case
1
(−iℏ∇ − QA)2 ψ = Eψ (10.18)
2m
and set
iQ Rx
A(x′ )·dx′
ψ(x) = e ℏ ϕ(x). (10.19)
114 CHAPTER 10. RELATIVISTIC QUANTUM MECHANICS
Substituting (10.19) into the Schrodinger equation (10.18) immediately shows that ϕ(x)
obeys the free equation, ie.
1 iQ R x ′ ′ 1 iQ R x ′ ′
(−iℏ∇ − QA)2 ψ = e ℏ A(x )·dx (−iℏ∇ + QA − QA)2 ϕ(x) = Ee ℏ A(x )·dx ϕ(x).
2m 2m
(10.20)
Recall that under gauge transformations
A → A + ∇Λ,
iQΛ (10.21)
ψ(x) → e ℏc ψ
This phase difference will have an effect on the interference pattern provided that
2πℏn
Φ ̸= = nΦ0 . (10.23)
Q
∂Ψ ℏ2 2
iℏ =− ∇ Ψ + V (xa )Ψ (10.24)
∂t 2ma a
where Ψ = Ψ(t, x1 , ..., xN ). We don’t expect there to be a preferred point or direction
in space. So let us suppose that the potential is of the form
i.e. we assume that the potential only depends on distance between positions of the
particles. Schrödinger’s equation is then rotationally invariant:
xa → x′a = xa + vt (10.27)
Clearly V is invariant as xa − xb = x′a − x′b . And the Laplacian terms don’t change.
However if we write Ψ′ (t, xa ) = Ψ(t, x′a )
∂Ψ′ ∂Ψ X
= +v· ∇a Ψ (10.28)
∂t ∂t a
10.2. RELATIVISTIC QUANTUM MECHANICS 115
Thus we find an extra term on the left hand side and the Schrödinger equation is no
longer invariant. Drawing inspiration from our discussion of gauge invariance above, we
can ask whether the can fix this by demanding that the wavefunction transforms under
Galilean boosts by a phase.
Indeed, one can show that the unwanted term in (10.28) can be compensated for by
taking instead
P 1
′ − ℏi v· ma xa + ma vt
Ψ (t, xa ) = e a 2 Ψ(t, x′a ) (10.29)
i.e. we include an extra phase factor in the wavefunction. As a result we find (see the
problem set)
∂Ψ′ ℏ2 ′ 2 ′
iℏ =− ∇a Ψ + V (x′a )Ψ′ (10.30)
∂t 2ma
Thus we have a symmetry under what are known as Galilean boosts, corresponding
to a notion of Galilean relativity. In particular in Galilean relativity time is absolute,
eveyone agrees on t and an infinitesimal tick of the clock dt. So also do they all agree
on spatial lengths: dx2 + dy 2 + dz 2 . Thus observers may move in reference frames which
are related by rotations and boosts of the form (10.27).
That’s great but as Einstein showed in 1905 Galilean boosts are not symmetries of
space and time. Rather Lorentzian boosts are and we need to consider Special Relativity!
In Galilean relativity the speed of light is not a constant because under a boost velocities
are shifted:
ẋ → ẋ + v. (10.31)
But Einstein’s great insight (well one of them) was to insist that the speed of light is a
universal constant. I won’t go into why except to mention that Maxwell’s equations for
electromagnetism don’t allow you to go to a frame where c = 0. They are not invariant
under the Galilean boosts we saw above. They are instead invariant under Lorentz
transformations which include rotations and Lorentz boosts which we now describe.
So what is a Lorentz boost. In Special Relativity we have that the invariant notion
of length is, for an infinitesimal displacement,1
Here c is the speed of light. Here we only require that ds2 is the same for all observers
(not dt and dx2 + dy 2 + dz 2 ). Thus we are allowed to consider a transformation of the
form
we need ds′2 = ds2 and hence (we could also take the other sign but that would flip the
direction of time)
1
γ=q . (10.35)
1− |β|2
Thus to recover Galilean boost x′ = x+vt we identify β = v/c. Small β means |v| << c.
Then we see that
and we have recovered absolute time. These transformations are called Lorentzian boosts
and together with spatial rotations they form the Lorentz group.
E 2 = |p|2 c2 + m2 c4
(setting p = 0 immediately gives a famous formula). But this allows for both
positive and negative energy.
10.3. RELATIVISTIC WAVE EQUATIONS 117
One can find a conserved density: ρ = iΨ∗ ∂t Ψ − iΨ∂t Ψ∗ but this is not positive
definite (see the problems).
The other way to go, and one of those oh so clever moments of discovery, is to find an
equation that it first order in time and space. This is what Dirac did and it’s beautiful.
So let us try something like
iℏ 0 ∂
γ + iℏγ · ∇ − mc Ψ = 0. (10.39)
c ∂t
We will not be very specific yet about what γ 0 and γ i are. We want this equation to
imply the Klein-Gordon equation so we square it (ie. take DD† , where D† is the adjoint
of the differential operator above):
iℏ 0 ∂ iℏ 0 ∂
γ + iℏγ · ∇ − mc − γ − iℏγ · ∇ − mc Ψ = 0
c ∂t c ∂t
2
ℏ 0 2 ∂2 ℏ2 i j ∂2 ℏ2 0 i ∂ 2
2 2
(γ ) 2 + {γ , γ } i j + {γ , γ } +m c Ψ=0 (10.40)
c2 ∂t 2 ∂x ∂x c ∂t∂xi
(γ 0 )2 = 1 {γ 0 , γ i } = 0 {γ i , γ j } = −2δ ij . (10.41)
Clearly this can’t happen if the γ’s are just numbers. But maybe if they are Matrices?
Some trial and error gives the following (unique up to conjugation γ → U γU −1 ) solution:
! !
I 0 0 τi
γ0 = γi = (10.42)
0 −I −τi 0
where I is the 2 × 2 unit matrix and τi the Pauli matrices. Thus Ψ must be a complex
four-vector:
ψ1
ψ
2
Ψ= . (10.43)
ψ3
ψ4
x0 = ct , x1 = x , x2 = y , x3 = z (10.44)
where
−1 0 0 0
0 1 0 0
ηµν = (10.46)
0 0 1 0
0 0 0 1
and repeated indices are summed over. Note that in such sums one index is always up
and one down. This ensures that any quantity that has no left-over indices is a Lorentz
invariant. It is also helpful to introduce the matrix inverse to ηµν which is denoted by
−1 0 0 0
0 1 0 0
−1
η µν = ηµν = (10.47)
0 0 1 0
0 0 0 1
10.5 Spinors
What about Ψ? It now has four components so we could introduce an index Ψα , α =
1, 2, 3, 4 in which case the γ-matrices also pick-up indices:
γ µ = (γ µ )α β (10.49)
This is called a Clifford algebra2 . Ψ is called a spinor and the α, β indices spinor
indices. This is a new kind of object for us, so lets see the consequences.
2
Clifford was a student at King’s in the 1800’s.
10.5. SPINORS 119
x′µ = Λµ ν xν (10.51)
In Relativity we adopt a notation where contracting an index with ηµν lowers it: ω µ ρ ηµσ =
ωρσ . Similarly one can raise an index by contracting with η µν . Thus the Lie-algebra
consists of anti-symmetric ωρσ :
These are just 4 × 4 matrices with a single ±1 somewhere in the upper triangle and a
∓1 in the corresponding lower triangle to make it anti-symmetric. For example:
0 0 0 0
0 0 −1 0
M12 = δ1ρ η2σ − δ2ρ η1σ = (10.58)
0 1 0 0
0 0 0 0
[Mµν , Mλρ ] = Mµρ ηνλ − Mνρ ηµλ + Mνλ ηµρ − Mµλ ηνρ (10.59)
also forms a representation of so(1, 3), i.e. (10.59) (see the problem set). And it’s also
four-dimensional, because the spinor indices have the range α, β = 1, 2, 3, 4, but this time
four-complex dimensions and acts on the Ψ. This is called the spinor representation of
so(1, 3).3
Thus under an infinitesimal Lorentz transformation we have
S −1 Λµ ν γ ν S = γ µ (10.66)
or infinitesimally
1 1
− ω ρσ Σρσ γ µ + ω µ ν γ ν + γ µ ω ρσ Σρσ = 0
2 2
ρσ 1 µ 1 µ 1 µ 1 µ
ω − γρσ γ + δρ γσ − δσ γρ + γ γρσ = 0 (10.67)
4 2 2 4
which boils down to the identity
This is in fact true as you can check. But for now just try some cases: if µ, ρ, σ are all
distinct then γ µ commutes with γρσ and the left hand side vanishes. So does the right
3
Things are a bit confusing in four-dimensions as many different indices are four-dimensional: xµ
and Ψα . This is another coincidence. For example in ten dimensions xµ would have µ = 0, 1, 2...., 9 but
α = 1, 2, 3, ...32.
10.5. SPINORS 121
hand side. If µ = σ and ρ ̸= σ then the two terms on the left hand side will add to give
a factor 2 which agrees with the right hand side.
As an example let’s consider a rotation in the x − y plane by an angle θ: ω12 = θ.
On the coordinates xµ we have
1 λρ M
x′µ = (e 2 ω λρ
)µ ν xν
= (eθM12 )µ ν xν
= (cos θI + M12 sin θ)µ ν xν
′
ct 1 0 0 0 ct
x′ 0 cos θ − sin θ 0 x
=⇒ ′ = (10.69)
y 0 sin θ cos θ 0 y
z′ 0 0 0 1 z
Ψ′ = −Ψ . (10.71)
Thus under a rotation of 2π spinors come back with a minus sign. This is very important.
It turns out that this means particles described by spinors must have wavefunctions
which change sign under swapping any two particles. Such particles are call Fermions
(particles that don’t have such a sign, such as photons, are called Bosons). This in
turn implies the Pauli exclusion principle: no to spinor particles can be in the same
state. This is what makes matter hard: when you add electrons to a Hydrogen atom to
build heavier elements they must fill out different energy levels.
How can we construct a Lorentz scalar from a spinor? For a vector we form the
contraction: e.g. ds2 = ηµν dxµ dxν = dxµ dxµ . For a spinor we need to consider something
like
for some spinor matrix C. Now under a Lorentz transformation δΨ1,2 = 41 ωµν γ µν Ψ1,2
and hence
1 1
δΨ†1 = Ψ†1 ωµν (γ µν )† = Ψ†1 ωµν (γ ν )† (γ µ )† (10.73)
4 4
Now it is easy to check that
(γ µ )† = γ 0 γ µ γ 0
=⇒ (γ ν )† (γ µ )† = γ 0 γ ν γ µ γ 0 (10.74)
122 CHAPTER 10. RELATIVISTIC QUANTUM MECHANICS
thus
1 1
δΨ†1 = Ψ†1 ωµν γ 0 γ νµ γ 0 = − Ψ†1 ωµν γ 0 γ µν γ 0 (10.75)
4 4
What we need is
0 = δ(Ψ†1 CΨ2 )
= δΨ†1 CΨ2 + Ψ†1 CδΨ2
1
= Ψ†1 ωµν (−γ 0 γ µν γ 0 C + Cγ µν )Ψ2 (10.76)
4
Thus we can simply take C = γ 0 . This defines the Dirac conjugate:
Ψ̄ = Ψ† γ 0 (10.77)
and hence Ψ̄1 Ψ2 is a Lorentz scalar. Thus C = γ 0 plays same role for spinor indices
that η µν plays for spacetime indices. If we put indices on it we have C αβ = (γ 0 )α β .
A final comment is that Ψ is called a Dirac spinor. This is not an irreducible
representation of so(1, 3). One can impose the constraint
γ5 Ψ = γ 0 γ 1 γ 2 γ 2 Ψ = ±Ψ (10.78)
The eigenstates of γ5 are known as Weyl spinors and are often called left or right handed
spinors. Given the form of the γ-matrices above we have
!
0 I
γ5 = (10.79)
I 0
where ΨL/R are the left and right handed spinors. These form irreducible representations
of so(1, 3) and each is a complex 2-vector in spinor space.
So we still have negative energy states! But we do have a positive definite density.
Indeed we can identify a 4-current
J µ = Ψ̄γ µ Ψ (10.82)
∂µ J µ = ∂µ Ψ̄γ µ Ψ + Ψ̄γ µ ∂µ Ψ
mc mc
= i Ψ̄Ψ − i Ψ̄Ψ
ℏ ℏ
=0 (10.83)
which you can show in the problem set. Along with showing Ψ̄γ µ Ψ a Lorentz vector.
This means that the time component can be used as a conserved quantity:
Z Z
1d
J d x = ∂0 J 0 d3 x
0 3
c dt
Z
= − ∂i J i d3 x
=0 (10.85)
And furthermore
J 0 = Ψ̄γ 0 Ψ = Ψ† Ψ (10.86)
is positive definite.
So we’ve discovered something beautiful and new but haven’t solved all our problems.
We still have negative energy states and not all particles are described by the Dirac
equation: just Fermions such as electrons and quarks. Dirac’s solution to the negative
energy states was to use the Pauli exclusion principle to argue that the ground state
has all negative energy states filled. This however implies that you should be able to
knock a state with negative energy out of the vacuum. Such an excitation has the same
energy (mass) but opposite charge to a positive energy state. In so doing he predicted
anti-particles which were subsequently observed. We now know that the Higgs’ Boson
(as with other Bosons) satisfies a Klein-Gordon-like equation and that these too have
anti-particles. The full solution comes by allowing the number of particles to change,
i.e. one allows for the creation and annihilation of particles, and leads to quantum field
theory. But that’s another module....
124 CHAPTER 10. RELATIVISTIC QUANTUM MECHANICS
Chapter 11
Our last topic is to present another way of formulating quantum mechanics known as
the path integral formulation. It is perhaps the most popular way as one can incorporate
symmetries such as Lorentz invariance from the outset. It also allows us to make contact
with the principle of least action and hence the Classical world.
Let us start by considering a general quantum mechanical system with coordinate q̂
and its conjugate momenta p̂:
= δ(a2 − a1 )
Z
⟨k1 |k2 ⟩ = ψk1 (q)∗ ψk2 (q)dq
Z
= ei(k2 −k1 )q/ℏ dq
= 2πℏδ(k2 − k1 ) (11.6)
125
126 CHAPTER 11. THE FEYNMAN PATH INTEGRAL
In a suitable sense (don’t ask too many questions) the |a⟩ form a basis. We can write
the identity operator as
Z
I = da|a⟩⟨a| (11.7)
K(af , tf ; a0 , t0 ) = ⟨af |IeiĤ(tf −tN −1 )/ℏ IeiĤ(tN −1 −tN −2 )/ℏ IeiĤ(t2 −t1 )/ℏ . . . IeiĤ(t1 −t0 )/ℏ |a0 ⟩
(11.16)
Next we replace each occurrence of I by I× (11.8)
Z
dkN
K(af , tf ; a0 , t0 ) = ⟨af | |kN ⟩ (11.17)
2πℏ
Z
iĤ(tN −tN −1 )/ℏ dkN −1
× ⟨kN |e I |kN −1 ⟩⟨kN −1 |eiĤ(tN −1 −tN −2 )/ℏ . . . |a0 ⟩
2πℏ
(11.18)
where ai+1 − ai = ȧ(ti+1 − ti )/N = ȧδt with a dot denoting differentiation with respect
to time. Thus we find an example of a path-integral:
Z
dk i
R
K(af , tf ; a0 , t0 ) = [da]e ℏ (k(t)ȧ(t)+H(k(t),a(t)))dt (11.21)
2πℏ
128 CHAPTER 11. THE FEYNMAN PATH INTEGRAL
where k(t) and a(t) are arbitrary paths such that a(t0 ) = a0 and a(tf ) = af . This is an
infinite dimensional integral, we are integrating over all paths k(t) and a(t). It is not
well understood Mathematically but is very powerful in Physics.
where N is an infinite but irrelevant constant that we can absorb. Again we recognise
the exponent as an integral
X m(ai+1 − ai )2 Z t2
1 2
lim 2
− V (a i ) δt = mȧ − V (a)dt
N →∞
i
2(δt) t1
2
= S[a] (11.25)
Z
i
= [da]e− ℏ S[a(t)] (11.26)
This is known as the Schrödinger propagator. You can check that, for t2 ̸= t1
∂KS ℏ2 ∂ 2 Ks
iℏ =− . (11.31)
∂t2 2m ∂a22
A more careful treatment reveals that
∂KS ℏ 2 ∂ 2 Ks
iℏ + = δ(t2 − t1 ) . (11.32)
∂t2 2m ∂t2
Thus we see that in the Path Integral formulation of Quantum Mechanics the ampli-
tude for a particle to go from q = a1 at t = t1 to q = a2 at t = t2 is given by an infinite
dimensional integral over all possible paths, classical or not, that start at a1 and end at
a2 . Each path is then given a weight e−iS/ℏ .
Here we see the Lagrangian of classical mechanics appearing. As you may recall from
classical mechanics the action, as determined by the Lagrangian, was a mysterious object
with no physical interpretation yet it controlled everything (symmetries, equations of
motion,...). However here we see it arises as the logarithm of the weight that appears
in the path integral. Each path contributes a weight that is a pure phase so they all
contribute equally in magnitude. However there can be interference. The dominant
contributions then correspond to those where the interference is the weakest. These
corresponds to paths where δS = 0, i.e. extrema. Thus the classical paths corresponds
to the most dominant contribution to the path integral with other paths cancelling out
due to the rapid oscillatory nature of the path integral for ‘small’ ℏ. This explains the
principle of least action.
Furthermore we see that we can marry Quantum Mechanics with Special Relativ-
ity by simply taking an action which is invariant under Lorentz transformations. For
example it is not too hard to see that the Klein-Gordon equation arises from the action
m 2 c3
Z
SKG = c∂µ Ψ∗ ∂ µ Ψ − 2 Ψ∗ Ψ d4 x (11.33)
ℏ
130 CHAPTER 11. THE FEYNMAN PATH INTEGRAL
This object is similar to the partition function that we saw above only in imaginary
temperature β = −i/ℏ and weighted by the action not the energy. In fact the relation
between the partition function of a quantum theory, as defined by (11.34), and the
partition function we saw in our discussion of thermal statistical physics is quite deep.
Thus
Z
X 1
S=− qn qm
en (t)Eem (t)dt (11.38)
2
Let us choose en (t) to be eigenstates of E with eigenvalues λn :
Z
X 1
S=− qn qm en (t)λm em (t)dt
2
1X
=− λn qn2 (11.39)
2 n
Thus
Z Z Y
i i P 2
[dq]e− ℏ S[q] = dqn e 2ℏ n λn qn
n
YZ i 2
= dqn e 2ℏ λn qn
n
r
Y πiℏ
=
n
λn
N Y
=√ , det E = λn (11.40)
det E n
11.2. COMPUTING DETERMINANTS 131
where the determinant is the infinite product of eigenvalues of the operator Eab . More
generally still if one has
Z
1
S=− ( qa Eab qb + Ja qa )dt (11.43)
2
we find
N
Z Y
i 1
R −1
[dqa ]e− ℏ S[q1 ,...,qn ] = √ e 2 dtJa Eab Jb (11.44)
a
det Eab
1
Some of you might be worrying that these integrals are not well defined and this is just hocus-pocus.
Don’t, just relax and don’t ask too many questions.