0% found this document useful (0 votes)
9 views

AQM2024-25

The document is a set of lecture notes on Advanced Quantum Mechanics, authored by Ana Raclariu from King's College London, and is based on the teachings of several professors. It covers various topics including fundamental concepts of quantum mechanics, angular momentum, perturbation theory, and the hydrogen atom, structured into chapters with a planned weekly schedule for lectures and tutorials. The introduction emphasizes the revolutionary nature of quantum mechanics compared to classical physics, highlighting key experiments that challenge classical interpretations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

AQM2024-25

The document is a set of lecture notes on Advanced Quantum Mechanics, authored by Ana Raclariu from King's College London, and is based on the teachings of several professors. It covers various topics including fundamental concepts of quantum mechanics, angular momentum, perturbation theory, and the hydrogen atom, structured into chapters with a planned weekly schedule for lectures and tutorials. The introduction emphasizes the revolutionary nature of quantum mechanics compared to classical physics, highlighting key experiments that challenge classical interpretations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 131

Advanced Quantum Mechanics

6ccm436a/7ccmms31

Ana Raclariu∗

Department of Mathematics, King’s College London


The Strand, London WC2R 2LS, UK

N.B. These notes are not particularly original but rather based on the lecture notes of
Profs. N. Lambert, N. Drukker and G. Watts.


email: ana-maria.raclariu@kcl.ac.uk
2
Contents

0.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
0.2 Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1 Review: Elements of Quantum Mechanics 9


1.1 Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2 Operators, Observables and Measurements . . . . . . . . . . . . . . . . . 11
1.3 Aside: Canonical Quantisation . . . . . . . . . . . . . . . . . . . . . . . . 16
1.4 Time Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.5 Example: The Quantum Harmonic Oscillator . . . . . . . . . . . . . . . 18

2 Angular Momentum 25
2.1 Transformations and Unitary Operators . . . . . . . . . . . . . . . . . . 25
2.1.1 Transformation of states . . . . . . . . . . . . . . . . . . . . . . . 25
2.1.2 Transformation of operators . . . . . . . . . . . . . . . . . . . . . 27
2.1.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.1.4 Heisenberg vs. Schrödinger pictures, revisited . . . . . . . . . . . 28
2.2 Symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.2.1 Aside: relation to classical mechanics . . . . . . . . . . . . . . . . 30
2.3 A Spherically Symmetric Potential . . . . . . . . . . . . . . . . . . . . . 31
2.4 Rotations and Angular Momentum Operators . . . . . . . . . . . . . . . 33

3 All about su(2) 37


3.1 The Stern-Gerlach Experiment . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2 su(2) Lie algebras and representations . . . . . . . . . . . . . . . . . . . 38
3.2.1 Construction of irreducible unitary su(2) representations . . . . . 39
3.3 Lie groups vs. algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

4 Back to Angular Momentum 47


4.1 Differential Operators vs Irreducible Representations . . . . . . . . . . . 48
4.2 Addition of Angular Momentum . . . . . . . . . . . . . . . . . . . . . . . 52
4.3 An application of spin: Magnetic Resonance Imaging (MRI) . . . . . . . 56

5 The Hydrogen Atom 61


5.1 The Quantum 2-Body Problem . . . . . . . . . . . . . . . . . . . . . . . 61
5.2 The Hydrogen Atom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

3
4 CONTENTS

6 Time Independent Perturbation Theory 67


6.1 A Simple Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6.2 First Order Non-Degenerate Perturbation Theory . . . . . . . . . . . . . 70
6.3 Second Order Non-Degenerate Perturbation Theory . . . . . . . . . . . . 74
6.4 Application: Fine structure of Hydrogen . . . . . . . . . . . . . . . . . . 76
6.4.1 Relativistic correction to KE . . . . . . . . . . . . . . . . . . . . . 76
6.4.2 Spin-orbit coupling . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6.5 Degenerate Perturbation Theory . . . . . . . . . . . . . . . . . . . . . . . 79

7 Time Dependent Perturbation Theory 83


7.1 Fermi Golden Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
7.1.1 Absorbtion and stimulated emission . . . . . . . . . . . . . . . . . 87

8 Semi-Classical Quantization: WKB 89


8.1 WKB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
8.2 Particle in a Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
8.3 Turning Points, Airy and Bohr-Sommerfeld Quatization . . . . . . . . . . 93
8.4 Harmonic Oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

9 Entanglement, Density Matrices and Thermal States 97


9.1 Entanglement and Bell’s Inequality . . . . . . . . . . . . . . . . . . . . . 97
9.2 Density Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
9.3 Thermal States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
9.4 Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
9.5 More on entanglement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

10 Relativistic Quantum Mechanics 111


10.1 Schrodinger equation for a charged particle . . . . . . . . . . . . . . . . . 111
10.1.1 Aharonov-Bohm effect . . . . . . . . . . . . . . . . . . . . . . . . 112
10.2 Relativistic quantum mechanics . . . . . . . . . . . . . . . . . . . . . . . 114
10.3 Relativistic Wave Equations . . . . . . . . . . . . . . . . . . . . . . . . . 116
10.4 Relativistic Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
10.5 Spinors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
10.6 Back to Quantum Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . 122

11 The Feynman Path Integral 125


11.1 Gaussian Path Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
11.2 Computing Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
0.1. INTRODUCTION 5

0.1 Introduction
Quantum Mechanics arguably presents our most profound and revolutionary under-
standing of objective reality. In the classical world, as studied for example in “Classical
Dynamics” (5ccm231a) the physically observable world consists of knowing the positions
and momenta (velocities) of every single particle at a moment in time. The dynamical
equations can then be used to evolve the system so as to know the positions and mo-
menta in the future leading to a causal and highly predictive theory. You could forgive
the physicists of the 19th century for thinking that they were almost done. Everything
seemed to work perfectly and there were just a few loose strings to get under control.
It turned out that by pulling on those loose strings the whole structure of classical
physics and our understanding of objective relativity fell apart. We now have a more
predictive and precise theory in the form of Quantum Mechanics. But no one has got
to grips with what it truly means.
The classic experiment, although there are many, that lead to the unravelling of
classical mechanics relates to the question as to whether or not light is a particle or
a wave. If you consider a coherent light source and send a beam of light through two
small slits and see what happens you will find an interference pattern. This is what
you would expect if light was a wave and the two slits produced two wave sources that
could add constructively or cancel out deconstructively. Thus an interference pattern
is observed. Fine, light is a wave. Let’s do the same with electrons. We find the same
thing! Shooting an electron beam at a double slit also leads to an interference pattern.
So electrons, which otherwise appear as point like particles with fixed electric charges
also behave like waves (and similarly there are experiments such as the photoelectric
effect which show that light behaves like a particle). This interference pattern even
exists if you slow to beam down so that one electron at a time is released.
What has happened? Nothing that has a classical interpretation. We can think of
electrons as waves with profile ψ. After passing through the slits we have two wave-
functions ψhole1 and ψhole2 . Quantum mechanics tells us that the system is described by
ψhole1 + ψhole2 but in this case the two wave profiles can constructively or deconstruc-
tively interfere (i.e. add or cancel out). Classically there isn’t really an explanation but
one would expect two the electron beams coming out from each of the slits to behave
independently. Experiment tells us which is true (see figure 0.1.1). Needless to say there
have been countless experiments since which confirm the quantum picture and refute
any classical explanation (most notably in the Bell’s inequalities).
There are further revelations to that we won’t get to speak much of. Relativity
came and told us that time is not absolute. Combining this with Quantum Mechanics
means that there really aren’t any particles at all just local excitations of fields that
permeate spacetime. Why, after all, is every electron identical to every other? This is
the topic of Quantum Field Theory. However exploring Quantum Mechanics beyond an
introductory module such as “Introductory Quantum Theory” (6cmm332c) is crucial
for any physicist.
6 CONTENTS

Figure 0.1.1: After 100, 200, 500 and 1000 electrons hit the screen the interference
picture is formed (on top). Compare this to the probability distribution computed from
the wave function of the electron (bottom). In particular the quantum probability (left)
is obtained from |ψhole1 + ψhole2 |2 vs. the classical sum of probabilities |ψhole1 |2 + |ψhole2 |2
(right).

0.2 Plan

We hope to cover each chapter in a week. The main lectures, where new material is
presented, are on Tuesdays 12-2pm and Wednesdays 12-1pm. The Tuesday lectures
will focus more on foundations, while in the Wednesday lecture we will aim to cover
examples and sometimes real world applications of the material.
In addition there will be weekly problem sets found on Keats. You are strongly
urged to do the problems which are essential to understanding the material. There
are weekly tutorials on Monday at 9am which will discuss the problem set from the
previous week. Each week you will find a questionnaire on Keats asking you to indicate
one topic that you understood well and one topic that you are struggling with. The
answers will be anonymous and I strongly encourage you to reflect on these questions
after the Wednesday class each week and submit your response. This will allow us to
0.2. PLAN 7

identify the confusing points and focus on clarifying them during the tutorials.
There is a reading week starting Oct 28 where there will not be any lectures, discus-
sion sessions or tutorials.
8 CONTENTS
Chapter 1

Review: Elements of Quantum


Mechanics

Let’s jump head first into the formulation of Quantum Mechanics. In this section we
will just consider one-dimensional systems, meaning one spatial dimension and time.
Conceptually the extension to higher dimensions is easy but more technically involved
and is the subject of much of this module. One can also consider quantum systems with
no spatial dimension. We will consider these later. This chapter is expected to be a
review.
Quantum mechanics describes physical systems through a state ψ in a complex vector
space evolving according to the Schrodinger equation

∂ψ
iℏ = Ĥψ. (1.1)
∂t

Here Ĥ is the Hamiltonian operator and carries information about the energy of the
system. For a non-relativistic particle in one spatial dimension subject to a potential
V (x), (1.1) becomes

∂ψ(t, x) ℏ2 ∂ 2 ψ(t, x)
iℏ =− + V̂ (x)ψ(t, x). (1.2)
∂t 2m ∂ 2 x
We can compare and contrast this to classical mechanics, where in closed systems
(ie. systems that are not acted upon by any forces) energy E is conserved. This energy
depends on the details of the system, but typically consists of a kinetic and a potential
component. For a one-particle system of mass m and momentum p subject to a potential
V (x), the energy is given by
p2
E= + V (x) (1.3)
2m
For a classical system, momenta p and positions x are entirely determined by the equa-
tions of motion and are constrained to obey (1.3) for definite E. (1.2) takes a form very
similar to (1.3) subject to the replacements

∂ ∂
E ↔ iℏ , p ↔ −iℏ , V ↔ V̂ . (1.4)
∂t ∂x
9
10 CHAPTER 1. REVIEW: ELEMENTS OF QUANTUM MECHANICS

In other words, the numbers in (1.3) are replaced by operators acting on a wavefunction
ψ(x, t)!
More generally, (1.1) tells us that, at microscopic scales, the energy of the system
cannot be determined with certainty: the system may be found in a superposition of
energy eigenstates ψn ,1 X
ψ(t, x) = cn ψn (t, x), (1.5)
n
where ψn are defined by
ℏ 2 ∂ 2 ψn
En ψn = − + V̂ (x)ψn . (1.6)
2m ∂x2
Unlike in classical mechanics, upon measurement, the system is found to have energy
En with probability
pn ≡ |cn |2 ≡ cn c∗n . (1.7)
P
Demanding that ψ is normalized ensures that n pn = 1. Quantum mechanics is
fundamentally probabilistic in its predictions! One may of course choose to measure
other observables of the system, such as position or momentum. We will review these
examples in section 1.2 below.
This all can be formalized by introducing the notion of a Hilbert space H and
Hermitian operators Ô acting on H. As we will see in sections 1.1, 1.2, Ô correspond
to the physical observables. An example of an operator above is the Hamiltonian Ĥ. In
general

• the eigenvectors of Ô provide a basis in which one can expand ψ ∈ H

• the eigenvalues of Ô correspond to the possible observed outcomes of a mea-


surement of Ô (ie. the measurement is performed with respect to the basis of
eigenvectors of Ô)

• the wavefunction or state ψ determine the probabilities with which these outcomes
will be observed

Note: the probabilities of the outcomes should not be confused with the outcomes
themselves. The probabilities are a property of the state ψ (cf. |cn |2 in (1.5), (1.7)):
given a state ψ, it admits a unique decomposition in the basis {ψn } with coefficients
{cn }. The measurement outcomes are a property of the observable Ô (cf. En in (1.6) if
Ô = Ĥ).
We will also see in section 1.4 that closed quantum systems evolve unitarily in time.

1.1 Hilbert Spaces


In Quantum Mechanics the state of a system is given by an element |ψ⟩ of unit norm in
a Hilbert space H.
1
We assume here that the energy spectrum of Ĥ is discrete.
1.2. OPERATORS, OBSERVABLES AND MEASUREMENTS 11

Definition: A Hilbert space is a complete, complex vector space with positive definite
inner product

⟨ψ1 |ψ2 ⟩ : H × H → C. (1.8)

The inner product obeys the following properties:

• ⟨ψ1 |ψ1 ⟩ ≡ |ψ1 |2 = 0 ⇐⇒ |ψ1 ⟩ = 0

• ⟨ψ1 |ψ2 ⟩ = ⟨ψ2 |ψ1 ⟩∗

• ⟨ϕ|αψ1 + βψ2 ⟩ = α⟨ϕ|ψ1 ⟩ + β⟨ϕ|ψ2 ⟩, α, β ∈ C

Completeness is the condition that if ψn ∈ H is a Cauchy sequence, then limn→∞ ψn ∈ H.


A Hilbert space can be finite (eg. a qubit) or infinite dimensional (eg. a particle confined
to a box). For infinite-dimensional Hilbert spaces, we will sometimes use the terminology
state and wavefunction interchangably (see later).
The statement that the state of the system is an element of a Hilbert space is already
profoundly weird as vector spaces allow you to add vectors. Thus if |ψ1 ⟩ and |ψ2 ⟩ are
two orthogonal states then
1
|ψ⟩ = √ (|ψ1 ⟩ + |ψ2 ⟩) (1.9)
2
is also a state. There is no classical analogue of this.
Definition The space of linear maps from H to C is called the dual space H∗ .
Here we are using Dirac notation where we denote vectors by “kets” |ψ⟩. Elements
of the dual space are denoted by “bras” ⟨ψ| such that

⟨ψ1 |(|ψ2 ⟩) = ⟨ψ1 |ψ2 ⟩. (1.10)

There is a theorem which states that H∗ is also a Hilbert space and is isomorphic to
H. This means that for each |ψ⟩ there is a unique ⟨ψ| and vice versa.

1.2 Operators, Observables and Measurements


Hilbert spaces admit linear maps (the analogue of matrices in finite dimensions)

O:H→H (1.11)

such that O(λ1 |ψ1 ⟩ + λ2 |ψ2 ⟩) = λ1 O|ψ1 ⟩ + λ2 O|ψ2 ⟩ where λ1 , λ2 ∈ C.


We can use the inner product to define the notion of an adjoint operator O† .
Definition The adjoint of a linear map O is a linear map O† : H → H that satisfies

⟨O† ψ1 |ψ2 ⟩ = ⟨ψ1 |Oψ2 ⟩ (1.12)

for any two vectors |ψ1 ⟩, |ψ2 ⟩ ∈ H.


12 CHAPTER 1. REVIEW: ELEMENTS OF QUANTUM MECHANICS

In Dirac notation, given a basis |en ⟩ for H, the linear maps O admit the decompo-
sition
X
O= Omn |em ⟩⟨en | (1.13)
mn

where Omn ∈ C. It is easy to see that the adjoint is then


X
O† = †
Omn |em ⟩⟨en |, (1.14)
mn

† ∗
where, just as in finite dimensions, Omn = Onm .
In addition to a Hilbert space of states Quantum Mechanics exploits the fact that
there exists a preferred class of linear maps which are self-adjoint (aka Hermitian in the
case when H is finite dimensional) in the sense that

⟨ψ1 |Oψ2 ⟩ = ⟨Oψ1 |ψ2 ⟩ ⇐⇒ O† = O

for all vectors |ψ1 ⟩ and |ψ2 ⟩. Such maps are call observables and they lead to an
important

Theorem: A (compact) self-adjoint operator admits an othornormal basis of eigenvec-


tors with real eigenvalues.

N.B. An eigenvector |ψ⟩ ∈ H and corresponding eigenvalue λ ∈ C of an operator O


satisfy O|ψ⟩ = λ|ψ⟩.
We won’t prove this theorem here, in other words, we won’t prove existence of
the basis. However we can at least prove the following statement: eigenstates of self-
adjoint operators have real eigenvalues and eigenvectors with distinct eigenvalues are
orthogonal. To see this we consider two eigenvectors |ψ1 ⟩ and |ψ2 ⟩ with eigenvalues λ1
and λ2 respectively. Then we can evaluate

⟨ψ1 |Oψ2 ⟩ = λ2 ⟨ψ1 |ψ2 ⟩


⟨Oψ1 |ψ2 ⟩ = λ∗1 ⟨ψ1 |ψ2 ⟩ (1.15)

If O is self-adjoint then these two expression are equal:

λ2 ⟨ψ1 |ψ2 ⟩ = λ∗1 ⟨ψ1 |ψ2 ⟩ (1.16)

First we suppose that |ψ1 ⟩ = |ψ2 ⟩ and hence λ2 = λ1 . Thus we have, since |ψ1 ⟩ must
have non-zero norm, λ1 = λ∗1 . Next we suppose that |ψ1 ⟩ =
̸ |ψ2 ⟩. Since we now know
that λ1 and λ2 are real we have

(λ2 − λ1 )⟨ψ1 |ψ2 ⟩ = 0 (1.17)

Thus if λ2 ̸= λ1 then ⟨ψ1 |ψ2 ⟩ = 0. If there are degeneracies, meaning that there are
many eigenstates with the same eigenvalue then one can arrange to find an orthogonal
1.2. OPERATORS, OBSERVABLES AND MEASUREMENTS 13

basis by a Gram-Schmidt process. But we can’t conclude that two such eigenstates are
orthogonal without extra work. On the other hand, we know immediately that if two
eigenstates of a self-adjoint operator have different eigenvalues then they are orthogonal.
If an operator is self-adjoint then there is a choice of basis |en ⟩ so that (see problem
set 1)
X
O= λn |en ⟩⟨en | (1.18)
n

with λn ∈ R. In other words, the operator is diagonalizable and this is what a diagonal
operator looks like in Dirac notation.
How do we extract numbers that we can compare with the so-called real world
through experiment? The key idea of Quantum Mechanics is that observables represent
a physical measurement and the eigenvalues are the possible outcomes of such mea-
surements. We may define the expectation value of an operator O in a state |ψ⟩ to
be

⟨O⟩ = ⟨ψ|O|ψ⟩ (1.19)

The above theorem tells us that we can find an orthonormal basis of eigenvectors of
an observable O. Thus any state |ψ⟩ can be written as
X
|ψ⟩ = cn |ψn ⟩ (1.20)
n

where O|ψn ⟩ = λn |ψn ⟩. Intuitively the physical idea is that in order to measure an
observable one must probe the state you are observing, e.g. to find the position of an
electron you must look for it which means hitting it with light which will then impart
momentum and change it. Thus the act of measurement changes the system and hence
one doesn’t know what the system is doing after the measurement. Eigenstates are
special as they are, in some sense, unaffected by a particular measurement (but not
others).
Looking at |ψ⟩ we can compute the expectation value of O to find
X
⟨ψ|Oψ⟩ = c∗m cn ⟨ψm |Oψn ⟩
m,n
X
= λn c∗m cn ⟨ψm |ψn ⟩
m,n
X
= λn c∗n cn
n
X
= λn pn (1.21)
n

where we think of pn = c∗n cn as a probability distribution since the unit norm of |ψ⟩
14 CHAPTER 1. REVIEW: ELEMENTS OF QUANTUM MECHANICS

implies that

1 = ⟨ψ|ψ⟩
X
= c∗m cn ⟨ψm |ψn ⟩
m,n
X
= c∗m cn ⟨ψm |ψn ⟩
m,n
X
= c∗n cn
n
X
= pn (1.22)
n

with 0 ≤ pn ≤ 1. The interpretation of this equation is that pn = |cn |2 is the probability


that a state will be observed to the have the value λn in the measurement associated to
O. The expectation value is therefore the average value that repeated experiments will
obtain for measuring the observable O on the state |ψ⟩. It follows that if the state is an
eigenstate of an operator O then a measurement of O will produce the corresponding
eigenvalue with probability 1.
But this is not the general situation. Rather the system is not in an eigenstate
and so repeated measurements will produce different answers and hence there will be a
non-zero variance known as the uncertainty:

(∆O)2 = ⟨O2 ⟩ − ⟨O⟩2 (1.23)

Note that the right hand side can be written as ⟨(O − ⟨O⟩I)2 ⟩ and hence is positive.
Another theorem asserts that if two operators commute:

[O1 , O2 ] = O1 O2 − O2 O1 = 0 (1.24)

then one can find an orthonormal basis of states which are simultaneously eigenstates
of both operators. On the other hand for pairs of operators that don’t commute we find
the famous Heisenberg uncertainty principle:
1
∆O1 ∆O2 ≥ |⟨[O1 , O2 ]⟩| (1.25)
2
The classic example of this are the positions and momenta operators:

x̂ψ(x) = xψ(x) p̂ψ(x) = −iℏ ψ(x) (1.26)
∂x
so that [x̂, p̂] = iℏ and hence

∆x̂∆p̂ ≥ ℏ/2. (1.27)

Examples

A two-level system. Consider a two-state system (aka a qubit). In this case H is


two-dimensional. Let {|0⟩, |1⟩} be a basis. Then any state of the system takes the form

|ψ⟩ = c0 |0⟩ + c1 |1⟩, c0 , c1 ∈ C. (1.28)


1.2. OPERATORS, OBSERVABLES AND MEASUREMENTS 15

Furthermore
⟨ψ|ψ⟩ = 1 =⇒ |c0 |2 + |c1 |2 = 1. (1.29)
|c0 |2 , |c1 |2 are the probabilities to find the system states |0⟩ and |1⟩ respectively, provided
we measure an operator whose evectors are |0⟩, |1⟩. In vector notation we may choose
! !
1 0
|0⟩ = , |1⟩ = . (1.30)
0 1
An observable in this case is a Hermitian 2×2 matrix and since {|0⟩, |1⟩} are its evectors,
we may write !
λ0 0
O = λ0 |0⟩⟨0| + λ1 |1⟩⟨1| = , (1.31)
0 λ1
then measuring O will give outcome λ0 with probability |c0 |2 and outcome λ1 with prob-
ability |c1 |2 . We will see later that a spin- 21 system is characterized by a state of the
form (1.29) and the spin operator along a direction in space that can be taken to be
proportional to one of the Pauli matrices (say σz ) is an observable.

Position and momenta. Consider an infinite-dimensional Hilbert space acted



upon by the position and momentum operators x̂ = x and p̂ = −iℏ ∂t . These opera-
tors have continuous spectra. To find the position operator eigenfunctions ϕ(x; x0 ), we
expand Z
ψ(x) = dx0 c(x0 )ϕ(x; x0 ). (1.32)
Now note that
|c(x0 )|2 = |ψ(x0 )|2 (1.33)
gives the probability density of finding the system at position x0 . We can then immedi-
ately read off the position eigenfunctions
ϕ(x; x0 ) = N δ(x − x0 ), (1.34)
with N fixed by the normalization condition
Z
1 = dx|ψ(x)|2 . (1.35)

Similarly, for momentum Z


ψ(x) = dkc(k)ϕ(x; k), (1.36)
with
c(k) = ψ̃(k) (1.37)
and therefore
ϕ(x; k) = N eikx . (1.38)
Functions obeying Z
0< dx|ψ(x)|2 < ∞ (1.39)

are said to be L2 normalizable. Thanks to the normalization condition (1.35), ψ(x) is


such a function. L2 -normalizable functions provide an example of a Hilbert space.
16 CHAPTER 1. REVIEW: ELEMENTS OF QUANTUM MECHANICS

1.3 Aside: Canonical Quantisation


How does one construct such a Hilbert space and its corresponding obervables in a
physically meaningful way? There is a notion of canonical quantization whereby one
takes the physical quantities of a classical Hamiltonian system and replaces them by
self-adjoint operators on a Hilbert space where the Poisson bracket is replaced by the
commutator:
i
f (q, p), g(q, p) → Of , Og {f, g} → − [Of , Og ] (1.40)

In particular the most fundamental quantities in a Hamiltonian system are the the
positions qi and conjugate momenta pi with i = 1, .., N which obey

{qi , qj } = {pi , pj } = 0 {qi , pj } = δij (1.41)

The ‘canonical’ way to quantise this system is consider the Hilbert space L2 (RN ) of
functions of qi and make the replacements2

qi →q̂i : q̂i ψ(q1 , ..., qN ) = qi ψ(q1 , ..., qN )



pi →p̂i : p̂i ψ(q1 , ..., qN ) = −iℏ ψ(q1 , ..., qN ) (1.42)
∂qi

and it is easy to see that indeed

[q̂i , q̂j ] = [p̂i , p̂j ] = 0 [q̂i , p̂j ] = iℏδij (1.43)

as required (and used above). Note that this corresponded to a particular choice of coor-
dinates qi and conjugate momentum pi . However Hamiltonian systems admit canonical
transformations that mix these and hence there are several ways to quantise correspond-
ing to choosing a ‘polarisation’, i.e. a choice of what is a q and what is a p.
However this is really putting the cart before the horse as the classical world emerges
from the quantum world and not the other way around. Indeed there are quantum
systems with no classical limit at all. They obey the rules of quantum mechanics that we
are outlining here but they need not come from “quantising” some classical Hamiltonian
system.

1.4 Time Evolution


The most important of all observables is Hamiltonian H and its eigenvalues are the
possible energies of the system. Under canonical quantisation the Hamiltonian equation

dO
= {O, H} (1.44)
dt
2
In general we will avoid putting hats on operators but we need to do so here as qi have very specific
meanings as real numbers.
1.4. TIME EVOLUTION 17

becomes
dOH i
= − [OH , H] (1.45)
dt ℏ
Here I have introduced a subscript H on O to indicate that we are in the so-called
Heisenberg picture where operators evolve in time and states are constant. And 1.45 is
known as the Heisenberg equation.
A more familiar picture is the Schrödinger picture where

OS = e−iHt/ℏ OH eiHt/ℏ
|ΨS ⟩ = e−iHt/ℏ |ΨH ⟩ (1.46)

With this choice we see that


dOS
=0
dt
d i
|ΨS ⟩ = − H|ΨS ⟩ (1.47)
dt ℏ
Furthermore the Hamiltonian governs the time evolution of the system through
Schrödinger’s equation


iℏ |ΨS ⟩ = H|ΨS ⟩ (1.48)
∂t

This follows by thinking of the Hamiltonian operator H as the energy operator Ê and
making the identification


Ê = iℏ (1.49)
∂t
In what follows we drop the subscript S and will always work in the Schrödinger picture.
One then sees that eigenstates of H with eigenvalue En evolve rather trivially (for
simplicity here we assume that H and hence its eigenvectors and eigenvalues are time
independent):

|Ψ(t)⟩ = e−iEn t/ℏ |ψn ⟩ (1.50)

I will try to stick to a convention where the time dependent wavefunction is |Ψ⟩ and
the time independent one, meaning it is an eigenstate of the Hamiltonian, |ψ⟩. The
difference then is just the time-dependent phase.
More generally given any state |Ψ⟩ we can expand it in an energy eigenstate basis:
X
|Ψ⟩ = cn (t)|ψn ⟩ (1.51)
n

in which case substituting into the Schrödinger equation we find

cn (t) = cn (0)e−iEn t/ℏ (1.52)


18 CHAPTER 1. REVIEW: ELEMENTS OF QUANTUM MECHANICS

and hence we know the state at all times:


X
|Ψ(t)⟩ = cn (0)e−iEn t/ℏ |ψn ⟩ (1.53)
n

and the cn (0) can be found by expanding |Ψ(0)⟩ in the energy eigenstate basis. We
have recovered (1.5). Of course finding the eigenstates and eigenvalues of a typical
Hamiltonian is highly non-trivial. But in principle solving Quantum Mechanics is down
to (infinite-dimensional) linear algebra.
A central point is that time evolution is unitary, meaning that it preserves the
inner-product between two states. It is crucial for the consistency of the theory as
otherwise the probabilities we discussed above will fail to sum to unity.

1.5 Example: The Quantum Harmonic Oscillator


Let us review this most classic example of a quantum system corresponding to
1
V (x) = kx2 , (1.54)
2
hence the Schrödinger equation is

∂Ψ ℏ2 ∂ 2 Ψ 1 2
iℏ =− + kx Ψ (1.55)
∂t 2m ∂x2 2
We could proceed by finding the general solution to this differential equation. However
this is a difficult task and a much better analysis can be done using algebra.
Again our Hilbert space is H = L2 (R). We can write the Schrödinger equation as
 
1 2 k 2
Ê|Ψ⟩ = p̂ + x̂ |Ψ⟩ (1.56)
2m 2

Next we introduce new operators â and â†


  12 √ ! 12   12 √ ! 12
1 mk 1 mk
â = √ p̂ − i x̂ , ↠= √ p̂ + i x̂ (1.57)
2ℏ mk 2ℏ 2ℏ mk 2ℏ

These satisfy
√ √
† 1 2 mk 2 i 1 2 mk 2 1
ââ = √ p̂ + x̂ − [x̂, p̂] = √ p̂ + x̂ +
2ℏ mk 2ℏ 2ℏ 2ℏ mk 2ℏ 2
√ √
1 mk 2 i 1 mk 2 1
↠â = √ p̂2 + x̂ + [x̂, p̂] = √ p̂2 + x̂ −
2ℏ mk 2ℏ 2ℏ 2ℏ mk 2ℏ 2
(1.58)

and hence

[â, ↠] = 1 (1.59)


1.5. EXAMPLE: THE QUANTUM HARMONIC OSCILLATOR 19

Thus the Hamiltonian can be written as


r 

1 2 k 2 kℏ2 1 †
H= p̂ + x̂ = â â + (1.60)
2m 2 m 2
p
At this point, if not before, one usually introduces ω = k/m so that
 
† 1
H = ℏω â â + (1.61)
2
We know that the normalisable modes have energies which are bounded below by
zero. In particular there is a lowest energy state that we call the ground state |0⟩. Let
us introduce another operator N̂ called the number operator
N̂ = ↠â (1.62)
i.e. H = ℏω(N̂ + 21 ). Note that N̂ is Hermitian with postive eigenvalues
⟨Ψ|N̂ Ψ⟩ = ⟨Ψ|↠âΨ⟩ = ⟨âΨ|âΨ⟩ ≥ 0 (1.63)
Thus it follows that the lowest energy eigenstate |0⟩ must satisfy
N̂ |0⟩ = 0 (1.64)
which means that
â|0⟩ = 0 (1.65)
q
1 kℏ2
hence Ĥ|0⟩ = 2 m
|0⟩, i.e. the ground state energy is
1
E0 = ℏω (1.66)
2

We can create new states by acting with â
|n⟩ = Nn (↠)n |0⟩ (1.67)
where Nn is a normalisation constant chosen so that ⟨n|n⟩ = 1. In particular it is
r
1
Nn = (1.68)
n!
Next we prove the following lemma: N̂ |n⟩ = n|n⟩. To do this we proceed by in-
duction. By definition we have that N̂ |0⟩ = 0. Next we consider, assuming that
N̂ |n − 1⟩ = (n − 1)|n − 1⟩,
Nn
N̂ |n⟩ = N̂ ↠|n − 1⟩
Nn−1
Nn † †
= â ââ |n − 1⟩
Nn−1
Nn
↠[â, ↠] + ↠↠â |n − 1⟩

=
Nn−1
Nn †  
= â 1 + N̂ |n − 1⟩
Nn−1
Nn †
=n â |n − 1⟩
Nn−1
= n|n⟩
20 CHAPTER 1. REVIEW: ELEMENTS OF QUANTUM MECHANICS

This can also been seen from the following commutators

[N̂ , â] = [↠â, â]


= ↠ââ − â↠â
= ↠ââ − [â, ↠]â − ↠ââ
= −â (1.69)

and

[N̂ , ↠] = [↠â, ↠]


= ↠â↠− ↠↠â
= ↠â↠+ ↠[â, ↠] − ↠ââ†
= ↠(1.70)

Hence it follows that

N̂ â|n⟩ = ([N̂ , â] + âN̂ )|n⟩ = (−â + nâ)|n⟩ = (n − 1)â|n⟩ (1.71)

and

N̂ ↠|n⟩ = ([N̂ , ↠] + ↠N̂ )|n⟩ = (↠+ n↠)|n⟩ = (n + 1)↠|n⟩ (1.72)

The operators ↠and â are therefore known as raising and lowering operators respec-
tively. From this we see that
   
1 1
H|n⟩ = ℏω N̂ + |n⟩ = ℏω n + |n⟩ (1.73)
2 2
i.e. the energy of the state |n⟩ is
 
1
En = ℏω n + (1.74)
2
So we have completely solved the system without ever solving a differential equation.
Furthermore this approach also allows us to explicitly construct the corresponding
energy eigenstate wavefunctions. For example the ground state satisfies

  21 √ ! 21 
1 mk
â|0⟩ =  √ p̂ − i x̂ |0⟩ = 0 (1.75)
2ℏ mk 2ℏ

Recalling the definitions of x̂ and p̂ we obtain a first order differential equation



  12 √ ! 21 
1 ∂ mk
−iℏ √ −i x Ψ0 = 0 (1.76)
2ℏ mk ∂x 2ℏ

where Ψ0 (x, t) = e−iEn t/ℏ ψ(x) is the explicit element of L2 (R) that represents |0⟩.
Rewriting this equation gives

dψ0 mk
=− xψ0 (1.77)
dx ℏ
1.5. EXAMPLE: THE QUANTUM HARMONIC OSCILLATOR 21

mkx2 /2ℏ
The solution to this differential equation is simply ψ ∝ e− and hence

itE0 mk 2
Ψ = N0 e− ℏ e− 2ℏ
x
(1.78)

where N0 is a normalisation and we have included the time-dependent phase factor


itE0
e− ℏ .
Furthermore we can obtain explicit solutions for the higher energy states Ψn (x, t) by
acting with â†
r
1 − itEn † n
Ψn (x, t) = e ℏ (â ) ψ0 (x)
n!
r 
  21 √ ! 12 n
1 − itEn  1 ∂ mk
= e ℏ −iℏ √ +i x ψ0 (x)
n! 2ℏ mk ∂x 2ℏ
r 
  12 √ ! 21 n √
1 itEn 1 ∂ mk mk 2
= N0 e− ℏ −iℏ √ +i x e− 2ℏ x (1.79)
n! 2ℏ mk ∂x 2ℏ


mk 2
These have the form of a polynomial times the Gaussian e− 2ℏ x . We can explicitly
construct the first few wavefunctions, for example,

  12 √ ! 12  √
itE1 1 ∂ mk mk 2
Ψ1 (x, t) = N0 e− ℏ −iℏ √ +i x e− 2ℏ x ,
2ℏ mk ∂x 2ℏ
 √ ! 21 √ ! 12  √
itE1 mk mk mk 2
= N0 e− ℏ +i x+i x e− 2ℏ x
2ℏ 2ℏ
√ ! 12 √
mk itE1 mk 2
= 2iN0 e− ℏ xe− 2ℏ x
2ℏ
(1.80)

It would have been very difficult to find all these solutions to the Schrödinger equation
by brute force and also to use them to calculate quantities such as ⟨x̂⟩ and ⟨x̂2 ⟩.
We can also calculate various expectation values in this formalism. For example, in
the state |n⟩, we have

⟨x̂⟩ = ⟨n|x̂n⟩
  12
i 2ℏ
= √ ⟨n|(â − ↠)n⟩
2 mk
  12
i 2ℏ
⟨n|ân⟩ − ⟨n|↠n⟩

= √ (1.81)
2 mk

Now â|n⟩ ∝ |n − 1⟩ and ↠|n⟩ ∝ |n + 1⟩ so that

⟨x̂⟩ ∝ ⟨n|n + 1⟩ − ⟨n|n − 1⟩ = 0 (1.82)


22 CHAPTER 1. REVIEW: ELEMENTS OF QUANTUM MECHANICS

since the |n⟩ are orthonormal. On the other hand

⟨x̂2 ⟩ = ⟨n|x̂2 n⟩
 
1 2ℏ
= − √ ⟨n|(â − ↠)2 n⟩
4 mk
 
1 2ℏ
⟨n|(â)2 n⟩ + ⟨n|(↠)2 n⟩ − ⟨n|â↠n⟩ − ⟨n|↠ân⟩

= − √
4 mk
(1.83)

Now again (â)2 |n⟩ ∝ |n − 2⟩ and (↠)2 |n⟩ ∝ |n + 2⟩ so that the first two terms give zero.
However the last two terms give
 
1 2ℏ
2
⟨x̂ ⟩ = √ (⟨↠n|↠n⟩ + ⟨ân|ân⟩) (1.84)
4 mk
Now we have already seen that
r
1
|n + 1⟩ = ↠|n⟩ , (1.85)
n+1
which implies that
r r r
1 1 1
â|n + 1⟩ = â↠|n⟩ = (N̂ + [â, ↠])|n⟩ = (n + 1)|n⟩ (1.86)
n+1 n+1 n+1
Thus
r
1
|n⟩ = â|n + 1⟩ (1.87)
n+1
and finally
 
2 1 2ℏ
⟨x̂ ⟩ = √ (2n + 1) (1.88)
4 mk
You can also evaluate ⟨p̂⟩ and ⟨p̂2 ⟩ this way and check that the Heisenberg uncertainty
relation is satisfied.
For completeness, we finally show how the energy spectra of the harmonic oscillator
may also be recovered by directly solving the Schrödinger equation. Consider the time-
independent Schrödinger equation
ℏ2 d2 ψ 1 2
− + kx ψ = Eψ (1.89)
2m dx2 2
It is helpful to write
2
ψ = f (x)e−αx (1.90)

so that
 
dψ df 2 2
= − 2αxf e−α x
dx dx
2
 2 
dψ df df 2 2

2
= 2
− 4αx − 2αf + 4α x f e−α x
2 2
(1.91)
dx dx dx
1.5. EXAMPLE: THE QUANTUM HARMONIC OSCILLATOR 23

and
ℏ2 d2 f
 
df 1
− 2
− 4αx − 2αf + 4α2 x2 f + kx2 f = Ef (1.92)
2m dx dx 2

Let us take α2 = km/4ℏ2 so that the x2 terms cancel:

d2 f df 2mE
2
− 4αx − 2αf = − 2 f (1.93)
dx dx ℏ
This equation will have polynomial solutions. Indeed if we write f = xn + . . . then the
leading order xn−2 term must cancel to give
2mE
−4αn − 2α = − (1.94)
ℏ2
From here we read off that
r
ℏ2 α k
E=2 (n + 1/2) = ℏ (n + 1/2) (1.95)
m m
as before.
24 CHAPTER 1. REVIEW: ELEMENTS OF QUANTUM MECHANICS
Chapter 2

Angular Momentum

Let us start our exploration of Quantum Mechanics by looking at more realistic three-
dimensional models. In Classical Dynamics the simplest, but also most common, sys-
tems have rotational symmetry which leads via Noether’s theorem to conserved angular
momentum. This in turn means that the system can typically be solved. The classic
example is the Kepler problem of a planet moving around the Sun.
It’s ridiculous to think of the quantum problem of something so macroscopic as a
planet moving around something even bigger such as the Sun. The correspondence
principle tells us that we should just reproduce the classical results with incredible
accuracy. But happily there is a suitable quantum analogue which is of a negatively
charged particle (an electron) moving around a heavier positively charged particle (a
nucleus) as found in atoms! But before we tackle this problem head on and see Quantum
Mechanics really working for us and matching experiment we should step back and
think of transformations of states, symmetries, and in particular angular momentum in
Quantum Mechanics. We will see that the angular momentum is simply the generator of
rotations and, just like in classical mechanics, it is conserved in systems with rotational
symmetry. An example of such a system is the Hydrogen atom, which we will introduce
in this chapter and discuss in more detail later.

2.1 Transformations and Unitary Operators


As in classical mechanics, symmetries are transformations that leave the Hamiltonian
invariant. Hence, before we discuss symmetries, we need to understand what we mean
by transformations and how they act on states and operators.

2.1.1 Transformation of states


Recall that linear transformations on states |ψ⟩ ∈ H are given by operators

U : H → H, |ψ⟩ 7→ |ψ ′ ⟩ = U |ψ⟩. (2.1)

25
26 CHAPTER 2. ANGULAR MOMENTUM

This transformation must preserve the norm of |ψ⟩ (recall the first Postulate) and so
1 = ⟨ψ|ψ⟩ = ⟨ψ ′ |ψ ′ ⟩ = ⟨ψ|U † U |ψ⟩. (2.2)
It follows that U must be a unitary matrix, ie.
U † U = I. (2.3)
To see this, let
|ψ⟩ = |ϕ⟩ + λ|χ⟩ (2.4)
and substitute into (2.2). After elementary algebra, this yields
λ∗ ⟨χ|ϕ⟩ − ⟨χ|U † U |ϕ⟩ = −λ ⟨ϕ|χ⟩ − ⟨ϕ|U † U |χ⟩ ,
 
(2.5)
which must be true as we vary λ ∈ C. This means that both sides must vanish, resulting
in (2.3).
We will be interested in transformations that form a group. Recall this means that,
given g1 , g2 ∈ G, we have g1 ·g2 ∈ G, where · is the group composition which is associative.
G has an identity element and every element has an inverse. We want U (gi ) to form a
representation of gi on H, which means that they provide a group homomorphism
U (g1 ) ◦ U (g2 ) = U (g1 · g2 ). (2.6)
We will also usually be interested in transformations that depend on a continuous
parameter θ (cf. translations, rotations). Then given an infinitesimal transformation
θ = δθ, we have
U (δθ) = I − iδθT + O(δθ2 ), (2.7)
where θ = 0 corresponds to no transformation hence the identity on the RHS above.
This is just the operator analog of the Taylor expansion for functions. The unitarity
condition for U then implies
U † (δθ)U (δθ) = I ⇐⇒ iδθT † − iδθT = 0 =⇒ T † = T . (2.8)
T is therefore a Hermitian matrix, hence a good candidate for an observable. More
generally, we have
|ψ ′ ⟩ = |ψ(θ + δθ)⟩ = U (θ + δθ)|ψ⟩ = |ψ(θ)⟩ − iδθT |ψ(θ)⟩ + O(δθ2 ). (2.9)
Rearranging and taking the limit δθ → 0, we find
∂|ψ(θ)⟩
i = T |ψ(θ)⟩. (2.10)
∂θ
This means that |ψ⟩ varies smoothly as a function of θ in H and the rate at which it
varies is governed by T .
Given an infinitesimal transformation, we can exponentiate it as follows,
 N

U (θ) = lim 1 − T = e−iθT . (2.11)
N →∞ N
In other words, we can generate a finite transformation θ by applying θ/N N times and
taking the continuum limit N → ∞.
2.1. TRANSFORMATIONS AND UNITARY OPERATORS 27

2.1.2 Transformation of operators


We can think of the transformations discussed before as acting on operators as opposed
to states. To see this consider the transformation of the expectation value of an operator
A in state |ψ⟩ under the change of |ψ⟩

⟨ψ|A|ψ⟩ 7→ ⟨ψ ′ |A|ψ ′ ⟩ = ⟨ψ|U † AU |ψ⟩. (2.12)

Hence we see that we can keep |ψ⟩ fixed and transform A instead

A 7→ A′ = U † AU. (2.13)

For infinitesimal transformations (2.7) you can check that

A′ = A + iδθ[T, A] + O(δθ2 ). (2.14)

2.1.3 Examples
a) Translations

Let x̂ be the position operator. Translations should be implemented by unitary


operators U (a) defined by
U † (a)x̂U (a) = x̂ + aI. (2.15)

Indeed, we have

U † (a)x̂U (a)|x⟩ = (x̂ + aI)|x⟩ = (x + a)|x⟩ ⇐⇒ x̂U (a)|x⟩ = (x + a)U (a)|x⟩. (2.16)

In other words, U (a)|x⟩ = |x + a⟩. So what is U (a)? To find it, consider first the
infinitesimal translation
i
U (δa) = I − δa · P̂ + O(δa2 ). (2.17)

Here we allowed for translations in 3d space where δa = (δax , δay , δaz ) and ℏ−1 P̂ is a
3-component vector of translation generators (operators). From (2.14) and (2.15), we
have
1
U † (δa)x̂U (δa) = x̂ + δaI + O(δa2 ) ⇐⇒ i [δa · P̂ , x̂] = δaI. (2.18)

For this to hold for any 3-vector δa, we must have

[P̂i , x̂i ] = −iℏδij . (2.19)

But these are just the canonical commutation relations between position and momenta
in QM! Exponentiating,
i
U (a) = e− ℏ a·P̂ . (2.20)

We conclude that the momentum operator is simply the generator of translations.


28 CHAPTER 2. ANGULAR MOMENTUM

b) Time translations

From the same argument as before, we conclude that time translations are imple-
mented by unitary operators parameterized by the time t
i
U (t) = e− ℏ Ĥt , (2.21)

where Ĥ is the generator of time translations. Then a state |ψ(t)⟩ is related to a state
at time t = 0 |ψ(0)⟩ by the action of U (t), namely
i
|ψ(t)⟩ = U (t)|ψ(0)⟩ = e− ℏ Ĥt |ψ(0)⟩. (2.22)

From (2.10), for infinitesimal time translations we have


iℏ |ψ(t)⟩ = Ĥ|ψ(t)⟩ (2.23)
∂t
which is nothing but the time-dependent Schrödinger equation!

2.1.4 Heisenberg vs. Schrödinger pictures, revisited


We can now understand better the relation between the Heisenberg and the Schrödinger
pictures encountered in Chapter 1. In particular, we can use (2.22) to relate time-
dependent states |ψ(t)⟩ (such as those governed by the Schrödinger equation) to time-
independent states |ψH ⟩ ≡ |ψ(0)⟩ via
i
|ψ(t)⟩ = e− ℏ Ĥt |ψH ⟩. (2.24)

By construction, we have

|ψH ⟩ = 0, (2.25)
∂t
but we could also derive the same relation by substituting (2.24) into the Schrödinger
equation (2.23) provided that H doesn’t depend on explicitly on time t.
Consider now the action of operators with no explicit time dependence on |ψ(t)⟩

|ψ ′ (t)⟩ ≡ OU (t)|ψH ⟩ = U (t)U † (t) OU (t)|ψH ⟩ ≡ U (t) OH |ψH ⟩ . (2.26)


| {z } | {z }
I ′ ⟩
|ψH

We see that OH is time-dependent and related to the time-independent operator O by


conjugation by the time-evolution operator U (t). The time evolution of OH is governed
by the Heisenberg eq. of motion

d i i
Ĥt ∂O i  
OH = [ Ĥ, OH ] + 
e ℏ  e− ℏ Ĥt (2.27)
dt ℏ ∂t
where the last term on the RHS vanishes due to the assumed time independence of O.
(2.27) contains the same information as the Schrödinger equation.
2.2. SYMMETRIES 29

We have recovered the relation between the Schrödinger and Heisenberg pictures
discussed in Chapter 1. In the former, states |ψ(t)⟩ are time-dependent and their evo-
lution is governed by the Schrödinger equation while operators O are taken to be time-
independent. In the latter states are time-independent |ψH ⟩ and instead, the operators
OH are time dependent and governed by the Heisenberg equation (2.27). There is noth-
ing physically different about the two pictures, you can think about them as a change in
the point of view. In particular, expectation values of operators are the same regardless
of the picture we choose to use

⟨ψH |OH |ψH ⟩ = ⟨ψH |U † (t)OU (t)|ψH ⟩ = ⟨ψ(t)|O|ψ(t)⟩. (2.28)

The situation becomes more complicated if Ĥ or the operators are dependent on time.
This is usually the case in quantum field theory (there you will encounter yet another
picture, namely the “interaction” picture). Such problems are usually quite hard to
solve in generality and one has to resort to approximate methods such as perturbation
theory. We will talk about this later in the course.

2.2 Symmetries
Consider a Heisenberg picture operator that has no explicit time dependence and com-
mutes with the Hamiltonian, in other words let

OH = Q(t) = U † (t)QU (t), [Ĥ, Q(t)] = [Ĥ, Q] = 0. (2.29)

From (2.27), we immediately get

dQ(t) i
= [ Ĥ, Q(t)] = 0, (2.30)
dt ℏ
so Q(t) is time independent. Operators that are time independent in the Heisenberg
picture are said to be conserved. Note that if we start out with an eigenstate of Q at
time t = 0
Q|ψ(0)⟩ = q|ψ(0)⟩ (2.31)
at a later time t, we find

Q|ψ(t)⟩ = QU (t)|ψ(0)⟩ = U (t)Q|ψ(0)⟩ = qU (t)|ψ(0)⟩ = q|ψ(t)⟩, (2.32)

where in the second equality we used that [Q, Ĥ] = 0 and hence [Q, U (t)] = 0. We
see that a particle that starts off in a Q eigenstate will remain so at all later times.
Furthermore, since [Q, Ĥ] = 0, eigenstates of Q are also eigenstates of Ĥ,

Q|en ⟩ = qn |en ⟩ Ĥ|en ⟩ = En |en ⟩. (2.33)

A typical state will not be an eigenstate of either Ĥ or Q. Nevertheless it is also easy


to see that the expectation value of Q in any state will be time independent (assuming
30 CHAPTER 2. ANGULAR MOMENTUM

that Q is time independent):

d d d
⟨ψ|Q|ψ⟩ = ⟨ ψ|Q|ψ⟩ + ⟨ψ|Q| ψ⟩
dt dt dt
i i
= ⟨− Ĥψ|Q|ψ⟩ + ⟨ψ|Q| − Ĥψ⟩
ℏ ℏ
i
= ⟨ψ|Ĥ † Q − QĤ|ψ⟩

i
= ⟨ψ|[Ĥ, Q]|ψ⟩

= 0. (2.34)

The most important source of conserved quantities are symmetries of the Hamilto-
nian. A symmetry of Ĥ is a transformation U (θ) that leaves Ĥ invariant, namely

U † (θ)ĤU (θ) = Ĥ. (2.35)

From the general discussion of section 2.1.2, this implies that

[T, Ĥ] = 0, (2.36)

where T is the generator of the transformation discussed before. In this special case, T
is also a conserved quantity by (2.29). We arrive at the important conclusion that sym-
metries of the Hamiltonian correspond to conserved quantities. The analog statement
in classical mechanics is the famous Noether theorem.

2.2.1 Aside: relation to classical mechanics


In classical dynamics symmetries play an important role. Simply put these are transfor-
mations of the system that leave the dynamics unchanged. In a Hamiltonian formulation
these correspond to canonical transformations that leave the Hamiltonian invariant:

H(qi′ , p′i ) = H(qi , pi ). (2.37)

Recall that a canonical transformation qi → qi′ , pi → p′i preserves the Poisson brackets,
namely

{qi′ , qj′ } = {p′i , p′j } = 0 {qi′ , p′j } = δij . (2.38)

A particularly important class of transformations are infinitesimal ones: meaning that


one can expand

qi′ = qi + εTi + . . .
p′i = pi + εUi + . . . (2.39)

here the ellipsis denote higher order powers of ε and Ti , Ui are some functions on phase
space. It is known that for this to be a canonical transformation there must exist a
2.3. A SPHERICALLY SYMMETRIC POTENTIAL 31

function Q on phase space such that1

∂Q ∂Q
Ti = Ui = − . (2.40)
∂pi ∂qi

This in turn means that

qi′ = qi + ε{qi , Q} p′i = pi + ε{pi , Q}. (2.41)

One then finds that


X  ∂H ∂H

H(qi′ , p′i ) = H(qi , pi ) + ε Ti + Ui + . . .
∂qi ∂pi
X  ∂H ∂Q ∂H ∂Q 
= H(qi , pi ) + ε − + ...
∂qi ∂pi ∂pi ∂qi
= H(qi , pi ) + ε{H, Q} + . . . (2.42)

Thus an infinitesimal symmetry corresponds to function Q such that

dQ
{Q, H} = 0 ⇐⇒ = 0, (2.43)
dt
where we have assumed that Q has no explicit time-dependence. This it the famous
Noether theorem: every infinitesimal symmetry gives rise to a conserved quantity
and vice-versa (although the converse is not true in a Lagrangian formulation).

2.3 A Spherically Symmetric Potential


In three-dimensions the Schrödinger equation takes the form

∂Ψ ℏ2 2
iℏ =− ∇ Ψ + V (x, y, x)Ψ (2.44)
∂t 2m
Here we have assumed the potential is time-independent and

∂2 ∂2 ∂2
∇2 = + + . (2.45)
∂x2 ∂y 2 ∂z 2

This is in general too hard to solve but let us consider a potential with spherical symme-
p
try V = V (r) and r = x2 + y 2 + z 2 . Because the system is symmetric, it is helpful to
switch to coordinates that make these symmetry manifest, namely spherical coordinates:

x = r sin θ cos ϕ ,
y = r sin θ sin ϕ ,
z = r cos θ. (2.46)
1
One can easily check that the first equation in (2.38) tell us that Ti , Ui must be total derivatives
wrt pi , qi respectively of some functions Q, P , while the second equation gives the functions are the
same up to a sign.
32 CHAPTER 2. ANGULAR MOMENTUM

In this case (you can show this by a tedious calculation just using the chain rule, see
the problem sets, its easier if you know Riemannian geometry)

∂2
     
2 1 ∂ 2 ∂ 1 ∂ ∂ 1 1 ∂ 2 ∂ 1
∇ = 2 r + 2 sin θ + 2 2 2
≡ 2 r + 2 L2 ,
r ∂r ∂r r sin θ ∂θ ∂θ r sin θ ∂ϕ r ∂r ∂r r
(2.47)

where the differential operator L2 only depends on the angular variables. We will see
later how this is related to the angular momentum generator in some basis.
We now look for energy eigenstates. We can make the Ansatz (see section 1.4)

Ψ = e−iEt/ℏ ψ(r, θ, ϕ) (2.48)

which leads to the following equation

ℏ2 L2 (θ, ϕ)
   
1 ∂ 2 ∂
− r + ψ + V (r)ψ = Eψ. (2.49)
2m r2 ∂r ∂r r2
We can solve this equation by separation of variables. In particular we write

ψ(r, θ, ϕ) = u(r)Y (θ, ϕ) (2.50)

From here we find


ℏ2 d ℏ2 u 2
 
2 du
− r Y − L Y + (V (r) − E)uY = 0. (2.51)
2mr2 dr dr 2mr2

By the usual logic of separation of variables (divide by uY /r2 and see that the first and
third terms only depend on r where as the second is independent of r) splits into two
equations

ℏ2 d
 
2 du
− r + (V (r) − E)r2 = −ϵ
2mu dr dr
1 ℏ2 1 ∂2
   
1 ∂ ∂
− sin θ + Y = ϵ, (2.52)
Y 2m sin θ ∂θ ∂θ sin2 θ ∂ϕ2
| {z }
L2

where ϵ is a constant. In particular the second equation knows nothing about the
potential and hence nothing about the problem at hand:

1 ∂2
   
1 ∂ ∂ 2mϵ
sin θ + + 2 Y = 0. (2.53)
sin θ ∂θ ∂θ sin2 θ ∂ϕ2 ℏ

Indeed the solutions for Y (θ, ϕ) can be completely solved and the solutions are related
to spherical harmonics Yl,m (θ, ϕ), labelled by two integers l, m with l ≥ 0 and |m| ≤ l.
On the other hand the first equation is a single ordinary differential equation for
u(r) and we have some hope to solve it:
 
1 d 2 du 2m  ϵ 
− 2 r + 2 V (r) + 2 − E u = 0 (2.54)
r dr dr ℏ r
2.4. ROTATIONS AND ANGULAR MOMENTUM OPERATORS 33

This is the only place which is sensitive to the particular problem at hand through the
choice of potential V (r).
Thus we have reduced the problem to solving two independent differential equations.
And the only one specific to the particular problem is a second order ordinary linear
differential equation. One can proceed with brute force (the Y ′ s can be found by sepa-
ration of variables once more). You can find them on Wikipedia or Mathematica so we
won’t say more about them here. Although we will say that one finds
ℏ2 l(l + 1)
ϵ= (2.55)
2m
for some non-negative integer l. Later we will construct them in an insightful way. Indeed
spherical harmonics arise from some very important physics (angular momentum) and
mathematics (Lie algebras) underlying the system. So let us explore...

2.4 Rotations and Angular Momentum Operators


There is an important example of transformation on H that we did not discuss yet,
namely rotations. Let’s first recall how these are defined in R3 . Let v ∈ R3 . A rotation
around n̂ by an angle α is a map

R(α) : R3 → R3 , v 7→ v ′ = R(α)v, (2.56)

where
v ′ · v ′ = v · v, det R(α) = 1. (2.57)
These two constraints imply that R(α) ∈ SO(3), ie. they are special orthogonal matri-
ces. You may recall from studying rotations in classical mechanics that an infinitesimal
rotation of v by an angle δα around some unit vector n̂ (that points around the axis of
rotation) is given by
v ′ = v + δαn̂ × v + O(δα2 ), (2.58)
where × is the cross product. Draw a picture to convince yourself that this is the case
(recall that the wedge product of two vectors gives a vector perpendicular to the plane
defined by the two vectors).
In quantum mechanics, rotations are defined by a unitary transformation U (α) whose
action on the position operator is given by (see also the discussion in section 2.1.3)

U † (α)x̂U (α) = R(α)x̂. (2.59)

Recall that in quantum mechanics, the vector x ∈ R3 is promoted to a vector of position


operators x̂ = (x̂, ŷ, ẑ) and R(α) acts on this vector as usual, resulting into another
vector of operators. For example, if R(α) is a rotation around the z axis (ie. a rotation
in the x − y plane), we get
    
cos θ − sin θ 0 x̂ cos θx̂ − sin θŷ
R(α)x̂ =  sin θ cos θ 0 ŷ  = sin θx̂ + cos θŷ  . (2.60)
    

0 0 1 ẑ ẑ
34 CHAPTER 2. ANGULAR MOMENTUM

The important thing to keep in mind is that the components of these vectors are not
numbers, but operators on H.
From (2.7), we can write an infinitesimal rotation as

i
U (δα) = I − δα · Ĵ + O(δα2 ) (2.61)

Substituting this into the definition (2.59), and using (2.14) and (2.58), we deduce that
an infinitesimal rotation of x̂ is given by

i
[δα · Ĵ, x̂] = δα × x̂. (2.62)

Using2
3
X X
δα · Ĵ = δαi Jˆi , (δα × x̂)i = εijk δαj x̂k (2.63)
i=1 j,k

and given that (2.62) has to hold for all δαi , we find


[Jˆi , x̂j ] = εjik x̂k = iℏεijk x̂k . (2.64)
i

In summary, x̂ transforms as a vector under rotations and α · Ĵ is the generator of


rotations around α ≡ αn̂.
What about successive rotations? Consider two infinitesimal rotations around dif-
ferent axes α, β

R(δβ)R(δα)v = R(δβ) (v + δα × v) = v+δα×v+δβ×(v+δα×v)+O(δα2 , δβ 2 ). (2.65)

We can now compute the commutator (or the difference between first rotating around
α and then around β and vice versa)

(R(δβ)R(δα) − R(δα)R(δβ)) v = δβ × (δα × v) − δα × (δβ × v)


(2.66)
= (δβ × δα) × v = R(δβ × δα)v − v.

The last line can be checked from the definition of the cross product and please do so!
Since U (α) is a homomorphism, (2.66) implies that

[U (δβ), U (δα)] = U (δβ × δα) − I, (2.67)

or using (2.61),
1 i
2

[δβ · Ĵ, δα · Ĵ] = − (δβ × δα) · Ĵ. (2.68)
ℏ ℏ
Inserting the formulas (2.63) for the scalar and cross products, since (2.68) must hold
for any δα, δβ we must have that

[Jˆi , Jˆj ] = iℏεijk Jˆk . (2.69)


2
P
From now one we will adopt the Einstein summation convention i ai bi = ai bi .
2.4. ROTATIONS AND ANGULAR MOMENTUM OPERATORS 35

This means that Jˆi form an algebra. We will see in the next chapter that this is the Lie
algebra of the Lie groups SO(3) and SU (2), whose representations we will discuss in
detail.
A special case is the angular momentum from classical mechanics L = x × p, which
in quantum mechanics becomes

L = x̂ × p̂ (2.70)

If we denote x = (x, y, z) = (x1 , x2 , x3 ) then we have, in components

Li = εijk x̂j p̂k (2.71)

where i = 1, 2, 3 and we use the convention that repeated indices are summed over.
Therefore, following canonical quantisation, we find the operator

Li = −iℏεijk xj (2.72)
∂xk
for example
∂ ∂
L1 = −iℏx2 3
+ iℏx3 2 (2.73)
∂x ∂x
To gain some intuition let us look at

[L1 , L2 ] = L1 L2 − (1 ↔ 2)
  
2 k ∂ m ∂
= −ℏ ε1kl x ε2mn x − (1 ↔ 2)
∂xl ∂xn
∂2
 
2 k ∂ k m
= −ℏ ε1kl ε2mn x δlm n + x x − (1 ↔ 2)
∂x ∂xl ∂xn
∂ ∂2
= −ℏ2 ε1kl ε2ln xk n − ℏ2 ε1kl ε2mn xk xm l n − (1 ↔ 2) (2.74)
∂x ∂x ∂x
Lets look at the second term. We see that it is symmetric under k ↔ m and l ↔ n.
Therefore it is symmetric under 1 ↔ 2 and is cancelled when we subtract 1 ↔ 2. Thus
we have

[L1 , L2 ] = −ℏ2 ε1kl ε2ln xk n − (1 ↔ 2)
∂x

= −ℏ2 ε123 ε231 x2 1 − (1 ↔ 2)
∂x

= −ℏ2 x2 1 − (1 ↔ 2)
∂x
= iℏL3 (2.75)

where we have used the fact that l ̸= 1 in ε1kl and l ̸= 2 in ε2ln so the only non-zero
term comes from l = 3 and hence k = 2, n = 1. This is a general fact which follows from
the identity3

εikl εjlm − εjkl εilm = −εijl εlkm (2.76)


3
Check it for your self or use Mathematica. I don’t know any “smart” way to do it, just think of
the cases and note that the left hand side is anti-symmetric in i, j.
36 CHAPTER 2. ANGULAR MOMENTUM

and so in general (see the problem sets)

[Li , Lj ] = iℏεijk Lk (2.77)

We find again (2.69). This structure is powerful enough for us to deduce more or less
anything we need to know about states with angular momentum from purely algebraic
considerations. In the next chapter we will also discuss another important observable
that obeys the same relation (2.69) but has no classical analog: the spin.
Chapter 3

All about su(2)

3.1 The Stern-Gerlach Experiment


Quantization of angular momentum is due to Bohr and Sommerfeld (20s) and was
tested by Stern & Gerlach in a famous 1922 experiment. The idea of the experiment is
as follows. Consider a collection of atoms, isotropically distributed in space. Each atom
of mass M is predicted to be governed by the quantum Hamiltonian

P̂ 2
H= − µ̂ · B. (3.1)
2M
Clasically, µ would correspond to the magnetic dipole moment of the atom and is pro-
portional to the angular momentum of electrons going around a loop. In quantum me-
chanics, the magnetic dipole moment is proportional to the spin, an intrinsic property
of atoms:
µ
µ → µ̂ ≡ Ŝ. (3.2)
ℏs
Ŝ obeys the same commutation relations as the angular momentum, namely

[Ŝi , Ŝj ] = iℏεijk Ŝk (3.3)

and so the state of the atom transforms non-trivially under Ŝ. Essentially, the magnetic
moment captures deviations of the atom from spherical symmetry (ie. there is a preferred
direction of “spin” around an axis).
As discussed before, observables are obtained by computing expectation values of
various operators whose evolution is governed by the Heisenberg equations of motion.
Recalling that in the Heisenberg picture states are time-independent, we have

d⟨x̂⟩ i i P̂ 2 ⟨P̂ ⟩
= ⟨ [H, x̂]⟩ = ⟨ [ , x̂]⟩ = , (3.4)
dt ℏ ℏ 2M M
where we used the commutation relations for P̂ and x̂ (2.19). This is the analog of the
classical relation between position and momentum. Furthermore, we have

d⟨P̂ ⟩ i
= ⟨[H, P̂ ]⟩ = ⟨∇(µ̂ · B)⟩. (3.5)
dt ℏ
37
38 CHAPTER 3. ALL ABOUT su(2)

This the analog of Newton’s second law, where the RHS is interpreted as a force that
acts on the atom.
Provided that the magnetic field is inhomogeneous, ie. ∇(µ̂ · B) ̸= 0, the atoms
experience a force that pushes them in the direction of increasing B if their spins are
aligned, and in the direction of decreasing B if their spins are anti-aligned. Provided the
atoms are unpolarized, classically, we would expect a continuum distribution of atoms
on a remote screen, as shown in Figure ??. Instead the SG experiment found in the
case of silver atoms that the beam split into two after passing through the magnetic
field. This observation was consistent with the quantization of µ (ie. Ŝ promoted to a
vector of operators, one for each direction in space). In this case, dimH = 2 and each
component of µ̂ is a 2 × 2 matrix with eigenvalues ±1 (in some units).
We will see that this case corresponds to a 2-dimensional representation of the su(2)
Lie algebra. In general su(2) admits (2s + 1)-dimensional representations, and indeed,
going through the periodic table, the pattern on the screen in the SG experiment was
found to consist of an integer number of lines! This observation is fully consistent with
the description of spin in terms of representations of su(2) with s ∈ 21 N. The goal of
this Chapter is to explain these observations using math.

3.2 su(2) Lie algebras and representations


We have seen that the rotation generators (2.69), as well as the angular momentum
operators (2.69) (with factors of ℏ stripped off) satisfy [Ji , Jj ] = iϵijk Jk . It is not hard
to see that
1
J i = τi (3.6)
2
where τi are the Pauli matrices also satisfy [Ji , Jj ] = iϵijk Jk . Recall that the Pauli
matrices are by definition
! ! !
0 1 0 −i 1 0
τ1 = , τ2 = , τ3 = . (3.7)
1 0 i 0 0 −1
So what other matrices or operators do?
The commutation relation satisfied by the J ′ s is an example of a Lie-algebra. Roughly
speaking (and there is a whole module devoted to this) a Lie algebra is what you get
when you look at infinitesimally small group transformations. We have already encoun-
tered this in the last chapter when we looked at transformations U (δθ) parameterized
by a continuous, infinitesimal variable δθ like in (2.7). The formal definition is that a
Lie-algebra is a vector space G (often taken to be complex) with a bi-linear map

[ · , · ]:G×G →G (3.8)

which is antisymmetric: [A, B] = −[B, A] and satisfies the Jacobi identity:

[A, [B, C]] + [B, [C, A]] + [C, [A, B]] = 0. (3.9)
3.2. su(2) LIE ALGEBRAS AND REPRESENTATIONS 39

This last relation is automatically obeyed by matrices if we take [A, B] = AB − BA


since matrix multiplication is associative (A(BC) = (AB)C). If we let Ta be a basis
for G where a = 1, .., dimG then it must be that there exist constants, called structure
constants, such that

[Ta , Tb ] = ifab c Tc (3.10)

where there is a sum over c and the Ta are known as generators of the Lie-algebra.
The factor of i requires some explanation. It is often not there in the mathematics
literature. However in Physics we like our generators to be Hermitian (just like the
angular momentum operators). If the Ta are Hermitian then

([Ta , Tb ])† = Tb† Ta† − Ta† Tb†


= Tb Ta − Ta Tb
= −[Ta , Tb ] (3.11)

Thus the factor of i ensures that fab c are real. In Mathematics one often drops the factor
of i and takes Ta = −Ta† . This is essentially the same reason why we included a factor
of i in the expansion of (2.7).
A representation is a linear map

Π : G → End(V ) (3.12)

such that

Π([Ji , Jj ]) = [Π(Ji ), Π(Jj )] (3.13)

Here V is a vector space (typically complex) and End(V ) are the set of endomorphisms
of V (linear maps from V to V ). In physics language V = CN and End(V ) are N × N
matrices.
We are particularly interested in irreducible representations. These are represen-
tations with no non-trivial invariant subspaces. That is, there are no vector subspaces
of V that are mapped to itself by Π.
Let us suppose that we are given matrices Ji that satisfy [Ji , Jj ] = iϵijk Jk with
i = 1, 2, 3. We have dimG = 3 and fabc = ϵabc which define the su(2) ≃ so(3) Lie algebra.
Since ϵijk are real, Ji† = Ji but we do not know anything else yet and we certainly don’t
assume that they are 2 × 2 matrices or differential operators. It is important not to
confuse the dimension of the Lie-algebra dimG with the dimension of the representation,
N . We will see that we can construct N × N Hermitian matrices Ji , i.e. N -dimensional
representations of su(2), for any N = 0, 1, 2, ....

3.2.1 Construction of irreducible unitary su(2) representations


Let us now return to our task: finding all N × N Hermitian matrices Ji that satisfy

[Ji , Jj ] = iϵijk Jk . (3.14)


40 CHAPTER 3. ALL ABOUT su(2)

We note that
J 2 = (J1 )2 + (J2 )2 + (J3 )2 , (3.15)

is a Casimir.1 That means it commutes with all the generators


X
[J 2 , Ji ] = [Jj2 , Ji ]
j
X
= Jj [Jj , Ji ] + [Jj , Ji ]Jj
j
X
= Jj ϵjik Jk + ϵjik Jk Jj
jk
X
= ϵjik (Jj Jk + Jk Jj )
jk

=0. (3.16)

There is a famous theorem known as Schur’s lemma which states that any such Casimir
must act as a multiple of the identity in an irreducible representation. This means that
J 2 = λI in any irreducible representation. In practical terms if J 2 commutes with all
other operators then nothing will change the eigenvalue of J 2 .
Since the Ji are Hermitian we can chose to diagonalise one, but only one since su(2)
has rank 1, say J3 . Thus the representation has a basis of states labelled by eigenvalues
of J3 and J 2 :
J3 |λ, m⟩ = m|λ, m⟩ J 2 |λ, m⟩ = λ|λ, m⟩. (3.17)

In analogy to the harmonic oscillator we can trade J1 and J2 for the operators

J± = J1 ± iJ2 J+† = J− . (3.18)

Notice that

[J3 , J± ] = [J3 , J1 ± iJ2 ]


= [J3 , J1 ] ± i[J3 , J2 ]
= iJ2 ± J1
= ±J± . (3.19)

We can therefore use J± to raise and lower the eigenvalue of J3 :

J3 (J± |λ, m⟩) = ([J3 , J± ] + J± J3 )|λ, m⟩


= (±J± + mJ± )|λ, m⟩
= (m ± 1)J± |λ, m⟩ (3.20)
1
The Casimir is an element of the universal enveloping algebra of su(2). You may wonder whether
we can find other operators that commute with all the generators. A theorem by Harish-Chandra tells
us that the dimension of the center of the universal enveloping algebra of an algebra is equal to the
rank of the algebra. Since su(2) has rank 1, it has only one Casimir.
3.2. su(2) LIE ALGEBRAS AND REPRESENTATIONS 41

Therefore we have

J+ |λ, m⟩ = c+ (m)|λ, m + 1⟩ J− |λ, m⟩ = c− (m)|λ, m − 1⟩ , (3.21)

where the constants c+ (m) and c− (m) are chosen to ensure that the states are normalized
(we are assuming for simplicity that the eigenspaces of J3 are one-dimensional - we will
return to this shortly).
It will also be useful to know that

[J+ , J− ] = [J1 + iJ2 , J1 − iJ2 ] = −i[J1 , J2 ] + i[J2 , J1 ] = −2i × iϵ123 J3 = 2J3 . (3.22)

Furthermore, it is easy to show that in terms of J± , the Casimir (3.15) takes the form
(please check!)
1
J 2 = (J+ J− + J− J+ ) + J32 (3.23)
2
Using these identities, we can now determine the normalization constants in (3.21) as
follows. Using that the states are normalized to 1, namely ⟨m|m⟩ = ⟨m + 1|m + 1⟩ = 1,
and (3.18), we have

|c+ |2 ⟨m + 1|m + 1⟩ = |c+ |2 = ⟨λ, m|J− J+ |λ, m⟩,


|c− |2 ⟨m − 1|m − 1⟩ = |c− |2 = ⟨λ, m|J+ J− |λ, m⟩.

We can now take the sum and difference of the above,

|c+ |2 + |c− |2 = ⟨λ, m|(J− J+ + J+ J− )|λ, m⟩,


(3.24)
|c+ |2 − |c− |2 = ⟨λ, m|[J− , J+ ]|λ, m⟩.

We can now conveniently use (3.23) in the first identity and (3.22) in the second one to
find
|c+ |2 + |c− |2 = ⟨λ, m|(J 2 − J32 )|λ, m⟩ = (2λ − 2m2 )
(3.25)
|c+ |2 − |c− |2 = −2⟨λ, m|J3 |λ, m⟩ = −2m,

where in the last line we used that |λ, m⟩ is an eigenstate of J 2 , J3 (3.17). We can now
easily solve for c±

c+ = λ − m2 − m
√ (3.26)
c− = λ − m2 + m.

Thus we see that any irrep of su(2) is labelled by λ and has states with J3 eigenvalues
m, m ± 1, m ± 2, . . .. If we look for finite dimensional representations then there must be
a highest value of J3 -eigenvalue mh and lowest value ml . Furthermore the corresponding
states must satisfy

J+ |λ, mh ⟩ = 0 J− |λ, ml ⟩ = 0 (3.27)

This in turn requires that c+ (mh ) = c− (ml ) = 0:

λ − mh (mh + 1) = 0 and λ − ml (ml − 1) = 0 . (3.28)


42 CHAPTER 3. ALL ABOUT su(2)

This implies that


λ = mh (mh + 1) (3.29)

and also that

mh (mh + 1) = ml (ml − 1) . (3.30)

This is a quadratic equation for ml as a function of mh and hence has two solutions.
Simple inspection tells us that

ml = −mh or ml = mh + 1 . (3.31)

The second solution is impossible since ml ≤ mh and hence the spectrum of J3 eigenvalues
is:
mh , mh − 1, ..., −mh + 1, −mh , (3.32)

with a single state assigned to each eigenvalue. Furthermore there are 2mh + 1 such
eigenvalues and hence the representation has dimension 2mh + 1. This must be an
integer so we learn that
2mh = 0, 1, 2, 3.... . (3.33)

We now return to the issue whether or not the eigenspaces |λ, m⟩ can be more
than one-dimensional. If the space of eigenvalues with m = mh is N -dimensional then
when we act with J− we obtain N -dimensional eigenspaces for each eigenvalue m. This
would lead to a reducible representation where one could simply take one-dimensional
subspaces of each eigenspace. Let us then suppose that there is only a one-dimensional
eigenspace for m = mh , spanned by |λ, mh ⟩. It is then clear that acting with J− produces
all states and each eigenspace of J3 has only a one-dimensional subspace spanned by
|λ, m⟩ ∝ (J− )n |λ, mh ⟩ for some n = 0, 1, ..., 2λ + 1.
In summary, and changing notation slightly to match the norm, we have obtained
a (2l + 1)-dimensional unitary representation determined by any l = 0, 12 , 1, 23 , ... having
the Casimir J 2 = l(l + 1)I (in terms of what we had before l = mh ). The states can be
labelled by |l, m⟩ where m = −l, −l + 1, ..., l − 1, l.
Let us look at some examples.
l = 0: Here we have just one state |0, 0⟩ and the matrices Ji act trivially. This is the
trivial representation.
l = 1/2: Here we have 2 states:
! !
1 0
| 12 , 21 ⟩ = | 21 , − 12 ⟩ = . (3.34)
0 1

By construction J3 is diagonal:
!
1/2 0
J3 = . (3.35)
0 −1/2
3.3. LIE GROUPS VS. ALGEBRAS 43

We can determine J+ through


p
J+ | 12 , 12 ⟩ = 0 J+ | 21 , − 21 ⟩ = 3/4 − 1/4 + 1/2| 21 , 12 ⟩ = | 12 , 12 ⟩ (3.36)

so that !
0 1
J+ = . (3.37)
0 0
And can determine J− through
p
J− | 21 , 12 ⟩ = 3/4 − 1/4 + 1/2| 21 , − 12 ⟩ J− | 21 , − 21 ⟩ = 0 (3.38)

so that !
0 0
J− = . (3.39)
1 0
Or alternatively
!
1 1 0 1
J1 = (J+ + J− ) =
2 2 1 0
!
1 1 0 −i
J2 = (J+ − J− ) = (3.40)
2i 2 i 0

Thus we have recovered the Pauli matrices.

3.3 Lie groups vs. algebras


An important observation is that we have been looking at representations of the Lie-
algebra and not the group. A Lie group is defined as a group G which is also a smooth
manifold, in which multiplication and inversion are smooths maps. A manifold is a
space that is locally Rn and n is the dimension of the manifold. At every point in G, we
can define the tangent space Tp G which is a vector space also of dimension n. The Lie
algebra defined abstractly before is defined as the tangent space of G at the identity.
Near the identity, every element of G admits the following expansion

g(δθi ) = I − iδθi Ji + . . . , i = 1, · · · n, (3.41)

where δθi << 1 some arbitrarily small parameters. This is exactly the same expansion
we have encountered before in the discussion of rotations (there n = 3). To obtain
representations of the group we simply exponentiate these matrices (see also (??)):
i i
g = e−iθ J (3.42)

where now θi are not small.


Let us at look at what we get by exponentiating the Pauli matrices we found above:
i
g = e− 2 θ·τ (3.43)
44 CHAPTER 3. ALL ABOUT su(2)

To this end let us take

θ = θn̂ (3.44)

where n̂ is a unit vector in R3 and θ = |θ|. We note that

(n̂ · τ )2 = n̂i n̂j τi τj


= n̂i n̂j (δij I + iεijk τk )
= n̂ · n̂I
=I (3.45)

This means that (in̂ · τ )2 = −I. Thus we have (the proof is the same as for eiθ =
cos θ + i sin θ)
i
g = e− 2 θ·τ
= cos(θ/2)I − in̂ · τ sin(θ/2) (3.46)

and hence

g † g = (cos(θ/2)I + in̂ · τ sin(θ/2)) (cos(θ/2)I − in̂ · τ sin(θ/2))


= (cos2 (θ/2) + sin2 (θ/2))I
=I (3.47)

i
In other words g † = e 2 θ·τ = g −1 so we have unitary matrices. We can also compute
i i
det g = det e− 2 θ·τ = etr(− 2 θ·τ ) = 1. (3.48)

This follows because τi are traceless. Thus we have g ∈ SU (2) where SU (2) is the group

SU (2) = {2 × 2 complex matrices| det g = 1, g † = g −1 } (3.49)

One caveat: these are unitary representations because we assumed that the Ji are Her-
i
mitian (so the group elements e−iθ Ji are Unitary). But we started off by looking at
rotations, that is SO(3). What happened?
Here encounter the fact that there are two groups with the same Lie-algebra: SU (2)
and SO(3). Indeed there is a well-known theorem:

SO(3) ∼
= SU (2)/Z2 (3.50)

where Z2 is generated by the centre of SU (2) - that is to say the set of all elements
in SU (2) that commute which each other. This means that they are multiples of the
identity and this leaves just ±I. In particular

g(θ = 0) = I g(θ = 2π) = −I. (3.51)


3.3. LIE GROUPS VS. ALGEBRAS 45

To exhibit the isomorphism, let’s go back to our discussion of rotations which are
elements of SO(3). Looking at infinitesimal rotations in R3 , we have

R = I − iδθ · J + . . . (3.52)

with
     
0 0 0 0 0 1 0 1 0
J 1 = i 0 0 1  , J2 = i  0 0 0 , J3 = i −1 0 0 (3.53)
     

0 −1 0 −1 0 0 0 0 0

Each of these generates an infinitesimal two-dimensional rotation in the x, y and z


planes respectively. For example take only δθ3 non-zero:
     
x x x
R y  = y  − iδθ3 J3 y 
     

z z z
 
x + δθ3 y
= y − δθ3 x , (3.54)
 

which is indeed an infinitesimal rotation around the z axis. One can see explicitly that

[Ji , Jj ] = iεijk Jk . (3.55)

This is the su(2) Lie algebra, but unlike the Pauli matrices, the generators of rotations
are 3 × 3 matrices. On the one hand, this is a different representation of su(2).
On the other hand, exponentiating these gives a relatively complicated expression.
Restricting again our attention to rotations around z, we get
     
0 0 0 1 0 0 0 1 0
g = e−iθ3 J3 = 0 0 0 + cos θ3 0 1 0 + i sin θ3 −1 0 0
     

0 0 1 0 0 0 0 0 0
 
cos θ3 sin θ3 0
= − sin θ3 cos θ3 0 (3.56)
 

0 0 1

Here we have used the fact that J3 , and all powers of J3 , split into a non-trivial 2 × 2
bit that squares to minus the identity and a trivial part with only zeros. Thus we have
rotations with θ3 ∈ [0, 2π].
We can now see that the homomorphism Φ : SU (2) → SO(3) is
 
i
− θ·τ
Φ e 2 = e−iθ·J (3.57)

and indeed both θ = 0 and |θ| = 2π are mapped to the identity in R3 .


46 CHAPTER 3. ALL ABOUT su(2)

It’s also interesting to think about how SU (2) and SO(3) look as spaces (manifolds).
Rewriting equation 3.46 as

g = AI + iB · τ (3.58)

we see that A2 + B · B = 1 parameterises a 3-sphere in 4 dimensions: SU (2) ∼ = S 3.


What does SO(3) look like? To describe a rotation we choose an axis n̂ and an angle
θ ∈ [−π, π]. This allows us to construct θ = θn̂. But we can see that −n̂ and −θ
parameterise the same rotation. As a result SO(3) ∼
= S 3 /Z2 and is not simply connected.
In particular the path
 
1
θ = πs , n̂ = 0 , s ∈ [−1, 1] (3.59)
 

0
is closed in SO(3) but can’t be continuously deformed to a point. But curiously enough
going around this loop twice is deformable to a point, i.e. no loop, π1 (SO(3)) = Z2 .
More general representation arise by considering tensors Tµ1 ...µn over C2 for su(2)
or R3 for SO(3). The group elements act on each of the µi indices in the natural
way. In general this does not give an irreducible representation. For larger algebras
such as SU (N ) and SO(N ) taking Tµ1 ...µn to be totally anti-symmetric does lead to
an irreducible representation. So does totally symmetric and traceless on any pair of
indices.
What happens in Nature? We we will see that the spherical harmonics are basis
functions for representations of SO(3). They arise because there is a fundamental SO(3)
rotational symmetry of space. For spherical harmonics l is an integer. This is because
for rotations, we need the image of −I in SU (2) under the homomorphism Φ in (3.57)
to be the identity in SO(3), namely

e2πiJ3 = I. (3.60)

Hence the eigenvalues of J3 must be integers.


We have also just seen that representations with l half-integer exists. Later we will
see that actually the symmetry of space is SU (2) and that there are particles, Fermions,
which transform under SU (2) not just SO(3). Sometimes one writes SU (2) = Spin(3)
where Spin(d) is the simply connected cover of SO(d). But this is critical as Fermions
must satisfy the Pauli exclusion principle which means that no two Fermions can be in
the same state. This is ultimately what makes atoms and matter stable enough for us
to exist.
Furthermore with relativity the symmetry group of spacetime is enhanced to SO(1, 3).
But again there are Fermions and the actual symmetry group is Spin(1, 3) ∼ = SL(2, C) ∼
=
SU (2)×SU (2). By a happy coincidence everything is still described by SU (2). This is a
fluke of being in a relatively low dimension. In higher dimensions the spacetime groups
and algebras are more complicated than those of SU (2). However many physicists can
happily spend their lives only looking at SU (2).
Chapter 4

Back to Angular Momentum

We have previously constructed N = 2s + 1 dimensional representations of su(2). We


have seen that the s = 21 (N = 2) representations describe the Hilbert space of a qubit,
while the s = 3 representations (N = 3) describe the invariant subspace of rotations in
3d. We have seen that Hermitian operators acting on these spaces are the Pauli matrices
(??) and the rotation generators (2.61) respectively.
We have also seen that the angular momentum generators


Li = −iℏϵijk xj ≡ (x̂ × p̂)i , (4.1)
∂xk

obey the su(2) commutation relations. These are clearly not N × N matrices, so where
do they fit? Recall that the position and momentum operators (x̂, p̂) act on the space
of differentiable functions on R3 which is an infinite dimensional Hilbert space. This
means that Li in (4.1) must also act on this space. However, we will now show that they
do not act irreducibly on this space. In other words, functions on R3 (or more precisely
functions on the 2-dimensional sphere S 2 ) form a reducible representation of the su(2)
algebra. We can study the action of (4.1) on this space to decompose it into irreducible
subspaces.
Following the abstract construction of finite-dimensional irreducible representations
in Section ??, we want to look for eigenstates of the Casimir L2 , and L3 . As before, we
introduce

L+ = L1 + iL2 L− = L1 − iL2 (4.2)

which satisfy L− = L†+ . The commutation relations are now

[L3 , L+ ] = [L3 , L1 + iL2 ] = iℏL2 + ℏL1 = ℏL+


[L3 , L− ] = [L3 , L1 − iL2 ] = iℏL2 − ℏL1 = −ℏL−
[L+ , L− ] = [L1 + iL2 , L1 − iL2 ] = −2i[L1 , L2 ] = 2ℏL3 (4.3)

47
48 CHAPTER 4. BACK TO ANGULAR MOMENTUM

Finally we note that


 2  2
2 L+ + L− L+ − L−
L = + + L23
2 2i
1 2
L+ + L2− + L+ L− + L− L+ − L2+ − L2− + L+ L− + L− L+ + L23

=
4
1
= (L+ L− + L− L+ ) + L23
2
1
= L− L+ + [L+ , L− ] + L23
2
= L− L+ + ℏL3 + L23 (4.4)

Note that all these equations follow from the su(2) algebra

[Li , Lj ] = iℏεijk Lk (4.5)

and hence must hold for our differential operators (4.1) as well. Nevertheless, it will
be helpful to see what the previous analysis looks like in terms of these differential
operators.

4.1 Differential Operators vs Irreducible Represen-


tations
It’s not surprising that the space of differentiable functions on R3 is not an irreducible
su(2) representation. To see this, consider the set of functions of the form ψ = f (r2 ),
where r2 = x2 + y 2 + z 2 . These are spherically symmetric, so we expect them to form a
trivial invariant subspace of su(2). Indeed

∂ψ
Li ψ = −iℏεijk xj = −2iℏεijk xj xk f ′ (r2 ) = 0 (4.6)
∂xk
so these are an invariant subspace.
To continue it will be convenient to work in a spherical coordinate system,

x = r sin θ cos ϕ
y = r sin θ sin ϕ
z = r cos θ (4.7)

and express the Li generators are differential operators acting on S 2 . This is done by a
straightforward coordinate transformation (see Problem Sheets). First, we invert (4.7)
and find
p
r= x2 + y 2 + z 2 n, n (4.8)
θ = arccos(z/r) ,
ϕ = arctan(y/x). (4.9)
4.1. DIFFERENTIAL OPERATORS VS IRREDUCIBLE REPRESENTATIONS 49

From here, we can read off the change of basis matrix

   
∂r/∂x ∂r/∂y ∂r/∂z sin θ cos ϕ sin θ sin ϕ cos θ
 ∂θ/∂x ∂θ/∂y ∂θ/∂z  =  cos θ cos ϕ/r cos θ sin ϕ/r − sin θ/r , (4.10)
   

∂ϕ/∂x ∂ϕ/∂y ∂ϕ/∂z − sin ϕ/r sin θ cos ϕ/r sin θ 0

which further allows us to find

∂ ∂r ∂ ∂θ ∂ ∂ϕ ∂
= + +
∂x ∂x ∂r ∂x ∂θ ∂x ∂ϕ
∂ cos θ cos ϕ ∂ sin ϕ ∂
= sin θ cos ϕ + − ,
∂r r ∂θ r sin θ ∂ϕ
∂ ∂r ∂ ∂θ ∂ ∂ϕ ∂
= + +
∂y ∂y ∂r ∂y ∂θ ∂y ∂ϕ
∂ cos θ sin ϕ ∂ cos ϕ ∂
= sin θ sin ϕ + + ,
∂r r ∂θ r sin θ ∂ϕ
∂ ∂r ∂ ∂θ ∂ ∂ϕ ∂
= + +
∂z ∂z ∂r ∂z ∂θ ∂z ∂ϕ
∂ sin θ ∂
= cos θ − .
∂r r ∂θ
(4.11)

Finally, this allows us to express the angular momentum generators in spherical coordi-
nates

∂ ∂
L1 = − iℏy + iℏz
∂z ∂y
 
∂ sin θ ∂
= − iℏr sin θ sin ϕ cos θ −
∂r r ∂θ
 
∂ cos θ sin ϕ ∂ cos ϕ ∂
+ iℏr cos θ sin θ sin ϕ + +
∂r r ∂θ r sin θ ∂ϕ
 
∂ ∂
= iℏ sin ϕ + cot θcos ϕ , (4.12)
∂θ ∂ϕ

∂ ∂
L2 = −iℏz + iℏx
∂x  ∂z 
∂ cos θ cos ϕ ∂ sin ϕ ∂
= −iℏrcos θ sin θ cos ϕ + −
∂r r ∂θ r sin θ ∂ϕ
 
∂ sin θ ∂
+ iℏr sin θ cos ϕ cos θ −
∂r r ∂θ
 
∂ ∂
= iℏ −cos ϕ + cot θsin ϕ , (4.13)
∂θ ∂ϕ
50 CHAPTER 4. BACK TO ANGULAR MOMENTUM

and, nicest of all,


∂ ∂
L3 = −iℏx + iℏy
∂y ∂x
 
∂ cos θ sin ϕ ∂ cos ϕ ∂
= − iℏr sin θ cos ϕ sin θ sin ϕ + +
∂r r ∂θ r sin θ ∂ϕ
 
∂ cos θ cos ϕ ∂ sin ϕ ∂
+ iℏr sin θ sin ϕ sin θ cos ϕ + − .
∂r r ∂θ r sin θ ∂ϕ

= −iℏ (4.14)
∂ϕ
Note that the r-dependence dropped out! This means that Li naturally act on functions
on the sphere, as anticipated.
It will also be convenient to find expressions for the raising and lowering operators
(see also Problem Sheet 3)

L+ = L1 + iL2
 
iϕ ∂ ∂
= iℏe −i + cot θ ,
∂θ ∂ϕ
L− = L1 − iL2
 
−iϕ ∂ ∂
= ie i + cot θ . (4.15)
∂θ ∂ϕ
From here we see that

L2 = L21 + L22 + L23


=L− L+ + ℏL3 + L23
   
−iϕ ∂ ∂ iϕ ∂ ∂
=iℏe i + cot θ iℏe −i + cot θ
∂θ ∂ϕ ∂θ ∂ϕ
2
∂ ∂
− iℏ2 − ℏ2 2
∂ϕ ∂ϕ
 2
∂2 ∂2 ∂2

2 ∂ i ∂ 2
=−ℏ − + i cot θ − i cot θ + cot θ 2
∂θ2 sin2 θ ∂ϕ ∂θ∂ϕ ∂θ∂ϕ ∂ϕ
2
 
∂ ∂ ∂ ∂
− ℏ2 cot θ + i cot2 θ +i + 2
∂θ ∂ϕ ∂ϕ ∂ϕ
 2
∂2 ∂2

2 ∂ ∂ 2
=−ℏ + cot θ + cot θ 2 + 2
∂θ2 ∂θ ∂ϕ ∂ϕ
2
   
2 1 ∂ ∂ 1 ∂
=−ℏ sin θ + . (4.16)
sin θ ∂θ ∂θ sin2 θ ∂ϕ2
This is an operator that we already encountered when studying the Schrödinger equation
for spherically symmetric systems in Chapter 2 (up to a sign, for consistency with the
conventions used in the construction of su(2) irreps, we will stick to the definition (4.16)
from now on). In particular we see that the 3-dimensional Hamiltonian with a spherically
symmetry potential can be written as
ℏ2 1 ∂
 
2 ∂
H=− r + Vef f (r) (4.17)
2m r2 ∂r ∂r
4.1. DIFFERENTIAL OPERATORS VS IRREDUCIBLE REPRESENTATIONS 51

with

L2
Vef f (r) = V (r) +
2mr2
ℏ2 l(l + 1)
= V (r) + (4.18)
2mr2

where in the second line we have assumed that we are looking at angular momentum
eigenstates, i.e. eigenstates of L2 . It’s an easy excercise to see that L2 does indeed
commute with H.
From last Chapter, to construct su(2) irreducible representations, we need a highest
weight state

L3 |l, m⟩ = ℏm|l, m⟩,


L2 |l, m⟩ = ℏl(l + 1)|l, m⟩, (4.19)
L+ |l, m⟩ = 0

and we saw that the last condition implies that l = m. We would like to construct
functions on the sphere that form a basis for the l representations of su(2). To this end,
define
Yl,m (θ, ϕ) ≡ ⟨θ, ϕ|l, m⟩, (4.20)

where the inner product is to be interpreted as picking up the coefficient of |θ, ϕ⟩ in


an expansion of |l, m⟩ in a continuum of states with (θ, ϕ), similar to what we do when
decomposing states in a continuum of position or momentum eigenbases (see section
1.2). Now applying ⟨θ, ϕ| to both sides of the equations in (4.19). Acting with the
differential operators on the LHS on the left, we find

L3 Yl,m = mℏYl,m ,
L2 Yl,m = ℏ2 l(l + 1)Yl,m . (4.21)

The set {Yl,m |m = −l, ..., +l} therefore provides an irreducible, (2l + 1)-dimensional
representation of su(2) inside the space of all differentiable functions on S 2 (viewed as
the unit sphere in R3 ).
Let us look for the associated eigenfunctions corresponding to the states |l, m⟩, which
are also known as spherical harmonics. We can construct them easily as solutions to
differential equations imposed by (4.21), namely


L3 Yl,m = −iℏ Yl,m = mℏYl,m (4.22)
∂ϕ

Thus

Yl,m = eimϕ Θl,m (θ) (4.23)


52 CHAPTER 4. BACK TO ANGULAR MOMENTUM

for some function Θl,m (θ). To fix Θl,m , we start with the highest weight state |l, l⟩. We
know that

0 = L+ Yl,l
 
iϕ ∂ ∂
= iℏe −i + cot θ eilϕ Θl,l (θ)
∂θ ∂ϕ
 
d
0 = −i + il cot θ Θl,l (θ) (4.24)

This is easily solved:
dΘl,l
= l cot θdθ = ld ln sin θ
Θl,l
=⇒ Θl,l (θ) = Cl (sin θ)l (4.25)

We can then obtain

Yl,m = (L− )l−m |l, l⟩


  l−m
−iϕ ∂ ∂
= Cl e − + i cot θ (sin θ)l eilϕ
∂θ ∂ϕ
= Θl,m (θ)eimϕ (4.26)

for some explicit function Θl,m . Note that we haven’t worried here about the normal-
ization but one can. But in practice these functions are given in books or built in to
Mathematica.
A special case of a beautiful result in mathematics (harmonic analysis), known as the
Peter-Weyl theorem then implies that the regular representation of su(2) on functions on
S 2 decomposes as a direct sum of irreducible, unitary representations. In other words,
∞ X
X ℓ ∞ X
X ℓ
ψ(θ, ϕ) ≡ ⟨θ, ϕ|ψ⟩ = cℓ,m ⟨θ, ϕ|ℓ, m⟩ = cℓ,m Yℓ,m (θ, ϕ). (4.27)
ℓ=0 m=−ℓ ℓ=0 m=−ℓ

In fact this is a general result which applies to any compact group G and allows one to
decompose L2 functions on G as direct sums of all unitary, irreducible representations of
G. The most familiar example of this fact is the Fourier series, in which case G = U (1)
(the group of unit complex numbers).

4.2 Addition of Angular Momentum


So far we looked at representations of su(2). We have seen that these describe single-
particle systems that carry spin or angular momentum. How do we deal with 2 or more
particles? Multi-particle systems are described as states in a Hilbert space constructed
from the single-particle Hilbert space by taking the tensor product. For example, for a
2-particle system
p1 ∈ H1 , p2 ∈ H2 , (4.28)
4.2. ADDITION OF ANGULAR MOMENTUM 53

the joint state (p1 p2 ) is an element of the tensor product

(p1 p2 ) ∈ H1 ⊗ H2 . (4.29)

If the dimensions of H1 and H2 are n1 and n2 respectively, the dimension of H1 ⊗ H2


is the product of dimensions n1 · n2 . Given basis elements {e1 , · · · , en1 } of H1 and
{f 1 , · · · , f n } of H2 , one can form a basis of H1 ⊗ H2 from pairs of elements, one from
2
H1 and one from H2 , namely {e1 ⊗ f 1 , e1 ⊗ f 2 , · · · , en1 ⊗ en2 }. Elements of H1 ⊗ H2 are
then linear combinations of these.
For example, consider 2 qubits each characterized by a state in a 2-dimensional
representation of su(2), which we denote by V1 , V2 . The joint state will then be an
element of V1 ⊗ V2 , which is a 4-dimensional Hilbert space spanned by
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
| , ⟩ ⊗ | , ⟩, | , ⟩ ⊗ | , − ⟩, | , − ⟩ ⊗ | , ⟩, | , − ⟩ ⊗ | , − ⟩. (4.30)
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
Now let’s consider two irreps of su(2), of dimensions 2j1 + 1 and 2j2 + 2 respec-
tively. We will denote their tensor product by (2j1 + 1) ⊗ (2j2 + 1). In general, such
representations are reducible. We would like to find the su(2) invariant subspaces of
this representation. To do this, we follow the raising/lowering operator based approach
discussed before. One the one hand, states in (2j1 + 1) ⊗ (2j2 + 1) are labelled by
eigenvalues of the 4 commuting operators:
(1) (2)
(J (1) )2 , J3 , (J (2) )2 , J3 (4.31)
(A) (A) (A)
where (J (A) )2 ≡ (J1 )2 + (J2 )2 + (J3 )2 , and A = 1, 2 are particle labels and
(1) (2)
J3 ≡ J3 ⊗ I, J3 ≡ I ⊗ J3 ,
(4.32)
(J (1) )2 ≡ J 2 ⊗ I, (J (2) )2 ≡ I ⊗ J 2 .

In general we can construct operators acting on H1 ⊗ H2 by taking linear combinations


of tensor products of operators acting on H1 and H2 .
Since these operators act on the different factors of V1 ⊗ V2 , it is easy to check that
they commute. The decomposition of the tensor product of su(2) irreps, into su(2)
irreps is then obtained by finding the change of basis from

|j1 , m1 ; j2 , m2 ⟩ = |j1 , m1 ⟩ ⊗ |j2 , m2 ⟩, −ji ≤ mi ≤ ji (4.33)

to a basis of net angular momentum eigenstates

|j T , mT , j1 , j2 ⟩, (4.34)

where
(1) (2)
JiT ≡ Ji + Ji , i = 1, 2, 3,
3
T 2
X (4.35)
(J ) ≡ (JiT )2 ,
i=1
54 CHAPTER 4. BACK TO ANGULAR MOMENTUM

and
J3T |j T , mT , j1 , j2 ⟩ = mT |j T , mT , j1 , j2 ⟩,
(4.36)
(J T )2 |j T , mT , j1 , j2 ⟩ = j T (j T + 1)|j T , mT , j1 , j2 ⟩

You can also check that these 4 new operators


(T )
(J (T ) )2 , J3 , (J (1) )2 , (J (2) )2 (4.37)

commute. Here j T and mT relate to the total angular momentum-squared and total
angular momentum around the z-axis. Importantly, one can check from the definitions
that
[JiT , JjT ] = iϵijk JkT , (4.38)
and it is the invariant subspaces of this su(2) algebra that we want to identify inside the
vector space (2j1 + 1) ⊗ (2j2 + 1). The corresponding raising and lowering operators
LT± are constructed as before.
In general, the change of basis from |j1 , m1 ; j2 , m2 ⟩ to a basis labelled by |j T , mT , j1 , j2 ⟩
takes the form
j1 j2
T T
X X
T T j ,m
|j , m , j1 , j2 ⟩ = Cm 1 ,m2
(j1 , j2 )|j1 , m1 ; j2 , m2 ⟩ (4.39)
m1 =−j1 m2 =−j2

j ,m T T
The constants Cm 1 ,m2
(j1 , j2 ) are known as Clebsch-Gordan coefficients. They can be
computed recursively, as we now describe. The idea is to start with a highest weight
state with respect to J3T . An obvious choice is the state of maximum eigenvalue of

J3T = J3 ⊗ I + I ⊗ J3 , (4.40)

which has mT = j1 + j2 = j T . There is a unique state in (2j1 + 1) ⊗ (2j2 + 1) with this


property and one can easily check that J+T annihilates it, so it is highest weight with
respect to the (4.38) su(2). It follows that for this state, the decomposition (4.39) is
simply
|j1 + j2 , j1 + j2 , j1 , j2 ⟩ = |j1 , j1 ⟩ ⊗ |j2 , j2 ⟩. (4.41)
To get the other states in the j T = j1 + j2 irrep, we now act on both sides of (4.41)
with
(1) (2)
J−T = J− + J− . (4.42)
This gives the equation

c− (j1 + j2 , j1 + j2 )| j1 + j2 , j1 + j2 − 1, j1 , j2 ⟩
| {z } | {z }
jT mT

= c− (j1 , j1 )|j1 , j1 − 1⟩ ⊗ |j2 , j2 ⟩ + c− (j2 , j2 )|j1 , j1 ⟩ ⊗ |j2 , j2 − 1⟩,


(4.43)

where c− were defined in (??). We emphasize that the derivation of these coefficients
only relied on the existence of the su(2) algebra and not the form of the generators. It
4.2. ADDITION OF ANGULAR MOMENTUM 55

therefore, applies equally well to the su(2) generated by JiT . Straightforward evaluation
using (??) gives
p
c− (j1 , j1 ) = 2j1 ,
p
c− (j1 , j2 ) = 2j2 , (4.44)
p
c− (j1 + j2 , j1 , j2 ) = 2(j1 + j2 ),

from which we can read off the Clebsch-Gordan coefficients


√ √
j1 +j2 ,j1 +j2 −1 j1 j1 +j2 ,j1 +j2 −1 j2
Cj1 −1,j2 (j1 , j2 ) = √ , Cj1 ,j2 −1 (j1 , j2 ) = √ . (4.45)
j1 + j2 j1 + j2

The other coefficients can be determined by continuing to act with J−T on this states, until
we hit the lowest weight state | j1 + j2 , −(j1 + j2 ), j1 , j2 ⟩. We have therefore isolated a
| {z } | {z }
jT mT =−j T
2(j1 + j2 ) + 1 dimensional su(2) subspace inside (2j1 + 1) ⊗ (2j2 + 1). Provided (2j1 +
1) · (2j2 + 1) − 2(j1 + j2 ) − 1 > 0, we can continue applying the same procedure to
the highest state with respect to J+T in the complement of the subspace that we just
isolated. It is not hard to see that the next such state has mT = j1 + j2 − 1 = j T . Going
on like this, one can show that
j1 +j2
M
(2j1 + 1) ⊗ (2j2 + 1) = (2J + 1). (4.46)
J=|j1 −j2 |

We can check that this formula makes sense by comparing the dimensions of the
LHS and the RHS:
j1 +j2 2j2
X X
dim(RHS) = (2J + 1) = (2(J + j1 − j2 ) + 1)
J=|j1 −j2 | J=0

2j2 (2j2 + 1) (4.47)


=2 + 2(j1 − j2 )(2j2 + 1) + 2j2 + 1
2
= (2j1 + 1)(2j2 + 1) = dim(LHS).

Let us be more specific and look at the simplest non-trivial case of j1 = j2 = 1/2
which appeared before. To clean up the notation, we denote in this exercise
1 1
| , m1 , , m2 ⟩ ≡ |m1 , m2 ⟩ = |m1 ⟩ ⊗ |m2 ⟩. (4.48)
2 2
In the tensor product we have the following basis of states (this is exactly the basis of
(4.30))

| 21 , 12 ⟩ = | 21 ⟩ ⊗ | 12 ⟩
| 12 , − 12 ⟩ = | 21 ⟩ ⊗ | − 12 ⟩
| − 21 , 12 ⟩ = | − 21 ⟩ ⊗ | 12 ⟩
| − 21 , − 12 ⟩ = | − 21 ⟩ ⊗ | − 21 ⟩. (4.49)
56 CHAPTER 4. BACK TO ANGULAR MOMENTUM

(1) (2)
These are eigenstates of J3 + J3
(1) (2) (1) (2)
(J3 + J3 )| 12 , 12 ⟩ = J3 | 12 ⟩ ⊗ | 21 ⟩ + | 12 ⟩ ⊗ J3 | 12 ⟩
= 12 | 21 ⟩ ⊗ | 12 ⟩ + 12 | 12 ⟩ ⊗ | 21 ⟩
= | 12 , 12 ⟩
(1) (2) (1) (2)
(J3 + J3 )| 12 , − 12 ⟩ = J3 | 12 ⟩ ⊗ | − 21 ⟩ + | 12 ⟩ ⊗ J3 | − 12 ⟩
= 12 | 21 ⟩ ⊗ | − 12 ⟩ − 21 | 12 ⟩ ⊗ | − 12 ⟩
=0
(1) (2) (1) (2)
(J3 + J3 )| − 12 , 12 ⟩ = J3 | − 12 ⟩ ⊗ | 12 ⟩ + | − 12 ⟩ ⊗ J3 | 12 ⟩
= − 12 | − 12 ⟩ ⊗ | 12 ⟩ + 12 | − 12 ⟩ ⊗ | 12 ⟩
=0
(1) (2) (1) (2)
(J3 + J3 )| − 12 , − 12 ⟩ = J3 | − 12 ⟩ ⊗ | − 12 ⟩ + | − 12 ⟩ ⊗ J3 | − 12 ⟩
= − 12 | − 12 ⟩ ⊗ | − 12 ⟩ − 12 | − 12 ⟩ ⊗ | − 12 ⟩
= −| − 12 , − 21 ⟩ (4.50)
(T )
Thus we find the J3 eigenvalues (−1, 0, 0, +1). This doesn’t correspond to any irre-
ducible representation of su(2) but we can split it as 0 and (−1, 0, 1) which are the
eigenvalues of the j T = 0 and j T = 1 representations. Thus we expect to find

2 ⊗ 2 = 1 ⊕ 3. (4.51)

Indeed, following the algorithm described above, we find the states in 1


1 1
|j T = 0, mT = 0⟩ = √ | 12 , − 12 ⟩ − √ | − 21 , 12 ⟩
2 2
(4.52)

and 3

|j T = 1, mT = −1⟩ = | − 12 , − 21 ⟩
1 1
|j T = 1, mT = 0⟩ = √ | 12 , − 21 ⟩ + √ | − 21 , 12 ⟩
2 2
T T 1 1
|j = 1, m = 1⟩ = | 2 , 2 ⟩ (4.53)

You can check this for yourself. These form another basis for 2 ⊗ 2, which are manifestly
invariant subspaces, or 3-dimensional and 1-dimensional irreducible representations of
su(2).

4.3 An application of spin: Magnetic Resonance Imag-


ing (MRI)
So far we have only discussed symmetries, in particular generated by su(2) elements.
Particles are defined to be described by states in irreducible representations of symmetry
4.3. AN APPLICATION OF SPIN: MAGNETIC RESONANCE IMAGING (MRI) 57

algebras. The subject of this section, we will focus on the su(2) symmetry and consider
a spin- 12 particle in 3d, or simply a 2-dimensional irreducible representation of su(2).
The Pauli matrices defined before allow us to define the operators (or observables)
! ! !
ℏ 0 1 ℏ ℏ 0 −i ℏ ℏ 1 0 ℏ
Sx = = σx , Sy = = σy , Sz = = σz . (4.54)
2 1 0 2 2 i 0 2 2 0 −1 2

The eigenvalues of these matrices are ± ℏ2 which are the values that one observes when
measuring spin along the x, y or z axes (recall the Stern-Gerlach experiment).
Let’s put the qubit in a magnetic field B = (0, 0, B), in which case the Hamiltonian
takes the form

H = −µ · B = − γσz B0 , µz ≡ γSz , (4.55)
2
for some constant γ. The state of the system evolves according to the Schrodinger
equation !
∂ ℏ B0 0
iℏ |ψ(t)⟩ = − γ |ψ(t)⟩. (4.56)
∂t 2 0 −B0
Let {|0⟩, |1⟩} be a basis for the 2 irrep of su(2). The state of the spin at some time t
must be a linear combination of these basis elements with time-dependent coefficients,

|ψ(t)⟩ = a(t)|0⟩ + b(t)|1⟩, a, b ∈ C, |a|2 + |b|2 = 1. (4.57)

Plugging this into (4.56), we find a system of differential equations for a(t), b(t)
! ! !
∂ a i B0 0 a iγB0 iγB0
= γ =⇒ a(t) = e 2 t a(0), b(t) = e− 2 t b(0). (4.58)
∂t b 2 0 −B0 b

Without loss of generality, we take the spin |ψ(0)⟩ to “point” in the n̂ direction (this
amounts to choosing a boundary condition). Of course, you shouldn’t take this literally,
|ψ⟩ is a vector in a 2-dimensional complex vector space. What this means is that |ψ(0)⟩
is an eigenstate of the spin in that direction, ie.

n̂ · Ŝ|ψ(0)⟩ = |ψ(0)⟩. (4.59)
2
Further (for simplicity) taking n̂ = (0, sin θ, cos θ), ie ϕ = π/2, we have
!
ℏ ℏ cos θ −i sin θ
n̂ · Ŝ = (sin θσy + cos θσz ) = , (4.60)
2 2 i sin θ − cos θ

whose eigenvalues are ± ℏ2 . The evector corresponding to the +ve eigenvalue is then
(CHECK!) !
θ θ cos(θ/2)
|ψ(0)⟩ = cos |0⟩ + i sin |1⟩ = . (4.61)
2 2 i sin(θ/2)
From this boundary condition, we can read off a(0) and b(0) and so
!
cos(θ/2)eiω0 t/2
|ψ(t)⟩ = , ω0 ≡ γB0 . (4.62)
i sin(θ/2)e−iω0 t/2
58 CHAPTER 4. BACK TO ANGULAR MOMENTUM

This means that the spin will precess around the B0 ẑ axis with frequency ω0 . To see
this you can check that (4.62) is an eigenstate of n̂(t) · Ŝ, where

n̂(t) = (− sin θ sin ω0 t, sin θ cos ω0 t, cos θ). (4.63)

MRI essentially exploits the fact that spins couple, hence react to and can be manipu-
lated with external magnetic fields. In particular, one can use a time-dependent, probe
magnetic field to flip spin states from |0⟩ to |1⟩. These states have different energies.
Every time the transition happens, the change in state (hence energy) of the system is
compensated by the emission of radiation. Upon detection, this allows one to get infor-
mation about the structure of molecules. Specifically, molecules consist of many spins
in general (here is where addition of su(2) would be relevant in practice); each spin will
experience a local B field (analog of the constant B0 above) due to other atoms around.
Molecules will hence have several resonance frequencies depending on their constituents
which can be found by applying a probe field (see B(t) below). In this way, one can get
information about the structure of molecules.
To be specific, let’s apply to our spin in the constant magnetic field B0 another
time-dependent probe field B(t) that rotates in the x − y plane, namely

B = (B1 cos ωt, −B1 sin ωt, 0) . (4.64)

The new Hamiltonian is time-dependent and takes the form


!
ℏγ B0 B1 eiωt
HB = − (4.65)
2 B1 e−iωt −B0

The new time-dependent Schrodinger equation is


! ! !
∂ a(t) i B0 B1 eiωt a(t)
= γ . (4.66)
∂t b(t) 2 B1 e−iωt −B0 b(t)

This one is harder to solve directly because the Hamiltonian is time-dependent (more
on this later).
Luckily, in this case we can actually still solve it exactly by noting that we can go to
a rotating frame in which the Hamiltonian becomes time-independent. In other words,
we can change basis from (ab) to (a′ b′ ), where
! ! ! !
a′ (t) e−iωt/2 0 a(t) a(t)
= ≡ R(t) . (4.67)
b′ (t) 0 eiωt/2 b(t) b(t)

Substituting this into our Schrodinger equation (4.66) we get


! ! ! !
dR† a′ (t) ȧ′
(t) iγ B 0 B1 eiωt
a ′
(t)
+ R† ′ = R† ′ (4.68)
dt b′ (t) ḃ (t) 2 B1 e −iωt
−B0 b (t)

or equivalently ! !
ȧ′ (t) ′
′ a (t)
= iH , (4.69)
ḃ′ (t) b′ (t)
4.3. AN APPLICATION OF SPIN: MAGNETIC RESONANCE IMAGING (MRI) 59

with
! !
γ B 0 B1 eiωt
dR† 1 −∆ω ω1
H′ ≡ R −iωt
R †
+iR = , ω1 ≡ γB1 , ∆ω = ω−ω0 .
2 B1 e −B0 dt 2 ω1 ∆ω
(4.70)
You can check this by direct substitution into mathematica. The nice feature of (4.69)
is that H ′ is again time independent, and so we have (see Chapter 2)
! !
′ ′
a (t) ′ a (0)
|ψ ′ (t)⟩ ≡ = eiH t ′ . (4.71)
b′ (t) b (0)

In terms of our original variables we have


! ! !

a(t) a (t) ′ a(0)
= R† (t) ′ = R† (t)eiH t . (4.72)
b(t) b (t) b(0)
!
1
Now starting with , ie. a(0) = 1, b(0) = 0, given the solution above, we find the
0
!
0
probability that the state transitions to to be
1

ω12
q 
2 t
P12 = |b(t)| = sin2 2
ω1 + ∆ω 2 . (4.73)
|∆ω 2 + ω12 | 2

We see that for ∆ω = 0 (at resonance), P12 = 1 at the special times

(2n + 1)π (2n + 1)π


t= p 2 = . (4.74)
ω1 + ∆ω 2 ω1

These are called Rabi oscillations and are at the heart of MRI imaging.
60 CHAPTER 4. BACK TO ANGULAR MOMENTUM
Chapter 5

The Hydrogen Atom

Let us put what we learnt above to use in solving for an electron moving around a
positively charged nucleus. We consider it as a two-body problem where the position of
the electron is denoted by re and that of the nucleus by rn . Thus our wavefunction is

Ψ(t, re , rn ) = e−iEt/ℏ ψ(re , rn ). (5.1)

Furthermore the potential arising from the electrostatic force between them is

Ze2
V =− (5.2)
|rn − re |
where Z is the nucleon number and e the charge of an electron in appropriate units.
Thus the time dependent Schrödinger equation is

ℏ2 2 ℏ2 2 Ze2
Eψ = − ∇ ψ− ∇ ψ− ψ (5.3)
2me re 2mn rn |rn − re |
where me and mn are the masses of the electron and nucleon respectively.

5.1 The Quantum 2-Body Problem


Thus we need to solve for a wavefunction of six coordinates. As in classical mechanics
we can exploit the fact that the potential only depends on rn − re to reduce the problem
to two one-body problems:

ψ(re , rn ) = ψCoM (R)ψrel (r12 ) (5.4)

for suitable choices of R and r12 namely:


me re + mn rn
r12 = re − rn , R= (5.5)
me + mn
In particular it is a small exercise (see the problem set) to show that

ℏ2 2 ℏ2 2 ℏ2 2 ℏ2 2
− ∇r e − ∇r n = − ∇R − ∇ (5.6)
2me 2mn 2M 2µ r12

61
62 CHAPTER 5. THE HYDROGEN ATOM

where M = me + mn and µ = me mn /M .
Thus our time-independent Schrödinger equation is

ℏ2 2 ℏ2 2 Ze2
Eψ = − ∇R ψ − ∇r12 ψ − ψ. (5.7)
2M 2µ |r12 |
Now we can use separation of variables and consider the Ansatz

ψ(re , rn ) = ψCoM (R)ψrel (r12 ). (5.8)

We therefore find two eigenvalue equations

ℏ2 2
ϵCoM ψCoM = − ∇ ψCoM
2M R
ℏ2 Ze2
ϵrel ψrel = − ∇2r12 ψrel − ψrel (5.9)
2µ |r12 |
with E = ϵCoM + ϵrel .

5.2 The Hydrogen Atom


The centre of mass part is trivial to solve:

ψCoM = eik·R χ(k) (5.10)

with ϵCoM = ℏ2 |k|2 /2M . These just describe a free particle, the atom, in a basis where
the linear momentum is fixed. We can then construct wave packets through

d3 k ik·R
Z
ψCoM (R) = e χ(k) (5.11)
(2π)3
which will not be energy eigenstates but can be localized in position to an arbitrary
accuracy.
The more interesting part is to solve for ψrel (r12 ). Here we can switch from r12 to
spherical coordinates and use separation of variables yet again:

ψrel = ul (r)Yl,m (θ, ϕ) (5.12)

where r = |r12 |. We already know what the Yl,m ’s are and we know that ul satisfies:

Ze2 ℏ2 l(l + 1)
   
1 d 2 dul 2µ
− 2 r + 2 − + − ϵrel ul = 0 (5.13)
r dr dr ℏ r 2µr2

So solving the Hydrogen atom (which corresponds to Z = 1 but we can be more general
without causing any more pain) comes down to solving this equation and finding ul , E
and ψrel and ψ(re , rn ).
To continue we write

ul (r) = rl fl (r) (5.14)


5.2. THE HYDROGEN ATOM 63

to find
d2 fl 2(l + 1) dfl 2µZe2 2µϵrel
− − − f l − fl = 0 (5.15)
dr2 r dr ℏ2 r ℏ2
This substitution is relatively standard in spherically symmetric examples as it removes
the l(l + 1) term from the equation.
To continue we look at the large r limit where only the first and last terms are
important. If ϵrel > 0 then we find oscillating solutions:
p p
f ∼ C1 cos( 2µϵrel r/ℏ) + C2 sin( 2µϵrel r/ℏ) (5.16)

which will not be normalizable. Solutions with ϵrel = 0 will also not be normalizable.
Thus we conclude that ϵrel < 0. In this case we expect solutions, at large r, to be of the
form
√ √
f ∼ C1 e− −2µϵrel r/ℏ
+ C2 e −2µϵrel r/ℏ
(5.17)

where now we must set C2 = 0 to find a normalizable solution.



Thus our last step is to write fl = Pl (r)e− −2µϵrel r/ℏ which leads to

d2 Pl 2(l + 1) dPl 2µZe2 2p dPl 2(l + 1) −2µϵrel
− 2 − − 2 Pl + −2µϵrel + Pl = 0 (5.18)
dr r dr ℏr ℏ dr ℏr
This probably doesn’t look much better but it is! Indeed it has polynomial solutions.
To see this let us assume we have a polynomial solution and look at the highest order
term Pl ∼ rn +. . .. The differential equation has a leading order rn−1 term with coefficient

2µZe2 p p
− + 2 −2µϵrel n + 2(l + 1) −2µϵrel = 0 (5.19)

which must vanish. We can rearrange this to give
2
µZ 2 e4

1
ϵrel =− (5.20)
2ℏ2 n+l+1

Let us return to the differential equation which we now write as

d2 Pl 2(l + 1) dPl 2µZe2 /ℏ2 µZe2 /ℏ2 dPl µZe2 /ℏ2 1


+ + P l − 2 − 2(l + 1) Pl = 0 (5.21)
dr2 r dr r n + l + 1 dr n+l+1r

next introduce ρ = 2µZe2 r/ℏ2 to find


 2 
d Pl 2(l + 1) dPl 1 dPl (l + 1)
(n + l + 1) 2
+ + Pl − − Pl = 0 (5.22)
dρ ρ dρ ρ dρ ρ

This is more amenable to solving through an expansion


n
X
Pn,l = C k ρk (5.23)
k=0
64 CHAPTER 5. THE HYDROGEN ATOM

Substituting back in simply leads to a recursion relation obtained from setting the
coefficient of ρk−2 to zero:

(n + l + 1)(k(k − 1) + 2k(l + 1))Ck + ((n + l + 1) − (k − 1) − (l + 1)) Ck−1 = 0


(n + l + 1)(k(k − 1) + 2k(l + 1))
=⇒ Ck−1 = − Ck
n−k+1
(5.24)

We see that taking k = 0 leads to C−1 = 0 and hence the series indeed terminates giving
a polynomial. These polynomials are known as Laguerre Polynomials. We haven’t
shown here that these are the only normalizable solutions but this does turn out to be
the case.
In summary our solutions are of the form

ψ(re , rn ) = ψCoM (R)ψrel (re − rn )


µZe2 r

ψrel (re − rn ) = rl e (n+l+1)ℏ2 Pn,l (r)Θl (cos θ)eimϕ (5.25)

and ψCoM is a generic free wave packet. It’s fun to plot |ψrel |2 for various choices of n, l
and m. For example look here https://en.wikipedia.org/wiki/Atomic_orbital#
/media/File:Hydrogen_Density_Plots.png On the other hand there are actual pic-
tures of atoms such as here: https://www.nature.com/articles/498009d
We can also reproduce the famous formula postulated by Bohr in the earliest days of
Quantum Mechanics (before all this formalism that we have discussed was formulated):

µZ 2 e4 1
EBohr = − (5.26)
2ℏ2 N 2
for some integer N = n + l + 1 = 1, 2, 3, ...,. In particular since a proton is 2000 times
more massive than an electron we have mn >> me and so µ = me to very high accuracy.
Thus one finds (for Hydrogen where Z = 1)

13.6eV 2.18 × 10−18 J


EBohr ∼ − ∼ − (5.27)
N2 N2
This in turn also leads to the Rydberg formula for the energy spectrum of light coming
from excited atoms - known as spectral lines. In particular the energy of a photon (a
single particle of light) corresponds to the energy lost when an electron drops from a
higher orbital to a lower one (N1 > N2 ):

me Z 2 e 4 1 me Z 2 e4 1
Ephoton = − +
2ℏ2 N12 2ℏ2 N22
 
1 1
=R 2
− 2 (5.28)
N2 N1

where we have computed the Rydberg constant R = me Z 2 e4 /2ℏ2


We can also ask how many states there are at each energy level:

• N = 1: n = l = 0 1 state
5.2. THE HYDROGEN ATOM 65

• N = 2: n = 1, l = 0 or n = 0, l = 1 1+3=4 states

• N = 3: n = 2, l = 0 or n = 1, l = 1 or n = 0, l = 2 1+3+5 = 9 states

In fact we find N 2 states for each energy level (the sum of the first N odd numbers is
N 2 ).
Now, what are the lowest energy states with multiple electrons? In fact there are
twice as many as these as each electron can be spin up or spin down (more on this
later). This is just an additional quantum number (which only takes two values: up or
down) and corresponds to the fact that the correct rotation group of Nature isn’t SO(3)
but SU (2) = Spin(3). Note also that the terminology of spin up and spin down has
no physical interpretation in terms of up and down: if you take an electron of spin up
to Australia it does not become spin down. It has more to do with the fact that we
write vectors as column matrices and the up state are on top and then down state on
the bottom. We must know another fact: no two electrons can be in the same state
(the multi-electron wavefunction must be anti-symmetric - odd - under interchange of
any two electrons). Thus the degeneracies of low energy multi-electron states are (this
ignores inter electron interactions which become increasingly important)

• 2: n = l = 0

• 8: 2 states with n = 1, l = 0 and 6 from n = 0, l = 1

• 18: 2 from n = 2 6 from n = 1, l = 1 and 10 from n = 0, l = 2

This pattern is evident in the periodic table whose rows have 2, 8 and 18 elements. We
have now predicted this based on the “crazy” idea that states are vectors in a Hilbert
space and observables are self-adjoint linear maps.
66 CHAPTER 5. THE HYDROGEN ATOM
Chapter 6

Time Independent Perturbation


Theory

Next we must face the fact that solving the Schrödinger equation in general is too
complicated and we need to make approximations to make progress (for spherically
symmetric systems one is on better ground but still one has to solve a second order
ODE). The idea is to find a system you can solve exactly, for example a free, non-
interacting, system or a system you have solved exactly such as a single electron in an
atom, and then imagine that you perturb it (say by adding in another electron or putting
the atom in a small background magnetic field). This means by adding an interaction
whose strength is controlled by a parameter g which we can make as small as we like
such that setting g = 0 reduces us the problem we can solve exactly. One then computes
the physically meaningful quantities in a power series:

E = E0 + gE1 + g 2 E2 + . . . (6.1)

The constants g are referred to as coupling constants - there can be several in any given
problem. The contributions at order g are called first-order and at g 2 second order etc..
Computing each term in this expansion is known as perturbation theory. Much, almost
all, of Physics is done this way.
Of course in Nature g is not a free parameter but some constant that you determine
from experiment. If g is small then we say the theory is weakly coupled whereas if g
is not small then it is strongly coupled. For example in electromagnetism the relevant
coupling constant is

e2 1
α= ∼ . (6.2)
ℏc 137
This is known as the fine structure constant. It is so named as it leads to corrections to
the Hydrogen atom spectrum that correspond to “fine”, e.g. small, corrections to what
we found above. This is indeed small. Nature was kind to Physicists as this meant
that accurate predictions from quantum electrodynamics (QED) could be made using
perturbation theory. Indeed in some cases theory and experiment agree to 13 decimal

67
68 CHAPTER 6. TIME INDEPENDENT PERTURBATION THEORY

place accuracy. This is like knowing the distance to the moon to within the width of a
human hair.
On the other hand in Quantum Chromodynamics (QCD), the theory that describes
quarks inside protons and neutrons, the relevant coupling is
1
αs ∼ (6.3)
2
In fact neither α nor αs are constant. αs becomes larger at longer distances whereas α
gets smaller. The values I gave here are the approximate values of αs at the distance of
a proton and α at infinite distance. So we are not lucky with QCD (superconductivity
is another system that is strongly coupled and perturbation theory fails). At distances
around the size of a proton, QCD is strongly coupled and computations in perturbation
theory are next to useless (but we can use perturbation theory well at very short, sub
proton, distances where they behave in a weakly coupled way). This is asymptotic
freedom, meaning that at short distances the theory is weakly coupled. Pulling quarks
apart increases the force between them, somewhat like a spring. Quarks are confined
into protons (and other baryons such as neutrons) but this is still poorly understood
largely because we can’t compute much.
It is important to ask what small means. It is meaningless to say that a quantity
that is dimensionful is small or large. By dimensionful we mean that it is measured in
some units. For example is a fighter jet fast? A typical speed for a fighter jet is 2000
kph and that is very fast compared to driving a car (say 100 kph):

vcar /vjet = 0.05 << 1 (6.4)

But the speed of light is 109 Kph so

vjet /vlight = 0.000002 << 1 (6.5)

So no in this sense a fighter jet is very very slow. Thus when we do perturbation theory
we need to identify a small dimensionless number. Small then means g 2 << g, so that
higher order terms in the expansion are indeed smaller (although their coefficients might
grow). Note that this requires us to compare g and g 2 so they must have the same units
and hence g must be dimensionless.
Lastly we note that even if we have a small, even tiny, dimensionless coupling con-
stant perturbation theory can still fail. This is because not all functions admit a Taylor
expansion about g = 0. The classic example is
 1
e− g2 g ̸= 0
f (g) = (6.6)
0 g=0

This is an example of a non-analytic function. All derivatives of f (g) vanish at g = 0


and so a Taylor expansion about g = 0 gives
1

e g2 = 0 + 0 + 0... (6.7)
6.1. A SIMPLE EXAMPLE 69

i.e. perturbation theory misses all the information in f (g). This may seem like an esoteric
example but actually functions of this form arise all the time in quantum theories as
one can see in the path integral formulation.
In practice in quantum theories the perturbative series of the form 6.1 do not even
converge! They become more accurate as we include higher order terms for a while but
then they get worse if you include too many and then would ultimately diverge if one
could do infinitely many. In QED where g = α = 1/137 one expects the series to start
failing around the 137-th term. Such a series is known as an asymptotic series. The full
theory is not divergent and one expect the complete answer to take the form
2
E = (E0,0 + E0,1 g + E0,2 g 2 + . . .) + (E1,0 + E1,1 g + E1,2 g 2 + . . .)e−1/g + . . . (6.8)

6.1 A Simple Example


If we have a finite dimensional Hilbert space then solving the Schrödinger equations
boils down to diagonalising an N × N Hermitian matrix H. For illustration purposes
let us look at the simplest case of N = 2. Let’s look suppose that ϵ ∈ R and take
!
E1 ϵ
H= . (6.9)
ϵ E2

For ϵ = 0 we clearly have two eigenvalues E1 , E2 and eigenstates


! !
1 0
|E1 ⟩ = , |E2 ⟩ = . (6.10)
0 1

This is the unperturbed system that we have solved exactly.


In this case the ϵ ̸= 0 system is simple enough that we can solve it exactly. So let’s
do that first and then compare with perturbation theory. It’s not hard to see that the
eigenvalues are
(ϵ) 1 p 
E1/2 = E1 + E2 ± (E1 − E2 )2 + 4ϵ2 . (6.11)
2
Here 1 refers to the + sign and 2 to the − sign. The square root is of a positive number
so we find real eigenvalues as we should. The eigenvectors are
! !
1 1
(ϵ)
|E ⟩ = N1 (ϵ) = N1 √ (6.12)
1 E1 −E1 2 2
−(E1 −E2 )+ (E1 −E2 ) +4ϵ
ϵ 2ϵ

(ϵ)
! √ !
E2 −E2 (E1 −E2 )− (E1 −E2 )2 +4ϵ2
(ϵ)
|E2 ⟩ = N2 ϵ = N2 2ϵ (6.13)
1 1
where the normalization constants are equal and given by
1
N1/2 = q √ . (6.14)
(E2 −E1 + (E1 −E2 )2 +4ϵ2 )2
1+ 4ϵ2
70 CHAPTER 6. TIME INDEPENDENT PERTURBATION THEORY

Not nice but solved exactly! Note that for E1 ̸= E2 we can smoothly take ϵ → 0 as
s !
(ϵ) 1 4ϵ2
E1/2 = E1 + E2 ± |E1 − E2 | 1 +
2 (E1 − E2 )2
2ϵ2
 
1
= E1 + E2 ± |E1 − E2 | ± + ...
2 |E1 − E2 |
ϵ2
= E1/2 ± + O(ϵ4 ). (6.15)
|E1 − E2 |

To first order in ϵ we find N1/2 = 1 + O(ϵ2 ) and


!
(ϵ) 1
|E1 ⟩ = ϵ
+ O(ϵ2 ),
E1 −E2
!
ϵ
(ϵ) − E1 −E
|E2 ⟩ = 2 + O(ϵ2 ). (6.16)
1

Thus we see that the small dimensionless parameter is g = ϵ/|E1 − E2 |. Here we


can already identify a potential problem. What if E1 = E2 ? The problem is that even
at ϵ = 0 we could rotate |E1 ⟩ and |E2 ⟩ and still have energy eigenstates. So the limit
ϵ → 0 need not return us to where we started. We will have to be more careful.

6.2 First Order Non-Degenerate Perturbation The-


ory
For now let us assume that E1 ̸= E2 . This is called non-degenerate perturba-
tion theory. The key assumption that needs to hold in order to solve a system with
perturbation theory is that as we turn on a small perturbation in the Hamiltonian, the
change from the original eigenstates to the new eigenstates is also small. In other words,
consider a one-parameter family of Hamiltonians

Hλ = H0 + λ∆H, λ ∈ [0, 1], (6.17)

where λ = 0 corresponds to the original Hamiltonian (whose solution is assumed to be


known) and λ = 1 corresponds to the perturbed Hamiltonian which we want so solve.
Note that H depends analytically on λ. Perturbation theory will work as long as the
eigenstates |Eλ ⟩ are also analytic in λ. This is just to say that small changes in the
system (Hamiltonian) lead to small changes in the outcome (state).
This need not be true in general (and indeed, we have already encountered this
phenomenon in our simple example above when E1 = E2 ). In this section, we will
take this as an assumption and see how far we can get. For concreteness, let’s try
to solve the simple, two-dimensional example of the last section in perturbation theory.
The approach is easily generalized to finite-dimensional systems/Hamiltonians with non-
degenerate energy eigenstates.
6.2. FIRST ORDER NON-DEGENERATE PERTURBATION THEORY 71

Let us introduce a parameter ϵ, to keep track of the perturbative expansion. We


assume that at zeroth order, i.e. ϵ = 0, we have (n = 1, 2)

H (0) |En(0) ⟩ = Ea(0) |En(0) ⟩ (6.18)

where
! ! !
(0)
E1 0 (0) 1 (0) 0
H (0) = (0) |E1 ⟩ = |E2 ⟩ = (6.19)
0 E2 0 1
(0)
and we have relabelled En = En . Next we expand to lowest order in ϵ. In principle
this involves an infinite number of terms but let’s start with just the first order term

H = H (0) + ϵH (1) + . . .
En = En(0) + ϵEn(1) + . . .
|En ⟩ = |En(0) ⟩ + ϵ|En(1) ⟩ + . . . (6.20)

We want to solve the following equations

H|En ⟩ = En |En ⟩ (6.21)

which to first order are

(H (0) + ϵH (1) + . . .)(|En(0) ⟩ + ϵ|En(1) ⟩ + . . .) = (En(0) + ϵEn(1) + . . .)(|En(0) ⟩ + ϵ|En(1) ⟩ + . . .) .

Collecting terms of order ϵ0 and ϵ, we find

O(ϵ0 ) : H (0) |En(0) ⟩ = En(0) |En(0) ⟩, (6.22)


O(ϵ) : H (1) |En(0) ⟩ + H (0) |En(1) ⟩ = En(1) |En(0) ⟩ + En(0) |En(1) ⟩. (6.23)

The terms of order ϵ0 cancel as we have solved the unperturbed problem. The O(ϵ)
equation on the other hand allows us to deduce the new spectrum and eigenstates of
(0) (0)
the perturbed Hamiltonian as follows. We know H (0) , H (1) , En and |En ⟩ so we need
(1) (1) (0)
to solve for En and |En ⟩. To this end we use the fact that |En ⟩ are an orthonormal
(0)
basis. So we can take the inner product of both sides of (6.23) with |Em ⟩ to find
(0)
⟨Em |H (1) |En(0) ⟩ + ⟨Em
(0)
|H (0) |En(1) ⟩ = En(1) ⟨Em
(0)
|En(0) ⟩ + En(0) ⟨Em
(0)
|En(1) ⟩
= En(1) δnm + En(0) ⟨Em
(0)
|En(1) ⟩, (6.24)
(0)
where in the second equality we have used orthogonality of |En ⟩. However we also
(0) (0) (0)
know that ⟨Em |H (0) = Em ⟨Em | and hence
(0)
⟨Em |H (1) |En(0) ⟩ = En(1) δnm + (En(0) − Em
(0) (0)
)⟨Em |En(1) ⟩ (6.25)

We can now take n = m to find

En(1) = ⟨En(0) |H (1) |En(0) ⟩. (6.26)


72 CHAPTER 6. TIME INDEPENDENT PERTURBATION THEORY

This is the the first order correction to the energies.


We can now write, for m ̸= n,

(0) 1
⟨Em |En(1) ⟩ = (0) (0)
(0)
⟨Em |H (1) |En(0) ⟩ (6.27)
En − Em

Here is where the non-degeneracy is important. (6.27) are nothing but the coefficients
(1) (0)
of |En ⟩ in an expansion in the |Em ⟩ basis, for m ̸= n. To get the complete state, we
still need the coefficients for m = n. These can be obtained by demanding that the
complete eigenstates are properly normalized, namely

δnm = ⟨En |Em ⟩


= (⟨En(0) | + ϵ⟨En(1) | + . . .)(|Em
(0) (1)
⟩ + ϵ|Em ⟩ + . . .)
0 = ϵ⟨En(1) |Em
(0)
⟩ + ϵ⟨En(0) |Em
(1)
⟩ (6.28)

Thus we learn that

⟨En(1) |Em
(0)
⟩ = −⟨En(0) |Em
(1) (1)
⟩ = −(⟨Em |En(0) ⟩)∗ (6.29)

(0) (1)
For m = n, this tells us that the real part of ⟨En |En ⟩ vanishes. In particular we can
take

⟨En(1) |En(0) ⟩ = ⟨En(0) |En(1) ⟩ = 0. (6.30)

This tells us that the first order correction is orthogonal to the unperturbed eigenstate.
Finally
X
|En(1) ⟩ = (0)
⟨Em |En(1) ⟩|Em
(0)

m
X 1 (0)
= (0) (0)
⟨Em |H (1) |En(0) ⟩|Em
(0)
⟩ (6.31)
m̸=n En − Em

We can then verify that


X 1
⟨Ep(0) |En(1) ⟩ = (0) (0)
(0)
⟨Em |H (1) |En(0) ⟩⟨Ep(0) |Em
(0)

m̸=n En − Em
1
= (0) (0)
⟨Ep(0) |H (1) |En(0) ⟩
En − Ep
!∗
1
=− (0) (0)
⟨En(0) |H (1) |Ep(0) ⟩
Ep − En
= −(⟨En(0) |Ep(1) ⟩)∗ (6.32)

as required for keep the eigenvectors orthonormal. Note that in this derivation, although
we started with a two-dimensional Hilbert space, we didn’t use this and our answer is
true more generally.
6.2. FIRST ORDER NON-DEGENERATE PERTURBATION THEORY 73

Let’s look at the simple example above where


!
0 1
H (1) = (6.33)
1 0
(1) (0) (0)
Here we see that En = ⟨En |H (1) |En ⟩ = 0 for n = 1, 2: e.g.
! !
  0 1 1
(0) (0)
⟨E1 |H (1) |E1 ⟩ = 1 0 =0
1 0 0
! !
  0 1 0
(0) (0)
⟨E2 |H (1) |E2 ⟩ = 0 1 =0 (6.34)
1 0 1

Thus the energies are not corrected to first order in ϵ:

En = En(0) + O(ϵ2 ) (6.35)

This agrees with our exact result but will not be true in general since our H (1) is off-
diagonal.
(1)
However |En ⟩ will be non-zero as:
! !
  0 1 0
(0) (1) (0)
⟨E1 |H |E2 ⟩ = 1 0 =1
1 0 1
! !
  0 1 1
(0) (1) (0)
⟨E2 |H |E1 ⟩ = 0 1 =1 (6.36)
1 0 0

From which we learn


(0) (0)
(0) ⟨E2 |H (1) |E1 ⟩ (0)
|E1 ⟩ = |E1 ⟩ +ϵ (0) (0)
|E2 ⟩
− E1 E2
! !
1 ϵ 0
= + (0) (0)
0 E1 − E2 1
!
1
= ϵ (6.37)
(0) (0)
E1 −E2

and
(0) (0)
(0) ⟨E1 |H (1) |E2 ⟩ (0)
|E2 ⟩ = |E2 ⟩ + ϵ (0) (0)
|E1 ⟩

E2 E1
! !
0 ϵ 1
= + (0) (0)
1 E2 − E1 0
!
− (0) ϵ (0)
= E1 −E2 (6.38)
1

which agrees with what we found above.


However to see the shift in energy we will have to go to second order in perturbation
theory!
74 CHAPTER 6. TIME INDEPENDENT PERTURBATION THEORY

6.3 Second Order Non-Degenerate Perturbation The-


ory
To see the correction to the energy in our simple example we need to go one step further
and consider second order terms:

H = H (0) + ϵH (1) + ϵ2 H (2) + . . .


En = En(0) + ϵEn(1) + ϵ2 En(2) + . . .
|En ⟩ = |En(0) ⟩ + ϵ|En(1) ⟩ + ϵ2 |En(2) ⟩ + . . . (6.39)

We still want to solve the following equations

H|En ⟩ = En |En ⟩ (6.40)

But now we have to expand to second order

(H (0) + ϵH (1) + ϵ2 H (2) + . . .)(|En(0) ⟩ + ϵ|En(1) ⟩ + ϵ2 |En(2) ⟩ + . . .)


= (En(0) + ϵEn(1) + ϵ2 En(2) + . . .)(|En(0) ⟩ + ϵ|En(1) ⟩ + ϵ2 |En(2) ⟩ . . .) (6.41)

But we can assume that the zeroth and first order equations have been solved so we
find, at second order,

H (2) |En(0) ⟩ + H (1) |En(1) ⟩ + H (0) |En(2) ⟩ = En(0) |En(2) ⟩ + En(1) |En(1) ⟩ + En(2) |En(0) ⟩ (6.42)
(2) (2)
Remember that the unknowns are En and |En ⟩ everything else is known. So we take
matrix elements again:
(0)
⟨Em |H (2) |En(0) ⟩ + ⟨Em
(0)
|H (1) |En(1) ⟩ + ⟨Em
(0)
|H (0) |En(2) ⟩
= En(0) ⟨Em
(0)
|En(2) ⟩ + En(1) ⟨Em
(0)
|En(1) ⟩ + En(2) ⟨Em
(0)
|En(0) ⟩ (6.43)

which becomes
(0)
⟨Em |H (2) |En(0) ⟩ + ⟨Em
(0)
|H (1) |En(1) ⟩ + Em
(0) (0)
⟨Em |En(2) ⟩
= En(0) ⟨Em
(0)
|En(2) ⟩ + En(1) ⟨Em
(0)
|En(1) ⟩ + En(2) δnm (6.44)

Again we first look at m = n and find

⟨En(0) |H (2) |En(0) ⟩ + ⟨En(0) |H (1) |En(1) ⟩ + En(0) ⟨En(0) |En(2) ⟩ = En(0) ⟨En(0) |En(2) ⟩ + En(2) (6.45)
(0) (1) (0) (2)
where we have used ⟨En |En ⟩ = 0. The ⟨En |En ⟩ terms cancel so this tells us the
second order correction to En :

En(2) = ⟨En(0) |H (2) |En(0) ⟩ + ⟨En(0) |H (1) |En(1) ⟩


(0) (0)
X ⟨Em |H (1) |En ⟩
= ⟨En(0) |H (2) |En(0) ⟩ + ⟨En(0) |H (1) (0) (0)
(0)
|Em ⟩
m̸=n En − Em
X ⟨En(0) |H (1) |Em
(0) (0) (0)
⟩⟨Em |H (1) |En ⟩
= ⟨En(0) |H (2) |En(0) ⟩ + (0) (0)
(6.46)
m̸=n En − Em
6.3. SECOND ORDER NON-DEGENERATE PERTURBATION THEORY 75

(2)
Since we can compute everything on the right hand side we now know En . It is left
(2)
and an exercise to compute |En ⟩ (just do it for H (2) = 0).
So let us try to compute the correction to the energy in our simple example. Here
we have H (2) = 0 and so (using the matrix elements we computed before 6.36)
(0) (0) (0) (0)
(2) ⟨E2 |H (1) |E1 ⟩⟨E1 |H (1) |E2 ⟩ 1
E1 = (0) (0)
=
E1 − E2 E1 − E2
(0) (0) (0) (0)
(2) ⟨E1 |H (1) |E2 ⟩⟨E2 |H (1) |E1 ⟩ 1
E2 = (0) (0)
=− (6.47)
E2 − E1 E1 − E2

which agrees with what we found above from the exact solution 6.15.
Let us summarise what we found and in a slightly cleaner notation. Typically (but
not always) the perturbation to the Hamiltonian comes from the potential so

H (1) = V (1) (6.48)


(0)
and for simplicity we take H (2) = 0. Let us denote the unperturbed system by |En ⟩ =
|n⟩ and introduce the matrix elements

⟨m|V (1) |n⟩ = Vmn . (6.49)

If we look at wavefunctions on R3 , then


Z
Vnm = ψn∗ (x)V (1) (x)ψm (x)d3 x, (6.50)

where ψn (x) are a basis of normalised eigenfunctions of the unperturbed Hamiltonian:

ℏ2 2
En(0) ψn = − ∇ ψn + V (0) ψn . (6.51)
2m
Then our formulae are (see the problem set)
X Vmn Vnm
En = En(0) + ϵVnn + ϵ2 (0) (0)
+ ...
m̸=n En − Em
X Vnm
|En ⟩ = |n⟩ + ϵ (0) (0)
|m⟩
m̸=n En − Em
"
XX Vpn Vmp X Vnn Vmn
+ ϵ2 (0) (0) (0) (0)
|m⟩ − (0) (0) 2
|m⟩
m̸=n p̸=n (En − Em )(En − Ep ) m̸=n (En − Em )
#

1X Vmn Vmn
− |n⟩ + . . . (6.52)
2 m̸=n (En(0) − Em (0) 2
)

The last term is obtained by solving for the normalization of the full state to second
order in ϵ.
In principle, you can continue like this to find third-order and higher corrections. The
calculations will become increasingly tedious and, furthermore, at some point perturba-
tion theory will break down. This is because, as we see from the first two corrections to
76 CHAPTER 6. TIME INDEPENDENT PERTURBATION THEORY

the Hamiltonian eigenvalues and eigenstates above, the expansion is roughly speaking
in powers of
(0) (0)
⟨Em |∆H|En ⟩
, m ̸= n. (6.53)
Em − En

We are fine as long as this quantity is small.1

6.4 Application: Fine structure of Hydrogen


We put what we just learned (and also some past material!) into practice by computing
corrections to the spectrum of the Hydrogen atom. There are three sources for these
corrections:

• Relativistic correction to the Kinetic Energy

• Spin-orbit coupling due to the spin of the electron coupling to the B field due to
the proton in the electron frame

• Darwin term due to quantum fluctuations of the electron

We will only discuss the first two. The last one should be a topic in Quantum Field
Theory.

6.4.1 Relativistic correction to KE


We will talk about the relativistic Schrodinger equation later. Its implications are easy
to describe now. The relativistic formula for the energy of a particle of 3-momentum p
is given by q
E= |p|2 c2 + m2e c4 , (6.54)

where c is the speed of light and me is the mass of the electron.2 If you haven’t seen
Special Relativity before, you surely have heard about the famous E = mc2 . (6.54) is
just that.
Now the speed of the electron can be assumed to be much smaller than the speed of
light hence
|p|2
≪1 (6.55)
m2e c2
and we can Taylor expand (6.54) in this small parameter. We find

|p|2
2
|p|4
E = me c + − + ··· (6.56)
2me 8m3e c2
1
A precise criterion is actually hard to find, since at higher order in ptertubation theory, the correc-
tions will involve sums over these quantities.
2
We work in the approximation before where the mass of the electron is much smaller than the mass
of the nucleus and µ ≃ me .
6.4. APPLICATION: FINE STRUCTURE OF HYDROGEN 77

The first two terms are just the usual kinetic terms. The last term is a correction to the
Hydrogen atom Hamiltonian discussed in Chapter 5. We want to use time-independent,
non-degenerate perturbation theory to compute the correction on the energy spectrum
of the Hydrogen atom that it gives rise to.
From the last section we have

(1)
En,ℓ = ⟨n, ℓ, m|∆H|n, ℓ, m⟩, (6.57)

where
P̂ 4
∆H = − . (6.58)
8m3e c2
In fact, we have seen in the discussion of the Hydrogen atom that states with different
ℓ, m but the same n have the same energy and are hence degenerate. We may be
worried that we need to use degenerate perturbation theory in this case. The fact that
the perturbation (6.58) is spherically symmetric saves us: the perturbation cannot lead
to mixing between the degenerate energy eigenstates since

⟨n, ℓ′ , m′ |∆H|n, ℓ, m⟩ = 0, ℓ ̸= ℓ′ , m ̸= m′ . (6.59)

This can be seen by evaluating

[Lz , ∆H] = [L2 , ∆H] = 0 (6.60)

between (ℓ′ , m′ ) and (ℓ, m) states. (6.59) is an example of a selection rule: if the Hamil-
tonian has certain properties such as symmetries, transitions between certain states may
be forbidden.
To proceed let notice that

P̂ 4
= (H0 − V (r))2 (6.61)
4m2e

where H0 is the Hamiltonian for the hydrogen atom from last Chapter (ie. without
corrections). We found last week that

me c2 α2
H0 |n, ℓ, m⟩ = En |n, ℓ, m⟩ = − |n, ℓ, m⟩. (6.62)
n2
We drop to superscript (0) on En above, it is to be understood that the above is the
unperturbed energy. We hence have

(1) 1  2 2

En,ℓ = − E n − 2En ⟨V ⟩n,ℓ,m + ⟨V ⟩n,ℓ,m (6.63)
2me c2

Recall now that for the Hydrogen atom (previously worked in units where 4πϵ0 to avoid
unnecessary clutter)
e2
V =− . (6.64)
4πϵ0 r
78 CHAPTER 6. TIME INDEPENDENT PERTURBATION THEORY

As such, we just need to evaluate the expectation values of 1/r and 1/r2 with respect
to the Hydrogen wavefunctions derived before. One finds (exercise)

1 1
⟨ ⟩n,ℓ,m = ,
r a0 n 2
1 1 (6.65)
⟨ 2 ⟩n,ℓ,m = 2 3 .
r a0 n (ℓ + 21 )

Here a0 is the Bohr radius



a0 = , (6.66)
me cα
and
e2
α= (6.67)
4πϵ0 ℏc
is the fine structure constant. It’s always good to check that dimensions work out and
writing things in terms of a0 which has units of length makes this easy.
Putting it all together, we have

e2 e4
 
(1) 1 2 1
En,ℓ = − En + 2En +
2me c2 4πϵ0 a0 n2 16π 2 ϵ20 (ℓ + 21 )n3 a20
(6.68)
me c2 α4
   
2 n 3 2 n 3
=− − En = − − .
me c2 ℓ + 12 4 2 n4 ℓ + 21 4

We see that this correction is α2 suppressed with respect to En . We also see that
the ℓ degeneracy is lifted. Needless to say, these corrections have been experimentally
measured a long time ago (in fact before the discovery of quantum mechanics!). It is
quite nice that the conceptually simple perturbative framework that we set up allows
us to predict these.

6.4.2 Spin-orbit coupling


This correction appears also at O(α4 ). It is due to the fact that a charge moving in an
electric field experiences a magnetic field

γ 1 er̂ e
B= v × E = p × = − L. (6.69)
c2 m e c2 4πϵ0 r2 4πϵ0 me c2
We have also seen before that the electron carries spin S and spins couple to magnetic
fields via
e αℏ 1
HSO = S · B = − 2 3 S · L. (6.70)
2me 2me c r
The correction to the energy due to spin-orbit coupling is therefore

(1)
ESO = ⟨n, ℓ, m; s|HSO |n, ℓ, m; s⟩ (6.71)

where
|n, ℓ, m; s⟩ ≡ |n, ℓ, m⟩ ⊗ |s⟩ (6.72)
6.5. DEGENERATE PERTURBATION THEORY 79

are a basis for the Hilbert space of the electron (obtained by the tensor product). We
now notice that
1 2
J − L2 − S 2 ,

S·L= (6.73)
2
where
J =L+S (6.74)

is the net (orbital + spin) angular momentum. Unlike before where we looked at two-
particle systems and added their angular momenta, here we have one particle (the
electron) that can carry two kinds of angular momentum. The physics is different, but
the math is the same.
Just like before, it is convenient to go to a basis that diagonalizes J 2 , L2 , S 2 and Jz
instead of (6.72) which diagonalizes L2 , Lz , S 2 , Sz . In this new basis, S · L is diagonal
with evalues 
ℏ2 ℓ, j = ℓ + 21 ,
S · L|j, mj ; s, ℓ⟩ = (6.75)
2 −(ℓ + 1), j = ℓ − 1 .
2

Hence 
αℏ 1 3 j = ℓ + 12 ,
ℓ,
En;j,ℓ = − 2 ⟨ 3 ⟩n;j,ℓ (6.76)
4me c r −(ℓ + 1), j = ℓ − 1 .
2

It turns out that


1 1
⟨ ⟩ n;j,ℓ = . (6.77)
r3 n3 a30 ℓ(ℓ + 21 )(ℓ + 1)
You can show this either by explicit computation, or by noticing that on the one hand
we have
⟨[Pr , H0 ]⟩n,ℓ,m = 0, (6.78)

but on the other hand


ℏ2 ℓ(ℓ + 1) e2
 
[Pr , H0 ] = −iℏ − + . (6.79)
me r3 4πϵ0 r2

These two equations allow us to express ⟨r−3 ⟩ in terms of ⟨r−2 ⟩ which we already found
before.
We can finally combine the spin-orbit correction with the relativistic correction to
obtain
m e α 2 c2 1 α2 3
  
1
En,j,ℓ = − − − . (6.80)
2 n2 n3 4n j + 21
This formula holds when ℓ ̸= 0 for both j = ℓ ± 12 .

6.5 Degenerate Perturbation Theory


It’s clear from our previous discussion that if two or more energy eigenvalues are equal,
(0) (0)
so that En = Em for some pair m ̸= n then all Hell breaks loose. Lets go back to our
80 CHAPTER 6. TIME INDEPENDENT PERTURBATION THEORY

simple example and look at what happens when E1 = E2 = E:


!
E ϵ
H= (6.81)
ϵ E

Again it’s easy to solve for the energy


1 p 
E1/2 = E1 + E2 ± (E1 − E2 )2 + 4ϵ2 = E ± |ϵ| (6.82)
2
and the eigenvectors are
! !
1 1 1 −sgn(ϵ)
|E1 ⟩ = √ |E2 ⟩ = √ (6.83)
2 sgn(ϵ) 2 1

where sgn(ϵ) = ϵ/|ϵ|. Here we see the problem: the answer is not analytic in ϵ and so a
naive Taylor series expansion will fail. In addition taking ϵ → 0± does not give us the
unperturbed eigenvectors. This is fine as the eigenvalues at ϵ = 0 are degenerate and
hence we could just as well as started with
! !
(0) 1 1 (0) 1 −1
|E1 ⟩ = √ |E2 ⟩ = √ (6.84)
2 1 2 1

or the other way around. Thus we need to be more careful and treat degenerate
eigenspaces as a special case. The degeneracy maybe just a single pair, or more typically,
every energy eigenstate has some fixed degeneracy, e.g. two spin states of an electron.
To this end, let V ⊂ H be an N -dimensional subspace of states of equal (degenerate)
energies EV of H (0) . All of these states satisfy
(0) (0)
H0 |EV,n ⟩ = EV |EV,n ⟩, n = 1, · · · , N. (6.85)

Given any orthonormal basis {|n⟩} of V , we can define the projector on V


X
PV = |n⟩⟨n| (6.86)
n∈V

and its orthogonal subspace


P⊥ = I − PV . (6.87)
These obey PV2 = PV , P⊥2 = P⊥ . Consequently, H decomposes as

H = PV H ⊕ P⊥ H = V ⊕ H ⊥ . (6.88)

Since V is a degenerate subspace of H, it is easy to see that

[H0 , PV ] = [H0 , P⊥ ] = 0. (6.89)

We can now proceed as before to solve

H (0) + ϵH (1) |E⟩ = E(ϵ)|E⟩



(6.90)
6.5. DEGENERATE PERTURBATION THEORY 81

in perturbation theory. We have

H (0) + ϵH (1) (PV + P⊥ )|E⟩ = E(ϵ)(PV + P⊥ )|E⟩



(6.91)

or equivalently, using H (0) PV |E⟩ = EV PV |E⟩

EV − E(ϵ) + ϵH (1) PV |E⟩ + H (0) + ϵH (1) − E(ϵ) P⊥ |E⟩ = 0.


 
(6.92)

These equations split into two by acting with PV and P⊥ respectively

PV : (EV − E(ϵ))PV |E⟩ + ϵPV H (1) PV |E⟩ + ϵPV H (1) P⊥ |E⟩ = 0,


(6.93)
P⊥ : ϵP⊥ H (1) PV |E⟩ + P⊥ H (0) + ϵH (1) P⊥ |E⟩ − E(ϵ)P⊥ |E⟩ = 0.


Let’s first analyze the first equation in (6.93). Start with a state in V and expand
(0)
En (ϵ) = EV,n + ϵEn(1) + O(ϵ2 ),
(0)
(6.94)
|En ⟩ = |EV,n ⟩ + ϵ|En(1) ⟩ + O(ϵ2 ).
Note that the unperturbed state must lie in V , but the perturbation can in principle
take us out of V . Note also that P⊥ |En ⟩ = O(ϵ). To leading order in ϵ, we just get that
E (0) = EV . We already knew this. To O(ϵ) we get something more interesting, namely
(0) (0)
−En(1) |EV,n ⟩ + PV H (1) PV |EV,n ⟩ = 0. (6.95)

In other words
(0) (0)
PV H (1) PV |EV,n ⟩ = En(1) |EV,n ⟩ (6.96)
(0)
meaning that |EV,n ⟩ is also an eigenstate of the perturbation projected on the degenerate
subspace! This is important. It tells us that under the assumption that a small change
(0)
in H causes a small change in the state, |EV,n ⟩ must not only be an eigenstate of H (0)
but also of the perturbation H (1) , ie. it is an eigenstate of the full Hamiltonian. In other
words, degenerate systems stable under perturbations are those that are eigenstates of
the perturbation.
In conclusion, we get that the perturbation lifts the degeneracy of the |EV,n ⟩ subspace
according to
(0) (0)
En(1) = ⟨EV,n |H (1) |EV,n ⟩. (6.97)
This is the same formula as before. We can also use the second equation in (6.93) to
(0)
deduce that the correction to H (0) eigenstates |Ep ⟩ ∈ H⊥ are given by a similar formula

Ep(1) = ⟨Ep(0) |P⊥ H (1) P⊥ |Ep(0) ⟩. (6.98)

To compute the perturbed eigenstates in (6.94), we use the second eq. in (6.93). We
find
(0) (0)
P⊥ H (1) |EV,n ⟩ + P⊥ H (0) |En(1) ⟩ − EV,n P⊥ |En(1) ⟩ = 0 (6.99)
(0)
Taking inner product with |Ep ⟩ ∈ H⊥ , we find
(0) (0)
⟨Ep |H (1) |EV,n ⟩
⟨Ep(0) |En(1) ⟩ = (0) (0)
. (6.100)
EV,n − Ep
82 CHAPTER 6. TIME INDEPENDENT PERTURBATION THEORY

(1) (0)
This only determines the components of |En ⟩ along |Ep ⟩ ∈ H⊥ . To determine the
components along the degenerate subspace, we need to go to second order in perturba-
tion theory. This is an interesting exercise.
Thus looking at our simple two-dimensional example we should start from the basis
! !
1 1 2 1 −1
|E1 ⟩ = √ + O(ϵ ) |E2 ⟩ = √ + O(ϵ2 ) (6.101)
2 1 2 1

which are eigenstates of


!
0 1
H (1) = (6.102)
1 0

with eigenvalues 1 and −1 respectively and then we find

E1 = E + ϵ + O(ϵ2 ) E2 = E − ϵ + O(ϵ2 ) (6.103)

In fact in this simple case the perturbation ends at first order but this won’t be true in
general.
Of course it could be that there are still degeneracies but then one simply goes to
the next order. In quantum theories one expects that the only degeneracies that persist
to all orders in perturbation theory are those that are protected by a symmetry. That
is there exists an observable Q that commutes with the full Hamiltonian [Q, H] = 0
and hence the states |En ⟩ and Q|En ⟩ have the same energy to all orders in perturbation
theory:

HQ|En ⟩ = QH|En ⟩ = En Q|En ⟩ (6.104)

One then works with a energy eigenstates that are also eigenstates of Q.
Chapter 7

Time Dependent Perturbation


Theory

Our next topic in perturbation theory is to allow for a small time dependence. So
far we have assumed that the Hamiltonian has no explicit time dependence. But this
need not be the case. So what happens if we consider a perturbation that includes
time-dependence:

∂|Ψ⟩
iℏ = H (0) |Ψ⟩ + gH (1) (t)|Ψ⟩ (7.1)
∂t
(we assume the unperturbed Hamiltonian is time independent). Recall that in the
time-independent case we solve the Schröding equation by
X
|Ψ(0) (t)⟩ = cn e−iEn t/ℏ |ψn(0) ⟩, (7.2)
n

(0)
where |ψn ⟩ are a basis of eigenstates of H (0) :

H (0) |ψn(0) ⟩ = En(0) |ψ (0) ⟩. (7.3)

In this case the cn ’s are just constants that characterise the state at t = 0.
To solve the perturbed Schrödinger equation we allow cn to be functions of time.
Substituting back we now find
X  dcn  X
iℏ + En cn e−iℏEn t/ℏ |ψn(0) ⟩ = cn e−iEn t/ℏ H (0) |ψn(0) ⟩
n
dt n
X
+g cn e−iEn t/ℏ H (1) (t)|ψn(0) ⟩ (7.4)
n

(0)
Next we use the fact that the |ψn ⟩ satisfy the unperturbed time-independent Schrödinger
equation so
X dcn X
iℏ e−iEn t/ℏ |ψn(0) ⟩ = g cn (t)e−iEn t/ℏ H (1) (t)|ψn(0) ⟩ (7.5)
n
dt n

83
84 CHAPTER 7. TIME DEPENDENT PERTURBATION THEORY

(0)
Note that again we can still use the fact that the |ψn ⟩ are an orthonormal basis of the
(0)
Hilbert space. Thus we can take the inner product of this equation with |ψm ⟩ to obtain
dcm −iEm t/ℏ X
iℏ e =g cn e−iEn t/ℏ ⟨ψm
(0)
|H (1) (t)|ψn(0) ⟩
dt n
dcm ig X
=− cn ei(Em −En )t/ℏ ⟨ψm(0)
|H (1) (t)|ψn(0) ⟩ (7.6)
dt ℏ n

(0)
For the usual case where |ψn ⟩ are realised by functions on R3 we have
Z
(1)
⟨ψm |H (t)|ψn ⟩ = (ψm (0)
(x))∗ H (1) (t)ψn(0) (x)d3 x (7.7)

Thus we obtain a first order differential equation for each cn (t) but it is coupled to all
other c’s. Thus cn (t) is determined by the original initial condition cn (0).
So far we haven’t made any approximation. But we have an infinite set of coupled
differential equations! At lowest order cn is constant so let us expand

cm (t) = cm (0) + gc(1) 2 (2)


m (t) + g cm (t) + . . . (7.8)
(k)
with the boundary condition that cm (0) = 0 for k = 1, 2, 3.... This sets up an infinite
(k+1) (k)
series of recursion relations between cm and cm but the first order one is
(1)
dcm iX
=− cn (0)ei(Em −En )t/ℏ ⟨ψm
(0)
|H (1) (t)|ψn(0) ⟩ (7.9)
dt ℏ n

which has the solution


Z t
iX ′
c(1)
m (t) =− cn (0) ei(Em −En )t /ℏ ⟨ψm
(0)
|H (1) (t′ )|ψn(0) ⟩dt′ , (7.10)
ℏ n 0

(n)
where by (7.8), all cm (0) = 0 for n ≥ 1.
It is natural to imagine a situation where the perturbation is turned on at t = 0
and then switched off at some later stage. That is to say H (1) (t) = 0 for t ≤ 0 and
H (1) (t) → 0 rapidly at late times. Further let us suppose that at t ≤ 0 the system is in
the k-th energy eigenstate:

cm (0) = δmk (7.11)

then we find that


Z t
i ′ (0)
c(1)
m (t) =− ei(Em −Ek )t /ℏ ⟨ψm
(0)
|H (1) (t′ )|ψk ⟩dt′ (7.12)
ℏ 0

The interpretation is that |cm (t)|2 gives the probability that after the perturbation the
(1)
system will lie in the m-th energy state (m ̸= k) at time t. cm (t) is called the first order
transition amplitude. Note that we don’t expect to find
X
|cn (t)|2 = 1 (7.13)
n
85

The interpretation is that the perturbation has introduced (or taken away) energy into
the system which then gets redistributed.
So lets do an example! Consider a harmonic oscillator
p̂2 1
H (0) = + kx̂2 (7.14)
2m 2
with states |n⟩, n = 0, 1, 2... and energies En = ℏω(n + 12 ). Next we perturb it by
2 /τ 2
H (1) = −x̂e−t (7.15)

Classically this corresponds to an applied force F = g in the x-direction but only for a
period of time of the order 2τ . In this case we assume that the system is in the ground
state at t → −∞. Thus we need to evaluate (we have change the initial time to t = −∞)
i t iωnt′
Z
(1)
cn (t) = − e ⟨n|H (1) (t′ )|0⟩dt′
ℏ −∞
Z t
i ′ ′2 2
= ⟨n|x̂|0⟩ eiωnt −t /τ dt′ (7.16)
ℏ −∞

There is no closed form for this integral but we can look at late times so
Z ∞
i ′ ′2 2
(1)
cn (∞) = ⟨n|x̂|0⟩ eiωnt −t /τ dt′
ℏ −∞

iτ π −τ 2 n2 ω2 /4
= e ⟨n|x̂|0⟩ (7.17)

where we have used to integral

Z r
−at′2 +bt′ π b2 /4a
e dt′ = e (7.18)
−∞ a
Lastly we need to compute
r

⟨n|x̂|0⟩ = ⟨n|â + [f ix]↠|0⟩
r 2mω

= ⟨n|↠|0⟩
r 2mω

= δn,1 (7.19)
2mω
So it is only non-zero for the first excited state:
r
π 2 2
c1 (∞) = ig τ e−τ ω /4 (7.20)
2mℏω
More generally its easy to see that if we started in |N ⟩ then we could only jump to
|N ± 1⟩ with the same dependence on τ .
To make predictions it is necessary to see what happens to the ground state. The
equation for c0 is
(1)
c0 (t) = 1 + gc0 (t) + . . . (7.21)
86 CHAPTER 7. TIME DEPENDENT PERTURBATION THEORY

with
Z ∞
(1) i
c0 (∞) =− ⟨0|H (1) (t′ )|0⟩dt′
ℏ −∞

=0 (7.22)

as ⟨0|(a + a† )|0⟩ = 0. Thus at late times our state is


r
π 2 2
|Ψ(t → ∞)⟩ = e−iωt/2 |0⟩ + ig τ e−τ ω /4 e−3iωt/2 |1⟩. (7.23)
2mℏω
This is not normalized but one could normalize it and read off the relative probabilities
for finding the system in the ground state |0⟩ or first excited state |1⟩. To lowest order
in g we find
1
P0→0 = ∼ 1 − A2
1 + A2
A2
P0→1 = ∼ A2 (7.24)
1 + A2
where
πg 2 2 −ω2 τ 2 /2
A2 = τ e << 1 (7.25)
2mℏω

7.1 Fermi Golden Rule


Consider the following time-dependent perturbation

0, t < 0,
H (1) = (7.26)
∆e−iωt + ∆† eiωt , t > 0.

Here ∆ = ∆(x, p, · · · ) has no explicit time dependence and H (1) is manifestly Hermi-
tian. You can think about H (1) resulting from a harmonic driving force. An external
electromagnetic field will give rise to such pertrubations (although EM radiation is not
necessarily mono-chromatic, ie. the perturbation may involve terms of different ω).
Starting in an energy eigenstate as before and performing the time integrals, we find
Z t Z t
i ′ i(ωn −ωm )t′ −iωt′ i † ′ ′
(1)
cn (t) = − ⟨n|∆|m⟩ dt e e − ⟨n|∆ |m⟩ dt′ ei(ωn −ωm )t eiωt
ℏ 0 ℏ 0
 i(ωn −ωm )t−iωt i(ωn −ωm +ω)t
 (7.27)
1 e −1 e −1 †
=− ⟨n|∆|m⟩ + ⟨n|∆ |n⟩
ℏ (ωn − ωm − ω) (ωn − ωm + ω)

where ωn are the H (0) eigenvalues in units of ℏ (or frequencies). In the late time limit
t → ∞, we obtain a significant transition amplitude only when ωnm ≡ ωn − ωm = ±ω.
For ω > 0, the + sign corresponds to energy absorbtion, while the − sign corresponds
to energy emission. As such, the time-dependent monochromatic perturbation acts as
a source or sink of energy that can be exchanged with the system.
7.1. FERMI GOLDEN RULE 87

Near ωnm = ω, the first term dominates and we find

⟨n|∆|m⟩ (ωnm − ω)t


cn (t) ≃ 2i sin (7.28)
ℏ(ω − ωnm ) 2

and the transition probability from level m to level n

4|⟨n|∆|m⟩|2 (ωnm − ω)t


|cn (t)|2 ≃ 2 2
sin2 . (7.29)
ℏ (ω − ωnm ) 2

The transition rate is then


∂ 2π
Γ(m → n) ≡ lim |cn (t)|2 = 2 |⟨n|∆|m⟩|2 δ(ωnm − ω). (7.30)
t→∞ ∂t ℏ
To obtain the last equality, we have used the delta function identity

1
lim sin(xt) = δ(x). (7.31)
t→∞ πx

(7.30) is a very useful equation that allows us to determine the transition rates between
atomic energy levels: shining monochromatic light of frequency that matches the energy
difference between two levels will induce a transition between the levels.
These results were first obtained by Dirac and it was Fermi who later called them
“golden rules”.

7.1.1 Absorbtion and stimulated emission


In this section we discuss a generalization of the Fermi Golden rules to the case where
the perturbation is not monochromatic. For example, consider an electron in a time-
dependent electric field, with Hamiltonian

H = H0 + eE(t) · x̂ (7.32)

As before, for a transition from |m⟩ to |n⟩, we have


Z t
ie ′
cn (t) = − dt′ ⟨n|E(t′ ) · x̂|m⟩eiωnm t (7.33)
ℏ 0

where we assumed that H (1) = 0 for t < 0.


The probability of transition is then given by
t t
e2
Z Z
′ ′ ′′
2
|cn (t)| = 2 dt dt′′ ⟨n|E(t′ ) · x̂|m⟩⟨m|E(t′′ ) · x̂|n⟩eiωnm (t −t ) . (7.34)
ℏ 0 0

To be able to proceed, consider the average over a distribution of electric fields


Z
Ei (t1 )Ej (t2 ) ≡ δij dωP (ω)e−iω(t1 −t2 ) . (7.35)
88 CHAPTER 7. TIME DEPENDENT PERTURBATION THEORY

This is an assumption that tells us that there are no correlations among different spatial
components of the fields. P (ω) is the average energy density in the the radiation. We
can use this result to simplify the average probability of transition
2 XZ t Z t
e ′ ′ ′′
|cn |2 = 2 dt dt′′ Ei (t′ )Ej (t′′ )⟨n|xi |m⟩⟨m|x̂i |n⟩eiωnm (t −t )
ℏ i 0 0
3 Z ∞ Z t 2
e2 X 2 ′ i(ωnm −ω)t′
= 2 |⟨n|x̂i |m⟩| dωP (ω) dt e (7.36)
ℏ i=1 −∞ 0
3
sin2 (ωnm2−ω)t
Z ∞
e2 X 2
= 2 |⟨n|x̂i |m⟩| dωP (ω) .
ℏ i=1 −∞ (ωnm − ω)2

In the large t limit, we can use the result from before (7.31). In that case, the integral
over ω sets the argument of P to be ωnm and the result is proportional to the probability
that the electric dipole moment can link n and m. Depending on whether ωnm > 0 or
ωnm < 0, we get absorbtion or stimulated emission.
Chapter 8

Semi-Classical Quantization: WKB

Let us now look at another type of approximation which holds whenever a system is
“nearly classical”. We will make this more precise below, but to get some intuition
about what this means, note that we expect that for large systems, classical results
should emerge. Much of the quantum-ness of quantum theory should disappear. This
disappearance arises as the wavefunctions typically oscillate so quickly that the quantum
effects cancel out and only the dominant classical configurations remain. Furthermore,
we should be able to use quantum mechanics to characterize deviations from classicality.

8.1 WKB
The WKB approximation is named after Wentzel, Krammers & Brillouin (’26), although
the methods were previously developed by Jeffreys (’23) in the mathematical context
of approximate methods for differential equations. The same method appears under
different names in various areas of physics. In optics, it goes under the name of “eikonal”
approximation. In QCD similar methods, emplyoing a separation of the degrees of
freedom into fast and slow ones, are used in computing high-energy observables.
Before we describe the math, let’s think about the question: “How do we quantify
the semi-classical regime of a system?” There are two naive answers:

• ℏ → 0. On the one hand, recall that the uncertainty principle tells us that ∆x∆p ∼
ℏ, hence we cannot measure both position and momenta with precision. In classical
mechanics, we of course can, so we expect the classical limit to be one in which
the uncertainty principle disappears, hence ℏ → 0. On the other hand, ℏ is a fixed
constant of nature, and it makes no sense to take it to 0. What we should instead
be doing is to look for a dimensionless quantity involving ℏ which can be taken to
be small. As we will see, it’s still useful to think about ℏ → 0 as taking a classical
limit and you may encounter this in other courses such as QFT.

• The de Broglie wavelength is a rough measure that allows us to characterize the

89
90 CHAPTER 8. SEMI-CLASSICAL QUANTIZATION: WKB

quantum-ness of a system. It is given by

2πℏ
λdB ≡ L, (8.1)
p

where p is the momentum of system and L its size. Recall that for a system at
rest, p = m0 c2 , where m0 is the rest mass of the system/object. You can calculate
the de Broglie wavelength for mundane objects like a ball or a cup of tea and you
will find a tiny answer, much much smaller than the size of the object itself. So
quantum mechanics is irrelevant when λdB << L. In this limit, we cannot resolve
the wave-like nature of particles. We enter the quantum regime as λdB ∼ L.

In terms of λdB we will see that we can apply semi-classical methods when:
a)
dλdB
≪1 (8.2)
dx
and
b)
dV |p|2
λdB ≪ , (8.3)
dx m
where V is the potential experienced by the system. a) tells us that the dB wavelength
has to be slowly varying. One can imagine that in this case deviations from classicality
will be under control. b) tells us that the potential is slowly varying. We will understand
how these conditions arise when solving the Schrodinger equation with “approximate
wavefunctions”.
We expect that classical solutions will be just waves (of huge frequency), so let’s
consider wave-functions of the form (in one dimension)

ψ(x) ∼ eiσ(x)/ℏ . (8.4)

For now, we won’t worry about the normalization. We can fix that later. Taking the
exponent to be σ(x)/ℏ as opposed to simply kx, will allow us to characterize deviations
from simple wave-like behavior. Note that

dψ i dσ iσ(x)/ℏ
= e
dx ℏ dx
 2
d2 ψ i d2 σ iσ(x)/ℏ 1 dσ
2
= 2
e − 2 eiσ(x)/ℏ (8.5)
dx ℏ dx ℏ dx

Substituting this Ansatz into the Schrodinger equation restricted to one dimension, we
find
ℏ2 d2 ψ
Eψ = − + V (x)ψ
2m dx
 2
iℏ d2 σ 1 dσ
E=− 2
+ + V (x). (8.6)
2m dx 2m dx
8.1. WKB 91

We now see that taking ℏ to be small really means that the first term on the RHS is
much smaller than the second, namely
 2
dσ d2 σ
≫ ℏ 2. (8.7)
dx dx

In fact this condition is nothing but condition a) stated above. To see this note that

2π dk
σ(x)/ℏ = k(x)x = x = kx + x2 + · · · =⇒
λ(x) dx
(8.8)
1 dσ dk 1 d2 σ dk
= k + 2 x + · · · =⇒ 2
= 2 + ···
ℏ dx dx ℏ dx dx
Hence (8.7) is equivalent to
dk
ℏ2 k 2 ≫ ℏ (8.9)
dx
or in terms of λ
4π 2π dλ dλ
≫ =⇒ ≪ 1.✓ (8.10)
λ2 λ2 dx dx
If this condition holds, then we can make a semi-classical expansion:

σ = σ (0) + ℏσ (1) + . . . (8.11)

which gives at zeroth and first order (we assume there is no ℏ in V )


 (0) 2
0 1 dσ
O(ℏ ) : E = + V (x),
2m dx
i d2 σ (0) dσ (0) dσ (1)
O(ℏ) : 0 = − + . (8.12)
2 dx2 dx dx
Thus at zeroth order we find a familiar formula:
dσ (0) p
= ± 2m(E − V (x)) (8.13)
dx
which is solved by
Z x p
(0)
σ (x) = ± p(y)dy p(y) = 2m(E − V (y)) (8.14)
x0

Thus we have determined the wavefunction (in principle and assuming the condition
(8.7)). In other words our semi-classical wavefunction is
i
Rx i
Rx
ψsemiclassical = Ae ℏ p(y)dy
+ Be− ℏ p(y)dy
(8.15)

where we have allowed for the two possible choices of sign.


The integrand in 8.14 has a classical interpretation where p is the momentum of a
particle in a potential V (y):

p2
E= + V (y) (8.16)
2m
92 CHAPTER 8. SEMI-CLASSICAL QUANTIZATION: WKB

Next we can easily solve for the first order term:


i d2 σ (0) dσ (0) dσ (1)
=
2 dx2 dx dx
i d dσ (0) dσ (1)
ln =
2 dx dx dx
i
ln p = σ (1) + const. (8.17)
2
We can absorb the constant in the normalization cofficients A, B and hence, to first
order,
A i x
R B i x
R
ψW KB (x) = p e ℏ p(y)dy + p e− ℏ p(y)dy . (8.18)
p(x) p(x)
This is known as the WKB approximation. It captures a surprisingly large amount of
quantum information.
Before we do some examples, let’s study look a bit closer into when our approximation
breaks down. From (8.7),

d2 σ (0) dσ (0) dσ (0) mℏ dV 2 dV p2


ℏ ≪ =⇒ ≪ p ⇐⇒ λ ≪ . (8.19)
dx2 dx dx p dx dx m
We have therefore recovered condition b) stated above. Note also that
s  2
d2 σ (0) 1 2m dV dσ
2
=± = 2m(E − V ). (8.20)
dx 2 (E − V ) dx dx

Hence (8.7) is clearly violated for E ∼ V , so the semi-classical approximation breaks


down at the ‘turning points’ where V = E. Here the momentum p vanishes so nothing
can be smaller than it. We will explain how to deal with turning points in the next
section.

8.2 Particle in a Box


Let us look at the particle in a box where V = 0 but we restrict the wavefunction to
vanish for x ≤ 0 and x ≥ L. Our wave function is therefore
A i Rx B i x
R
ψ(x) = √ e ℏ pdy + √ e− ℏ pdy
p p
i i
= A′ e ℏ px + B ′ e− ℏ px (8.21)

where p = 2mE is constant and we have absorbed p into the coefficients A′ , B ′ . Next
we need to impose boundary conditions ψ(0) = ψ(L) = 0 which leads to

A′ + B ′ = 0 A′ eipL/ℏ + B ′ e−ipL/ℏ = 2A′ sin(pL/ℏ) = 0. (8.22)

So we find p = nℏπ/L and E = n2 π 2 ℏ2 /2mL2 for n = 1, 2, .... This agrees perfectly


with the exact answer (because d2 σ (0) /dx2 = dp/dx = 0).
8.3. TURNING POINTS, AIRY AND BOHR-SOMMERFELD QUATIZATION 93

8.3 Turning Points, Airy and Bohr-Sommerfeld Qua-


tization
So you might think that the next easiest thing to consider is the harmonic oscillator -
that always works! Well no: the Devil is in the detail and in the WKB approximation
the detail is the behaviour near a turning point. Near a turning point we can assume
the the potential is roughly linear. So let us look at this example first exactly before
using WKB.
Let us consider
ℏ2 d2 ψ
Eψ = − + F xψ (8.23)
2m dx2
where F is a constant. Classically this is just a linear force in the x-direction. The
energy won’t be bounded from below but we can still find solutions. Cleaning this up
we first shift
ℏ2 d2 ψ
x = x′ + E/F =⇒ − ′2
+ x′ F ψ = 0 (8.24)
2m dx
and next we rescale

x′ = (ℏ2 /2mF )1/3 z (8.25)

to obtain the deceptively simple equation


d2 ψ
− zψ = 0 (8.26)
dz 2
The solution to this equation which decays at large z is known as the Airy function1 :

ψ = cAi(z) (8.27)

for some constant c. It is named after George Airy who was an Astronomer Royal at
the Greenwich Observatory. He features in their museum if you have ever been. The
Airy function is the poster-child of difficult functions to understand using perturbative
techniques due to its bi-polar characteristics of oscillating and decaying. But we love
it none the less as it is a thing of beauty with important and varied applications (and
with modern computer techniques it can be evaluated numerically with high precision).
It can be defined by
Z ∞
1
Ai(z) = √ cos( 31 u3 + zu)du (8.28)
π 0
From here you can check that
Z ∞
′ 1
Ai (z) = − √ u sin( 13 u3 + zu)du
π 0
Z ∞
′′ 1
Ai (z) = − √ u2 cos( 13 u3 + zu)du (8.29)
π 0
1
There is a second solution that grows at large z.
94 CHAPTER 8. SEMI-CLASSICAL QUANTIZATION: WKB

0.4

0.2

-20 -10 10 20

-0.2

-0.4

Figure 8.3.1: The Airy function Ai(z)

so
Z ∞
′′ 1
Ai (z) − zAi(z) = − √ (u2 + z) cos( 13 u3 + zu)du
π 0
Z ∞
1 d
= −√ sin( 13 u3 + zu)du
π 0 du
=0 (8.30)

(one must be careful with the boundary term at infinity which is oscillating wildly). A
plot of Ai(z) is given below. It oscillates to the left of the turning point (where E > F x)
and then dies off exponentially to the right (where E < F x). Therefore, near a turning
point, this is what we expect the wavefunction to look like. In particular the asymptotic
form of Ai(z) is known:

 1 e− 23 z3/2 z→∞
1/4
Ai(z) ∼ z (8.31)
 1 sin 2 |z|3/2 + π/4 z → −∞
|z|1/4 3

It’s worth mentioning here the fact that Ai(z) and hence the wavefunction is not zero
to the right of the turning point where the potential energy V = F x is greater than
the total energy: V > E. Thus there is a non-zero probability to find the particle in a
region which is strictly forbidden in the classical world.
So let us try this with the WKB approximation. The idea is that WKB should
be good away from the turning points and near a turning point we can use the Airy
function. So it’s a question of patching together the various wavefunctions.
So lets do the WKB procedure for V = E + F (x − b). This is a potential that rises
to the right (F > 0) with the turning point at x = b:
p
p(x) = 2m(F b − F x) (8.32)
8.3. TURNING POINTS, AIRY AND BOHR-SOMMERFELD QUATIZATION 95

and hence (we pick the lower bound of integration constant to make things simple as
the effect is just a constant that can be absorbed elsewhere)
Z xp
(0)
σ = 2m(F b − F y)dy
b
2√
=− 2mF (b − x)3/2 (8.33)
3
Thus our WKB wavefunction is
A 2 √ 3/2 B 2 √ 3/2
ψ=p ei 3ℏ 2mF (b−x) + p e−i 3ℏ 2mF (b−x) , b>x (8.34)
p(x) p(x)
but we don’t trust it near x = b and in particular p = 0 there. Rather we take the
following
2 √ 2 √

− 2mF (x−b)3/2 + 2mF (x−b)3/2

 √Ar
e 3ℏ + √ Br
e 3ℏ x>b
 p(x) p(x)



ψ = cAi (2mF/ℏ2 )1/3 (x − b) x∼b (8.35)

 2 √ 3/2 2 √ 3/2
 √Al ei 3ℏ 2mF (b−x) + √Bl e−i 3ℏ 2mF (b−x) x<b


p(x) p(x)

Normalizability tells us that Br = 0. Looking at the asymptotic values of Ai for positive


z we find agreement if Ar = c. Looking at the large negative x − b we find agreement if
c c
Al = eiπ/4 Bl = − e−iπ/4 (8.36)
2i 2i
Thus our solution is
2 √

− 2mF (x−b)3/2

 √ c
e 3ℏ x>b
 p(x)



ψ = cAi (2mF/ℏ2 )1/3 (x − b) x∼b (8.37)

  √ 
 √ c sin 3ℏ 2
2mF (b − x)3/2 + π/4 x<b


p(x)

In other words the WKB solution agrees with the asymptotic regions of the exact Airy
function but not the region where it changes from oscillating to damping.
We can also imagine a turning point, rising to the left, located at x = a < b. The
analysis is the same but with x − b → a − x.
  √ 

 √ c
sin − 2
3ℏ
2mF (x − a)3/2
− π/4 x>a
 p(x)



ψ = cAi (2mF/ℏ2 )1/3 (a − x) x∼a (8.38)

 2 √ 3/2
 √ c e− 3ℏ 2mF (a−x) x<a


p(x)

Let us now consider a more general situation there there is a potential V with turning
points on the left and right. In the middle, a < x < b we need to match the two solutions
we find (this time assuming a general potential):
1 x
 Z 
c
ψb = p sin − p(y)dy + π/4 x<b
p(x) ℏ b
1 x
 Z 
c
ψa = p sin − p(y)dy − π/4 x>a (8.39)
p(x) ℏ a
96 CHAPTER 8. SEMI-CLASSICAL QUANTIZATION: WKB

These won’t agree unless


1 x 1 x
Z Z
− p(y)dy + π/4 = − p(y)dy − π/4 + nπ (8.40)
ℏ b ℏ a
for some integer n. Rearranging this gives the so-called Bohr-Sommerfeld quantization
rule:
Z b
p(y)dy = π(n − 1/2) n∈Z (8.41)
a

where a and b are two turning points. This was more or less guessed by Bohr and
Sommerfeld in the early days of quantum mechanics where they conjectured that n =
1, 2, ....

8.4 Harmonic Oscillator


Having mastered turning points we can now tackle the Harmonic oscillator. We have

V = 21 kx2 so p(x) = 2mE − mkx2 and
Z xp
(0)
σ = 2mE − mky 2 dy
0
√ Z xp
= 2mE 1 − ky 2 /2Edy
0

 
1 p 1 p p
= 2mE x 1 − kx2 /2E + 2E/karcsin( k/2Ex) (8.42)
2 2

The associated wavefunctions will not be the ones we found before: i.e. a polynomial
times an exponential suppression. But that’s okay, we don’t expect to land on the exact
answer in any non-trivial case.

Let us see what we can do. At large x we have σ (0) ∼ ±i mk x2 and hence we do
see an exponential suppression in ψ (for the right choice of sign - the must discard the
wrong choice due to normalizability). We can apply the Bohr-Sommerfeld quantization
condition:
Z b Z √2E/k √
p(x)dx = √ 2mE − mkx2 dx
a − 2E/k
p
= πE m/k
= πE/ω (8.43)

Setting this to πℏ(n − 1/2) indeed gives us the correct spectrum (n = 1, 2, ...):

E = ℏω(n − 1/2) (8.44)

Although more generally we would only expect agreement at large n.


Chapter 9

Entanglement, Density Matrices


and Thermal States

In this chapter we turn to more foundational aspects of quantum mechanics, such as


entanglement and properties of quantum states.

9.1 Entanglement and Bell’s Inequality

Entanglement is a basic consequence of the quantum mechanics description of particles


as states in a Hilbert space. More specifically, in the description of many particles as
states in the tensor product of individual particle Hilbert spaces.
Recall that in our discussion of angular momentum we introduced the tensor product
of su(2) representations (or vector spaces). More generally, let H1 and H2 be two Hilbert
spaces with bases {|ei ⟩} and {|fj ⟩} with i = 1, · · · dimH1 and j = 1, · · · , dimH2 . The
tensor product H = H1 ⊗ H2 is then a Hilbert space spanned by linear combinations of
|ei ⟩|fj ⟩. The dimensions of H is clearly the product of the dimensions of the constituents,
namely dimH = dimH1 dimH2 .
For example, if H1 and H2 are two-dimensional and spanned by {| ↑⟩, | ↓⟩} respec-
tively, a state in (or element of) H will take the form

|ψ⟩ = N (a| ↑⟩ ⊗ | ↑⟩ + b| ↑⟩ ⊗ | ↓⟩ + c| ↓⟩ ⊗ | ↑⟩ + d| ↓⟩ ⊗ | ↓⟩) , (9.1)

where N is a normalization obtained by imposing

⟨ψ|ψ⟩ = 1. (9.2)

97
98CHAPTER 9. ENTANGLEMENT, DENSITY MATRICES AND THERMAL STATES

In vector notation we have


! !
1 0
| ↑⟩ ≡ , | ↓⟩ = ,
0 1
 
! ! 1
1 1 0
 
| ↑⟩ ⊗ | ↑⟩ = ⊗ =  ,
0 0 0
(9.3)
0
 
! ! 0
1 0 1
 
| ↑⟩ ⊗ | ↓⟩ = ⊗ = 
0 1 0
0

and so on. We will from now on drop the tensor product sign, namely | ↑⟩⊗| ↑⟩ ≡ | ↑⟩| ↑⟩.
From (9.1) we see that states such as

|ψ1 ⟩ = | ↑⟩| ↑⟩, (9.4)


1
|ψ2 ⟩ = √ (| ↑⟩| ↑⟩ + | ↓⟩| ↓⟩) (9.5)
2

are both in H1 ⊗ H2 .
Definition: States that can be written as

|ψ⟩ = |ψ1 ⟩ ⊗ |ψ2 ⟩ (9.6)

are called product states. States that are not product states are called entangled.
We see that |ψ1 ⟩ in (9.4) is a product state, while |ψ2 ⟩ in (9.5) is entangled.
Entanglement is a fundamental property of quantum mechanics. To see this, we
consider the following thought experiment (EPR 1935): Let 1 and 2 be two particles
produced at the same source and sent to two distinct detectors with respectively settings
a and b. An example of setting here would be the spatial orientation of the detector.
Let A and B be the measurement outcomes at the two detectors, given the respective
settings a and b. Einstein, Podolski and Rosen in their now famous 1935 paper proposed
that any “physical” theory should meet the following two criteria:

a) Reality: if without disturbing the system we can predict with certainty the value of
a physical theory, then there exists an element of reality to the physical quantity.

b) Locality: Since at time of measurement particles 1 and 2 no longer interact, no


real change can take place in the second system in consequence to anything done
in the first.

Criteria a) and b) go under the name of “local realism”. Quantum mechanics is in-
compatible with local realism, as famously demonstrated through a violation of “Bell’s
9.1. ENTANGLEMENT AND BELL’S INEQUALITY 99

inequality” first in 1972 by Clauser and Freedman and a dozen other times since. In-
deed, the Nobel prize a few years ago went to three experimenters who showed, in a
variety of experiments, that Bell’s inequalities are violated.
Bell’s inequality is an inequality that ought to be obeyed by physical systems with
properties a) and b) above. To be more specific, let particles 1 and 2 in our thought
experiment above be simple, two-level systems. Classically, this means that each of these
particles can be in one of two states, and hence the measurement outcomes A and B
above can take one of two possible values: ±1. Furthermore, (a, b) could involve rotating
the detector (ie. measuring the spin or polarization of the particles along different axes).
Each of the detectors is allowed to be in one of two possible settings, a or a′ and b or b′ .
Now given detector settings (a, b) and measurement outcomes as above, one can
define the correlation function

E(a, b) = ⟨A(a)B(b)⟩ (9.7)

where the angle brackets mean averaging over many runs in which the particles are
subjected to measurements with the same detector setting (a, b). Note that since (clas-
sically) A(a) = ±1 and B(b) = ±1, we must also necessarily have A(a)B(b) = ±1.
Therefore (think about the average correlated outcome of tossing two coins)

−1 ≤ E(a, b) ≤ 1. (9.8)

Consider then the following linear combination of correlators

S ≡ E(a, b) + E(a′ , b) − E(a, b′ ) + E(a′ , b′ ). (9.9)

Claim 1: Any physical theory obeying local realism must have |S| ≤ 2. This is called
Bell’s inequality.
Proof: Simply notice that

S0 = A′ (B + B ′ ) + A(B − B ′ ) =⇒ S0 = ±2. (9.10)

This is because B = ±B ′ so only one of the two brackets can be non-zero and must
equal ±2. The average over this quantity is then
Z
S ≡ ⟨S0 ⟩ = dλp({λ})S0 ({λ}) (9.11)

where λ are possible “hidden variables”, ie. parameters that could vary across different
experimental runs, assumed to be the same for both particles since they are created at
the same source. Then Z
|S| ≤ dλp({λ})|S0 ({λ})| ≤ 2, (9.12)
R
where we have used that dλp({λ}) = 1.
Claim 2: Quantum mechanics violates the Bell inequality (9.12).
100CHAPTER 9. ENTANGLEMENT, DENSITY MATRICES AND THERMAL STATES

Proof: Let the source emit a pair of particles in an entangled state

1
|ψ⟩ = √ (| ↑⟩A | ↓⟩B − | ↓⟩A | ↑⟩B ) (9.13)
2

Note that entanglement violates locality (property b) in the definition of local realism
above) through measurement of either particle.
Then consider the following settings of the detector:

a) Rotate the first device to measure spin/polarization along n̂a in the plane per-
pendicular to the direction of propagation of particle 1. In other words, the first
device will perform a measurement of the particle 1 in the rotated basis

| ↑′ ⟩A = Ra | ↑⟩A , | ↓′ ⟩A = Ra | ↓⟩A ,
!
cos θa sin θa (9.14)
Ra =
− sin θa cos θa

b) Rotate the second device to measure spin/polarization along n̂b in the plane per-
pendicular to the direction of propagation of particle 2. In other words, the second
device will perform a measurement of the particle 2 in the rotated basis

| ↑′ ⟩B = Rb | ↑⟩B , | ↓′ ⟩B = Rb | ↓⟩B ,
!
cos θb sin θb (9.15)
Rb =
− sin θb cos θb

In this new basis, the outcomes of the measurement will still be ±1, however the prob-
abilities with which they occur will be different. Quantum mechanics predicts

p(1, 1; θa , θb ) ≡ |(⟨↑′ |A ⟨↑′ |B )|ψ⟩|2


2
1
= (cos θa ⟨↑ |A − sin θa ⟨↓ |A ) (cos θb ⟨↑ |B − sin θb ⟨↓ |B ) √ (| ↑⟩A | ↓⟩B − | ↓⟩A | ↑⟩B )
2
1
= |(− cos θa sin θb + sin θa cos θb )|2
2
1
= [sin(θa − θb )]2 .
2
(9.16)

A similar computation then gives the probabilities of the other outcomes (given the
same detector settings above)

1
p(1, −1; θa , θb ) = [cos(θab )]2
2
1
p(−1, 1; θa , θb ) = [cos(θab )]2 (9.17)
2
1
p(−1, −1; θa , θb ) = [sin(θab )]2 , θab ≡ θa − θb .
2
9.2. DENSITY MATRICES 101

This allows us to compute our correlation function defined above


X
E(a, b) = ABp(A, B; a, b)
A,B=±1
(9.18)
= (p(1, 1; a, b) + p(−1, −1; a, b)) − (p(1, −1; a, b) + p(−1, 1; a, b))
= sin2 θab − cos2 θab = − cos 2θab .
Finally, consider the Bell-CHSH parameter

S ≡ E(a, b) + E(a′ , b) − E(a, b′ ) + E(a′ , b′ )


(9.19)
= − cos 2θab − cos 2θa′ b + cos 2θab′ − cos 2θa′ b′ .
Consider then the following settings

θab = θa′ b′ = −θa′ b = θ =⇒ θab′ = θ + θb − θb′ = 3θ. (9.20)

In this case
S = −3 cos 2θ + cos 6θ, (9.21)
and maximizing this quantity with respect to θ gives
dS π
= 0 =⇒ 6 sin 2θ − 6 sin 6θ = 0 ⇐⇒ θ = . (9.22)
dθ 8
Plugging this value back into (9.19), we find

Smax = −2 2 =⇒ |S| > 2. (9.23)

Hence, given an entangled state and assuming the postulates of quantum mechanics, one
can violate Bell’s inequality. This violation has been experimentally confirmed, some of
the experiments also recently closing some loopsholes regaring the existence of “hidden
variables”.

9.2 Density Matrices


Whether a system is entangled or not can be in general hard to tell. For large systems,
checking whether a state is a product state or not may be challenging. Furthermore,
how can we describe the ”classical” case in the quantum theory?
It turns out to be useful to introduce the notion of a density matrix where we
trade the state |ψ⟩ for the operator

ρ = |ψ⟩⟨ψ|. (9.24)

Given any observable O we can compute the expectation value of |ψ⟩ by computing a
trace:
X
tr(Oρ) = ⟨en |Oρ|en ⟩
n
X
= ⟨en |O|ψ⟩⟨ψ|en ⟩ (9.25)
n
102CHAPTER 9. ENTANGLEMENT, DENSITY MATRICES AND THERMAL STATES

Next we expand
X
|ψ⟩ = cm |em ⟩ ⇐⇒ cm = ⟨em |ψ⟩ = (⟨ψ|em ⟩)∗ (9.26)
m

so
X
tr(Oρ) = cm ⟨en |O|em ⟩c∗n
n,m
X
= cm λm ⟨en |em ⟩c∗n
n,m
X
= λn |cn |2
n

= ⟨ψ|O|ψ⟩ (9.27)

where we have used an orthonormal basis |em ⟩ of eigenstates of O with eigenvalues λm :


O|em ⟩ = λm |em ⟩.
Note that the normalization of |ψ⟩ translates into the statement that (just take
O = 1)
X
tr(ρ) = ⟨en |ρ|en ⟩
n
X
= ⟨en |ψ⟩⟨ψ|en ⟩
n
X
= cn c∗n
n

= 1. (9.28)

Thus we can swap a state for a density matrix. Such a state is called a pure state,
meaning that it is equivalent to a single state in the more traditional formulation.
So what’s the point? We can consider more general density matrices:

ρ = p1 |ψ1 ⟩⟨ψ1 | + p2 |ψ2 ⟩⟨ψ2 | + . . . , tr(ρ) = 1 pi ≥ 0. (9.29)

Note that this is with respect to some basis. If we choose a different basis then ρ may
not take such a diagonal form. The second condition translates into
X
pi = 1. (9.30)

In general these are called mixed states when they can’t be written as

ρ = |ψ⟩⟨ψ| (9.31)

for a single state |ψ⟩. A mixed state allows us to introduce a statistical notion of
uncertainty, in the sense that we don’t know what the quantum state is (but maybe we
could if we did further experiments). For example we can compute

tr(Oρ) = p1 tr(O|ψ1 ⟩⟨ψ1 |) + p2 tr(O|ψ2 ⟩⟨ψ2 |) + ...


= p1 ⟨ψ1 |O|ψ1 ⟩ + p2 ⟨ψ2 |O|ψ2 ⟩ + . . . (9.32)
9.2. DENSITY MATRICES 103

The expectation value then has the interpretation as a classical statistic average over
the individual quantum expectation values.
Let’s consider the following examples of density matrices:

ρ1 = |ψ⟩⟨ψ|
1
= (| ↑⟩ + | ↓⟩)(⟨↑ | + ⟨↓ |)
2
1
= (| ↑⟩⟨↑ | + | ↑⟩⟨↓ | + | ↓⟩⟨↑ | + | ↓⟩⟨↓ |) (9.33)
2

and

1
ρ2 = (| ↑⟩⟨↑ | + | ↓⟩⟨↓ |) (9.34)
2

Note that although the density matrix ρ2 may look simpler than ρ1 it is a mixed state
whereas ρ1 is pure.
How can we tell in general whether or not a density matrix corresponds to a pure or
mixed state? Well a pure state means ρ = |ψ⟩⟨ψ| for some state |ψ⟩ and hence

ρ2 = |ψ⟩⟨ψ|ψ⟩⟨ψ| = |ψ⟩⟨ψ| = ρ (9.35)

So in particular tr(ρ2 ) = tr(ρ) = 1. But for a mixed state (we assume |ψn ⟩ form an
orthonormal basis)
X
ρ= pn |ψn ⟩⟨ψn |
n
X
2
ρ = pn pm |ψn ⟩⟨ψn |ψm ⟩⟨ψm |
n,m
X
= p2n |ψn ⟩⟨ψn | (9.36)
n

from which it follows that


X X
tr(ρ2 ) = p2n ≤ pn = 1 (9.37)
n n

because 0 ≤ pn ≤ 1 we have p2n ≤ pn for each pn . Furthermore we can only have


tr(ρ2 ) = 1 if one, and therefore only one, pn = 1 with all others vanishing. Thus we
have a theorem:

tr(ρ2 ) ≤ 1 (9.38)

with equality iff ρ represents a pure state. For example in our classical density matrix
above we have p1 = p2 = 1/2 and so tr(ρ) = 1 as required but tr(ρ2 ) = 1/4 + 1/4 =
1/2 < 1 which tells us it is indeed a mixed state.
104CHAPTER 9. ENTANGLEMENT, DENSITY MATRICES AND THERMAL STATES

9.3 Thermal States


The most classic example of a mixed state is what we find at finite temperature. Here
we imagine that there are many particles with a variety of energies and we take

1 X −En /kB T
ρthermal = e |En ⟩⟨En | (9.39)
Z n

where |En ⟩ are the energy eigenstates, T is the temperature and kB is Boltzman’s
constant, kB = 1.380649 × 10−23 J/o K that converts temperature into energy.1 Clearly
this is a mixed state and is known as the Boltzman distribution.
To determine the normalization Z we need to impose

1 = tr(ρ)
1 X −En /kB T
= e ⟨Em |En ⟩⟨En |Em ⟩
Z n,m
1 X −En /kB T
= e . (9.40)
Z n

Thus
X
Z= e−En /kB T . (9.41)
n

This known as the Partition function and plays a very central role. It counts the number
of states available at each energy En :
X
Z= d(n)e−βEn (9.42)
En

where d(n) counts the degeneracy of states at energy level En and and β = 1/kB T is
the inverse temperature.
At low temperatures, where T → 0, the density matrix will strongly peak around
the lowest energy state:

lim ρthermal = |E0 ⟩⟨E0 | (9.43)


T →0

and hence becomes pure. However at high energy, T → ∞, all the energies states
contribute more or less equally and so

1 X
lim ρthermal = |En ⟩⟨En |. (9.44)
T →∞ dimH n

This is proportional to the identity matrix (in the |En ⟩ basis). Such states are called
maximally mixed.
1
This value of kB is exact. It’s a definition.
9.3. THERMAL STATES 105

We can ask what is the expected energy of a thermal state:


X
tr(Hρthermal ) = ⟨Em |Hρthermal |Em ⟩
m
1 X −En /kB T
= e ⟨Em |H|En ⟩⟨En |Em ⟩
Z n,m
1X
= En e−En /kB T
Z n
1X
= En e−βEn
Z n
1 ∂ X −βEn
=− e
Z ∂β n

=− ln Z (9.45)
∂β
These formulae may (or may not) be familiar from statistical physics.
For example we can consider our friend the Harmonic oscillator. Here we have a
unique state at each level n with En = ℏω(n + 1/2). Thus

X q 1/2
Z= q n+1/2 = (9.46)
n=0
1−q

where q = e−ℏωβ . Furthermore


1 X n+1/2
ρthermal = q |n⟩⟨n|
Z n
X
= (1 − q) q n |n⟩⟨n| (9.47)
n

We can compute the expected energy to find


 1/2 
∂ q
tr(Hρthermal ) = − ln
∂β 1−q
 
1 1
=− + (−ℏω)q
2q 1 − q
1 1+q
= ℏω (9.48)
2 1−q
We can re-write this as
1 q
tr(Hρthermal ) = ℏω + ℏω (9.49)
2 1−q
The first term is just the ground state energy and so we can subtract it off since it
doesn’t have a physical meaning to find
1
Ethermal = ℏω
q −1−1
1
= ℏω (9.50)
eℏω/kB T − 1
106CHAPTER 9. ENTANGLEMENT, DENSITY MATRICES AND THERMAL STATES

Figure 9.3.1: The Planck Curve: N.B. The horizontal axis is wavelength λ ∼ 1/ω

This is the famous Planck formula for the spectrum of emitted light from a so-called
black body of temperature T , albeit in just one dimension. In three dimensions there
is an 4πω 2 in the numerator. The extra 4πω 2 comes from counting modes in three
dimensions; basically it arises from the growth of the size of a sphere as ω = |k| where
k is the spatial momentum. A plot of Eth /(ℏω) is given in Fig. 8.2.1. In particular
it was observed that all bodies radiate with a certain spectrum that depends on their
temperature. Humans (living ones) radiate in the infra-red. If it is hotter you can see
it as visible light such as an oven hob (not an induction one!). Even the empty space of
the Universe emits radiation in the form of the cosmic microwave background (CMB)
corresponding to a temperature of about −270o C. Planck’s formula Ethermal matches
experiment but deriving it was one of those problems the eighteenth century physicists
thought was just a loose string. Rather Planck had to introduce the notion of discrete
energies and ℏ to derive it. That was the beginning of the end for Classical Physics.

9.4 Entropy

To continue we need to talk about the idea of entropy. Entropy is a measure of disorder.
More precisely it measures how many microscopic states lead to the same macroscopic
one. All your various charger cables are always tangled up as there are so many more
tangled possibilities than untangled ones. It’s not that life is against you its just that
being organised is an exponentially unlikely state of affairs: organisation requires some
9.5. MORE ON ENTANGLEMENT 107

organiser and quite some effort2 . The most basic definition of entropy is it is the log of
the number of states with the same energy. So

S(En ) = ln d(n) (9.51)

where d(n) is given in 9.42.


But there are a variety of notions of entropy3 . To measure the entropy of a mixed
state we consider the von Neumman entropy:

SvN = −tr(ρ ln ρ) (9.52)

This requires a little motivation. Firstly to compute the Log of an operator the easiest
way is to diagonalise it so:
X
SvN = − pn ln pn (9.53)
n

Thus the minus is needed as 0 ≤ pn ≤ 1 to ensure that SvN ≥ 0. We see that pn ln pn = 0


if pn = 1 (corresponding to a pure state) and also pn ln pn = 0 if pn = 0 (meaning that
states that don’t appear in ρ don’t contribute to the entropy). It follows that only mixed
states have SvN > 0. So this is another way to characterise mixed states.
In our simple mixed state ρ2 we have
1 1 1 1
SvN = − ln − ln
2 2 2 2
= ln 2 (9.54)

In fact there is a theorem that SvN ≤ ln(dim H). A mixed state with SvN = ln(dim H)
is said to be maximally mixed. For example a maximally mixed state arises when
pn = 1/N for all n, where N = dim H. Such a mixed state is equally mixed between all
pure states.

9.5 More on entanglement


Let’s again consider the example of two particles in the |0, 0⟩ state that we saw in Section
4.2:
1
|0, 0⟩ = √ | 21 , 12 ⟩ ⊗ | 21 , − 12 ⟩ − | 12 , − 21 ⟩ ⊗ | 12 , 21 ⟩

(9.55)
2
This is a state that consists of two particles each of which carries ±ℏ/2 units of angular
momentum around the z-axis in such a way that the total angular momentum vanishes.
But we don’t know if the first one is spinning clockwise and the second anti-clockwise
or vice-versa.
2
Of course life could still be against you; even though everyones charger cables get tangled, yours
are more tangled than others and always at the worst time.
3
One could say that the entropy of entropy definitions is non-zero.
108CHAPTER 9. ENTANGLEMENT, DENSITY MATRICES AND THERMAL STATES

Now these two particles might be very far from each other. Perhaps they were created
in some experiment as a pair and then flew away across the universe, like endless rain into
a paper cup. Suppose sometime later we measure one of the particles, say the first, and
it is in the state | 12 , 12 ⟩. Then we know, we absolute certainty, that the second particle,
wherever it is in the Universe, is in the | 12 , − 12 ⟩ state. This is very counter intuitive
and indeed it violates condition b) in the local realism hypothesis. Nevertheless, we
can’t use it to send messages faster than light as we can’t control how the first particle’s
wavefunction ‘collapses’: to know what state the second particle is in one needs to
receive communication about the outcome on particle 1, and this cannot happen faster
than the speed of light.
Now recall that a general state in the tensor product H = H1 ⊗ H2 of Hilbert spaces
can be written as
X
|Ψ⟩ = cn′ n′′ |en′ ⟩ ⊗ |en′′ ⟩ (9.56)
n′ ,n′′

with |en′ ⟩ a basis of H1 and |en′′ ⟩ a basis of H2 . Note that pure state takes the form

ρ = |Ψ⟩⟨Ψ|
X
= cn′ n′′ c∗m′ m′′ |en′ ⟩ ⊗ |en′′ ⟩⟨em′ | ⊗ ⟨em′′ |
n′ ,n′′ ,m′ ,m′′
X
= cn′ n′′ c∗m′ m′′ |en′ ⟩⟨em′ | ⊗ |en′′ ⟩⟨em′′ | (9.57)
n′ ,n′′ ,m′ ,m′′

Next we want to we admit that we have no idea what is going on in the second Hilbert
space H2 . Thus we would construct a density matrix where we sum (trace) over all the
options for H2 :

ρreduced = trH2 (|Ψ⟩⟨Ψ|)


!
X X
= ⟨ep′′ | cn′ n′′ c∗m′ m′′ |en′ ⟩⟨em′ | ⊗ |en′′ ⟩⟨em′′ | |ep′′ ⟩
p′′ n′ ,n′′ ,m′ ,m′′
!
X X
= cn′ n′′ c∗m′ m′′ |en′ ⟩⟨em′ |
n′ ,m′ n′′ ,m′′
X
≡ an′ m′ |en′ ⟩⟨em′ | (9.58)
n′ ,m′

Note that an′ m′ defines a self-adjoint operator on H1 as


X
a∗m′ n′ = c∗m′ n′′ cn′ m′′
n′′ ,m′′
X
= cn′ n′′ c∗m′ m′′
m′′ ,n′′

= an′ m′ (9.59)
9.5. MORE ON ENTANGLEMENT 109

So we can find a new basis of H1 given by |e′n′ ⟩ such that


X
ρreduced = pn′ |e′n′ ⟩⟨e′n′ | (9.60)
n′
Furthermore the pn′ , which are the eigenvalues of an′ m′ , are real and in fact positive
(because, roughly, a ∼ cc† is positive definite) and less than one (because |Ψ⟩ is nor-
malised).
Thus we find a density matrix, called the reduced density matrix, in the first Hilbert
space. Even though we started from a pure state, from the point of view of the first
Hilbert ρreduced is a mixed state in general. Furthermore we can evaluate it’s von-
Neumann entropy:
See = −trH1 (ρreduced ln ρreduced ) (9.61)
which is generally referred to as the entanglement entropy. Note that there is a theorem
that, had we traced over H1 and computed the entanglement entropy of the resulting
density matrix in H2 then we would find the same value for SEE . In other words the
state is equally entangled between the two sub-Hilbert spaces.
So lets go back to our two particle system and first consider the pure state
|Ψ⟩ = | ↑⟩ ⊗ | ↓⟩ (9.62)
This leads to the density matrix
ρ = | ↑⟩⟨↑ | ⊗ | ↓⟩⟨↓ | (9.63)
If we trace over the second Hilbert space we find
ρreduced = | ↑⟩⟨↑ | ⊗ (⟨↑| ↓⟩⟨↓| ↑⟩ + ⟨↓| ↓⟩⟨↓| ↓⟩)
= | ↑⟩⟨↑ | (9.64)
which is pure and indeed See = 0. The reason is that the unknown part of the state aris-
ing from H2 is just one state. So averaging over it doesn’t really loose any information.
However suppose we start with
1
|Ψ⟩ = √ (| ↑⟩ ⊗ | ↓⟩ + | ↓⟩ ⊗ | ↑⟩) (9.65)
2
The density matrix is therefore
1 
ρ= | ↑⟩⟨↑ | ⊗ | ↓⟩⟨↓ | + | ↑⟩⟨↓ | ⊗ | ↓⟩⟨↑ | + | ↓⟩⟨↑ | ⊗ | ↑⟩⟨↓ | + | ↓⟩⟨↓ | ⊗ | ↑⟩⟨↑ |
2
(9.66)
then tracing over the second Hilbert space the middle two terms drop out and we find
1 
ρreduced = | ↑⟩⟨↑ | + | ↓⟩⟨↓ | (9.67)
2
which agrees with ρ2 that we had above. We also see that
1 1 1 1
See = − ln − ln = ln 2 (9.68)
2 2 2 2
as before. The lesson is that the ‘Classical’ density matrix we considered above arises
in a quantum theory by tracing over hidden degrees of freedom. In other words the
quantum description is more refined, i.e. contains more information.
110CHAPTER 9. ENTANGLEMENT, DENSITY MATRICES AND THERMAL STATES
Chapter 10

Relativistic Quantum Mechanics

In this chapter we discuss the interplay between quantum mechanics and special rela-
tivity. As a warm-up, we will revisit the problem of a particle in an electromagnetic
field, the properties of the Schrodinger equation under “gauge” transformations and
the implications for physical observables. In particular, we will study the Aharonov-
Bohm effect. We will then see how demanding that Quantum mechanics is compatible
with special relativity leads us to new equations such as the Klein-Gordon and Dirac
equations.

10.1 Schrodinger equation for a charged particle


Classically, a particle in an electromagnetic field evolves according to

dv q
m = qE + v × B. (10.1)
dt c
The electric and magnetic field can be combined into an electromagnetic field strength
tensor Fµν , µ, ν = 0, · · · , 3 that transforms covariantly under Lorentz transformations
(which are symmetries of special relativity, as we will see later). Fµν is antisymmetric
and the electric and magnetic fields correspond to the following components

1 dA
Ei ≡ −F0i = −∇V − ,
c dt (10.2)
jk
Bi ≡ ϵijk F = (∇ × A)i

Here A is the vector or gauge potential and together with the Coulomb potential V ,
combines into a 4-vector
Aµ = (V, A) . (10.3)

The equations of motion can then be derived from

mv 2 Q
L= + v · A − qV , (10.4)
2 |c {z }
≡ 1c pµ Aµ

111
112 CHAPTER 10. RELATIVISTIC QUANTUM MECHANICS

or equivalently the Hamiltonian


 2
1 Q ∂L Q
H ≡ p · q̇ − L = p − A + qV, q̇ = v, p= = mv + A. (10.5)
2m c ∂ q̇ c

Note that p is the canonical momentum, which in this case differs from the usual momen-
tum mv. The Schordinger equation for a quantum particle in an EM field is obtained
from (10.5) by simply replacing p → −iℏ∇. We obtain
" 2 #
1 Q
Ĥ = −iℏ∇ − A + qV . (10.6)
2m c

All physical quantities (electric, magnetic fields) are invariant under the gauge trans-
formations
1∂
V → Ṽ = V − Λ(x, t),
c ∂t (10.7)
A → Ã = A + ∇Λ(x, t),

or more compactly (and covariantly),

Aµ → Aµ + ∂µ Λ. (10.8)

Under these transformations, the wavefunction is not invariant, but instead transforms
as
iQΛ
ψ̃(x, t) = e ℏc ψ(x, t). (10.9)

To see this note that


 
∂ iQΛ ∂ Q ∂Λ
iℏ ψ̃ = e ℏc iℏ ψ − ψ ,
∂t ∂t c ∂t
  2 
 !2 
 1 Qà iqΛ  1  Q∇Λ q à 
  
−iℏ∇ − + QṼ ψ̃ = e  −iℏ∇ + −  + q Ṽ 
 ℏc
 
2m c  2m  | c {z c } 
− qA
c

(10.10)

which means that the Schrodinger equation is invariant under gauge transformations,
"  2 #
1 Q ∂
−iℏ∇ − Ã + QṼ ψ̃ = iℏ ψ̃ (10.11)
2m c ∂t

provided ψ transforms according to (10.9).

10.1.1 Aharonov-Bohm effect


This is an experiment that demonstrates that quantum particles are sensitive to mag-
netic fields even if they are not directly exposed to them. Consider the following setup:
10.1. SCHRODINGER EQUATION FOR A CHARGED PARTICLE 113

• two particles emitted at a source and directed along paths enclosing (but not
crossing) a region with non-vanishing magnetic flux

• particles recombine

The key outcome of this experiment is that, upon recombination, the particles give rise
to an interference pattern that is sensitive to the magnetic flux. This all despite the fact
that the particles were never exposed directly to the flux. Classically, this would not be
possible.
Let
Φ=B·S (10.12)
be the magnetic flux generated by a solenoid of area S (magnitude proportional to the
surface area, direction perpendicular to the area). Note that
Z Z I
Φ = dS · B = dS · (∇ × A) = A · dx, (10.13)

where in the last line we used Stokes’ theorem. Now let’s work in cylindrical coordinates
with the z axis aligned along the solenoid, in the direction of the magnetic field. Then
I Z 2π
Φ
Φ = A · dx = Aϕ rdϕ =⇒ Aϕ = . (10.14)
0 2πr

All other components of the gauge potential vanish. The Hamiltonian (10.6) is then
 2
1 ∂ QΦ
H= −iℏ − . (10.15)
2mr2 ∂ϕ 2π

Energy eigenstates satisfy


2
ℏ2

inϕ Φ
Hψ = En ψ =⇒ ψ = N e , En = n− , (10.16)
2mr2 Φ0

where
2πℏ
Φ0 = (10.17)
q
is the flux quantum.
Now notice that the spectrum is unaffected if Φ = mΦ0 , m ∈ Z. On the other hand,
if Φ ̸= mΦ0 , the spectrum is shifted. This phenomenon is known as spectral flow and
is physical: it knows about B and Φ despite the fact that the particles are not directly
exposed to Φ. It can be measured by the Aharonov-Bohm effect, ie. an interference
pattern.
To see this, consider the more general case
1
(−iℏ∇ − QA)2 ψ = Eψ (10.18)
2m
and set
iQ Rx
A(x′ )·dx′
ψ(x) = e ℏ ϕ(x). (10.19)
114 CHAPTER 10. RELATIVISTIC QUANTUM MECHANICS

Substituting (10.19) into the Schrodinger equation (10.18) immediately shows that ϕ(x)
obeys the free equation, ie.
1 iQ R x ′ ′ 1 iQ R x ′ ′
(−iℏ∇ − QA)2 ψ = e ℏ A(x )·dx (−iℏ∇ + QA − QA)2 ϕ(x) = Ee ℏ A(x )·dx ϕ(x).
2m 2m
(10.20)
Recall that under gauge transformations

A → A + ∇Λ,
iQΛ (10.21)
ψ(x) → e ℏc ψ

which is manifestly obeyed by (10.19).


The phase difference between particles travelling around paths enclosing the mag-
netic flux is
Z Z I Z
Q Q Q Q qΦ
∆φ = A · dx − A · dx = A · dx = B · dS = . (10.22)
ℏ P1 ℏ P2 ℏ ℏ ℏ

This phase difference will have an effect on the interference pattern provided that
2πℏn
Φ ̸= = nΦ0 . (10.23)
Q

10.2 Relativistic quantum mechanics


Let us consider the Schrödinger equation for a collection of N particles with positions
xa , a = 1, ..., N :

∂Ψ ℏ2 2
iℏ =− ∇ Ψ + V (xa )Ψ (10.24)
∂t 2ma a
where Ψ = Ψ(t, x1 , ..., xN ). We don’t expect there to be a preferred point or direction
in space. So let us suppose that the potential is of the form

V (x1 , ..., xN ) = V (|xa − xb |) (10.25)

i.e. we assume that the potential only depends on distance between positions of the
particles. Schrödinger’s equation is then rotationally invariant:

xa → x′a = Rxa (10.26)

where R−1 = RT is an orthogonal matrix. It also has a notion of Galilean boosts:

xa → x′a = xa + vt (10.27)

Clearly V is invariant as xa − xb = x′a − x′b . And the Laplacian terms don’t change.
However if we write Ψ′ (t, xa ) = Ψ(t, x′a )
∂Ψ′ ∂Ψ X
= +v· ∇a Ψ (10.28)
∂t ∂t a
10.2. RELATIVISTIC QUANTUM MECHANICS 115

Thus we find an extra term on the left hand side and the Schrödinger equation is no
longer invariant. Drawing inspiration from our discussion of gauge invariance above, we
can ask whether the can fix this by demanding that the wavefunction transforms under
Galilean boosts by a phase.
Indeed, one can show that the unwanted term in (10.28) can be compensated for by
taking instead
P  1

′ − ℏi v· ma xa + ma vt
Ψ (t, xa ) = e a 2 Ψ(t, x′a ) (10.29)

i.e. we include an extra phase factor in the wavefunction. As a result we find (see the
problem set)

∂Ψ′ ℏ2 ′ 2 ′
iℏ =− ∇a Ψ + V (x′a )Ψ′ (10.30)
∂t 2ma
Thus we have a symmetry under what are known as Galilean boosts, corresponding
to a notion of Galilean relativity. In particular in Galilean relativity time is absolute,
eveyone agrees on t and an infinitesimal tick of the clock dt. So also do they all agree
on spatial lengths: dx2 + dy 2 + dz 2 . Thus observers may move in reference frames which
are related by rotations and boosts of the form (10.27).
That’s great but as Einstein showed in 1905 Galilean boosts are not symmetries of
space and time. Rather Lorentzian boosts are and we need to consider Special Relativity!
In Galilean relativity the speed of light is not a constant because under a boost velocities
are shifted:

ẋ → ẋ + v. (10.31)

But Einstein’s great insight (well one of them) was to insist that the speed of light is a
universal constant. I won’t go into why except to mention that Maxwell’s equations for
electromagnetism don’t allow you to go to a frame where c = 0. They are not invariant
under the Galilean boosts we saw above. They are instead invariant under Lorentz
transformations which include rotations and Lorentz boosts which we now describe.
So what is a Lorentz boost. In Special Relativity we have that the invariant notion
of length is, for an infinitesimal displacement,1

ds2 = c2 dt2 − dx2 − dy 2 − dz 2 . (10.32)

Here c is the speed of light. Here we only require that ds2 is the same for all observers
(not dt and dx2 + dy 2 + dz 2 ). Thus we are allowed to consider a transformation of the
form

ct′ = γct + γβ · x, x′ = γx + γβct,


cdt′ = γcdt + γβ · dx, dx′ = γdx + γβcdt. (10.33)
1
A more common convention is ds2 = −c2 dt2 + dx2 + dy 2 + dz 2 but either will do - thats not true:
no sane person likes the convention we are using.
116 CHAPTER 10. RELATIVISTIC QUANTUM MECHANICS

However we require that

ds′2 = c2 dt′2 − dx′ · dx′


= γ 2 (cdt + β · dx)2 − γ 2 (dx + βcdt) · (dx + βcdt)
= γ 2 (1 − |β|2 )c2 dt2 − γ 2 (1 − |β|2 )dx · dx, (10.34)

we need ds′2 = ds2 and hence (we could also take the other sign but that would flip the
direction of time)

1
γ=q . (10.35)
1− |β|2

What is β? Well consider a small β then to first order we have γ = 1 and

t′ = t + c−1 β · x, x′ = x + βct. (10.36)

Thus to recover Galilean boost x′ = x+vt we identify β = v/c. Small β means |v| << c.
Then we see that

t′ = t + c−2 v · x = t + O(|v|/c2 ). (10.37)

and we have recovered absolute time. These transformations are called Lorentzian boosts
and together with spatial rotations they form the Lorentz group.

10.3 Relativistic Wave Equations


Let’s get back to Quantum Mechanics where time plays a central role. If time is not
absolute then what happens to Quantum Mechanics? How does this fit with the fact
that the Schrödinger equation is first order in time derivative and second order in spatial
derivatives. In Special relativity time and space are interchangeble.
The first attempt to find a relativistic wave equations is to simply make time second
order and consider
ℏ2 ∂ 2 Ψ
− ℏ 2 ∇ 2 Ψ + m 2 c2 Ψ = 0 (10.38)
c2 ∂t2
This is invariant under Special relativity if Ψ(x) → Ψ′ (x) = Ψ(x′ ) (see the problems).
It is known as the Klein-Gordon equation. But there are some problems:

1. If we look at energy and momentum eigenstates Ψ = e−iEt/ℏ+ip·x/ℏ we find

E 2 = |p|2 c2 + m2 c4

(setting p = 0 immediately gives a famous formula). But this allows for both
positive and negative energy.
10.3. RELATIVISTIC WAVE EQUATIONS 117

2. We would like to interpret Ψ∗ Ψ as a probability density but it is not conserved:


Z
d
Ψ∗ Ψd3 x ̸= 0
dt

One can find a conserved density: ρ = iΨ∗ ∂t Ψ − iΨ∂t Ψ∗ but this is not positive
definite (see the problems).

3. Knowing the state at time t = 0 is not enough information to determine it at later


times, one also needs ∂Ψ/∂t at t = 0 but what does that mean?

The other way to go, and one of those oh so clever moments of discovery, is to find an
equation that it first order in time and space. This is what Dirac did and it’s beautiful.
So let us try something like
 
iℏ 0 ∂
γ + iℏγ · ∇ − mc Ψ = 0. (10.39)
c ∂t

We will not be very specific yet about what γ 0 and γ i are. We want this equation to
imply the Klein-Gordon equation so we square it (ie. take DD† , where D† is the adjoint
of the differential operator above):
  
iℏ 0 ∂ iℏ 0 ∂
γ + iℏγ · ∇ − mc − γ − iℏγ · ∇ − mc Ψ = 0
c ∂t c ∂t
 2
ℏ 0 2 ∂2 ℏ2 i j ∂2 ℏ2 0 i ∂ 2

2 2
(γ ) 2 + {γ , γ } i j + {γ , γ } +m c Ψ=0 (10.40)
c2 ∂t 2 ∂x ∂x c ∂t∂xi

Where {γ i , γ j } = γ i γ j + γ j γ i is known as the anti-commutator. Thus we will recover


the Klein-Gordon equation if

(γ 0 )2 = 1 {γ 0 , γ i } = 0 {γ i , γ j } = −2δ ij . (10.41)

Clearly this can’t happen if the γ’s are just numbers. But maybe if they are Matrices?
Some trial and error gives the following (unique up to conjugation γ → U γU −1 ) solution:
! !
I 0 0 τi
γ0 = γi = (10.42)
0 −I −τi 0

where I is the 2 × 2 unit matrix and τi the Pauli matrices. Thus Ψ must be a complex
four-vector:
 
ψ1
ψ 
 2
Ψ=  . (10.43)
ψ3 
ψ4

And yet again we see the Pauli matrices.


118 CHAPTER 10. RELATIVISTIC QUANTUM MECHANICS

10.4 Relativistic Notation


It’s time to adopt a better notation. We introduce the new coordinate xµ , µ = 0, 1, 2, 3
with

x0 = ct , x1 = x , x2 = y , x3 = z (10.44)

Note that the invariant length we introduced above is

ds2 = ηµν dxµ dxν (10.45)

where
 
−1 0 0 0
0 1 0 0
ηµν = (10.46)
 

0 0 1 0
0 0 0 1

and repeated indices are summed over. Note that in such sums one index is always up
and one down. This ensures that any quantity that has no left-over indices is a Lorentz
invariant. It is also helpful to introduce the matrix inverse to ηµν which is denoted by
 
−1 0 0 0
0 1 0 0
−1
η µν = ηµν = (10.47)
 

0 0 1 0
0 0 0 1

Note that numerically ηµν = η µν but this is really a coincidence.


Lastly we have γ µ = (γ 0 , γ 1 , γ 2 , γ 3 ) above. Then the Dirac equation is simply

−iℏγ µ ∂µ Ψ + mcΨ = 0. (10.48)

10.5 Spinors
What about Ψ? It now has four components so we could introduce an index Ψα , α =
1, 2, 3, 4 in which case the γ-matrices also pick-up indices:

γ µ = (γ µ )α β (10.49)

The condition we found in (10.43) is

{γ µ , γ ν } = −2η µν I ⇐⇒ {γ µ , γ ν }α β = −2η µν δα β (10.50)

This is called a Clifford algebra2 . Ψ is called a spinor and the α, β indices spinor
indices. This is a new kind of object for us, so lets see the consequences.
2
Clifford was a student at King’s in the 1800’s.
10.5. SPINORS 119

Under a Lorentz transformation:

x′µ = Λµ ν xν (10.51)

we require that ds′2 = ds2 and hence

Λµ ρ Λν σ ηµν = ηρσ (10.52)

The coordinates xµ provide a representation of the Lorentz Group, which is sometimes


called SO(1, 3):

SO(1, 3) = {Λ|η = ΛT ηΛ, det(Λ) = 1} (10.53)

It is the spacetime version of the Euclidean rotation group in four-dimensions: SO(4)


Note that η = ΛT ηΛ is just the index-free, matrix version of (10.52). Here we suppressed
all the vector µ, ν indices and think of xµ as a vector in R4 and η, Λ are 4×4 real matrices.
Let us look at the infinitesimal form, i.e. the Lie-algebra:

Λµ ν = δνµ + ω µ ν ω µ ν << 1 (10.54)

then (10.52) becomes (to first order)

ω µ ρ ηµσ + ω ν σ ηρν = 0 (10.55)

In Relativity we adopt a notation where contracting an index with ηµν lowers it: ω µ ρ ηµσ =
ωρσ . Similarly one can raise an index by contracting with η µν . Thus the Lie-algebra
consists of anti-symmetric ωρσ :

ωσρ + ωρσ = 0 . (10.56)

A basis for so(1, 3) is therefore provided by the matrices

(Mµν )ρ σ = δµρ ηνσ − δνρ ηµσ (10.57)

These are just 4 × 4 matrices with a single ±1 somewhere in the upper triangle and a
∓1 in the corresponding lower triangle to make it anti-symmetric. For example:
 
0 0 0 0
0 0 −1 0
M12 = δ1ρ η2σ − δ2ρ η1σ =  (10.58)
 

0 1 0 0
0 0 0 0

One finds the Lie-algebra satisfies (see the problem set)

[Mµν , Mλρ ] = Mµρ ηνλ − Mνρ ηµλ + Mνλ ηµρ − Mµλ ηνρ (10.59)

What about Ψ? Well it turns out that


1 1
Σµν = γµν = (γµ γν − γν γµ ) (10.60)
2 4
120 CHAPTER 10. RELATIVISTIC QUANTUM MECHANICS

also forms a representation of so(1, 3), i.e. (10.59) (see the problem set). And it’s also
four-dimensional, because the spinor indices have the range α, β = 1, 2, 3, 4, but this time
four-complex dimensions and acts on the Ψ. This is called the spinor representation of
so(1, 3).3
Thus under an infinitesimal Lorentz transformation we have

Ψ(x) → Ψ′ (x) = SΨ(Λx)


 
1 µν
= 1 + ω Σµν Ψ(Λx) (10.61)
2
The extra factor of 1/2 comes from the over counting due to the anti-symmetry. To find
a finite transformation we exponentiate:
1 µν Σ
S = e2ω µν
(10.62)

Let us check that this is a symmetry of the Dirac equation:

−iℏγ µ ∂µ Ψ′ + mcΨ′ = −iℏγ µ ∂µ (SΨ(Λx)) + mcSΨ(Λx)


= −iℏγ µ SΛν µ ∂ν′ Ψ(Λx) + mcSΨ(Λx)
= S −iℏS −1 Λν µ γ µ S∂ν′ Ψ(x′ ) + mcΨ(x′ )

(10.63)

where we have used


∂x′ν ∂Ψ
∂µ Ψ = µ ′ν
= Λν µ ∂ν′ Ψ (10.64)
∂x ∂x
We want to show

−iℏγ µ ∂µ Ψ′ + mcΨ′ = S −iℏγ µ ∂µ′ Ψ(x′ ) + mcΨ(x′ ) = 0



(10.65)

Thus we need that,

S −1 Λµ ν γ ν S = γ µ (10.66)

or infinitesimally
1 1
− ω ρσ Σρσ γ µ + ω µ ν γ ν + γ µ ω ρσ Σρσ = 0
 2 2 
ρσ 1 µ 1 µ 1 µ 1 µ
ω − γρσ γ + δρ γσ − δσ γρ + γ γρσ = 0 (10.67)
4 2 2 4
which boils down to the identity

γρσ γ µ − γ µ γρσ = 2δσµ γρ − 2δρµ γσ (10.68)

This is in fact true as you can check. But for now just try some cases: if µ, ρ, σ are all
distinct then γ µ commutes with γρσ and the left hand side vanishes. So does the right
3
Things are a bit confusing in four-dimensions as many different indices are four-dimensional: xµ
and Ψα . This is another coincidence. For example in ten dimensions xµ would have µ = 0, 1, 2...., 9 but
α = 1, 2, 3, ...32.
10.5. SPINORS 121

hand side. If µ = σ and ρ ̸= σ then the two terms on the left hand side will add to give
a factor 2 which agrees with the right hand side.
As an example let’s consider a rotation in the x − y plane by an angle θ: ω12 = θ.
On the coordinates xµ we have
1 λρ M
x′µ = (e 2 ω λρ
)µ ν xν
= (eθM12 )µ ν xν
= (cos θI + M12 sin θ)µ ν xν
 ′   
ct 1 0 0 0 ct
 x′  0 cos θ − sin θ 0  x 
=⇒  ′  =  (10.69)
    
 
 y  0 sin θ cos θ 0  y 
z′ 0 0 0 1 z

But what happens to a spinor?


1 µν Σ
Ψ′α = (e 2 ω µν
)α β Ψβ
1
= (e 2 θγ12 )α β Ψβ
= (cos(θ/2)I + γ12 sin(θ/2))α β Ψβ (10.70)

In particular for θ = 2π we find

Ψ′ = −Ψ . (10.71)

Thus under a rotation of 2π spinors come back with a minus sign. This is very important.
It turns out that this means particles described by spinors must have wavefunctions
which change sign under swapping any two particles. Such particles are call Fermions
(particles that don’t have such a sign, such as photons, are called Bosons). This in
turn implies the Pauli exclusion principle: no to spinor particles can be in the same
state. This is what makes matter hard: when you add electrons to a Hydrogen atom to
build heavier elements they must fill out different energy levels.
How can we construct a Lorentz scalar from a spinor? For a vector we form the
contraction: e.g. ds2 = ηµν dxµ dxν = dxµ dxµ . For a spinor we need to consider something
like

Ψ†1 CΨ2 (10.72)

for some spinor matrix C. Now under a Lorentz transformation δΨ1,2 = 41 ωµν γ µν Ψ1,2
and hence
1 1
δΨ†1 = Ψ†1 ωµν (γ µν )† = Ψ†1 ωµν (γ ν )† (γ µ )† (10.73)
4 4
Now it is easy to check that

(γ µ )† = γ 0 γ µ γ 0
=⇒ (γ ν )† (γ µ )† = γ 0 γ ν γ µ γ 0 (10.74)
122 CHAPTER 10. RELATIVISTIC QUANTUM MECHANICS

thus
1 1
δΨ†1 = Ψ†1 ωµν γ 0 γ νµ γ 0 = − Ψ†1 ωµν γ 0 γ µν γ 0 (10.75)
4 4
What we need is

0 = δ(Ψ†1 CΨ2 )
= δΨ†1 CΨ2 + Ψ†1 CδΨ2
1
= Ψ†1 ωµν (−γ 0 γ µν γ 0 C + Cγ µν )Ψ2 (10.76)
4
Thus we can simply take C = γ 0 . This defines the Dirac conjugate:

Ψ̄ = Ψ† γ 0 (10.77)

and hence Ψ̄1 Ψ2 is a Lorentz scalar. Thus C = γ 0 plays same role for spinor indices
that η µν plays for spacetime indices. If we put indices on it we have C αβ = (γ 0 )α β .
A final comment is that Ψ is called a Dirac spinor. This is not an irreducible
representation of so(1, 3). One can impose the constraint

γ5 Ψ = γ 0 γ 1 γ 2 γ 2 Ψ = ±Ψ (10.78)

The eigenstates of γ5 are known as Weyl spinors and are often called left or right handed
spinors. Given the form of the γ-matrices above we have
!
0 I
γ5 = (10.79)
I 0

and hence we can write


!
ΨL
Ψ= (10.80)
ΨR

where ΨL/R are the left and right handed spinors. These form irreducible representations
of so(1, 3) and each is a complex 2-vector in spinor space.

10.6 Back to Quantum Mechanics


Okay so now we have a wave equation that is relativistic and first order in time. In
particular we can re-write the Dirac equation in a form more compatible with quantum
mechanics:
∂Ψ
iℏ = HΨ H = −iℏcγ 0 γ · ∇ + mc2 γ 0 (10.81)
∂t
This H is self-adjoint but not positive definite. Indeed one can just look at constant Ψ
and observe that γ 0 has eigenvalues +1 and −1.
10.6. BACK TO QUANTUM MECHANICS 123

So we still have negative energy states! But we do have a positive definite density.
Indeed we can identify a 4-current

J µ = Ψ̄γ µ Ψ (10.82)

It is easy to see that this is conserved:

∂µ J µ = ∂µ Ψ̄γ µ Ψ + Ψ̄γ µ ∂µ Ψ
mc mc
= i Ψ̄Ψ − i Ψ̄Ψ
ℏ ℏ
=0 (10.83)

Here we have used the equation of motion

iℏ∂µ Ψ̄γ µ + mcΨ̄ = 0 (10.84)

which you can show in the problem set. Along with showing Ψ̄γ µ Ψ a Lorentz vector.
This means that the time component can be used as a conserved quantity:
Z Z
1d
J d x = ∂0 J 0 d3 x
0 3
c dt
Z
= − ∂i J i d3 x

=0 (10.85)

And furthermore

J 0 = Ψ̄γ 0 Ψ = Ψ† Ψ (10.86)

is positive definite.
So we’ve discovered something beautiful and new but haven’t solved all our problems.
We still have negative energy states and not all particles are described by the Dirac
equation: just Fermions such as electrons and quarks. Dirac’s solution to the negative
energy states was to use the Pauli exclusion principle to argue that the ground state
has all negative energy states filled. This however implies that you should be able to
knock a state with negative energy out of the vacuum. Such an excitation has the same
energy (mass) but opposite charge to a positive energy state. In so doing he predicted
anti-particles which were subsequently observed. We now know that the Higgs’ Boson
(as with other Bosons) satisfies a Klein-Gordon-like equation and that these too have
anti-particles. The full solution comes by allowing the number of particles to change,
i.e. one allows for the creation and annihilation of particles, and leads to quantum field
theory. But that’s another module....
124 CHAPTER 10. RELATIVISTIC QUANTUM MECHANICS
Chapter 11

The Feynman Path Integral

Our last topic is to present another way of formulating quantum mechanics known as
the path integral formulation. It is perhaps the most popular way as one can incorporate
symmetries such as Lorentz invariance from the outset. It also allows us to make contact
with the principle of least action and hence the Classical world.
Let us start by considering a general quantum mechanical system with coordinate q̂
and its conjugate momenta p̂:

[q̂, p̂] = iℏ (11.1)

Let us imagine that we have eigenstate of q̂:

q̂|a⟩ = a|a⟩ (11.2)

In a usual wavefunction representation this is slightly dodgy as

ψa (q) = ⟨q|a⟩ = δ(q − a) (11.3)

Similarly we can consider the momentum eigenstates

p̂|k⟩ = k|k⟩ (11.4)

which has a more reasonable, but still not normalizable, wavefunction

ψk (q) = ⟨q|k⟩ = eiqk/ℏ (11.5)

which is clearly an eigenstate of p̂ = −iℏ∂/∂q. The important point is that these


eigenstates are delta-function normalized:
Z
⟨a1 |a2 ⟩ = ψa1 (q)∗ ψa2 (q)dq

= δ(a2 − a1 )
Z
⟨k1 |k2 ⟩ = ψk1 (q)∗ ψk2 (q)dq
Z
= ei(k2 −k1 )q/ℏ dq

= 2πℏδ(k2 − k1 ) (11.6)

125
126 CHAPTER 11. THE FEYNMAN PATH INTEGRAL

In a suitable sense (don’t ask too many questions) the |a⟩ form a basis. We can write
the identity operator as
Z
I = da|a⟩⟨a| (11.7)

Similarly so too do the |k⟩:


Z
dk
I= |k⟩⟨k| (11.8)
2πℏ
Let’s look at how |a⟩ evolves in time (note that we need to put a hat on the Hamil-
tonian as we will later see):

|a; t⟩ = e−iĤt/ℏ |a⟩ (11.9)

Let is consider the overlap

K(a2 , t2 ; a1 , t1 ) = ⟨a2 , t2 |a1 , t1 ⟩


= ⟨a2 |eiĤ(t2 −t1 )/ℏ |a1 ⟩ (11.10)

This is known as the propagator: it takes a particle at position a1 at time t1 and


computes the amplitude that it is found at a2 at time t2 . In addition if we start with a
more conventional wavefunction, i.e. one not localised in q:
Z
|ψ(t1 )⟩ = da1 ψ(a1 , t1 )|a1 , t1 ⟩ (11.11)

then by linearity we have


Z
⟨a2 , t2 |ψ(t1 )⟩ = ψ(a1 , t1 )⟨a2 , t2 |a1 , t1 ⟩da1
Z
= K(q, t; a1 , t1 )ψ(a1 , t1 )da1
Z
i.e. ψ(a2 , t2 ) = K(q2 , t2 ; a1 , t1 )ψ(a1 , t1 )da1 (11.12)

Thus the propagator evolves (propagates) wavefunctions in time. It is a Green’s function


for the Schrödinger equation.
We want to find an expression for K(a2 , t2 ; a1 , t1 ). Let us insert the identity matrix
written as an integral over momentum eigenstates:

K(a2 , t2 ; a1 , t1 ) = ⟨a2 |IeiĤ(t2 −t1 )/ℏ |a1 ⟩


Z
dk
= ⟨a2 |k⟩⟨k|eiĤ(t2 −t1 )/ℏ |a1 ⟩
2πℏ
Z
dk ika2 /ℏ
= e ⟨k|eiĤ(t2 −t1 )/ℏ |a1 ⟩ (11.13)
2πℏ
where we have used ⟨a2 |k⟩ = (⟨k|a2 ⟩)∗ = eika2 /ℏ . We will consider Hamiltonians of the
form
1 2
Ĥ = p̂ + V (q̂) (11.14)
2m
127

Next we consider an infinitesimal time step t2 − t1 = δt and expand:


Z
dk ika2 /ℏ
K(a2 , t1 + δt; a1 , t1 ) = e ⟨k|I + ℏi Ĥδt|a1 ⟩
2πℏ
Z Z  
dk ika2 /ℏ i dk ika2 /ℏ 1 2
= e ⟨k|a1 ⟩ + δt e ⟨k| p̂ + V (q̂) |a1 ⟩
2πℏ ℏ 2πℏ 2m
Z Z  
dk ika2 /ℏ i dk ika2 /ℏ 1 2
= e ⟨k|a1 ⟩ + δt e k + V (a1 ) ⟨k|a1 ⟩
2πℏ ℏ 2πℏ 2m
Z
dk ika2 /ℏ i H(k,a1 )δt
= e eℏ ⟨k|a1 ⟩
2πℏ
Z
dk i (ka2 +H(k,a1 )δt)
= eℏ ⟨k|a1 ⟩
2πℏ
Z
dk i (k(a2 −a1 )+H(k,a1 )δt)
= eℏ (11.15)
2πℏ
Note that now the Hamiltonian does not have a hat as we have evaluated it on position
and momentum eigenstates.
Now we want to add these all up to find K(af , tf ; a0 , t0 ) for a finite time difference
tf − t0 . To do this we split tf − t0 = (tf − tN −1 ) + (tN −1 − tN −2 ) + . . . + (t1 − t0 ) with
N → ∞:

K(af , tf ; a0 , t0 ) = ⟨af |IeiĤ(tf −tN −1 )/ℏ IeiĤ(tN −1 −tN −2 )/ℏ IeiĤ(t2 −t1 )/ℏ . . . IeiĤ(t1 −t0 )/ℏ |a0 ⟩
(11.16)
Next we replace each occurrence of I by I× (11.8)
Z
dkN
K(af , tf ; a0 , t0 ) = ⟨af | |kN ⟩ (11.17)
2πℏ
Z
iĤ(tN −tN −1 )/ℏ dkN −1
× ⟨kN |e I |kN −1 ⟩⟨kN −1 |eiĤ(tN −1 −tN −2 )/ℏ . . . |a0 ⟩
2πℏ
(11.18)

then each remaining I by (11.7).


We further suppose that each pair ti − ti−1 = (tf − t0 )/N = δt so that
N Z 
Y dki i
(ki (ai −ai−1 )+H(ki ,ai−1 )δt)
K(af , tf ; a0 , t0 ) = dai−1 e ℏ
i=1
2πℏ
Z Y N N −1 i P
dki Y i+1 −ai
 a 
i ki +H(ki ,ai ) δt
= dai e ℏ δt
. (11.19)
i=1
2πℏ i=0
In the large N limit we can recognize the object appearing in the exponential as an
integral:
X  ai+1 − ai  Z tf
lim ki + H(ki , ai ) δt = k(t)ȧ(t) + H(k(t), a(t))dt (11.20)
N →∞
i
δt t0

where ai+1 − ai = ȧ(ti+1 − ti )/N = ȧδt with a dot denoting differentiation with respect
to time. Thus we find an example of a path-integral:
Z  
dk i
R
K(af , tf ; a0 , t0 ) = [da]e ℏ (k(t)ȧ(t)+H(k(t),a(t)))dt (11.21)
2πℏ
128 CHAPTER 11. THE FEYNMAN PATH INTEGRAL

where k(t) and a(t) are arbitrary paths such that a(t0 ) = a0 and a(tf ) = af . This is an
infinite dimensional integral, we are integrating over all paths k(t) and a(t). It is not
well understood Mathematically but is very powerful in Physics.

11.1 Gaussian Path Integrals


The Path Integral formulation is conceptually very nice but of course impossible to
compute exactly for general systems. Formally it is defined (in so far as it is defined) as
a limit over a finer and finer net of finite-dimensional integrals. However there is one class
where we can have success: If the Lagrangian is quadratic then all the finite-dimensional
integrals that appear are just Gaussian integrals of the form
Z r
−Ak2 +Bk π B 2 /4A
dke = e (11.22)
A
The integral of a Gaussian gives another Gaussian.
In a particular we note that the integral over k(t) is a Gaussian. What exactly do
we mean by that. Let us look at one of the integrals in (11.19):
k2 −m(ai+1 −ai )2
  r  
dk ℏi k(ai+1 −ai )+ 2m
Z
δt+V (ai )δt 1 2mπℏ ℏi 2δt
+V (ai )δt
e = e
2πℏ 2πℏ iδt
m(ai+1 −ai )2
r  
m − ℏi
2(δt)2 −V (ai ) δt
= e (11.23)
2πℏiδt
(We should be more careful and Wick rotate (t2 − t1 ) → −i(τ2 − τ1 ) to get a damped
exponential). We can do this for each of the ki integrals (for i ̸= 1, 2). Putting this back
into the finite t2 − t1 expression gives
Z NY
+2 i P m(ai+1 −ai )2
 
− −V (ai ) δt
i 2(δt)2
K(a2 , t2 ; a1 , t1 ) = N dai e ℏ (11.24)
i=3

where N is an infinite but irrelevant constant that we can absorb. Again we recognise
the exponent as an integral
X  m(ai+1 − ai )2  Z t2
1 2
lim 2
− V (a i ) δt = mȧ − V (a)dt
N →∞
i
2(δt) t1
2
= S[a] (11.25)

the classical action! Thus the propagator can be written as


Z
i
R 1
K(a2 , t2 ; a1 , t1 ) = [da]e− ℏ ( 2 m(ȧ(t)) −V (a(t)))δt
2

Z
i
= [da]e− ℏ S[a(t)] (11.26)

An interesting and tractable case to consider is the free Schrödinger equation V = 0.


Going back a step we have
ki2
 
Z Y i P ai+1 −ai
dki i ki δt
+ 2m
δt
KS (a2 , t2 ; a1 , t1 ) = dai e ℏ (11.27)
i
2πℏ
11.1. GAUSSIAN PATH INTEGRALS 129

The integrals over dai give δ-functions


Z
i
dai e ℏ (ki+1 (ai+1 −ai )+ki (ai −ki−1 )) = 2πℏδ(ki − ki+1 ) (11.28)

This leaves us with only one independent k:


Z
dk ℏi k a2δt
−a1 P k2
 
+ i 2m δt
KS (a2 , t2 ; a1 , t1 ) = e
2πℏ
Z
dk ℏi k(a2 −a1 )+ 2m
k2
 
(t2 −t1 )
= e (11.29)
2πℏ
P
where we have used that i δt = t2 − t1 . This leaves just one Gaussian integral which
evaluates to:
s
im im(a2 −a1 )2
KS (a2 , t2 ; a1 , t1 ) = e 2ℏ(t2 −t1 ) (11.30)
2πℏ(t2 − t1 )

This is known as the Schrödinger propagator. You can check that, for t2 ̸= t1

∂KS ℏ2 ∂ 2 Ks
iℏ =− . (11.31)
∂t2 2m ∂a22
A more careful treatment reveals that
∂KS ℏ 2 ∂ 2 Ks
iℏ + = δ(t2 − t1 ) . (11.32)
∂t2 2m ∂t2
Thus we see that in the Path Integral formulation of Quantum Mechanics the ampli-
tude for a particle to go from q = a1 at t = t1 to q = a2 at t = t2 is given by an infinite
dimensional integral over all possible paths, classical or not, that start at a1 and end at
a2 . Each path is then given a weight e−iS/ℏ .
Here we see the Lagrangian of classical mechanics appearing. As you may recall from
classical mechanics the action, as determined by the Lagrangian, was a mysterious object
with no physical interpretation yet it controlled everything (symmetries, equations of
motion,...). However here we see it arises as the logarithm of the weight that appears
in the path integral. Each path contributes a weight that is a pure phase so they all
contribute equally in magnitude. However there can be interference. The dominant
contributions then correspond to those where the interference is the weakest. These
corresponds to paths where δS = 0, i.e. extrema. Thus the classical paths corresponds
to the most dominant contribution to the path integral with other paths cancelling out
due to the rapid oscillatory nature of the path integral for ‘small’ ℏ. This explains the
principle of least action.
Furthermore we see that we can marry Quantum Mechanics with Special Relativ-
ity by simply taking an action which is invariant under Lorentz transformations. For
example it is not too hard to see that the Klein-Gordon equation arises from the action

m 2 c3
Z
SKG = c∂µ Ψ∗ ∂ µ Ψ − 2 Ψ∗ Ψ d4 x (11.33)

130 CHAPTER 11. THE FEYNMAN PATH INTEGRAL

which is indeed invariant under Lorentz transformations. Therefore we can build a


relativistic quantum theory, in this case a quantum field theory, by considering
Z
i
Z = [dΨ]e− ℏ SKG [Ψ] (11.34)

This object is similar to the partition function that we saw above only in imaginary
temperature β = −i/ℏ and weighted by the action not the energy. In fact the relation
between the partition function of a quantum theory, as defined by (11.34), and the
partition function we saw in our discussion of thermal statistical physics is quite deep.

11.2 Computing Determinants


1 2
Let us look at a simple example H = 2m
p̂ + 12 k q̂ 2 . The corresponding Lagrangian is of
course just
1 1
L = mq̇ 2 − kq 2 (11.35)
2 2
It is helpful to rewrite the action as
d2
Z
1
S=− qEqdt E =m 2 +k (11.36)
2 dt
where we view E as a differential operator. Let us expand q(t) in a real orthonormal
basis of functions
X
q= qn en (t) (11.37)
with
Z
en (t)em (t)dt = δnm

Thus
Z
X 1
S=− qn qm
en (t)Eem (t)dt (11.38)
2
Let us choose en (t) to be eigenstates of E with eigenvalues λn :
Z
X 1
S=− qn qm en (t)λm em (t)dt
2
1X
=− λn qn2 (11.39)
2 n
Thus
Z Z Y
i i P 2
[dq]e− ℏ S[q] = dqn e 2ℏ n λn qn

n
YZ i 2
= dqn e 2ℏ λn qn
n
r
Y πiℏ
=
n
λn
N Y
=√ , det E = λn (11.40)
det E n
11.2. COMPUTING DETERMINANTS 131

Here N is another normalization constant that we aren’t interested in.1


Its easy to see how this generalises: For a quadratic action with many coordinates
qa , a = 1, ..., n of the form
Z
1
S=− qa Eab qb dt (11.41)
2
we find
N
Z Y
i
[dq]e− ℏ S[q1 ,...,qn ] = √ (11.42)
a
det Eab

where the determinant is the infinite product of eigenvalues of the operator Eab . More
generally still if one has

Z
1
S=− ( qa Eab qb + Ja qa )dt (11.43)
2
we find
N
Z Y
i 1
R −1
[dqa ]e− ℏ S[q1 ,...,qn ] = √ e 2 dtJa Eab Jb (11.44)
a
det Eab

1
Some of you might be worrying that these integrals are not well defined and this is just hocus-pocus.
Don’t, just relax and don’t ask too many questions.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy