PHYS30201
PHYS30201
PHYS30201
2 Angular momentum 18
2.1 General properties of angular momentum . . . . . . . . . . . . . . . . . . . . . . 18
2.2 Electron spin and the Stern-Gerlach experiment . . . . . . . . . . . . . . . . . . 21
2.3 Spin- 21 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3.1 Spin precession . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3.2 Spin and measurement: Stern Gerlach revisited . . . . . . . . . . . . . . 24
2.4 Higher spins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.5 Addition of angular momenta . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.5.1 Using tables of Clebsch-Gordan tables . . . . . . . . . . . . . . . . . . . 29
2.5.2 Example: Two spin- 12 particles . . . . . . . . . . . . . . . . . . . . . . . . 31
2.5.3 Angular Momentum of Atoms and Nuclei . . . . . . . . . . . . . . . . . 32
2.6 Vector Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
1
4 Approximate methods II: Time-independent perturbation theory 45
4.1 Non-degenerate perturbation theory . . . . . . . . . . . . . . . . . . . . . . . . 45
4.1.1 Connection to variational approach . . . . . . . . . . . . . . . . . . . . . 47
4.1.2 Perturbed infinite square well . . . . . . . . . . . . . . . . . . . . . . . . 47
4.1.3 Perturbed Harmonic oscillator . . . . . . . . . . . . . . . . . . . . . . . 48
4.2 Degenerate perturbation theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.2.1 Example of degenerate perturbation theory . . . . . . . . . . . . . . . . 51
4.2.2 Symmetry as a guide to the choice of basis . . . . . . . . . . . . . . . . . 53
4.3 The fine structure of hydrogen . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.3.1 Pure Coulomb potential and nomenclature . . . . . . . . . . . . . . . . . 54
4.3.2 Fine structure: the lifting of l degeneracy . . . . . . . . . . . . . . . . . . 54
4.4 The Zeeman effect: hydrogen in an external magnetic field . . . . . . . . . . . . 57
4.5 The Stark effect: hydrogen in an external electric field . . . . . . . . . . . . . . 59
5 Quantum Measurement 61
5.1 The Einstein-Poldosky-Rosen “paradox” and Bell’s inequalities . . . . . . . . . 61
All of quantum mechanics follows from a small set of assumptions, which cannot themselves
be derived. There is no unique formulation or even number of postulates, but all formulations
I’ve seen have the same basic content. This formulation follows Shankar most closely, though
he puts III and IV together. Nothing significant should be read into my separating them (as
many other authors do), it just seems easier to explore the consequences bit by bit.
I: The state of a particle is given by a vector |ψ(t)i in a Hilbert space. The state is normalised:
hψ(t)|ψ(t)i = 1.
This is as opposed to the classical case where the position and momentum can be specified at
any given time.
This is a pretty abstract statement, but more informally we can say that the wave function
ψ(x, t) contains all possible information about the particle. How we extract that information
is the subject of subsequent postulates.
The really major consequence we get from this postulate is superposition, which is behind most
quantum weirdness such as the two-slit experiment.
II: There is a Hermitian operator corresponding to each observable property of the particle.
Those corresponding to position x̂ and momentum p̂ satisfy [x̂i , p̂j ] = i~δij .
Other examples of observable properties are energy and angular momentum. The choice of
these operators may be guided by classical physics (e.g. p̂ · p̂/2m for kinetic energy and x̂ × p̂
for orbital angular momentum), but ultimately is verified by experiment (e.g. Pauli matrices
for spin- 12 particles).
The commutation relation for x̂ and p̂ is a formal expression of Heisenberg’s uncertainty prin-
ciple.
III: Measurement of the observable associated with the operator Ω̂ will result in one of the
eigenvalues ωi of Ω̂. Immediately after the measurement the particle will be in the corresponding
eigenstate |ωi i.
This postulate ensures reproducibility of measurements. If the particle was not initially in the
state |ωi i the result of the measurement was not predictable in advance, but for the result of
3
a measurement to be meaningful the result of a subsequent measurement must be predictable.
(“Immediately” reflects the fact that subsequent time evolution of the system will change the
value of ω unless it is a constant of the motion.)
IV: The probability of obtaining the result ωi in the above measurement (at time t0 ) is
|hωi |ψ(t0 )i|2 .
If a particle (or an ensemble of particles) is repeatedly prepared in the same initial state |ψ(t0 )i
and the measurement is performed, the result each time will in general be different (assuming
this state is not an eigenstate of Ω̂; if it is the result will be the corresponding ωi each time).
Only the distribution of results can be predicted. The postulate expressed this way has the same
content as saying that the average value of ω is given by hψ(t0 )|Ω̂|ψ(t0 )i. (Note the distinction
between repeated measurements on freshly-prepared particles, and repeated measurements on
the same particle which will give the same ωi each subsequent time.)
P
Note that if we expand the state in the (orthonormal) basis {|ωi i}, |ψ(t0 )i = i ci |ωi i, the
probability of obtaining the result ωi is |ci |2 , and hΩ̂i = i |ci |2 ωi .
P
d
V: The time evolution of the state |ψ(t)i is given by i~ dt |ψ(t)i = Ĥ|ψ(t)i, where Ĥ is the
operator corresponding to the classical Hamiltonian.
In most cases the Hamiltonian is just the energy and is expressed as p̂ · p̂/2m + V (x̂). (They
differ in some cases though - see texts on classical mechanics such as Kibble and Berkshire.) In
the presence of velocity-dependent forces such as magnetism the Hamiltonian is still equal to
the energy, but its expression in terms of p̂ is more complicated.
VI: The Hilbert space for a system of two or more subsystems (for example, several particles)
is a product space.
This is true whether the subsystems interact or not, i.e. if the states |φi i span the space for one
subsystem, the states |φi i ⊗ |φj i will span the space for two. If they do interact, the eigenstates
of the Hamiltonian will not be simple products of that form, but will be linear superpositions
of such states.
The position operator in 3-D is x̂, and has eigenkets |ri. A state of a particle can therefore
be associated with a function of position, the wave function: ψ(r, t) = hr|ψ(t)i. Note that
position and time are treated quite differently in non-relativistic quantum mechanics. There is
no operator corresponding to time, and t is just part of the label of the state. By the fourth
postulate, the probability of finding the particle in an infinitesimal volume dV at a position r,
ρ(r)dV , is given by ρ(r, t) = |hr|ψ(t)i|2 = |ψ(r, t)|2 . Thus a measurement of position can yield
many answers, and as well as an average x-position hψ|x̂|ψi there will be an uncertainty, ∆x,
where ∆x2 = hψ|x̂2 |ψi − hψ|x̂|ψi2 .
Though the notation hψ|Â|φi is compact, to calculate it if  is a function Rof position and mo-
∞
mentum operators we will usually immediately substitute the integral form −∞ ψ ∗ (r)Âφ(r)d3 r,
where—rather sloppily—Â is here understood as the position-representation of the operator.
The momentum operator p̂ has the representation in the position-basis of −i~∇. Eigenstates
of momentum, in the position representation, are just plane waves. In order that they satisfy
the following normalisations
Z ∞
ˆ
I= |pihp| d3 p, and hp|p0 i = δ(p − p0 ) = δ(px − p0x )δ(py − p0y )δ(pz − p0z ),
−∞
we need
1 3/2
eip·r/~ .
φp (r) ≡ hr|pi = 2π~
d
From the time-evolution equation i~ dt |ψ(t)i = Ĥ|ψ(t)i we obtain in the position representation
∂
i~ ψ(r, t) = Ĥψ(r, t),
∂t
which is the Schrödinger equation. The x-representation of Ĥ is usually −~2 ∇2 /2m + V (r).
The Hamiltonian is a Hermitian operator, and so its eigenstates |En i form a basis in the space.
Together with the probability density, ρ(r) = |ψ(r)|2 , we also have a probability flux or current
i~ ∗
j(r) = − (ψ (r)∇ψ(r) − ψ(r)∇ψ ∗ (r)).
2m
The continuity equation ∇ · j = −∂ρ/∂t which ensures local conservation of probability density
follows from the Schrödinger equation.
A two-particle state has a wave function which is a function of the two positions (6 coordinates),
Φ(r1 , r2 ). For states of non-interacting distinguishable particles where it is possible to say that
the first particle is in single-particle state |ψi and the second in |φi, Ψ(r1 , r2 ) = ψ(r1 )φ(r2 ). But
most two-particle states are not factorisable (separable) like that one. The underlying vector
space for two particles is called a “tensor direct product space” with separable states written
|Ψi = |ψi ⊗ |φi, where “⊗” is a separator which is omitted in some texts. We will come back
to this soon.
The Schrödinger equation tells us the rate of change of the state at a given time. From that
we can deduce an operator that acts on the state at time t0 to give that at a subsequent time
t: |ψ(t)i = Û (t, t0 )|ψ(t0 )i, which is called the propagator or time-evolution operator. We need
the identity
x N
lim 1 + = ex
N →∞ N
(to prove it, take the log of the L.H.S. and use the Taylor expansion for ln(1 + x) about the
point x = 0).
An infinitesimal time step Û (t+dt, t) follows immediately from the Schrödinger equation:
d i
i~ |ψ(t)i = Ĥ|ψ(t)i ⇒ |ψ(t+dt)i − |ψ(t)i = − Ĥdt|ψ(t)i
dt ~
i
⇒ |ψ(t+dt)i = 1 − Ĥdt |ψ(t)i.
~
For a finite time interval t − t0 , we break it into N small steps and take the limit N → ∞, in
which limit every step is infinitesimal and we can use the previous result N times:
|ψ(t)i = lim 1 − ~i Ĥ (t−t
N
0) N
|ψ(t0 )i = e−iĤ(t−t0 )/~ |ψ(t0 )i ≡ Û (t, t0 )|ψ(t0 )i
N →∞
We note that this is a unitary operator (the exponential of i times a Hermitial operator always
is). Thus, importantly, it conserves the norm of the state; there remains a unit probability of
finding the particle somewhere!
If |ψ(t0 )i is an eigenfunction |ni of the Hamiltonian with energy En ,
each term evolves with a different phase and non-trivial time evolution takes place. Note that
this implies an alternative form for the propagator:
X
Û (t, t0 ) = e−iEn (t−t0 )/~ |nihn|.
n
where the time-ordered exponential denoted by T exp means that in expanding the exponential,
the operators are ordered so that Ĥ(t1 ) always sits to the right of Ĥ(t2 ) (so that it acts first)
if t1 < t2 . This will come up in Advanced Quantum Mechanics.)
Measurement
The flow-chart below represents an arbitrary series of measurement on a particle (or series
of identically prepared paricles) in an unknown initial state. We carry out consecutive mea-
surements “immediately”, that is quickly compared with the timescale which characterises the
evolution of the system in between measurements. We will talk of “measuring A” when we
strictly mean “measuring the physical quantity associated with the operator Â.
b=+1
a=+1
B a=+1
A B
b=−1
a=−1
A
b=−1
a=−1
A priori, the possible outcomes on measuring A are the eigenvalues of Â, ±1. In general the
particle will not start out in an eigenstate of Â, so either outcome is possible, with probabilities
that depend on the initial state.
If we obtain the outcome a = +1 and then measure B, what can we get? We know that the
state is now no longer |φi but |a+i. The possible outcomes are b = +1 with probabilities
|hb+|a+i|2 and b = −1 with probabilities |hb−|a+i|2 . Both of these probabilities are 1/2: there
is a 50:50 chance of getting b = ±1. (Note that the difference between this and the previous
measurement of A where we did not know the probabilities is that now we know the state before
the measurement.)
If we we obtain the outcome b = −1 and then measure B again immediately, we can only
get b = −1 again. (This is reproducibility). The particle is in the state |b−i before the
measurement, an eigenstate of B̂.
Finally we measure A again. What are the possible outcomes and their probabilities?
Propagation
First let us consider the time-evolution of this system if the Hamiltonian is Ĥ = ~γ B̂. Assume
we start the evolution at t = 0 with the system in the state |ψ(0)i. Then |ψ(t)i = Û (t, 0)|ψ(0)i
with Û (t, 0) = e−iĤt/~ . Now in general the exponentiation of an operator can’t be found in
closed form, but in this case it can, because B̂ 2 = Iˆ and so B̂ 3 = B̂. So in the power series
ˆ
that defines the exponential, successive terms will be alternately proportional to B̂ and I:
So if we start, say, with |ψ(0)i = |b+i, an eigenstate of B̂, as expected we stay in the same
state: |ψ(t)i = Û (t, 0)|b+i = e−iγt |b+i. All that happens is a change of phase. But if we start
with |ψ(0)i = |a+i,
|ψ(t)i = cos γt|a+i − i sin γt|a−i.
Of course we can rewrite this as
q
1
e−iγt |b+i + eiγt |b−i
|ψ(t)i = 2
as expected. The expectation value of  is not constant: hψ(t)|Â|ψ(t)i = cos 2γt. The system
oscillates between |a+i and |a−i with a frequency 2γ. (This is twice as fast as you might
think—but after time π/γ the state of the system is −|a+i, which is not distinguishable from
|a+i.)
The object U (r, r0 ; t, 0) ≡ hr|Û (t, 0)|r0 i is the position-space matrix element of the propagator.
(Some texts call this the propagator, referring to Û only as the time-evolution operator.) This
is the probability of finding the particle at position r0 at time t, given that at time 0 it was at
r. To calculate it we will use the fact that momentum eigenstates |pi are eigenstates of Ĥ:
ZZ
0
hr|Û (t, 0)|r i = hr|pihp|Û (t, 0)|p0 ihp0 |r0 id3 pd3 p0
−ip̂2 t
ZZ
= hr|pihp| exp |p0 ihp0 |r0 id3 pd3 p0
2m~
−ip2 t −ip0 · r0
ip · r
ZZ
1 0
= exp exp δ(p − p ) exp d3 pd3 p0
(2π~)3 ~ 2m~ ~
−ip2 t ip · (r − r0 )
Z
1
= 3
exp + d3 p
(2π~) 2m~ ~
0 2
m 3/2 im|r − r |
= exp
2iπ~t 2~t
In the last stage, to do the three Gaussian integrals
R (dp x dpy dpzp
) we “completed the square”,
−αx2
shifted the variables and used the standard results e dx = π/α which is valid even if α
is imaginary.
Suppose the initial wave function is a spherically symmetric Gaussian wave packet with width
∆:
ψ(r, 0) = N exp(−|r|2 /(2∆2 )) with N = (π∆2 )−3/4 .
Then the (pretty ghastly) Gaussian integrals give
im|r − r0 |2 −|r0 |2
m 3/2 Z
ψ(r, t) = N exp exp 2
d3 r0
2iπ~t 2~t 2∆
2
|r|
= N 0 exp − 2
2∆ (1 + i~t/m∆2 )
where N 0 does preserve the normalisation but we do not display it. This is an odd-looking
function, but the probability density is more revealing:
|r|2
−3/2 2 2 −3/2
P (r, t) = |ψ(r, t)| = π ∆ + (~t/m∆) exp − 2 ;
∆ + (~t/m∆)2 )
p
this is a Gaussian wavepacket with width ∆(t) = ∆2 + (~t/m∆)2 . The narrower the initial
wavepacket (in position space), the faster the subsequent spread, which makes sense as the
momentum-space wave function will be wide, built up of high-momentum components. On the
other hand for a massive particle with ∆ not too small, the spread will be slow. For m = 1 g
and ∆(0) = 1 µm, it would take longer than the age of the universe for ∆(t) to double.
d d
Using i~ dt |ψ(t)i = Ĥ|ψ(t)i and hence −i~ dt hψ(t)| = hψ(t)|Ĥ, and writing hΩ̂i ≡ hψ(t)|Ω|ψ(t)i,
we have Ehrenfest’s Theorem
d 1 ∂ Ω̂
hΩ̂i = h[Ω̂, Ĥ]i + h i
dt i~ ∂t
p̂2 1
Ĥ = + mω 2 x̂2
2m 2
and we are going to temporarily forget that we know what the energy levels andp
wave functions
are. In order to work with dimensionless quantities, we define the length x0 = ~/mω.
Then if we define
1 x̂ x0 †
1 x̂ x0
â = √ + i p̂ and â = √ − i p̂
2 x0 ~ 2 x0 ~
• Ĥ = ~ω(↠â + 21 )
Without any prior knowledge of this system, we can derive the spectrum and the wave functions
of the energy eigenstates. We start by assuming we know one normalised eigenstate of Ĥ, |ni,
with energy En . Since
and also hn|↠â|ni = hân|âni ≥ 0, we see that En ≥ 21 ~ω. There must therefore be a lowest-
energy state, |0i (not the null state!).
Now consider the state â|ni. Using the commutator [Ĥ, â] above we have
Ĥ â|ni = âĤ|ni − ~ωâ|ni = (En − ~ω)â|ni,
so â|ni is another eigenstate with energy En − ~ω. A similar calculation shows that ↠|ni is
another eigenstate with energy En + ~ω. So starting with |ni it seems that we can generate an
infinite tower of states with energies higher and lower by multiples of ~ω.
However this contradicts the finding that there is a lowest energy state, |0i. Looking more
closely at the argument, though, we see there is a get-out: either â|ni is another energy eigen-
state or it vanishes. Hence â|0i = 0 (where 0 is the null state or vacuum).
The energy of this ground state is E0 = h0|Ĥ|0i = 21 ~ω. The energy of the state |ni, the nth
excited state, obtained by n applications of ↠, is therefore (n + 12 )~ω. Thus
and it follows that ↠â is a “number operator”, with ↠â|ni = n|ni. The number in question is
the number of the excited state (n = 1—first excited state, etc) but also the number of quanta
of energy in the oscillator.
Up to a phase, which we chose to be zero, the normalisations of the states |ni are:
√ √
â|ni = n|n−1i and ↠|ni = n + 1|n+1i.
As a result we have
↠n
|ni = √ |0i.
n!
The operators ↠and â are called “raising” and “lowering” operators, or collectively “ladder”
operators.
We can also obtain the wave functions in this approach. Writing φ0 (x) ≡ hx|0i, from hx|â|0i = 0
we obtain dφ0 /dx = −(x/x20 )φ0 and hence
2 /2x2
φ0 = (πx20 )−1/4 e−x 0
(where the normalisation has to be determined separately). This is a much easier differential
equation to solve than the one which comes direct from the Schrödinger equation!
The wave function for the n-th state is
n
1 x d 1
φn (x) = √ − x0 φ0 (x) = √ Hn ( xx0 )φ0 (x)
2n n! x0 dx 2n n!
2 d n −z /22
where here the definition of the Hermite polynomials is Hn (z) = ez /2 (z − dz ) e . The equiv-
alence of this formulation and the Schrödinger-equation-based approach means that Hermite
polynomials defined this way are indeed solutions of Hermite’s equation.
This framework makes many calculations almost trivial which would be very hard in the tra-
ditional framework; in particular matrix elements √ of powers of x̂ and p̂ between
√ general states
can be easily found by using x̂ = (â + ↠)(x0 / 2) and p̂ = i(↠− â)(~/ 2x0 ). For example,
hm|x̂n |m0 i and hm|p̂n |m0 i will vanish unless |m − m0 | ≤ n and |m − m0 | and n are either both
even or both odd (the last condition being a manifestation of parity, since φn (x) is odd/even if
n is odd/even).
For a particle in a two-dimensional potential 12 mωx2 x2 + 21 mωy2 y 2 , the Hamiltonian is separable:
p p
Ĥ = Ĥx + Ĥy . Defining x0 = ~/mωx and y0 = ~/mωy , creation operators âx and â†x can be
constructed from x̂ and p̂x as above, and we can construct a second set of operators ây and â†y
from ŷ and p̂y (using y0 as the scale factor) in the same way. It is clear that âx and â†x commute
with ây and â†y , and each of Ĥx and Ĥy independently has a set of eigenstates just like the ones
discussed above.
In fact the space of solutions to the two-dimensional problem can be thought of as a tensor
direct product space1 of the x and y spaces, with energy eigenstates |nx i ⊗ |ny i, nx and ny
being integers, and the Hamiltonian properly being written Ĥ = Ĥx ⊗ Iˆy + Iˆx ⊗ Ĥy , and the
eigenvalues being (nx + 12 )~ωx + (ny + 12 )~ωy . The ground state is |0i ⊗ |0i and it is annihilated
by both âx (= âx ⊗ Iˆy ) and ây (= Iˆx ⊗ ây ).
The direct product notation is clumsy though, and we often write the states as just |nx , ny i.
Then for instance
√ p
âx |nx , ny i = nx |nx −1, ny i and â†y |nx , ny i = ny + 1|nx , ny +1i.
1
see section 1.7
The corresponding wave functions of the particle are given by hr|nx , ny i = hx|nx ihy|ny i:
2 2 2 2
φ0,0 (x, y) = (πx0 y0 )−1/2 e−x /2x0 e−y /2y0
1 1
φnx ,ny (x, y) = √ n p
n
Hnx ( xx0 ) Hny ( yy0 ) φ0,0 (x, y)
2 nx ! 2 ny !
x y
{|Ai = |0i ⊗ |0i, |Bi = |0i ⊗ |1i, |Ci = |1i ⊗ |0i, |Di = |1i ⊗ |1i}.
States |Bi and |Ci differ by which of the two Qbits is in which state. Note that here the
label | . . .i tells us whether a particular state is in the product space or one of the individual
spaces, but in general we need to specify. These are all separable, as are the following states
(unnormalised, for simplicity):
|Ai + |Bi = |0i ⊗ |0i + |1i ; |Bi + |Di = |0i + |1i ⊗ |1i;
|Ai + |Bi + |Ci + |Di = |0i + |1i ⊗ |0i + |1i .
|Ai + |Di = |0i ⊗ |0i + |1i ⊗ |1i; |Bi − |Ci = |0i ⊗ |1i − |1i ⊗ |0i;
|Ai + |Ci − |Di = |0i + |1i ⊗ |0i − |1i ⊗ |1i = |0i ⊗ |0i + |1i ⊗ |0i − |1i
When a measurement of the state of one Qbit is carried out, the system will collapse into those
parts of the state compatible with the result, which may or may not specify the state of the
other Qbit. Restricting ourselves to states where neither Qbit’s state is known in advance (i.e.
we might get 0 or 1 whichever of the two we measure) then we have two possibilities. If the
initial state is separable,
|ψi i = α|0i + β|1i ⊗ γ|0i + δ|1i ,
(normalised, so |α|2 + |β|2 = |γ|2 + |δ|2 = 1), then if we measure the state of the first Qbit
and get 0 (something that will happen with probability |α|2 ), then the subsequent state of the
system is
|ψf i = |0i ⊗ γ|0i + δ|1i ,
and we still don’t know what result we’d get if we measured the state of the second Qbit. If on
the other hand the state is not separable, for instance
then if measure the state of the first Qbit and get 0, the subsequent state of the system is
—we now know that the second Qbit is in state |1i and if we measured it we would be certain to
get 1. This particular case where measurement of the first determines the result of the second
is called maximal entanglement.
Here it is particularly important to be clear that we are not multiplying Ĉa and D̂b together; they
act in different spaces. Once again ⊗ should be regarded as a separator, not a multiplication.
Denoting the identity operators in each space as Iˆa and Iˆb respectively, in the product space
the identity operator is Iˆa ⊗ Iˆb . An operator in which each additive term acts in only one space,
such as Ĉa ⊗ Iˆb + Iˆa ⊗ D̂b , is called a separable operator. Ĉa ⊗ Iˆb and Ia ⊗ D̂b commute.
The inverse of Ĉa ⊗ D̂b is Ĉa−1 ⊗ D̂b−1 and the adjoint, Ĉa† ⊗ D̂b† . (The order is NOT reversed,
since each still has to act in the correct space.)
Matrix elements work as follows: (hp| ⊗ hv|) Ĉa ⊗ D̂b (|qi ⊗ |wi) = hp|Ĉa |qihv|D̂b |wi. (This
is the arithmetic product of two scalars.)
The labels a and b are redundant since the order of the operators in the product tells us which
acts in which space. The names of the operators might also tell us. Alternatively if we keep
the labels, it is common to write Ĉa when we mean Ĉa ⊗ Iˆb and Ĉa D̂b (or even, since they
commute, D̂b Ĉa ) when we mean Ĉa ⊗ D̂b .
If an operator is separable, i.e. it can be written as Ĉa ⊗ Iˆb + Iˆa ⊗ D̂b , then the eigenvectors are
|ci i ⊗ |dj i with eigenvalue ci + dj . As already mentioned the operator is often written Ĉa + D̂b ,
where the label makes clear which space each operator acts in; similarly the eigenstates are
often written |ci , dj i.
Examples of separable operators are the centre-of-mass position of a two-particle system, m1 x̂1 +
m2 x̂2 , which we would never think of writing in explicit direct-product form but which, none-
the-less, acts on states of two particles such as ψa (r1 ) ⊗ φb (r2 ). Another is the total angular
momentum Ĵ = L̂ + Ŝ. Here we will sometimes find it useful to write Jˆz = L̂z ⊗ Iˆ + Iˆ ⊗ Sz to
stress the fact that the operators act on different aspects of the state, see later in the course.
Vectors and operators in finite-dimensional direct-product spaces can be represented by column
vectors and matrices, just as in any other vector space. In the two-Qbit space above, consider
the single-Qbit operator Ω̂ defined by Ω̂|0i = |1i and Ω̂|1i = |0i. Taking the states {|Ii . . . |Di
defined above as a basis, so that |1i −→ (1, 0, 0, 0)> etc, the operator Ω̂⊗ Ω̂ has matrix elements
such as
hB|Ω̂ ⊗ Ω̂|Ci = h0| ⊗ h1| Ω̂ ⊗ Ω̂ |1i ⊗ |0i = h0|Ω̂|1ih1|Ω̂|0i = 1 × 1 = 1
and
hA|Ω̂ ⊗ Ω̂|Ai hA|Ω̂ ⊗ Ω̂|Bi hA|Ω̂ ⊗ Ω̂|Ci hA|Ω̂ ⊗ Ω̂|Di
hB|Ω̂ ⊗ Ω̂|Ai hB|Ω̂ ⊗ Ω̂|Bi hB|Ω̂ ⊗ Ω̂|Ci hB|Ω̂ ⊗ Ω̂|Di
Ω̂ ⊗ Ω̂ →
hC|Ω̂ ⊗ Ω̂|Ai hC|Ω̂ ⊗ Ω̂|Bi hC|Ω̂ ⊗ Ω̂|Ci hC|Ω̂ ⊗ Ω̂|Di
hD|Ω̂ ⊗ Ω̂|Ai hD|Ω̂ ⊗ Ω̂|Bi hD|Ω̂ ⊗ Ω̂|Ci hD|Ω̂ ⊗ Ω̂|Di
0 0 0 1
0 0 1 0
=
0 1 0 0
1 0 0 0
Since |0i ± |1i are eigenvectors of Ω̂ with eigenvalues ±1, the four eigenvectors of Ω̂ ⊗ Ω̂ are
(|0i ± |1i) ⊗ (|0i ± |1i) (where the two ± are independent), i.e.
1 1
1 −1
|Ai + |Bi + |Ci + |Di → 1
and |Ai − |Bi − |Ci + |Di → −1
1 1
−1 −1
both with eigenvalue −1. Of course because there is degeneracy, these are not unique; for
example the simpler vectors |Ai ± |Di and |Bi ± |Ci are also eigenvectors with eigenvalue ±1.
It is easily checked that these are indeed eigenvectors of the corresponding matrix above. They
are orthogonal as expected.
1.7.4 Decoupling
One final note: we often encounter systems with many degrees of freedom but ignore all but one
for simplicity. This is legitimate if the the Hamiltonian does not couple them, or if the coupling
is sufficiently weak, because them the states in the space of interest will evolve independently
of the other degrees of freedom. For instance we typically ignore the centre-of-mass motion of
a two-particle system in the absence of external forces, because it is trivial (it will be at rest or
have constant velocity). Centre-of-mass and relative motions generally decouple, as is the case
in classical physics too. We often discuss the Schrödinger equation for spatial wave functions
and ignore spin, or focus purely on spin for illustrative purposes, which again is fine if spin and
spatial motion are only weakly coupled.
2. Angular momentum
We have previously encountered orbital angular angular momentum via the vector operator
L̂ = r̂ × p̂, the components of which obey the commutation relations
[L̂x , L̂y ] = i~L̂z , [L̂y , L̂z ] = i~L̂x , [L̂z , L̂x ] = i~L̂y , [L̂2 , L̂i ] = 0,
where L̂2 = L̂2x + L̂2y + L̂2z . We also found that L̂2 is identical to the angular part of −~2 r2 ∇2
and so occurs in 3D problems; it commutes with the Hamiltonian if the potential V (r) is
independent of angle. The commutation relations imply that we can only simultaneously know
L2 and one component, taken conventionally to be Lz . The common eigenfunctions of L̂2 and
L̂z are the spherical harmonics, Ylm (θ, φ) and the requirements that these be single-values and
finite everywhere restrict the eigenvalues to be ~2 l(l + 1) and ~m respectively, where l and m
are integers which satisfy
l = 0, 1, 2 . . . , m = −l, −l + 1, . . . l.
More details including the forms of the differential operators for the L̂i and of the first few
spherical harmonics can be found in section A.3.4. A recap of other aspects of angular momen-
tum from last year can be found in section A.5. In this course, we are going to re-derive the
properties of angular momentum, and discover some new ones, based only on the commutation
relations and nothing else.
In the case of the harmonic oscillator, we found that an approach which focused on operators
and abstract states rather than differential equations was extremely powerful. We are going
to do something similar with angular momentum, with the added incentive that we know
that orbital angular momentum is not the only possible form, we will need to include spin as
well—and that has no classical analogue or position-space description.
Consider three Hermitian operators, Jˆ1 , Jˆ2 and Jˆ3 , components of the vector operator Ĵ, about
which we will only assume one thing, their commutation relations:
[Jˆ1 , Jˆ2 ] = i~Jˆ3 , [Jˆ2 , Jˆ3 ] = i~Jˆ1 , [Jˆ3 , Jˆ1 ] = i~Jˆ2 (2.1)
18
or succinctly, [Jˆi , Jˆj ] = i~ k ijk Jˆk .1 It can be shown that the orbital angular momentum
P
operator defined previously satisfy these rules, but we want to be more general, hence the new
name Ĵ, and the use of indices 1-3 rather than x, y, z. Note that Ĵ has the same dimensions
(units) as ~.
From these follow the fact that all three commute with Ĵ2 = Jˆ12 + Jˆ22 + Jˆ2 : 3
[Ĵ2 , Jˆi ] = 0
It follows that we will in general be able to find simultaneous eigenstates of Ĵ2 and only one
of the components Jˆi . We quite arbitrarily choose Jˆ3 . We denote the normalised states |λ, µi
with eigenvalue ~2 λ of Ĵ2 and eigenvalue ~µ of Jˆ3 . (We’ve written these so that λ and µ
are dimensionless.) All we know about µ is that it is real, but recalling that for any state and
Hermitan operator, hα|Â2 |αi = hÂα|Âαi ≥ 0, we know in addition that λ must be non-negative.
Furthermore
Since Jˆ± commute with Ĵ2 , we see that the states Jˆ± |λ, µi are also eigenstates of Ĵ2 with
eigenvalue ~2 λ.
Why the names? Consider the state Jˆ+ |λ, µi:
Jˆ3 (Jˆ+ |λ, µi) = Jˆ+ Jˆ3 |λ, µi + ~Jˆ+ |λ, µi = ~(µ + 1)(Jˆ+ |λ, µi)
So either Jˆ+ |λ, µi is another eigenstate of Jˆ3 with eigenvalue ~(µ + 1), or it is the null vector.
Similarly either Jˆ− |λ, µi is another eigenstate of Jˆ3 with eigenvalue ~(µ − 1), or it is the null
vector. Leaving aside for a moment the case where the raising or lowering operator annihilates
the state, we have Jˆ+ |λ, µi = Cλµ |λ, µ + 1i, where
|Cλµ |2 = hλ, µ|Jˆ+† Jˆ+ |λ, µi = hλ, µ|Jˆ− Jˆ+ |λ, µi = hλ, µ|(Ĵ2 − Jˆ32 − ~Jˆ3 )|λ, µi = ~2 (λ − µ2 − µ)
p
There is an undetermined phase that we can choose to be +1, so Cλµ = ~ λ − µ2 − µ.
We can repeat the process to generate more states with quantum numbers µ ± 2, µ ± 3 . . . unless
we reach states that are annihilated by the raising or lowering operators. All these states are
in the λ-subspace of Ĵ2 .
1
ijk is 1 if i, j, k is a cyclic permutation of 1, 2, 3, −1 if an anticylic permutation such as 2, 1, 3 and 0 if any
two indices are the same.
However
√ we saw above that the magnitude of the eigenvalues µ of Jˆ3 must not be greater than
λ. So the process cannot go on indefinitely, there must be a maximum and minimum value
µmax and µmin , such that Jˆ+ |λ, µmax i = 0 and Jˆ− |λ, µmin i = 0. Furthermore by repeated action
of Jˆ− , we can get from |λ, µmax i to |λ, µmin i in an integer number of steps: µmax − µmin is an
integer, call it N .
Now the expectation value of Jˆ− Jˆ+ in the state |λ, µmax i must also be zero, but as we saw above
2
that expectation value, for general µ, is Cλµ = ~2 (λ − µ2 − µ). Thus
λ − µmax (µmax + 1) = 0.
Similarly, considering the expectation value of Jˆ+ Jˆ− in the state |λ, µmin i gives
λ − µmin (µmin − 1) = 0.
Taking these two equations together with µmin = µmax − N , we find
N
µmax (µmax + 1) = (µmax − N )(µmax − N − 1) ⇒ (N + 1)(2µmax − N ) = 0 ⇒ µmax = 2
.
Hence µmax is either an integer or a half-integer, µmin = −µmax and there are 2µmax + 1 possible
values of µ. Furthermore λ is restricted to values λ = N2 N2 + 1 for any integer N ≥ 0.
Let’s compare with what we found for orbital angular momentum. There we found that what
we have called λ had to have the form l(l + 1) for integer l, and what we’ve called µ was an
integer m, with −l ≤ m ≤ l. That agrees exactly with the integer case above. From now on we
will use m for µ, and j for µmax ; furthermore instead of writing the state |j(j +1), mi we will use
|j, mi. We refer to it as “a state with angular momentum j” but this is sloppy—if universally
p
understood; the magnitude of the angular momentum is ~ j(j + 1). The component of this
along any axis, though, cannot be greater than ~j.
But there is one big difference between the abstract case and the case of orbital angular mo-
mentum, and that is that j can be half integer 21 , 23 . . .. If these cases are realised in Physics,
the source of the angular momentum cannot be orbital, but something without any parallel in
classical Physics.
We end this section by rewriting the relations we have already found in terms of j and m,
noting m can only take the one of the 2j + 1 values −j, −j + 1 . . . j − 1, j:
J2 |j, mi = ~2 j(j + 1)|j, mi; Jˆ3 |j, mi = ~m |j, mi;
Jˆ± |j, mi = ~ j(j + 1) − m(m ± 1) |j, m ± 1i.
p
(2.3)
In the diagram below, √ the five cones show the possible locations of the angular momentum
vector with length 6~ and z-component m. The x- and y-components are not fixed, but must
satisfy hJˆ12 + Jˆ22 i = (6 − m2 )~2 > 0 (unless j = 0).
2.2 Electron spin and the Stern-Gerlach experiment
(Shankar 14.5); Griffiths 4.4; Mandl 5.2
From classical physics, we know that charged systems with angular momentum have a magnetic
moment µ, which means that they experience a torque µ × B if not aligned with an external
magnetic field B, and their interaction energy with the magnetic field is −µ·B. For an electron
in a circular orbit with angular momentum L, the classical prediction is µ = −(|e|/2m)L =
−(µB /~)L, where µB = |e|~/2m is called the Bohr magneton and has dimensions of a magnetic
moment.
Since the torque is perpendicular to the angular momentum the system is like a gyroscope, and
(classically) the direction of the magnetic moment precesses about B, with Lz being unchanged.
If the field is not uniform, though, there will also be a net force causing the whole atom to
move so as to reduce its energy −µ · B; taking the magnetic field along the +ve z axis the atom
will move to regions of stronger field if µz > 0 but to weaker field regions if µz < 0. If a beam
of atoms enters a region of inhomogeneous magnetic field one (classically) expects the beam to
spread out, each atom having a random value of µz and so being deflected a different amount.
The Stern-Gerlach experiment, in 1922, aimed to test if silver atoms had a magnetic moment,
and found that it did. The figure below (from Wikipedia) shows the apparatus; the shape of
the poles of the magnet ensures that the field is stronger near the upper pole than the lower
one.
The first run just showed a smearing of the beam demonstrating that there was a magnetic
moment, but further running showed that atoms were actually deflected either up or down by
a fixed amount, indicating that µz only had two possible values relative to the magnetic field.
The deflection what what would be expected for Lz = ±~. That accorded nicely with Bohr’s
planetary orbits, and was taken as a confirmation of a prediction of what we now call the “old”
quantum theory.
From a post-1926 perspective, however, l = 1 would give three spots (m = −1, 0, 1) not two—
and anyway we now know that the electrons in silver atoms have zero net orbital magnetic
moment. By that time though other considerations, particularly the so-called anomalous Zee-
man splitting of spectroscopic lines in a magnetic field, had caused first Kronig then, in 1925,
Gouldschmidt and Uhling, to suggest that electrons could have a further source of angular
momentum that they called spin, which would have only two possible values (m = − 21 , + 12 ) but
which couples twice as strongly to a magnetic field as orbital angular momentum (gs = 2)—
hence the Stern-Gerlach result (µ̂ = −(µB /~)2Ŝ). We now know that the electron does indeed
carry an intrinsic angular momentum, called spin but not mechanical in origin, which is an
example of the j = 12 possibility that we deduced above.
Thus the full specification of the state of an electron has two parts, spatial and spin. The
vector space is a tensor direct product space of the space of square-integrable functions of which
the spatial state is a member, states like |ψr (t)i, for which hr|ψr (t)i = ψ(r, t), and spin space,
containing states |ψs (t)i, the nature of which we will explore in more detail in the next section.
While in non-relativistic QM this has to be put in by hand, it emerges naturally from the Dirac
equation, which also predicts gs = 2.
Because this product space is itself a vector space, sums of vectors are in the space and not all
states of the system are separable, (that is, they do not all have the form |ψr (t)i ⊗ |ψs (t)i). We
can also have states like |ψr (t)i ⊗ |ψs (t)i + |φr (t)i ⊗ |φs (t)i. As we will see spin-space is two
dimensional (call the basis {|+i, |−i} just now), so including spin doubles the dimension of the
state space; as a result we never need more than two terms, and can write
|Ψ(t)i = c1 |ψr (t)i ⊗ |+i + c2 |φr (t)i ⊗ |−i.
But this still means that the electron has two spatial wave functions, one for each spin state. In
everything we’ve done so far the spin is assumed not to be affected by the dynamics, in which
case we return to a single common spatial state. But that is not general.
2.3 Spin- 21
Shankar 14.3; Griffiths 4.4; Mandl 5.3
Whereas with orbital angular momentum we were talking about an infinite-dimension space
which could be considered as a sum of subspaces with l = 0, 1, 2, ...., when we talk about
intrinsic angular momentum—spin—we are confined to a single subspace with fixed j. We also
use Ŝ in place of Ĵ, but the operators Ŝi obey the same rules as the Ji . The simultaneous
eigenstates of Ŝ2 and Ŝz are |s, mi, but as ALL states in the space have the same s, we often
drop it in the notation. In this case, s = 21 , m = − 12 , + 12 , so the space is two-dimensional with
a basis variously denoted
{| 12 , 12 i, | 12 , − 12 i} ≡ {| 12 i, |− 21 i} ≡ {|+i, |−i} ≡ {|ẑ+i, |ẑ−i}
In the last case ẑ is a unit vector in the z-direction, so we are making it clear that these are
states with spin-up (+) and spin-down (−) in the z-direction. We will also construct states
with definite spin in other directions.
†
In this basis, the matrices representing p Ŝz (which is diagonal), Ŝ+ and Ŝ− = Ŝ+ , can be written
down directly. Recall Jˆ+ |j, mi = ~ j(j+1)−m(m+1) |j, m+1i, so
q
Ŝ+ | 12 , − 12 i = ~ 34 + 14 | 21 , 12 i, Ŝ+ | 12 , 12 i = 0,
and so h 12 , 21 |Ŝ+ | 12 , − 12 i = ~ is the only non-vanishing matrix element of Ŝ+ From these Ŝx =
1
2
(Ŝ+ + Ŝ− ) and Ŝy = − 12 i(Ŝ+ − Ŝ− ) can be constructed:
1 0 0 1 0 0
|ẑ+i −→ |ẑ−i −→ Ŝ+ −→ ~ Ŝ− −→ ~
Sz 0 Sz 1 Sz 0 0 Sz 1 0
~ 1 0 ~ 0 1 ~ 0 −i
Ŝz −→ Ŝx −→ Ŝy −→ .
Sz 2 0 −1 Sz 2 1 0 Sz 2 i 0
The label Sz on the arrows reminds us of the particular basis we are using. It is easily shown
that the matrices representing the Ŝi obey the required commutation relations.
The matrices
0 1 0 −i 1 0
σ1 = σ2 = σ3 =
1 0 i 0 0 −1
P
are called the Pauli matrices. They obey σi σj = δij I + i k ijk σk . The three together, σ, form
a vector (a vector of matrices, as Ŝ is a vector of operators) while for some “ordinary” vector
a, a · σ is a single Hermitian matrix:
az ax − iay
a·σ = .
ax + iay −az
and a · σb · σ = a · b I + i(a × b) · σ. Together with the identity matrix they form a basis (with
real coefficients) for all Hermitian 2 × 2 matrices.
The component of Ŝ in an arbitrary direction defined by the unit vector n is Ŝ · n, We can
parametrise the direction of n by the polar angles θ, φ, so n = sin θ cos φ ex + sin θ sin φ ey +
cos θ ez . Then in the basis of eigenstates of Ŝz , Ŝ · n and its eigenstates are given by
The eigenvalues, of course, are ±~/2. (There is nothing special about the z-direction.)
The most general normalised state of a spin-half vector has the form
a
a|ẑ+i + b|ẑ−i −→ where |a|2 + |b|2 = 1.
Sz b
In fact any such state is a eigenstate of Ŝ · n for some direction n given by tan 21 θ = |b/a| and
φ = Arg (b) − Arg (a).
Note that (from the matrix representation, or by checking explicitly for the two states of a
basis) (2Ŝ · n/~)2 = Iˆ for any n. So
The lack of higher powers of Ŝi follows from the point about Hermitian operators noted above.
Some calculation in the matrix basis reveals the useful fact that hn±|Ŝ|n±i = ± ~2 n; that is,
the expectation value of the vector operator Ŝ is parallel or antiparallel to n.
2.3.1 Spin precession
The Hamiltonian of a spin- 12 electron in a uniform magnetic field is, with gs = 2 and charge
−|e|,
Ĥ = −µ · B = (gs µB /~)Ŝ · B −→ µB σ · B.
Sz
Consider the case of a field in the x-direction, so that Ĥ −→ µB Bσx , and a particle initially in
Sz
the state |ẑ+i. It turns out that we have already done this problem in Section 1.4.1, with the γ
of that problem being equal to µB B/~. Giving the result in terms of ω = 2γ = 2µB B/~, which
is the frequency corresponding to the energy splitting of the eigenstates of Ĥ, we obtained
|ψ(t)i = cos(ωt/2)|ẑ+i − isin(ωt/2)|ẑ−i. hψ(t)|Ŝz |ψ(t)i = (~/2) cos ωt.
To this we can now add hψ(t)|Ŝy |ψ(t)i = −(~/2) sin ωt, and hψ(t)|Ŝx |ψ(t)i = 0. So the ex-
pectation value of Ŝ is a vector of length ~/2 in the yz plane which rotates with frequency
ω = 2µB B/~. This is exactly what we would get from Ehrenfest’s theorem.
Alternatively, we can take the magnetic field along ẑ so that the energy eigenstates are |ẑ±i
with energies ±µB B ≡ ±~ω/2. If the initial state is spin-up in an arbitrary direction n, that
is |n+i, we can decompose this in terms of the energy eigenstates, each with its own energy
dependence, and obtain
|ψ(t)i = cos 2θ e−i(ωt+φ)/2 |ẑ+i + sin 2θ ei(ωt+φ)/2 |ẑ−i = |n(t)+i
where n(t) is a vector which, like the original n, is oriented at an angle θ to the ẑ (i.e. B)
axis, but which rotates about that axis so that the aximuthal angle changes with time: φ(t) =
φ(0) + ωt. The expectation value hŜi precesses likewise, following the same behaviour as a
classical magnetic moment of −gs µB .
The Particle Data Group lists spin-1 particles and spin- 23 particles; gravitons if they exist are
spin-2, and nuclei can have much higher spins (at least 92 for known ground states of stable
nuclei).
Furthermore, since in many situations total angular momentum commutes with the Hamiltonian
(see later), even when orbital angular momentum is involved we are often only concerned with
a subspace of fixed j (or l or s). All such sub-spaces are finite dimensional, of dimension
N = 2j + 1, and spanned by the basis {|j, mi} with m = j, j − 1 . . . − j + 1, −j. It is most
usual (though of course not obligatory) to order the states by descending m.
In this subspace, with this basis, the operators Jˆx , Jˆy , Jˆz are represented by three N × N
matrices with matrix elements eg (Jx )m0 m = hj, m0 |Jˆx |j, mi. (Because states with different j
are orthogonal, and because the Jˆi only change m, not j, hj 0 , m0 |Jˆx |j, mi = 0 if j 6= j: that’s
why we can talk about non-overlapping subspaces in the first place.) The matrix representation
of Jˆz of course is diagonal with diagonal elements j, j − 1 . . . − j + 1, −j. As with spin- 21 , it is
easiest to construct Jˆ+ first, then Jˆ− as its transpose (the elements of the former having been
chosen to be real), then Jˆx = 21 (Jˆ+ + Jˆ− ) and Jˆy = − 21 i(Jˆ+ − Jˆ− ).
As an example we construct the matrix representation of the operators for spin-1. The three
ˆ
p
basis states |s, mi are |1, 1i, |1, 0i and |1, −1i. Recall J+ |j, mi = ~ j(j+1)−m(m+1) |j, m+1i,
√ √
so Ŝ+ |1, −1i = ~ 2 − 0 |1, 0i, Ŝ+ |1, 0i = ~ 2 − 0 |1, 1i and Ŝ+ |1, 1i = 0, and the only non-zero
matrix elements of Ŝ+ are
√
h1, 1|Ŝ+ |1, 0i = h1, 0|Ŝ+ |1, −1i = 2~.
So:
1 0 0
|1, 1i −→ 0 |1, 0i −→ 1 |1, −1i −→ 0
Sz Sz Sz
0 0 1
1 0 0 √ 0 1 0 √ 0 0 0
Ŝz −→ ~ 0 0
0 Ŝ+ −→ 2~ 0 0 1 Ŝ− −→ 2~ 1 0 0
Sz Sz Sz
0 0 −1 0 0 0 0 1 0
0 1 0 0 −i 0
~ ~
Ŝx −→ √ 1 0 1 Ŝy −→ √ i 0 −i
Sz 2 0 1 0
Sz 2 0 i 0
Of course this is equally applicable to any system with j = 1, including the l = 1 spherical
harmonics.
Once all possible values of j and m are allowed, any angular momentum operator is represented
in the {|j, mi} = {|0, 0i, | 12 , 12 i, | 12 , − 12 i, |1, 1i, |1, 0i, |1, −1i . . . } basis by a block-diagonal matrix
The first block is a single element, zero in fact, since all components of Ĵ in the one-dimensional
space of states of j = 0 are zero. The next block is the appropriate 2×2 spin- 21 matrix, the next
a 3 × 3 spin-1 matrix, and so on. This block-diagonal structure reflects the fact that the vector
space can be written as a direct sum of spaces with j = 0, j = 12 , j = 1....: V = V1 ⊕V2 ⊕V3 ⊕. . .
(where the superscripts of course are 2j + 1).
In fact, any given physical system can only have integer or half-integer angular momentum.
So the picture would be similar, but with only odd- or even-dimensioned blocks. For orbital
angular momentum, for instance, the blocks would be 1 × 1, 3 × 3, 5 × 5 . . ..
Up till now, we have in general spoken rather loosely as if an electron has either orbital or
spin angular momentum—or more precisely, we’ve considered cases where only one affects the
dynamics, so we can ignore the other. But many cases are not like that. The electron in a
excited hydrogen atom may have orbital as well as spin angular momentum, and if is placed
in a magnetic field, both will affect how the energy levels shift, and hence how the spectral
lines split. Or the deuteron (heavy hydrogen nucleus) consists of both a proton and a neutron,
and both have spin; heavier atoms and nuclei have many components all with spin and angular
momentum. Only the total angular momentum of the whole system is guaranteed by rotational
symmetry to be conserved in the absence of external fields. So we need to address the question
of the addition of angular momentum.
Because the notation is clearest, we will start with the spin and orbital angular momentum of
a particle. We consider the case where l as well as s is fixed: electrons in a p-wave orbital,
for instance. These two types of angular momentum are independent and live in different
vector spaces, so this is an example of a tensor direct product space, spanned by the basis
{|l, ml i ⊗ |s, ms i} and hence (2l + 1) × (2s + 1) dimensional.
Now angular momentum is a vector, and we expect the total angular momentum to be the
vector sum of the orbital and spin angular momenta. We can form a new vector operator in
the product space
Now in calling the sum of angular momenta Ĵ, which we previously used for a generic angular
momentum, we are assuming that the Jˆi do indeed obey the defining commutation rules for
angular momentum, and this can easily be demonstrated. For instance
[Jˆx , Jˆy ] = [L̂x + Ŝx , L̂y + Ŝy ] = [L̂x , L̂y ] + [Ŝx , Ŝy ] = i~L̂z + i~Ŝz = i~Jˆz ,
where we have used the fact that [L̂i , Ŝj ] = 0, since they act in different spaces. Hence we
expect that an alternative basis in the product space will be {|j, mj i}, with allowed values of
j not yet determined. The question we want to answer, then, is the connection between the
{|l, ml i ⊗ |s, ms i} and {|j, mj i} bases. Both, we note, must have dimension (2l + 1) × (2s + 1).
We note some other points about the commutator: L̂z , Ŝz and Jˆz all commute; Jˆz commutes
with Ĵ2 (of course) and with L̂2 and with Ŝ2 (because both L̂z and Ŝz do), but L̂z and Ŝz do not
commute with Ĵ2 . Thus we can, as implied when we wrote down the two bases, always specify
l and s, but then either ml and ms (with mj = ml + ms ) or j and mj . (We will sometimes write
|l, s; j, mj i instead of just |j, mj i, if we need a reminder of l and s in the problem.) What this
boils down to is that the state of a given j and mj will be linear superpositions of the states of
given ms and ml that add up to that mj . If there is more than one such state, there must be
more than one allowed value of j for that mj .
Let’s introduce a useful piece of jargon: the state of maximal m in a multiplet, |j, ji, is called
the stretched state.
We start with the state of maximal ml and ms |l, li ⊗ |s, si, which has mj = l + s. This is
clearly the maximal value of mj , and hence of j: jmax = l + s, and since the state is unique, it
must be an eigenstate of Ĵ2 2 . If we act on this with Jpˆ− = L̂− + Ŝ− , we get a new state with
two terms in it; recalling the general rule Jˆ− |j, mi = ~ j(j + 1) − m(m − 1)|j, m−1i where j
can stand for j or l or s, we have (using ̄ as a shorthand for jmax = l + s)
|̄, ̄i = |l, li ⊗ |s, si⇒ Jˆ− |̄, ̄i = (L̂− |l, li) ⊗ |s, si + |l, li ⊗ (Ŝ− |s, si)
p √ √
⇒ 2̄ |̄, ̄−1i = 2l |l, l−1i ⊗ |s, si + 2s |l, li ⊗ |s, s−1i
From this state we can continue operating with Jˆ− ; at the next step there will be three terms on
the R.H.S. with {ml , ms } equal to {l−2, s}, {l−1, s−1} and {l, s−2}, then four, but eventually
we will reach states which are annihilated by L̂− or Ŝ− and the number of terms will start to
shrink again, till we finally reach |̄, −̄i = |l, −li ⊗ |s, −si after 2̄ steps (2̄ + 1 states in all).
2
This can also be seen directly by acting with Ĵ2 = Jˆ− Jˆ+ + Jˆz2 + ~Jˆz , since |l, li ⊗ |s, si is an eigenstate of
Jz with eigenvalue ~(l + s), and is annihilated by both L̂+ and Ŝ+ , and hence by Jˆ+ .
ˆ
Whichever is the smaller of l or s will govern the maximum number of {ml , ms } that can equal
any mj ; for example if s is smaller, the maximum number is 2s + 1.
Now the state we found with mj = l+s−1 is not unique, there must be another orthogonal
combination of the two states with {ml , ms } equal to {l−1, s} and {l, s−1}. This cannot be
part of a multiplet with j = ̄ because we’ve “used up” the only state with mj = ̄. So it must
be the highest mj state (the stretched state) of a multiplet with j = ̄ − 1 (ie l + s − 1):
q q
s l
|̄ − 1, ̄ − 1i = − l+s |l, l−1i ⊗ |s, si + l+s |l, li ⊗ |s, s−1i
Successive operations with Jˆ− will generate the rest of the multiplet (2̄ − 1 in all); all the states
will be orthogonal to the states of the same mj but higher j already found.
However there will be a third linear combination of the states with {ml , ms } equal to {l−2, s},
{l−1, s−1} and {l, s−2}, which cannot have j = ̄ or ̄−1. So it must be the stretched state of
a multiplet with j = ̄−2, (2̄ − 3 states in all).
And so it continues, generating multiplets with successively smaller values of j. However the
process will come to an end. As we saw, the maximum number of terms in any sum is whichever
is smaller of 2l + 1 or 2s + 1, so this is also the maximum number of mutually orthogonal states
of the same mj , and hence the number of different values of j. So j can be between l + s and
the larger of l + s − 2s and l + s − 2l; that is, l + s ≥ j ≥ |l − s|. The size of the {|j, mj i} basis
is then l+s
P
j=|l−s| 2j + 1, which is equal to (2l + 1)(2s + 1).
The table below illustrates the process for l = 2, s = 1; we go down a column by applying Jˆ− ,
and start a new column by constructing a state orthogonal to those in the previous columns.
The three columns correspond to j = 3, j = 2 and j = 1, and there are 7 + 5 + 3 = 5 × 3 states
in total.
|3, 3i = |2,
q 2i⊗|1, 1i q q q
|3, 2i = 2 |2, 1i⊗|1, 1i+ 1 |2, 2i⊗|1, 0i |2, 2i = − 1 |2, 1i⊗|1, 1i+ 2 |2, 2i⊗|1, 0i
q3 q3 q3 q3 q q
|3, 1i = 2 5
|2, 0i⊗|1, 1i+ 15 8 |2, 1i⊗|1, 0i |2, 1i = − 1
2
|2, 0i⊗|1, 1i+ 1 6
|2, 1i⊗|1, 0i 1 |2, 0i⊗|1, 1i−
|1, 1i = 10 3 |2, 1i⊗|1, 0i
10
q q q
+ 151 |2, 2i⊗|1, −1i + 1 |2, 2i⊗|1, −1i + 3 |2, 2i⊗|1, −1i
q q q 3 q 5 q
|3, 0i = 5 |2, −1i⊗|1, 1i+ 3
1
5
|2, 0i⊗|1, 0i |2, 0i = − 1
2
|2, −1i⊗|1, 1i+0|2, 0i⊗|1, 0i 3 |2, −1i⊗|1, 1i− 2 |2, 0i⊗|1, 0i
|1, 0i = 10 5
q q q
+ 1 5
|2, 1i⊗|1, −1i + 12
|2, 1i⊗|1, −1i + 10 3 |2, 1i⊗|1, −1i
q q q q q q
|3, −1i = 15 1 |2, −2i⊗|1, 1i+ 8 |2, −1i⊗|1, 0i |2, −1i = − 1 |2, −2i⊗|1, 1i− 1 |2, −1i⊗|1, 0i |1, −1i = 3 |2, −2i⊗|1, 1i− 10 3 |2, −1i⊗|1, 0i
q 15 3 q 6 5 q
+ 2 5
|2, 0i⊗|1, −1i + 12
|2, 0i⊗|1, −1i + 1 |2, 0i⊗|1, −1i
10
q q q q
|3, −2i = 1 3
|2, −2i⊗|1, 0i+ 2 |2, −1i⊗|1, −1i
3
|2, −2i = − 2 3
|2, −2i⊗|1, 0i+ 1 3
|2, −1i⊗|1, −1i
|3, −3i = |2, −2i⊗|1, −1i
The coefficients in the table are called Clebsch-Gordan coefficients. They are the inner prod-
ucts (hl, ml | ⊗ hs, ms |)|j, mj i but that is too cumbersome a notation; with a minimum modifi-
cation Shankar uses hl, ml ; s, ms |j, mj i; Mandl uses C(l, ml , s, ms ; j, mj ), but hl, s, ml , ms |j, mj i
and other minor modifications, including dropping the commas, are common. They are all
totally clear when symbols are being used, but easily confused when numerical values are sub-
stituted! We use the “Condon-Shortley” phase convention, which is the most common; in this
convention Clebsch-Gordan coefficients are real, which is why we won’t write hl, ml ; s, ms |j, mj i∗
in the second equation of Eq. (2.4) below.
All of this has been written for the addition of orbital and spin angular momenta. But we did
not actually assume at any point that l was integer. So in fact the same formulae apply for
the addition of any two angular momenta of any origin: a very common example is two spin- 12
particles. The more general form for adding two angular momenta j1 and j2 , with J and M
being the quantum numbers corresponding to the total angular momentum of the system, is
X
|J, M i = hj1 , m1 ; j2 , m2 |J, M i |j1 , m1 i ⊗ |j2 , m2 i,
m1 ,m2
X
|j1 , m1 i ⊗ |j2 , m2 i = hj1 , m1 ; j2 , m2 |J, M i |J, M i. (2.4)
J,M
To summarise, the states of a system with two contributions to the angular momentum, j1
and j2 , written in a basis in which the total angular momentum J and z-component M are
specified; the values of J range from |j1 −j2 | to j1 +j2 in unit steps. In this basis the total angular
momentum operators Jˆi and Ĵ2 are cast in block-diagonal form, one (2J+1)-square block for
each value of J. The vector space, which we started by writing as a product, V2j1 +1 ⊗ V2j2 +1 ,
can instead be written as a direct sum: V2(j1 +j2 )+1 ⊕ . . . ⊕ V2|j1 −j2 |+1 . In particular for some
orbital angular momentum l and s = 21 , V2l+1 ⊗ V2 = V2l+2 ⊕ V2l . The overall dimension of
the space is of course unchanged.
General formulae for the Clebsch-Gordan coefficients are not used (the already-met case of s = 21
is an exception). One may use the Mathematica function ClebschGordan[{l, ml }, {s, ms }, {j, mj }],
and on-line calculator at Wolfram Alpha.
Frequently though one consults tables and this section gives instructions on their use.
In a system with two contributions to angular momentum j1 and j2 , Clebsch-Gordan coefficients
are used to write states good of total angular momemtum J and z-component M , |j1 , j2 ; J, M i
or just |J, M i, in terms of the basis {m1 , m2 }, |j1 , m1 i ⊗ |j2 , m2 i:
X
|j1 , j2 ; J, M i = hj1 , m1 ; j2 , m2 |J, M i |j1 , m1 i ⊗ |j2 , m2 i and
m1 m2
X
|j1 , m1 i ⊗ |j2 , m2 i = hj1 , m1 ; j2 , m2 |J, M i |j1 , j2 ; J, M i
JM
where the numbers denoted by hj1 , m1 ; j2 , m2 |J, M i are the Clebsch-Gordan coefficients; they
vanish unless j1 + j2 ≥ J ≥ |j1 − j2 |, and m1 + m2 = M . There is a conventional tabulation
which can be found in various places including the Particle Data Group site, but the notation
takes some explanation.
There is one table for each j1 , j2 pair. The table consists of a series of blocks, one for each value
J
of M . Along the top are possible values of M and at the left are possible values of m1 m2 .
1
Each block stands for something which could be written like this one for j1 = 1, j2 = 2
and
M = m1 + m2 = 21 :
J 3/2 1/2
M +1/2 +1/2
m1 m2
+1 −1/2 1/3 2/3
0 +1/2 2/3 −1/3
For compactnessq the numbers in the blocks are the coefficients squared times their sign; thus
− 12 stands for − 12 .
1 3
As an example consider the table for coupling j1 = 1 and j2 = 2
to get J = 2
or 12 . For clarity
we will use the notation |J, M i in place of |j1 , j2 ; J, M i.
In red the coefficients of |1, 1i ⊗ | 21 , − 12 i and |1, 0i ⊗ | 21 , 12 i in | 12 , 12 i are highlighted
q q
| 12 , 12 i = 2
3
|1, 1i ⊗ | 12 , − 12 i − 1
3
|1, 0i ⊗ | 12 , 12 i.
In green are the components for the decomposition
q q
|1, −1i ⊗ | 2 , 2 i = 3 | 2 , − 2 i − 23 | 12 , − 12 i.
1 1 1 3 1
3
Or for coupling j1 = 2
and j2 = 1:
Here we will call the operators Ŝ(1) , Ŝ(2) and Ŝ = Ŝ(1) + Ŝ(2) for the individual and total spin
operators, and S and M for the total spin quantum numbers. (The use of capitals is standard
in a many-particle system.) Because both systems are spin- 12 , we will omit the label from our
states, which we will write in the {m1 , m2 } basis as
|1i = |+i ⊗ |+i, |2i = |+i ⊗ |−i, |3i = |−i ⊗ |+i, |4i = |−i ⊗ |−i.
where the 1 × 1 plus 3 × 3 block-diagonal structure has been emphasised and the 3 × 3 blocks
are just the spin-1 matrices we found previously.
2.5.3 Angular Momentum of Atoms and Nuclei
Both atoms and nuclei consist of many spin- 12 fermions, each of which has both spin and orbital
angular momentum. In the independent-particle model we think of each fermion occupying a
well-defined single-particle orbital which is an eigenstate of a central potential and hence has
well defined orbital angular momentum l. The notation s, p, d, f , g. . . is used for orbitals of
l = 0, 1, 2, 3, 4 . . .. For each fermion there is also a total angular momentum j. All the angular
momenta of all the fermions can be added in a variety of ways, and the following quantum
numbers are defined: L for the sum of all the orbital angular momenta (that is, the eigenvalues
of L̂2tot are ~2 L(L + 1)); S for the sum of all the spin angular momenta, and J for the total
angular momentum of the atom or nucleus from all sources. The use of capitals for the quantum
numbers shouldn’t be confused with the operators themselves.
In reality the independent-particle model is only an approximation, and only the total angular
momentum J is a conserved quantum number (only Ĵ2tot commutes with the Hamiltonian of
the whole system). For light atoms, it is a good starting point to treat L and S as if they were
conserved too, and the notation 2S+1 LJ is used, with L being denoted by S, P , D, F , G. . . .
This is termed LS coupling. So 3 S1 has L = 0, S = J = 1. Energy levels of light atoms are
split according to J by the spin-orbit splitting ∝ L̂ · Ŝ (of which more later). However for heavy
atoms and for nuclei, it is a better approximation to sum the individual total angular momenta
j to give J as the only good quantum number. (j-j coupling.)
Somewhat confusingly, J is often called the spin of the atom or nucleus, even though its origin
is both spin and angular momentum. This composite origin shows up in a magnetic coupling
g which is neither 1 (pure orbital) or 2 (pure spin). For light atoms g can be calculated from
L, S and J (the Landé g-factor). For nuclei things are further complicated by the fact that
protons and neutrons are not elementary particles, and their “spin” is likewise of composite
origin, something which shows up through their g values of gp = 5.59 and gn = −3.83 rather
than 2 and 0 respectively. Using these the equivalent of the Landé g-factor can be calculated
for individual nucleon orbitals, and hence for those odd-even nuclei for which the single-particle
model works (that is, assuming that only the last unpaired nucleon contributes to the total
angular momentum). Beyond that it gets complicated.
This section is not examinable. The take-home message is that vector operators such as x̂
and p̂ can change the angular momentum of the state they act on in the same way as coupling
in another source of angular momentum with l = 1. If the components of the vector operator
are written in a spherical basis analogously to Ĵ± , the dependence of the matrix elements on the
m quantum numbers is given by Clebsch Gordan coefficients, with the non-trivial dependence
residing only in a single “reduced matrix element” for each pair j and j 0 of the angular momenta
of the initial and final states. This is the Wigner-Eckart theorem of Eq. (2.6).
We have now met a number of vector operators: x̂ = (x̂, ŷ, ẑ), p̂ = (p̂x , p̂y , p̂z ), and of course
L̂, Ŝ and Ĵ. We have seen, either in lectures or examples, that they all satisfy the following
relation: if V̂ stands for the vector operator
X
[Jˆi , V̂j ] = i~ ijk V̂k
k
for example, [Jˆx , ŷ] = i~ẑ. (We could have substituted L̂x for Jˆx here as spin and space operators
commute.)
We can take this to be the definition of a vector operator: a triplet of operators makes up a
vector operator if it satisfies these commutation relations.
Just as it was useful to define Jˆ+ and Jˆ− , so it is useful to define
q q
1
V̂+1 = − 2 (V̂1 + iV̂2 ) V̂−1 = 12 (V̂1 − iV̂2 ) V̂0 = V̂3
where the subscripts are no longer Cartesian coordinates (1 ≡ x etc) but analogous to the m
of the spherical harmonics—and indeed
q q q
±1
1
∓ 2 (x ± iy) = 4π
3
rY1 (θ, φ) z = 4π 3
rY10 (θ, φ).
q
Note a slight change of normalisation and sign: Jˆ±1 = ∓ 12 Jˆ± . In terms of these spherical
components V̂m ,
[Jˆ0 , V̂m ] = m~V̂m [Jˆ± , V̂m ] = ~ (1 ∓ m)(2 ± m)V̂m±1
p
This is a specific instance of the Wigner-Eckart theorem. It says that acting on a state
with a vector operator is like coupling in one unit of angular momentum; only states with
|j 0 − 1| ≤ j ≤ j 0 + 1 and with p = m + q will have non-vanishing matrix elements. It also
means that if one calculates one matrix element, which ever is the simplest (so long as it is
non-vanishing), then the others can be written down directly.
Since Ĵ is a vector operator, it follows that matrix elements of Ĵq can also be written in terms
of a reduced matrix element hj 0 ||Ĵ||ji, but of course this vanishes unless j 0 = j.
P
Writing |j1 , j2 ; J, M i = m1 m2 hJ, M |j1 , m1 ; j2 , m2 i|j1 , m1 i ⊗ |j2 , m2 i, and using orthonormality
of the states {|J, M i}, allows us to show that
X
hJ, M |j1 , m1 ; j2 , m2 ihJ 0 , M 0 |j1 , m1 ; j2 , m2 i = δJJ 0 δM M 0 (2.5)
m1 m2
Noting
P too that a scalar product of vector operators P̂·Q̂ can be written in spherical components
as q (−1)q P̂−q Q̂q , we can show that
X
hj, m|P̂ · Ĵ|j, mi = (−1)q hj, m|P̂−q |j 0 , m0 ihj 0 , m0 |Jˆq |j, mi
q,j 0 ,m0
X
= hj 0 , m0 |P̂q |j, mihj 0 , m0 |Jˆq |j, mi = hj||P̂||jihj||Ĵ||ji;
q,m0
(we insert a complete set of states at the first step, then use the Wigner-Eckart theorem and
Eq. (2.5)).
p
Replacing P̂ with Ĵ gives us hj||Ĵ||ji = j(j + 1). Hence we have the extremely useful relation
which we will use in calculating the Landé g factor in the next section.
Finally, we might guess from the way that we used a general symbol l instead of 1, that there
are operators which couple in 2 or more units of angular momentum. Simple examples are
obtained by writing rl Ylm in terms of x, y and z, then setting x → x̂ etc; so (x̂ ± iŷ)2 , (x̂ ± iŷ)ẑ,
and 2ẑ 2 − x̂2 − ŷ 2 ) are the m = ±2, m = ±1 and m = 0 components of an operator with
l = 2 (a rank-two tensor operator, in the jargon). There are six components of x̂i x̂j , but
(ẑ 2 + x̂2 + ŷ 2 ) is a scalar (l = 0). This is an example of the tensor product of two l = 1 operators
giving l = 2 and l = 0 operators.
3. Approximate methods I:
variational method and WKB
• They don’t provide much insight on which aspects of the physics are more important.
• Even supercomputers can’t solve the equations for many interacting particles exactly in
a reasonable time (where “many” may be as low as four, depending on the complexity of
the interaction) — ask a nuclear physicist or quantum chemist.
• Quantum field theories are systems with infinitely many degrees of freedom. All ap-
proaches to QFT must be approximate.
• If the system we are interested in is close to a soluble one, we might obtain more insight
from approximate methods than from numerical ones. This is the realm of perturbation
theory. The most accurate prediction ever made, for the anomalous magnetic moment of
the electron, which is good to one part in 1012 , is a 4th order perturbative calculation.
In the next chapter we will consider perturbation theory, which is the probably the most widely
used approximate method, and in this chapter we consider two other very useful approaches:
variational methods and WKB.
35
3.2 Variational methods: ground state
Shankar 16.1; Mandl 8.1; Griffiths 7.1; Gasiorowicz 14.4
Suppose we know the Hamiltonian of a bound system but don’t have any idea what the energy
of the ground state is, or the wave function. The variational principle states that if we simply
guess the wave function, the expectation value of the Hamiltonian in that wave function will
be greater than the true ground-state energy:
hΨ|Ĥ|Ψi
≥ E0
hΨ|Ψi
1.064E1 .
If we wanted a bound on E2 , we’d need a wave function which was orthogonal to both the
ground state and the first excited state. The latter is easy by symmetry, but as we don’t know
the exact ground state (or so we are pretending!) we can’t ensure the first. We can instead
form a trial wave function which is orthogonal to the best trial ground state, but we will no
longer have a strict upper bound on the energy E2 , just a guess as to its value.
In this case we can choose Ψ(x) = x(a − x) + bx2 (a − x)2 with a new value of b which gives
orthogonality to the previous state, and then we get E2 ∼ 10.3E0 (as opposed to 9 for the
actual value).
3.2.2 Variational methods: the helium atom
Griffiths 7.2; Gasiorowicz 14.2,4; Mandl 7.2, 8.8.2;Shankar 16.1
If we could switch off the interactions between the electrons, we would know what the ground
state of the helium atom would be: Ψ(r1 , r2 ) = φZ=2 Z=2 Z
100 (r1 )φ100 (r2 ), where φnlm is a single-
particle wave function of the hydrogenic atom with nuclear charge Z. For the ground state
n = 1 and l = m = 0 (spherical symmetry). The energy of the two electrons would be
2Z 2 ERy = −108.8 eV. But the experimental energy is only −78.6 eV (ie it takes 78.6 eV to
fully ionise neutral helium). The difference is obviously due to the fact that the electrons repel
one another.
The full Hamiltonian (ignoring the motion of the proton - a good approximation for the accuracy
to which we will be working) is
~2
2 2 1 1 1
− (∇1 + ∇2 ) − 2 ~cα + + ~cα
2m |r1 | |r2 | |r1 − r2 |
where ∇21 involves differentiation with respect to the components of r1 , and α = e2 /(4π0 ~c) ≈
1/137. (See here for a note on units in EM.)
A really simple guess at a trial wave function for this problem would just be Ψ(r1 , r2 ) as written
above. The expectation value of the repulsive interaction term is (5Z/4)ERy giving a total
energy of −74.8 eV. (Gasiorowicz demonstrates the integral, as do Fitzpatrick and Branson.)
It turns out we can do even better if we use the atomic number Z in the wave function Ψ as a
variational parameter (that in the Hamiltonian, of course, must be left at 2). The best value
turns out to be Z = 27/16 and that gives a better upper bound of −77.5 eV – just slightly
higher than the experimental value. (Watch the sign – we get an lower bound for the ionization
energy.) This effective nuclear charge of less than 2 presumably reflects the fact that to some
extent each electron screens the nuclear charge from the other.
The WKB approximation is named for G. Wentzel, H.A. Kramers, and L. Brillouin, who
independently developed the method in 1926. There are pre-quantum antecedents due to
Jeffreys and Raleigh, though.
We can always write the one-dimensional time-independent Schrödinger equation as
d2 ψ
= −k(x)2 ψ(x)
dx2
p
where k(x) ≡ 2m(E − V (x))/~. We could think of the quantity k(x) as a spatially-varying
wavenumber (k = 2π/λ), though we anticipate that this can only make sense if it doesn’t
change too quickly with position - else we can’t identify a wavelength at all.
If the potential, and hence k, were constant, the plain waves e±ikx would be solutions. Let’s
see under what conditions a solution of the form
Z x
0 0
ψ(x) = A exp ±i k(x )dx
might be a good approximate solution when the potential varies with position. Plugging this
into the SE above, the LHS reads −(k 2 ∓ ik 0 )ψ. (Here and hereafter, primes denote differenti-
ation wrt x — except when they indicate an integration variable.) So provided |k 0 /k 2 | 1, or
equivalently |λ0 | 1, this is indeed a good solution as the second term can be ignored. And
|λ0 | 1 does indeed mean that the wavelength is slowly varying. (One sometimes reads that
what is needed is that the potential is slowly varying. But that is not a well defined statement,
because dV /dx is not dimensionless. For any smooth potential, at high-enough energy we will
have |λ0 | 1. What is required is that the length-scale of variation of λ, or k, or V (the scales
are all approximately equal) is large compared with the de Broglie wavelength of the particle.
An obvious problem with this form is that the probability current isn’t constant: if we calculate
it we get |A|2 ~k(x)/m. A better approximation is
Z x
A 0 0
ψ(x) = p exp ±i k(x )dx
k(x)
which gives a constant flux. (Classically, the probability of finding a particle in a small region
is inversely proportional to the speed with which it passes through that region.) Furthermore
one can show that if the error in the first approximation is O |λ0 |, the residual error with the
second approximation is O |λ0 |2 . At first glance there is a problem with the second form when
k(x) = 0, ie when E = V (x). But near these points—the classical turning points—the whole
approximation scheme is invalid anyway, because λ → ∞ and so the potential cannot be “slowly
varying” on the scale of λ.
For a region of constant potential, of course, there
R x is0 no 0difference between the two approxima-
tions and both reduce to a plain wave, since k(x )dx = kx.
For regions where E < V (x), k(x) will be imaginary and there is no wavelength as such. Instead
we get real exponentials, which describe tunnelling. But defining λ = 2π/|k| still, the WKB
approximation will continue to be valid if |λ0 | 1 (where λ should now be interpreted as a
decay length).
Tunnelling and bound-state problems inevitably include regions where E ≈ V (x) and the WKB
approximation isn’t valid. This would seem to be a major problem. However if such regions
are short the requirement that the wave function and its derivative be continuous can help us
to “bridge the gap”.
(The integral is positive-definite so n cannot be negative.) Now k(x) depends on E, and this
condition will not hold for a general E, any more than the boundary conditions can be satisfied
for an arbitrary E in the infinite square well. Instead we obtain an expression for E in terms
of n, giving a spectrum of levels E0 , E1 . . ..
√
Of course for the infinite square well k = 2mE/~ is constant and the condition gives k =
(n + 1)π/(b − a), which is exact. (Using n + 1 rather than n allows us to start with n = 0;
starting at n = 1 is more usual for an infinite well.)
For a more general potential, outside the classically allowed region we will have decaying ex-
ponentials. In the vicinity of the turning points these solutions will not be valid, but if we
approximate the potential as linear we can solve the Schrödinger equation exactly (in terms of
Airy functions). Matching these to our WKB solutions in the vicinity of x = a and x = b gives
the surprisingly simple result Z b
k(x0 )dx0 = (n + 21 )π.
a
This is the quantisation condition for a finite well; it is different from the infinite well because
the solution can leak into the forbidden region. (For a semi-infinite well, the condition is
that the integral equal (n + 34 )π. This is the appropriate form for the l = 0 solutions of a
spherically symmetric well.) Unfortunately we can’t check this against the finite square well
though, because there the potential is definitely not slowly varying at the edges, nor can it be
approximated as linear. But we can try the harmonic oscillator, for which the integral gives
Eπ/~ω and hence the quantisation condition gives E = (n + 12 ) ~ω! The approximation was
only valid for large n (small wavelength) but in fact we’ve obtained the exact answer for all
levels.
Griffiths 8.3
This subsection gives details of the matching process. The material in it is is not examinable.
More about Airy functions can be found in section A.9.
If we can treat the potential as linear over a wide-enough region around the turning points
that, at the edges of the region, the WKB approximation is valid, then we can match the WKB
and exact solutions.
Consider a linear potential V = βx as an approximation to the potential near the right-hand
turning point b. We will scale x = ( ~2 /(2mβ))1/3 z and E = (~2 β 2 /2m)1/3 µ, so the turning
point is at z = µ. Then the differential equation is y 00 (z) − zy(z) + µy(z) = 0 and the solution
which decays as z → ∞ is y(z) = AAi(z − µ). This has to be matched, for z not too close to
µ, to the WKB solution. In these units, k(x) = (µ − z)1/2 and
Z x Z z
0 0
k(x )dx = (µ − z 0 )1/2 dz 0 = − 23 (µ − z)3/2 ,
b µ
so
WKB B 2 3/2
WKB C 2 3/2
ψx<b (z) = cos − 3
(µ − z) + φ and ψx>b (z) = exp − 3
(z − µ) .
(µ − z)1/4 (z − µ)1/4
(We chose the lower limit of integration to be b in order that the constant of integration
vanished; any other choice would just shift φ.) Now the asymptotic forms of the Airy function
are known:
2 3/2 π
z→−∞
cos 3 |z| − 4 − 2 z 3/2
z→∞ e 3
Ai(z) −→ √ and Ai(z) −→ √ 1/4
π |z|1/4 2 πz
so 2 3/2
2
− π4
3/2
z→−∞ cos 3 (µ − z) z→∞ e− 3 (z−µ)
Ai(z − µ) −→ √ and Ai(z − µ) −→ √
π(µ − z)1/4 2 π(z − µ)1/4
and these will match the WKB expressions exactly provided C = 2B and φ = π/4.
At the left-hand turning point a, the potential is V = −β 0 x (with β 6= β 0 in general) and the
solution y(z) = AAi(−z + ν). On the other hand the WKB integral is
Z x Z z
0 0
k(x )dx = (−ν + z 0 )dz 0 = 32 (−ν + z)3/2 .
a ν
which requires φ0 = −π/4. (Note that φ0 6= φ because we have taken a different lower limit of
the integral.)
So now we have two expressions for the solution, valid everywhere except in the vicinity of the
boundaries,
Z x Z x
D 0 0 B 0 0
ψ(x) = p cos k(x )dx − π/4 and ψ(x) = p cos k(x )dx + π/4
k(x) a k(x) b
Rb
which can be satisfied only if D = ±B and a
k(x0 )dx0 = (n + 21 )π, as required.
It is worth stressing that although—for a linear potential—the exact (Airy function) and WKB
solutions match “far away” from the turning point, they do not do so close in. The (z − µ)−1/4
terms in the latter mean that they blow up, but the former are perfectly smooth. They are
shown (for µ = 0) in the figure, in red for the WKB and black for the exact functions. We can
see they match very well so long as |z − µ| > 1; in fact z → ∞ is overkill!
So now we can be more precise about the conditions under which the matching is possible: we
need the potential to be linear over the region ∆x ∼ ( ~2 /(2mβ))1/3 where β = dV /dx. Linearity
means that ∆V /∆x ≈ dV /dx at the turning point, or d2 V /dx2 ∆x dV /dx (assuming the
curvature is the dominant non-linearity, as is likely if V is smooth). For the harmonic oscillator,
d2 V /dx2 ∆x/(dV /dx) = 2(~ω/E)2/3 which is only much less than 1 for very large n, making
the exact result even more surprising!
3.3.2 WKB approximation for tunnelling
Shankar 16.2; Gasiorowicz Supplement 4B; Griffiths 8.2
For the WKB approximation to be applicable to tunnelling through a barrier, we need as
always |λ0 | 1. In practice that means that the barrier function is reasonably smooth and
that E V (x). Now it would of course be possible to do a careful calculation, writing down
the WKB wave function in the three regions (left of the barrier, under the barrier and right of
the barrier), linearising in the vicinity of the turning points in order to match the wave function
and its derivatives at both sides. This however is a tiresomely lengthy task, and we will not
attempt it. Instead, recall the result for a high, wide square barrier; the transmission coefficient
in the limit e−2κL 1 is given by
16k1 k2 κ2
T = e−2κL ,
(κ2 + k12 )(κ2 + k22 )
wherepk1 and k2 are the wavenumbers on either side of the barrier (width L, height V ) and
κ = 2m(V − E). (See the revision notes.) All the prefactors are not negligible, but they
are weakly energy-dependent, whereas the e−2κL term is very strongly energy dependent. If we
plot log T against energy, the form will be essentially const − 2κ(E)L, and so we can still make
predictions without worrying about the constant.
For a barrier which is not constant, the WKB approximation will yield a similar expression for
the tunnelling probability:
Z b
0 0
T = [prefactor] × exp −2 κ(x )dx ,
a
p
where κ(x) ≡ 2m(V (x) − E)/~. The WKB approximation is like treating a non-square
barrier like a succession of square barriers of different heights. The need for V (x) to be slowly
varying is then due to the fact that we are slicing the barrier sufficiently thickly that e−2κ∆L 1
for each slice.
The classic application of the WKB approach to tunnelling is alpha decay. The potential here is
a combination of an attractive short-range nuclear force and the repulsive Coulomb interaction
between the alpha particle and the daughter nucleus. Unstable states have energies greater than
zero, but they are long-lived because they are classically confined by the barrier. (It takes some
thought to see that a repulsive force can cause quasi-bound states to exist!) The semi-classical
model is of a pre-formed alpha particle of energy E bouncing back and forth many times (f )
per second, with a probability of escape each time given by the tunnelling probability, so the
decay rate is given by 1/τ = f T . Since we can’t calculate f with any reliability we would be
silly to worry about the prefactor in T , but the primary dependence of the decay rate on the
energy of the emitted particle will come from the easily-calculated exponential.
In the figure above the value of a is roughly the nuclear radius R, and b is given by VC (b) = E,
with the Coulomb potential VC (r) = zZ ~cα/r. (Z is the atomic number of the daughter
nucleus and z = 2 that of the alpha.) The integral in the exponent can be done (see Gasiorowicz
Supplement 4 B for details1 ; the substitution r = b cos2 θ is used), giving in the limit b a
Z b r
mc2 Z Z
2 κ(r)dr = 2πzZα = 39 p ⇒ log10 τ = const + 1.72 p .
R 2E E(MeV) E(MeV)
1
But note that there is a missing minus sign between the two terms in square brackets in eq. 4B-4.
This is the Geiger-Nuttall law. Data for the lifetimes of long-lived isotopes (those with low-
energy alphas) fit such a functional form well, but with 1.61 rather than 1.72. In view of the
fairly crude approximations made, this is a pretty good result. Note it is independent of the
p radius because we used b R; we could have kept the first correction, proportional
nuclear
to b/R, to improve the result. Indeed the first estimates of nuclear radii came from exactly
such studies.
A version of the classic figure of the results is given below on the left.2 The marked energy scale
1
is non-linear; the linear variable on the x-axis is actually −E − 2 , so that the slope is negative.
Straight lines join isotopes of the same element. On the right is a more recent plot3 of log τ
1
against E − 2 , showing deviations from linearity at the high-energy (left-hand) end.
2
found at https://web-docs.gsi.de/∼ wolle/TELEKOLLEG/KERN/LECTURE/Fraser/L14.pdf, which may
not be the original source.
3
Astier et al, Eur. Phys. J. A46 (2010) 165
4. Approximate methods II:
Time-independent perturbation theory
Perturbation theory is the most widely used approximate method. It requires we have a set of
exact solutions to a Hamiltonian which is close to the realistic one.
Perturbation theory is applicable when the Hamiltonian Ĥ can be split into two parts, with
the first part being exactly solvable and the second part being small in comparison. The first
part is always written Ĥ (0) , and we will denote its eigenstates by |n(0) i and energies by En(0)
(with wave functions φ(0)
n ). These we know, and for now assume to be non-degenerate. The
eigenstates and energies of the full Hamiltonian are denoted |ni and En , and the aim is to
find successively better approximations to these. The zeroth-order approximation is simply
|ni = |n(0) i and En = En(0) , which is just another way of saying that the perturbation is small
and at a crude enough level of approximation we can ignore it entirely.
Nomenclature for the perturbing Hamiltonian Ĥ − Ĥ (0) varies. δV , Ĥ (1) and λĤ (1) are all
common. It usually is a perturbing potential but we won’t assume so here, so we won’t use the
first. The second and third differ in that the third has explicitly identified a small, dimensionless
parameter (eg α in EM), so that the residual Ĥ (1) isn’t itself small. With the last choice, our
expressions for the eigenstates and energies of the full Hamiltonian will be explicitly power
series in λ, so En = En(0) + λEn(1) + λ2 En(2) + . . . etc. With the second choice the small factor is
hidden in Ĥ (1) , and is implicit in the expansion which then reads En = En(0) +En(1) +En(2) +. . .. In
this case one has to remember that anything with a superscript (1) is first order in this implicit
small factor, or more generally the superscript (m) denotes something which is mth order. For
the derivation of the equations we will retain an explicit λ, but thereafter we will set it equal
to one to revert to the other formulation. We will take λ to be real so that Ĥ1 is Hermitian.
We start with the master equation
Then we substitute in En = En(0) + λEn(1) + λ2 En(2) + . . . and |ni = |n(0) i + λ|n(1) i + λ2 |n(2) i + . . .
and expand. Then since λ is a free parameter, we have to match terms on each side with the
45
same powers of λ, to get
We have to solve these sequentially. The first we assume we have already done. The second
will yield En(1) and |n(1) i. Once we know these, we can use the third equation to yield En(2) and
|n(2) i, and so on. The expressions for the changes in the states, |n(1) i etc, will make use of the
fact that the unperturbed states {|n(0) i} from a basis, so we can write
X X
|n(1) i = cm |m(0) i = hm(0) |n(1) i |m(0) i.
m m
In each case, to solve for the energy we take the inner product with hn(0) | (i.e. the same state)
whereas for the wave function, we use hm(0) | (another state). We use, of course,
hm(0) |Ĥ (0) = Em
(0)
hm(0) | and hm(0) |n(0) i = δmn .
At first order we get
The second equation tells us the overlap of |n(1) i with all the other |m(0) i, but not with |n(0) i.
This is obviously not constrained by the eigenvalue equation, because we can add any amount
of |n(0) i and the equations will still be satisfied. However we need the state to continue to be
normalised, and when we expand hn|ni = 1 in powers of λ we find that hn(0) |n(1) i is required
to be imaginary. This is just like a phase rotation of the original state and we can ignore it.
(Recall that an infinitesimal change in a unit vector has to be at right angles to the original.)
Hence
X hm(0) |Ĥ (1) |n(0) i
|n(1) i = (0) (0)
|m(0) i. (4.3)
m6=n
En − Em
If the spectrum of Ĥ (0) is degenerate, there may be a problem with this expression because the
denominator can vanish, making the corresponding term infinite. In fact nothing that we have
done so far is directly valid in that case, and we have to use “degenerate perturbation theory”
instead. For now we assume that for any two states |m(0) i and |n(0) i, either En(0) − Em
(0)
6= 0 (non
degenerate) or hm |Ĥ |n i = 0 (the states are not mixed by the perturbation.)
(0) (1) (0)
The expression for the second-order shift in the wave function |n(2) i can also be found but it
is tedious. The main reason we wanted |n(1) i was to find En(2) anyway, and we’re not planning
to find En(3) ! Note that though the expression for En(1) is generally applicable, those for |n(1) i
and En(2) would need some modification if the Hamiltonian had continuum eigenstates as well
as bound states (eg hydrogen atom). Provided the state |ni is bound, that is just a matter of
integrating rather than summing. This restriction to bound states is why Mandl calls chapter 7
“bound-state perturbation theory”. The perturbation of continuum states (eg scattering states)
is usually dealt with separately.
Note that the equations above hold whether we have identified an explicit small parameter λ
or not. So from now on we will set λ to one, assume that Ĥ (1) has an implicit small parameter
within it, and write En = En(0) + En(1) + En(2) + . . .; the expressions above for E (1,2) and |n(1) i are
still valid.
We treat this as a perturbation on the flat-bottomed well, so Ĥ (1) −→ V0 for a/2 < x < a and
x
zero elsewhere.
q
2
The ground-state unperturbed wave function is ψ0 = (0)
a
sin πx
a
, with unperturbed energy
E0(0) = π 2 ~2 /(2ma2 ). A “low” step will mean V0 E0(0) . Then we have
2 a
Z
πx V0
(1) (0) (1) (0)
E0 = hψ0 |Ĥ |ψ0 i = V0 sin2 dx =
a a/2 a 2
17
16
15
14
13
12
11
V
2 4 6 8 10 12 14
We can also plot the exact wave functions for different step size, and see that for V0 = 10 (the
middle picture, well beyond the validity of first-order perturbation theory) it is significantly
different from a simple sinusoid.
25 25 25
20 20 20
15 15 15
10 10 10
5 5 5
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
(n + 21 )~ω.
Recalling√that in terms of creation and annihilation p operators (see section 1.6),
x̂ = (x0 / 2)(â + ↠), with [â, ↠] = 1, and x0 = ~/(mω), and so
x20 λ (0) † 2 λ
En(1) = hn(0) |Ĥ (1) |n(0) i = hn |(â ) + â2 + 2↠â + 1|n(0) i = 2
~ω(n + 12 ).
2 mω
The first-order change in the wave function is also easy to compute, as hm(0) |Ĥ (1) |n(0) i = 0
unless m = n ± 2. Thus
X hm(0) |Ĥ (1) |n(0) i
|n(1) i = (0) (0)
|m(0) i
m6=n
En − E m
p p !
~λ (n + 1)(n + 2) n(n − 1)
= |(n + 2)(0) i + |(n − 2)(0) i .
2mω −2~ω 2~ω
We can now also calculate the second order shift in the energy:
2
X hm(0) |Ĥ (1) |n(0) i
En(2) = hn(0) |Ĥ (1) |n(1) i =
m6=n
En(0) − Em
(0)
2
~λ (n + 1)(n + 2) n(n − 1) 2
= + = − 21 λ
mω 2 ~ω(n + 12 )
2mω −2~ω 2~ω
We can see a pattern emerging, and of course this is actually a soluble
p problem, as all that the
0
perturbation has done is change the frequency. Defining ω = ω 1 + 2λ/(mω 2 ), we see that
the exact solution is
2
En = (n + 21 )~ω 0 = (n + 12 )~ω 1 + mω
λ
2 − 1 λ
2 mω 2
+ . . .
None of the formalism that we have developed so far works “out of the box” if Ĥ (0) has
degenerate eigenstates. To be precise, it is still fine for the non-degenerate states, but it fails
to work in a subspace of degenerate states if Ĥ (1) is not also diagonal in this subspace. We can
see that in Eq. (4.2), where the vanishing energy denominator clearly signals a problem, but
even Eq. (4.1) is wrong. The reason is simple: we assumed from the start that the shifts in
But suppose |1 i and |2 i are degenerate
(0) (0)
the states due to the perturbation
q would be small.
1
eigenstates of Ĥ (0) ; then so are 2
|1(0) i ± |2(0) i . Now the eigenstates of the full Hamiltonian
|1i and |2i are not degenerate—but which of the possible choices for the eigenstate of Ĥ (0) are
they close to? If for example it is the latter (as is often the case in simple examples) then even
a tiny perturbation Ĥ (1) will induce a big change in the eigenstates.
Consider a potential in two dimensions which is symmetric: V (x, y) = V0 (r). The ground
state will be non-degenerate, but all higher states, of energy En , will be doubly degenerate; a
possible but not unique choice has wave functions ψn (r) sin(nφ) and ψn (r) cos(nφ) (where φ
is the angle in the plane from the x-axis); the form of ψn (r) will of course depend on V0 (r).
For n = 1, the probability-density maps of these two states looks like the last two figures of
the top line of figure 4.1. Now imagine a perturbation which is not circularly symmetric, say
λx2 . The potential now rises more steeply along the x-axis than along the y-axis (the top
line of figure 4.1 shows a contour map of the—somewhat exaggeratedly—deformed potential).
This will clearly lift the degeneracy, because the first state, which vanishes along the x-axis,
will “feel” the perturbation less and have a lower energy than the second. The new energy
eigenstates will be similar to the original ones, and the first order energy shifts will indeed be
given by the usual formula: a naı̈ve application of perturbation theory is fine.
But what if the perturbation is 21 λ(x + y)2 ? (See the bottom line of figure 4.1 for the new
contour map.) This is symmetric between x and y, cos φ and sin φ: do we conclude that the
states remain degenerate? No: this is really the same problem as before with a more and a less
steep direction, the only difference being that the orientation is rotated by 45◦ . And so there
will again be a pair of solutions split in energy according to whether the probability of being
found along the 45◦ line is larger or smaller. We expect them to look like those shown in the
bottom line of figure 4.1. But these are not close to our original pairs of solutions; we can’t
get from one to the other by perturbation theory. Does that mean that perturbation theory is
Figure 4.1: Far left: contour plots of unperturbed potential. Left: contour plots of perturbed
potentials. Right and far-right: density plots of |ψ|2 for lower- and higher-energy eigenstates
with n = 1, for each perturbed potential. See text for explanation.
useless in this case? Surely not: effectively all we have done between the two cases is rotate the
axes by 45◦ ! The resolution comes from recognising that our initial choice of eigenstates of V0
was not unique, and with hindsight was not the appropriate starting point for this perturbation.
In fact the circular symmetry of the original
potential means that for
any angle α, the alternative
pair of wave functions ψn (r) cos n(φ − α) and ψn (r) sin n(φ − α) is an equally good choice of
orthogonal unperturbed eigenstates. (We can rotate the x-axis at will.) For the perturbation
1
2
λ(x + y)2 , the choice of α = 45◦ is the right one to ensure that the perturbed solutions are
close to the unperturbed ones, and the usual formulae can be used if we chose that basis.
To sum up: the perturbations broke the symmetry of the original problem and hence took away
the original freedom we had to choose our basis. If we chose inappropriately initially, we need
to choose again once we know the form of the perturbation we are dealing with.
Sometimes the right choice of unperturbed eigenstates is immediately obvious, as in the case
above, or can be deduced by considerations of symmetry. If not, then starting with our initial
choice {|n(0) i}, we need to find a new set of states in the degenerate space, linear combinations
of our initial choice, which we will call {|n0(0) i}, and which crucially are not mixed by the
perturbation, in other words,
In the new basis Ĥ (1) is diagonal in the degenerate subspace. That will ensure that all state
and energy shifts are small, as assumed in the set-up of perturbation theory.
Thus we write down the matrix which is the representation of Ĥ (1) in the degenerate subspace
of the originally-chosen basis, and find the eigenvectors. These, being linear combinations of
the old basis states, are still eigenstates of Ĥ (0) and so are equally good new basis states. But
because they are eigenstates of Ĥ (1) in this restricted space, they are not mixed by Ĥ (1) . In the
process we will probably find the eigenvalues too, and these are the corresponding first-order
energy shifts.
We then then proceed almost exactly as in the non-degenerate case having replaced (say) |1(0) i
and |2(0) i with the new linear combinations which we can call |10(0) i and |20(0) i. The expressions
for the energy and state shifts, using the new basis, are as before, Eqs. (4.1,4.2,4.3,4.4), except
instead of summing over all states m 6= n, we sum over all states for which Em (0)
6= En(0) . As al-
ready mentioned the first-order energy-shifts of the originally-degenerate states, hn0(0) |Ĥ (1) |n0(0) i,
are just the eigenvalues of the representation of Ĥ (1) in the degenerate subspace.
For example suppose Ĥ (0) has many eigenstates but two, |1(0) i and |2(0) i, are degenerate, and
that Ĥ (1) |1(0) i = β|2(0) i + . . . and Ĥ (1) |2(0) i = β|1(0) i + . . ., with β real and . . . referring to |3(0) i
and higher. Then in the degenerate subspace
(0) (1) (0)
(1) h1 |Ĥ |1 i h1(0) |Ĥ (1) |2(0) i 0 1
Ĥ −→ =β
h2(0) |Ĥ (1) |1(0) i h2(0) |Ĥ (1) |2(0) i 1 0
q q
whose eigenvectors are 12 −11 and 12 ( 11 ), with eigenvalues ∓β. With these, we define new
eigenstates: q q
|10(0) i = 1
2
|1(0) i − |2(0) i , |20(0) i = 1
2
|1(0) i + |2(0) i .
Now that we have the appropriate basis, we can apply Eq. (4.1) to obtain
E1(1)0 = h10(0) |Ĥ (1) |10(0) i = −β, E2(1)0 = h20(0) |Ĥ (1) |20(0) i = β.
The expressions for the first order state changes |n0(1) i and second-order energy changes En(2)0
are just given by Eq. (4.3,4.4) but with primed states where appropriate; in these the higher
states will of course enter. However since h20(0) |Ĥ (1) |10(0) i = 0 by construction, |20(0) i does not
appear in the sum over states and there is no problem with vanishing denominators.
We should note that, unless the perturbation does not mix the degenerate states and the
higher-order states, we have not solved the problem exactly. At this stage we have just found
the correct approximate eigenstates and the first-order energy shifts. Of course that is often
all we want.
The right-hand plot above shows the partially degenerate case discussed below.
Now we consider a case with two degenerate states, with E1(0) = E2(0) = E0 , and E3(0) = 2E0 .
We note that |1(0) i and |2(0) i are just two of an infinite set of eigenstates with the same energy
E1(0) , since any linear combination of them is another eigenstate. We have to make the choice
which diagonalises Ĥ (1) in this subspace: in this subspace
(1) 1 1
Ĥ −→ a
1 1
q q
whose eigenstates are 12 −11 and 12 ( 11 ), with eigenvalues 0 and 2a. So
1 1
|10 i = √ |1(0) i − |2(0) i |20 i = √ |1(0) i + |2(0) i .
(0) (0)
and
2 2
These new states don’t diagonalise the full Ĥ (1) , √ of course. To go further we need the matrix
elements h3(0) |Ĥ (1) |10(0) i = 0 and h3(0) |Ĥ (1) |20(0) i = 2a. Then
E1(1)0 = 0 E2(1)0 = 2a, E3(1) = a,
√ √ √
0 1
a 2 (0) 2a 2a 0(0) a
|10(1) i = 0, 0(1)
|2 i = − |3 i −→ − 0 , (1)
|3 i = |2 i −→ 1
E0 E0 E0 E0
1 0
2 2
2a 2a
E1(2)0 = 0, E2(2)0 = − , E3(2) =
E0 E0
In this particular case it is easy to show that |10(0) i is actually an eigenstate of Ĥ (1) , so there
will be no change to any order. We can check our results against the exact eigenvalues and see
that they are correct, which is left as an exercise for the reader; for that purpose it is useful to
write Ĥ (1) in the new basis (Ĥ (0) of course being unchanged) as:
0 0 √0
Ĥ (1) −→ a 0 √2 2 .
0 2 1
h10(0) |Ĥ (1) |3(0) i 0(0) h20(0) |Ĥ (1) |3(0) i 0(0)
|3(1) i = |1 i + |2 i
E0 E0
But we could equally have used the un-diagonalised states |1(0) i and |2(0) i. This can be seen if
we write
1 0(0) 0(0)
|3(1) i = |1 ih1 | + |20(0) ih20(0) | Ĥ (1) |3(0) i
E0
and spot that the term in
brackets is the identity operator in the degenerate subspace, which can
equally well be written |1 ih1 | + |2 ih2 | . Of course for a problem in higher dimensions,
(0) (0) (0) (0)
there would be other terms coming from the non-degenerate states |m(0) i as well.
(Compared to the section on addition of orbital and spin angular momenta, the only difference
is that in calculating matrix elements of the states |n, l, ml i, there is a radial integral to be done
as well as the angular one.) With the pure Coulomb potential, we can use whichever basis we
like.
The diagrams above show corrections to the simple Coulomb force which would be represented
by the exchange of a single photon between the proton and the electron. The most notable
effect on the spectrum of hydrogen is to lift the remaining degeneracy between the 2 p1/2 and
2
s1/2 states, so that the latter is higher by 4.4 × 10−6 eV.
Below the various corrections to the energy levels of hydrogen are shown schematically. The
gap between the n = 1 and n = 2 shells is suppressed, and the Lamb and hyperfine shifts are
exaggerated in comparison with the fine-structure. The effect of the last two on the 2 p3/2 level
is not shown.
4.4 The Zeeman effect: hydrogen in an external mag-
netic field
(Shankar 14.5); Mandl 7.5; Griffiths 6.4; Gasiorowicz 12.3
(Since we will not ignore spin, this whole section is about the so-called anomalous Zeeman
effect. The so-called normal Zeeman effect cannot occur for hydrogen, but is a special case
which pertains in certain multi-electron atoms for which the total spin is zero.)
With an external magnetic field along the z-axis, the perturbing Hamiltonian is Ĥ (1) = −µ·B =
(µB B/~)(L̂z + 2Ŝz ). The factor of 2 multiplying the spin is of course the famous g-factor for
spin, as predicted by the Dirac equation. Clearly this is diagonalised in the {|n, l, ml , ms i}
basis (s = 12 suppressed in the labelling as usual). Then Enlm(1)
l ms
= µB B(ml + 2ms ). If, for
example, l = 2 there are 7 possible values of ml + 2ms between −3 and 3, with −1, 0 and 1
being degenerate (5 × 2 = 10 states in all).
This is the “strong-field Zeeman” or “Paschen-Back” effect. It is applicable if the magnetic field
is strong enough that we can ignore the fine structure discussed in the last section. In hydrogen,
that means B 10−4 eV/µB ∼ 2 T. But for a weak field, B 2 T, the fine structure effects
will be stronger, so we will consider them part of Ĥ (0) for the Zeeman problem; our basis is then
{|n, l; j, mj i} and states of the same j but different l and mj are degenerate. This degeneracy
however is not a problem, because the operator (L̂z + 2Ŝz ) does not connect states of different
l or mj (since it commutes with L̂2 and with Jˆz ). So we can use non-degenerate perturbation
theory, with
(1) µB B µB B
Enljmj = hn, l; j, mj |L̂z + 2Ŝz |n, l; j, mj i = µB Bmj + hn, l; j, mj |Ŝz |n, l; j, mj i.
~ ~
If Jˆz is conserved but L̂z and Ŝz are not, the expectation values of the latter two might be
expected to be proportional to the first, modified by the average degree of alignment: hŜz i =
~mj hŜ · Ĵi/hĴ2 i . (This falls short of a proof but is in fact correct, and follows from the Wignar
Eckart theorem as explained in section 2.6. A similar expression holds for L̂z . A semi-classical
derivation may be found here.) Using 2Ŝ · Ĵ = Ŝ2 + Ĵ2 − L̂2 gives
(1) j(j + 1) − l(l + 1) + s(s + 1)
Enljmj = µB Bmj 1 + = µB Bmj gjls .
2j(j + 1)
Of course for hydrogen s(s+1) = 34 , but the expression above, which defines the Landé g factor,
is actually more general and hence I’ve left it with an explicit s. For hydrogen, j = l ± 12 and
1
so g = (1 ± 2l+1 ).
Thus states of a given j (already no longer degenerate due to fine-structure effects) are further
split into (2j + 1) equally-spaced levels. Since spectroscopy involves observing transitions
between two states, both split but by different amounts, the number of spectral lines can be
quite large.
For heavier (but not too heavy) atoms the fine structure splitting of levels is much greater than
in hydrogen, and the weak-field Zeeman effect is the norm. The expression above can be used
for the Landé g-factor, using the quantum numbers J, L and S for the atom as a whole.
For protons and neutrons the complex internal dynamics of the quarks leads to g-factors,
and hence magnetic moments, which are quite different from 2 (for the proton) and 0 (for the
uncharged neutron); see comments in 2.5.3. The magnetic moments of nuclei with one unpaired
nucleon in an orbital of specified l and j can then be estimated analogously to the procedure
above. (These estimates give the “Schmidt limits” which will be discussed in PHYS40302.)
4.5 The Stark effect: hydrogen in an external electric
field
Shankar 17.2,3; Gasiorowicz 11.3; (Griffiths problems 6.35,36)
Summary: In this case we have to get to grips with degenerate and beyond-
first-order perturbation theory
In this section we consider the energy shifts of the levels of hydrogen in an external electric field,
taken to be along the z-axis: E = Eez (we use E for the electric field strength to distinguish it
from the energy). We will work in the strong-field limit and ignore fine structure; furthermore
the dynamics are then independent of the spin so we can ignore ms ; the unperturbed eigenstates
can be take to be |n, l, ml i.
The perturbing Hamiltonian is Ĥ (1) = |e|Ez. (In this section we will not write ẑ for fear of confu-
sion with a unit vector.) Now it is immediately obvious that, for any state, hn, l, ml |z|n, l, ml i =
0. The probability density is symmetric on reflection in the xy-plane, but z is antisymmetric.
So for the ground state, the first order energy shift vanishes. (We will return to excited states,
but think now about why we can’t conclude the same for them.) This is not surprising, because
an atom of hydrogen in its ground state has no electric dipole moment: there is no p · E term
to match the µ · B one.
To pcalculate the second-order energy shift we need hn, l, ml |z|1, 0, 0i. We can write z as r cos θ
or 4π/3 rY10 (θ, φ). The lack of dependence on φ means that ml can’t change, and in addition
l can only change by one unit, so hn, l, m|z|1, 0, 0i = δl1 δm0 hn, 1, 0|z|1, 0, 0i. However this isn’t
the whole story: there are also states in the continuum, which we will denote |ki (though these
are not plane waves, since they see the Coulomb potential). So we have
X |hn, 1, 0|z|1, 0, 0i|2 2
3 |hk|z|100i|
Z
(2) 2 2
E100 = (eE) + (eE) d k (0)
n>1
E1(0) − En(0) E1 − Ek(0)
(We use E1 for E100 ). This is a compact expression, but it would be very hard to evaluate
directly. We can get a crude estimate of the size of the effect by simply replacing all the
denominators by E1(0) − E2(0) ; this overestimates the magnitude of every term but the first, for
which it is exact, so it will give an upper bound on the magnitude of the shift. Then (recalling
E1(2) < 0),
!
2 Z
(eE) X X
E1(2) > h1, 0, 0|z|n, l, ml ihn, l, ml |z|1, 0, 0i + d3 kh1, 0, 0|z|kihk|z|1, 0, 0i
E1(0) − E2(0) n≥1 lm
l
!
2 Z
(eE) XX
= (0) (0)
h1, 0, 0|z |n, l, ml ihn, l, ml | + d3 k|kihk| z|1, 0, 0i
E1 − E2 n≥1 lm
l
2
(eE) 2 4(eEa0 )2 8(eE)2 a30
= h1, 0, 0|z |1, 0, 0i = − = −
E1(0) − E2(0) 3ERy 3~cα
where we have included n = 1 and other values of l and m in the sum because the matrix
elements vanish anyway, and then used the completeness relation involving all the states, bound
and unbound, of the hydrogen atom. For details of this integral and the one needed for the
next part, see here, p2.
There is a trick for evaluating the exact result, which gives 9/4 rather than 8/3 as the constant
(See Shankar.) So our estimate of the magnitude is fairly good. (For comparison with other
ways of writing the shift, note that (eE)2 /~cα = 4π0 E2 —or, in Gaussian units, just E2 . )
Having argued above that the hydrogen atom has no electric dipole, how come we are getting
a finite effect at all? The answer of course is that the field polarises the atom, and the induced
dipole can then interact with the field. We have in fact calculated the polarisability of the
hydrogen atom.
Now for the first excited state. We can’t conclude that the first-order shift vanishes here, of
course, because of degeneracy: there are four states and Ĥ (1) is not diagonal in the usual basis
|2, l, ml i (with l = 0, 1). In fact as we argued above it only connects |2, 0, 0i and |2, 1, 0i, so the
states |2, 1, ±1i decouple and their first order shifts do vanish. Using h2, 1, 0|z|2, 0, 0i = −3a0 ,
we have in this subspace (with |2, 0, 0i = (1, 0)> and |2, 1, 0i = (0, 1)> )
(1) 0 1
Ĥ = −3a0 |e|E ,
1 0
p
and the eigenstates are 1/2(|2, 0, 0i ± |2, 1, 0i) with eigenvalues ∓3a0 |e|E. So the degenerate
quartet is split into a triplet of levels (with the unshifted one doubly degenerate).
In reality the degeneracy of the n = 2 states is lifted by the fine-structure splitting; are these
results then actually relevant? They will be approximately true if the field is large; at an
intermediate strength both fine-structure and Stark effects should be treated together as a
perturbation on the pure Coulomb states. For very weak fields degenerate perturbation √ theory
can be applied in the space of j = 21 states (2 s 1 and 2 p 1 ), which are shifted by ± 3a0 |e|E. The
2 2
j = 23 states though have no first-order shift.
5. Quantum Measurement
Summary: This section is about nothing less important than “the nature
of reality”!
In 1935 Einstein, along with Boris Poldosky and Nathan Rosen, published a paper entitled
“Can quantum-mechanical description of physical reality be considered complete?” By this
stage Einstein had accepted that the uncertainty principle did place fundamental restrictions
on what one could discover about a particle through measurements conducted on it. The
question however was whether the measuring process actually somehow brought the properties
into being, or whether they existed all along but without our being able to determine what
they were. If the latter was the case there would be “hidden variables” (hidden from the
experimenter) and the quantum description—the wave function—would not be a complete
description of reality. Till the EPR paper came out many people dismissed the question as
undecidable, but the EPR paper put it into much sharper focus. Then in 1964 John Bell
presented an analysis of a variant of the EPR paper which showed that the question actually
was decidable. Many experiments have been done subsequently, and they have come down
firmly in favour of a positive answer to the question posed in EPR’s title.
The original EPR paper used position and momentum as the two properties which couldn’t be
simultaneously known (but might still have hidden definite values), but subsequent discussions
have used components of spin instead, and we will do the same. But I will be quite lax about
continuing to refer to “the EPR experiment”.
There is nothing counter-intuitive or unclassical about the fact that we can produce a pair of
particles whose total spin is zero, so that if we find one to be spin-up along some axis, the other
must be spin down. All the variants of the experiment to which we will refer can be considered
like this: such a pair of electrons is created travelling back-to-back at one point, and travel to
distant measuring stations where each passes through a Stern-Gerlach apparatus (an “SG”) of
a certain orientation in the plane perpendicular to the electrons’ momentum.
As I say there is nothing odd about the fact that when the two SGs have the same orientation
the two sequences recorded at the two stations are perfectly anti-correlated (up to measurement
errors). But consider the case where they are orientated at 90◦ with respect to each other as
below: Suppose for a particular pair of electrons, we measure number 1 to be spin up in the
z-direction and number 2 to be spin down in the x-direction. Now let’s think about what would
have happened if we had instead measured the spin in the x-direction of particle 1. Surely, say
EPR, we know the answer. Since particle 2 is spin down in the x-direction, particle 1 would
61
have been spin up. So now we know that before it reached the detector, particle 1 was spin up
in the z-direction (because that’s what we got when we measured it) and also spin up in the
x-direction (because it is anti-correlated with particle 2 which was spin down). We have beaten
the uncertainty principle, if only retrospectively.
But of course we know we can’t construct a wave function with these properties. So is there
more to reality than the wave function? Bell’s contribution was to show that the assumption
that the electron really has definite values for different spin components—if you like, it has
an instruction set which tells it which way to go through any conceivable SG that it might
encounter—leads to testable predictions.
For Bell’s purposes, we imagine that the two measuring stations have agreed that they will set
their SG to one of 3 possible settings. Setting A is along the z-direction, setting C is along
the x direction, and setting B is at 45◦ to both. In the ideal set-up, the setting is chosen just
before the electron arrives, sufficiently late that no possible causal influence (travelling at not
more than the speed of light) can reach the other lab before the measurements are made. The
labs record their results for a stream of electrons, and then get together to classify each pair
as, for instance, (A ↑, B ↓) or (A ↑, C ↑) or (B ↑, B ↓) (the state of electron 1 being given
first). Then they look at the number of pairs with three particular classifications: (A ↑, B ↑),
(B ↑, C ↑) and (A ↑, C ↑). Bell’s inequality says that, if the way the electrons will go through
any given orientation is set in advance,
N (A ↑, B ↑) + N (B ↑, C ↑) ≥ N (A ↑, C ↑)
This is obvious from the diagram below, in which the union of the blue and green sets fully
contains the red set.
A logical proof is as follows:
To apply to the spins we started with, we identify A with A ↑ and A with A ↓. Now if an
electron is A ↑ B ↓ (whatever C might be) then its partner must be A ↓ B ↑, and so the result
of a measurement A on the first and B on the second will be (A ↑, B ↑). Hence the inequality
for the spin case is a special case of the general one. We have proved Bell’s inequality assuming,
remember, that the electrons really do have these three defined properties even if, for a single
electron, we can only measure one of them.
Now let’s consider what quantum mechanics would say. A spin-0 state of two identical particles
is q
|S = 0i = 12 | ↑i ⊗ | ↓i − | ↓i ⊗ | ↑i
and this is true whatever the axis we have chosen to define “up” and “down”. As expected, if we
choose the same measurement direction at the two stations (eg both A), the first measurement
selects one of the two terms and so the second measurement, on the other particle, always
gives the opposite result. (Recall this is the meaning of the 2-particle wave function being
non-seperable or entangled.)
What about different measurement directions at the two stations (eg A and B)? Recall the
relation between the spin-up and spin-down states for two directions in the xz-plane, where θ
is the angle between the two directions:
|θ, ↑i = cos 2θ |0, ↑i + sin 2θ |0, ↓i |0, ↑i = cos 2θ |θ, ↑i − sin 2θ |θ, ↓i
|θ, ↓i = − sin 2θ |0, ↑i + cos 2θ |0, ↓i |0, ↓i = sin 2θ |θ, ↑i + cos 2θ |θ, ↓i.
(We previously showed this for the first axis being the z-axis, but, up to overall phases, it is
true for any pair). For A and B or for B and C, θ = 45◦ ; for A and C it is 90◦ .
Consider randomly-oriented spin-zero pairs and settings A, B and C equally likely. If the first
SG is set to A and the second to B (which happens 1 time in 9), there is a probability of 1/2 of
getting A ↑ at the first station. But then we know that the state of the second electron is |A ↓i
and the probability that we will measure spin-up in the B direction is |hB ↑ |A ↓i|2 = sin2 π8 .
(This probability doesn’t change if we start with the second measurement.) Thus the fraction
of pairs which are (A ↑, B ↑) is 12 sin2 22.5◦ = 0.073, and similarly for
(B ↑, C ↑). But the fraction which are (A ↑, C ↑) is 12 sin2 45◦ = 0.25. So the prediction of
quantum mechanics for 9N0 measurements is
So in quantum mechanincs, Bell’s inequality is violated. The experiment has been done many
times, starting with the pioneering work of Alain Aspect, and every time the predictions of
quantum mechanics are upheld and Bell’s inequality is violated. (Photons rather than electrons
are often used. Early experiments fell short of the ideal in many ways, but as loopholes have
been successively closed the result has become more and more robust.)
It seems pretty inescapable that the electrons have not “decided in advance” how they will
pass through any given SG. Do we therefore have to conclude that the measurement made at
station 1 is responsible for collapsing the wave function at station 2, even if there is no time
for light to pass between the two? It is worth noting that no-one has shown any way to use
this set-up to send signals between the stations; on their own they both see a totally random
succession of results. It is only in the statistical correlation that the weirdness shows up...
In writing this section I found this document by David Harrison of the University of Toronto
very useful.
As well as the textbook references given at the start, further discussions can be found in N.
David Mermin’s book Boojums all the way through (CUP 1990) and in John S. Bell’s Speakable
and unspeakable in quantum mechanics (CUP 1987).
A. Revision and background
or
ui + vi = wi .
Here i is a free index; it stands for 1 or 2 or 3 or . . . N . It allows us to write N equations in one,
and is completely equivalent in content to the vector equation. It has to occur in each additive
term on both sides of the equation. There is nothing special about the choice of i: provided we
make the substitution everywhere in the equation we can use j or k or m or n or. . .. Often more
than one free index is needed, e.g. to label the ith row and jth column of a matrix, and they
must be chosen to be different (unless one actually means the diagonal elements). Free indices
can also be used to label basis vectors, as in {|ii}. If these basis vectors are orthonormal, they
satisfy (
1 if i = j,
hi|ji = δij ≡
0 if i 6= j.
This is an example of an equation with two free indices. δij is the Kroneker delta. (In a different
context it might is also be used for the general element of the unit matrix or identity operator.)
Another use of indices is to represent summation:
N
X
hu|vi = u∗1 v1 + u∗2 v2 + u∗3 v3 + ... + u∗N vN = u∗i vi .
i=1
Here i is called a dummy index and represents all indices. This is a completely different use
from free indices; we only have one equation not a set of N , and we can’t replace i with a definite
index of our choice. Again, though,P we can use another symbol, say j, for the dummy index. (In
this course we will never omit the —we won’t use the Einstein equation convention, because
it causes problems in eigenvalue equations. But the limits 1 to N will often be implied and
omitted.)
The crucial rule is that if dummy and free indices, or more than one dummy index, occur in
the same equation, they must all be different.
P So for instance suppose we have an orthonormal
basis {|ii} in terms of which we write |vi = i vi |ii. If we want to find the value of hi|ai, where
65
i is a free index, we have to choose something different for the dummy index in the sum, say j:
!
X X X
hi|vi = hi| vj |ji = vj hi|ji = vj δij = vi .
j j j
In the penultimate step we still had a sum over j, but only the ith term was non-zero. Suppose
instead we’d used i as the dummy index, then we would end up with something like
X X
vi hi|ii = vi = v1 + v2 + . . . + vN
i i
The Kronecker delta collapses the double sum to a single sum, but it doesn’t matter which of
the two dummy indices we use in that single sum.
A.2 Vector spaces
This section is intended as revision for those who did PHYS20672 last semester. Those who
did not should consult the longer guides here and here, as well as chapter 1 of Shankar.
We may write vectors of a basis as {|v1 i, |v1 i . . . |vN i} or simply as {|1i, |2i . . . |N i}. It is very
useful to work with orthonormal bases for which hm|ni = δmn .
When the vectors we are talking about are ordinary vectors in real 3-D space, we will tend not
to use Dirac notation. Cartesian unit vectors forming an orthonormal basis in that space will
be written {ex , ey , ez }. Here the inner product is the familiar scalar product.
Operators act on vectors to produce new vectors: Q̂|vi = |wi. The matrix element of Q̂
between two vectors is defined as hu|Q̂|vi = hu|wi. The identity operator Iˆ leaves vectors
unchanged.
The object  = |uihv| is an operator, since it can act on a vector to give another (which will
always be proportional to |ui): Â|wi = (hv|wi)|ui. If the vectors {|ni} form an orthonormal
basis, then
N N
! N N
X X X X
|nihn| = Iˆ since |nihn| |vi = |nihn|vi = vn |ni = |vi.
n=1 n=1 n=1 n=1
This is called the completeness relation.
An operator is fully defined by what it does to the vectors of a basis, since then we can find
PN basis vector |ni, Q̂|ni
what it does to any other vector. For each
2
is a new vector which can
itself be expanded in the basis: Q̂|ni = m Qmn |mi. These N numbers Qnm fully define the
operator, in the same way that the components of vector fully define it (always with respect to
a given basis of course). With an orthonormal basis, we have
N
X
vn = hn|vi, Qmn = hm|Q̂|ni and wm = Qmn vn .
n
The final equation is reminiscent of matrix multiplication. We can write the components of a
vector as a vertical list (or column vector), and of an operator as a matrix, to give:
v1 h1|vi
v2 h2|vi
|vi −→ .. = .. ≡ v,
. .
vN hN |vi
hv| −→ (v1∗ , u∗2 , . . . vN ∗
) = (h1|vi, hv|2i, . . . hv|N i) ≡ v† ,
Q11 Q12 . . . Q1N h1|Q̂|1i h1|Q̂|2i ... h1|Q̂|N i
Q21 Q22 . . . Q2N h2|Q̂|1i h2|Q̂|2i ... h2|Q̂|N i
Q̂ −→ .. .. = ≡ Q.
.. .. .. .. .. .. .. .. .. ..
. . ... . . . ... .
QN 1 QN 2 . . . QN N hN |Q̂|1i hN |Q̂|2i . . . hN |Q̂|N i
The symbol −→ means “is represented by”, with name being a name or label for the basis,
name
which will be omitted if the basis is obvious. In different bases, the components and matrix
elements will be different. The corresponding column vectors and matrices are different rep-
resentations of the same vector/operator. (Note though that hu|Q̂|vi is a just number and
independent of the representation.)
Note that in their own basis, the basis vectors themselves have extremely simple representations:
in a 3-D space, if we use the symbol −→ to mean “is represented in the {|1i, |2i, |3i} basis
{1,2,3}
by”, then
1 0 0
|1i −→ 0 |2i −→ 1 |3i −→ 0 .
{1,2,3} {1,2,3} {1,2,3}
0 0 1
If we choose a new orthonormal basis {|n0 i}, vectors and operators will have new coefficients.
With vn = hn|vi, vn0 = hn0 |vi, Qmn = hm|Q̂|ni and Q0mn = hm0 |Q̂|n0 i, and where S is a unitary
matrix (not an representing an operator) defined as Smn = hm|n0 i, we have the following
relations between the two representations:
X
vn0 = ∗
Smn vm ⇒ v0 = S† v; Q0ij = Ski
∗
Qkl Slj ⇒ Q0 = S† QS.
j
are orthonormal and so also form a basis. But in this new basis, the column vectors and matrices
which represent states and operators will be different. For instance if |vi = |1i − |3i = |10 i + |30 i
we write
1 1
|vi −→ 0 ≡ v |vi −→ 0 ≡ v0 ,
{1,2,3} {10 ,20 ,30 }
−1 1
and q
1 1 1
0
v1 2 2 2 v1
√i − √i2
v2 = 0 v20 .
2
v30
q
v3 −21 1
− 12
2
The matrix is S as defined above. We observe that its columns are just the representations of
the new states {|10 i, |20 i, |30 i} in the old basis {|1i, |2i, |3i}: S23 = h2|30 i etc.
The adjoint of an operator is defined by hu|Q̂|vi = hv|Q̂† |ui∗ . A unitary operator satisfies
Q̂† Q̂ = Iˆ .
A Hermitian operator is its own adjoint: hu|Q̂|vi = hv|Q̂|ui∗ . In practice that means
that Q̂ can act backwards on hu| or forward on |vi, whichever is more convenient. In an
orthonormal basis, Q̂ will be represented by an matrix which equals its adjoint (transposed
complex-conjugate): Qmn = Q∗nm .
Hermitian operators have real eigenvalues and orthogonal eigenvectors which span the space.
(If eigenvalues are repeated, all linear combinations of the corresponding eigenvectors are also
eigenvectors—they form a degenerate subspace—but an orthogonal subset can always be
chosen.) Thus the normalised eigenvectors of Hermitian operators are often chosen as a basis,
typically labelled by the eigenvalues: |λn i. Two Hermitian operators which commute will have
a common set of eigenvectors with might be labelled by both eigenvalues: |µm , λn i .
In its own eigenbasis, a Hermitian operator will be diagonal, with the eigenvalues as the di-
agonal elements. Hence the process of finding the eigenvalues and eigenvectors is often called
diagonalisation. The unitary matrix S whose columns are the normalised eigenvectors can
be used to transform other vectors and operators to this basis.
Since we can add and multiply operators and multiply them by scalars, we can form power
series of an operator and hence define more general functions via their power-series expansion.
The most important function of an operator is the exponential:
∞
Q̂
X Q̂n
e ≡ .
n=0
n!
Since the corresponding power series for eλ converges for all finite numbers, this is defined for
all Hermitian operators, and its eigenvalues are eλi . (In the eigenbasis of a Hermitian operator,
any function of the operator is also represented by a diagonal matrix whose elements are the
function of the eigenvalues.)
The exponential of a Hermitian operator is a unitary operator.
R dg
Also since f ∗ dx dx = −( g ∗ df dx)∗ , D̂ is anti-hermitian, so iD̂ is Hermitian. In QM we work
R
dx
with p̂ = −i~D̂, and we can see that [x̂, p̂] = i~. In the abstract vector space, this commutation
relation defines p̂. In position space, these operators are represented by23
d
x̂ −→ x, p̂ −→ −i~ dx .
x x
We can define eigenstates of p̂, |pi, which have the following representation in position space:4
|pi −→ hx|pi = √1
2π~
eipx/~ ,
x
Up to factors of ~, hp|f i = f˜(p) is the Fourier transform of f (x), and is an equally valid
representation—in what we call momentum space—of the abstract state |f i. The numerical
equality of hf |gi calculated in the position and momentum representations is a reflection of
Parseval’s theorem.
We note that the states |ni defined above whose position-space representation is a Hermite
polynomial times a Gaussian are actually eigenstates of x̂2 − D̂2 , with eigenvalues λn = 2n + 1.
In this basis x̂ and p̂ are represented by infinite-dimensional matrices, and it can be shown that
for both, only matrix elements where m and n differ by ±1 are non-zero.
We can extend the discussion to functions of three coordinates x, y and z. (Our notation for
a point in space in a particular Cartesian coordinate system is r = x ex + y ey + z ez .) There
are operators associated with each coordinate, x̂, ŷ and ẑ, which commute, and corresponding
momentum operators p̂x , p̂y and p̂z , which also commute. Between the two sets the the only
non-vanishing commutators are [x̂, p̂x ] = [ŷ, p̂y ] = [ẑ, p̂z ] = i~.
The position operator, x̂, is5 x̂ ex + ŷ ey + ẑ ez , and similarly p̂. Boldface-and-hat now indicates
a vector operator, i.e. a triplet of operators. The eigenstate of position is |x, y, zi ≡ |ri:
x̂|ri = x̂ ex + ŷ ey + ẑ ez |ri = x ex + y ey + z ez |ri = r|ri,
p̂|pi = p̂x ex + p̂y ey + p̂z ez |pi = px ex + py ey + pz ez |pi = p|pi.
2
Actually as these are operators, it is more accurate to give the matrix elements x̂ −→ hx0 |x̂|xi = xδ(x0 − x)
x
0
and p̂ −→ hx0 |p̂|xi = −i~ dδ(x−x
dx0
)
, which then are integrated over x0 in any expression, but as this has just the
x
net effect of setting x0 to x we never bother with this more correct version.
3
The lecturer of PHYS20672 used |ek i −→ ek (x) for eigenkets of K̂ = p̂/~. The labelling of the kets used here
x
is more common and is more symmetric between the x and p representations. I don’t find it causes problems
in practice. There is no common convention for the name of the function hx|pi which represents a plane wave.
4
This can be derived from the differential equation obtained by combining hx|p̂|pi = phx|pi together with
d
the position-space representation hx|p̂|pi = −i~ dx hx|pi.
5
We do not use r̂ since that is reserved for the unit vector r/r!
In position space, x̂ −→ r and p̂ −→ −i~∇. Momentum eigenstates are
r r
1 3/2
eip·r/~ ,
|pi −→ hr|pi = 2π~
r
A.2.3 Commutators
Let Â, B̂ and Ĉ be arbitrary operators in some space. Then the following relations are very
useful:
and as a result the wave function must be normalised (it must be somewhere in space):8
Z
|Ψ(r, t)|2 dV = 1
where the integral here and everywhere below is over all space unless otherwise specified. The
wave function is a solution of the time-dependent Schrödinger equation (TDSE)
2
− 2M
~
∇2 Ψ + V (r)Ψ = i~ ∂Ψ
∂t
or equivalently ĤΨ = i~ ∂Ψ
∂t
,
where Ĥ is the Hamiltonian or energy operator. As we often do when faced with a partial
differential equation, we can look for solutions which are separable in space and time,
(this will not work if the potential depends on time, but for many interesting cases it does not).
Substituting in the the Schrödinger equation and dividing by Ψ(r, t) gives
~2 ∇2 ψ i~ dT
+ V (r) = = E;
2M ψ T dt
using the usual argument that if two functions of different variables are equal for all values
of those variables, they both must be equal to a constant which we denote as E. The time
equation is trivial and independent of the potential, and gives
T (t) = e−iE/~ ;
6
These notes are designed to be read at the start of the current course, and so do not use vector space
terminology. They also do not distinguish between operators and their position-space representations.
7
m is usually used for the mass, but that may be confused with the azimuthal quantum number.
8
This means that the wave function is dimensioned, with units of inverse length to the power D/2, D being
the number of spatial dimensions.
this is just a phase factor with a constant frequency satisfying de Broglie’s relation ~ω = E.
Allowed values of E though are not determined at this stage.
The spatial equation is the time-independent Schrödinger equation (TISE)
2
− 2M
~
∇2 ψ + V (r)ψ = Eψ or equivalently Ĥψ = Eψ.
Though the form of the equation is universal, the solutions depend on the potential. In solving
the TISE, at a minimum we need to ensure the solutions are finite everywhere—we don’t want
the probability of finding the particle in some region to be infinite. If the potential is constant,
V (r) = V0 , the solutions are plane waves characterised by a wave vector k
~2 |k|2
ψ(r) ∝ eik·r , where E = V0 + .
2M
This is very suggestive, it looks like E equals the potential energy plus the kinetic energy, if
the momentum is given by de Broglie’s relation p = ~k. And the only allowable solutions are
those with E > V0 , which makes sense classically. Indeed we identify E with the energy of the
system in all cases.
If the potential varies with position, we have two possible types of solution. There are those for
which the energy is greater than the potential over most of space, in which case the solutions
are not localised and the particle may be found anywhere; these are called scattering solutions.
They will look like plane waves in regions where the potential is constant. The other type of
solution may exist if the potential has a well, a region of space where the potential is lower
than its value at large distances. These solutions are called bound states; they have energies
that are not large enough for the particle to climb out of the well, and the wave function is
concentrated within the well—the probability of finding the particle large distances away from
the well vanishes.
Elementary QM is almost exclusively concerned with bound states. The extra requirement that
the wave function must vanish at infinity means that, for arbitrary values of E, no solution to
the TISE exists; only for certain discrete values can acceptable solutions be found. The energy
is quantised ; there is a single lowest (ground-state) energy state and a number of higher-energy
or excited states. If the potential grows without bound at large distances (an infinite well)
there will be infinitely-many states, but more realistically the potential will level off eventually
(a finite well); there will be a maximum energy of the bound states, and scattering states will
exist as well. (In this case it is usual to set V (|r| → ∞) = 0, with V < 0 in the well.) For
simplicity, we often concentrate on infinite wells.
In this case we have infinitely many states with energies En , n = 1, 2 . . . and solutions satisfying
~2
Ψn (r, t) = ψn (r)e−iEn t/~ where − 2M
∇2 ψn (r) + V (r)ψn (r) = En ψn (r).
Note that while the form of the time-dependence is the same for all solutions, the frequency
ωn = En /~ is different for each. The TISE is an eigenfunction equation, with the energies En
being the eigenvalues. The states are taken to be normalised, and it can be shown that they
are also orthogonal :9 Z
∗
ψm (r)ψn (r) dV = δmn .
9
In more than one dimension there may be distinguishable states with the same energy which are termed
degenerate; in that case any superposition of degenerate solutions is also a solution but we are always able to
choose an orthogonal set.
The general solution of the TDSE is a superposition of these states:
∞
X ∞
X
−iEn t/~
Ψ(r, t) = cn ψn (r)e , where |cn |2 = 1,
n=1 n=1
the restriction on the sum of the magnitudes of the coefficients being the normalisation con-
dition. In any specific case a particular wave function is determined by the coefficients, which
(since the functions ψn (r) are orthogonal) can be obtained from the initial state of the system
by Z
ψn∗ (r)Ψ(r, 0) dV.
cn =
q q
Thus in a very simple case, if the initial state is Ψ(r, 0) = 13 ψ1 (r) + 23 ψ3 (r), the subsequent
state would be q q
−iE1 t/~
1
Ψ(r, t) = 3 ψ1 (r)e + 23 ψ3 (r)e−iE3 t/~ .
In practical situations (such as atomic physics) the quantisation of the energy levels shows up
primarily in the existence of discrete excitation or de-excitation energies—spectral lines.
A.3.2 Measurement
It is a postulate of QM that for every “observable” or measurable property of a quantum system
there is a corresponding Hermitian operator. Denoting such an operator Q̂, “Hermitian” means
that if ψ(r) and φ(r) are (complex) normalisable functions of position,
Z ∗
∗ ∗
ψ (r)Q̂φ(r) dV = φ (r)Q̂ψ(r) dV .
Similarly to the case with Hermitian matrices, Hermitian operators have real eigenvalues and
orthogonal eigenfunctions which form a complete set, that is any other normalisable function
can be expressed as a superposition of them.
At the start we noted that a measurement of position (for which the operator is just the
position vector, denoted x̂ but simply equal to r in an integral) will give an answer which is not
known in advance, with the probability of the possible results being governed by the modulus-
squared of the wave function. So, in general, is the case with other possible measurements. If
some observable is associated with an operator Q̂, the average value of a measurement of that
observable for a particle with wave function Ψ(r, t) is given by10
Z
hQ̂i = Ψ∗ (r, t)Q̂Ψ(r, t) dV
Since, as is well known, measurement in quantum mechanics changes the system, we can only
talk about average results if we make repeated measurements on identically-prepared copies of
the system. So this average is an ensemble average, also called the expectation value.
As well as an average value, these measurements will have a spread or uncertainty ∆Q which
is given through the usual definition of the standard deviation by
And immediately after the measurement, the system is in state φ∗i (r).
Some frequently-met operators are momentum, p̂ = −i~∇, kinetic energy p̂ · p̂/2M , energy Ĥ
as given above, and angular momentum L̂ = r × p̂.
For energy, once we have found the eigenfunctions and eigenvalues for the relevant particular
potential, as we have already seen any wave function can be expanded
∞
X ∞
X Z
−iEn t/~
Ψ(r, t) = cn ψn (r)e , where 2
|cn | = 1 and cn = ψn∗ (r)Ψ(r, 0)dV.
n=1 n=1
We now see that |cn |2 is the probability that, if we measure the energy, we will get the value En .
If only one cn is non-zero, Ψ(r, t) is an energy eigenstate. The following terms are synonymous:
separable solution of the TDSE; solution of the TISE; eigenstate of the Hamiltonian; state of def-
inite energy; stationary state. The last requires some explanation: for Ψ(r, t) = ψn (r)e−iEn t/~ ,
the expectation value hQ̂i is independent of time for any operator Q̂ that does not itself depend
on time.
For the case we considered above,
q q
Ψ(r, t) = 1
ψ (r)e−iE1 t/~
3 1
+ 2
ψ (r)e−iE3 t/~ ,
3 3
a measurement of energy would yield the result E1 one third of the time and E3 two-thirds of
the time.
Given any wave function except a plane wave, the uncertainty in any component of the mo-
mentum ∆pi will be non-zero. This is related to the Fourier transform: to make up a wave
11
if the eigenvalues are continuous, this should be interpreted as a probability density instead
packet of finite spatial extent ∆x requires a superposition of waves of a spread of wavelengths.
(Similarly, a signal of finite duration consists of a spread of frequencies). The bandwidth theo-
rem relates the two; the narrower the wave packet, the wider the spread of wave numbers. In
QM language, this is Heisenberg’s uncertainty relation:
∆x∆px ≥ 12 ~.
If we make a measurement that narrows down the region in which the particle exists, a subse-
quent measurement of the momentum is much more unpredictable, and vice versa. It is obvious
therefore that we cannot find a state of well-defined position and momentum; position and mo-
mentum can’t have common eigenfunctions. In fact if operators have common eigenfunctions,
they must commute, and x̂ and p̂x do not:
Heisenberg, with Born and Jordan, derived his formulation of QM, called matrix mechanics,
starting from this relation; Schrödinger subsequently showed that it was in fact equivalent to
his wave mechanics.
Other operators which do not commute are the three components of angular momentum, so
that in QM we can only know one of them exactly (usually but not necessarily taken to be Lz ).
In fact the commutation relations are rather interesting:
However all three separately commute with L̂2 = L̂2x + L̂2y + L̂2z , so we can know the magnitude
of the angular momentum and one of its components at the same time. For more on angular
momentum operators see section A.5. For more on commutators see section A.2.3.
[continued overleaf]
A.3.3 Bound states
We now turn to the problem of finding the energy levels for a variety of potentials.
An unrealistic but useful model potential is one which has a constant value, taken to be zero,
over some region, but which is infinite elsewhere. Where a potential is discontinuous like this
we have to solve the TISE in each region separately, and we match them by requiring that ψ
is continuous at the boundaries (the probability density has to be unambiguous). Since the
TISE is a second-order differential equation we usually need the derivatives of ψ to match as
well. However in the unique case of an infinite potential step there is no solution possible in
the classically-forbidden region, so in this case the condition is just that ψ vanishes at the
boundaries.
In 1D, with V (x) = 0 for 0 < x <√ a, the general solution is Aeikx + Be−ikx or equivalently
q with ~k = 2M E, but requiring ψ(0) = ψ(a) = 0 restricts the (nor-
C sin(kx) + D cos(kx),
malised) solution to a2 sin(kn x), with kn = nπ/a and En = ~2 π 2 /(2M a2 ). The quantisation
of the energies is completely analogous to the “quantisation” of the vibrational frequencies of
a string fixed at both ends.
If we’d chosen the well to stretch from −a/2 to a/2 the values kn and hence the energies would
(of course) be unchanged but the wave functions would alternate cosines and sines. (I only
mention this because the finite square well is much neater with that choice, see below.)
In 2D and 3D, for a rectangular or cuboidal well, the spatial wave function is separable. The
solution in 3D, with sides Lx etc, is just
s
8 nx πx ny πy nz πz
ψnx ny nz = sin sin sin
Lx Ly Lz Lx Ly Lz
~2 π 2 n2x n2y n2z
and Enx ny nz = + + .
2M L2x L2y L2z
A cuboidal well (albeit better modelled as a finite than infinite well) is also called a quantum
dot. If all the lengths are the same, some energy levels will correspond to more than one
different wave functions—denoting the quantum numbers of the state by (nx , ny , nz ), then the
first excited energy level is in fact three states, with quantum numbers (2, 1, 1), (1, 2, 1) and
(1, 1, 2). These are said to be three-fold degenerate.
Quantum mechanics allows us to make systems which act as if they have fewer than 3 dimensions
in a way that classical physics would not. If we have a situation where one length is much shorter
than the others, say Lz Lx , Ly , then all the low-lying states will have nz = 1. If we probe
such a system with energies less than the energy needed to excite the first state with nz = 2,
it will be effectively two-dimensional; we say the third degree of freedom is frozen out. Modern
so-called 2D systems are often of this nature. Similarly if Lx Ly , Lz we have an effectively
1D system or quantum wire.
For a 2D circular well, the radial solutions are Bessel functions with argument kr. Again k is
chosen so that the wave function vanishes at the boundary r = R, ie kR is a zero of the Bessel
function. For a 3-D spherical well we get spherical Bessel functions. More on this below.
Finite square well
A somewhat more realistic well has one region of constant low potential with a constant higher
potential elsewhere. We can either take these to be −V0 and 0, or 0 and +V0 ; the former allows
the potential at infinity to be zero but the latter is closer to the infinite well so we will go with
that.
In 1D, with V (x) = 0 for −a/2 < x < a/2 and V (x) = V0 > 0 elsewhere, the solutions are
again Aeikx + Be−ikx inside the well. But now there solutions to the equation outside
p the well
−κx κx
as well: for x > a/2, ψ = Ce and for x < −a/2, ψ = De , with ~κ(k) = 2M (V0 − E).
(These were not allowable for a constant potential over all space as they blow up, but they are
OK here as they hold only in restricted regions.) Matching both ψ and ψ 0 at x = ±a/2 again
restricts the values of k: either ψ is symmetric (cosine inside the well) and k has to satisfy
κ(k) = k tan(ka/2), or antisymmetric (sine inside the well) with κ(k) = −k cot(ka/2). The
graphical solution for the allowed values of k, and the first few solutions, are shown below. A
1D well, no matter how shallow, always has one bound state.
The most important part of this analysis (apart again from the quantisation of the energy) is
that there is a finite probability of finding the particle in the classically-forbidden regions. The
higher the energy, the higher this probability is. On the other hand we note that as V0 increases,
the tails of the low-lying states get smaller and shorter, until we reach the point where simply
setting ψ = 0 at the boundary, as we did in the infinite well, is a good approximation.
There is a standard piece of iconography used in the picture on the right. We draw a well and
the energy levels. Then we take the lines representing the energy levels as the x-axis for a graph
of the corresponding wave function. These therefore have off-set y-axes, and the two y scales,
for the potential and for the wave functions, are not related.
Harmonic Oscillator
A potential which is harmonic in three directions, 12 M (ωx2 x2 + ωy2 y 2 + ωz2 z 2 ) will have solutions
which (like the square well) are just products of the 1D states, and energies which are the sum
of the corresponding 1D energies. If all the “spring constants” are equal there will be a high
degree of degeneracy among the excited states, for instance (7/2)~ω is the energy of the states
with (nx , ny , nz ) = (2, 0, 0), (0, 2, 0), (0, 0, 2), (1, 1, 0), (1, 0, 1) and (0, 1, 1).
In the symmetric case, though, the potential can be written V (r) = V (r) = 12 mω 2 r2 , and we
can work instead in spherical polars: see the next section. The energies, degeneracies, and non-
degenerate wave functions such as the ground state must turn out the same in both coordinate
systems, but the degenerate ones need only be linear combinations of one another.
Linear potential
In a region of space where the potential is linear, V (x) ∝ x, the solutions are Airy functions
(see section A.9). To form a well, this potential would have to have a hard wall somewhere, say
V = ∞ for x < 0, or it might be part of a V-shaped potential V (x) ∝ |x|. The energy levels
have to be found numerically.
A.3.4 Circular and spherical symmetry
In 2D we will use r2 = x2 + y 2 and y/x = tan φ; in 3D, r2 = x2 + y 2 + z 2 , y/x = tan φ and
z/r = cos θ. The double meaning of r will not cause problems so long as we don’t consider 3D
problems with cylindrical geometry.
In a 2D problem with a symmetric potential V (r) = V (r), we can write the wave function
in a form which is separable in plane polar coordinates: ψ(x, y) = R(r)Φ(φ). Skipping some
detail we find that the angular dependence is just of the form eimφ where, in order for the wave
function to be single valued, we need m to be an integer (not to be confused with the mass!).
Then the radial equation is
~2 d2 R 1 dR ~2 m2
− + + R + V (r)R = ER.
2M dr2 r dr 2M r2
In a 3D spherically-symmetric potential V (r) = V (r), we can write the wave function in a form
which is separable in spherical polar coordinates: ψ(r) = R(r)Y (θ, φ). Then
~2 2
− ∇ ψ(r) + V (r)ψ(r) = Eψ(r)
2M
1 ∂ 2Y
2 1 ∂ ∂Y
⇒ −~ sin θ + = ~2 l(l + 1)Y (A.1)
sin θ ∂θ ∂θ sin2 θ ∂φ2
~2 d2 (rR) ~2 l(l + 1)
and − + R + V (r)R = ER
2M r dr2 2M r2
where l(l + 1) is the constant of separation. The radial equation depends on the potential, and
so differs from problem to problem. However the angular equation is universal: its solutions
do not depend on the potential. It is further separable into an equation in θ and one in φ
with separation constant m2 ; the latter is the same as in 2D with solution eimφ for integer
m. Finally the allowable solutions of the θ equation are restricted to those which are finite
for all θ, which is only possible if l is an integer greater than or equal to |m|; the solutions
are associated Legendre polynomials Plm (cos θ). The combined angular solutions are called
spherical harmonics Ylm (θ, φ):
r r
1 ±1 3
0
Y0 (θ, φ) = Y1 (θ, φ) = ∓ sin θ e±iφ
4π 8π
r r
3 ±2 15
0
Y1 (θ, φ) = cos θ Y2 (θ, φ) = sin2 θ e±2iφ
4π 32π
r r
15 5
Y2±1 (θ, φ) = ∓ sin θ cos θ e±iφ Y20 (θ, φ) = (3 cos2 θ − 1)
8π 16π
These are normalised and orthogonal:
Z
0 ∗
(Ylm m
0 ) Yl dΩ = δll0 δmm0 where dΩ = sin θ dθ dφ
The physical significance of the quantum numbers l and m is not clear from this approach.
However if we look at the radial equation, we see that the potential has been effectively modified
by an extra term ~2 l(l + 1)/(2M r2 ). Recalling classical mechanics, this is reminiscent of the
centrifugal potential which enters the equation for the radial motion of an orbiting particle,
where ~2 l(l + 1) is taking the place of the (conserved) square of the angular momentum. Now
we have already defined the angular momentum operator L̂ = −i~r × ∇. If we cast this in
spherical polar coordinates, we find
1 ∂2
∂ 2 2 1 ∂ ∂
L̂z = −i~ , L̂ = −~ sin θ + .
∂φ sin θ ∂θ ∂θ sin2 θ ∂φ2
But L̂2 is just the angular part of −~2 r2 ∇2 . So its eigenfunctions, the spherical harmonics, are
in fact states of definite squared angular momentum, which is quantised and takes the values
~2 l(l + 1) for integer l. They are also states of definite Lz , which takes the values ~m. It is
good to see that as |m| ≤ l, L2z ≤ L2 , with equality only if both are zero.
We have already noted that we can simultaneously know the total angular momentum and its
z component, since the corresponding operators commute. (This should be obvious as L̂2 is
independent of φ.) Since L̂2 commutes with the Hamiltonian for a system with a symmetric
potential, states may be fully classified by their energy, the square of their angular momentum,
and the z-component of angular momentum.
For completeness we note that the formulae for L̂x and L̂y are rather lengthy (the z-axis has a
special role in spherical polars), but they can be expressed more succinctly as as
where
∂ ∂ ∂ ∂
L̂+ = ~e iφ
+ i cot θ , L̂− = L̂†+ = ~e −iφ
− + i cot θ .
∂θ ∂φ ∂θ ∂φ
The operators L̂± both commute with L̂2 , though it is somewhat easier to show it for L̂x and
L̂y using Cartesian coordinates.
There is more on angular momentum in section A.5.
For a 2D circular infinite square well with V = 0 for r < R the solutions are Bessel functions
as mentioned above; actually there is a different Bessel function for each value of m so
ψ(r, φ) ∝ Jm (kr)eimφ .
The ground state is circularly symmetric, m = 0, and there is a tower of m = 0 states with k,
and hence E, determined by J0 (kR) = 0. However the (doubly-degenerate) first excited states
have m = ±1 and k fixed by requiring kR to be the first zero of J1 .
Similarly for a 3D spherical well, the solutions are spherical Bessel functions,
This time the energy levels depend on l but not m (as expected since the z-axis is arbitrary).
Similar remarks as in the previous paragraph can be made about the ordering of levels, first
the l = 0 ground state, then an l = 1 excited states, etc.
Interestingly for 3D the radial part of ∇2 R simplifies if we set R(r) = u(r)/r, becoming u00 /r.
Even in the absence of an external potential there is the centrifugal potential to stop the
equation for the radial wave function being trivial, but for the special case of l = 0 we simply
have u(r) = sin(kr). And indeed j0 (z) = sin z/z. So the l = 0 states of a spherical square well
have energies Enl=0 = ~2 n2 π 2 /2M R2 ,
Finite circular and spherical square wells can also be solved numerically. A 2D circular well, like
a 1D well, no matter how shallow, always has one bound state, but in 3D a sufficiently-shallow
spherical well will have no bound states. (The l = 0 states, having the form within the well
of sin(kr)/r and outside of e−κr /r, have the same energies as the antisymmetric states of a 1D
well of the same diameter; as we saw when we considered 1D finite wells above, such a state
may not exist if the well is too shallow.)
If we solve the symmetric harmonic oscillator V (r) = 21 M ω 2 r2 in 2 or 3D, we find the lowest
2 2
energy solution to be e−r /2x0 with energy ~ω or 32 ~ω respectively. Higher-energy solutions take
the form of the same Gaussian multiplied by polynomials in r and the appropriate angular form
eimφ or Ylm (θ, φ). The first excited states in the Cartesian basis are just x or y (or z) times the
Gaussian; the relation between the two forms comes from noting that x ± iy = re±iφ (2D) or,
up to a constant, rY1±1 (θ, φ) in 3D, while z ∝ rY10 . Though it becomes increasingly tiresome
to verify, the degeneracies and energies do all match as they must. The three Cartesian-basis
states with E = (5/2)~ω all have l = 1. From the 6 states with E = (7/2)~ω discussed above,
we now find that from linear combinations we can make one state with l = 0 and five with
l = 2 (and m = 2, 1, 0, −1, 1, −2).
Hydrogen atom
The details of finding the radial solutions for the Coulomb potential are extremely lengthy, but
are covered in all the course textbooks. The results are quite simple though, and are given in
section A.6.
A.3.5 Tunnelling
As we saw with the finite square well, there is a finite probability of finding a particle in a
classically-forbidden region where V > E. This means that a particle incident on a barrier of
finite height and thickness has a finite chance of tunnelling through and continuing on the other
side. (See picture below.) To find how likely this is, we need to solve the Schrödinger equation
and look at the probability density beyond the barrier. For an individual particle we would need
to set up a localised incoming wave packet with, necessarily, a spread of momenta and energies,
which would be a complicated, time-dependent problem. Instead we consider the steady-state
problem of a beam of mono-energetic particles incident from the left on the barrier; there will
also be a reflected beam travelling in the opposite direction to the left of the barrier and a
transmitted beam beyond the barrier. (This is an example of a scattering problem). Without
loss of generality we can take the potential to be zero on the left, and the wave function
for the incident beam, with momentum ~k and energy E = ~ω = ~2 k 2 /(2M ), is Iei(kx−ωt) ,
the probability density is |I|2 and the flux of particles in the beam is (p/M )|I|2 particles per
second. We are looking for a single solution to the TISE, albeit one with different functional
forms in different regions, so the energy is the same everywhere and the time-dependence e−iωt
is likewise universal, so we will drop it.
Beyond the barrier the potential need not be the same as on thep left, so we will let it be Vf ; the
0
the momentum of particles that tunnel through will be p = 2M (E − Vf ); the wave function
0
is T eik x and the transmitted flux (p0 /M )|T |2 . Finally the reflected wave is Re−ikx and the
reflected flux (p/M )|R|2 . The transmission and reflection probabilities are
2 2
p0 T R
t= , r= ; t + r = 1.
p I I
4κ2 kk 0
t=
(κ2 − kk 0 )2 sinh2 (κL) + κ2 (k 0 + k)2 cosh2 (κL)
But if κL 1, cosh(κL) and sinh(κL) both tend to 21 eκL and the expression simplifies to
This is equivalent to ignoring the exponentially-growing wave within the barrier, so that the
ratio of the wave function at either end of the barrier is just e−κL .
Recall that k, k 0 and κ are all functions of energy. By far the most rapidly-varying part of
the expression is the exponential, and a rough estimate of the dependencepof the tunnelling
probability on energy for a high, wide barrier is given simply by t ∼ exp(−2L 2M (Vb − E)/~).
In the figure below we can note the approximately exponential decay within the barrier, the
continuity and smoothness of the wave function, and the much smaller amplitude and (since
Vf is higher than the initial potential) the longer wavelength on the right.
It may be noted that there will also be some reflection even if E > Vb , or indeed from a well
rather than a barrier.
A.4 Series Solution of Hermite’s equation and the Har-
monic Oscillator
Shankar 7.3
Griffiths 2.3.2
where we have changed the summation index in the first sum before relabelling it j. The only
way a polynomial can vanish for all y is if all the coefficients vanish, so we have a recurrence
relation:
(j + 1)(j + 2)cj+2 + (E − 1 − 2j)cj = 0. (A.6)
Given c0 and c1 , we can construct all other coefficients from this equation, for any E. We can
obtain two independent solution, as expected for a second order differential equation: even
solutions with c1 = 0 and odd ones with c0 = 0.
However, we need the wave function to be normalisable (square integrable), which means that
it tends to 0 as x → ±∞. In general an infinite polynomial times a Gaussian will not satisfy
this, and these solutions are not physically acceptable. If we look again at equation (A.6),
though, we see that if E = 1 + 2n for some integer n ≥ 0, then cn+2 , cn+4 , cn+6 . . . are all zero.
Thus for E = 1, 5, 9 . . . we have finite even polynomials, and for E = 3, 7, 11 . . . we have finite
odd polynomials. These are called the Hermite polynomials.
Rewriting (A.6) with E = 1 + 2n as
2(j − n)
cj+2 = cj , (A.7)
(j + 1)(j + 2)
we have for instance, for n = 5,
H0 (y) = 1; H1 (y) = 2y; H2 (y) = 4y 2 −2; H3 (y) = 8y 3 −12y; H4 (y) = 16y 4 −48y 2 +12. (A.9)
The corresponding solutions of the original Hamiltonian, returning to unscaled coordinates, are
The commutation relations imply that we can only simultaneously know L2 and one component,
taken conventionally to be Lz . The common eigenfunctions of L̂2 and L̂z are the spherical
harmonics, Ylm (θ, φ):
L̂2 Ylm (θ, φ) = ~2 l(l + 1) Ylm (θ, φ) L̂z Ylm (θ, φ) = ~m Ylm (θ, φ)
From requirements that the wave function must be finite everywhere, and single-valued under
φ → φ + 2π, it emerges that l and m are integers and must satisfy
l = 0, 1, 2 . . . , m = −l, −l + 1, . . . l.
See the end of these notes for some explicit forms of spherical harmonics.
s = 0, 21 , 1, 32 . . . , ms = −s, −s + 1, . . . s.
s = 21 for an electron, s = 1 for a photon or W boson. This means that the magnitude of the
√
spin vector of an electron is ( 3/2)~, but we always just say “spin- 12 ”.
If a particle has both orbital and spin angular momentum, we talk about its total angular
momentum, with operator
Ĵ = L̂ + Ŝ.
As with spin, the eigenvalues of Ĵ2 are ~2 j(j + 1),
j = 0, 21 , 1, 23 . . . , mj = −j, −j + 1, . . . j.
Systems composed of more than one particle (hadrons, nuclei, atoms) will have many contribu-
tions to their total angular momentum. It is sometimes useful to add up all the spins to give a
total spin, and now, confusingly, we denote the quantum numbers by S and MS , so it is really
important to distinguish operators and the corresponding quantum numbers. Then
It follows that the eigenvalues of (L̂tot )2 , (Ŝtot )2 and (Ĵtot )2 have exactly the same form, with
the same restrictions on the quantum numbers, as those for a single particle. So for instance
the eigenstates of (Ŝtot )2 are ~2 S(S + 1), and of Ŝztot are ~Ms , and
L = 0, 1, 2 . . . , S = 0, 12 , 1, 32 . . . , J = 0, 21 , 1, 32 . . . ,
J = |L − S|, |L − S| + 1, . . . , L + S, MJ = ML + MS .
However since L̂z and Ŝz do not commute with Ĵ2 , we cannot know J, ML and MS simultane-
ously. For a single-particle system, replace J, L, and S with j, l, and s.
More generally, for the addition of any two angular momenta with quantum numbers J1 , M1
and J2 , M2 , the rules are
J = |J1 − J2 |, |J1 − J2 | + 1, . . . , J1 + J2 , MJ = M1 + M2
For two such particles there are four states ↑↑, ↓↓, ↑↓ and ↓↑. The first two states have MS = 1
and −1 respectively, and we can show, using Ŝtot = Ŝ(1) + Ŝ(2) , that they are also eigenstates of
(Ŝtot )2 with S = 1. However the second two, though they have MS = 0, are not eigenstates of
(Ŝtot )2 . To make those, we need linear combinations, tabulated below:
S=1 S=0
M =1 ↑↑
M =0 √1 (↑↓ + ↓↑) √1 (↑↓ − ↓↑)
2 2
M = −1 ↓↓
The S = 1 states are symmetric under exchange of particles; the S = 0 states are antisymmetric.
For a system of N spin- 21 particles, S will be integer if N is even and half-integer if N is odd.
• If two electrons in an atom are in the same orbital (thus their spatial wave function is
symmetric under exchange of the two), they must be in an S = 0 state.
• Thus the ground state of helium has S = 0, but the excited states can have S = 0
(parahelium) or S = 1 (orthohelium).
• Two π 0 mesons must have even relative orbital angular momentum L (they are spinless,
so this is the only contribution to their wave function).
• Two ρ0 mesons (spin-1 particles) can have odd or even relative orbital angular momentum
L, but their spin state must have the same symmetry as their spatial state. (In this case,
S = 2 and 0 are even, S = 1 is odd.)
Note that in the last two, in the centre-of-momentum frame the spatial state only depends on
the relative coordinate r. So interchanging the particles is equivalent to r → −r, ie the parity
operation.
Spherical Harmonics
In spherical polar coordinates the orbital angular momentum operators are
The spherical harmonics, Ylm (θ, φ) are eigenfunctions of L̂2 and L̂z ; the first few are as follows
r r
1 ±1 3
0
Y0 (θ, φ) = Y1 (θ, φ) = ∓ sin θ e±iφ
4π 8π
r r
3 15
Y10 (θ, φ) = cos θ Y2±2 (θ, φ) = sin2 θ e±2iφ
4π 32π
r r
15 5
Y2±1 (θ, φ) = ∓ sin θ cos θ e±iφ Y20 (θ, φ) = (3 cos2 θ − 1)
8π 16π
A.6 Hydrogen wave functions
The solutions of the Schrödinger equation for the Coulomb potential V (r) = −~cα/r have
energy En = − n12 ERy , where ERy = 21 α2 mc2 = 13.6 eV (with m the reduced mass of the
electron-proton system). (Recall α = e2 /(4π0 ~c) ≈ 1/137.) The spatial wavefunctions are
ψnlm (r) = Rn,l (r)Ylm (θ, φ).
The radial wavefunctions are as follows, where a0 = ~c/(mc2 α):
2 r
R1,0 (r) = 3/2
exp − ,
a0 a0
2 r r
R2,0 (r) = 1− exp − ,
(2 a0 )3/2 2 a0 2 a0
1 r r
R2,1 (r) = √ exp − ,
3 (2 a0 )3/2 a0 2 a0
2 r2
2 2r r
R3,0 (r) = 1− + exp − ,
(3 a0 )3/2 3 a0 27 a02 3 a0
√
4 2 r r r
R3,1 (r) = 1− exp − ,
9 (3 a0 )3/2 a0 6 a0 3 a0
√ 2
2 2 r r
R3,2 (r) = √ exp − .
27 5 (3 a0 )3/2 a0 3 a0
R∞
They are normalised, so 0 (Rn,l (r))2 r2 dr = 1. Radial wavefuntions of the same l but different
n are orthogonal (the spherical harmonics take care of orthogonality for different ls).
The following radial integrals can be proved:
a20 n2
hr2 i = 5 n2 + 1 − 3l (l + 1) ,
2
a0
3n2 − l (l + 1) ,
hri =
2
1 1
= 2
,
r n a0
1 1
= ,
r2 (l + 1/2) n3 a02
1 1
= .
r3 l (l + 1/2) (l + 1) n3 a30
For hydrogen-like atoms (single-electron ions with nuclear charge |e| Z) the results are obtained
by substituting α → Zα (and so a0 → a0 /Z).
A.7 Properties of δ-functions
The δ-function is defined by its behaviour in integrals:
Z b Z b
δ(x − x0 )dx = 1; f (x)δ(x − x0 ) dx = f (x0 )
a a
where the limits a and b satisfy a < x0 < b; the integration simply has to span the point on
which the δ-function is centred. The second property is called the sifting property because
it picks out the value of f at x = x0 .
The following equivalences may also be proved by changing variables in the corresponding
integral (an appropriate integration range is assumed for compactness of notation):
Z
1 b
δ(ax − b) = |a| δ(x − a ) since f (x)δ(ax − b) dx = a1 f ( ab )
X δ(x − xi )
δ(g(x)) = where the xi are the (simple) real roots of g(x).
i
|g 0 (xi )|
Note that the dimensions of a δ-function are the inverse of those of its argument, as should be
obvious from the first equation.
Though the δ-function is not well defined as a function (technically it is a distribution rather
than a function), it can be considered as the limit of many well-defined functions. For instance
the “top-hat” function which vanishes outside a range a and has height 1/a tends to a δ-
function as a → ∞. Similarly a Gaussian with width and height inversely proportial tends to
a δ-function as the width tends to zero. These are shown in the first two frames below.
Two less obvious functions which tend to a δ-function, shown in the next two frames, are the
following:
Z L
0 L→∞
1
2π
ei(k−k )x dx = Lπ sinc (k − k 0 )L −→ δ(k − k 0 )
−L
L→∞
L
π
sinc2 (k − k 0 )L −→ δ(k − k 0 )
The first of these does not actually vanish away from the peak, but it oscillates so rapidly
that there will be no contribution to any integral over k 0 except from the point k 0 = k. This
is the integral which gives the orthogonality of two plane waves with different wavelengths:
hk|k 0 i = δ(k − k 0 ). It also ensures that the inverse Fourier transform of a Fourier transform
recovers the original function.
That
R ∞ the normalisation (for
R ∞integration over k) is correct follows from the following two integrals:
2
−∞
sinc(t)dt = π and −∞ sinc (t)dt = π. The second of these follows from the first via
R∞ R∞
integration by parts. The integral −∞ sinc(t)dt = Im I where I = −∞ eit /t dt may be done
via the contour integral below:
As no poles are included by the contour, the full contour integral is zero. By Jordan’s lemma
the integral round the outer circle tends to zero (as R → ∞, eiz decays exponentially in the
upper half plane). So the integral along the real axis is equal and opposite to the integral over
the inner circle, namely −πi times the residue at x = 0, so I = iπ. So the imaginary part, the
integral of sinc(x), is π.
A.8 Gaussian integrals
The following integrals will be useful:
Z ∞ ∞
dn
r Z r
−αx2 π 2n −αx2 n π
e dx = and x e dx = (−1)
−∞ α −∞ dαn α
y z y z
|β|/2α
x x
R R
(a) (b)
A.9 Airy functions
Airy functions are the solutions of the differential equation:
d2 f
− zf = 0
dz 2
There are two solutions, Ai(z) and Bi(z); the first tends to zero as z → ∞, while the second
blows up. Both are oscillatory for z < 0. The Mathematica functions for obtaining them are
The Schrödinger equation for a linear potential V (x) = βx in one dimension can be cast in the
following form
~2 d2 ψ
− + βxψ − Eψ = 0
2m dx2
Defining z = x/x0 , with x0 = (~2 /(2mβ))1/3 , and E = (~2 β 2 /(2m))1/3 µ, and with y(z) ≡ ψ(x),
this can be written
d2 y
− zy + µy = 0.
dz 2
The solution is
y(z) = C Ai(z−µ)+D Bi(z−µ) or ψ(x) = C Ai (βx−E)/(βx0 ) +D Bi((βx−E)/(βx0 )
where D = 0 if the solution has to extend to x = ∞. The point z = µ, x = E/β is the point
at which E = V and the solution changes from oscillatory to decaying / growing.
The equation for a potential with a negative slope is given by substituting z → −z in the
defining equation. Hence the general solution is ψ(x) = C Ai(−x/x0 − µ) + D Bi(−x/x0 − µ),
with D = 0 if the solution has to extend to x = −∞.
The first few zeros of the Airy functions are given in Wolfram MathWorld.
A.10 Units in EM
There are several systems of units in electromagnetism. We are familiar with SI units, but
Gaussian units are still very common and are used, for instance, in Shankar.
In SI units the force between two currents is used to define the unit of current, and hence the
unit of charge. (Currents are much easier to calibrate and manipulate in the lab than charges.)
The constant µ0 is defined as 4π × 10−7 N A−2 , with the magnitude chosen so that the Ampère
is a “sensible” sort of size. Then Coulomb’s law reads
q1 q2
F=
4π0 |r1 − r2 |2
and 0 has to be obtained from experiment. (Or, these days, as the speed of light is now has a
defined value, 0 is obtained from 1/(µ0 c2 ).)
However one could in principle equally decide to use Coulomb’s law to define charge. This is
what is done in Gaussian units, where by definition
q1 q2
F=
|r1 − r2 |2
Then there is no separate unit of charge; charges are measured in N1/2 m (or the non-SI equiv-
alent): e = 4.803 × 10−10 g1/2 cm3/2 s−1 . (You should never need that!) In these units,
µ0 = 4π/c2 . Electric and magnetic fields are also measured in different units.
The following translation table can be used:
Gauss e E p B
√ √
SI e/ 4π0 4π0 E 4π/µ0 B
Note that eE is the same in both systems of units, but eB in SI units is replaced by eB/c in
Gaussian units. Thus the Bohr magneton µB is e~/2m in SI units, but e~/2mc in Gaussian
units, and µB B has dimesions of energy in both systems.
The fine-structure constant α is a dimensionless combination of fundamental units, and as such
takes on the same value (≈ 1/137) in all systems. In SI it is defined as α = e2 /(4π0 ~c), in
Gaussian units as α = e2 /( ~c). In all systems, therefore, Coulomb’s law between two particles
of charge z1 e and z2 e can be written
z1 z2 ~cα
F=
|r1 − r2 |2