QM 2
QM 2
QM 2
Mathematical Structure
and
Physical Structure
Part 2
John R. Boccio
Emeritus Professor of Physics
Swarthmore College
i
ii CONTENTS
Index 1669
Chapter 11
Time-Dependent Perturbation Theory
11.1 Theory
Time-independent or stationary-state perturbation theory, as we developed ear-
lier, allows us to find approximations for the energy eigenvalues and eigenvectors
in complex physical systems that are not solvable in closed form and where we
could write H in two parts as
H = H0 + V (11.1)
H = H0 + Vt (11.2)
(0) E
(0)
E
i~ t = H0 t t t0 (11.3)
t
It is a solution of the time-dependent Schrodinger equation with no perturbing
interactions before t0 where
H = H0 t t0 (11.4)
H = H0 + Vt t t0 (11.5)
921
The new state of the system then satisfies
i~ |t i = H |t i = (H0 + Vt ) |t i t t0 (11.6)
t
with the boundary condition(initial value)
E
(0)
|t i = t at t=t0 (11.7)
Since the effect of H0 will be much greater than the effect of Vt , most of the time
dependence comes from H0 . If we could neglect Vt , then since H0 is independent
of time, we would have the simple time dependence
i
E
(0)
|t i = e ~ H0 t t (11.8)
Let us assume that this is still approximately true and remove this known time
dependence from the solution. This should remove the major portion of the
total time dependence from the problem. We do this by assuming a solution of
the form
i
|t i = e ~ H0 t |(t)i (11.9)
and then determining and solving the equation for the new state vector |(t)i.
Substituting this assumption in our original equation, the equation for |(t)i is
then given by
i H0 t i
i~ e ~ |(t)i = (H0 + Vt ) e ~ H0 t |(t)i
t
i i
H0 e ~ H0 t |(t)i + i~e ~ H0 t |(t)i
t
i i
= H0 e ~ H0 t |(t)i + Vt e ~ H0 t |(t)i
i~ |(t)i = V (t) |(t)i (11.10)
t
where
i i
V (t) = e ~ H0 t Vt e ~ H0 t (11.11)
The substitution has removed H0 from the equation and changed the time de-
pendence of the perturbing potential. We are in the so-called interaction picture
or representation where both the state vectors and the operators depend on time
as we discussed earlier in Chapter 6.
922
We develop a formal solution by integrating this equation of motion for the
state vector to get
Zt Zt
i~ |(t0 )idt0 = V (t0 ) |(t0 )idt0
t0
t0 t0
The 2nd order approximation is obtained by inserting the 1st order approxi-
mation into the full equation. We get
Zt
1
|(t)i = |(t0 )i + V (t0 ) |(t0 )idt0
i~
t0
0
Zt Zt
1 0
+ dt dt00 V (t0 )V (t00 ) |(t0 )i (11.16)
(i~)2
t0 t0
Notice that in all subsequent iterations the operators V (t0 ),V (t00 ),. . ., etc, always
occur in order of increasing time from right to left.
923
where
Zt Zt1 tZ
n1
X 1
U (t, t0 ) = I + n
dt1 dt2 ........ dtn V (t1 )V (t2 ).......V (tn ) (11.18)
n=1
(i~)
t0 t0 t0
so that
i
e ~ H0 t U (t, t0 ) = the total time development operator (11.20)
means the product of the operators where the operators are written from right
to left in order of increasing times, i.e.,
A(t)B(t0 ) t t0
0
A(t)B(t ) = (11.22)
+ B(t0 )A(t) t0 t
924
and in general
t n t
Z Z Zt Zt
V (t0 )dt0 = V (t1 )dt1 V (t2 )dt2 ..... V (tn )dtn
t0 + t0 t0 t0 +
Zt Zt Zt
= dt1 dt2 ...... dtn (V (t1 )V (t2 )......V (tn ))+
t0 t0 t0
Zt Zt Zt
= n! dt1 dt2 ...... dtn V (t1 )V (t2 )......V (tn ) (11.24)
t0 t0 t0
because there are n! possible orderings of the n terms involved. This last form
is identical to the expression for U (t, t0 ) and thus we have
t n
Z
X 1 1
U (t, t0 ) = n n!
V (t0 )dt0
n=0
(i~)
t0 +
Zt
i
= exp V (t0 )dt0 (11.25)
~
t0 +
The last expression is just a convenient shorthand for the infinite sum. In order
to verify that this is in fact a solution of
i~ |(t)i = V (t) |(t)i (11.26)
t
we must prove that
i~ U (t, t0 ) |(t0 )i = V (t)U (t, t0 ) |(t0 )i (11.27)
t
i~ U (t, t0 ) = V (t)U (t, t0 ) (11.28)
t
Substituting, we have
Zt
i
i~ U (t, t0 ) = i~ exp V (t0 )dt0
t t ~
t0 +
Zt
i
= V (t) exp V (t0 )dt0 (11.29)
~
t0 +
925
Since t is certainly the latest time in the time-ordered product and therefore
all the other operators will be on the right of V (t) we can pull it outside the
time-ordered product and write
Zt
i
i~ U (t, t0 ) = V (t) exp V (t0 )dt0 = V (t)U (t, t0 )
t ~
t0 +
as required.
The most important question (really the only question) that is usually asked in
quantum mechanics is the following:
The probability amplitude for observing the system in the state |mi at time t
is given by
i
hm | t i = hm| e ~ H0 t U (t, t0 ) |(t0 )i
i
= hm| e ~ H0 t U (t, t0 ) |ni (11.30)
where
Setting t0 = 0 for simplicity and using the 1st order approximation for U (t, 0)
and also using
i
i + i + i
hm| e ~ H0 t = e ~ H0 t |mi = e ~ m t |mi = hm| e ~ m t (11.32)
926
we get
Zt
1 i
hm | t i = e ~ m t dt0 hm| V (t0 ) |ni
i~
0
Zt
1 i m t i i
= e ~ dt0 hm| e ~ H0 t Vt0 e ~ H0 t |ni
i~
0
Zt
1 i m t i
= e ~ dt0 e ~ (m n )t hm| Vt0 |ni (11.33)
i~
0
2 1 e ~i t 2
!2
2 sin t
2~
Pnm (t) = hm| Vt0 |ni = hm| Vt0 |ni (11.36)
/2
What this means physically is that the system has absorbed from the perturbing
field (or emitted to it) the energy difference = m n and therefore the system
has changed its energy.
Does the statement also mean that the state vector has changed from an initial
value |(0)i = |ni to a final value |(t)i = |mi?
927
We can get a better feeling for the correct answer to this question by deriving
the result in a different manner.
We have
i~ |t i = H |t i = (H0 + Vt ) |t i (11.37)
t
and
H0 |ni = n |ni (11.38)
As in our development of time-independent perturbation theory, we let
Vt = g Ut (11.39)
The set of eigenvectors {|ni} is a complete set and therefore we can use it as a
basis for the space and, in particular, we can write
i
X
|t i = an (t)e ~ n t |ni (11.40)
n
The reason for pulling out the phase factors will be clear shortly.
The phase factors we pulled out represent the time dependence due to H0 and
this is, by assumption, the major time dependence in the system.
If g is small we expect the time dependence of an (t), which is due to the per-
turbation to be weak or that
dan (t)
is small (11.42)
dt
It is in this sense that we can propose to use perturbation theory on the system.
Applying the linear functional hm| from the left and using the orthonormality
relation
hm | ni = mn (11.44)
928
we get
dam (t) X
i~ =g hm| Ut |ni eimn t an (t)
dt n
X
= Vmn (t)eimn t an (t) (11.45)
n
where
m n
mn = (11.46)
~
This is an exact equation. It implies that the time dependence of an (t) is due
entirely to Vt (because we explicitly extracted out the dependence due to H0 ).
This is the interaction picture that we had earlier.
eit
E1 0 0 V11 V12
H0 = , V (t) = = (11.47)
0 E2 eit 0 V21 V22
dcm (t) X X i
i~ = Vmn (t)eimn t cn (t) , |t i = cn (t)e ~ n t |ni (11.48)
dt n n
or
dc1 (t) E1 E2
i~ = ei[+ ~ ]t c2 (t) (11.49)
dt
dc2 (t) E1 E2
i~ = ei[ ~ ]t c1 (t) (11.50)
dt
i i
|t i = c1 (t)e ~ E1 t |1i + c2 (t)e ~ E2 t |2i (11.51)
ei[21 ]t
dC(t) i 0
= C(t) (11.52)
dt ~ ei[21 ]t 0
where
c1 (t) E2 E1
C(t) = , 21 = (11.53)
c2 (t) ~
We can find an exact solution. With initial conditions c1 (0) = 1 and c2 (0) = 0
we get
2 2 2
|c1 (t)| = ~ (21 )2
2
sin2 t = 1 |c2 (t)| (11.54)
2
+ 4
929
where
2
2 ( 21 )
2 = + (11.55)
~2 4
A graph of these functions is shown in Figure 11.1 below.
At resonance, = 21 , we have
2
= , |c1 (t)|min = 0 (11.57)
~
as shown in Figure 11.2 below.
930
Figure 11.3: Amplitude versus
We now return to the full, general equations and look for a perturbation solution.
Now we assume (power series)
931
initial condition a(0)
n
a(0) (1) st
n an using the 1 order equation
........
a(r) (r+1)
n an using the (r + 1)st order equation
H = H0 t0 (11.60)
where
H0 |ni = n |ni (11.61)
and during the time interval 0 t T a perturbation Vt is applied to the
system and the an (t) change with time.
The probability that, as a result of the perturbation, the energy of the system
becomes r , is given by
2
X
2 ~i i t 2
|hr | t i| = ai (t)e hr | ii = |ar (t)| (11.62)
i
and as t we get
2 2
|hr | t i| = |ar (T )| (11.63)
Now to 1st order we have
(1)
dar X i
i~ = hr| Ut |ni e ~ rn t a(0)
n (11.64)
dt n
and
ar (T ) = a(0) (1)
r (T ) + gar (T ) (11.68)
932
which is identical to our earlier result as t .
Now let return to our question. Has the state changed also?
In the example we found that the perturbation produces a final state |t i for
t T which to 1st order is
i
X
|t i = an (t)e ~ n t |ni (11.69)
n
Thus, the perturbation does not cause a jump from one stationary state |ii of
H0 to another |ri, but instead it produces a non-stationary state.
it is correct to say
or
An example
Suppose we perturb an oscillator with a decaying electric field of the form
t
Vt = qE0 xe t0 (11.71)
To 1st order, starting with the initial state |ni with energy
1
n = ~(n + ) (11.72)
2
933
we have
Zt
1
|(t)i = |ni + dt0 V (t) |ni (11.73)
i~
0
where
i i
V (t) = e ~ H0 t Vt e ~ H0 t (11.74)
We let n = 0 (the ground state) for this example. We then have
Zt
1 i t i
|(t)i = |0i + dt0 e ~ H0 t (qE0 x)e e ~ H0 t |0i (11.75)
i~
0
Using
i
e ~ H0 t |0i = ei 2 t |0i (11.76)
r
~
x |0i = |1i (11.77)
2m
i 3
e ~ H0 t |1i = ei 2 t |0i (11.78)
we get (letting t )
r Z
1 ~ 0 t
|(t)i = |0i + (qE0 ) dt0 eit |1i (11.79)
i~ 2m
0
and finally,
2
q 2 E02
Z
2 0 t
dt0 eit
P01 = |h1 | (t)i| =
2m~
0
q 2 E02 2
= (11.80)
2m~ 2 2 + 1
We now return to the earlier general result (11.36) we derived for the probability,
namely,
2 sin t 2
!
2~
P0n (t) = hn| Vt0 |0i (11.81)
/2
In Figure 11.4 below we plot this function.
934
Figure 11.4: Probability(0,n) versus Delta
The height of the central peak is proportional to t2 and the location of the first
zero is at
2~
= (11.82)
t
so that the width of the peak decreases as 1/t.
As t , however, the probability is largest for those states whose energy lies
under the sharp bump near = 0 or those states with whose energy lies under
the peak around 0 . Now the energy n 0 lies under the sharp bump when
2~
|| = |n 0 | < (11.84)
t
The area under the bump is proportional to t and the rest of the area oscillates
in time around zero. This latter feature means that if n 6= 0 , the transition
probability oscillates in time with a repetition time of
2~
(11.85)
|n 0 |
The case, where we are looking for a transition to a single state, is, thus, only
valid in perturbation theory for very small time t. Otherwise the condition that
935
the
P00 (t) 1 (11.86)
will not be true and perturbation theory breaks down. We also note that the
probability cannot grow larger than one or that, after a while, the higher-order
effects of the perturbation which we have neglected so far must become impor-
tant and prevent the probability from exceeding one.
Physically, a more interesting case occurs when the state |ni is one of a contin-
uum of energy states, or it lies in a group of very closely spaced levels.
P0n (t)
= transition rate = = constant as t (11.88)
t
Quantities that we measure are related to the transition rate and this result
says that these measurements will make sense.
To calculate this transition rate we must sum P0n over the group of final
states. We assume that | hb| Vt0 |0i |2 is relatively constant over the small group
of states near |ni (has a weak energy dependence).
We then have
h i 2
(n 0 )t
X 2 Z sin ~
P0n (t) = hn| Vt0 |0i dn (n ) h (11.89)
i
(n 0 )
n
ingroup group 2
936
where
sin2 t
t () = (11.91)
2 t
we have that (
t
=0
t () = 1
(11.92)
2 t 6= 0
and
Z
lim dt ()F () = F (0) (11.93)
t
Therefore,
lim t () = () (11.94)
t
and thus
2 2
= transition rate = hn| V |0i (n )n =0 (11.96)
~
which is called Fermis Golden Rule.
V t it
Vt = V et cos t = + eit
e e (11.97)
2
937
We have
Z0
1 i n t
hn | (t)i = hn | 0i + e ~ dt0 hn|V (t0 ) |0i
i~
" #
i(n 0 ~) ~t t
e t
e ei(n 0 +~) ~
= + hn| V |0i (11.98)
2 0 n + ~ + i~ 0 n ~ + i~
The first term comes from the eit part of Vt (positive frequency) and the
second term comes from the eit part of Vt (negative frequency). The last term
represents interference effects.
Since
P0n (t) = 0n t (11.100)
we have
dP0n (t)
0n = (11.101)
dt
and thus
0n (11.102)
h i
1 1
2t
e 2
(0 n +~)2 +(~)2 + (0 n ~)2 +(~)2
(1 cos 2t)
= hn| V |0i
h i
2it
4 e
+2 sin 2t (0 n +~+i~)(
0 n ~+i~)
The sin t and cos t terms arise from the interference term. In the limit 0
and assuming that |ni is in the continuum part of spectrum, we have
938
Thus, the eit part produced a E > 0 process (absorption), while the eit
part produced a E < 0 process (emission).
which is only applied for a finite time interval 0 t T . If we start with energy
i in the state |ii, then at any time t T
(1)
af (t) = 1st order amplitude for the state |(t)i
to have energy f (be in state |f i ?) f 6= i
is given by
ZT ZT
(1) 1 i(f i )t 1
af (T ) = hf | V |ii e dt + hf | V + |ii ei(f i +)t dt (11.105)
i~ i~
0 0
and
2
(1)
af (T ) = probability that the final energy will be f (11.106)
As an example we consider spin resonance (we solved this problem exactly ear-
lier).
We consider a spin = 1/2 particle in a static magnetic field B0 (in the zdirection).
This says that the unperturbed Hamiltonian is
1
H0 = ~B0 z (11.107)
2
This operator has the eigenvectors and eigenvalues
1 0 1
|+i = , |i = , = ~B0 (11.108)
0 1 2
We now perturb the system with another magnetic field B1 , which is rotating
in the x y plane with angular velocity . This implies that
1
Vt = ~B1 (cos ti + sin tj)
2
eit
1 1 0
= ~B1 [x cos t + y sin t] = ~B1
2 2 eit 0
1
= ~B1 V eit + V + eit (11.109)
2
939
where
0 1 + 0 0
V = , V = (11.110)
0 0 1 0
We choose the initial state to be
where
0 = B0 , 1 = B1 (11.113)
and we have used
1
h| V |+i = 0, h| V + |+i = ~B1 (11.114)
2
f i
f i = = 0 (11.115)
~
When is the first order perturbation theory result valid?
If we compare the exact result with perturbation theory by expanding the exact
result in a power series, we find that the two results agree exactly if
1
0 + << 1 (11.116)
2 1
|af (T )| = sin2 T (11.117)
2
2
where 2 = (0 + ) + 12 and perturbation theory gives
2 T 2
(1) 1
af (T ) = (11.118)
2
Thus, the results agree only if |1 T | 1 or if the perturbation only acts for a
short time.
940
The Hamiltonian for an electron in an atom interacting with and electromagnetic
field is 2
~
p~op qc A
H = + q + U (11.119)
2me
where q = e, U = the potential energy function that binds the electrons in
the atom, and A~ and are the vector and scalar potentials associated with the
electromagnetic field.
~
~ = 1 A ,B
E ~ =A
~ (11.120)
c t
~ = 0, the B
Note that if A ~ = 0 and
Z~r
= ~ r, t)d~r
E(~ (11.121)
0
We rewrite H as
H = H0 + V (11.122)
where
p~2op
H0 = + U = Hamiltonian for the atom with no electromagnetic field
2me
(11.123)
and
2
q ~ p~op + q
~+A
~A
~ + q
V = p~op A A (11.124)
2me c 2me c2
is the perturbation due to the presence of the electromagnetic field, i.e., the
term V tells us how the atom interacts with the electromagnetic field.
~ |B|,
In Gaussian units |E| ~ but the force due to B ~ (v/c) the force due
~
to E. Thus, magnetic effects are negligible in most atoms compared to electric
effects. We therefore assume
~ constant over the volume of the atom
1. E
~ can be neglected
2. B
941
This is the so-called electric dipole approximation.
Z~r
= ~ r, t)d~r
E(~ (11.125)
0
to the Hamiltonian.
942
~ r, t) = E(~
If E(~ ~ r)eit , then A
~ = eE/i.
~ Using
p~op ih i
= H0 , ~rop (11.132)
m ~
we get
q h i
~
VAp = H0 , ~rop E (11.133)
~
We can then calculate matrix elements
q ~
hm| VAp |ni = hm| H0~rop ~rop H0 |ni E
~
q ~ |ni
= (m n ) hm| ~rop E
~
mn
= hm| V |ni (11.134)
where
m n
mn = (11.135)
~
~ pop ) differ from the matrix elements of
Thus, the matrix elements of VAp (from A~
V (from ) by the factor mn /. This implies different transition probabilities in
first order except at resonance where mn / = 1. The reason for the differences
is as follows:
1. we assumed that the perturbation = 0 for t 0 , t T
~ changing discontinuously
2. there is no physical problem with V = q~r E
~ p~op changes discontinuously, then the relation
3. however, if A
~
~ = 1 A
E (11.136)
c t
~ fields
generates spurious function type E
It is clear that one must exercise great care in choosing a starting point for
perturbation theory.
943
This perturbing potential corresponds to monochromatic(single wavelength)
electromagnetic radiation.
For an initial state |(0)i = |ii where H0 |ii = i |ii the probability, at any time
t T , that the atom will have a final energy f is
2
(1)
Pif (T ) = af (T ) (11.138)
(i f ~) absorption
(i f + ~) emission
These relationships between the quantum numbers of the initial and final states
that tell us whether or not a transition is allowed are called selection rules.
To determine the selection rules for one-electron atoms we only need to consider
matrix elements of the form
~ 0 |n`m` ms i
hn0 `0 m`0 ms0 | ~r E (11.142)
hn0 `0 m`0 ms0 | x |n`m` ms i , hn0 `0 m`0 ms0 | y |n`m` ms i and hn0 `0 m`0 ms0 | z |n`m` ms i
944
Now
A typical matrix element, therefore, will have a term like the following:
Z Z
hsms0 | sms i Rn0 `0 (r)Rn` (r)r3 dr Y`0 m`0 Y`m` Y1m d (11.146)
where m = 1, 0.
The radial integral would equal zero only by accident implying that it is not
part of the general selection rules, which must come from the other terms.
The term hsms0 | sms i = ms0 ms gives us a simple selection rule (this simple
rule arises here because the interaction does not depend on spin). We have
The rest of the selection rules come from the angular integration terms
Z
Y`0 m`0 Y`m` Y1m d (11.148)
This integral equals 0 unless ` + `0 + 1 = an even integer. This rule follows from
parity considerations. For any angular integration over all angles to be nonzero,
the integrand must be even under the parity operation.
Now
Y`m (1)` Y`m (11.149)
under the parity operation. Therefore
0
Y`0 m`0 Y`m` Y1m (1)`+` +1 Y`0 m`0 Y`m` Y1m (11.150)
|`0 `| 1 `0 + `
945
and the selection rule then follows from the orthogonality condition.
Thus, for transitions within the electric dipole approximation, as defined above,
we have the SELECTION RULES
m` = m`0 m` = 1 , 0 (11.152)
0
` = ` ` = 1 (11.153)
ms = 0 = s (11.154)
The derivation is more complex for multi-electron atoms due the complexity of
the wave function (see next chapter), but it can be shown that, in general, the
SELECTION RULES in the electric dipole approximation are
1. parity changes
P
2. ( `i ) = 1
3. S = 0
4. MS = 0
6. ML = 1, 0
7. J = 1, 0
8. MJ = 1, 0
9. J = 0 J = 0 is strictly forbidden
e2
|ii = |100i with energy i = 100 = (11.155)
2a0
The final state of the electron in the ionization process is a free particle state(ionized
electron)
*E ~2 k 2
|f i = k with energy f = k = (11.156)
2m
946
We then have E E
p~op ~k = ~~k ~k (11.157)
which says this is a momentum eigenstate also (for free particles [H, p~op ] = 0
and momentum and energy have the same eigenstates). The momentum is given
by p~ = ~~k. Since f > i this is an absorption process and thus the transition
rate is given by
P ~ (t)
0~k = transitionrate = lim 0k
t t
2 D~ 2
= k V |100i (k 100 ~) (11.158)
~
where = frequency of the electromagnetic radiation and
We define
For convenience we use a common trick and assume that the universe is a large
box (side = L, volume = L3 ). This allows us to normalize the plane wave states
associated with the free electron. We have
D E
~
~r ~k = Aeik~r (11.160)
D E Z D ED E
~k ~k = 1 = d3~r ~k ~r ~r ~k
Z Z
~ ~
= A2 d3~reik~r eik~r = A2 d3~r = A2 L3 (11.161)
or
1
A= (11.162)
L3/2
Now there are
d3~k L3 L3 mk
L3 = dk 2
dk = ddk (11.163)
(2)3 (2)3 (2)3 ~2
states in the volume d3~k of phase space. This implies that there are
L3 mk
(11.164)
(2)3 ~2
states per unit energy per unit solid angle.
947
Therefore,
Z
X L3 mk e2 D~ ~ 2
d = 0~k d dk k ~r E |100i (k 100 ~)
3
(2) ~2 2~
~
kind 0
where
~2 k 2
k = 100 + ~ = (11.166)
2m
1/2
2m 1
k= 2 (11.167)
~ a0
using ~k R
~ = kz = kr cos . Now in spherical-polar coordinates we have
Z2 Z Z
2 r
= 3/2
d sin d drr3 eikr cos a0 [cos cos + sin sin cos( )]
a0 L3/2
0 0
948
Since
Z2 Z2
d cos = d sin =0 (11.173)
0 0
the integration wipes out the cos( ) term. Letting x = cos we then
have
Z1 Z
D
~k ~r E~ |100i = 4E cos r
drr3 eikrx a0 xdx
3/2
(11.174)
a0 L3/2
1 0
Now
Z Z
r
3 ikrx a0 3! 6
drr e = drr3 er = = 4 (11.175)
4 1
0 0 ikx + a0
Therefore, we have
Z1
D
~k ~r E~ |100i = 4E cos 6x
4 dx
3/2 3/2
a0 L ikx + 1
1 a0
4E cos 16ka50
= (11.176)
3/2
a0 L3/2 (1 + k 2 a20 )3
and D
~ ~
2 4096 2 E 2 cos2 a70 k 2
k ~r E |100i = (11.177)
L3 (1 + k 2 a20 )6
where
2m 1
k2 = 2 (11.178)
~ a0
Finally, we get
L3 mke2
Z Z D 2
ionization = d = d~k ~k ~r E~ |100i (11.179)
16 2 ~3
where
d~k = integration over the angles of ~k
(varies direction of arbitrary z - axis)
However, varying the z-direction is the same as keeping E~ fixed and integrating
over dE~ . Therefore, we have
L3 mke2 4096 2 E 2 a70 k 2
Z
ionization = dE~ cos2
16 2 ~3 L3 (1 + k 2 a20 )6
64me2 E 2 a70 k3
= (11.180)
3~3 (1 + k 2 a20 )6
949
Now
2ma20
1 + k 2 a20 = (11.181)
~
and letting
~
0 = (11.182)
2ma20
we find that 1/2
1
1 + k 2 a20 = and k = 1 (11.183)
0 a0 0
and we get
3/2
64e2 E 2 a30 0 6
ionization = 1 (11.184)
3~ 0
Thus, there exists a threshold energy for this process, i.e., it cannot occur unless
the energy of the photon is greater than a minimum amount, which makes
physical sense. In particular, we must have 0 so that
We also get the correct 6th power term in the answer which agrees with exper-
iment.
Suppose that H = H(g(t)), where g(t) tells us the dependence on time. This
might correspond to a variation in time of some parameters. We still have
d
i~ |(t)i = H(g(t)) |(t)i (11.186)
dt
and at any instant of time we have
Let us assume that the instantaneous eigenvectors always form a complete set
so that we can write
X
|(t)i = n (t)ein (t) |n(g(t))i (11.188)
n
950
where we have generalized the phase factor
i
e ~ n t (11.189)
Inserting this expression for the state vector |(t)i into the time-dependent
Schrodinger equation we get
d X X
ih n (t)ein (t) |n(g(t))i = H(g(t)) n (t)ein (t) |n(g(t))i
dt n n
X dn (t) X dein (t)
i~ ein (t) |n(g(t))i + ih n (t) |n(g(t))i
n
dt n
dt
X
in (t) d
+ ih n (t)e |n(g(t))i
n
dt
X X
= n (t)ein (t) H(g(t)) |n(g(t))i = n (t)ein (t) En (g(t)) |n(g(t))i
n n
X dn (t) dn (t) in (t)
X
i~ ein (t) |n(g(t))i + ih e |n(g(t))i
n (t)i
n
dt n
dt
X d X
+ ih n (t)ein (t) |n(g(t))i = n (t)ein (t) En (g(t)) |n(g(t))i
n
dt n
Now
Zt
dn (t) 1 d 1
= En (g(t0 ))dt0 = En (g(t)) (11.191)
dt ~ dt ~
0
Therefore, we get
X dn (t) X d
ein (t) |n(g(t))i + n (t)ein (t) |n(g(t))i = 0 (11.192)
n
dt n
dt
951
Using
hm | ni = mn (11.195)
we have
dm im X d
e = n ein hm| |ni (11.196)
dt n
dt
dm X d
= n ei(n m ) hm| |ni (11.197)
dt n
dt
dH d dEn d
|ni + H |ni = |ni + En |ni (11.198)
dt dt dt dt
Again, applying the linear functional hm| from the left we get
dH d dEn d
hm| |ni + hm| H |ni = hm | ni + En hm| |ni (11.199)
dt dt dt dt
For m 6= n, using hm| H = hm| Em , we have
dH d d
hm| |ni + Em hm| |ni = En hm| |ni (11.200)
dt dt dt
d hm| ddt
H
|ni
hm| |ni = (11.201)
dt En Em
Thus, we finally have
dm X hm| ddt
H
|ni
= n ei(n m ) (11.202)
dt n
E m En
n (0) = 1
m (0) = 0 m 6= n
dm hm| ddt
H
|ni
ei(n m )t (11.204)
dt Em En
We now assume that
dH
hm| |ni and Em En (11.205)
dt
952
have slow time dependence and that to this order of approximation we can write
t
ei(n m )t = ei(Em En ) ~ (11.206)
We then get
hm| ddt
H
|ni h i(Em En )t i
m (t) i~ e 1 (11.207)
(Em En )2
This implies that
m (t) remains small for m 6= n (11.208)
The adiabatic theorem assumes that in the case where the system starts in an
eigenstate |ni at t = 0, i.e.,
m (t) = 0 m 6= n (11.209)
and that
|(t)i = ein t |n(g(t))i (11.210)
which says that if the system was in the eigenstate |ni at t = 0, i.e.,
then at a later time t, it is still in the same eigenstate |n(g(t))i of the new
Hamiltonian H(g(t)), i.e.,
This means that if we start with a particle in the ground state of a harmonic
oscillator potential
1
V = k(0)x2 0 (k(0), x) (11.213)
2
and assume that
k(0) k(T ) (11.214)
slowly, the particle ends up in the ground state of the harmonic oscillator po-
tential
1
V = k(T )x2 0 (k(T ), x) (11.215)
2
to within a phase factor.
The opposite result comes from the so-called sudden approximation , where the
change occurs so fast that no changes of the state vector are possible.
Since the state vector does not change at all, if you are in the ground state and
953
a sudden change in the parameters occurs, then you remain in the ground state
for the old parameters. This is not the ground state with new parameters. It is
some linear combination of the new states.
We assume that the particle starts out in the nth eigenstate of H0 = H(0)
|(0)i = |nii (11.217)
where the subscripts are
i initial parameters
f final parameters
The state vectors is assumed to change in time to |(t)i.
where
Vkm = i hk| V |mii (11.219)
On the other hand, first order time-dependent perturbation theory implies that
i
X
|(t)i = n (t)e ~ En t |nii (11.220)
n
954
with
Zt
i
n (t) = 1 Vnn f (t0 )dt0 (11.221)
~
0
Zt
i t0
m (t) = Vmn f (t0 )ei(Em En ) ~ dt0 , m 6= n (11.222)
~
0
Zt
i t0
m (t) = Vmn f (t0 )ei(Em En ) ~ dt0
~
0
Zt
Vmn d h i(Em En ) t0 i 0
= f (t0 ) e ~ dt
Em En dt0
0
Vmn h 0 i(Em En ) t0 it
= f (t )e ~
Em En 0
t
df (t0 ) i(Em En ) t0 0
Z
Vmn
+ e ~ dt (11.223)
Em En dt0
0
Using f (0) = 0 and neglecting the last term because df /dt is small gives
Vnn
X Vqn i
|(T )i = (1 i ) |nii |qi e ~ En T (11.224)
~ Eq En i
q6=n
where
ZT
= f (t)dt (11.225)
0
Therefore
X V kn V nn
X Vqn i
f hn | (T )i = i hn| +
i hk|
(1 i ) |nii |qi e ~ En T
En Ek ~ Eq En i
k6=n q6=n
2
Vnn X |Vqn | e ~i En T
= 1 i + (11.226)
~ (En Eq )2
q6=n
955
X V km Vnn
X Vqn i
f hm | (T )i = i hm| +
i hk|
(1 i ) |nii |qi e ~ En T
Em Ek ~ Eq En i
k6=m q6=n
V nn Vnm
X V nk V km e ~i En T
= i +
~(Em En ) (En Ek )(Em Ek )
m6=k6=n
(11.227)
Note that all the first order terms cancel in the last expression. If we only keep
terms to first order (which is consistent with the derivation) we then have
(
1 i Vnn
k=n
f hk | (T )i = (11.228)
~
0 k 6= n
The transitions in that case are from one eigenvector of the unperturbed Hamil-
tonian to another eigenvector of the same unperturbed Hamiltonian.
The first derivation gives the adiabatic approximation for any size perturbation.
In the second derivation,however, we not only assumed a slow change in time,
956
but also assumes a small perturbation so that we could use first order pertur-
bation theory.
The way to handle this is to divide the time interval (0, T ) into N subintervals
such that the perturbation V is small within any subinterval. In fact, it is of
O(V /N ). Thus, if N is large, V is small.
However, the transition amplitude is second-order and thus the total transition
amplitude behaves like
2
V V2
N 0asN (11.230)
N N
An Example
Let us consider a 1dimensional square well where
(
0 |x| a2
V (x) = (11.231)
|x| > a2
2 ~2 n2
En = n = 1, 2, 3, ....... (11.233)
8ma2
Suppose that we change the size of the well and ask what happens to the ground
state in the sudden and adiabatic approximations.
957
Sudden
x
1 (x) = cos before (11.234)
2a
leads to
x
(x) = cos after (no change in the wave function) (11.235)
2a
However, after the change we have new eigenfunctions and energies
(
0 cos nx
4a n = 1, 3, 5, .......
n (x) = nx
(11.236)
sin 4a n = 2, 4, 6, .......
a
for |x| 2 and zero otherwise and
2 ~2 n 2
En0 = n = 1, 2, 3, ....... (11.237)
8ma2
The state of the system is still an eigenstate of the old well and, thus, is not an
eigenstate of the new well. In fact, we have
x X
(x) = cos = bn n0 (x) (11.238)
2a n
Adiabatic
x
1 (x) = cos before (ground state of old well) (11.239)
2a
x
10 (x) = cos after (ground state of new well) (11.240)
4a
The state of the system is an eigenstate of the new well and, thus, is not an
eigenstate of the old well any longer. In fact, we have
x X
n0 (x) = cos = bn n (x) (11.241)
4a n
11.5 Problems
11.5.1 Square Well Perturbed by an Electric Field
At time t = 0, an electron is known to be in the n = 1 eigenstate of a
1dimensional infinite square well potential
(
for |x| > a/2
V (x) =
0 for |x| < a/2
At time t = 0, a uniform electric field of magnitude E is applied in the direction
of increasing x. This electric field is left on for a short time and then removed.
Use time-dependent perturbation theory to calculate the probability that the
electron will be in the n = 2, 3 eigenstates at some time t > .
958
11.5.2 3-Dimensional Oscillator in an electric field
A particle of mass M , charge e, and spin zero moves in an attractive potential
V (x, y, z) = k x2 + y 2 + z 2
(11.242)
(a) Find the three lowest energy levels E0 , E1 , E2 and their associated degen-
eracy.
(c) In (b) suppose the particle is in the ground state at time t = 0. Find the
probability that the energy is E1 at time t.
Find the 1st order probability for the atom to be in any of the n = 2 states after
a long time.
(b) by solving the problem assuming the validity of 1st order time-dependent
perturbation theory with H as a perturbation switched on at t = 0. Under
what conditions does this calculation give the correct results?
959
11.5.5 A Variational Calculation of the Deuteron Ground
State Energy
Use the empirical potential energy function
V (r) = Aer/a
(r) = er/2a
(a) What are the energy eigenvalues of the particle when it is connected to
both springs?
(c) One spring is suddenly cut, leaving the particle bound to only the other
one. If the particle is in the ground state before the spring is cut, what is
the probability that it is still in the ground state after the spring is cut?
960
11.5.8 Another perturbed oscillator
Consider a particle bound in a simple harmonic oscillator potential. Initially(t <
0), it is in the ground state. At t = 0 a perturbation of the form
H 0 (x, t) = Ax2 et/
is switched on. Using time-dependent perturbation theory, calculate the prob-
ability that, after a sufficiently long time (t ), the system will have made a
transition to a given excited state. Consider all final states.
961
11.5.11 Two-Level System
Consider a two-level system |a i , |b i with energies Ea , Eb perturbed by a
jolt H 0 (t) = U (t) where the operator U has only off-diagonal matrix elements
(call them U ). If the system is initially in the state a, find the probability Pab
that a transition occurs. Use only the lowest order of perturbation theory that
gives a nonzero result.
~
All of the n = 2 states of hydrogen are degenerate in the absence of the field E,
but certain of them mix (Stark effect) when the field is present.
(a) Which of the n = 2 states are connected (mixed) in first order via the
electric field perturbation?
(b) Find the linear combination of the n = 2 states which removes the degen-
eracy as much as possible.
(c) For a system which starts out in the 2s state at t = 0, express the wave
function at time t L/v. No perturbation theory needed.
(d) Find the probability that the emergent beam contains hydrogen in the
various n = 2 states.
962
Figure 11.7: Electric Field
(b) Assume that the unbound states may be approximated by free particle
states with periodic boundary conditions in a box of length L. Find the
normalized wave function of wave vector k, k (x), the density of states as
a function of k, D(k) and the density of states as a function of free-particle
energy Ek , D(Ek ).
(c) Assume that the electric field may be treated as a perturbation. Write
down the perturbation term in the Hamiltonian, H1 , and find the matrix
element of H1 between the initial and the final state h0| H1 |ki.
(d) The probability of a transition between an initially occupied state |Ii and
a final state |F i due to a weak perturbation H1 (t) is given by
t 2
Z
1 0
hF | H1 (t0 ) |Ii eiF I t dt0
PIF (t) = 2
~
where F I = (EF EI )/~. Find an expression for the probability P (Ek )dEk
that the particle will be in an unbound state with energy between Ek and
Ek + dEk for t > .
(F0 /m)
F (t) = , < t <
2 + t2
963
At t = , the oscillator is known to be in the ground state. Using time-
dependent perturbation theory to 1st order, calculate the probability that the
oscillator is found in the 1st excited state at t = +.
imparted to the oscillator is always the same, that is, independent of ; yet
for >> 1/, the probability for excitation is essentially negligible. Is this
reasonable?
Consider now hydrogen including fine structure. For a given sublevel, the spon-
taneous emission rate is
4 3 X 0 0 0 0 ~ 2
(nLJMJ )(n0 L0 J 0 ) = k hn L J MJ | d |nLJMJ i
3~ 0
MJ
(a) Show that the spontaneous emission rate is independent of the initial MJ .
Explain this result physically.
(b) Calculate the lifetime ( = 1/) of the 2P1/2 state in seconds.
H 0 = eE x
964
(a) What is the new ground state energy?
(b) Assuming that the field is switched on in a time much faster than 1/,
what is the probability that the particle stays in the unperturbed ground
state?
965
11.5.19 The Driven Harmonic Oscillator
At t = 0 a 1dimensional harmonic oscillator with natural frequency is driven
by the perturbation
H1 (t) = F xeit
The oscillator is initially in its ground state at t = 0.
(a) Using the lowest order perturbation theory to get a nonzero result, find
the probability that the oscillator will be in the 2nd excited state n = 2
at time t > 0. Assume 6= .
(b) Now begin again and do the simpler case, = . Again, find the prob-
ability that the oscillator will be in the 2nd excited state n = 2 at time
t>0
(c) Expand the result of part (a) for small times t, compare with the results
of part (b), and interpret what you find.
In discussing the results see if you detect any parallels with the driven
classical oscillator.
966
(d) Given that k = 3.0 kg/s2 , what photon wavelength is required to excite
the electron from state 0 to state 1 ? Use symmetry arguments to decide
whether this is an allowed transition (explain your reasoning); you might
want to sketch 0 (x) and 1 (x) to help your explanation.
(e) Given that
a |i = | 1i , a+ |i = + 1 | + 1i
evaluate the transition matrix element h0| x |1i. (HINT: rewrite x in terms
of a and a+ ). Use your result to simplify your expression for the transition
rate.
where and (t) are real-valued parameters (with units of energy). Let (t)
be given by a step function
(
t0
(t) =
0 t>0
|(0+ )i = |(0 )i
Suppose the system is initially prepared in the ground state of the Hamiltonian
at t = 1. Use the Schrodinger equation and the sudden approximation to
compute the subsequent evolution of |(t)i and determine the function
Show that |f (t)|2 is periodic. What is the frequency? How is it related to the
Hamiltonian?
967
which includes a static field B0 in the z direction plus a rotating field in the
x y plane. Let the state of the particle be written
a(0) = 0 , b(0) = 1
Show that
(b1 )2
tp 2
2
|a(t)| = 2 sin2 + (b1 )2
+ (b1 )2 2
Just before each atom enters the cavity, we can assume that the joint state of
that atom and the microwave cavity is given by the factorizable pure state
(a) Suppose the Hamiltonian for the joint atom-cavity system vanishes when
the atom is not inside the cavity and when the atom is inside the cavity
the Hamiltonian is given by
HAC = ~ |ei hg| |0i h1| + 2 |1i h2| + ~ |gi he| |1i h0| + 2 |2i h1|
Show that while the atom is inside the cavity, the following joint states
968
are eigenstates of HAC and determine the eigenvalues:
969
970
Chapter 12
Identical Particles
We will create a model to handle these atoms that follows from the one-electron
case we just considered. These systems are very complex and all the results
that we derive will be approximations.
H = T + V (12.2)
where
N N
~2 2
X X
T = Tj = j and V = V (~r1 , ~r2 , ......, ~rN , t) (12.3)
j=1 j=1
2mj
H(1, 2, 3, 4, ......, N, t) = i~ (1, 2, 3, 4, ......, N, t) (12.4)
t
971
The probability density is defined as
so that
972
two particles, then it is not possible to determine via any physical measurement
that any change was made in the system.
This says that all measurable quantities or the operators representing them must
remain unchanged by the interchange of indistinguishable particles.
In words, we say
H |i = E |i (12.12)
H Pij |i = Pij H |i = E Pij |i (12.13)
This holds for any pair (i, j). So H and Pij share a common eigenbasis as
expected. This phenomenon is called exchange degeneracy.
973
For simplicity, we assume that N = 2. We then have
and h i
H, P12 = 0 (12.16)
then
H(2, 1) = E(2, 1) (12.19)
and these two state functions are degenerate. Then, we can write
HS = ES HA = EA (12.20)
P12 S = +S P12 A = A (12.21)
and
974
This relationship between spin and wave function symmetry cannot be proved
in non-relativistic quantum mechanics. It can, however, be proved if we add rel-
ativity and construct the relativistic waves equations for bosons and fermions.
Before proceeding to study real atoms with N electrons, let us see what we can
learn from a one-dimensional systems containing either two identical bosons or
two identical fermions.
H = H1 + H2 + U (x1 x2 ) (12.22)
2 2
~
H1 = + V (x1 ) (12.23)
2m x21
~2 2
H2 = + V (x2 ) (12.24)
2m x22
where
U (x1 x2 ) = the particle - particle interaction (12.25)
We will assume that U (x1 x2 ) is small enough that we can apply perturbation
theory. We then use direct product states and write
H = H0 + U (12.26)
H1 n(0)
1
(x1 ) = En(0)
1
n(0)
1
(x1 ) (12.27)
H2 n(0)
2
(x2 ) = En(0)
2
n(0)
2
(x2 ) (12.28)
(0)
Hn1 n2 (x1 , x2 ) = H0 n(0)1
(x1 )n(0)
2
(x2 )
(0) (0)
= (H1 + H2 )n1 (x1 )n2 (x2 )
= (En(0)
1
+ En(0)
2
)n(0)
1
(x1 )n(0)
2
(x2 )
= En(0)
1 n2
n(0)
1 n2
(x1 , x2 ) (12.29)
The simple direct product states will not work for a description of the two
particle system since the eigenfunctions of H0 must be either symmetric or
antisymmetric under particle interchange.
975
The correct choice is S or A where
1 h i
n(0)S
1 n2
= n(0) 1
(x1 )n(0)
2
(x2 ) + n(0)
1
(x2 )n(0)
2
(x1 ) (12.30)
2
1 h i
n(0)A
1 n2
= (0)
n1 (x 1 ) (0)
n2 (x 2 ) (0)
n1 (x 2 ) (0)
n2 (x 1 ) (12.31)
2
(0) (0) (0)
Both of these states have energy En1 n2 = En1 + En2 .
Now
D D E E
(0) (0) (0) (0)
1 (x1 ) 1 (x2 ) U (x1 x2 ) 1 (x01 ) 1 (x02 )
976
or
Z Z 2 2
(0) (0) (0)
E11 = 2E1 + dx1 dx2 1 (x1 ) U (x1 x2 ) 1 (x2 ) (12.37)
where
Z Z
Kn1 n2 = dx1 dx2 n(0)
1
(x1 )n(0)
2
(x1 )U (x1 x2 )n(0)
2
(x2 )n(0)
1
(x2 ) (12.43)
Now let us look at a possible physical meaning of these direct and exchange
integrals.
We define
2
(0)
n1 (x1 ) = 1 = probability density for particle 1 in state n1 (12.44)
and
2
(0)
n2 (x2 ) = 2 = probability density for particle 2 in state n2 (12.45)
977
Therefore, the direct integrand takes the form
1 2 U (r12 ) (12.46)
This represents the total energy of two classical charge distributions interacting
with the potential energy U (r12 ).
The energy level diagram to first order might look like Figure 12.1 below.
The more interesting case is a two spin = 1/2 fermion system (since electrons
are spin = 1/2 fermions).
978
For example, E
(0)
n1 |+i1 (12.50)
(0)
presents a fermion in the n1 spatial state with spin up.
and so on, where we define the labels (j) = |+ij and (j) = |ij .
We must choose the antisymmetric combination for the zero-order wave func-
tions. We have 4 possible direct product states given n1 and n2 , i.e.,
1 (1, 2) = n(0)
1
(x1 )n(0)
2
(x2 )(1)(2) (12.52)
2 (1, 2) = n(0)
1
(x1 )n(0)
2
(x2 )(1)(2) (12.53)
(0) (0)
3 (1, 2) = n1 (x1 )n2 (x2 )(1)(2) (12.54)
4 (1, 2) = n(0)
1
(x1 )n(0)
2
(x2 )(1)(2) (12.55)
1
R = (1 P12 ) (12.56)
2
1 1
RA(1, 2) = (1 P12 )A(1, 2) = [A(1, 2) A(2, 1)] (12.57)
2 2
p
which is antisymmetric. The factor 1/ (2) keeps the state normalized. We
now use R to construct four antisymmetric states from the four direct product
states (12.52).
(0) 1
n1 n2 ++ (x1 , x2 ) = (1 P12 )1 (1, 2)
2
1 h (0) i
= n1 (x1 )n(0) 2
(x 2 )(1)(2) (0)
n2 (x 1 ) (0)
n1 (x 2 )(1)(2) (12.58)
2
(0) 1
n1 n2 + (x1 , x2 ) = (1 P12 )2 (1, 2)
2
1 h (0) i
= n1 (x1 )n(0) 2
(x 2 )(1)(2) (0)
n2 (x 1 ) (0)
n1 (x 2 )(2)(1) (12.59)
2
979
(0) 1
n1 n2 + (x1 , x2 ) = (1 P12 )3 (1, 2)
2
1 h (0) i
= n1 (x1 )n(0) 2
(x2 )(1)(2) n(0)
2
(x1 )n(0)
1
(x2 )(1)(2) (12.60)
2
(0) 1
n1 n2 (x1 , x2 ) = (1 P12 )4 (1, 2)
2
1 h (0) i
= n1 (x1 )n(0) 2
(x2 )(1)(2) (0)
n2 (x 1 ) (0)
n1 (x 2 )(1)(2) (12.61)
2
We could use these states as the zero-order wave function to start perturbation
theory. It would be like doing the spin-orbit calculation using the |`sm` ms i ba-
sis, rather than the |`sjmj i basis where Hso is diagonal. It is always important
to choose zero-order wave functions, if it is not to difficult to do, that incorpo-
rate as much of the symmetry of the system as possible. In other words, choose
zero-order wave functions that are simultaneous eigenstates of the maximal set
of commuting observables. This will hopefully produce a diagonal perturbation
matrix or at least so many zeros that it is easy to diagonalize the rest of the
matrix.
In this case, we not only have [H, P12 ] = 0 which told us to choose antisymmetric
zero-order states, but we also have [H, S ~op
2
] = 0 and [H, Sz ] = 0 where
~op = S
S ~1,op + S
~2,op = the total spin angular momentum
Sz = S1z + S2z = the z - component of the total spin angular momentum
Therefore we should choose antisymmetric state functions which are also eigen-
~op
functions of H0 , S 2
and Sz as our zero-order states.
From our earlier work we know that the possible values of the total spin are
980
~ 2 and Sz are
S = 0, 1 and the state vectors that are eigenstates of S op
(0) 1 h i
n1 n2 00 = n(0)1
(x 1 ) (0)
n2 (x 2 ) + (0)
n1 (x 2 ) (0)
n2 (x 1 ) 00 (12.68)
2
(0) 1 h i
n1 n2 1ms = n(0) 1
(x1 )n(0)2
(x2 ) n(0)1
(x2 )n(0)2
(x1 ) 1ms ms = 1, 0
2
(12.69)
Notice that if we have identical spatial states, i.e., n1 = n2 , the S = 1 states
vanish identically. This says that two fermions in an S = 1 spin state cannot be
in the same spatial state(the wavefunction vanishes). This is the first example
of a general principle we will discuss later called the Pauli Exclusion Principle.
where
ams1 ms2 = hs1 s2 ms1 ms2 | s1 s2 sms i (12.71)
981
Now
11 11
ms ms 11 = ms1 , 21 ms2 , 21 (12.72)
22 1 2 22
which implies that
(0) (0)
n1 n2 11 = n1 n2 ++ (12.73)
as written above.
As with the two boson case, the zero-order ground state for two fermions corre-
(0)
sponds to n1 = n2 = 1 with zero-order energy 2E1 . Since the S = 1 or triplet
states have identically zero state functions in the case (since the spatial function
(0)
are antisymmetric), we have 11,1ms = 0. The unperturbed ground state must
(0)
then have S = 0, Sz = 0 or it is 11,00 (1, 2). This involves a singlet state with
ms = 0 only
1
00 = ((1)(2) (1)(2)) (12.76)
2
In this state, the particle spins are always opposite or antiparallel.
which is the same energy as in the two boson system(we are assuming the same
Hamiltonian applies).
The spatial part of the wave function is the same also, namely,
(0) (0)
1 (1)2 (2) (12.78)
We must use a symmetric spatial wave function here because the spin vector is
antisymmetric in the ground state of two fermions. The presence of the spin
982
internal degrees of freedom(and the Pauli principle) has a more dramatic effect
on the first excited state for two fermions.
We can write the energy this way, i.e., we do not need to write a 4 4 matrix
hU i and diagonalize it because the hU i matrix is already diagonal in this basis
due the orthogonality of the spin functions and the fact that the perturbing
potential does not depend on spin. This first order energy is different for the
triplet and singlet states. If we do the integrals (they are the same as the boson
case) we get
(0) (0)
E12 = E1 + E2 + J12 K12 (12.80)
where
+ singlet s = 0, ms = 0
triplet s = 1, ms = 1, 0
All the triplet states have the same energy because they have the same spatial
wave function and the perturbing potential does not depend on spin.
We thus get the energy level structure shown in Figure 12.2 below.
The energies now depend on the total spin S even though the Hamiltonian H
does not explicitly depend on spin. A very dramatic effect!!
This level splitting is not due to any additional terms added to the Hamiltonian
such as Hso or HZeeman . This effect is strictly due to symmetry requirements.
The requirement of symmetry or antisymmetry forced on the spatial wave func-
tions by the symmetry or antisymmetry of the spin vectors causes this level
983
splitting. The entire effect is due to the invariance of the Hamiltonian under
pairwise particle interchange.
The first order energy for the singlet state is larger than for the triplet states
because the repulsive interaction is enhanced in the singlet state. This overall
effect is called spin pairing and it is a purely quantum mechanical effect.
The energy eigenstates for the N electron atom are solutions of the time inde-
pendent Schrodinger equation
HE = EE (12.84)
where E = E (1, 2, 3, 4, ......., N ) and 1 = (~r1 , s1 ) and so on.
984
where the Pij interchange all attributes of the electrons, i.e., both the spatial
and spin degrees of freedom.
p~2k,op
H0 (k) = + V (~rk ) (12.87)
2m
where
H0 (k)n (~rk ) = n n (~rk ) n = 0, 1, 2, 3, 4, ...... (12.88)
Thus, any single particle sees the energy level structure as shown in Figure 12.3
below.
where
(1, 2, 3, ......, N ) = a (1)b (2).......n (N ) (12.91)
and
E = a + b + ...... + n (12.92)
985
This solution implies that
Electrons have spin = 1/2. Thus, corresponding to any single particle energy
level, say a, there are two possible single particle states, namely,
From now on when we write a (1), where the subscript a will be understood to
also include the spin information.
The simple product state solutions are not physically admissible solutions since
they are not antisymmetric under particle interchange for any two particles.
All such states with particles interchanged pairwise have the same energy. In
fact, any permutation of the indices produces a state with the same energy. We
need to construct a completely antisymmetric linear combination of all of these
solutions.
If we define
where the sum means a sum over all possible permutations or arrangements.
There are N ! such permutations.
986
Examples
N = 2 N! = 2
S (1, 2) = 1 (1)2 (2) + 1 (2)2 (1)
N = 3 N! = 6
S (1, 2, 3) = 1 (1)2 (2)3 (3) + 1 (2)2 (1)3 (3) + 1 (3)2 (2)3 (1)
+ 1 (1)2 (3)3 (2) + 1 (3)2 (1)3 (2) + 1 (2)2 (3)3 (1)
Any such permutation operator can be written as the product of the 2particle
interchange operators Pij , i.e.,
and then
Examples
N = 2 N! = 2
A (1, 2) = 1 (1)2 (2) 1 (2)2 (1)
N = 3 N! = 6
A (1, 2, 3) = 1 (1)2 (2)3 (3) 1 (2)2 (1)3 (3) 1 (3)2 (2)3 (1)
1 (1)2 (3)3 (2) + 1 (3)2 (1)3 (2) + 1 (2)2 (3)3 (1)
It is clear that if any two states are identical (put 2 = 3 above), then A is
identically = 0 as it should be for fermions.
987
This implies that we can put at most 2 electrons in each energy level of the
potential well. The two electrons in the k th level would then have wave functions
k and k (12.98)
i.e., they must have opposite spins. This says that N spin = 1/2 fermions must
occupy at least N/2 different states in the well.
This is very different than for bosons where all the N bosons can be in any
energy level.
Another way to write the completely antisymmetric wave function for fermions
is the so-called Slater determinant
a (1) a (2) . a (N )
(1) b (2) . b (N )
A (1, 2, 3, ....., N ) = b
(12.99)
. . . .
n (1) n (2) . n (N )
Therefore, we get
X Z X 2 2
hA | A i = d3~r1 .......d3~rN |a (1)| .... |n (N )| (12.100)
s1 ,s2 ,....,sN
But
XZ 2
d3~rk |k (j)| = 1 (12.101)
sk
so we finally get
X
hA | A i = 1 = number of possible permutations = N ! (12.102)
988
In a similar manner
N!
hS | S i = (12.104)
Na !.......Nn !
where
Nk = the number of times the single particle state k occurs
What is the difference between the ground state of N fermions and N bosons?
For N bosons, all N particles occupy the lowest level 0 and the wavefunction
is
S (1, 2, 3, ...., N ) = 0 (1)0 (2)0 (3).........0 (N ) (12.105)
with energy
E0 = N 0 (12.106)
This is true no matter how large N might be, even for macroscopic systems
where N 1023 . As we shall see in later discussions, this is one of the physi-
cal requirements for phenomena like superconductivity, superfluidity and Bose-
Einstein condensation.
2 in N 1 in N +1
2 2
This difference for systems with even or odd numbers of fermions will lead to
dramatic physical consequences later for some atomic systems.
Either of these two energies is always greater than the N boson ground state
energy.
The extra energy is called the zero point energy and it arises from particle inter-
change invariance or it arises from the Pauli Exclusion Principle which states
989
12.5 The Helium Atom
We now consider the simplest multielectron atom, namely, helium, which has
two electrons. The Hamiltonian is
p~21,op Ze2 p~22,op Ze2 e2
H = H(1) + H(2) + V = + + (12.107)
2m r1 2m r2 |~r1 ~r2 |
where
where
and
Z 2 e2
En(0) = (Z = 2 for helium) (12.109)
2a0 n2
We will be working out the numbers in this problem so that we can compare our
results to experiment. The zero order energies are shown in Table 12.1 below:
(0) (0)
n1 n2 En1 n2 (Ry) En1 n2 (eV )
1 1 -8 -108.8
1 2 -5 -68.0
1 3 -40/9 -64.4
1 .. .. ..
1 .. -4 -54.5
2 2 -2 -27.2
where
e2
1 Ry(Rydberg) = = 13.6 eV (12.110)
2a0
990
The ground state energy is
(0) (0)
Egs = E11 = 2E1 = 8 Ry (12.111)
and the energy of the system when one electron has been ionized (no longer
bound) is
(0) (0)
Eion = E1 + E = 4 Ry (12.112)
Therefore, it requires the addition of 4 Ry to create singly ionized helium. Notice
that the (2, 2) state has an energy greater than Eion , which implies that it is
not a bound state of the helium atom. All the states (1, n) are bound states.
The energy level spectrum looks as shown in Figure 12.4 below.
Since the particles are electrons we must antisymmetrize the wave functions.
We have two spin = 1/2 fermions. The spin functions are
(
|1, (1, 0)i symmetric
|s, ms i = (12.113)
|0, 0i antisymmetric
The spatial part of the wave function must be of opposite symmetry to the spin
functions so that the product is antisymmetric. By convention we label the
991
states as follows:
Parahelium
(symmetric space part)00
(|100i |100i) |00i
1
[|100i |2`mi + |2`mi |100i] 00
2
and so on.
Orthohelium
(antisymmetric space part)1ms
1
[|100i |2`mi |2`mi |100i] 1m
2
and so on.
Even though this calculation does not give a very accurate result, it is still very
instructive to learn the tricks necessary to evaluate the integrals.
992
where
= angle between ~r1 and ~r2 (12.120)
Therefore,
1 1
= 1/2
(12.121)
r12 (r12 + r22 2r1 r2 cos )
In the subsequent development, we let the larger of r1 , r2 be called r> and the
smaller be called r< . We then have
1 1
=
r12 2 1/2
r> 1 2 rr<
>
cos + r<
r>
2 ! 2 !2
1 1 r< r< 3 r< r<
= + 2 cos 2 cos
r> 2r> r> r> 8r> r> r>
2 !3
15 r< r<
+ 2 cos ... (12.122)
48r> r> r>
or
" 2 #
1 1 r< r< 3 2 1
= 1+ cos + cos + ... (12.123)
r12 r> r> r> 2 2
" 2 #
1 r< r<
= P0 (cos ) + P1 (cos ) + P2 (cos ) + ... (12.124)
r> r> r>
Therefore,
1 1 X r<
= P (cos ) (12.125)
r12 r> r>
=0
Now, the addition theorem for spherical harmonics, which is proved at the end
of this chapter, gives
4 X
P (cos ) = Ym (1 )Ym (2 ) (12.126)
2 + 1
m=
993
and Z Z
dYm () dYm ()Y00 () = 0 m0 (12.130)
Therefore, the only term that contributes from the sum is = m = 0 and we
get Z Z
1 1
d1 d2 = (12.131)
r12 r>
and therefore we have
3 Z Z
1 Z 2Zr1 2Zr2 1
E = 2 e 2
dr1 r12 e a0
dr2 r22 e a0
(12.132)
a0 r>
0 0
or
3 Z Zr1
1 Z 2Zr1 2Zr2 1
E = 2 e 2
dr1 r12 e a0
dr2 r22 e a0
a0 r1
0 0
3 Z Z
1 Z 2Zr1 2Zr2 1
+ 2 e 2
dr1 r12 e a0
dr2 r22 e a0
(12.133)
a0 r2
0 r1
which gives
5 Ze2
E = = J1s,1s = J10,10 = 2.5 Ry = 34 eV (12.134)
8 a0
for Z = 2.
This first order result is amazingly good for this complex system!
The first order energy shifts are once again given by standard perturbation
theory since the hV i matrix is diagonal in this basis due to the orthonormality
of the spin vectors and the fact that V is independent of spin.
We thus have
e2
Z Z
s,t 1 2
En` = d3~r1 d3~r2 |100 (1)n`0 (2) 100 (2)n`0 (1)| (12.137)
2 r12
994
where s, t singlet, triplet S = 0, 1 , +. As shown before, we need only
calculate the m = 0 case because [L~ op , V ] = 0 where
~ op = L
L ~ 1,op + L
~ 2,op = total orbital angular momentum (12.138)
Therefore
Z Z
s,t 2 2 1
En` = e2 d3~r1 d3~r2 |100 (1)| |n`0 (2)| (12.139)
r12
Z Z
1
+ e2 d3~r1 d3~r2 100 (1)n`0 (2)100 (2)n`0 (1) = Jn` Kn`
r12
where
and
with
~1,op S
~2,op = S~op
2 ~1,op
2 ~2,op
2 3
2S S S = ~2 (S(S + 1) ) (12.140)
2
(
1
2S ~2,op = ~2 + 2 triplet
~1,op S (12.141)
23 singlet
and therefore
s,t 1 ~1,op S
~2,op )Kn`
En` = Jn` (1 + 4S (12.142)
2~2
The calculation results (in eV) are shown in Table 12.2 below and the energy
levels are shown in Figure 12.5 below. Not bad!!
995
State 1s2s 1s2p
n` 10,20 10,20 10,21 10,21
singlet triplet singlet triplet
0th order -68.0 -68.0 -68.0 -68.0
J 11.4 11.4 13.2 13.2
K 1.2 1.2 0.9 0.9
1st order -55.4 -57.8 -53.9 -55.7
Eexpt -58.4 -59.2 -57.8 -58.0
error 3.0=5.1% 1.4=2.4% 3.9=6.7% 2.3=4.0%
For comparison, we will also calculate the ground state energy using the vari-
ational method. We neglect spin in this case. The simplest choice of a trial
function is the product of two hydrogen atom wave functions as in (12.143)
below, which would be an exact solution if the electron-electron repulsion was
neglected.
3/2 3/2
1 Z
Zr1 1 Z Zr2
(~r1 , ~r2 ) = e a0 e a0
(12.143)
a0 a0
996
We do the calculation as follows. We write
p~21,op Ze2 p~22,op Ze2 e2
H = HZ (1) + HZ (2) + V = + +
2m r1 2m r2 r12
2 2 2 2 2 2
p~1,op e p~2,op e ( Z)e ( Z)e e2
= + + + +
2m r1 2m r2 r1 r2 r12
( Z)e2 ( Z)e2 e2
= H (1) + H (2) + + + (12.144)
r1 r2 r12
Now
(0)
H (1)100 (1) = E100 ()100 (1) = 2 100
(1) (12.145)
(0) 2
H (2)100 (2) = E100 ()100 (2) = 100 (2) (12.146)
in Rydbergs. Therefore
( Z)e2
f () = 22 + h100
(1)| |100 (1)i
r1
( Z)e2 e2
+ h100 (2)| |100 (2)i + h()| |()i (12.147)
r2 r12
But
( Z)e2 ( Z)e2
h100 (1)| |100 (1)i = h100 (2)| |100 (2)i
r1 r2
( Z)e2
= h100| |100i (12.148)
r
Therefore,
1 e2
f () = 22 + 2e2 ( Z) h100| |100i + h100| h100| |100i |100i (12.149)
r r12
Using some earlier calculations we get
5
f () = 22 + 4( Z) + (12.150)
4
Minimizing
df 5
= 0 = 2 + 2Z (12.151)
d 8
or
5
=Z (12.152)
16
and
5 5 5 5
E0variational = f (Z ) = 2(Z )2 = 2Z 2 + Z 2( )2 (12.153)
16 16 4 16
997
The first two terms are just the first order perturbation theory result. The third
term lowers the energy relative to perturbation theory.
For Z = 2, we get
Even with the simple trial function, we get a significantly better result using the
variational method. The reduction in the value of Z represents the effect of the
inner electron screening the outer electron so it see a smaller nuclear charge.
H = E (12.154)
where = all quantum numbers needed to specify the N -electron state and the
Hamiltonian H is
N N X N
~2 2 Ze2 e2
X X
H = i + (12.155)
i=1
2me ri r
i=1 j>i ij
This implies, since electrons are fermions, that the wave functions must be
completely antisymmetric, i.e.,
The full Hamiltonian is much too complex to solve exactly. We will approach
the solution as a series of increasingly better approximations and obtain a qual-
itative picture of the energy level structure of these complex atoms.
Since the difficulties arise from the e2 /rij terms that represent the electron-
electron repulsion, we start with a model where each electron moves indepen-
dently of all the other electrons (an independent particle model). In this model
998
each electron will be described by a single-particle wavefunction called an or-
bital.
This Hamiltonian is separable, i.e., we can assume that the system wavefunction
is a product of single-particle wavefunctions or orbitals.
where the subscript k represents all applicable single particle quantum numbers
for the k th electron, that is,
To solve these equations, we must know the potential energy functions Vi (~ri ). As
a first approximation within the orbital approximations, we ignore the electron-
electron repulsion so that the electrons only interact with the nucleus and we
have
Ze2
Vi (~ri ) = V (ri ) = (12.164)
ri
In this approximation, all the other electrons do not matter at all and each
electron satisfies
~2 2 Ze2
i ni `i m`i msi (~ri ) = Eni `i m`i msi ni `i m`i msi (~ri ) (12.165)
2me ri
999
This is a hydrogen atom with charge Ze. The single-particle wavefunctions are
given by
n`m` ms (~r) = Rn` (r)Y`m` (, )sms = (~r) (12.166)
where
me Z 2 e4
Enk nk = 1, 2, 3, ...... (12.167)
2~2 nk
The wave function corresponding to a set of orbitals (1 , 2 , ......., N ) is then
properly antisymmetrized by writing it as
1 (1) 2 (1) .. N (1)
1 1 (2) .. .. ..
= (12.168)
N! .. .. .. ..
1 (N ) .. .. N (N )
where
E = E1 + E2 + ..... + EN (12.169)
We certainly can write down an answer in this approximation but the result,
not surprisingly, is terrible. Any real electron is dramatically affected by the
others, even when there is only one other electron as we saw in helium.
12.6.1 Screening
Any electron, on the average, if it is far from the nucleus, does not feel all of
the nuclear charge Ze and hence has a weaker Coulomb attraction then we have
assumed. This is clear in the helium variational calculation where we found that
the best value of the charge variational parameter Z 0 was
5
Z0 =Z (12.170)
16
This implies that, on the average, the distant electrons are shielded or screened
from the nucleus by the other electrons.
What is the simplest correction that we can make to take this effect into account
for multi-electron atoms and still leave us with solvable equations?
Suppose we write
Ze2
Vi (ri ) = + Vief f (ri ) (12.171)
ri
where Vief f (ri ) includes the screening effects of the other N 1 electrons. An
important feature of this assumption is that Vief f (ri ) is independent of (, ).
This says that the angular part of the wave function is still
Y`i m`i (, ) (12.172)
1000
The radial function, however, now satisfies a modified equation
1 d 2 d `(` + 1) 2me
r + 2 (Eni Vi (ri )) Rni `i (ri ) = 0 (12.173)
ri2 dri i dri ri2 ~
This is called the central field approximation. We still have N difficult equations
to solve.
4. the procedure is continued until the final wave functions determine a self-
consistent potential, i.e., it stops changing as we iterate
Using single-particle wave functions, however, we are still neglecting the corre-
lations between the electrons. Although the simple single-particle orbital prod-
uct functions ignore antisymmetry, some effect of the Pauli exclusion principle
(PEP) can be included in the calculations by choosing the single-particle quan-
tum numbers so they do not violate the PEP.
Z 2 e2
En = (12.174)
2a0 n2
and each level had a degeneracy equal to n2 arising from the allowed ranges
` = 0, 1, 2, ......, n 1
m` = `, ........, `
1001
We say that each n value defines a shell with energy En and within each shell
we have subshells defined by `. Thus,
n = 1 ` = 0 1ssubshell
n = 2 ` = 0, 1 2s, 2psubshells
n = 3 ` = 0, 1, 2 3s, 3p, 3dsubshells
We assume that the atom consists of shells (n) and subshells (n`). Electrons
are placed into these shells so that we do not violate the PEP, that is, since the
electrons are fermions only two electrons can be in each energy level. We define
in this model
and we have
En` = En`0 (degenerate) (12.175)
(this is not true in the central field approximation).
For n > 1, the sorbital has a nonzero probability near the origin r = 0
(the nucleus). This implies that it penetrates the n = 1 shell more than the
corresponding porbitals do. This implies that the s subshell electrons feel a
stronger nuclear charge then the p subshell electrons.
Therefore, we expect in this model that the energy levels will look like Figure
12.6 below.
Complex screening arguments of this type lead to the Aufbau principle, which
tells us how electrons fill shells.
1002
We see how it works by figuring out the ground state of an N electron atom.
The ground state corresponds to that state where all the lowest energy levels
are filled with a maximum of two electrons per level. It is clear that this is the
state of lowest energy.
ni , `i i = 1, 2, 3, ......, N (12.176)
i.e.,
The screening arguments of the type we just discussed imply that for a given n
(a given shell) the energy order is s, p, d, ..... and generally the energy of a shell
increases with n = the principal quantum number. The closed shells correspond
to
and so on.
1003
Electrons in the shell beyond the last closed shell are called valence electrons.
Much of the form and shape of the periodic table is determined by the Aufbau
principle. For instance
The valence electrons are the ones that participate in bonding and chemical
reactions.
As with all simple principles of this type, anomalies and breakdowns soon ap-
pear. For the Aufbau principle this occurs at the n = 3 shell.
In real atoms, when the 3d and 4s subshells are partially full, the 4s level fills
ups before the 3d level. This means that
potassium 1s2 2s2 2p2 3s2 3p6 4s
and not 1s2 2s2 2p6 3s2 3p6 3d
The 4s state has a larger probability of being near r = 0 then the 3d state and
hence its energy is lower.
1004
12.7 Angular Momentum Coupling
The N electrons each have spin and orbital angular momentum and thus have
associated magnetic moments.
~ is conserved.
Therefore, the individual `i are not conserved, but L
The spin-orbit interaction leads to terms of the form ~`i ~si and thus couple a
particles orbital and spin angular momentum leading to (ji , mji ) values, where
~ji = ~`i + ~si as we saw earlier in hydrogen.
In most light atoms, the magnetic interactions are usually weaker than the
electrostatic interactions, i.e., electrostatic 1 eV and spin-orbit 104
105 eV .
2. the spin angular momenta ~si couple to form a total spin angular momen-
tum
XN
~op =
S ~si,op (12.183)
i=1
1005
~ and S
3. the weaker magnetic interactions then couple L ~ to form the total
angular momentum of the atom
J~ = L+
~ S~ (12.184)
This coupling scheme or order where the electrostatic interactions dominate the
magnetic interactions is called LS or Russell-Saunders coupling.
We will now investigate the energy level structure in detail for these two different
schemes.
Our discussion of helium has shown that exchange symmetry, which requires
that the wave functions are completely antisymmetric, has dramatic observable
consequences. We saw a spin-spin correlation energy that is characterized by
the rule:
12.7.1 LS Coupling
In this regime we have the observables and quantum numbers as shown in Table
12.3 below:
1006
Operator Quantum Number
~2
L L
op
J~op
2
J
~2
S S
op
Lz ML
Sz MS
and
Lz = ~ML Sz = ~MS
What are the possible L, S values? We can use our addition of angular momen-
tum rules to find out.
`1 = `2 = 1 L = 0, 1, 2
1
s1 = s2 = S = 0, 1
2
and
and
M J = ML + MS
J = |L S| , ......., L + S
1007
L S J State(s)
1
0 0 0 S0
3
0 1 1 S1
1
1 0 1 P1
3
1 1 0,1,2 P0,1,2
1
2 0 2 D1
3
2 1 1,2,3 D1,2,3
For a closed shell we must have L = S = 0 or we would violate the PEP. For
example,
s2 `1 = `2 = 0 L = 0
1
s1 = s2 = S = 0 or 1
2
However, there is only one way to choose the m quantum numbers without
violating the PEP which is shown in Table 12.5 below.
L = S = J = 0 1 S0 state (12.187)
For
p6 `1 = `2 = `3 = `4 = `5 = `6 = 1
1
s1 = s2 = s3 = s4 = s5 = s6 =
2
Once again it turns out there is only one way to choose the values without vio-
lating the PEP. This is shown in Table 12.6 below.
1008
m`1 m `2 m`3 m`4 m`5 m`6 ms1 ms2 ms3 ms4 ms5 ms6
1 1 0 0 -1 -1 1/2 -1/2 1/2 -1/2 1/2 -1/2
L = S = J = 0 1 S0 state (12.188)
In the presence of Hso the energy levels will depend on L, S and J but not on
ML , MS or MJ , which is why we label their atomic terms by
2S+1
LJ (12.189)
where
S, P, D, F, ......... means L = 0, 1, 2, 3, ..... (12.190)
The superscript 2S + 1 s the multiplicity of the level (singlet, doublet, triplet,
etc).
If we ignore Hso then we have (2S + 1)(2L + 1) degeneracy for a given level.
2S+1
Adding Hso splits the J states. Each term LJ remains 2J + 1 degenerate
(the MJ values).
How do we determine the ground state for a particular atom in this scheme?
Let us consider carbon which has two equivalent (same subshell) 2p-electrons in
the unfilled shell. We have
2p2 `1 = `2 = 1 L = 0, 1 or 2
1
s1 = s2 = S = 0 or 1
2
We get Table 12.7 below by applying these rules:
1009
L S J Term Sublevels
1 1
0 0 0 S S0
1 1
1 0 1 P P1
1 1
2 0 2 D D2
3 3
0 1 1 S S1
3 3
1 1 0,1,2 P P0,1,2
3 3
2 1 1,2,3 D D1,2,3
Not all of these sublevels are allowed by the PEP however. To see this we must
look at the individual electron quantum numbers. Table 12.8 below shows those
m`1 , m`2 , ms1 , ms2 values allowed by the PEP (i.e., no two electrons have the
same set of quantum numbers).
Before proceeding to the table, in this case, we can use symmetry arguments
to determine the allowed levels. In the special case of only two electrons in an
unfilled shell, we can easily determine the symmetry of the spin vectors
We also know the symmetry of the spatial state in general. The symmetry fol-
lows from the symmetry of the angular part of the 2electron wave function.
Since we have a central field approximation, the angular part of the wave func-
tion is given by the YLML spherical harmonics. The radial function is always
symmetric. The symmetry of the spherical harmonics is (1)L . Therefore,
The product of the spin vector and the spatial function must always be anti-
symmetric. Therefore we have
This method is only simple to carry out for 2electron unfilled shells. In the
case of carbon we get the allowed states
for a total of 15 allowed states. The individual quantum numbers table corre-
sponding to these 15 states is
1010
Entry m` = 1 m` = 0 m` = +1 ML MS
1 -2 0
2 0 0
3 2 0
4 -1 1
5 -1 0
6 -1 0
7 -1 -1
8 0 1
9 0 0
10 0 0
11 0 -1
12 1 1
13 1 0
14 1 0
15 1 -1
Any other combinations will violate the PEP. This table can be constructed just
using the PEP.
We now construct an implied terms table which tells us how many states exist
with a particular pair of (ML , MS ) values. It is shown as Table 12.9 below.
ML /MS 1 0 -1
2 0 1 0
1 1 2 1
0 1 3 1
-1 1 2 1
-2 0 1 0
We use this table to determine which atomic terms are allowed for carbon. The
steps are as follows:
1. Consider the largest possible values of L and S, L = 2,S = 1 which
correspond to the 3 D terms.
1011
2. We now look at the next largest values, namely, L = 2,S = 0 or the 1 D
term. A L = 2,S = 0 term requires ML = 2,MS = 0 terms which do exist.
Therefore the 1 D term and the sublevel 1 D2 exist. This has J = 2 and
thus 2J + 1 = 5 MJ levels. This accounts for 5 of the 15 entries in the
table.
ML /MS 1 0 -1
2 0 0 0
1 1 1 1
0 1 2 1
-1 1 1 1
-2 0 0 0
ML /MS 1 0 -1
2 0 0 0
1 0 0 0
0 0 1 0
-1 0 0 0
-2 0 0 0
This accounts for all the 15 entries in the table. No more states are allowed,
which means that the 3 S and 1 P atomic terms and their associated sublevels are
forbidden by the PEP. This result agrees with the allowed states we obtained
from symmetry arguments.
We always need to use the implied-terms tables in the general cases (more than
2 electrons in an unfilled shell) because the corresponding symmetry arguments
1012
are very complex to apply.
This result is true for all atoms with 2 equivalent pelectrons outside closed
subshells.
To complete the picture, we must now determine how the allowed terms are
ordered in energy.
A set of rules exists for qualitatively ordering the levels. They are called Hunds
rules.
Hunds rules apply when we are ordering the energy levels and sublevels for
equivalent electrons in the ground state.
This follows from the fact that same spins (unpaired spins) repel and different
spins (paired spins) attract.
High multiplicity implies a greater number of electrons with parallel spin than
low multiplicity in multielectron atoms. Since parallel spin electrons avoid each
other, the e2 /rij effect decreases and the energy of high multiplicity states lies
below that of low multiplicity states.
Rule 2 Of several levels with the same multiplicity S, the one with maximum
L lies lowest in energy
In some sense, the maximum L state implies that all electrons are orbiting in
the same direction. These electrons tend to remain separated from each other
and so have a lower energy than those orbiting in the opposite direction, which
get close to each other some of the time.
1013
This results follows from Hso and the fact that e2 /r increases as r .
Applying Hunds rules to an np2 configuration we get the energy level scheme
in Figure 12.7 below.
Hunds rules are not perfect since they are based on the orbital approximation.
12.7.3 JJ-Coupling
In heavy atoms, the magnetic interactions which couple the ~`i and ~si together
into the ~ji , dominate over the electrostatic interactions which led to LS coupling.
The configurations are then better described by the so-called jj-coupling scheme.
1
ji = `i
2
mji = ji , ......., ji
~
The individual ~ji then couple together to give the total J.
J = |j1 j2 | , ......., j1 + j2
MJ = J, ......., J
1014
Let us consider the P b (lead) atom, which has np2 valence electrons (built on
many closed shells). We have
`1 = `2 = 1
1 3 1 3
j1 = , and j2 = ,
2 2 2 2
The possible total J values are then
3 3
= 3, 2, 1, 0
2 2
1 3
= 2, 1
2 2
1 1
= 1, 0
2 2
Not all of these states are allowed by the PEP. For example,
3 3
J = 3, MJ = 3 j1 = j2 = , mj1 = mj2 =
2 2
`1 = `2 = 1, m`1 = m`2 = 1
1 1
s1 = s2 = , ms1 = ms2 =
2 2
Both electrons need to have identical quantum numbers for this state to exist.
Thus, this state is not allowed. In a similar manner,
3
J =3 , j1 = j2 =
2
1
J =1 , j1 = j2 =
2
can be shown to be forbidden by the PEP. Therefore we have
3 3
= 2, 0
2 2
1 3
= 2, 1
2 2
1 1
=0
2 2
Usually, the level with the lowest J for a given pair (j1 , j2 ) has the lowest energy
(this is not a strict rule).
1015
Figure 12.8: LS - jj Energy Level Connection
D E 1
Hso = C [(J(J + 1) L(L + 1) S(S + 1)] (12.191)
2
1
EJ+1 EJ = C [(J + 1)(J + 2) J(J + 1)]
2
= C(J + 1) (12.192)
This says that the spacing between consecutive levels of a fine structure mul-
tiplet is proportional to the larger J value involved. This is the Lande interval
rule.
We end this discussion with an example of two electrons that are not equiva-
lent(in different shells). The discussion is more straightforward since we do not
have to worry about the PEP (all possibilities are allowed).
1016
we have:
`1 = 1 , `2 = 2 L = 1, 2, 3
1
s1 = s2 = S = 0, 1
2
3 1 J = 4, 3, 2
3 0 J = 3 7 states
2 1 J = 3, 2, 1 16 states
2 0 J = 2 5 states
1 1 J = 2, 1, 0 9 states
1 0 J = 1 3 states
or
J = 4 in 1 level
J = 3 in 3 levels
J = 2 in 4 levels
J = 1 in 3 levels
J = 0 in 1 level
Thus, we have 12 total levels. The LS coupling energy level diagram is shown
in Figure 12.9
1017
In the jj-coupling scheme the energy leveldiagram is shown in Figure 12.10
Notice that the final J values are identical, but their arrangement in energy is
very different.
1018
In ordinary 3dimensional space, if we define
~ 2op |`mi = L
h| L ~ 2op h | `mi = L
~ 2op Y`m (, )
= ~2 `(` + 1) h | `mi = ~2 `(` + 1)Y`m (, ) (12.198)
h| L3 |`mi = L3 h | `mi = L3 Y`m (, )
= ~m h | `mi = ~mY`m (, ) (12.199)
Some Properties
Complex Conjugate
Y`,m (, ) = (1)m Y`,m (, ) (12.205)
~r ~r or r r, , +
1019
Therefore,
Since they form a complete set, any function of (, ) can be expanded in terms
of the Y`,m (, ) (the Y`,m (, ) are a basis), i.e., we can write
X
f (, ) = f`m Y`,m (, ) (12.207)
`,m
where
Z2 Z
f`m = d sin dY`0 m0 (, )f (, ) (12.208)
0 0
Z2 Z
d sin dY`0 m0 (, )Y`m (, ) = `0 ` m0 m (12.209)
0 0
Closure:
X
`
X
( 0 )( 0 )
Y`m (, )Y`m (0 , 0 ) = (r, r0 ) (12.210)
sin
`=0 m=`
i.e., the solid angle delta function is equal to zero unless the two vectors r(, ) , r0 (0 , 0 )
coincide. It has the property
Z
f (r0 )(r, r0 )d0 = f (r) (12.211)
Recursion:
1/2
L Y`m = [`(` + 1) m(m 1)] Y`,m1
1/2
= [(` m)(` + 1 m)] Y`,m1 (12.212)
1/2
(` + 1 + m)(` + 1 m)
cos Y`m = Y`+1,m
(2` + 1)(2` + 3)
1/2
(` + m)(` m)
+ Y`1,m (12.213)
(2` + 1)(2` 1)
1020
12.8.2 The Addition Theorem
Consider two coordinate systems xyz and x0 y 0 z 0 . The addition theorem is the
formula expressing the eigenfunction P` (cos 0 ) of the angular momentum Lz0
about the z 0 axis in terms of the eigenfunctions Y`,m (, ) of Lz . . See Figure
12.11 below for orientations.
The angles and are the azimuth and the polar angles of the z 0 axis in
the Cartesian xyz coordinate frame. They are also the first two Euler angles
specifying the orientation of the Cartesian coordinate system x0 y 0 z 0 with respect
to xyz. The third Euler angle is left unspecified here and the x0 and y 0 axes
are not shown. The projections of the z 0 axis and the radius vector on the xy
plane are dashed lines.
The direction of the z 0 axis in space is specified by its polar angle and its
azimuth angle with respect to the xyz system.
1021
Since P` is an eigenfunction of L~ 2 , only spherical harmonics with the same
op
subscript ` can appear in the expansion.
1022
where 0 = angle between the directions (, ) and (, ).
If we combine the closure relation with the addition theorem we get the identity
X
(2` + 1)P` (r r0 ) = 4(r, r0 ) (12.223)
`=0
or, in general
X
`
~
X
eik~r = 4 i` j` (kr)Y`m (~k , ~k )Y`m (~r , ~r ) (12.227)
`=0 m=`
12.9 Problems
12.9.1 Two Bosons in a Well
Two identical spin-zero bosons are placed in a 1dimensional square potential
well with infinitely high walls, i.e., V = 0 for 0 < x < L, otherwise V = . The
normalized single particle energy eigenstates are
r
2
un (x) = sin (nx/L)
L
(a) Find the wavefunctions and energies for the ground state and the first two
excited states of the system.
(b) Suppose that the two bosons interact with each other through the per-
turbing potential
H 0 (x1 , x2 ) = LV0 (x1 x2 )
Compute the first-order correction to the ground state energy of the sys-
tem.
1023
12.9.2 Two Fermions in a Well
Two identical spin1/2 bosons are placed in a 1dimensional square potential
well with infinitely high walls, i.e., V = 0 for 0 < x < L, otherwise V = . The
normalized single particle energy eigenstates are
r
2
un (x) = sin (nx/L)
L
(a) What are the allowed values of the total spin angular momentum quantum
number, J ? How many possible values are there fore the zcomponent
of the total angular momentum?
(d) What is the ground-state energy of the two-particle system, and how does
it depend on the overall spin state?
with b < a and V0 very large (assume V0 is infinite where appropriate) and
positive.
(a) Determine the normalized position-space energy eigenfunction for the ground
state. What is the spin state of the ground state? What is the degeneracy?
(b) What can you say about the energy and spin state of the first excited
state? Does your result depend on how much larger a is than b? Explain.
1024
12.9.4 Hydrogen Atom Calculations
We discuss here some useful tricks for evaluating the expectation values of cer-
tain operators in the eigenstates of the hydrogen atom.
me2
E (1) = = h/rin`m
n2 ~2
Therefore we get
me2 1
h1/rin`m = = 2
n 2 ~2 n a0
We note (for later use) that
(0) (1) dE
E() = E +E + .... = E( = 0) + + ....
d =0
so that one way to extract E (1) from the exact answer is to calculate
dE
d =0
(b) Evaluate, in a manner similar to part (a), p~2 /2 n`m by considering the
Hamiltonian
p2 Ze2 p2
H = +
2 r 2
(c) Consider now /r2 n`m . In this case, an exact solution is possible since
the perturbation just modifies the centrifugal term as follows:
~2 `(` + 1) ~2 `0 (`0 + 1)
+ =
2mr2 r2 2mr2
where `0 is a function of . Now go back to the original hydrogen atom
solution and show that the dependence of E on `0 () is
mZ 2 e4
E(`0 ) = = E() = E (0) + E (1) + ....
2~2 (k+ `0 + 1)2
1025
Then show that
0
dE dE d`
/r2 n`m = E (1)
=
d =0 d`0 `0 =` d `0 =`
= 3 2
n a0 (` + 1/2)
or
2 1
1/r n`m = 3 2
n a0 (` + 1/2)
(d) Finally consider /r3 n`m . Since there is no such term in the hydrogen
Hamiltonian, we resort to different trick. Consider the radial momentum
operator
1
pr = i~ +
r r
Show that in terms of this operator we may write the radial part of the
Hamiltonian
~2
1 2
r
2m r2 r r
as
p2r
2m
Now show that
h[H, pr ]i = 0
in the energy eigenstates. Using this fact, and by explicitly evaluating the
commutator, show that
3 Z
2
1/r n`m = 1/r n`m
a0 `(` + 1)
and hence
Z3
1/r3
=
n`m n3 a30 `(` + 1)(` + 1/2)
1026
12.9.7 Magnetic moments of proton and neutron
The magnetic dipole moment of the proton is
e
p = gp Sp
2mp
We have not studied the Dirac equation yet, but the prediction of the Dirac
equation for a point spin1/2 particle is gp = 2. We can understand the fact
that the proton gyromagnetic ratio is not two as being due its compositeness,
i.e., in a simple quark model, the proton is made up of three quarks, two ups
(u), and a down (d). The quarks are supposed to be point spin1/2, hence,
their gyromagnetic ratios should be gu = gd = 2 (up to higher order corrections,
as in the case of the electron). Let us see if we can make sense out of the proton
magnetic moment.
The proton magnetic moment should be the sum of the magnetic moments of
its constituents, and any moments due to their orbital motion in the proton.
The proton is the ground state baryon, so we assume that the three quarks are
bound together (by the strong interaction) in a state with no orbital angular
momentum. The Pauli principle says that the two identical up quarks must
have an overall odd wave function under interchange of all quantum numbers.
We must apply this rule with some care since we will be including color as one
of these quantum numbers.
These are different rules than for the addition of spin case because that case
uses the rotation group instead. We do not need to understand all aspects of
the SU (3) group for this problem. The essential aspect here is that there is a
singlet in the decomposition, i.e., it is possible to combine three colors in a way
as to get a color singlet state or a state with no net color charge. These turn
out to be the states of physical interest for the observed baryons according to a
postulate of the quark model.
1027
(a) The singlet state in the decomposition above must be antisymmetric under
the interchange of any two colors. Assuming this is the case, write down
the color portion of the proton wave function.
(b) Now that you know the color wave function of the quarks in the proton,
write down the spin wave function. You must construct a total spin state
|1/2, 1/2i total spin angular momentum state from three spin1/2 states
where the two up quarks must be in a symmetric state.
(c) Since the proton is uud and its partner the neutron (the are just two states
of the same particle) is ddu and mp ' mn , we can make the simplifying
assumption that mu ' md . Given the measured value of gp , what does
you model give for mu ? Remember that the up quark has electric charge
2/3 and the down quark has electric charge 1/3, in units of positron
charge.
(d) Finally, use your results to predict the gyromagnetic moment of the neu-
tron(neutron results follows from proton results by interchanging u and d
labels) and compare with observation.
1028
12.9.10 LS versus JJ coupling
Consider a multielectron atom whose electron configuration is
1s2 2s2 2p6 3s2 3p6 3d10 4s2 4p4d
(a) To what element does this configuration belong? Is it the ground state or
an excited state? Explain.
(b) Suppose that we apply the Russell-Saunders coupling scheme to this atom.
Draw and energy level diagram roughly to scale for the atom, beginning
with the single unperturbed configuration energy and taking into account
the various interactions one at a time in the correct order. Be sure to
label each level at each stage of your diagram with the appropriate term
designation, quantum numbers and so on.
(c) Suppose instead we apply pure jjcoupling to the atom. Starting again
from the unperturbed n = 4 level, draw a second energy level diagram.
[HINT: Assume that for a given level (j1 , j2 ), the state with the lowest J
lies lowest in energy]
1029
(a) particles are not identical
(b) identical particles of spin= 0
(c) identical particles of spin= 1/2 with spins parallel
1030
Figure 12.12: Fermi-Pauli Splittings
(b) Without the spin-orbit interaction, good quantum numbers for the angu-
lar momentum degrees of freedom are |LML SMS i. What are the good
quantum numbers with spin-orbit present?
(c) The energy level diagram including spin-orbit corrections is sketched be-
low.
Label the states with appropriate quantum numbers. NOTE: Some of the
levels are degenerate; the sublevels are not shown.
1031
1032
Chapter 13
Scattering Theory and Molecular Physics
A particle is a localized region in space and time that contains energy and
momentum. In quantum mechanics a good representation of a particle is a wave
packet as we saw earlier. We will assume that the potential that is responsible
for the scattering effects is of short range in space.
We also assume that the spatial extent of the wave packet is small compared to
the spatial dimensions of the laboratory, i.e., the detectors , etc, but that it is
large compared to the size of the scattering region.
1033
The incident wavepacket is emitted by a source at time t0 in a region where the
potential is zero (or negligible) and we detect the scattered wave packet (at the
detector) in another region where the potential is zero (or negligible).
The wave packet representing the incident particle is given by the expression
d3~k i~k~r
Z
(~r, t0 ) = e a~k (13.1)
(2)3
where a~k has a maximum near ~k0 . Our earlier stationary phase arguments then
say that this wave packet (region in space and time where ||2 is nonzero) travels
with a velocity(group) ~~k0 /m towards the target.
1. We solve for the exact eigenstates ~k (~r) of the potential V (~r) using the
Schrodinger equation
2
~ 2
+ E~k ~k (~r) = V (~r)~k (~r) (13.2)
2m
where
~2 k 2
E~k = (13.3)
2m
and all values of E~k 0 are allowed eigenvalues.
d3~k
Z
(~r, t0 ) = ~ (~r)b~k (13.4)
(2)3 k
where
~2 k 2
0 (13.5)
2m
for all the ~k (~r) in the expansion.
These assumptions imply that only the incident (incoming) and scat-
tered(outgoing) waves appear and that no bound states contribute since
they would have E < 0 and thus fall off exponentially.
1034
3. Since we now have an expansion in energy eigenstates it is trivial to in-
corporate the time evolution of (~r, t). We get
d3~k
Z E~
i ~k (tt0 )
(~r, t) = ~
k (~
r )b~
k e (13.6)
(2)3
This is the formal solution to the scattering problem.
The Greens function for the free particle Schrodinger equation is given by the
differential equation
2
~ 2
+ E~k G(~r, k) = (~r) (13.7)
2m
In terms of this function, the solution to the full Schrodinger equation is
Z
~k (~r) = 0 (~r) + d3~r 0 G(~r ~r 0 , k)V (~r 0 )~k (~r 0 ) (13.8)
where
~2 2
+ E~k 0 (~r) = 0 (13.9)
2m
and 0 (~r) is the free-particle wave function (solution of the homogenous equation
that results when V = 0). We then have
~
0 (~r) = eik~r (13.10)
Direct substitution shows that we have a general solution to the Schrodinger
equation.
2 2
~ 2 ~ 2
+ E~k ~k (~r) = + E~k 0 (~r)
2m 2m
Z 2
3 0 ~
+ d ~r + E~k G(~r ~r0 , k)V (~r0 )~k (~r0 )
2
2m
Z
= d3~r 0 (~r ~r 0 )V (~r 0 )~k (~r 0 ) = V (~r)~k (~r)
1035
In addition, we then say
The integral is then the total scattered wave (adds up all waves coming from all
parts of the target region) and is equal to the sum over all source points ~r 0 .
For this interpretation to make physical sense G(~r ~r 0 , k) must generate out-
going waves only!
We can determine the Greens function, as we did earlier, using Fourier trans-
forms and complex contour integration. As before, the choice of the contour
will be equivalent to choosing boundary conditions for the differential equation
and thus completing the solution of the problem.
d3 p~ i~p~r
Z
G(~r, k) = e G(~ p) (13.11)
(2)3
d3 p~ i~p~r
Z
(~r) = e (13.12)
(2)3
~2 2 d3 p~
Z 2
~
+ E~k G(~r, k) = + E~k ei~p~r G(~
2
p)
2m (2)3 2m
d3 p~ ~2 p2 ~2 k 2 i~p~r
Z
= + e G(~ p)
(2)3 2m 2m
d3 p~ i~p~r
Z
= e (13.13)
(2)3
1036
Therefore, we obtain
d3 p~ i~p~r
Z
1
G(~r, k) = e ~2
(2)3 2m (k 2 p2 )
Z2 Z1 Z
2m eipr cos 2
= 3 2 d d cos p dp
8 ~ p2 k 2
0 1 0
Z Zipr
m 1 p
= 2 2 dp dxex
2 ~ ir p k2
2
0 ipr
Z
m 1 p
eipr eipr dp
=
2 2 ~2 ir p2 k 2
0
Z
m 1 peipr
= 2 2 dp (13.15)
2 ~ ir p2 k 2
then we have
peipr zeizr
Z Z
m 1
2 2 dp + dz = 2iResidue(+k)
2 ~ ir p2 k 2 z 2 k2
semicircle
Now
zeizr
Z
dz 0 as the radius of the semicircle (13.16)
z2 k2
semicircle
due to the eipr term which behaves like eImag(p)r 0 if Imag(p) > 0 as it
does on contour C.
1037
The residue at +k is
keikr eikr
= (13.17)
2k 2
Therefore,
m 1 eikr m eikr
G(~r, k) = 2i = (13.18)
2 2 ~2 ir 2 2~2 r
This physically represents an outgoing spherical wave as is required by the
boundary conditions of the scattering problem.
Z ik|~ r0 |
r ~
i~ m 3 0e
~k (~r) = e k~
r
d ~r V (~r0 )~k (~r0 ) (13.19)
2~2 |~r ~r0 |
Note that since the time dependence of ~k (~r 0 ) is eiE~k t/~ , we are adding up
terms of the form
0 E~
ei(k|~r~r | ~
k t)
(13.20)
|~r ~r 0 |
which are outgoing waves with the appropriate inverse square relationship built
in!
Since we are assuming that the detectors are far away from the target, we can
look at solutions where |~r| >> |~r 0 |. In this case, we have
p 1/2
k |~r ~r 0 | = k(~r ~r 0 ) (~r ~r 0 ) = k r2 + r 02 2~r ~r 0
1/2
1/2 ~r
k r2 2~r ~r 0 = kr 1 2 ~r 0
r
~r
kr 1 ~r 0 = kr ~k 0 ~r 0 (13.21)
r
where
~k 0 = kr = k ~r = wave vector seen at the detector (13.22)
r
1038
Therefore, at the detector we have
~ eikr
~k (~r) = eik~r + f~ (, ) (13.23)
r k
where (, ) = direction of vecr and
Z
m ~ 0
f~k (, ) = d3~r 0 eik~r V (~r 0 )~k (~r 0 ) (13.24)
2~2
= the scattering amplitude
The term
eikr
f~ (, ) (13.25)
r k
implies
amplitude that particle
amplitude that incident
will reach r after being
particle will be scattered
scattered at the target (13.26)
with k in the direction
outgoing wave + inverse
(, )
rsquared effect
Using the exact solution for ~k (~r)
Z ik|~ r 0|
r ~
m 3 0e ~
~k (~r) + d ~r V (~r 0 )~k (~r 0 ) = eik~r (13.27)
2~2 |~r ~r 0 |
~
we can substitute for eik~r to get
ik|~ r 0|
" #
d3~k r ~
Z Z
m 3 0e 0 0
(~r, t0 ) = ~k (~r) + d ~r V (~r )~k (~r ) a~k (13.28)
(2)3 2~2 |~r ~r 0 |
Now we assumed that a~k has a maximum near ~k0 which implies that inside the
brackets [....] we can write ~ (~r ) ~ (~r ) and since k0 >> ~k ~k0 we get
0 0
k k0
~k ~k ~k0 ~k
k= = k0 ~k (13.29)
k k0
Therefore, the last term becomes
d3~k
Z
~ 0
a~ eikk0 |~r~r | ~k0 (~r 0 ) = (k0 |~r ~r 0 | , t0 )~k0 (~r 0 ) = 0 (13.30)
(2)3 k
This is zero because at t = t0 , (k0 ~r r~0 , t0 ) = 0 since k0 |~r r~0 | is to the
right of the potential.
Therefore,
d3~k
Z
(~r, t0 ) = ~ (~r)a~k (13.31)
(2)3 k
1039
which implies a~k = b~k , i.e., the expansion coefficients are the same whether we
expand in plane waves or the exact energy eigenstates.
d3~k
Z E~
k
(~r, t) = 3
~k (~r)a~k ei ~ (tt0 ) (13.32)
(2)
For ~r far from the target, we can use the asymptotic form for ~k (~r) to get
E~
i ~ r ~k (tt0 )
k~
d3~k
Z
e
(~r, t) = 0 (~r, t) + a~ f~k (, ) (13.33)
(2)3 k r
where
E~
i ~ r ~k (tt0 )
k~
d3~k
Z
e
0 (~r, t) = a~ (13.34)
(2)3 k r
= wave packet at time t if V = 0
we get
f~k0 (, )
(~r, t) = 0 (~r, t) + 0 (k0 r, t) (13.36)
r
The meaning of these terms is as follows:
1. (~r, t) = (no scattering term) + (scattered wave)
2. the scattered term includes
1040
13.1.2 Cross Sections
The quantity used to connect theory to experiment in scattering experiments is
the differential scattering cross section.
It is defined as follows: if
then
d dN (, )
= (13.37)
d Nin d
In terms of a single particle, d/d is the total probability that the particle is
scattered into a unit solid angle divided by the total probability that the particle
crosses a unit area in front of the target.
The same assumptions are made as earlier, i.e., the incident packet size size
of the target and the detector is far from the incident bean and the target.
Now the total probability of being scattered into an infinitesimal solid angle
d at ~r is the rate that probability strikes an area r2 d in the detector plane
integrated over time.
This is given by
Z
dt (velocity) (area at detector) (probability of scattering)
2
Z
~ (, )
2
~k0 2
fk0
= dt ( ) (r d) ( k r, t) (13.38)
0 0
m r2
Z
2 ~k 2
0
f~k0 (, ) d dt 0 (k0 r, t) (13.39)
m
The total probability that crosses a unit area at ~r0 in front of the target in the
incident beam is
Z Z 2
~k0
dt(probability flux) = dt 0 (k0 r, t) (13.40)
m
1041
If we assume that the wave packet does not spread between ~r0 and ~k0 r (this
just says a particle remains a particle during the duration of the experiment),
then we have
d 2
= ratio of these last two terms = f~k0 (, ) (13.41)
d
This result does not depend on the details of the incident wave packet.
Since experimentalists can measure and d/d, the theorist needs to be able
to calculate them given the potential function.
This means we should only consider stationary state solutions that are also
eigenstates of the angular momentum.
1042
In this expansion
r
4
P` (cos ) = Y`0 (, ) (13.46)
2` + 1
= Legendre polynomial of order `
and
1
(h` (kr) + h` (kr))
j` (kr) = (13.47)
2
where the h` (kr) are Hankel functions.
Since the incident plane wave is independent of (rotation about the direction
of ~k), the scattered wave must be invariant under rotations about the direction
of ~k 0 , which implies that
f~k (, ) f~k () (13.48)
For functions of , the Legendre polynomials are a complete set so we can expand
f~k () as
X
f~k () = (2` + 1)f` P` (cos ) (13.49)
`=0
Each ` term is called a partial wave and f` is the partial wave scattering ampli-
tude.
For similar reasons, the stationary state wave functions can be written as
X
~k (~r) = i` (2` + 1)P` (cos )R` (r) (13.50)
`=0
where
d2
`(` + 1) 2m
+ k2 rR` (r) = V (r)rR` (r) (13.51)
dr2 r2 ~2
In the limit of large r (where the detector is located), assuming that r2 V (r) 0,
we get 2
d 2 `(` + 1)
+k rR` (r) = 0 (13.52)
dr2 r2
which is Bessels equation.
The h` (kr) term represents an incoming spherical wave and the h` (kr) term
represents an outgoing spherical wave. We have therefore assumed that the
effect of the potential will be to modify only the outgoing wave.
1043
In the absence of scattering, i.e., V = 0, we have
1
R` (r) = j` (kr) = [h` (kr) + h` (kr)] (13.54)
2
which implies that
1
B` = and S` = 1 (13.55)
2
When V (r) 6= 0 only S` changes.
In the Schrodinger picture, we can talk about probability densities and proba-
bility flows or currents. In non-relativistic quantum mechanics we have a prob-
ability interpretation for the wave function which implies that
2
probability density = P (~r, t) = |(~r, t)| (13.56)
and
Z
2
|(~r, t)| d3~r = 1 (13.57)
1044
Substitution of (13.53) implies that we must have
` is called the phase shift and 2` equals the difference in phase between the
~
outgoing parts of the wave function K r) and the incident plane wave eik~r .
~ (~
The phase shift contains the all of the effects of the potential on the wave
function.
Since the ` (E) are constants for a given scattering process (since E = constant),
we can determine their values at any point in the scattering process.
1X `
i (2` + 1)P` (cos ) h` (kr) + e2i` h` (kr)
~k (~r) =
2
`
1X `
i (2` + 1)P` (cos ) h` (kr) + h` (kr) + (e2i` 1)h` (kr)
=
2
`
1X `
= i (2` + 1)P` (cos ) [h` (kr) + h` (kr)]
2
`
1X `
i (2` + 1)P` (cos ) (e2i` 1)h` (kr)
+
2
`
X
`
= i (2` + 1)P` (cos )j` (kr)
`
1X `
i (2` + 1)P` (cos ) (e2i` 1)h` (kr)
+
2
`
i~
k~
r 1X `
i (2` + 1)P` (cos ) e2i` 1 h` (kr)
=e + (13.64)
2
`
~ f () ikr
~k (~r) = eik~r + e
r
1045
we get
1 X
(2` + 1)P` (cos ) e2i` 1
f () = (13.66)
2ik
`
1X
= (2` + 1)P` (cos )ei` sin ` (13.67)
k
`
where
f` = e2i` 1 = ei` sin ` (13.68)
= the partial wave scattering amplitude
We obtain the total cross section using
Z
2
= |f ()| d (13.69)
X X (2` + 1) (2`0 + 1) Z2 Z1
i` i`0
= e e sin ` sin `0 d dxP` (x)P`0 (x)
k k
` `0 0 1
Now
Z1
``0
dxP` (x)P`0 (x) = (13.70)
2` + 1
1
where
4
(2` + 1) sin2 `
` = (13.72)
k2
is the partial wave cross section for scattering particles in angular momentum
`states.
1046
acts. For energy
~2 k 2
E= (13.76)
2m
the classical turning radius is
p
`(` + 1)
rcl = (13.77)
k
For r < rcl , the wave function falls off exponentially. If rcl > a, then the particle
feels nothing from the potential, which says that the particle is scattered only
if p
rcl a or `(` + 1) ` ka (13.78)
Note that there are interference effects between different partial waves d/d
but not in .
4 X
= (2` + 1) sin2 ` (13.80)
k2
`
1X
Imag(f ()) = (2` + 1)P` (cos ) sin2 ` (13.81)
k
`
1X 1X
Imag(f (0)) = (2` + 1)P` (1) sin2 ` = (2` + 1) sin2 ` (13.82)
k k
` `
or
4
= Imag(f (0)) (13.83)
k
which is the optical theorem.
1047
We have the general relations
Z
m ~0 0
f~k (, ) = 2
d3~r 0 eik ~r V (~r 0 )~k (~r 0 ) (13.84)
2~
X
~k (~r) = i` (2` + 1)P` (cos )R` (r) (13.85)
`=0
we get
Z X
`
2m X
f~k (, ) = d3~r0 (1)` Y `m (~k 0 )j` (kr 0 )Y`m (~r 0 )V (~r 0 )
~2
`=0 m=`
X 0
i` (2`0 + 1)P`0 (cos )R`0 (r) (13.87)
`0 =0
Now
Z Z2 Z1 Z
3 0
d ~r d~r 0 d(cos ~r 0 ) r 02 dr 0 (13.88)
0 1 0
and
~r 0 integration m0
~r 0 integration ``0
Z
2m X
f () = 2 (2` + 1)P` (cos ) r2 drV (r)j` (kr)R` (r) (13.89)
~
`=0 0
1X
f () = (2` + 1)P` (cos )ei` sin ` (13.90)
k
`
1048
If we suppose that the potential is weak for a particular partial wave, then we
have very small scattering effects on R` (r). We then have
and
ei` sin ` ` (13.93)
or
Z
2mk
` = 2 r2 drV (r)j`2 (kr) (13.94)
~
0
When the potential has a small effect on all partial waves we can derive another
form of the Born approximation. In this case, we assume the scattered wave is
small compared to the incident wave. This implies that
~
~k (~r) eik~r (13.95)
and
Z
m ~ ~0 0
f () = d3~r 0 ei(kk )~r V (~r 0 ) (13.96)
2~2
m
= V~ 0 ~ (13.97)
2~2 k k
where
V~k 0 ~k = Fourier transform of V (~r) (13.98)
1049
Figure 13.3: Elastic Scattering Angles
For short range potentials, the wave function ~k (~r) only contributes to the
scattered wave at small distances. We can therefore use a small distance ap-
proximation for ~k (~r)
Z ikr 0
i~ m 3 0e ~ 0 ~
~k (~r) e k~
r
d ~r V (r0 )eik~r eik~r (13.103)
2~ r0
only if
m Z ikr 0 m Z ikr 0 0
3 0e 0 i~ r 0
k~ 02 0e 0 sin kr
d ~r V (r )e = r dr V (r ) 1
2~ r0 2~ r0 kr 0
At r = b, both the wave function and its derivative must be continuous, which
implies that the logarithmic derivative is continuous
< >
d` (r) d` (r)
dr dr
r=b r=b
` = = (13.106)
`< (b) `> (b)
1050
or using our earlier expression for `> (r) we have
!
r h` (kr) + e2i` h` (kr)
` = (13.107)
h` (kr) + e2i` h` (kr)
r=b
or !
d` (kr)
dr ` ` (kr)
cot ` = dj` (kr)
(13.109)
dr ` j` (kr) r=b
13.1.7 Examples
Scattering from a Hard Sphere
In this case we have (
0 r>b
V (r) = (13.110)
rb
This is a repulsive potential. It implies that
and
` = (13.112)
This gives
` (kb)
cot ` = (13.113)
j` (kb)
For ` = 0 we have
0 (kb) cos kb
cot 0 = = = cot kb (13.114)
j0 (kb) sin kb
or
0 = kb (13.115)
therefore, for a repulsive potential 0 < 0.
What about ` ? Consider low energy or small k. We then have for kb << 1,
(2`1)!!
` (kb) (kb)`+1
cot ` = (kb)`
= (kb)2`1 (2` 1)!!(2` + 1)!! (13.116)
j` (kb)
(2`+1)!!
1051
where
(2` + 1)!! = 1 3 5 (2` + 1) (13.117)
Therefore, we have
sin ` (kb)2`+1 (13.118)
Thus, we can neglect all terms where ` 6= 0 in the low energy limit and write
1 1
f () = P0 (cos )ei0 sin 0 = ei0 sin 0 (13.119)
k k
d 2 sin2 0
= |f ()| = = b2 (13.120)
d k2
sin2 0
= 4 = 4b2 (13.121)
k2
These results for the hard sphere show that, for low energy, the total cross
section is
4 ( classical scattering cross section) (13.122)
due to quantum interference effects and the scattering is isotropic.
Now
sin sin cos
j0 () = , j1 () = 2 (13.127)
cos cos sin
0 () = , 1 () = 2 (13.128)
Therefore,
1
0 = q cot qb
b
0 b = qb cot qb 1
1052
which gives
d0 (kr)
!
dr 0 0 (kr) kb sin kb + qb cot qb cos kb
cot 0 = dj0 (kr)
= (13.129)
0 j0 (kr) kb cos kb qb cot qb sin kb
dr r=b
(kb)3 1 1 b
tan 1 = (13.132)
3 2 + 1 b
1 + 0 b = 0 or 2 + 1 b = 0 (13.134)
` + 1 + ` b = 0 (13.135)
and thus the partial wave scattering cross section takes on its maximum value,
namely,
4(2` + 1)
` (k) = (13.137)
k2
For other k values (off-resonance) we have
` (kb)2`+1 (13.138)
This says that the resonance is very sharp for large ` values.
1053
We can understand this effect better by looking at other features. As we saw,
for small k, cot ` is very large since it is proportional to k 2`1 . Therefore,
4 4 1
` = (2` + 1) sin2 ` = 2 (2` + 1) (13.139)
k2 k 1 + cot2 `
will be proportional to k 4` . In this case, except for the s-wave (` = 0), ` will
be small.
and
` b` (Er ) 2` + 1 (13.142)
After some algebra, these relations imply that
2(E Er )
cot ` (13.143)
k
where
2k 2`+1 b2`
k = 2 `
(13.144)
[(2` 1)!!] E E=Er
Now ` /E < 0, which implies that k > 0. Therefore, near resonance, the
partial wave cross section ` is
2
4 k2k
` = (13.145)
4(E Er )2 + 2k
When
k
E Er = (13.146)
2
the partial wave cross section is equal to one-half its maximum value. Therefore,
= k (Er ) is the width of the resonance at half-maximum.
This functional form is called the Breit-Wigner formula for a scattering reso-
nance. The curves are shown in Figure 13.4 below.
1054
Figure 13.4: ` versus E ; ` versus E
4a
V (~k ~k 0 ) = (13.148)
~ ~ 0 2
k k + 2
where
2
~ ~ 0 2
k k = 2k sin (13.149)
2
1055
This gives
2
2 2
d m ~ ~ 0 2 m 4a
= V (k k ) =
2
d (2~2 )2 2
(2~ )2
~ ~ 0
k k + 2
a2
= 2 (13.150)
~2 2
4Ek sin2 2 + 2m
This is an accident and higher order terms will completely change this result.
The higher order terms that we have neglected in the Born approximation can-
not be ignored for a long-range potential like the Coulomb potential.
1056
Spin Dependent Potential
We now consider the scattering of a particle of mass m and momentum p = ~k
through an angle by the spin-dependent potential
2
V (~r) = er [A + B~ ~r] (13.156)
where
~ = (x , y , z ) (13.157)
We assume that the initial spin is along the incident direction and we sum
over all final spins, which corresponds to measuring only the direction of the
scattered particles and not their spins. Note that this potential violates parity.
We use the Born approximation. The Fourier transform of the potential function
is given by Z
V (~q) = d3~r V (~r)ei~q~r (13.158)
where
~q = ~k ~k 0 (13.159)
We get
Z
2
V (~q) = d3~r er [A + B~ ~r] ei~q~r
3/2
q2 iB
= e 4 A+ ~ ~q
2
3/2
q2 iB
= e 4 A+ [q+ + q + + qz z ] (13.160)
2
where
1
q = qx iqy and = (x iy ) (13.161)
2
Since the initial spin is oriented along ~k, which we choose to be in the zdirection,
we also quantize the final spin along the same axis (use the same representa-
tion).
1057
4. The term Bq + gives a zero contribution since the initial spin cannot be
raised.
i
A+ Bqz (13.162)
2
i
B(qx + iqy ) (13.163)
2
To get the total contribution to f () we must square each separate term (these
are not indistinguishable processes) and then sum over the final states. We get
2 2 2 2
A + i Bqz + i B(qx + iqy ) = A2 + B q
(13.164)
2 2 42
where
T (~r 0 ) = nuclear (~r 0 ) + electronic (~r 0 )
= (~r 0 ) (~r 0 )
We then have
1058
In the last integral ~r 0 is a constant. Therefore we can write
0
ei~q(~r~r ) 3 ei~q~x 3
Z Z
d ~r = d ~x
|~r ~r 0 | x
Z2 Z Z
= lim d sin d ex eiqx cos xdx
0
0 0 0
Z Z0
= 2 lim ex xdx d(cos )eiqx cos
0
0
Z
eiqx eiqx
x
= 2 lim e xdx
0 iqx
0
Z
4
= lim ex sin qxdx
q 0
0
4 q 4
= lim 2 2
= 2 (13.168)
q 0 q + q
2mZe2
Z
0
f () = 2 2
T (~r 0 )ei~q~r d3~r 0
~ q
2mZe2
Z
0
= 2 2
[(~r 0 ) (~r 0 )] ei~q~r d3~r 0
~ q
2mZe2
= [1 F (q)] (13.169)
~2 q 2
where
Z
0
F (q) = (~r 0 )ei~q~r d3~r 0 (13.170)
1059
The determination of the electronic energy levels of molecules is significantly
more complicated than in atomic systems since the potential felt by the electrons
does not exhibit many of the simplifying features of the atomic case.
On the other hand, nuclei are very massive compared to the electrons
me
103 105 (13.171)
mN
and this can be used to simplify our work.
First, we must determine to what extent we need to take nuclear motion into
account in our calculations. On the average, the nuclei move more slowly than
the electron, have a small zero point energy and are well localized. We can pic-
ture the nuclei in a molecule as having classical equilibrium positions (minimum
potential energy points) and that they oscillate slowly about these positions. In
the meantime, the electrons are moving rapidly through the nuclear potentials.
The electrons effectively see a static potential and the electronic wave functions
can only be adiabatically changed by the slow nuclear vibrations. The nuclei
have translational, vibrational and rotational motions that we must take into
account.
1060
Therefore, the total vibrational energy is
m 1/2
Evib = ~ Eel 0.01 0.1 eV (13.173)
M
The rotational energy for an angular momentum `~ is
~2 `(` + 1) ~2 m
Erot = Eel (13.174)
2I M a2 M
Therefore, we have
Emol = Eel + Evib + Erot (13.175)
where the relative sizes are given by
m 1/2 m
Eel : Evib : Erot = 1 : : (13.176)
M M
In this model of a molecule we will use
where
2
X p~op,i
Te = = kinetic energy of the electrons
2m
electrons
i
2
X p~op,
TN = = kinetic energy of the nuclei
2m
nuclei
~ = (n (R)
(Te + Vee + VeN )n (~r, R) ~ VN N (R))
~ n (~
~
r, R) (13.179)
~ n (~r, R)
= Eel (R) ~ (13.180)
1061
Here we have defined
Now the wave function for the whole molecule satisfies the Schrodinger equation
~ = E(~r, R)
(Te + TN + Vee + VeN + VN N )(~r, R) ~ (13.181)
X
~ =
(~r, R) ~ n (~r, R)
n (R) ~ (13.182)
n
Substituting we get
X X
~ m (~r, R)
(Te + TN + Vee + VeN + VN N )m (R) ~ =E ~ m (~r, R)
m (R) ~
m m
X X
~ m (~r, R)
(TN + VN N )m (R) ~ + ~ m (~r, R)
(Te + Vee + VeN )m (R) ~
m m
X
=E ~ m (~r, R)
m (R) ~
m
X X
~ m (~r, R)
(TN + VN N )m (R) ~ + ~ VN N (R))
(m (R) ~ ~
m (R)m (~
~
r, R)
m m
X
=E ~ m (~r, R)
m (R) ~
m
X X
~
(TN + m (R)) ~
m (R)m (~
~ =E
r, R) ~ m (~r, R)
m (R) ~
m m
1062
~ and integrate over all electron positions
If we multiply on the left by n (~r, R)
using the orthonormality of the m (~r, R)~ we get
XZ
~ TN + m (R))
d3 rn (~r, R)( ~ ~
m (R)m (~
~
r, R)
m
X Z
=E ~
m (R) ~ m (~r, R)
d3 rn (~r, R) ~
m
XZ X
~ TN m (R)
d3 rn (~r, R) ~ m (~r, R)
~ + ~ m (R)
m (R) ~ nm
m m
X
=E ~ nm
m (R)
m
XZ
d 3 ~ TN m (R)
rn (~r, R) ~ m (~r, R)
~ ~ n (R)
+ n (R) ~
m
~
= En (R)
2 () = (2 ) + 2 + (2 ) (13.183)
we have
XZ X ~2
~
d3 rn (~r, R) ~ m (~r, R)
2R~ m (R) ~ + n (R)
~ n (R)
~
m
2M
~
= En (R)
X ~ 2 X Z
~ m (~r, R)
d3 rn (~r, R)[ ~ 2 m (R)
~
2M m
~ m (~r, R)
+ 2m (R) ~ + m (R) ~ 2 m (~r, R)]
~ + n (R)
~ n (R)
~ = En (R)
~
2
X ~ X Z
~
2 m (R) ~ m (~r, R)
d3 rn (~r, R) ~
2M m
X ~2 X Z
+ ~
d3 rn (~r, R)[2 ~
m (R) m (~
~
r, R)
2M m
~ 2 m (~r, R)]
+ m (R) ~ + n (R)
~ n (R)
~ = En (R)
~
X ~2 X
~ nm
2 m (R)
2M m
X ~2 X Z
+ ~
d3 rn (~r, R)[2 ~
m (R) m (~
~ + m (R)
r, R) ~ 2 m (~r, R)]
~
2M m
~ n (R)
+ n (R) ~ = En (R)
~
1063
X ~2
~
2R~ n (R)
2M
X ~2 X Z
+ ~
d3 rn (~r, R)[2 ~
m (R) m (~
~ + m (R)
r, R) ~ 2 m (~r, R)]
~
2M m
~ n (R)
+ n (R) ~ = En (R)
~
~
TN n (R)
X ~2 X Z
+ ~
d3 rn (~r, R)[2 ~
m (R) m (~
~ + m (R)
r, R) ~ 2 m (~r, R)]
~
2M m
~ n (R)
+ n (R) ~ = En (R)
~
or X
~
(TN + n (R)) ~ ~
n (R) = En (R)
~
Anm m (R) (13.184)
m
where
~
Anm m (R)
X ~2 Z
= ~
d3 rn (~r, R)[2 ~
~ m (R) R
~ m (~
~ + m (R)
r, R) ~
~ 2 m (~r, R)]
R ~
R
2M
The Anm term mixes different n and m, which corresponds to different electronic
~
wave functions in (~r, R).
~ M m (R)
~ 0 M m (R)
~ R
R~ m (R) ~ R ~
~ ~
where typical nuclear displacement from equilibrium. But
1
M 2 2 ~ (13.185)
2
This implies the first term in Anm is of order
m 1/2
~ spacing between n- levels (13.186)
M
1064
Therefore we can neglect all of the Anm terms. We thus have
~
(TN + n (R)) ~ ~
n (R) = En (R) (13.187)
This is a simple Schrodinger equation for the n (R)~ , which are the expansion
~ ~
coefficients and therefore we can write down (~r, R) once we know the n (~r, R).
The effect of the electrons is to couple together the nuclei with rubber bands
~
whose force constants depend on the electronic state. Thus, for each n (~r, R),
~ ~
we have an n (R) which implies a n (R). To lowest order in
m 1/2
(13.188)
M
we have no mixing of different n-levels.
This toy model is a simplification that makes the solution more transparent and
analytically tractable.
1065
Review of prior derivation
where the upper-case letters, R and P, denote the position and momentum
operators of the nuclei which are indexed by Greek letters. Lower-case letters,
r and p, and Latin subscripts are reserved for the electrons; m is the mass of an
electron, M is the mass of the th nucleus, e is the electron charge, and +Z e
is the charge on the th nucleus. The Schrodinger equation for the stationary
states of the molecule is
2 X X Z e2
~
X e2
2i +
2m i ,i
|R ri | i>j |ri rj |
#
X ~2 X Z Z e2
2
+ + ({r}, {R}) = E({r}, {R})
2M > |R R |
(13.192)
The first step in the Born-Oppenheimer approach consists in asserting that the
kinetic energy term associated with the heavy particles may be neglected in the
lowest approximation and their coordinates treated as parameters (and not as
dynamical variables). We write the full Hamiltonian as H = H0 + TR where
H0 is the electronic Hamiltonian in which the nuclear coordinates appear as
parameters. The eigenfunctions and eigenvalues of H0 describe the electrons for
fixed values of the coordinates of the nuclei:
We write {R} after a semicolon to indicate that {R} appear only as parameters.
The eigenvalues n ({R}) depend on the particular values of {R} of the nuclei
and yield energy surfaces that are labeled by n, the electronic quantum numbers.
Because the states n ({r}; {R}) are a complete set of orthonormal vectors, we
can use them as a basis to write the full wave function as
X
({r}; {R}) = n ({R})n ({r}; {R}) (13.194)
n
with the understanding that the summation includes integra- tion over con-
tinuum states. If we substitute the expansion (13.194) into Eq. (13.192) and
integrate over the electronic coordinates, we obtain a set of coupled equations
1066
describing the dynamics of the heavy nuclei, namely,
" #
X ~2
2 + n ({R}) E n ({R})
2M
X Z
3
n ({r}; {R}) m ({r}; {R})i d ri
,m
X Z
m ({R}) + n ({r}; {R})
,m
~2
2 3
m ({r}; {R})i d ri }) m ({R (13.195)
2M
where m 6= n. The terms on the left-hand side describe the motion driven by
the kinetic energy operator of the nuclei on a given electronic energy surface
n ({R}), whereas those on the right-hand side represent jumps from one energy
sur- face to another ( n m ) resulting in the coupling of the different electronic
states.
" #
X ~2
2
+ n ({R}) E n ({R}) = 0 (13.196)
2M
Real molecular systems are very complicated except for simple examples such
as the hydrogen molecular ion H2+ (see below). To illustrate the basic ideas
in a simple setting, we discuss a one-dimensional model of the actual physical
system.
1067
The Toy Model
Imagine a system of two particles: one light (of mass m) with coordinate x and
the other heavy (mass M ) located at X. The particles are confined between
two impenetrable walls a distance L apart, that is, L/2 x, X L/2. We as-
sume that these two particles interact with each other via an attractive Dirac
delta function interaction of strength . The Hamiltonian of the system within
the allowed range is thus
p2 P2
H= + (x X) (13.197)
2m 2M
where p and P are the momenta of the two particles. The stationary states are
given by
~2 2 ~2 2
(x X) (x, X) = E(x, X) (13.198)
2m x2 2M X 2
We follow the Born-Oppenheimer approach and first solve for the energy levels
of the light particle, En (X), treating X as a parameter, that is,
~2 2
(x X) (x; X) = (x; X) (13.199)
2m x2
or
d2
2
+ k (x; X) = (x X)(x; X) (13.200)
dx2
where k 2 2m/~2 and 2m/~2 . The wave function satisfies the
boundary conditions, (x = L/2, X) = 0. Thus for x < X the solution is
A sin k(x + L/2) while for x > X we must have B sin k(L/2 x), where A and
B are constants. The continuity of the wave function at x = X implies that
L L
A sin k + X = B sin k X (13.201)
2 2
The first derivative of the wave function must be discontinuous so as to yield
the delta function in Eq. (13.200), whose integral between x = X and x = X +
yields the condition 0 (x = X + ); X) 0 (x = X ; X) = (x = X; X), or
L L L
Bk cos k X Ak cos k + X = A sin k +X (13.202)
2 2 2
If we eliminate A and B from Eqs. (13.201) and (13.202), we find the eigenvalue
condition for k (and hence = ~2 k 2 /2m):
L L
k sin kL = sin k + X sin k X (13.203)
2 2
For a given and L, Eq. (13.203) is to be solved for k as a function of X
to determine the energy surfaces (here curves as there is only one parameter
1068
X). We observe that for X = L/2 the right-hand side of the eigenvalue con-
dition, Eq.(13.203) vanishes and sin kL = 0 has the solutions k = n/L and
n = ~2 n2 2 /(2mL)2 . In other words, the infinite repulsion at the walls makes
the wave function of the light particle (and hence the probability of finding it)
zero at X = L/2, and hence it must disregard the presence of the heavy par-
ticle if the latter is located at either of these two places.
In the above discussion we have tacitly assumed that > 0 by taking trigono-
metric functions as the solutions of Eq. (13.200), which generally holds except
for the lowest state (and that too only if is large enough). The condition for
crossover from positive to negative is easily derived by noting that if becomes
negative, it must pass through zero and hence we can consider the eigenvalue
condition, Eq. (13.203), for small k to obtain the critical potential strength
L
= c = L2
(13.204)
4 X2
If > c for a range of X values, we must use the hyperbolic solutions of the
Schrodinger equation with || = = ~2 k 2 /2m, and the eigenvalue condition
becomes
L L
k sinh kL = sinh k + X sinh k X (13.205)
2 2
Equation (13.205) admits only one solution for the corresponding k. The reason
is physically transparent. Note that for L ,c 0, which means that as
the walls recede to infinity, the attractive delta function potential no matter
how weak will have a solution with a negative energy eigenvalue (as can be seen
by solving the delta function potential without walls). Hence, in the presence
of the repulsive walls, this energy eigenvalue can be negative provided the walls
are not too close and the potential not too weak.
1069
which yields 2kX = N (N = 0, 1, 2, ....). For the lowest positive energy state,
n = 1, there will be only one extremum at X = 0 (corresponding to N = 0), and
this energy must be a minimum as can be seen from the following argument. At
X = L/2 we have k = /L, and as we move the heavy particle away from the
wall, the energy and hence k must de- crease because of the attractive nature
of the interaction. Thus for the n = 1 energy surface, X = 0 is a minimum.
For the first excited electronic state, n = 2, we have three extrema at N = 0
and 1. The point X = 0 (corresponding to N = 0) is a maximum, because
the vanishing of the wave function at x = 0 with X =0 (the wave function is
an odd function) implies that the derivative is continuous and hence the energy
corresponds to k = 2/L (as if the light particle disregards the presence of the
heavy one). As the heavy particle moves away from X = 0, the energy must
decrease due to the onset of the nonzero attraction. Thus the n = 2 energy
surface gives rise to a double well. Similarly, it is easy to see that for the second
excited electronic state, there are three minima at X = 0 and X = /kand
two maxima at X = /2k.
The wave functions for the ground and first excited electronic states are shown
in Fig. 13.5
Figure 13.5: Wave function for the light particle when the heavy one is located
at X = L/4 taking = 2/L
with the heavy particle located at X = L/4 and the strength of the delta func-
tion corresponding to = 2/L. We have taken L = 1. Attention is drawn
to the discontinuity in the derivative of the wave function of the light particle
1070
at the location of the heavy particle (indicated by the arrow). The shape and
symmetry of the wave functions are clearly affected by the location of the heavy
particle and the strength of the delta function interaction.
Now that we have solved for the motion of the light particle with the heavy
particle at X, we may, in the spirit of the Born-Oppenheimer approach, turn
our attention to the slower vibrational motion of the heavy particle governed
by the potential surfaces, n (X), obtained from = ~2 k 2 /2m, where k are so-
lutions (labeled by the quantum number n) of Eq. (13.203). The vibrational
energies may be determined numerically given the values of the relevant param-
eters (m, M, , L). However, these energies can be estimated via the harmonic
approximation by using the curvature of the potential at the minima
d2 (X) ~2 k 2
4k cos 2kX
=
dX 2 2m C
2
2 sin 2kX(sin 2kX + 2kX cos 2kX)
+ (13.207)
C2
where the dimensionless quantity C = kL cos kL+X sin 2kX+(1L/2) sin kL.
For convenience we choose the value = 2/L. For the ground state electronic
configuration, the minimum of the potential (X) = 1 (X) is at X = 0 and the
corresponding solution of Eq. (13.203) yields kL = 2.33 (and cos kL = 0.69).
At the minimum of the potential, the curvature is
d2 (X) 4~2 k 2
= (13.208)
dX 2 mL2 cos kL
leading to the classical angular frequency
r
d2 (X)/dX 2
= (13.209)
M
The ensuing spectra is a ladder with spacing ~, and accordingly, the vibrational
energy levels are approximately given by
s
~2 16m k 2 L2
1 1
~ + = + (13.210)
2 2mL2 M | cos kL| 2
where is the vibrational quantum number and takes integer values 0, 1, 2, ... .
To make contact between the toy model and the realistic situation of molecules,
we take L to be of the order of chemical bond lengths (L 1A) and m and
M to be the masses of the electron and the proton, respectively, m = 1030 kg
and M 2000m. The separations between the electronic levels are thus of
(~2 /mL2 ) 6 eV , and the vibrational level spacings are
the order of p
~ ~2 /(mL2 ) m/M 0.1 eV .
1071
In Figures. 13.6 and 13.7 the potential energy surfaces (in units of ~2 /2mL2 )
of the ground and first excited electronic states are plotted against X (in units
of L) for = 2.0/L.
Figure 13.6: Potential energy surface (in units of ~2 /2mL2 of the ground elec-
tronic state (n = 1) plotted against X (in units of L) for = 2.0/L. We have
shown the vibrational bound states for M 2000m. Note the decreasing sep-
aration between the adjacent levels, particularly for higher excitations, which
reflects the anharmonicity of the potential surface.
1072
It can be seen that for M = 2000m, the energy separation of the lower vibra-
tional levels is almost 0.26(in units of ~2 /2mL2 ). Thus a quadratic fit to the
potential surface of the electronic ground state is very good at least for the
lower vibrational levels (the agreement is within 5%). Similarly, for the first
excited electronic state (n = 2), we have a double well potential surface and
the curvature at the minima both correspond to an energy separation of 0.56.
Indeed the results of a numerical calculation reveal that the almost equidistant
vibrational level structure is obtained for the low lying levels, but the energy
gaps reduce with increasing vibrational energy as the anharmonicity of the po-
tential surface becomes important. Another feature of the first excited state,
which is typical of a double well potential, is the occurrence of a doublet due to
a degenerate pair of levels corresponding to each well split due to their coupling
from tunneling through the barrier. This split- ting is too small to be seen in
Figure 13.7 for the lower vibrational excitation and barely discernible in the
next to highest vibrational level (shown by the thickening of this line in the
diagram) and is clearly evident in the highest vibrational doublet. Hence, we
see that many features of molecular systems apart from the methodology of the
Born-Oppenheimer adiabatic approximation find a simple illustration in the toy
model that we have discussed.
13.2.3 Examples
Hydrogen Molecular Ion (H2+ )
In the ionized H2 molecule, a single electron moves in the attractive potential
of two protons A and B as shown in Figure 13.8 below.
1073
The Hamiltonian, in the Born-Oppenheimer approximation, is
~2 2 e2 e2 e2
H = + (13.211)
2m ~r R~ A ~r R~ B R
~A R~ B
where we have chosen the symmetric and antisymmetric combinations only be-
cause the potential is symmetric about the midpoint of the molecule
~A + R
R ~B
(13.213)
2
The wave functions are
1 |~ ~ |
r R A 1 |~ ~ |
r RB
A (~r) = e a and B (~r) = e a (13.214)
a3 a3
2
= C (2 + 2S(R))
where
~ ~
R = R A RB (13.215)
R2
Z
R R
S(R) = d3 rA (~r)B (~r) = 1 + + 2 e a (13.216)
a 3a
where Z
hA| H |Ai = d3 rA (~r)HA (~r) (13.218)
and so on.
1074
Now
Z 2 2 2 2
~ e e e
hA| H |Ai = d3 rA (~r) 2 + A (~r)
2m ~ ~ ~
~r RA ~r RB RA RB
~
Z 2 2
~ e
= d3 rA (~r) 2 A (~r)
2m ~r R ~ A
Z 2 Z 2
e e
+ d3 rA (~r) A (~r) d3 rA (~r) A (~r)
R~A R ~ B ~r R
~ B
e2 e2 e2
R 2R R 2R
= E1 + 1 1+ e a = E1 + 1+ e a
R R a R a
= hB| H |Bi
where
In a similar way
~2 2 e2 e2 e2
Z
hA| H |Bi = d3 rA (~r) + B (~r)
2m ~r R~ A
r R~ ~
R ~
R
~ B A B
e2 e2
R R
= E1 + S(R) 1+ e a
R a a
The top curve represents (R) and the bottom curve is + (R). It is clear that
+ (R) has a minimum and (R) does not.
Therefore, the symmetric wave function leads to binding between the nuclei and
the antisymmetric wave function does not.
1075
Figure 13.9: (R) versus R/a
In fact, (~r) is zero on the bisecting plane. This implies that + (~r) gives
the larger probability for the electron to be between the nuclei. This is the
binding mechanism. In this state the electron has the greatest attraction from
both protons.
1076
The experimental results are
e2 e2 e2 e2
+ + (13.222)
~r2 R~ A ~r2 R~ B |~r1 ~r2 | R~A R~ B
There are two different variational approaches to the solution of this problem.
We make the assumption that the wave function of the two electrons in the
ground state is
This wave function has a symmetric spatial part and an antisymmetric spin
part so that the total wave function for the two electrons is antisymmetric. The
spatial part is the product of the H2+ molecular wave functions. The spatial
part has been chosen to be a binding orbital since it is maximum in between the
nuclei. Therefore, we expect this spatial wave function to represent the ground
state of the H2 molecule. The choice of the singlet spin function (paired spins)
is solely to insure antisymmetry.
1077
This spatial part is an antibinding orbital since it is zero in between the nuclei
and thus has a higher energy than the other wave function.
In this method we leave out the A (1)A (2) and B (1)B (2) terms. We choose
1
s (1, 2) = p (A (1)B (2) + A (2)B (1)) sin g (13.225)
2(1 + S 2 )
1
t (1, 2) = p (A (1)B (2) A (2)B (1)) trip (13.226)
2(1 S 2 )
In this case, for large separations we have separated hydrogen atoms, which is
the correct physics. These wave functions exhibit the same small separation
problems as in the molecular orbital method, however.
1078
where we have used
The + sign is the singlet value and the sign is the triplet value. The matrix
elements are given by
Z
hAB| H |ABi = d3 r1 d3 r2 A (1)B (2)HA (1)B (2) (13.228)
Z
hAB| H |BAi = d3 r1 d3 r2 A (1)B (2)HA (2)B (1) (13.229)
where
1079
Q = total exchange energy
e2 e2 e2
Z
= d3 r1 d3 r2 A (1)B (2)A (2)B (1)
|~r1 ~r2 | ~r R ~ ~
~r2 RA
1 B
Z 2
e
= 2 d3 r1 A (1)B (1)
~ A
~r1 R
e2 e2
Z
+ d3 r1 d3 r2 A (1)B (2)A (2)B (1) + S2
|~r1 ~r2 | R
~ + S2 e2
= Vex (R) (13.234)
R
The exchange term is a measure of the overlap of the wave functions, weighted
by the potential energies. It is the result of an interplay between quantum
mechanics and the PEP.
VC Vex e 2 VC + R Vex + S 2 eR
= 2E1 + + = 2E1 + (13.235)
1 S2 R 1 S2
We have
e2 e2
VC + > 0 always and Vex + S 2 < 0 generally (13.236)
R R
which implies that + < . In fact, one finds that
1080
This implies that the molecule binds in the singlet state but not in the triplet
state. One finds that the strength of the binding is proportional to the overlap
of the two electron states .
The triplet wave function spatial part vanishes if both electrons are in between
the protons implying a smaller overlap and thus no binding.
if we consider a two-atom molecule like HCL, this equation gives the rotational
and vibrational energy levels.
~2 2
+ (r) (~r) = E(~r) (13.238)
2m
where
M1 M2
r = |~r| and m = = reduced mass (13.239)
M1 + M2
We have used the fact that the effective potential depends only on r.
~2
2
~2 `(` + 1)
d 2 d
+ + (r) + Rn` (r) = ERn` (r) (13.241)
2m dr2 r dr 2mr2
Defining
~2 `(` + 1)
Vef f (r) = (r) + (13.242)
2mr2
and using
un` (r) = rRn` (r) (13.243)
we have
~2 d2
+ Vef f (r) un` (r) = Eun` (r) (13.244)
2m dr2
For small values of `, Vef f (r) has a minimum that depends on ` (say at r = r` ).
In the neighborhood of this minimum, we can expand Vef f (r) in a Taylor series
1
Vef f (r) = Vef f (r` ) + m`2 (r r` )2 + ..... (13.245)
2
1081
where
d2 Vef f
m`2 = (13.246)
dr2 r=r`
~2 d2 ~2 `(` + 1) 1
2 2
+ (r` ) + + m` q un` (r) = Eun` (r) (13.247)
2m dq 2 2mr`2 2
~2 `(` + 1)
(r) and (13.248)
2mr2
We end up with the harmonic oscillator equation where the energy is
~2 `(` + 1)
E = (r` ) + + ~` (n + 1/2) (13.249)
2I`
with
I` = mr`2 = effective moment of inertia (13.250)
The corresponding stationary states are
( 2 )
q 1 q
un` = An Hn exp (13.251)
q0` 2 q0`
where 1/2
~
q0` = (13.252)
m`
Although the wave functions do not satisfy the requirement u(r = 0) = 0, the
energy eigenvalues are still reasonably accurate in this approximations, i.e.,
( 2 )
1 r`
un` (0) Hn (0) exp (13.253)
2 q0`
With r` q0` , u(0) is very small. The energy eigenvalues contain contributions
from
The rotational levels have 0.1 1 cm which corresponds to the far infrared
and microwave regions. The vibrational levels have 2 103 3 103 cm
which corresponds to the infrared region.
1082
13.3 Problems
13.3.1 S-Wave Phase Shift
We wish to find an approximate expression for the s-wave phase shift, 0 , for
scattering of low energy particles from the potential
C
V (r) = , C>0
r4
(a) For low energies, k 0, the radial Schrodinger equation for ` = 0 may be
approximated by (dropping the energy term):
1 d 2 d 2mC inside
2 r + 2 4 R`=0 (r) = 0
r dr dr ~ r
By making the transformations
1 i 2mC
R(r) = (r) , r=
r ~ x
show that the radial equation may be solved in terms of Bessel functions.
Find an approximate solution, taking into account behavior at r = 0.
outside
(b) Using the standard procedure of matching
this to R`=0 (r) at r = a
(where a is chosen such that ~a 2mC and ka 1) show that
2mC
0 = k
~
which is independent of a.
1083
(c) Find the energy dependence of the differential cross section for a fixed
scattering angle.
(d) Find ` for 2g/~2 1 and show that the differential cross section is
3 g2
d
= 2 cot
d 2~ E 2
(e) For the same potential, calculate the differential cross section using the
Born approximation and compare it with the above results. Why did this
happen?
1084
reciprocal lattice. This prove then shows that the scattering amplitude vanishes
unless the momentum transfer ~q is equal to some vector of the reciprocal lattice.
This is the Bragg-Von Laue scattering condition.
where kr0 1. Write the radial Schrodinger equation for ` = 1, and show that
the solution for the p-wave scattering is of the form
sin(kr) cos(kr)
k1 (r) = rRk1 (r) = A cos(kr) + a + sin(kr)
kr kr
where A and a are constants. Determine 1 (k) from the condition imposed on
k1 (r0 ). Show that in the limit k 0, 1 (k) (kr0 )3 and 1 (k) 0 (k).
Express this potential near its minimum by a harmonic oscillator potential and
determine the vibrational energies of the molecule.
How do we determine the energy operator for the ammonia molecule? If these
were the energy eigenstates, they would clearly have the same energy (since
we cannot distinguish them in any way). So diagonal elements of the energy
operator must be equal if we are using the (|1i , |2i) basis. But there is a
small probability that a nitrogen atom above the plane will be found below
the plane and vice versa (called tunneling). So the off-diagonal element of
1085
the energy operator must not be zero, which also reflects the fact that the
above and below states are not energy eigenstates. We therefore arrive with the
following matrix as representing the most general possible energy operator for
the ammonia molecule system:
E0 A
H =
A E0
(a) Find the eigenvalues and eigenvectors of the energy operator. Label them
as (|Ii , |IIi)
(b) Let the initial state of the ammonia molecule be |Ii, that is, |(0)i = |Ii.
What is |(t)i, the state of the ammonia molecule after some time t?
What is the probability of finding the ammonia molecule in each of its
energy eigenstates? What is the probability of finding the nitrogen atom
above or below the plane of the hydrogen atoms?
(c) Let the initial state of the ammonia molecule be |1i, that is, |(0)i = |1i.
What is |(t)i, the state of the ammonia molecule after some time t?
What is the probability of finding the ammonia molecule in each of its
energy eigenstates? What is the probability of finding the nitrogen atom
above or below the plane of the hydrogen atoms?
as a symmetric rigid rotator. Call the moment of inertia about the zaxis I3
and the moments about the pairs of axes perpendicular to the zaxis I1 .
~ I3 and I1 .
(a) Write down the Hamiltonian of this system in terms of L,
h i
(b) Show that H, Lz = 0
1086
(c) What are the eigenstates and eigenvalues of the Hamiltonian?
(d) Suppose that at time t = 0 the molecule is in the state
1 1
|(0)i = |0, 0i + |1, 1i
2 2
What is |(t)i?
where H is the Hamiltonian and , > 0. Now define an operator R such that
R |1 i = |2 i , R |2 i = |3 i , R |3 i = |1 i
(a) Show that R commutes with H and find the eigenvalues and eigenvectors
of R.
(b) Find the eigenvalues and eigenvectors of H.
1087
(2) Verify your expression is correct by showing that it reduces to the result
for the spherical square well when V1 = V2 , i.e., calculate separately the
spherical square well result and set V1 = V2 V0 .
(3) Plot the result of the square well(part(2)) as a function of qR2 over a suf-
ficient region to understand its behavior (i.e., qR2 0 , qR2 1 , qR2
1). Note and explain any noteworthy features.
(4) Now plot not simply versus q, but versus , 0 < < , for four represen-
tative values of the energy. Use atomic scales: R2 = 3aB , V0 = 1 Ry. You
have to decide on relevant energies to plot; it should be helpful to plot on
the same graph with different line styles or colors.
(6) Finally, consider a Gaussian potential that has the same range parameter
and the same volume integral as the simple square well of part(2). Cal-
culate the differential cross-section in this case. Can you identify possible
effects due to the sharp structure (discontinuity) that occurs in only one
of them.
(a) Find a and b such that the proton charge equals e, the charge on the
electron.
1088
(a) Show that the free-particle Greens function for the time-independent
Schrodinger equation with energy E and outgoing-wave boundary con-
ditions is
Z
1 eikx
GE (x) = dk 2 k2
2 E ~2m + i
1089
13.3.20 Born approximation - Atomic Potential
Use the Born approximation to determine the differential cross section for the
atomic potential seen by an incoming electron, which can be represented by the
function
0
T ( r )d3~r0
Z
V (r) = Ze2
|~r ~r0 |
where
T (~r0 ) = nuclear (~r0 ) + electronic (~r0 ) = (~r0 ) (~r0 )
It is given by
C12 C6
6
V (r) =
r12 r
(a) Near the minimum r0 , the potential looks harmonic. Including the first
anharmonic correction, show that up to a constant term
1
V (x) = m 2 x2 + x3
2
1/6
where r0 = (2C12 /C6 ) , x = r r0 , m 2 /2 = V 00 (r0 ) and = V 000 (r0 )/6.
p2
Let us write H = H0 + H1 , where H0 = 2m + 12 m 2 x2 and H1 = x3 .
(b) What is the small parameter of the perturbation expansion?
(c) Show that the first energy shift vanishes(use symmetry).
(d) Show that the second energy shift (first nonvanishing correction) is
2
3 + 3
(a + a ) |ni
2 ~
(2) 2m X hm|
En =
~ nm
m6=n
1090
(e) Evaluate the matrix elements to show that
6
(f) Consider carbon C-C bonds take Lennard-Jones parameters C6 = 15.2 eV A
12
and C12 = 2.4 104 eV A . Plot the potential and the energy levels from
the ground to second excited state including the anharmonic shifts.
(a) Classically, where would you put the electrons so that the nuclei are at-
tracted to one another in a bonding configuration? What configuration
maximally repels the nuclei (anti-bonding)?
(b) Consider the two-electron state of this molecule. When the nuclei are far
enough apart, we can construct this state out of atomic orbitals and spins.
Write the two possible states as products of orbital and spin states. Which
is the bonding configuration? Which is the anti-bonding?
(c) Sketch the potential energy seen by the nuclei as a function of the inter-
nuclear separation R for the two different electron configurations. Your
bonding configuration should allow for bound-states of the nuclei to one
another. This is the covalent bond.
1091
(a) Calculate, in the Born approximation, the differential cross section for
scattering a charged particle from the hydrogen atom as modeled above
(neglect the recoil of the hydrogen atom).
(b) What is the form of the differential cross section for low energy? Compare
with the pure Coulomb cross section.
(c) Show that the differential cross section becomes more and more like a
pure Coulomb cross section as the energy of the incident particle increases.
Explain why this happens.
1092
Chapter 14
Some Examples of Quantum Systems
a |i = |i (14.2)
we found
1 2 X m
|i = e 2 || |mi (14.3)
m=0 m!
where
2
= N = h| Nop |i (14.4)
1093
~
h| x2 |i = h| (a+ a+ + aa+ + a+ a + aa) |i
2m
~
= h| (2 + + + 2 + 1) |i
2m
2 ~
= h| x |i + (14.7)
2m
~m
h| p2 |i = h| (a+ a+ aa+ a+ a + aa) |i
2
~m
= h| (2 + 2 1) |i
2
2 ~m
= h| p |i + (14.8)
2
Using these relations we have
2 2 ~
(x) = h| x2 |i h| x |i = (14.9)
2m
2 2 ~m
(p) = h| p |i h| p |i = (14.10)
2
Hence
~
xp = (14.11)
2
which says that coherent states are minimum uncertainty states.
Now let us find the differential equation satisfied by hx | i and determine its
solution. We have
r r !
1 2m p 2
hx| a |i = hx | i = hx| x |i
2 ~ i m~
r r !
1 2m 1 2 d
= x i~ hx | i
2 ~ i m~ dx
or r
~ d 2~
x + hx | i = hx | i (14.12)
m dx m
which has the solution (check by substitution)
" #
2
(x hxi) i
hx | i = C exp + hpi x (14.13)
4(x)2 ~
r !2
m 2~
= C 0 exp x (14.14)
2~ m
1094
For a fixed oscillator mode, specified by a given value of m, the coherent states
are the manifold of those minimum uncertainty states that have definite values
of
p x and p. If m = 1, then the uncertainties in x and p are both equal to
~/2.
where
r r ! r r !
1 0 1 0
= , = + (14.18)
2 0 2 0
and therefore
b+ = a+ + a (14.19)
since and are real. Algebra also shows that
1 0 0
2 2 = + 0 0 +2+2 =1 (14.20)
4
1095
We now invert the transformation. We have
b b+ = 2 a + a+ a+ 2 a = (2 2 )a = a (14.21)
b |i = |i (14.23)
These new states are also minimum uncertainty states for x and p. We want to
calculate
2 2 2 2
(x) = x2 hxi , (p) = p2 hpi
(14.24)
Now we have
r
~
a + a+
x =
2m
r
~
= ( ) b + ( ) b+
2m
r
~
= ( ) b + b+ (14.25)
2m
and
r
~m
a+ a
p = i
2
r
~m
=i ( + ) b+ ( + ) b
2
r
~m
=i ( + ) b+ b (14.26)
2
These equations imply that earlier derivation we did is the same with different
multiplicative factors so that
2 ~ 2
(x) = ( ) (14.27)
2m
2 ~m 2
(p) = ( + ) (14.28)
2
and therefore
2 2 ~2 2 2 ~2 2 2 ~2
(x) (p) = ( ) ( + ) = 2 =
4 4 4
It turns out that the operators a and b are related by a unitary transformation,
i.e.,
b = U aU + (14.29)
1096
where
2
a a+2
U = exp (14.30)
2
and
e = + (14.31)
Proof:
2
U aU + = exp a a+2 a exp a2 a+2 = eB aeB
(14.32)
2 2
and so on.
Therefore
2 3
U aU + = a + a+ + a + a+ + .......
2 6
2 4 3 5
+
= a 1 + + + .... + a + + + .... (14.35)
2! 4! 3! 5!
If we define
2 4 3 5
= 1+ + + .... , = + + + .... (14.36)
2! 4! 3! 5!
so that
e = + (14.37)
we have
U aU + = a + a+ = b (14.38)
1097
Finally, we have
a |coherenti = |coherenti
b |squeezedi = |squeezedi
U a |coherenti = U |coherenti
U aU + U |coherenti = U U + U |coherenti = U |coherenti
bU |coherenti = U |coherenti = U |coherenti
bU |coherenti = U |coherenti
We have
~ = Bez
A , B = constant (14.39)
1098
In cylindrical coordinates (r, , z), we can choose
rB ~ = rB e
Ar = Az = 0 , A = A (14.40)
2 2
The Schrodinger equation for the electron is
1 e ~ 2
p~ A = E , e<0 (14.41)
2m c
We then let
ie ~ r
R
Ad~
= 0 e
~c
~
r (14.42)
~ (equivalent to a gauge transformation).
which should get rid of the effect of A
We have
ie ~ r
R
e ~ e ~ 0 ~c Ad~
p~ A = p~ A e ~r
c c
ie ~ r
R
~ Ad~ e~
0 ~c ~
= e r A
i c
ie ~ r ie R ~
R
Ad~ ~ ~ Ad~
r e~
0
0 A
~c ~c
= e ~r + e ~r
i i c
ie ~ r
R
Ad~ ~e~ e~
0 + A
~c
=e ~
r A
i c c
ie ~ r ie ~ r
R R
Ad~ ~ Ad~
0 = e ~r p~ 0
~c ~c
= e ~r
i
Similarly,
ie ~ r
R
e ~ 2 Ad~
p~2 0
~c
p~ A = e ~r (14.43)
c
so that the Schrodinger equation becomes
1 2 0
p~ = E 0 (14.44)
2m
Since the electron is confined to a loop of radius R, we have
ie ~ r ie
R R
Ad~ A Rd ie
0 0
= 0 ()e ~c A R
~c ~c
= () = ()e ~
r = ()e ~
r (14.45)
Therefore we have
1 2 0 ~2 d2 0 () 0 ~2
p~ () = = E () = C 2 0 () (14.46)
2m 2mR2 d2 2mR2 1
which has the solution
2
ie i C1 + eBR
0 () = eiC1 () = eiC1 e ~c A R = e 2~c
(14.47)
1099
Now imposing single-valuedness, we have
() = ( + 2) (14.48)
2 2 2
i C1 + eBR i C1 + eBR 2i C1 + eBR
e 2~c
=e 2~c
e 2~c
(14.49)
which says that
eBR2
C1 + = n = 0, 1, 2, ...... (14.50)
2~c
or 2
eBR2 ~2 eBR2
C1 = n En = n (14.51)
2~c 2mR2 2~c
If we define 0 = ~c/e = unit of flux, and remembering that the flux through
the loop is = R2 B we have
2
~2
En = n+ (14.52)
2mR2 0
which says that the dependence of En on the external magnetic field B or flux
is parabolic. A plot is shown in Figure 14.2 below.
1100
where n is the integer nearest to eBR2 /2~c or near
e
= <0 (14.54)
0 c~
Note that n < 0 since e < 0.
Now imagine that we start with the wire in its ground state in the presence of
a magnetic flux . If the magnetic field is turned off determine the current in
the loop. Assume that R = 2 cm and = 0.6 gauss cm2 .
Suppose that we start with a state En which is the ground state. n will remain
the same when the magnetic field is turned off. Therefore, () = Cein and
the electric current is
e~
J~ = ( ) (14.55)
2mi
which follows from
2
J~ + || = 0 (14.56)
t
We then get
e~ 2 ne~
J~ = (in) e = e (14.57)
2mi R mR
Now, if S = the cross-section area of the thin wire, the normalization constant
is Z
2 2 1
d`dS = 2R |C| S = 1 |C| = (14.58)
2RS
Then, Z
I = current = J~ dS ~ = ne~ |C|2 S = ne~ (14.59)
mR 2mR2
where we have assumed that J~ is constant throughout the thin wire.
Since the electron is initially in the ground state, this implies that En is the
minimum energy and we have
e
n = greatest integer not greater than = (14.60)
0 c~
or
greatest integer not greater than1 (14.61)
0
For a macroscopic system we have n 1 and we can use
e ne~ e2
n I= 2
(14.62)
c~ 2mR 4mcR2
For R = 2 cm and = 0.6 gauss cm2 we have (using SI units)
(1.6 1019 )2 (0.6 104 ) 104
I= = 1.1 1014 amp (14.63)
4 2 (2 102 )2 (0.9 1030 )
1101
14.3 Spin-Orbit Coupling in Complex Atoms
Spin-orbit coupling is strictly an internal effect arising from the interaction
between the electron spin and the effective magnetic field due to the apparent
nuclear motion. In analogy with the one-electron atom, we can write for the
N electron atom
N
X
Hso = ~i S
i (~ri )L ~i (14.64)
i=1
For the case of weak spin-orbit, let us use both classical and quantum mechanical
arguments to determine the first-order correction to the energy.
Classical Argument:
In this model, there exist two extreme orientations of these vectors for a single
electron as shown in Figure 14.3 below.
~ and S
and as shown in Figure 14.4 below the L ~ vectors precess about the vector
J~ = L
~ +S~ at the same angular velocity.
1102
Figure 14.4: Precessing Vectors
so that X X
~i S
L ~i = L
~ S
~ , = i i (14.68)
i i
~ S
~= ~2 ~ 2 ~ 2
Hso = L J L S (14.69)
2
as we assumed.
1103
More formally (using the Wigner-Eckart theorem) we have
X
hLSJM | Hso |LSJM i = C (LSML0 MS0 )C(LSML MS )
0 0
ML ,MS ,ML ,MS
0 0
ML +MS =ML +MS
0 0
ML ,MS ,ML ,MS
0 0
ML +MS =ML +MS
X
~ i |LML i hSMS0 | S
i (r) hLML0 | L ~i |SMS i
i
X
= C (LSML0 MS0 )C(LSML MS )
0 0
ML ,MS ,ML ,MS
0 0
ML +MS =ML +MS
X D
ED
E
~
~
i (r) L
Li
L S
Si
S
i
~ S
hLSML0 MS0 | L ~ |LSML MS i (14.70)
where the last step involves two uses of the Wigner-Eckart theorem. We then
have
X
hLSJM | Hso |LSJM i = C (LSML0 MS0 )C(LSML MS )
0 0
ML ,MS ,ML ,MS
0 0
ML +MS =ML +MS
X
~ S
i (r)i i hLSML0 MS0 | L ~ |LSML MS i
i
X
= C (LSML0 MS0 )C(LSML MS )
0 0
ML ,MS ,ML ,MS
0 0
ML +MS =ML +MS
~ S
hLSML0 MS0 | L ~ |LSML MS i
~ S
= hLSJM | L ~ |LSJM i (14.71)
~ S.
so that effectively, Hso = L ~
1104
so that
J(J + 1) L(L + 1) S(S + 1)
E(L, S, J) E(L, S, J 1) = A(LS)
2
(J 1)(J) L(L + 1) S(S + 1)
A(LS)
2
A(LS) 2
J + J J 2 + J = A(LS)J
= (14.73)
2
~ B
HB = M ~ = Mz Bz (14.75)
~ + gS S
~= e ~ ~
~ =
~L +
~ S = gL L (L + 2S) (14.77)
2mc
e
gS = = 2gL (14.78)
mc
1105
In addition, ~ Therefore,
~ is precessing rapidly about J.
J~ ~ + 2S)
B (L ~ J~ B (L~ + 2S)
~ (L~ + S)
~
~
ef f ective = = =
J ~ J ~ J
2 2 ~ ~
B L + 2S + 3L S B L2 + 2S 2 + 32 (J 2 L2 S 2 )
= =
~ J ~ J
2 2 2
B 3J L + S J 2 L2 + S 2
B
= J = J 1 + (14.79)
~ 2J 2 ~ 2J 2
or
J 2 L2 + S 2
B ~
~ ef f ective = J 1+ (14.80)
~ 2J 2
Now we use the Wigner-Eckert theorem.
e ~ ~ ~ ~ e J~ J~ + S ~ J~
GJ~ J~ = (J J + S J) G =
2mc 2mc J~ J~
~ ~ ~ ~ ~ ~
e J(J + 1) + JJ+S2SLL
=
2mc J(J + 1)
e J(J + 1) + S(S + 1) L(L + 1)
= 1+ (14.83)
2mc 2J(J + 1)
1106
Now
~ J~ |LSJM i = hLSJ kSk LSJi hLSJM | J~ J~ |LSJM i
hLSJM | S
= J(J + 1)~2 hLSJ kSk LSJi (14.86)
or
~ J~ |LSJM i
hLSJM | S
hLSJ kSk LSJi = (14.87)
J(J + 1)~2
But we have
so that
Therefore, the first-order correction to the energy level E(L, S, J) due to the
perturbation HB is
eB J(J + 1) + S(S + 1) L(L + 1)
hHzeeman i = M~ 1 +
2mc 2J(J + 1)
= 0 M g(LSJ) (14.91)
where
J(J + 1) + S(S + 1) L(L + 1)
g(LSJ) = 1 + (14.92)
2J(J + 1)
is the so-called Lande g-factor.
1107
gravitational field.
For a particular value of the angle of incidence , called the Bragg angle, a plane
wave
inc = ei(~p~rEt)/~ (14.93)
where E is the energy of the neutrons and P~ their momentum, is split by
the crystal into two outgoing waves which are symmetric with respect to the
direction perpendicular to the crystal, as shown in Figure 14.6 below.
1108
The transmitted wave and the reflected wave have complex amplitudes which
can be written respectively as
= cos , = i sin real (14.94)
so that 0
I = ei(~p~rEt)/~ , II = ei(~p ~rEt)/~ (14.95)
0
where |~
p| = |~
p | since the neutrons scatter elastically on the nuclei of the crystal.
The transmission and reflection coefficients are
2 2
T = || , R = || with T + R = 1 (14.96)
In this interferometer setup the incident neutron beam is horizontal. It is split
by the interferometer into a set of beams, two of which recombine and interfere
at point D. The detectors C2 and C3 count the outgoing neutron fluxes. The
neutron beam velocity corresponds to a de Broglie wavelength = 1.445A and
the neutron mass is M = 1.675 1027 kg.
1109
14.5.2 The Gravitational Effect
The phase difference between the beams ACD and ABD is created by rotating
the interferometer by an angle around the direction of the incident beam as
shown in Figure 14.7 (on the left)below.
Now let d be the distance between the silicon strips(we neglect their thickness
in this discussion). We also define L as the side of ABCD and H as its height
as shown in Figure 14.7 (on the right) above. We then have(simple geometry)
that
d
L= , H = 2d sin = Bragg angle (14.102)
cos
Experimentally, the values of d and are d = 3.6 cm and = 22.1 .
Since there is no recoil energy of the silicon atoms to be taken into account, the
neutron total energy (kinetic + potential) is a constant of the motion in all of
the process. The energies are given by
p2 (p p)2
EAC = = EBD = + M gH sin (14.103)
2M 2M
M 2 gH sin
p (14.104)
p
h
v= 2700 m/s (14.105)
M
1110
The change in the velocity v is therefore very small, i.e.,
gH
v = 2 104 m/s for = (14.106)
v 2
Now the gravitational potential varies in exactly the same way along AB and
CD. The neutron state in both cases is a plane wave with momentum p = h/
just before A or C. The same Schrodinger equation is used to determine the wave
function at the end of the segments. This implies that the phases accumulated
along the two segments AB and CD are equal.
When comparing segments AC and BD, the previous reasoning does not apply,
since the initial state of the neutron is not the same for the two segments. The
initial state is eipz/~ for AC and ei(pp)z/~ for BD. After traveling over a
distance L = AC BD, the phase difference between the two paths is
Lp M 2 gd2
= = tan sin (14.107)
~ ~2
The variation with of the experimentally measured intensity I2 in the counter
C2 is shown schematically in Figure 14.8 below (the data does not display a
minimum exactly at = 0 because of calibration difficulties).
1111
where
M 2 d2
A= tan (14.109)
~2
Therefore,
2 1
g= (14.110)
A(sin 2 sin 1 )
In the actual data there are 9 oscillations, i.e., 2 1 = 18 between 1 = 32
and 2 = +24 , which gives g = 9.8 m/s2 . This clearly shows that the neutron
interference effects are directly the result of the difference in the gravitational
potential along two arms of the interferometer.
1 2
H = ~ r) + V (~r) + (1 + a) q S
p~ + q A(~ ~ B
~ (14.112)
2m m
V (~r) = = m02 (2z 2 x2 y 2 )/4 (14.113)
= electrostatic potential energy (14.114)
~ r) = B
A(~ ~ ~r/2 (14.115)
= vector potential (14.116)
~ r) = A(~
We note that p~ A(~ ~ r) p~ = L
~ B/2
~ = Lz B/2 and A
~ 2 = B 2 (x2 + y 2 )/4.
1112
We then get
H = Hz + Ht + Hs (14.117)
p2z 1
Hz = + m02 z 2 (14.118)
2m 2
p2 p2y 1 1
Ht = x + + m2 (x2 + y 2 ) + c Lz (14.119)
2m 2m 2 2
Hs = (1 + a)c Sz (14.120)
In order to calculate the motion along the zaxis, we introduce the creation
and annihilation operators
1 i 1 i p
az = z + pz , a+
z = z pz , = m~0
2 ~ 2 ~
We then have
+
1 2 i i 1
az , az = [z, z] + [pz , z] [z, pz ] 2 2 [pz , pz ]
2 ~ ~ ~
1 i i
= + [i~] [i~] = 1 (14.123)
2 ~ ~
Thus, we have the same mathematical system as the harmonic oscillator so that
Hz = ~0 (Nz + 1/2) , Nz = a+
z az (14.124)
Nz |Nz i = Nz |Nz i , Nz = 0, 1, 2, 3, ...... (14.125)
Hz |Nz i = ENz |Nz i , ENz = ~0 (Nz + 1/2) (14.126)
1113
14.6.2 The Transverse Motion
We now investigate the x y motion governed by the Hamiltonian Ht . If we
define the right- and left-circular creation and annihilation operators
1 i
ar = (x iy) + (px ipy ) (14.127)
2 ~
1 i
al = (x + iy) + (px + ipy ) (14.128)
2 ~
where is a real constant, then we can show (in same way as above) that
+
ar , a+ , [ar , al ] = 0 = ar , a+
r = 1 = al , al l (14.129)
Defining
Nr = a+
r ar , Nl = a+
l al (14.130)
we have
1 1 2Lz
Nr = a+
r ar
= 2 2 2 2 2
(x + y ) + 2 2 (px + py ) 2 + (14.131)
4 ~ ~
+ 1 2 2 2 1 2 2 2L z
Nl = al al = (x + y ) + 2 2 (px + py ) 2 (14.132)
4 ~ ~
and thus
Lz = xpy y px = ~(Nr Nl ) (14.133)
and
1 1
Nr + Nl = 2 (x2 + y 2 ) + (p 2
+ p 2
) 1 (14.134)
2 2 ~2 x y
1114
Thus, the energy eigenvalues of H are
In liquid helium, kT = 3.5 104 eV and the longitudinal and magnetron level
spacings are much smaller than the thermal fluctuations. Thus, a classical
description of these two motions is appropriate. In contrast, a few quanta of
oscillation are thermal excited for the cyclotron motion since kT ~ c. Now,
the electron anomaly is a 0.00116. Therefore we can draw the relative position
of the four energy levels
Nz = 0; Nm = 0; Nc = 0, 1 and = 1 (14.142)
1115
The splitting E between the level Nc = 0, = +1 and the level Nc = 1, = 1
is proportional to the anomaly a. We have E = a~c = 5 107 eV , where
we have neglected the difference between c and 0c which is 7.9 1011 eV .
The splitting corresponds to a frequency = E/~ = 191 M Hz.
Suppose that a cat within a closed box would be killed by a |i particle but not
by a |i particle. Now consider the effect of the state |i + Ket, which can
easily be produced by a properly oriented Stern-Gerlach device.
Suppose that a particle in the state |i + Ket hits the cat and that the state
of the (spin + cat) makes a transition to
What is the consequence of including the observer herself in the quantum me-
chanical description?
According to the point of view presented, the cat(together with the mechanism
for killing the cat, which was not mentioned above) is linked to other macro-
scopic objects. These are influenced differently in the two final states so that
their respective wave functions do not overlap. For everything that follows, the
macroscopic consequences are not recorded; the trace is taken over them. The
final state of the cat is described by a mixture of states corresponding to a dead
cat and a living cat: the cat is either dead or living and not in a pure state
1116
14.7.1 Schrodingers Cat - a more detailed presentation
The superposition principle states that if |a i and |b i are two possible states
of a quantum system, the quantum superposition
1
(|a i + |b i) (14.145)
2
is also an allowed state for this system. This principle is essential in explain-
ing quantum interference phenomena. However, when it is applied to large or
macroscopic objects, it leads to paradoxical situations where a system can be
in a superposition of states which is classical self-contradictory.
The most famous example is Schrodingers cat paradox where the cat is in a
superposition of the dead and alive states. The purpose of this discussion is to
show that such superposition of macroscopic states are not detectable in prac-
tice. They are extremely fragile, and very weak coupling to the environment
suffices to destroy the quantum superposition of the two states |a i and |b i.
Preliminaries
We introduce the operators
p
X = m/~x , P = p/ m~ (14.148)
1117
a |ni = n |n 1i , a+ |ni = n + 1 |n + 1i (14.152)
We can use these relations to derive the ground state wave function in the
position representation as follows:
1
0 = hX| a |0i = hX| X + iP |0i
2
1 i
= X hX | 0i + i hX | 0i
2 2 X
2
X+ hX | 0i = 0 hX | 0i = AeX /2 = 0 (X)
X
2
0 (x) = Aemx /2~
(14.153)
Similarly, we can derive its the ground state wave function in the momentum
representation as follows:
1
0 = hP | a |0i = hP | X + iP |0i
2
1 i
= i hP | 0i + P hP | 0i
2 P 2
2
P+ hP | 0i = 0 hP | 0i = AeP /2 = 0 (P )
P
2
0 (p) = Aep /2m~
(14.154)
These two wave functions are related by the Fourier transform, that is,
Z
2 2
0 (p) = ep /2m~
emx /2~ ipx/~
e dx
Z
0 (x)eipx/~ dx
Since we are considering the question: what are the eigenstates of the lowering
operator a? We can write
a |i = |i where = || ei (14.155)
Since the vectors |ni are eigenvectors of a Hermitian operator, they form a
1118
orthonormal complete set and can be used as an orthonormal basis for the
vector space. We can then write
X
|i = bm |mi (14.156)
m=0
where
X
X
hk | i = bm hk | mi = bm km = bk (14.157)
m=0 m=0
Now
hn 1| a |i = hn 1 | i = bn1 (14.158)
and using
a+ |n 1i = n |ni hn 1| a = n hn| (14.159)
we have
hn 1| a |i = n hn | i = nbn (14.160)
or
bn = bn1 (14.161)
n
This says that
2
b1 = b0 , b2 = b1 = b0 (14.162)
1 2 2!
or
n
bn = b0 (14.163)
n!
We thus get the final result
X m
|i = b0 |mi (14.164)
m=0 m!
1119
Now
We have
1 2 X m 1 2
n
hn | i = e 2 || hn | mi = e 2 || (14.168)
m=0 m! n!
h| a+ a |i = 2 h | i = 2 = N = h| Nop |i
(14.169)
or N = the average value or expectation value of the Nop operator in the state
|i. This type of probability distribution is called a Poisson distribution, i.e., the
state |i has the number states or energy eigenstates distributed in a Poisson
manner.
Since the states |ni are energy eigenstates, we know their time dependence, i.e.,
En
|n, ti = ei ~ t
|ni (14.170)
This simple operation clearly indicates the fundamental importance of the en-
ergy eigenstates when used as a basis set.
Now let us try to understand the physics contained in the |i state vector. In a
1120
given energy eigenstate the expectation value of the position operator is given
by
r
~
hn, t| x |n, ti = hn, t| (a + a+ ) |n, ti
2m0
r
~ En En
= hn| ei ~ t
(a + a+ )ei ~ t
|ni
2m0
r
~
= hn| (a + a+ ) |ni
2m0
r
~
= hn| n |n 1i + n + 1 |n + 1i = 0
2m0
r
~ XX (Em Ek )
h, t| x |, ti = bm bk e i ~ t
hm| (a + a+ ) |ki (14.172)
2m0 m
k
Now
hm| (a + a+ ) |ki = hm| k |k 1i + k + 1 |k + 1i
= km,k1 + k + 1m,k+1 (14.173)
!
r
~ X (Ek1 Ek ) X (Ek+1 Ek )
h, t| x |, ti = bk1 bk ke i ~ t
+ bk+1 bk k + 1e i ~ t
2m0
k=1 k=0
!
r
~ X X
= bk1 bk kei0 t + bk+1 bk k + 1ei0 t
2m0
k=1 k=0
!
r
~ X X
= bk bk+1 ke i0 t
+ bk+1 bk k + 1e i0 t
2m0
k=0 k=0
!
k k+1 i0 t X k+1 k
r
~ X
= b2 ke + k + 1ei0 t
2m0 0
p p
k=0
(k + 1)!k! k=0
(k + 1)!k!
r
~ X 1 2k
b20 ei0 t + ei0 t
= || (14.174)
2m0 k!
k
1121
Now using = ||ei we get
r
X || 2k
~
h, t| x |, ti = b20 2 || Real(ei ei0 t )
2m0 k!
k
X ||2k
= 2x0 || cos(0 t )(b20 )
k!
k
r
~
= 2x0 || cos(0 t ) , x0 = (14.175)
2m0
The expectation value in the state |i behaves like that of a classical oscillator.
Before proceeding with the discussion, we will repeat the derivation using an
alternate but very powerful technique.
If we choose |0i to be the ground state of the oscillator, then we have for the
corresponding displaced ground-state
m +
|i = e 2~ (a a) |0i (14.178)
By Glaubers theorem
e(A+B) = eA eB e 2 [A,B ]
1
(14.179)
we have
m m m
(a+ a) a+ a 1 m
[a+ ,a]2
e 2~ =e 2~ e 2~ e2 2~
m m
a+ a 14 m 2
~
=e 2~ e 2~ e (14.180)
and thus m m
a+ a 41 m 2
~
|i = e 2~ e 2~ e |0i (14.181)
Now
m 2
r r !
m 1 m
e 2~ a
|0i = I + a + a + ..... |0i = |0i
2~ 2 2~
1122
using a |0i = 0. Similarly, using (a+ )n |0i = n! |ni we have
m 2
r r
!
+ m + 1 m
e 2~ a
|0i = I + a + a+ + ..... |0i
2~ 2 2~
r r 2
m 1 m
= |0i + |1i + |2i + ....
2~ 2 2~
p m n
X
= 2~ |ni (14.182)
n=0 n!
or
p m n
41 m 2
X
|i = e ~ 2~ |ni (14.183)
n=0 n!
Thus,
X
|i = bn |ni (14.184)
n=0
where
N n
e 2 N 2 N m 2
bn = , = (14.185)
n! 2 4~
or
which is a Poisson distribution. Thus, we obtain the coherent states once again.
a |i = |i h| a+ = h| (14.187)
1123
so that
2
hEi = h| H |i = ~ h| N + 1/2) |i = ~ || + 1/2) (14.188)
r r
~ ~
hxi = +
h| (a + a ) |i = ( + ) (14.189)
2m 2m
r r
m~ + m~
hpi = i h| (a a ) |i = i ( ) (14.190)
2 2
~ 2
(x)2 = h| (a + a+ )2 |i hxi
2m
~ h 2
i ~ 2
= ( + ) + 1 ( + )
2m r 2m
~ ~
= x = (14.191)
2m 2m
2 m~ + 2 2
(p) = h| (a a ) |i hpi
2
m~ h 2
i m~
2
= ( ) 1 + ( )
2 r 2
m~ m~
= p = (14.192)
2 2
~
xp = (14.193)
2
We can find the wave functions corresponding to |i using the earlier method.
We have in the position representation:
1
hX| a |i = hX | i = hX| X + iP |i
2
1 i
= X hX | i + i hX | i (14.194)
2 2 X
1
X+ hX | i = hX | i
2 X
2)2 /2
hX | i = Ae(X = (X) (14.195)
1124
and in the momentum representation:
1
hP | a |i = hP | i = hP | X + iP |i
2
1 i
= i hP | i + P hp | i (14.196)
2 P 2
1
P+ hP | i = hP | i
2 P
2)2 /2
hP | i = A0 e(P +i = (P ) (14.197)
Finally, we have
r
~ X 1 2k
b20 ei0 t + ei0 t
h, t| x |, ti = ||
2m0 k!
k
X ||2k
r r
~ 2 ~
=2 || cos(0 t )(b0 ) , x0 =
2m0 k! 2m0
k
r
2~
= x0 cos(0 t ) , x0 = (14.199)
m0
and
h, t| p |, ti = p0 sin(0 t ) , p0 = 2m~ (14.200)
In addition, we have (for 1)
x 1 p 1
= << 1 , = << 1 (14.201)
x0 2 p0 2
This says that the relative uncertainties in the position and momentum of the
oscillator are quite accurately defined at any time. Hence the name quasi-
classical state.
1125
Let us look at some numbers. We consider a pendulum of length 1 meter and
of mass 1 gram and assume that the state of this pendulum can be described
by a quasi-classical state. At time t = 0 we assume that the pendulum is
at hx(0)i = 1 micron from its classical equilibrium position, with zero mean
velocity.
x 1 1
= = = 1.3 1010 (14.203)
x0 2 2(0)
2
T = period = (14.204)
(T /4) = (0)eiT /4 = (0)ei/2 = i(0) = 3.9i 109 (14.205)
The eigenvectors of W are {|ni} with W |ni = ~gn2 |ni. This implies that for
|(0)i = |i
n
1 2 X 2
|(T )i = e 2 || eign T |ni (14.207)
n=0 n!
1126
12 ||2
X n 2
|(T = /2g)i = e ein /2 |ni
n=0 n!
1
X 2 n 1
= e 2 || [1 i + (1 + i)(1)n ] |ni
n=0 n! 2
n
2 X 1 h
1
i
= e 2 || ei/4 + ei/4 (1)n |ni
n=0 n! 2
1 h i
= ei/4 |i + ei/4 |i (14.208)
2
Now, suppose that is pure imaginary, that is, = i. In this case, in the state
|i, the oscillator has a zero mean position and a positive velocity.
r
~
hxi = ( + ) = 0
2m
r
m~
hpi = i ( ) = 2m~
2
Similarly, in the state |i, the oscillator also has a zero mean position, but a
negative velocity.
2
eX cos2 2X (14.209)
4
2
P r(P ) ei/4 hP | i + ei/4 hP | i
2 2 2
ei/4 e(P 2) /2 + ei/4 e(P + 2) /2
2)2 2)2
e(P + e(P + (14.210)
1127
where in the last expression
used the fact that, for 1, the two
we have
Gaussians centered at 2 and 2 have a negligible overlap.
These probability distributions are plotted in Figure 14.10 Figure 14.11 below
for = 5i
Suppose that a physicist (Alice) prepares N independent systems all in the state
(14.208) and measures the momentum of each of these systems. Suppose the
1128
measuring apparatus has a resolution p such that:
m~ << p << p0 (14.211)
The state (14.208) represents the quantum superposition of two states which
are macroscopically different, and therefore leads to the paradoxical situations
mentioned earlier.
Another physicist (Bob) claims that the measurements done by Alice have not
been performed on N quantum systems in the state (14.208), but that Alice is
actually dealing with a nonparadoxical statistical mixture, that is, half of the N
systems are in the state |i and the other half in the state |i.
Assuming that this is true, the statistical mixture of Bob leads (after N mo-
mentum measurements) to the same momentum distribution as that measured
by Alice: the N/2 oscillators in the state |i all lead to a mean momentum p0
and the N/2 oscillators in the state |i all lead to a mean momentum p0 .
Up to this point, there is therefore no difference and no paradoxical behavior
related to the quantum superposition (14.208).
In order to settle the matter, Alice now measures the position of each of the
N independent systems, all prepared in the state (14.208). Assuming that the
resolution x of the measuring apparatus is such that
r
1 ~ 1 1
x << X << = (14.212)
|| m ||
We continue with the assumption that Bob is dealing with a statistical mixture.
If Bob performs a position measurement on the N/2 systems in the state |i,
he will find a Gaussian distribution corresponding to the probability law
2 2
P r(X) |hX | i| eX (14.215)
1129
He will find the same distribution for N/2 systems in the state |i. The sum
of his results will be a Gaussian distribution, which is quite different(see Figure
14.11) from the result expected by Alice.
From earlier
2 2
E(t) = ~ |(t)| + 1/2) = ~ |0 | e2t + 1/2) (14.219)
1130
The energy decreases with time. After a time much longer than 1/, the os-
cillator is in its ground state. This dissipation model corresponds to a zero
temperature environment. The mean energy acquired by the environment is
2 2
E(0) E(t) = ~ |0 | (1 e2t ) 2~ |0 | t , 2t << 1 (14.220)
For initial states of the Schrodinger cat type for the oscillator, the state vector
of the total system, at t = 0,
1
|(0)i = ei/4 |0 i + ei/4 |0 i |e (0)i (14.221)
2
and, at a later time t,
1 E E
|(t)i = ei/4 |1 i (+)
e (t) + e
i/4
|1 i ()
e (t) (14.222)
2
1 2 1 2
P r(x) = |hx | 1 i| + |hx | 1 i|
2 2 D E
+ Real i hx | 1 i hx | 1 i (+)
()
e (t) e (t) (14.223)
D E
(+) ()
Let = e (t) e (t) . We then have 0 1, real.
This says that the probability distribution of the position keeps its Gaussian
envelope, but the contrast of the oscillations (cross term) is reduced by a factor
.
1 2 1 2
P r(p) = |hp | 1 i| + |hp | 1 i| + Real i hp | 1 i hp | 1 i (14.224)
2 2
Since the overlap of the two Gaussians hp | 1 i and hp | 1 i is negligible for
|1 | 1, the crossed term, which is proportional to does not contribute
significantly. We recover two peaks centered at |1 | 2m~. The distinction
between a quantum superposition and a statistical mixture can be made by
position measurements. The quantum superposition leads to a modulation of
spatial period
2 1/2
~ /(2m2 ) (14.225)
1131
with a Gaussian envelope, whereas only the Gaussian is observed for a statistical
mixture.
In order to see this modulation, the parameter must not be too small, say
1/10.
For times shorter than 1/, the energy of the first oscillator is
2
E(t) = E(0) 2t |0 | ~ (14.228)
The total energy is conserved: the energy transferred during the time t is
2 2
E(t) = 2t |0 | ~ = ~ |((t)| (14.230)
Conclusion
Even for a system as well protected from the environment as we have assumed
for the pendulum, the quantum superpositions of macroscopic states are unob-
servable. After a very short time, all measurements one can make on a system
1132
initially prepared in such a state coincide with those made on a statistical mix-
ture. It is therefore not possible, at present, to observe the effects related to the
paradoxical character of a macroscopic quantum superposition. However, it is
quite possible to observe mesoscopic kittens, for systems which have a limited
number of degrees of freedom and are well isolated.
We consider a beam of neutrons, which are particles with charge zero and spin
1/2, propagating along the xaxis with velocity v. We will treat the motion of
the neutrons classically as uniform linear motion. Only the evolution of their
spin states will be treated quantum mechanically.
~ 0 are
The magnetic energy levels of the neutron in the presence of the field B
E = n ~B0 /2 = ~0 /2 where 0 = n B0 .
1133
which gives in the |n : i basis
1 eit
~ 0
H= (14.233)
2 1 eit 0
1 eit
~ 0 1 0
H |n (t)i = + +
2 1 eit 0 0 1
d d+ 1 d 0
= i~ |n (t)i = i~ + i~ (14.234)
dt dt 0 dt 1
or
d+ ~ ~
i~ = 0 + + 1 eit (14.235)
dt 2 2
d ~ it ~
i~ = 1 e + 0 (14.236)
dt 2 2
or
d+ 1 1
i = 0 + + 1 eit (14.237)
dt 2 2
d 1 it 1
i = 1 e + 0 (14.238)
dt 2 2
Defining
(t) = (t)ei(tt0 )/2 (14.239)
we get
d+ 0 1 it0
i = + + e (14.240)
dt 2 2
d 0 1 it0
i = + e + (14.241)
dt 2 2
which has constant coefficients.
d+ 1 it0
i = e (14.242)
dt 2
d 1 it0
i = e + (14.243)
dt 2
whose solution is
1 (t t0 ) 1 (t t0 )
(t) = (t0 ) cos i (t0 )eit0 sin (14.244)
2 2
1134
Defining
where
ei cos iei sin
U (t0 , t1 ) = (14.251)
iei sin ei cos
is the time evolution matrix.
The object A is a detecting atom described later. The same oscillating field
~ 1 (t), is applied in both cavities. The magnitude B1 of this field is applied so
B
~ 0 is applied throughout
as to satisfy the condition = /4. The constant field B
1135
the entire experimental setup. At the end of the setup, one measures the number
of outgoing neutrons which have flipped their spin and are in the final state
|n : +i. This is done for several values of near = 0 .
We then have
+ (t00 )
i
iei ei
e 0 + (t1 ) 1
= = (14.257)
(t00 ) 0 ei (t1 ) 2 ei ei
where
0 T
= (14.258)
2
so that the spin state at t 00 is
+ (t00 ) iei ei0 T /2
1
= (14.259)
(t00 ) 2 ei ei0 T
Now let t 01 be the time when the neutron leaves the second cavity with t01 t00 =
t1 t0 = D/v. Now 0 = (t01 + t00 )/2 is given by
t00 = t1 + T , t01 = 2t1 t0 + T
0
= (2t1 t0 + T + t1 + T )/2 = (3t1 + 2T t0 )/2
so that (for the second cavity)
i0 0
cos 0 iei sin 0
e
U 0 = U (t00 , t01 ) = i 0 0 (14.260)
ie sin 0 ei cos 0
1136
where
0 = = 1 (t1 t0 )/2 , 0 = = 1 (t1 t0 )/2 (14.261)
so only the parameter changes into 0 .
Thus the probability amplitude for detecting the neutron in state + after the
second cavity is obtained by
1. Applying U 0 to the vector
+ (t00 ) iei ei0 T /2
1
= (14.262)
(t00 ) 2 ei ei0 T
1137
Figure 14.13: Experimental Results
For
1 2 2
P (T ) = e(T T0 ) /2 (14.267)
2
we get
( 0 )T 1 1 (0 )2 2 /2
cos2 = + e cos(( 0 )T0 ) (14.268)
2 2 2
This form agrees with the observed variation with frequency in Figure 14.13
of the experimental signal. The central maximum which is located at /2 =
748.8 kHz corresponds to = 0 . For that value a constructive interference
appears whatever the neutron velocity. The lateral maxima and minima are less
peaked, however, since the position of the lateral peak is velocity dependent.
The first two lateral maxima correspond to ( 0 )T = 2. Their amplitude
is reduced compared to the central peak by the exponential factor.
Suppose that we insert between the two cavities a device which can measure the
z component of the neutron spin (how this works will be discussed shortly). We
1138
define P++ as the probability of detecting a neutron in the + state between the
two cavities and in the + state when it leaves the second cavity. The probability
P++ is the product of two probabilities, namely, the probability of finding the
neutron in the state + when leaving the first cavity (p = 1/2) and, knowing
that it is in the + state of finding it in the + state when it leaves the second
cavity (p = 1/2). This gives P++ = 1/4. Similarly, P+ = 1/4. The sum
P++ + P+ = 1/2 does not display any interference, since one has measured
in which cavity the neutron spin has flipped. This is very similar to the elec-
tron double slit interference experiment if one measures through which slit the
electron passes.
and
1
|a : yi = ((1 i) |a : +xi + (1 i) |a : xi) (14.271)
2
We assume that the neutron-atom interaction does not affect the neutrons
trajectory. We represent the interaction between the neutron and the atom by
a very simple model. This interaction is assumed to last a finite time during
which the neutron-atom interaction Hamiltonian has the form
2A
V = Snz Sax (14.272)
~
~0
where A is a constant. We neglect the action of any external field, including B
during this time, i.e., we assume the atom-neutron interaction dominates for a
short period of time.
The operators Snz and Sax commute since they act on two different Hilbert
spaces. Therefore, h i
Snz , V = 0 (14.273)
1139
The common eigenvectors of Snz and V and the corresponding eigenvalues are
The operators Snz and V form a complete set of commuting operators as far as
the spin variables are concerned.
From now on we assume that A = /2. Suppose that the initial state of the
system is
|(0)i = |n : +i |a : +yi (14.276)
Expanding in terms of energy eigenstates, we get
|(0)i = |n : +i |a : +yi
1
= |n : +i ((1 + i) |a : +xi + (1 i) |a : xi) (14.277)
2
and
1
|(t)i = |n : +i (1 + i) |a : +xi eiAt/2 + (1 i) |a : xi e+iAt/2
2
so that
1
|( )i = |n : +i (1 + i) |a : +xi eiA /2 + (1 i) |a : xi e+iA /2
2
which for A = /2 gives
1
|( )i = |n : +i (1 + i) |a : +xi ei/4 + (1 i) |a : xi e+i/4
2
1 1 1
= |n : +i (1 + i) |a : +xi (1 i) + (1 i) |a : xi (1 + i)
2 2 2
1
= |n : +i (|a : +xi + |a : xi) = |n : +i |a : +i (14.278)
2
Similarly, if |(0)i = |n : i |a : yi then |( )i = |n : i |a : i.
Physically, this means that the neutrons spin state does not change since it
is an eigenstate of V , while the atoms spin precesses around the xaxis with
angular frequency A. At time = /2A it lies along the zaxis.
After the neutron-atom interaction described above, one measures the zcomponent
Saz of the atoms spin. The state after the interaction is(using linearity)
|( )i = + |n : +i |a : +i + |n : i |a : i (14.280)
1140
The measurement of the z-component of the atoms spin gives +~/2 with prob-
ability |+ |2 and state |n : +i |a : +i after the measurement or ~/2 with
probability | |2 and state |n : i |a : i after the measurement. In both
cases, after measuring the zcomponent of the atoms spin, the neutron spin
state is known - it is the same as that of the measured atom. It is not necessary
to let the neutron interact with another measuring apparatus in order to know
the value of Snz .
A neutron, initially in the spin state , is sent into the two-cavity system.
Immediately after the first cavity, there is a detecting atom of the type described
above, prepared in the spin state +y. By assumption, the spin state of the atom
evolves only during the time interval when it interacts with the neutron.
The probability of finding the neutron in the state + at time t01 (after the 2nd
cavity) is the sum of the probabilities for finding
1. the neutron in state + and the atom in state +, i.e., the square modulus
of the coefficient of |n : +i |a : +i, which = 1/4 in this case)
2. the neutron in state + and the atom in state , which = 1/4 in this case
also).
1141
We therefore get P+ = 1/4 + 1/4 = 1/2 - there is no interference since the
quantum path leading in the end to a spin flip of the neutron can be determined
from the state of the atom.
At time t01 , Bob measures the zcomponent of the neutron spin and Alice
measures the ycomponent of the atoms spin. Assume that both measurements
give +~/2.
We can write
i(+0 T /2) i i 0
1 ie e |n : +i ie |n : i (|a : +yi + |a : yi)
|(t01 )i = 0
2 2 + ei(+0 T /2) iei |n : +i + ei |n : i (|a : +yi |a : yi)
The probability amplitude that Bob finds +~/2 along the zaxis while Alice
finds +~/2 along the yaxis is the coefficient of the term |n : +i |a : +yi in
the above state. Equivalently, the probability amplitude is found by projecting
the state onto |n : +i |a : +yi and squaring. We get
1 0
2
P Snz = +~/2, Say = +~/2 = iei(++0 T /2) iei( +0 T /2)
8
1 ( 0 )T
= cos2 (14.281)
2 2
which clearly exhibits a modulation reflecting an interference phenomenon. Sim-
ilarly, one finds that
1 ( 0 )T
P Snz = +~/2, Say = ~/2 = sin2 (14.282)
2 2
which is also modulated.
2. Knowing the result obtained by Alice on each event, Bob can select a
subsample of his own events which displays an interference phenomenon.
1142
that if the statement were correct, this would imply instantaneous transmission
of information from Alice to Bob. By seeing interference reappear, Bob would
know immediately that Alice is performing an experiment, even though she may
be far away.
Statement (2) is correct. If Alice and Bob put together all their results, and if
they select the subsample of events for which Alice finds +~/2, then the number
of events for which Bob also finds +~/2 varies like
( 0 )T
cos2 (14.283)
2
Thus, they recover interference for this subset of events. In the complementary
set where Alice found ~/2, the number of Bobs results giving +~/2 varies like
( 0 )T
sin2 (14.284)
2
This search for correlation between events occurring in different detectors is a
common procedure in particle physics for example.
Statement (3), although less precise but more picturesque than than statement
(2), is nevertheless acceptable. The
( 0 )T
cos2 (14.285)
2
signal found earlier can be interpreted as the interference of the amplitudes
corresponding to two quantum paths for the neutron spin which is initially in
the state ; either its spin flips in the 1st cavity, or it flips in the 2nd cavity. If
there exists a possibility to determine which quantum path is followed by the
system, interference cannot appear. It is necessary to erase this information,
which is carried by the atom, in order to observe some interference. After Alice
has measured the atoms spin along the yaxis, she has, in some sense restored
the initial state of the system, and this enables Bob to see some interference. It is
questionable to say that information has been erased - one may feel that, on the
contrary, extra information has been acquired. Notice that the statement does
not specify in which physical quantity the interference reappears. Notice also
that the order of the measurements made by Alice and Bob has no importance,
contrary to what this third statement seems to imply.
1143
1144
Chapter 15
Postulate 2 stated that every physical system has a state (or density ) operator
associated with it, or, alternatively, we might say that every physical system has
an associated state vector.
This means that we must have some reproducible preparation procedure, which
we identify with the term state, that determines a probability distribution for
each dynamical variable.
1145
the relation |1 i = U (t1 , t0 ) |0 i. It is not always possible, however, to realize
U (t1 , t0 ) in practice (in the laboratory).
If we know the initial state, the mathematical mapping is one-to-one (pure state
to pure state) and invertible and thus reversible.
This says that even in the microscopic world where reversibility holds, we can
have, effectively, an irreversible transformation.
Now, we know that it is always possible to prepare the lowest energy state
of a system simply by waiting long enough for the system to decay into its
ground state. This follows because the decay of an atomic excited state by
spontaneous emission is due to the atom-electromagnetic field coupling and the
survival probability of an excited state, due to this interaction, decays to zero
(usually exponentially with time). This means that the probability of obtain-
ing the ground state can be made arbitrarily close to one just by waiting long
enough.
The method assumes that the energy of the excited state is radiated away to
infinity (it never returns to the system) and that the electromagnetic field is
initially in its lowest energy state since otherwise we have a nonvanishing prob-
ability for the atom to absorb energy from the field and become re-excited (no
longer guaranteed to be in the ground state).
1146
since Ex gets smaller.
Suppose that this just waiting method can be used successfully to produce the
ground state for some system. It then turns out that we can then prepare a
wide range of states for a spinless particle.
Suppose that we wish to prepare the state characterized by the state function
Step #1
Construct a potential W1 (~r) such that
~2 2
0 + W1 0 = E0 (15.2)
2m
where 0 (~r) = R(~r) is the ground state wave function for the system. By
definition of the ground state, R(~r) is then a nodeless function. We can do this
by choosing
~2 2 R(~r)
W1 (~r) = E + (15.3)
2m R(~r)
Proof : Direct substitution gives
~2 2 ~2 2 ~2 2 R(~r)
0 + W 1 0 = 0 + E0 + 0 = E0
2m 2m 2m R(~r)
or
2 R 0 2 0 R = 0
(15.4)
which certainly has a solution 0 (~r) = R(~r).
Step #2
Wait until the probability that the system has decayed to its ground state is
sufficiently close to one.
Step #3
Apply a pulse potential
where (
1/ 0 < t <
(t) = (15.6)
0 otherwise
1147
During the short time interval 0 < t < we can approximate the Schrodinger
equation by
(~r, t)
i~ = W2 (~r, t)(~r, t) (15.7)
t
since W2 overwhelms any other interactions in the limit 0.
Step #4
Integrate the equation with the initial condition (~r, 0) = R(~r). We get
In general, we can usually carry out this procedure since we are only limited by
our ability to produce the potentials W1 and W2 in the laboratory.
By blocking off and thus eliminating all but one of the sub-beams, which is an
irreversible process, we can select(or filter out) a particular spin state. We will
discuss this procedure in detail shortly.
No-Cloning Theorem
Why dont we just make exact replicas, or clones, of a prototype of the state
(assuming we can find one)? We do this all the time in the world of macro-
physics, i.e., duplicate keys, copying a computer file, etc.
U |i |i = |i |i | 0 i (15.10)
The dimension of the final device state vector | 0 i is smaller than that of the
initial device state vector |i because the overall dimension of the space is
1148
conserved under unitary transformations, i.e., if |i is a 1particle state and
|i is an N particle state, then | 0 i must be an (N 1)particle state.
U |1 i |i = |1 i |1 i | 0 i
U |2 i |i = |2 i |2 i | 00 i
It is possible that the final device states | 0 i and | 00 i have a dependence on the
state we are attempting to clone.
Now U is a linear operator. This implies that for the superposition state
1
|S i = (|1 i + |2 i) (15.11)
2
we must have
1 1
U |S i |i = |1 i |1 i | 0 i + |2 i |2 i | 00 i (15.12)
2 2
But |S i is not any special state and thus, by our assumption, we should have
obtained
U |S i |i = |S i |S i | 000 i (15.13)
but we do not!
Classical states are just special limiting cases of quantum states. Since we are
able to copy an unknown classical state, does this theorem make any sense?
We were able to prove the impossibility of cloning states because of the linearity
of quantum mechanical time evolution process.
In any case, the set of discrete states must also be orthogonal. This follows from
the fact that the inner product between state vectors is preserved by a unitary
transformation.
1149
This means we must have
h| h1 | U + U |2 i |i = h| h1 | I |2 i |i
= h1 | 2 i h | i
= (h 0 | h1 | h1 |) (|2 i |2 i | 00 i)
2
= |h1 | 2 i| h 0 | 00 i
Now, states that are classically different will certainly be orthogonal, so the
no-cloning theorem that we proved for quantum states would not be in conflict
with the ability to copy classical states.
In the laboratory, nature usually presents us with mixed states (states that are
not pure) and it is the preparation of pure states that is difficult.
1150
submit a system to the preparation procedure and then carry out only a single
measurement on it. To obtain any more information, we must repeat the state
preparation procedures before another measurement is carried out. We can use
the same system each time in the preparation procedure, or another identical
system each time. The results will be the same.
What sort of measurements are sufficient to determine the state operator (or
state vector) associated with the preparation procedure?
for the case of non-degenerate eigenvalues. This says that the measurement
of the probability distribution of the dynamical variable R gives the diagonal
matrix elements of the state operator in this representation.
We can formally construct a set of operators such that their probability distribu-
tions would determine all of the matrix elements of the state operator. Consider
the Hermitian operators:
1151
and similarly
T r Bmn = Im (hrm | |rn i) (15.19)
1152
which can be done using an SG apparatus (see discussion later in this chapter).
For some unit vector n = (sin cos , sin sin , cos ) we have the operator
ei sin
cos
n = (15.25)
ei sin cos
Only the relative magnitudes and relative phases of the components of any state
vector have physical significance (the norm and overall phase are irrelevant).
Looking at the eigenvectors, it is clear that all possible values of the relative
magnitude and relative phase can be obtained by varying and , or, in other
words, the relative magnitude and relative phase of any 2component vector
uniquely determines and . This implies that any pure state vector of an
S = 1/2 system can always be associated with a spatial direction n for which it
is the +hbar/2 eigenvector for that component of spin, i.e., emphany pure state
always represents a definite spin in some direction in this case.
= + 11 , 22 , 33 real
ij = ji i 6= j
where D E
S = polarization, as before. (15.28)
1153
The five other parameters are obtained by measuring the average values of the
so-called quadrupole operators
S S + S S , = x, y, z (15.29)
Only five of the six possible different operators are independent, namely,
(, ) = (x, y), (z, x), (y, z), (x, x), (y, y) (15.30)
and we have
1 sin ei
cos 2
0
1 sin ei
n S = 2 0 1 sin ei (15.32)
2
0 1 sin ei cos
2
with eigenvalues/eigenvectors
+~ 0 ~
1 i
1
+ cos )ei sin e 1
cos )ei
2 (1 2 2 (1
1 sin cos 12 sin
2
1 i 1
sin e i 1 i
2 (1 cos )e 2 2 (1 + cos )e
Unlike the spin = 1/2 case, it is no longer true that every vector must be an
eigenvector of the component of spin in some direction. This is so because it
requires four real parameters to specify the relative magnitudes and the relative
phases of the components of a general 3component vector. The eigenvectors
above only contain two parameters and , however. Therefore, the pure states
of a spin = 1 system need not be associated with a spin eigenvalue in any spatial
direction. Let us define the quantities
1
a = T r(S ) = x, y, z
~
1
q = 2 T r(S2 ) = x, y
~
1 h i
q = 2 T r S S + S S = xy, yz, zx
~
1154
these are the eight real numbers we need defined in terms of the measurable
quantities:
D E
S = x, y, z ,
D E
S2 = x, y
D E
S S + S S = xy, yz, zx
Writing down all such possible equations and solving them for the matrix el-
ements of the state operator we can express the most general state operator
as
1 + 12 (ax qxx qyy ) 1 1
(a + qxx i(ay + qyz ))
2 2 x 2 (qxx qyy iqxy )
1 (a + q + i(a + q )) 1 + qxx + qyy 1
(a qxx i(ay qyz ))
= 2 2 x xx y yz
2 2 x
1 1 1
2 (q xx qyy + iq xy )
2 2
(ax qxx + i(ay qyz )) 1 2 (ax + qxx + q yy )
With this parameterization of the state operator, we now ask how do we measure
the parameters?
An SG apparatus with the magnetic field gradient along the xdirection can be
used to perform measurements on an ensemble of particles that emerge from the
state preparation. We can determine the relative frequencies of the Sx values
+~, 0, ~ in this way. Then we can calculate
D E D E
Sx Sx2
ax = and qxx = (15.33)
~ ~2
and similarly for gradients in the y and zdirections. In this way we can
measure ax , ay , az , qxx , qyy , qzz which provides a check since we must have qzz =
2 qxx + qyy . What about qxy , qyz , qzx ? Here we take advantage of the time
evolution of the unknown state operator in a uniform magnetic field B. ~ We
have
H = ~ B ~ = S B
~ = Sz for B
~ = B z (15.34)
Now suppose that at t = 0 the state operator is and the magnetic field is
turned on for a time interval t, after which we measure Sx2 . By doing this many
1155
times for each of several t values, we can evaluate
d D 2 E
Sx (15.35)
dt t=0
where
(t) ih i
= (t), H (15.37)
t ~
In the Schrodinger picture we then have
!
d D E d R
R = T r (t)R = T r R +
dt t dt t t
!
ih i R
= Tr H, R +
~ t
!
i R
= Tr H R H R +
~ t
!
i R
= Tr RH H R +
~ t
!
i h i R
= Tr (t) H, R + (t)
~ t
In a similar manner we can determine qyz and qzx . Thus, we have managed to
determine the spin = 1 state completely. This procedure can be generalized to
higher spin states.
Now let us turn to the problem of determining the orbital state for a spin-
less particle. The orbital state of a spinless particle can be described by the
coordinate representation of the state operator
This is a function of two variable ~r and ~r 0 . It is called the density matrix. Its
diagonal elements
2
h~r| |~ri = |h~r | i| (15.39)
1156
yield the position probability density.
To determine the density matrix for an arbitrary state we will need the prob-
ability distributions for the position and one or more dynamical variables with
operators that do not commute with the position operator. In 1933, Pauli posed
this question: Are the position and momentum probability densities sufficient to
determine the state?
1157
Then we have
D E X X
R(1) = ham bn | |am0 bn0 i ham0 | R(1) |am i hbn0 | I(2) |bn i
m,n m0 ,n0
X X
= ham bn | |am0 bn0 i ham0 | R(1) |am i nn0
m,n m0 ,n0
X
= ham bn | |am0 bn i ham0 | R(1) |am i (15.44)
m,m0 ,n
where T r(2) means the trace over the space of component 2, then we have
X
ham | (1) |am0 i = ham bn | |am0 bn i (15.46)
n
The operator (1) is called the partial (or reduced) state operator for component
1. Similar results hold for
D E
R(2) I(1) R(2) R(2) = T r (2) R(2) (15.48)
Let us now prove that (1) and (2) are in fact state operators. We must have
T r(1) (1) = 1 , (1) = (1)+ , hu| (1) |ui 0 for all |ui (15.49)
Now we have
X X
T r(1) (1) = ham | (1) |am i = ham bn | |am bn i = T r =1
m m,n
X
ham | (1) (1)+ |am0 i = ham bn | + |am0 bn i
n
X
= ham bn | (0) |am0 bn i = 0 (1) = (1)+
n
To prove the last condition we assume hu| (1) |ui < 0 and look for a contradic-
tion. In the space of component 1 we use the orthonormal basis {|um i} where
|u1 i = |ui instead of using the basis {|am i}. We also use the product states
|um bn i = |um i |bn i. Then, our assumption implies that
X
0 > hu1 | (1) |u1 i = hu1 bn | |u1 bn i (15.50)
n
1158
but this is impossible since is non-negative. Thus, (1) and (2) are state
operators. (1) is sufficient to calculate the average of any dynamical variable
that belongs exclusively to component 1 and similarly for (2) .
They are not sufficient, in general, for determining the state of a composite sys-
tem. The reason is that they provide no information about correlations between
components 1 and 2.
If we find D E D ED E
R(1) R(2) = R(1) R(2) (15.51)
for all R(1) and R(2) , then the composite state is said to be an uncorrelated
state. In this case, we have
D E h i
R(1) R(2) = T r R(1) R(2)
= T r (1) R(1) T r (2) R(2) = (1) (2) (15.52)
This is the only case where the total state operator is determined by the partial
state operators of the components.
If is a state operator and if (1) = T r(2) and (2) = T r(1) and if (1) describes
a pure state, then
= (1) (2) (15.53)
i.e., a pure partial state operator must be a factor of the total state operator.
X
= k |k i hk | (15.54)
k
This form is nonnegative as long as the eigenvalues k > 0. Now let us expand
the eigenvectors in terms of the product vectors
X
k
|k i = Cmn |am bn i (15.55)
m,n
This implies
X X X
k k
= k Cmn Cm 0 n0 |am bn i ham0 bn0 | (15.56)
k m,n m0 ,n0
1159
so that
X X X X
(1) = T r(2) = hbr | k k
Cmn k
Cm 0 n0 |am bn i ham0 bn0 |
|br i
r k m,n m0 ,n0
XX X
k k
= k Cmn Cm 0 n0 hbr | am bn i ham0 bn0 | br i
(1) = |i h| (15.58)
Since the original basis {|am i} is arbitrary, we can choose |a1 i = |i which
implies that
X X X
(1) = |am i ham0 | k k
Cmn Cmk
0 n = |a1 i ha1 | (15.59)
m,m0 k n
k n n k
or that X k 2
k Cmn = 0 for m 6= 1 (15.60)
k
k
Therefore, for m 6= 1 and any k such that k 6= 0 we must have Cmn = 0 since
0. Thus
X XX
k k
= k C1n C1n 0 |a1 bn i ha1 bn0 |
k n n0
!
X XX
k k
= |a1 i ha1 | k C1n C1n 0 |bn i hbn0 | (15.61)
k n n0
1160
Summarizing, we have shown that partial states for the components of a system
can be defined, but the states of the components do not suffice for determining
the state of a whole composite system. The relation between the states of the
components and the composite state is very complex. The theorem helps when
one component is pure, which implies that factors as specified.
Now a factorization of the form = (1) (2) implies that there is no correla-
tions between the components 1 and 2. Therefore, a component described by a
pure state cannot have any correlations with the rest of the system. Does this
make sense?
|i = |i |i |i ................... (15.64)
i.e., all spins are up in the zdirection. This seems like a high degree of correla-
tion. Yet, we must interpret the product form of the state vector as an absence
of correlations among the particles.
We resolve this dilemma by noting that the correlation existing here is gener-
ated from the quantum mechanical probability distributions. Since |i is an
eigenvector of the z-components of the spins, there are no fluctuations in these
dynamical variables and no variability (no fluctuations) implies the degree of
correlation is undefined.
If, instead, we consider the components of the spin in any direction other than
z, they will be subject to fluctuation and those fluctuations will indeed be cor-
related in the state |i.
The source (inside box) emits pairs of particles in variable directions, but always
1161
with opposite momentum
~kb = ~ka , ~kb0 = ~ka0 (15.65)
The two output ports on each side of the source box restrict each particle to
two possible directions. This says that the state of the emerging pairs is
1 E E E E
|12 i = ~ka ~kb + ~ka0 ~kb0 (15.66)
2
The momenta of the particles are correlated in this state. This means that if
particle 1 on the right has momentum ~~ka , then particle 2 on the left must have
momentum ~~kb and if 1 has ~~ka0 , then 2 must have ~~kb0 .
Now by inserting mirrors in the proper places we can combine the beams a and
a0 on the right and combine beams b and b0 on the left. Looking at only one
side of the apparatus, it would appear that the amplitudes from paths a and
a0 should produce an interference pattern (like a double slit experiment) and
similarly for paths b and b0 on the left.
This, in fact, would be observed experimentally if the particles were not corre-
lated and the state was of the form
|12 i = |1 i |2 i (15.67)
This form holds only in the regions where the beams overlap. The wave function
is zero outside the beams.
If you ignore particles on the left and place a screen to detect particles on the
right, then the detection probability for particle 1 is given by
Z
2
|12 (~r1 , ~r2 )| d~r2 (15.70)
This is featureless, i.e., no interference pattern exists in the single particle prob-
ability density.
1162
we select only those particles on the right that are detected in coincidence with
particles on the left in a small volume near ~r1 , then their spatial density is
proportional to
Z h i
2
|12 (~r1 , ~r2 )| (~r2 )d~r2 = 1 + cos ~ka ~ka0 ) ~r1 (15.71)
The magnitude and sign of the force depends on the spin state since ~
~ S(spin).
Therefore, different spin states (different spin components) will be deflected by
this force into sub-beams propagating in different directions. The SG apparatus
physically separates spin states in space. By blocking off and thus eliminating
all but one of the sub-beams, which is an irreversible process, we can select(or
filter out) a particular spin state. In addition, the value of the spin component
can be inferred from the location of the sub-beam or equivalently, the deflection
of the beam. A schematic of an SG apparatus is shown in Figure 15.2 below:
1163
The velocity of the incident beam is in the ydirection and the magnetic field
is in the xzplane, which is transverse to the beam. Some idealizations are
necessary to simplify our analysis:
1. the magnetic field vanishes outside of the gap between the poles
This means, that relative to a coordinate origin located some distance below
the magnet, the components of the magnetic field can be written as
In this case, the magnetic force is in the zdirection and the y-component of
the velocity will be constant. We can, therefore, choose a frame of reference
moving uniformly in the ydirection (with the beam velocity). In this frame
the incident particle is at rest and it experiences a time-dependent magnetic
field that is only nonzero during the time interval T that it takes for the beam
to pass through the magnet field region. The spin Hamiltonian, H = B, ~
can then be written as:
0
t<0
H(t) = czz 0 < t < T (15.74)
0 T <t
Now, we must have B ~ = 0 for any magnetic field. The above magnetic field
does not, in fact, satisfy this relation. We would need to have, at least, that
Bx = xB 0 , Bz = B0 + zB 0 (15.75)
Now suppose that the state vector for the particles is given by
|0 i = a |+i + b |i t0 (15.76)
2
where a2 + |b| = 1 and |i = spin up/down eigenvectors of z .
The time evolution of this state (equation of motion) then gives for t 0
i i i
|(t)i = e ~ Ht |0 i = ae ~ Ht |+i + be ~ Ht |i
= aeiczt/~ |+i + beiczt/~ |i (15.77)
1164
so that for t T
This says that the effect of the interaction is to create a correlation between the
spin and the momentum of the particle. The state vector above says that
if z = +1 then Pz = +T c
if z = 1 then Pz = T c
or that the trajectory of the particle will be deflected either up or down accord-
ing to whether z is positive of negative. Thus, by observing the deflection of
the particle we can infer the spin value.
We assume that the apparatus (II) has an indicator variable A, and a corre-
sponding operator A that also possesses a complete set of eigenvectors, i.e.,
where is the indicator eigenvalue and m labels all the other quantum numbers
needed to specify a unique eigenvector of the apparatus.
Now introduce some interaction between (I) and (II) that produces a unique
correspondence between the value r of the dynamical variable R of (I) and the
1165
indicator of (II). Since the interaction causes a time development of the state
vectors, the time development operator U implicitly specifies the interaction.
We can study this process further via the time development operator U without
specifying any more details of the interaction.
Suppose that the initial state of (I) is an eigenstate of the R, say |riI . Then the
initial state of the combined system (I) and (II) is given by
If we require that the measurement should not change the value of the quantity
being measured, then we must have
First, there is no reason why the state of the object (I) should not change during
the interaction (in practice it usually does). It is also not necessary for the state
of the apparatus to remain an eigenvector corresponding to a unique value of
m. We can assume a much more general result of the form
X 0 0
U |riI |0, miII = urr,m
,m
|r0 iI |r , m0 iII = |r ; (r, m)i (15.83)
r 0 ,m0
The labels (r, m) in the last vector are not eigenvalues but just indicate the
initial state from which the vector evolved.
The only restrictions here are that the final state vector is related to the initial
state vector by a unitary transformation, and that the particular value of r in
the initial state vector should correspond to a unique value of the indicator r in
the final state vector. This last condition is required if the apparatus is actually
to be able to carry out a measurement.
In the SG example the dynamical variable being measured is the spin component
z and the indicator variable is the momentum Pz . In this case, therefore, the
1166
indicator variable is not physically separate from the object of the measurement.
It only needs to be kinematically separate. Also in this case, we look at the
deflected beams and use the position coordinates of the point where the beam
strikes a screen as the indicator variable.
Now consider a general initial state for object (I) of the form
X
|iI = cr |riI (15.84)
r
Finally, the probability, in the final state, that the indicator variable A of the
apparatus (II) has the value r is |cr |2 , which is just the same as the probability
in the initial state that the dynamical variable R of the object (I) had the value
r. This result is required if we are to have a faithful mapping from the initial
value of r to the final value r .
This enables us to get a better handle on how to interpret the quantum state.
1167
(A) is the more common interpretation in the literature, although it is not al-
ways made explicit. It assumes that because the state vector plays such an
important role in the mathematical formalism of quantum mechanics, it must
have an equally important role in the interpretation. It makes a strong corre-
spondence between the properties of the world and the properties of the state
vector.
There is no such difficulty with (B) where the state vector is simply some ab-
stract quantity that characterizes the probability distributions of the dynamical
variables of an ensemble of similarly prepared systems (each member of the en-
semble consists of an object and a measuring apparatus).
The first idea for the measurement theorem came from Schrodinger in 1935 in
his now famous Cat experiment. We consider a box containing a cat, a flask
of poison gas, a radioactive atom, and an automatic device which releases the
poison gas when the atom decays.
If the atom were isolated, then after a time equal to one half-life, its state vector
would be
1
|atom i = (|undecayedi + |decayedi)
2
1
= (|ui + |di) (15.86)
2
1168
Now the atom is coupled to the cat via the apparatus. Therefore, the state of
the system after one half-life is
1
|system i = (|uiatom |aliveicat + |diatom |deadicat ) (15.87)
2
This is an entangled or correlated state. It is also a superposition of macroscopi-
cally distinct states (alive cat and dead cat). This is typical of any measurement
process.
The physicists that believe interpretation (A) are now forced to introduce a new
postulate, called reduction of the state vector or collapse of the state vector.
f
A new process arises during a measurement so that a transition m |r0 ; (r0 , m)i
occurs.
The process is called reduction or collapse and the final state is called the re-
duced state. Here the new final state vector is now an eigenstate of the indicator
variable A with eigenvalue r0 corresponding to the actual observed indicator
position.
Proposed mechanisms
1. The reduction process is caused by an unpredictable and uncontrollable
disturbance of the object by the measuring apparatus. Any interaction
between the object (I) and the apparatus (II) that might act as the cause
of this disturbance must already be included in the Hamiltonian that we
use to construct the time development operator.
1169
2. The observer causes the reduction process when she reads the result of the
measurement from the apparatus.
This is a variation of (1) with the observer instead of the apparatus causing
the disturbance. It also makes no sense. It also leads some physicists
to speculation about whether quantum mechanics can be applied to the
consciousness of the observer, which is an arena we do not need to enter.
Here the proponents never quite make clear which part of the environment
is essential. If we include within (II) all these essential parts , then it is just
another disturbance model. We will have more to say about decoherence
later.
We can defeat the last view by re-deriving the theorem using general state
operators.
where the wm are the probabilities associated with each of the microscopic states
labeled by m, where m represents the quantum numbers(large number) of the
apparatus other than the indicator . The final state must be a mixture of
indicator eigenvalues, say of the form
X 2
X
d = |cr | vm |r ; (r, m)i hr ; (r, m)| (15.90)
r m
1170
distinct indicator position eigenvectors, which is not allowed by interpretation
A. The final state is more plausible than the reduced state, which would have
prescribed a unique measurement result r0 . The new conjecture is consistent
with the prediction that the result of the measurement being r has probability
|cr |2 .
This conjecture is not correct, however. The actual final state of the measure-
ment process is given by
X
f = U i U + =
f
f
wm m m (15.91)
m
f i
where m = U m . We then get
XX X
f = cr1 cr2 wm |r1 ; (r1 , m)i hr2 ; (r2 , m)| (15.92)
r1 r2 m
It seems that in all cases where the initial state is not an eigenstate of the dynam-
ical variable being measured, the final state will involve coherent superpositions
of macroscopically distinct indicator eigenvectors. This makes interpretation
(A) untenable.
Thus, if we attempt to maintain the idea that the statistical quantum theory
is, in principle, able to produce a complete description of an individual physical
quantum system, then we seem to always end up in a theoretical box full of im-
plausible results. If one, however, views the quantum mechanical description as
the description of an ensemble of systems, it seems possible that the theoretical
difficulties will vanish.
It turns out that taking this view seldom leads to any serious errors. This
is because the predictions of quantum mechanics that are derived from a wave
function consist of probabilities, and the operational significance of a probability
corresponds to a relative frequency. This means that one usually has to invoke
an ensemble of similar systems when one makes a comparison with experiment
independent of how one interprets the wave function.
Since so many results do not seem to depend in any critical manner on which
1171
interpretation one makes, should we dismiss the subject of interpretation as
irrelevant?
Let us consider this interesting case. Electrons are emitted from a hot cathode
and then accelerated to form a beam to be used in interference experiments.
Using interpretation (A) we can account for the energy spread in the beam via
two different assumptions:
Let us analyze this system such that the electron beam is moving along the x-
axis (a one dimensional problem). Then, assumption (1) says that each electron
has a wave function of the form
Note that all of the time dependence cancels out, so that we have a steady state!
1172
Now, assumption (2) says that if an individual electron is emitted at t0 in a
wave packet state
Z
t0 (x, t) = A()ei(kx(tt0 )) d (15.97)
2
then the energy distribution is given by |A()| = W (). The state function for
the emission process is obtained by averaging over the emission time (assumed
to be uniformly distributed):
T /2
Z
0 1
hx| |x i = lim t0 (x, t)t0 (x, t)dt0
T T
T /2
T /2 Z
Z Z
1 0 0 0
= lim A()ei(kx(tt0 )) d A ( 0 )ei(k x (tt0 )) d 0
T T
T /2
Now, integrating over t0 and then taking the limit, the integral is zero unless
= 0 (k = k 0 ). We thus get
Z
2 0
(x, x0 ) = |A()| eik(xx ) d (15.98)
This is the same result as obtained from assumption (1). Assumptions (1)
and (2) do not lead to any observable differences and any controversy over the
supposed wave functions of individual electrons seems to be pointless.
We now regard the state operator as the fundamental description of the state
generated by the thermal emission process, which now yields an ensemble of
systems each of which is a single electron. In this case, we can obtain (x, x0 )
without ever speculating about individual wave functions.
and thus H and possess a complete set of common eigenvectors. This set is the
free particle states | i, which have the coordinate representation hx | i =
1173
eikx and satisfy the eigenvalue equation H | i = ~ | i. In this case the state
operator has the form Z
= | i h |W ()d (15.101)
Of course we get the same result as earlier, since all of these clever approaches
must agree with the same set of experimental predictions in the end!!
This last approach seems, however, to be superior since it avoids any (probably
pointless) speculations about the form of the supposed wave function of an
individual electron.
1174
The incident beam at A is divided into a transmitted beam AC and a diffracted
(Bragg-reflected) beam AB. Similar divisions occur at B and C, with the trans-
mitted beams leaving the apparatus (no further role in the experiment). The
diffracted beams from B and C recombine coherently at D, where a further
Bragg reflection takes place. The interference of the amplitudes of the two
beams is observable by means of two detectors, D1 and D2 .
The amplitude at D1 is the sum of the transmitted part of CD plus the diffracted
part of BD, and similarly, the amplitude at D2 is the sum of the transmitted
part of BD plus the diffracted part of CD.
We assume for simplicity that the transmission and reflection coefficients are
the same at each of the vertices A,B,C and D and that free propagation of the
plane waves takes place between these vertices.
The time evolution and propagation processes are generated by linear opera-
tors. This implies that the relation between the amplitudes of the outgoing and
incoming waves is of the form
0
a1 a1 t r
= U where U = (15.103)
a02 a2 s u
or
1175
det U = 1 |tu rs| = 1 (15.105)
1 u r
U 1 = U +
tu rs s t
s
t
= |u| = |t| and |s| = |r| (15.106)
r u
Now since |u| = |t| and |s| = |r| we must have |tu| + |rs| = 1. This result is
compatible with |tu rs| = 1 only if tu and rs have the same complex phase,
and thus rs/tu must be real and negative.
where AB is the phase change due to the propagation through the empty space
between A and B, and similarly for AC . The amplitude that emerges toward
the detector D1 is the sum of the amplitudes from paths ABDD1 and ACDD1
(and similarly for detector D2 ):
D1 = A (a02 , a1 )eiAB (a01 , a2 )eiBD (a02 , a1 ) + (a01 , a1 )eiAC (a02 , a1 )eiCD (a02 , a2 )
Any perturbation that has an unequal effect on the phases associated with the
two paths will influence the intensities of the beams reaching the detectors D1
and D2 . Since rs/tu is negative, it follows that if the interference between the
two terms in D2 is constructive, then the interference between the two terms
in D1 will be destructive and vice versa. The best way to detect such a per-
turbation is to monitor the difference between the counting rates in D1 and D2 .
1176
An experiment of this sort in 1975 detected a quantum interference due to
gravity. The interferometer was rotated about a horizontal axis parallel to the
incident beam causing a difference in the gravitational potential on paths AC
and AB and thus a phase shift in the interference pattern. In the spin recombi-
nation experiment, a beams of neutrons with spin polarized in the +zdirection
is incident from the left as shown in Figure 15.6 below:
We let the vectors |+i and |i denote the spin-up and spin-down eigenvectors
of z . The neutrons at point B have the spin state |+i and the neutrons after
the spin-flipper have the spin state |i. We then ask this question.
At one time, when the beams are at B and C, they are separated by several
centimeters. This means that their spatial wave functions do not overlap. One
might suppose, in this case, that all coherence is lost and that no interference
is possible, i.e., that the spin state should be an incoherent mixture of spin-up
and spin-down states of the form
1
inc = (|+i h+| + |i h|) (15.112)
2
If, on the other hand, the coherence is maintained somehow, then the spin state
will be of the form
1
coh = |ui hu| where |ui = ei |+i + ei |i
(15.113)
2
1177
Both of these state operators predict that hz i = 0, i.e.
inc
= T r inc z = h+| inc z |+i + h| inc z |i
hz i
= h+| inc |+i h| inc |i
1 1
= h+| ( (|+i h+| + |i h|)) |+i h| ( (|+i h+| + |i h|)) |i
2 2
1 1
= =0
2 2
coh
= T r coh z = h+| coh z |+i + h| coh z |i
hz i
= h+| coh |+i h| coh |i
1 1
= h+| (|ui hu|) |+i h| (|ui hu|) |i = =0
2 2
or that the zcomponent of spin is equally likely to be positive or negative.
However, inc actually predicts zero polarization in any direction, while coh
predicts that the spin is polarized in some direction in the xyplane. We can
see this by computing
coh
= T r coh x = h+| coh x |+i + h| coh x |i
hx i
= h+| coh |i h| coh |+i = h+| (|ui hu|) |i h| (|ui hu|) |+i
1
= h+| ei |+i + ei |i ei h+| + ei h| |i
2
1
h| ei |+i + ei |i ei h+| + ei h| |+i
2
1 i i 1 i i
= e e e e = cos( ) (15.114)
2 2
Even though the phases and are not necessarily known in advance, their
difference can be systematically varied by placing known phase-shifters in one
of the beams. Such an experiment was done in 1982 and it found a periodic
dependence of hx i on the phase shift and no such dependence for hz i. This
confirms that the coherent superposition is the correct state.
If we account for both the position and spin variables, the state function(vector)
should be
|i = + (~r) |+i + (~r) |i (15.115)
where the wave functions + (~r) and (~r) vanish outside the beams. The spin
state operator is given by
2 2
= |i h| = |+ | |+i h+| + + |+i h| + + |i h+| + | | |i h|
(15.116)
1178
or 2
h+| |+i h+| |i |+ | +
= = 2 (15.117)
h| |+i h| |i + | |
Along AB, AC, and from B to the left of D, (~r) = 0 and from the right of
the spin-flipper to the left of D, + (~r) = 0. Both components are nonzero to
the right of D.
At the point D the off-diagonal terms are nonzero, which means that we have
a coherent superposition. In this experiment, the preservation of the coherence
over a distance of several centimeters is possible because the scatterer is cut
from a single crystal of silicon and the relative separation of the components
are stable to within the interatomic separation distance.
Suppose that the spectrometer were not such a high precision device and that
the relative separations of points A, B, C, and D are subject to random fluctu-
ations that are larger than the spatial spread of the neutron wave function.
This gives rise to random fluctuations in the phases and and hence, in
the phases of the off-diagonal terms. Different neutrons passing through the
spectrometer at different times would experience different configurations of the
spectrometer, and to determine the observed statistical distributions we must
average over these fluctuations.
The reduced state can therefore be significant under certain conditions. It does
not seem, however, to be a fundamental object, but , instead, arises only due
to an effect on the system (neutron + spectrometer) from its environment(the
cause of the noise fluctuations). The separation of the system and the environ-
ment is, however, artificial. If the reduction takes place in this manner, then it
is not a new fundamental process, and it would not have anything to do with
measurement.
Let us now include the environment not as an external effect on the system,
but as an integral part of the system. The neutrons that follow path ABD
will interact differently with the environment than those that follow path ACD.
These interactions will affect the state of the environment and therefore the
final state must now be
where |e1 i is the state of the environment if the neutron followed path ABD
1179
and |e2 i is the state of the environment if the neutron followed path ACD. If
|e1 i = |e2 i, then this inclusion of the environment has no effect. If we recalculate
the state operator we now get
2
|+ | + he2 | e1 i
= 2 (15.119)
+ he1 | e2 i | |
If the difference between the effects of taking paths ABD and ACD on the en-
vironment is so great that |e1 i and |e2 i are orthogonal, then the state operator
reduced once again to inc .
We have thus seen two possible methods of handling the influences, if any, of
the environment on the experiment. One method treats the effect of the envi-
ronment as an outside perturbation, which introduces random phases and we
lose coherence if the random phase fluctuations are large enough. In the second
method, the environment is included in the state vector of the system. It is then
the effect of the apparatus on the environment, rather than the environment on
the apparatus, that is important. Clearly, these two approaches are equivalent.
We can deal with this by computing the joint probability distribution for the
results of two or more measurements, or the probability for one measurement
conditional on both state preparation and the result of another measurement.
1180
project onto the subspace spanned by those eigenvectors whose eigenvalues lie
in the interval .
We suppose that the first of these events takes place at time ta and the second at
time tb . We use the Heisenberg representation and assume that the specification
of ta is implicit in the operators R and MR () and that the specification of tb
is implicit in the operators S and MS ().
Let us now extract the probability density g(r) using these results.
Discrete Spectrum
Let R have a purely discrete spectrum. We can then write
X
R = rn |rn i hrn | (15.125)
n
1181
Therefore,
g(r) = Pr ob(R < r|)
r
X
= (r rn ) hrn | |rn i
r n
X
= (r rn ) hrn | |rn i (15.128)
n
The probability that R will have the discrete value r in the virtual ensemble
characterized by is
which projects onto the subspace spanned by all the degenerate eigenvectors
with eigenvalue r = rn . We then have
P rob(R = r|) = T r P (r) (15.131)
we must have
hr0 | |r0 i = 1 (15.134)
But, any state operator must satisfy T r2 1 which implies
X X 2
hrn | |rm i hrm | |rn i = |hrn | |rm i| 1 (15.135)
m,n m,n
1182
Since one term in the sum already accounts for an amount = 1, all of the other
diagonal and non-diagonal matrix elements of must vanish.
Therefore the only state for which R takes on the non-degenerate eigenvalue r0
with probability = 1 is the pure state = |r0 i hr0 |. This is what we mean by
an eigenstate.
Continuous Spectrum
Let Q be a Hermitian operator having a purely continuous spectrum such that
Q |q 0 i = q 0 |q 0 i (15.136)
hq 0 | q 00 i = (q 0 q 00 ) (15.138)
Now let g(q)dq be the probability that the corresponding observable Q lies
between q and q + dq. We then get
D E Zq
(q Q) = g(q 0 )dq 0 = Pr ob(Q < q|)
Z
= T r (q Q) = T r (q q 0 ) |q 0 i hq 0 | dq 0
Zq
= hq 0 | |q 0 i dq 0 (15.139)
or
g(q) = Pr ob(Q < q|) = hq| |qi (15.140)
q
Once again in the pure state we get
2
g(q) = |hq | i| (15.141)
The expressions for the probability and the probability density always consist
of a relation between two factors: one characterizing the state (the state func-
tion) and one characterizing a portion of the spectrum being observed(the filter
function).
In the expression
P rob(R = r|) = T r P (r) (15.142)
1183
they are and P (r) respectively.
2 2
In the expressions |hrn | i| and |hq | i| they are the state vector and an
eigenvector belonging to the observable.
The two objects have very distinct natures,i.e., the state vector must be nor-
malized and therefore belongs to Hilbert space. The filter function does not
necessarily belong to Hilbert space, but rather to an extended or rigged Hilbert
space.
The sum is over all eigenvectors(even degenerate) whose eigenvalues lie in the
subset . For a continuous spectrum the sum becomes an integral. Then the
probability that the value of R will lie within is given by
If the region contains only one eigenvalue, then this reduces the earlier result
for P rob(R = r|) and in the continuous spectrum case it is equal to the integral
of the probability density over the region . This result satisfies all of the 4
probability axioms in Chapter 5.
We then have
The joint probability P rob(A B|C) can be evaluated only if we can find(from
quantum mechanics) a projection operator for the compound event A B. This
is possible if the projection operators MR (a ) and MS (b ) commute. In that
case the product MR (a )MS (b ) is also a projection operator that projects
onto the subspace spanned by those common eigenvectors of R and S with
eigenvalues in the ranges a and b , respectively. We then have
This is just the joint probability that both events A and B occur on the condition
C, or, in other words, it is the probability that the result of the measurement
of R at time ta is in the range a and the result of the measurement of S at
time tb is in the range b , following the state preparation corresponding to .
If R and S commute this calculation is possible for arbitrary ranges.
1184
The last term P rob(B|A C) is
1. We can regard the preparation of the state and the following measure-
ment of R as a composite operation that corresponds to the preparation
of a new state 0 .
2. We can define
P rob(A B|C)
P rob(B|A C) = (15.148)
P rob(A|C)
since we know how to calculate the right-hand-side.
Filtering-Type Measurements
If we want to regard the initial state preparation followed by the R measure-
ment as a composite operation that results in a new state 0 , then we require a
detailed description of the R measurement apparatus and a dynamical analysis
of its operation. This is only possible for particular cases and no general treat-
ment can be done.
Are there any types of measurements that we can treat without too much diffi-
culty?
Let us consider a measurement of the filtering type where the ensemble of sys-
tems generated by the state preparation is separated into sub-ensembles
according to the value of the dynamical variable R (SG apparatus is an exam-
ple of this type of measurement).
This filtering process, which has the effect of removing all values of R except
those for which R a , can be regarded as preparing a new state that is
represented by
MR (a )MR (a )
0 = (15.150)
T r [MR (a )MR (a )]
1185
so that we have
P rob(B|A ) = P rob ((S b ) | (R a ) )
= P rob ((S b ) |0 ) = T r (0 MS (b )) (15.151)
Finally, we can calculate a joint probability for two filtering-type measurements
as:
P rob(A B|C) = Pr ob(A|C) Pr ob(B|A C)
= Pr ob(A|) Pr ob(B|A )
= T r (MR (a )) T r (0 MS (b ))
MR (a )MR (a )
= T r (MR (a )) T r MS (b )
T r [MR (a )MR (a )]
T r (MR (a ))
= T r (MR (a )MR (a )MS (b ))
T r [MR (a )MR (a )]
T r (MR (a ))
= T r (MR (a )MS (b )MR (a ))
T r [MR (a )MR (a )]
T r (MR (a ))
= T r (MR (a )MS (b )MR (a ))
T r [MR (a )]
= T r (MR (a )MS (b )MR (a )) (15.152)
In the case when MR (a ) and MS (b ) commute, this reduces to the earlier
expression
P rob(A B|C) = T r (MR (a )MS (b )) (15.153)
The derivation of this last relation required that R and S commute, but no such
restriction was required to derive
P rob(B|A ) = T r (0 MS (b )) (15.154)
The latter, however, is restricted to filtering-type measurements.
The derivation of P rob(AB|C) was implicitly based on the assumption that the
measurements of R and S were equivalent to, or at least compatible with, a joint
filtering according to the eigenvalues of R and S, i.e., a product of projection
operators. This is only possible if R and S commute. In this case the time order
of measurements is irrelevant, as is clear from the symmetry with respect to the
two projection operators, i.e.,
P rob(A B|C) = T r (MR (a )MS (b )) (15.156)
1186
Remember these are time-dependent operators in the Heisenberg picture.
If the operators R and S do not commute, the above relation is not true, as
the definition of the joint probability. In this case we must carefully observe
the time orderings, because it is the R measurement that serves as (part of the)
state preparation for the S measurement and not vice versa. This is clear in the
lack of symmetry in the last result
a measurement of z at time t1
a measurement of u at time t2
a measurement of x at time t3
Seven SG machines are required to carry out this experiment. We assume that
the spin vector is a constant of the motion between the measurements.
P rob(z = a, u = b, x = c| X) (15.158)
1187
with a = 1, b = 1 and c = 1. As indicated above, the probability is condi-
tional on the state preparation (denoted by ) and the configuration of the SG
machines (denoted by X). We abbreviate it P rob(a, b, c| X) with the time
ordering assumed.
It is important to note that this is the joint probability for the results of three ac-
tual measurements, and not the joint distribution for hypothetical simultaneous
values of three noncommuting observables. Moreover, the various subbeams in
this experiment are all separated in space and no attempt is made to recombine
them. Thus, questions of relative phase and coherence are irrelevant.
1188
i.e., the probability for the z and u measurements independent of what happens
at the x measurement (the later x measurement should not affect them!).
We then get
X
Pzu (a, b| X) = Pzux (a, b, c| X)
c=1
X
= h| Mz (a)Mu (b)Mx (c)Mu (b)Mz (a) |i
c=1
as expected.
So the presence of the x filter has no effect on the earlier measurements. Sim-
ilarly, we have
X X
Pz (a| X) = Pzux (a, b, c| X)
b=1 c=1
= h| Mz (a) |i (15.167)
since the absence of the x and u filters has no effect on the measurement of
z .
P rob(A B|C)
P rob(B|A C) = (15.168)
P rob(A|C)
1189
This is the same as the probability of obtaining u = +1 conditional on the new
state being | 0 i = |z+i, i.e., as if the first measurement collapsed or reduced
the state to correspond to the measurement. We have not, however, assumed
any reduction process. This result follows from existing quantum mechanical
rules. It does not, however, say that any collapse occurred!
Now
or
1 2
P (+1, +1, +1| X) = || cos2 [1 + sin ] (15.172)
2 2
1 2
P (1, +1, +1| X) = || sin2 [1 + sin ] (15.173)
2 2
1 2
P (+1, 1, +1| X) = || sin2 [1 sin ] (15.174)
2 2
1 2
P (1, 1, +1| X) = || cos2 [1 sin ] (15.175)
2 2
and therefore carrying out the sum we get
1h 2 2
i
Px (+1| X) = 1 + (|| || ) sin cos (15.176)
2
This is the probability of obtaining the result x = +1 with the z and u filters
in place, but ignoring the results of the z and u measurements. It is not equal
to the probability of obtaining x = +1 with the z and u filters absent, which
is
1 2
Px (+1|) = h| Mx (+1) |i | + | (15.177)
2
These two results differ because the particle must pass through the z and u
filters before reaching the x filter and clearly the presence of the other filters
is relevant!
Thus, we must always explicitly take the dynamical action of the apparatus into
account when developing a theory of measurement.
1190
(3) Conditioning on Both Earlier and Later Measurements
We now calculate the probability for a particular result of the intermediate u
measurement, conditional on specified results for the preceding z measurement
and the following x measurement.
The later measurement cannot have any causal effect on the outcome of the ear-
lier measurement, but it can give relevant information because of the statistical
correlations between the results of successive measurements!
We use
P rob(A B|C)
P rob(B|A C) = (15.178)
P rob(A|C)
with C = X, A = (z (t1 ) = +1) (x (t3 ) = +1) and B = (u (t2 ) = +1).
We get
1191
i.e., the probability is 1 for both = 0 and = /2, which is very different than
case (1).
This is why the probabilities in these examples have been conditional on the
apparatus configuration X.
Now the angle specifies the direction of the u filter and it must be included
in X, i.e., we should write X . By conditioning on the final result x = +1 we
select a subensemble discarding those cases in which the result is 1. However,
a part of the specification of this subensemble is that its members have passed
through the u filter. Thus, the conditions that define the subensemble include
the value of the angle , which therefore may not be changed.
But this is not possible if we specify conditional information both before and
after the measurement of interest, as in the last example.
Thus, the paradox is resolved and makes us realize that we must pay very careful
attention to the state preparation concepts.
Comparing Approaches
Let us now look at all of this theory in a more standard manner(as is done in
many textbooks) and then compare the two approaches.
1192
4. The dynamics are given by the Schrodinger equation
i~ |i = H |i where H = Hamiltonian or energy operator
t
A |ai = a |ai
Postulate 1
Postulate 2
We also assumed that we were in a linear vector space and hence could prove
Stones theorem which is equivalent to axiom (4) above.
In the discussion just completed we saw that axiom (6) seems to be incorporated
in our theory but is not generally true.
1193
Let us see how the standard approach proceeds.
First, both approaches imply that if a system is in the state (linear combination)
X
|i = cn |an i where A |an i = an |an i (15.185)
n
= |i h| (15.188)
= h| I |i = h | i = 1
T r() = T r(I) (15.190)
2
= (|i h|)(|i h|) = |i h | i h| = |i h| = (15.191)
If we let {|ni} and {|mi} be two different basis systems, then we have
X X
T r(X) = hn|X |ni = hn | mi hm| X |ni
n n,m
X X
= hm| X |ni hn | mi = hm|X |mi
n,m m
If the systems or objects under investigation are all in one and the same state
|i, we call this a pure ensemble or we say that the system is in a pure state.
1194
In order to verify the probability predictions contained in the state vector |i ex-
perimentally , we must , in fact, investigate an ensemble of identically prepared
objects. If X
|i = cn |ni (15.192)
n
then the eigenvalue an will be the A measurement result Nn times for an en-
semble of N objects. The larger N , the more precisely Nn /N approaches the
probability |cn |2 , i.e.,
2 Nn
|cn | = lim (15.193)
N N
and the expectation value correspondingly becomes
D E X
2 1 X
A = |cn | an = lim Nn an (15.194)
N N
n n
This expectation value can also be represented by a density matrix of the form
X
= pi |i i hi | (15.197)
i
T r () = 1 (15.199)
X
2 = pi pj |i i hi | j i hj | =
6 (15.200)
i,j
1195
For each |i
X 2
h| |i = pi |h | i i| 0 non negative (15.201)
i
If X
|mi = Pm |mi = Pm |mi hm| (15.202)
m
then X 2
h| |i = Pm |hm | i| 0 Pm 0 (15.203)
m
and
!
X X X X
Pm = hm| |mi = hm| pi |i i hi | |mi
m m m i
X X 2
= pi |hm | i i| = 1
i m
We also have X X
2 = 2
Pm |mi hm| T r2 = 2
Pm <1 (15.204)
m m
Thus, the criterion for a pure state is T r2 = 1 and for a mixed state T r2 < 1.
Now the expectation value of the projection operator |ni hn| (discrete spectrum)
is given by
X 2
X 2
T r (|ni hn| ) = pi |hn | i i| = pi c(i)
n (15.205)
i i
The expectation value of the projection operator |xi hx| (continuous spectrum)
is given by
X 2
X 2
T r (|xi hx| ) = pi |hx | i i| = pi |i (x)| (15.206)
i i
which is equal to the probability of obtaining the state |xi as a result of a mea-
surement.
Now let us consider a system consisting of two subsystems 1 and 2 with or-
thonormal basis states {|1ni} and {|2ni}, respectively. A general pure state in
the direct product space is then
X X 2
|i = cnm |1ni |2mi where |cnm | = 1 (15.207)
m,n n,m
1196
The corresponding density operator is
X X
= |i h| = cnm cn0 m0 |1ni |2mi h1n0 | h2m0 | (15.208)
n,m n0 .m0
If we carry out measurements concerning only subsystem 1, that is, if the op-
erators corresponding to the observables being measured only act on the states
|1ni, then
D E X
A = T r A = h2m00 | h1n00 |A |1n00 i |2m00 i
m00 ,n00
!
X X X
00 00
= h2m | h1n | cnm cn0 m0 0 0
|1ni |2mi h1n | h2m | A |1n00 i |2m00 i
m00 ,n00 n,m n0 .m0
X X X
= cnm cn0 m0 (h2m00 | h1n00 |)(|1ni |2mi) h1n0 | A |1n00 i h2m0 | 2m00 i
m00 ,n00 n,m n0 .m0
X X X
= cnm cn0 m0 m,m00 m0 ,m00 n,n00 h1n0 | A |1n00 i
m00 ,n00 n,m n0 .m0
X
= cnm cn0 m h1n0 | A |1ni
n,n0 ,m
where
X
= cnm cn0 m |1ni h1n0 | = T r2 appropriate density operator (15.209)
n,n0 ,m
so that finally D E h i
A = T r1 (T r2 ) A (15.210)
where T rk means trace over subsystem k. Now
X X
2 = cnm cn0 m cn1 m1 cn01 m1 |1ni h1n0 | 1n1 i h1n01 |
n,n0 ,m n1 ,n01 ,m1
X X
= cnm cn0 m cn1 m1 cn01 m1 |1ni n0 n1 h1n01 |
n,n0 ,m n1 ,n01 ,m1
! !
X X X
= cnm cn0 m cn0 m1 cn01 m1 |1ni h1n01 |
n,n0 ,n01 m m1
1197
which say that, in general
2 6= (15.211)
If, however, the cnm take the form
X 2
X 2
cnm = bn dm with |bn | = 1 = |dm | (15.212)
n m
then we have
! !
X X X
2 = cnm cn0 m cn0 m1 cn01 m1 |1ni h1n01 |
n,n0 ,n01 m m1
! !
X X 2
X 2
= bn bn0 |dm | bn0 bn01 |dm1 | |1ni h1n01 |
n,n0 ,n01 m m1
X X X 2
= (bn bn0 ) bn0 bn01 |1ni h1n01 | = bn bn01 |1ni h1n01 | |bn0 |
n,n0 ,n01 n,n0 ,n01 n0
X
= bn bn01 |1ni h1n01 | =
n,n0 ,n01
i.e., it is the direct product of two pure states of the subspaces 1 and 2.
Except for this special case, represents the density operator of a mixed en-
semble. If the information of a subspace is ignored, then the pure state becomes
a mixed state.
Although the total system is in a pure state, the density operator , which
yields all expectation values pertaining only to subsystem 1, represents a mixed
ensemble in general.
Projection Operators
The projection operator in subspace 2, |2mi h2m| , projects onto the state |2mi.
Therefore,
X X
(|2mi h2m|) |i = |i = (|2mi h2m|) cnm |1ni |2m0 i = cnm |1ni |2mi
m0 ,n n
1198
and the state operator becomes
This implies that if represents a pure ensemble, the so does the projected state
operator.
We now re-derive the equation of motion for the state operator. The Schrodinger
equation
i~ |i i = H |i i i~ hi | = hi | H (15.216)
t t
gives
" #
X X
= pi |i i hi | = pi |i i hi | + |i i hi |
t t i i
t t
" #
1 X 1 X
= pi H |i i hi | |i i hi | H = H, pi |i i hi |
i~ i i~ i
ih i
= H, (15.217)
~
which is the von Neumann equation. This holds for time-dependent Hamiltoni-
ans also. It describes the time evolution of the state operator in the Schrodinger
picture.
Now, if we start in the state |(t0 )i at time t0 , then we have the formal solution
of the Schrodinger equation
where
i~ U (t, t0 ) = H(t)U (t, t0 ) (15.219)
t
1199
We then get
" #
(t) ih i i X
= H, = H, pi |i i hi |
t ~ ~ i
" #
i X
+
= H, pi U (t, t0 ) |(t0 )i h(t0 )| U (t, t0 )
~ i
iX
= pi H U (t, t0 ) |(t0 )i h(t0 )| U + (t, t0 )
~ i
iX
+ pi U (t, t0 ) |(t0 )i h(t0 )| U + (t, t0 )H
~ i
X
= pi U (t, t0 ) |(t0 )i h(t0 )| U + (t, t0 )
i
t
X
+ pi U (t, t0 ) |(t0 )i h(t0 )| U + (t, t0 )
i
t
= U (t, t0 ) |(t0 )i h(t0 )| U + (t, t0 )
t
= U (t, t0 )(t0 )U + (t, t0 )
t
or
(t) = U (t, t0 )(t0 )U + (t, t0 ) (15.220)
2
Theorem: The quantity T r is time independent. Hence, a pure(mixed) state
remains a pure(mixed) state.
Proof :
T r2 (t) = T rU (t0 )U + U (t0 )U + = T rU (t0 )I(t0 )U +
= T rU (t0 )(t0 )U + = T r(t0 )(t0 )U + U = T r2 (t0 )
Therefore, if T r2 (t0 ) = 1, then T r2 (t) = 1 and the state remains pure and
similarly for the mixed state. The expectation value in the two pictures is given
by D E
A = T r (t)A = T r (t0 )AH (t) (15.221)
or the time dependence comes from the state operator in the Schrodinger picture
and from the operator in the Heisenberg picture.
1200
are eigenstates of z , i.e., z = . Now a rotation in spin space through
an angle about an axis n is represented by the unitary operator
i
U = e 2 n = cos I + in sin (15.223)
2 2
and
i
2i n
e 2 n m e = cos I + in sin (m ) cos I in sin
2 2 2 2
= cos2 (m ) + i [n , m ] sin cos + (n ) (m ) (n ) sin2
2 2 2 2
Now
(n )(m ) = n mI + i (n m) (15.224)
or
[(n ), (m )] = 2i (n m) (15.225)
Alternatively,
[(n ), (m )] = ni mj [i , j ] = 2iijk ni mj k = 2ikij ni mj k
= 2ik (n m)k = 2i (n m) = 2im (n ) (15.226)
Therefore,
i i
e 2 n m e 2 n = cos2 (m ) + i [2i (n m)] sin cos
2 2 2
+ (n ) (n m + i (n m)) sin2
2
= (m n) (n ) sin2 + (m (n )) sin + (m ) cos2
2 2
+ i (n ) ((n m) ) sin2
2
2
= (m n) (n ) sin + (m (n )) sin + (m ) cos2
2 2
2
+ i (n (n m) + i (n (n m)) sin
2
2
= (m n) (n ) sin + (m (n )) sin + (m ) cos2
2 2
2
(n m) (n ) sin
2
2
= (m n) (n ) sin + (m (n )) sin + (m ) cos2
2 2
(m (m n) (n )) sin2
2
= (m n) (n ) sin2 + (m (n )) sin ((m (n (n ))
2
+ (m n) (n )) cos2 + (m (n (n ))) sin2
2 2
= (m n) (n ) (m (n (n ))) cos + (m (n )) sin (15.227)
1201
or
i i
e 2 n e 2 n = n (n ) (n (n )) cos + (n ) sin (15.228)
i i
e 2 x t e 2 x = (t x)(x ) t (x (x )) cos + t (x ) sin
= tx x t (x (y z z y)) cos + t (y z z y) sin
= tx x t (y y z z) cos + t (y z z y) sin
= tx x + (y ty + z tz ) cos + (y tz z ty ) sin
1 0 0 x
= (tx , ty , tz ) 0 cos sin y (15.230)
0 sin cos z
The eigenstates of t , i.e., the eigenstates of spin = 1/2 in the t direction are
given by the equation
i i
Ux (t )Ux+ = e 2 x (t )e 2 x = (15.231)
or
(t )(Ux+ ) = (Ux+ ) (15.232)
The two eigenfunctions are
cos 2 i sin 2
Ux+ + = , Ux+ = (15.233)
i sin 2 cos 2
1202
Let us now discuss the spin part of the density matrix. Imagine we are dealing
with electron beams. An electron beam of spin has the density operator
= |i h| (15.236)
= |i h| (15.237)
If one mixes the two beams in a 50:50 ratio, the density operator is
1
M = (|i h| + |i h|) (15.238)
2
This state has unknown relative phases.
The difference between the two density operators shows up in expectation values
and hence has measurable consequences. For the pure state,
1 ei
D E 1 h1| A |1i h1| A |2i
A = T r A = T r
2 ei 1 h2| A |1i h2| A |2i
i
h2| A |1i h1| A |2i + ei h2| A |2i
1 h1| A |1i + e
= Tr
2 ei h1| A |1i + h2| A |1i ei h1| A |2i + h2| A |2i
1
= h1| A |1i + h2| A |2i + 2Re ei h1| A |2i
2
1203
while for the mixed state we have
D E 1 1 0 h1| A |1i h1| A |2i
A = T r M A = T r
M 2 0 1 h2| A |1i h2| A |2i
1 h1| A |1i h1| A |2i
= Tr
2 h2| A |1i h2| A |2i
1
= h1| A |1i + h2| A |2i
2
Again, we see no interference (quantum) effects from the mixed state. It is clear
from these expressions that the mixed-state expectation value arises from the
pure-state expectation value by averaging over the relative phase value, i.e.,
Z2
1
M = d (15.244)
2
0
This implies that (1 b)/2 are probabilities and thus we must have |~b| 1. If
we have a pure state, then we must also have 2 = , which implies that
1 1
2 = I + 2~b ~ + (~b ~ )(~b ~ ) = I + 2~b ~ + b2 I
4 4
1 1 + b2 ~
1 ~ 2
= I + b ~ = = I + b ~ =
2 2 2
1204
(4) The Measurement Process
We have a beams of atoms or electrons moving through the inhomogeneous field
of a magnet as shown in Figure 15.8 below.
If we let the spatial wavefunction be a wave packet f (z) concentrated about the
zaxis before entering the magnetic field at time t = 0, then at time t, it is
then approximately
u (z, t) = f (z Ct2 )eit (15.251)
where the constant
B 0 B
C= (15.252)
2m
is the acceleration during the period the particle is in the magnetic field region
and
BB
= (15.253)
~
1205
This says that the particles with spin and are deflected downwards and
upwards, respectively.
The requirement that the zcoordinate of the particle serve as the pointer of
a measuring instrument implies that the deflections must be macroscopically
distinguishable. Formally, this implies that the overlap of the two wave packets
f (z + Ct2 ) and f (z Ct2 ) must be negligible. We also assume that we have
calibrated the device.
1
(z, 0) = (+ + ) f (z) (15.255)
2
After traversing the field this becomes
1
(z, t) = + f (z + Ct2 )eit + f (z Ct2 )eit
(15.256)
2
There is a unique correlation between the state of the spin and the state of
the pointer. Neither the spin nor the pointer are in an eigenstate (neither have
definite values).
1206
Measurement of spin observables
After the spin 1/2 particle has passed through the Stern-Gerlach apparatus,
suppose that its spin is measured. The measurement can take place in two
ways:
(i) ignoring the pointer position z
(ii) for a particular pointer position z
In case (i), if one ignores the pointer position, then is equivalent to
Z
1 1 0
= dz hz| |zi = (15.258)
2 0 1
This result holds because there is no overlap of the pointer wave functions
f (z + Ct2 ) and f (z Ct2 ). For an observable F (~ ), depending only on spin
operators, we then have
T rz, (F (~ )) = T r (F (~ )) (15.259)
where the density operator corresponds to a mixed ensemble. Thus, the pure
ensemble is replaced by the mixed ensemble .
In case (ii), filtering out the pointer position: in this case, we consider only the
particles with pointer position zpositive, i.e., we construct
Z
1 0 0
T rz>0, (F (~ )) = dszz = (15.260)
2 0 1
Because of normalization, the density is operator is actually
0 0
= (15.261)
0 1
The particles deflected up have the spin wave function |i.
For a measurement with the result zpositive(spin negative), the state goes
over to |i. This is consistent with the quantum mechanical postulate that
measurements are repeatable.
The fact that the particles which have been filtered off at a particular pointer
position are in that eigenstate corresponding to the eigenvalue measured is called
the collapse or reduction of the state vector.
Going over from to with respect to all observables related to spin can also
be regarded as a reduction of the state vector.
The density matrix describes an ensemble composed of 50% spin-up and 50%
spin-down states. If N particles are subjected to this Stern-Gerlach experiment,
then as far as their spins are concerned, they are completely equivalent to N/2
particles in the state |i and N/2 in the state |i.
1207
15.2.8 A General Experiment and Coupling to the Envi-
ronment
We now consider a general experiment. Let O be the object and A the apparatus
including the readout. At the time t = 0, let the state of the whole system O +A
be X
|(0)i = cn |O, ni |Ai (15.262)
n
where the |O, ni are object states and |Ai is the (metastable) initial state of
the apparatus. At a later time t, after the interaction of the object with the
measuring apparatus, the state becomes
X
|(t)i = cn |O, ni |A(n)i (15.263)
n
where the final states of the apparatus |A(n)i . n = 1, 2, 3, ..... must be macro-
scopically distinguishable.
We needed to make use of the fact that the final states of the apparatus are
macroscopically distinguishable, and thus do not overlap, i.e., that hA(n) | A(m)i =
mn .
Thus, if we do not read off the result of the measurement, a mixture occurs with
respect to O. If, on the other hand, we read off a particular value, e.g., A(m),
the density operator is then
|O, mi hO, m| (15.266)
2
The probability of measuring the value A(m) is clearly |cm | . The fact that in
a measurement with the result A(m) the density operator changes from
|(t)i h(t)| |O, mi hO, m| (15.267)
is known as the reduction of the wave function or state vector.
We now take into account the fact that the object and the apparatus are never
completely isolated from the environment, and take Z to be an additional vari-
able representing all further macroscopic consequences which couple to the state
A of the apparatus. The initial state is then
X
|(0)i = cn |O, ni |Ai |Zi (15.268)
n
1208
and after passage through the apparatus the state evolves into
X
|(t)i = cn |O, ni |A(n)i |Z(n)i (15.269)
n
If we do not read off Z, which always happens in practice, since we cannot keep
track of all the macroscopic consequences, the density operator of the (object
+ apparatus) is the mixture
X 2
= |cn | |A(n)i |O, ni hO, n| hA(n)| (15.270)
n
This is a completely different density operator than that of the pure state, i.e.,
|(t)i h(t)|.
(ii) If we read off A(n), then the probability of obtaining the particular reading
A(m) is
2
T rO,A |A(m)i hA(m)| = |cm | (15.272)
From that point onwards, it does not matter if we disregard A. Taking the trace
over A(n) yields, for the observable O, the density operator
The key problem in the theory of measurement is this reduction of the wave
function and in particular the question of when it takes place. This problem is
illustrated quite drastically by phenomena of Schrodingers Cat discussed in
Chapter 14.
1209
Influence of an Observation on Time Evolution
In order to further analyze the measurement process and its impact, we return
to the Stern-Gerlach experiment and consider this additional setup.
After the atomic beams have traversed the Stern-Gerlach apparatus, we recom-
bine them by means of a complicated field configuration in such a way that all
of the deformation and spreading of the wave function is carefully undone, i.e.,
the state
f (z) c1 ei+ + + c2 ei
(15.275)
is formed. This is the initial wave function once again, essentially. The phase
factors are inserted to characterize any path length differences.
Now, in the region where the beams + and are macroscopically separated,
we can set up a real measuring device whose pointer Z which reacts to z by the
interaction U (z Z), so that positive (negative) z leads to positive(negative)
Z. We then have for the initial state
II. We turn off the coupling to the measuring device Z and obtain the final
state
|e i = f (z) c1 ei+ + + c2 ei |Z = 0i
(15.280)
The resulting density operators are quite different. Although the state
1210
In situation II, both the total state and the state of the atom are pure states, In
situation I, the final state of the total system (atom + pointer) (characterized
by spin z and Z), is mixed, unless c1 or c2 vanish.
Even if we do not read the result, we still influence the atomic system.
Let the initial state of the atoms in the atomic beam be |ai, let {|ei} be a basis
for the final states, and let {|ci} be a basis for the intermediate state. We now
determine the probability of transition to the final state |ei, by representing this
in terms of the transition amplitudes of |ai to |ci to |ei.
because X
(1) (2)
Uac Uce = Uae (15.283)
c
On the other hand, one could also say that the transition probability is the
product of the probabilities
(1) 2 (2) 2
Uac Uce (15.284)
1211
summed over all intermediate states c :
X 2 2
II (1) (2)
Pae = Uac Uce (15.285)
c
II
For Pae there is a measurement in the intermediate region, and this introduces
unknown phase factors eic which must be averaged over.
Experiment 1: Between SG1 and SG2 , the atoms remain unperturbed. There
I
is no coupling to the external world, and the transition probability is Pae .
1212
ity, in which probabilistic assignments refer to our knowledge of an objectively
existing state of affairs, for example, an expression like A (the dispersion)
would be a measure of our ignorance of A which, it is supposed, does have an
actual value. Such a position is consistent with a completely realist view of the
world, in which individual objects and properties have an unequivocal existence
that is independent of any act of observation or measurement.
Interpretations of this type are naturally coupled with the assumptions that
1. quantum states refer to individual systems directly, not just to the out-
come of repeated measurements, or to any associated ensemble of systems
The reason is that the concept of an individual system possessing a value for all
its physical quantities is difficult to reconcile with the actual formalism of quan-
tum theory. As we have seen, one obstacle to this view are the Bell inequalities,
as discussed earlier in Chapter 8 and will discuss extensively in Chapter 16.
1213
As a consequence, nothing useful can be said about the existence of quantum
value-functions without postulating further properties for them.
We note that there are plenty of functions that satisfy the eigenvector require-
ment. For example, the function
V|i (A) = h| A |i (15.287)
does so (assuming that |i is normalized). However, as we know, this gives
the expected value of A, so it is not a good choice for a value function in any
situation in which the dispersion A is non-zero. Therefore, to get a sensible
notion of a value function it is clearly necessary to go beyond the eigenvector
requirement.
The most natural condition to impose is that, for any function F : < <,
V|i (F (A)) = F (V|i (A)) [condition A] (15.288)
Thus, in any quantum state |i, the value of a function of a physical quantity
is equal to the function evaluated on the value of the quantity; for example, the
value of L2x is the square of the value of the angular momentum Lx .
If we are quantizing a given classical system with state space S then F (A) can
stand for F (fA )(s), where fA : S <, and where the map F (fA ) : S <
is defined by F (fA )(s) := F (fA (s)) for all s S. In this case, it is viable to
regard a value function V|i as being defined on real-valued functions f on the
classical state space S.
1214
which deals only with mathematical entities. A value-function V|i of physical
quantities can then be defined by
0
V|i (A) := V|i (A) (15.290)
If the quantization map is one-to-one and onto, then F (A) is well-defined as the
inverse image of F (A), and the V|i function constructed in this way satisfies
condition A.
Note that we are driven towards such a strategy because, on the one hand(and
unlike in the pragmatic approach), an operational definition in terms of mea-
surements is not appropriate in the context of the more realist interpretation
being sought. But, on the other hand, we must avoid defining F (A) to be the
physical quantity that satisfies condition A for all |i H, since, then condi-
tion A would be a tautology. The potential trap is that this is precisely how
a function of a physical quantity is defined in classical physics. This discus-
sion impinges also on the basic function-preserving requirement F[ (A) = F (A),
which does become a tautology if F (A) is defined to be the physical quantity
that is associated with the operator F (A).
3. Let I denote the physical quantity that corresponds to the identity oper-
Then using the result in (2) above with A := I, we see that, for
ator I.
each state |i
V|i (B) = V|i (I)V|i (B) (15.293)
for all physical quantities B. Thus, for each |i H, we get
provided that there is at least one quantity B for which V|i (B) 6= 0.
1215
The multiplicative property of (2) above has an important implication for any
physical quantity P whose representative is a projection operator P . In this
case, the property P 2 = P implies at once that
2
V|i (P ) = V|i (P 2 ) = V|i (P ) (15.295)
and hence
V|i (P ) = 0 or 1 (15.296)
Thus, thinking of P as a proposition, we see that a quantum value function V|i
gives it a false or true assignment in the state |i.
n o
Now consider a collection P1 , P2 , ....., Pn of projection operators that forms
a resolution of the identity, that is,
Pi Pj = 0 if i 6= j , P1 + P2 + ..... + Pn = I (15.297)
For example, these could be the spectral projectors of a self-adjoint operator
A with discrete eigenvalues, so that Pi = PA=ai . In a realist interpretation,
the associated proposition Pi asserts thatnA has a particular o value ai , which
implies that the collection of propositions P1 , P2 , ....., Pn should be mutually
exclusive - only one can be true at any given time - and exhaustive - at least one
of them must be true at any given time. This is in accord with the commonsense
view of the nature of truth and falsity, that is, in a set of mutually exclusive
and exhaustive propositions one and only one can be true and the rest are false.
This expectation is borne out by the formalism. Indeed, the results above imply
that
V|i (A + B) = V|i (A) + V|i (B) (15.298)
n
X n
X
V|i ( Pi ) = V|i (Pi ) = 1 (15.299)
i=1 i=1
However, since each V|i (Pi ) has the value 0 or 1, this sum can equal 1 only if
one of the propositions is given the value 1 and the rest are given the value 0.
A special case is when Pi := |ei i hei | where {|e1 i , |e2 i , ......} is an orthonormal
basis for the Hilbert space H. Then, according to the result just obtained
above, for any state |i a value function V|i must assign the number 1 to one
of the projectors/one-dimensional subspaces (more precisely, to the associated
proposition) and 0 to the rest. This is not a trivial requirement, since any
given vector will belong to many different orthonormal basis sets, and the value
given to the corresponding one-dimensional subspace must be independent of
the choice of such a set.
1216
This will ultimately defeat any idea of a realist interpretation.
As stated above, this argument must apply to all possible sets of orthogonal
projectors which provide a resolution of the identity in the Hilbert space. Since
the projectors are in one-to-one correspondence with the rays in the Hilbert
space, the above constraint means that, for every complete orthogonal basis of
unit vectors in the Hilbert space, we must be able to associate the number 1
with one vector and the number 0 with all the other vectors in the basis in a
consistent manner.
2. For every complete orthogonal set of unit vectors only one is colored red.
3. Unit vectors belonging to the same ray have the same color.
1217
proved TNreal TNreal
+1 .
Repeated applications of this result show that if we can prove TNreal for any
real
given N , then the theorem is true for any greater value of N and, indeed, T
will follow by a similar argument, in which, assuming the coloring is possible
in the infinite-dimensional case, we show that it would be possible in a finite-
dimensional subspace that includes a direction colored red.
since, if we could color the unit hypersphere in the Hilbert space defined over the
complex field, we could show the coloring to be possible for a real Hilbert space
of the same dimension by considering some particular complete orthogonal set
of vectors in the complex case, and generating a structure isomorphic to a real
Hilbert space by considering all the orthogonal sets obtained by real orthogonal
transformations from the initial set of orthonormal vectors.
With these results in mind, all we need to do is examine TNreal for low values
of N . We notice that T2real is false(it is possible to satisfy conditions (1)-(3)).
This corresponds to coloring the unit circle in a real Euclidean plane. The
construction for doing such a coloring is shown in Figure 15.10 below.
We now show that T3real holds. First, we give a plausibility argument due to
Belinfante.
Consider the unit sphere in Euclidean 3-space. Suppose that the coloring had
been carried out. Then we would expect 1/3 of the surface to be colored red
and 2/3 to be colored blue, since for every orthogonal triad of directions one
is colored red and the remaining two blue. But every time we color a point P ,
say, red, then we must color the whole equator, with P as the pole, blue, since
any direction orthogonal to P gets colored blue. This is shown in Figure 15.11
below.
1218
Figure 15.11: 3-Dimensional Construction
More specifically, we shall show that if 1 and 2 are any two points on the surface
of a sphere with center at O, and if we denote by 12 the angle between the unit
vectors O1 and O2, then if
1
0 12 sin1 (15.302)
3
the points 1 and 2 cannot be assigned the opposite color.
1219
Figure 15.12: Kochen-Specker Diagram
This Kochen-Specker diagram has ten points on the unit sphere in 3dimensional
Euclidean space with
1
0 12 sin1
3
as we shall prove.
Suppose first that 12 is any acute angle. Since 3 is orthogonal to 1 and 2, and
4 is orthogonal to 3, O4 must be in the plane defined by O1 and O2. Since
O4 is orthogonal to O2, we may choose 4 so that 14 is also acute, and clearly,
14 = /2 12 as shown in Figure 15.13 below.
Now write O5 = i, O6 = k and take a unit vector j orthogonal to i and k so as
to complete a set of orthonormal vectors {i.j, k}. Then O7, being orthogonal
to i, may be written as
O7 = (j + xk)(1 + x2 )1/2 (15.303)
Similarly,
O8 = (i + y j)(1 + y 2 )1/2 (15.304)
1220
for some numbers x and y.
and
O10 = (y i j)(1 + y 2 )1/2 (15.306)
But O1 is orthogonal to O7 and O8, so we must have
O1 = (xy i xj + k)(1 + x2 + x2 y 2 )1/2 (15.307)
Also, O4 is orthogonal to O9 and O10, so we must have
O4 = (i + y j + xy k)(1 + y 2 + x2 y 2 )1/2 (15.308)
But then, taking the inner product of O1 and O4, we have at once
xy
cos 14 = 1/2
((1 + + x2 x2 y 2 )(1
+ y 2 + x2 y 2 ))
= cos (/2 12 ) = sin 12 (15.309)
Now consider the color map V : S {red, blue}, whose domain is the surface
S of the unit sphere, and suppose that V (1) = red and V (2) = blue, then from
the Kochen-Specker diagram above, V (1) = red V (7) = V (8) = V (3) = blue.
But V (2) = blue implies that V (4) = red. Hence, V (9) = V (10) = blue. But
V (7) = V (9) = blue, which implies that V (5) = red and V (8) = V (10) = blue
implies that V (6) = red. But V (5) = V (6) = red is not possible since 5 and 6
are orthogonal directions.
We now use the lemma to prove the impossibility of the coloring considered in
T3real
Suppose that we take 12 = 18 < sin1 (1/3), then we know that V (1) = red
implies that V (2) = red. But now introduce four additional points, labelled
1221
3,4,5,and 6, lying at 18 intervals along the equator through the points 1 and
2. Then, repeating the above argument
V (1) = red V (2) = red V (3) = red V (4) = red V (5) = red V (6) = red
But 16 = 5 18 = 90 .
So, if any point on the sphere is colored red, we have demonstrated two orthog-
onal red points, which contradicts the specification of the coloring.
But, considering any three orthogonal points on the sphere, one of them must
be colored red.
Hence, the coloring is not possible, that is, we have demonstrated T3real .
This argument can be put in the form of a Kochen-Specker diagram with a finite
set of vertices which cannot be colored in the specified way.
All we have to do is to choose three arbitrary points p0 , q0 , r0 and insert the ap-
propriate fivefold repetition of the Kochen-Specker diagram above, with 12 =
18 , between each pair of vertices of the triangle p0 , q0 , r0 . The resulting beau-
tiful example of an inconsistent Kochen-Specker diagram, that is, one which
cannot be colored, is shown in Figure 15.14 below.
1222
Notice that, since a is orthogonal to r0 and q0 , as also is p0 , then we can choose
a to coincide with p0 . Similarly, we can identify b and q0 and c and r0 . So the
total number of vertices in the diagram is made up of eight vertices in each of the
fifteen hexagons with three pairs of vertices identified, leaving (8 15) 3 = 117
distinct vertices.
We have shown that if a value function exists for a Hilbert space of a particular
dimension N , then it necessarily does so for any space of dimension less than
N . Similarly, if a value function exists for a complex Hilbert space, then one
exists on any real Hilbert space of the same dimension.
Then, to prove the theorem, it was clear that it suffices to show that no such
function can exist in real, 3-dimensional euclidean space.
Reducing the problem in this manner had the advantage that it could be stud-
ied in a manifestly geometrical way, in which case it looks like a certain type of
map-coloring problem as shown above.
In their original proof, Kochen and Specker found a counter example to the ex-
istence of a value function, in the form of a set of 117 vectors in the vector space
<3 that could not be consistently assigned the numbers 0 or 1 in the desired
way as we have shown above.
This remarkable result shows that a value function V|i cannot exist without
violating one of the assumptions implicit in the statement of the theorem. The
obvious candidates are:
1. The requirement that V|i (F (A)) = F (V|i (A))
1223
albeit not an unreasonable one), then the last equation implies that measuring
A1 and applying f1 to the result, will yield the same number as measuring
A2 and applying f2 to the result. However, there seems no good reason for
supposing this would be so if [A1 , A2 ] 6= 0.
On the other hand, dropping the last equation has a very peculiar effect. For
example, let A1 and A2 be a pair of non-commuting, self-adjoint operators whose
spectral resolutions contain a common projector P .
Pi = ai (A) (15.312)
Yet another way of expressing these ideas is to say that the operator A does not
represent a unique physical quantity. It has a different meaning depending on
what other compatible quantities are considered at the same time. This links
up with the second possible way of saving value functions, that is, the idea that
the quantization map A 7 A may be many-to-one.
1224
depend on the context in which an observable is studied. In particular, if A, B
and C are three operators with
h i h i h i
A, B = 0 = A, C , B, C 6= 0 (15.313)
A very important example of this situation arises naturally in the EPR context
of a pair of entangled systems that have become spatially separated, and on
which measurements are then made in such a way that the measurement events
are space-like separated. Remember that two events are space-like separated if
no signal can be sent from one to another without exceeding the speed of light
and for any such pair of events there is always some inertial frame of reference
with respect to which the two events are simultaneous.
If A is associated with the first system and B and C with the second, then we
expect to be able to measure A simultaneously with either B or C and hence
h i h i
A, B = 0 = A, C (15.314)
Note that the probabilistic predictions of the formalism concerning the results
of measuring A do not depend on what else is measured at the same time, that
is, the probability of getting result am when the state is |i is always just
h| Pm |i (15.315)
1225
Note that the analogue of a value-function in a hidden variable theory would
involve the assumption that the value of a physical quantity A in a quantum
state |i depends on these additional variables. However, this does not change
the force of these arguments above.
A central ingredient will presumably be the fact that each (normalized) state
|i gives rise to a probability assignment
P rob(P | |i) := h| P |i = h| P 2 |i
2
= h| P P |i =
P |i
(15.318)
which will be interpreted in some way as the probability that the state of affairs
represented by P is realized if the state is |i.
1226
values(possessed, measured or otherwise) of physical quantities, nothing is lost
by focussing on this case.
The probabilistic predictions of quantum theory are that if the state is |i, then
2
Prob (A | |i) =
PA |i
(15.319)
Note that a given projection operator may represent more than one proposition
of this type. For example, two operators A and B may share a common spectral
projector so that PA = PB0 for some subsets and 0 of <. In this case,
we shall say that the propositions A and hB i0 are physically equivalent.
There are implications for such a definition if A, B 6= 0.
(2) the way in which the concepts of truth and falsity apply to such
propositions
(3) the quantum analogue of the way in which the set of propositions in classi-
cal physics acquires the structure of a Boolean algebra from the underlying
mathematical representation of states and physical quantities
A realist might want to say that the projection operator PA represents the
physical quantity whose value is 1 if the value of A lies in and is 0 otherwise.
Hence, in this case, the proposition A is to be read as
1227
The realist version of A :
A has a value and this value lies in the subset .
Of course, this leaves open the question of when such an assertion is justified.
The situation in classical physics is clear: the proposition is true in a state s S
if, and only if, fA : S < is such that fA (s) . However, as we saw in the
Kochen-Specker discussion, any attempt to assign possessed values for all quan-
tities in quantum theory can only be maintained at the expense of considering
these properties to be contextual. This means that it is not meaningful to say
that any particular proposition is either true or false without specifying what
other compatible propositions are to be considered at the same time.
On the face to it, the statement if a measurement of A is made, then the result
will lie in is applicable to a single system, as is the realist claim that A
has a value that lies in . However, the instrumentalist form of the proposi-
tion is not a statement about how things are, but rather a claim about what
would happen if a certain operation was performed. This seems to cry out
for a positivist-type verification, that is, make the measurement and see what
happens - if the result does not lie in , then the proposition is certainly false.
But what if the result does lie in ? Does this mean this proposition is true?
1228
Certainly not if, as is arguably the case, a more precise rendering of the propo-
sition is:
Put in the modal form, the proposition is manifestly counterfactual and cannot
be verified by any single, or finite set, of measurements. Indeed, it could be
argued that, in this form, the proposition does not apply at all to an individual
system but only to a collection of such on which repeated measurements are to
be made. Thus, in this reading, it is consistent to say that states and proposi-
tions apply only to such collections.
the condition P rob(T | |i) = 1 implies that PT |i = |i, that is, |i is,
in fact, an eigenvector of PT with eigenvalue 1.
2. This result implies that the proposition A is true in any state |i that
can be written as a linear superposition of the eigenvectors associated with
the eigenvalues in . Of course, this does not mean that the probability
of getting any specific eigenvalue is equal to 1. Indeed, this number will
be strictly less than 1 if |i has non-zero expansion coefficients for more
than one of the eigenspaces. Note that, unlike in the realist case, this
causes no problems, since the modal form of A does not say that a
measurement of A necessarily gives any one particular element of .
1229
3. Care must be taken when handling propositions in this modal form. For
example, if the proposition A = am is interpreted as if A is measured, then
the result will necessarily be am , then the value of P rob(A = am | |i)
cannot be read as the probability that the proposition is true when the
state is |i. Rather, P rob(A = am | |i) refers to the relative frequency
with which result am will be found on a given collection of systems.
For example, the statement there is a 70% chance that the proposition if A
is measured, then the result will necessarily be am is true, is quite different
from if A is measured, then there is a 70% chance that the result will be am .
It is the latter that is intended in the quantum formalism by the statement
P rob(A = am | |i) = 0.7. Note that, if desired, one might introduce an extended
class of modal propositions of the form
On the other hand, a realist may balk at the idea that the proposition A
could be said to be false if there is still a non-zero probability of finding values
in . When confronted with the existing formalism of quantum theory - in
particular the idea that propositions are to be associated with projection oper-
ators - a realist may feel that the most natural assignment of truth and falsity
in a state |i is that a proposition A is (1) true in state |i if P rob(A
| |i) = 1 (that is, |i HA ) and (2) false if P rob(A | |i) = 0 (that
is, |i HA ).
1230
and HA , and, according to these assignments, the proposition A cannot
be said to be either true or false if both components are non-zero. This suggests
that some sort of multi-valued logic should be used. The simplest such attempt,
due to Reichenbach is to use a three-valued logic in which a proposition can be
either true, false, or indeterminate. The typical quantum mechanical situation
in which a non-trivial superposition of eigenvectors leads to a proposition being
neither true nor false would then be assigned to the indeterminate category.
But, in either case, it is not really clear what the meaning is of a proposition
that is deemed to apply to a single system but which is then said to be neither
true nor false. Philosophers have debated this for some time without reaching
any definite conclusions.
In addition, serious worries have been expressed about using the idea of using a
multi-valued logic within a mathematical framework that is, itself, based firmly
on classical two-valued logic.
The difficulties with logical connectives can be alleviated by adopting the strat-
egy in which a proposition A is understood to be false in state |i if it is
not true, that is, if P rob(A | |i) < 1. This approach is quite natural in
instrumentalist interpretations of quantum theory, and results in a two-valued
propositional structure, albeit at the expense of introducing a certain asymme-
try between the concepts of true and false. It is still necessary to define the
logical connectives, however.
1231
subspace of HU .
In classical physics, the analogue of these two propositions are the empty
set and the whole space S respectively, and for all subsets ST of S we
have ST S.
3. If T is a proposition, the proposition T (not T ) is defined by requiring
that, for all states |i, P rob(T | |i) = 1 if, and only if, P rob(T | |i) = 0.
This is equivalent to the relation HT = (HT ) . This relation determines
a unique projection operator PT that represents T :
PT = I PT (15.321)
Unlike the classical case, the quantum definition has the peculiar prop-
erty that T can be false in a state |i (that is, T is not true, so that
P rob(T | |i) < 1) without implying that T is true (which requires that
P rob(T | |i) = 0). But, as expected, T true does imply that T is false!
This curious asymmetry stems directly from (1) the existence of linear
superpositions of eigenstates of the operator PT and (2) our desire to have
a binary-valued, rather than trinary-valued, logic.
1232
4. Let T and U be a pair of propositions with associated closed subspaces
HT and HU , and projection operators PT and PU , respectively. Then
the proposition T U (to be interpreted as some type of conjunction,
T AN D U ) is defined by requiring that P rob(T U | |i) = 1 in a state
|i if and only if P rob(T | |i) = 1 and P rob(U | |i) = 1, that is, T U is
true in the state |i if and only if both T AN D U are true. This means
HT U = HT HU (15.323)
PT PU = PT PU (15.324)
HT U = HT + HU (15.325)
PT PU = PT + PU PT PU (15.326)
1233
Some Comments
This fact provides further justification for the consistency of the above
assignments of logical connectives.
2. Care must be taken in interpreting the assignment of logical connectives.
This is particularly true of the idea of T U . One may interpret as
and in the normal sense only if the two operators commute since, as
mentioned earlier, only then is it unequivocally meaningful to assert that
states of the system exist in which both properties are possessed at the
same time. In the non-commuting case, the best that can be said (instead
of
PT PU = PT PU (15.329)
is that
PT PU = lim (PT PU )k (15.330)
k
which is sometimes read as saying that T U refers to the results of an
infinite sequence of measurements of T and U .
3. The fact that most pairs of operators do not commute is reflected in
the crucial property of the set of quantum propositions that the algebraic
structure associated with the definitions of and fails to be distributive.
This can be seen easily with the aid of a simple example. Let H|1 i and
H|2 i be one-dimensional subspaces of a two-dimensional Hilbert space
H, spanned by (that is, all complex multiples of) the linearly-independent
vectors |1 i and |2 i respectively, and let |i be any non-trivial linear
combination of |1 i and |2 i. Then we have
H|i H|1 i + H|2 i = H|i H = H|i (15.331)
but if we use the standard distributive property we have
H|i H|1 i + H|2 i = H|i H|1 i + H|i H|2 i = H + H = H
(15.332)
We thus have the contradiction H|i = H . This result contrasts sharply
with the distributive property of classical propositions, which is direct con-
sequence of the Boolean nature of the algebra if intersections and unions
of subsets of a classical state space.
1234
Most of the strange, non-classical features of quantum theory can be traced back
to this non-distributive property of the logical structure of quantum proposi-
tions.
Gleasons Theorem
The probability rules for density matrices imply that for any proposition P ,
with associated projector P (H), the probability associated with P in the
state is
P rob(P |) = T r P (15.333)
An interesting and important question arises at this point. Are there any other
ways of assigning probabilities to the elements of (H), or, is the standard
quantum formalism unique? More precisely, can we find a new probabilistic
theory by starting with some Hilbert space H, with its non-distributive algebra
(H) of subspaces/projection operators, and then construct a probability map
P rob : (H) < that is not of the type given above? In particular, what is
the space of states in such a theory?
To tackle this question we must first decide what requirements should be satis-
fied by a probability map P rob : (H) <. The analogous question in classical
statistical physics is answered by defining a general probability measure to be a
real-valued function of subsets of S that satisfies the three conditions
In the quantum case, one obvious condition is 0 P rob(P ) 1 for all proposi-
tions P (H). Two other, essentially trivial, requirements are P rob() = 0
and P rob(I) = 1, where and I correspond respectively to the propositions
that are identically false and true.
1235
which, if {P1 , P2 , ...., PM } is any finite set of propositions that are pairwise
exclusive, generalizes at once to
M
X
P rob(P1 P2 ........ PM ) = P rob(Pi ) (15.335)
i=1
for any finite or countably infinite set {P1 , P2 , ....} of projection operators that
are pairwise mutually orthogonal.
A remarkable theorem due to Gleason shows that, provided that dim(H) > 2,
the converse is true - the only way of satisfying
0 P rob(P ) 1 for all P (H)
P rob(0) = 0 , P rob(I) =1
!
X X
P rob Pi = P rob(Pi )
i=1 i=1
1236
is with the aid of a density operator and the relation
P rob(P ) = T r P (15.338)
If you look at most papers on this subject it seems that only the experts can
know the details, because only they have the knowledge of the esoteric mathe-
matics required. Hopefully we can simplify the arguments using only elementary
mathematics(especially geometric arguments) in this presentation.
We first build a mathematical model of the subatomic world based on the prop-
erties revealed by the SternGerlach experiment. We then analyze the model and
prove a theorem whose physical implications are immediately obvious. One must
follow the reasoning carefully and analyze the figures to understand the proof of
the theorem. We will find that the subatomic world cannot be completely de-
termined and that its properties cannot be definite before an experiment reveals
these properties to us.
As we saw earlier, one of the first surprising results on the behavior of small
particles was observed by Stern and Gerlach in 1922. In that experiment, a
beam of silver atoms was sent through a magnet as shown in Fig. 15.15 below.
1237
Figure 15.15: Stern-Gerlach Setup
It was subsequently discovered that for atoms other than silver, the beam split
into three, four, or more branches, depending on the kind of atom. If the beam
of particles splits into two branches, we say the particle has spin 1/2; if it splits
into three, we say the particle has spin 1; if it splits into four, we say the particle
has spin 3/2, and so forth.
1238
Figure 15.16: Stern-Gerlach oriented in zdirection
If the magnet is turned to face in any other direction, say the xdirection
represented by the unit vector x, as in Fig. 15.17 below
the result will be the same: a beam goes to the right, another one goes to the left,
and a third beam is not deflected. The corresponding states are |1x i = |x i),
|1x i = |x i), and |0x i = |x i as before.
From this experiment, we come to the important conclusion that the state of
a system is characterized by a state vector that is labeled by a number or an
arrow representing a direction in space. This type of characterization, as we
have seen, turns out to be a basic principle of quantum mechanics.
We can use a SternGerlach device to answer the following question: Is the sys-
tem in the |0x i = |x i state? The answer will be yes for the beam that does
not deflect when the magnetic field points in the xdirection. The beams that
are deflected will give us negative answer to this question. This kind of ques-
tion, which is answered with only two alternatives: yes or no, 1 or 0, is called
a projector. Thus, the question, Is the system in the |0x i = |x i state? is
called the x or 0x projector (or projection operator) and is represented by the
operator symbol Px or P0x . In the same way, we can apply a 0y projector P0y
which corresponds to the question, Is the system in the |0y i = |y i state? In
1239
general, we can define the projector P0n corresponding to an arbitrary direction
represented by the unit vector n.
This, as we know, is the same as the square of the scalar product (probability
amplitude) of the states representing the atom in each case, i.e., h0n | 0x i, which
is just the Born Rule. If the direction n is orthogonal to x, then the projection
is zero. Therefore, the probability that the system is in the |0n i is zero. In this
case, the answer to the question Pn will be no if Px had been affirmative.
1240
this looks like:
1 1 0
1 1
|0x i = 0 |0y i = 0 |0z i = 1
2 1 2 1 0
Thus,
h0x | 0y i = h0x | 0z i = h0y | 0z i = 0
In three-dimensional space, there are at most three compatible projectors. For
example, the projectors, P0x , P0y , and P0z are compatible, because they do not
interfere with each other and can be implemented at the same time, meaning
that the response will be yes for one of the three and no for the other two. If
the system is in the |0x i state, for instance, it cannot be in the |0y i state or the
|0z i state.
When we throw the die, we could in principle know what number we would see if
we knew the precise value of the force applied to the die. But there are so many
elements we do not control that we accept the result as a random event, even
though in theory, we could predict the result if we analyzed the details of the
enormous variety of variables that are relevant. We use probabilities because
analyzing all those details would be too laborious. It is a question of practical
difficulty; not just of principle.
What would happen in a quantum experiment? Would there be any way of de-
termining the result of an experiment through a complete understanding of the
variables? We know that when we repeat a quantum experiment we can obtain
different results, even with the same initial conditions. That does not happen in
the case of the die: if the forces were the same, the result would be the same as
1241
well. In a quantum experiment we may need to have more profound knowledge.
Moreover, it is possible that todays technology is insufficient to discover the
subtle variables that cause the same experiment to produce different results. Is
it not plausible to think that there should be something within the system that
explains why we obtain one result and not another? Einstein, among others,
thought this way. For him, quantum mechanicsdespite its achievementsseemed
to be an incomplete theory because it could not predict with certainty the re-
sult of an experiment. Even though our insufficient knowledge prevents us from
identifying it, could each object have a hidden variable that could explain its
apparently random behavior in a particular experiment? Or do we have to ad-
mit that quantum uncertainty is more radical? Is matter really intrinsically
random so that there is no possibility, even in principle, of predicting the result
of an experiment?
An alternative lies between these two pictures of the world. If the hidden vari-
ables explaining the apparently random results exist, we would have a totally
deterministic world. It is a world where there is a cause for everything that
happens. In such a world of hidden variables, the properties of a subatomic
particle, or any other quantum system, would be clearly defined. If we re-
turn to the set of projectors we have constructed, then each projector would
have a plain answer: yes (1) or no (0). As we have said the projectors take
on only two values, 0 or 1, with a certain probability. When we say that the
projectors are defined, we mean that they take on the value 1 (or 0) with
probability 1. Therefore the expectation value of any projector in that case is
hP i = prob(P = 1) = 0 or 1. In brief definite values means hP i = 0 or 1.
If there were a definite world out there, we could ascribe the values 0 or 1 to all
the projectors, with the condition that if three projectors are mutually orthog-
onal, only one of them will be ascribed the value 1. Note that the assumption
hP i = 0 or 1 is by itself contradictory to quantum mechanics because in the
latter prob(P = 1|Pz = 1) = cos2 . We do not have to wait for the conclusion
of the theorem. It is a matter of principle: a theory with definite values clashes
with quantum mechanics. So for that matter we do not need the KSB theorem.
The merit of thePtheoremPis to show that any theory with definite values and
satisfying prob( Pi ) = prob(Pi ) (Pi mutually orthogonal) is contradictory
by itself.
But this proposition is a mathematical one and can be analyzed as such. Thus,
mathematics will tell us which is the correct picture of the world. (This power
of mathematics is possible only because we have previously constructed the pro-
jectors, a mathematical structure, superimposed on the physical world).
1242
able. In this case, our set would be a true dispersionless state (dispersion is the
deviation from the mean). On the contrary, if there were no hidden variables,
then apparently identical systems (systems that are in the same state) would
have slight differences which would correspond to their different behavior in the
experiments; thus, they do show dispersion. Von Neumann proved that there
are no dispersionless states; therefore there cannot be hidden variables. But his
demonstration was not conclusive. The matter was not resolved until 19661967.
It was Bell, who used a result obtained previously by Gleason and Kochen and
Specker, who proved that there cannot be pure dispersionless states, that is, it
is impossible to ascribe values 0, 1 to the projectors consistently. We present
here a simplified version, proposed by Gill and Keanes (and based on the work
of Piron).
Because the projectors point in any direction of space, we can talk about projec-
tors or directions alike. Besides, let us remember that, if we consider a sphere of
unit radius, each direction has one corresponding point on it: the one where that
direction goes through the sphere. Thus we can equally well speak of projec-
tors, directions, or points on a sphere. Consequently, the existence of an outside
world, determinate and deterministic, would mean that we can ascribe values
0 or 1 to all the points on a sphere. But remember that we have a restriction:
in each orthogonal triple, we ascribe the value 1 to only one point and the 0 to
the other two.
1243
Figure 15.19: Orthogonal Triple
To see that this statement is true, it is sufficient to note that any direction B in
the great circle can make up an orthogonal triple with AP and another direction
B 0 in the circle, perpendicular to B as shown in Fig.15.20 below.
Because an orthogonal triple can have only one 1, B and B 0 must take on the
value 0.
The great circle descent from a point A is the great circle which has A as its
most northerly point. (From now on, the great circle descent from A will be
called the great circle from A). Beginning at any point A (not at the north
pole or on the equator), we can descend along the great circle from A to an-
other point B; then we can descend along the great circle from B to another
1244
point C; we can once more descend along the great circle from C; and so forth.
The lemma states that we can go along great circle descents from one point to
another further south. That means that we can travel from Madrid to Manila
going along a sequence of great circle descents.
The proof of this lemma is straightforward if we project the points of the north-
ern hemisphere from the center of the sphere onto the plane tangent to the
sphere at the north pole. In Fig.15.21 below we see that the projection of a
parallel is a circumference (parallels are lines of constant latitude). We also
see that a great circle descent projects onto a straight line, the tangent to the
circumference projected by the parallel that contains the summit of the great
circle. Besides, the meridians give rise to straight lines passing through the
north pole.
Figure 15.21: Parallel and great circle that are tangent at S project from sphere
center onto tangent plane at North Pole, giving rise, respectively, to a circum-
ference and a straight line tangent to it at point s. Point C of great circle from
S projects onto point c of straight line. Meridian through S gives rise to straight
line P S, perpendicular to straight line sc.
These considerations imply that the great circle descent from S projects onto a
straight line, tangent to the circumference at the point s of the tangent plane
and perpendicular at that point to the line projected by the meridian. To prove
the lemma, we begin with the simplest case: How can we go along great cir-
cle descents from the point S to the point T , both on the same meridian (see
Fig.15.22 below)?
1245
Figure 15.22: S and T are on the same meridian
First, let us project the points S and T onto the plane tangent to the sphere at
the north pole (see Fig.15.23 below)
Figure 15.23: The projection of the great circle from S (DS ) is the straight line
ds, perpendicular to the projection of its meridian (the straight line P st)
We will obtain the points s and t. We also project the great circle from S (DS )
and obtain the straight line ds , perpendicular to the projection of the meridian
P st. Remember that we want to find a point R such that we can descend to R
along the great circle from S, and then to T along the great circle from R. We
are going to search for R using the tangent plane, because R will be the summit
of its great circle if the projections of its meridian and the great circle from it
are perpendicular. Thus we should search for a point r in ds such that the angle
P rt is 90 ; in this way RT will be an arc of the great circle from R, because
its projection rt will be perpendicular at r to the projection P r of the meridian
through R. To find the point r it is sufficient to recall a statement of elementary
geometry: An inscribed angle spanning a semicircumference is 90 . Therefore
we draw the semicircumference whose diameter is the segment P t (see Fig.15.24
below); the point where it intersects ds is the point r for which we were looking.
1246
Figure 15.24: r is intersection of semicircumference(diameter = segment P t)
with line ds
Next we find the point R in the sphere surface whose projection is r. In Fig.
15.25 below we see the spatial construction that enables us to find the point R,
and the arcs of the great circle descents DS and DR that we travel on to reach
T from S.
Figure 15.25: Join r with center of sphere and find R on spherical surface
We have already looked at the case in which the two points that we want to
join using great circle descents are on the same meridian. Next, we examine the
procedure used when the two points are on different meridians. In particular,
we will examine the case when the two points are almost antipodespoints whose
geographic longitudes differ by 180 .
Let us suppose that we intend to descend from the point S to another one on
a lower latitude, which is on the other side (that is, 180 away) on the same
1247
meridian. First we project the point S onto the plane tangent to the sphere at
the north pole (see Fig. 15.26 below).
Figure 15.26: Project point S onto plane tangent to sphere at North pole
Next, we divide the angle of 180 that separates the points of interest in the
projection plane into, for example, eight parts. We start at s and make the
construction in Fig. 15.27 below.
We call T the point on the sphere surface whose projection onto the plane is
exactly the point t. The points S and T become linked by the arcs of great circle
descents corresponding to the projections perpendicular to the lines dividing
the 180 angle. (If we divide the 180 into smaller equal angles and make the
previous construction, we could get from S to latitudes lower but almost equal
to that of S). To find the spherical surface points we join each of the dividing
points with the spheres center (see Fig.15.28 below).
1248
Figure 15.28: Project back to find the summits of the great circle descents
The points S and T are thus linked by the arcs of the succession of great circle
descents shown in Fig. 15.29 below.
1249
triple. First let us see which distribution of zeros and ones the lemma implies.
Let us begin with an orthogonal triple and assign 1 to one of the points and 0
to the other two. Let us place the 1 at the North Pole to facilitate the picture;
all the points on the equator will thus take on the value 0 (see the proposition).
Let us now consider the triple in Fig.15.30, where the great circles plane forms
an angle of more than 45 with the equator.
Figure 15.30: Great circle A forms an angle of more than 45 with the equator
If all the points on the spherical surface within 45 of a 1-valued point have
to take the value 1, then the points on the great circles perpendicular to those
projectorswhich occupy the central zone of the spherical surfacehave to take
on the value 0. In short, from the hypothesis that the projectors have well-
defined values, we have reached the conclusion that the spherical surface must
be distributed according to Fig.15.31 below: If we choose as the North Pole the
direction of the original 1-valued projector, the points in the 45 domes around
the North Pole and around the South Pole are 1-valued. The remaining points
of the spherical surface are 0-valued.
1250
Figure 15.31: Distribution of ones and zeros on the spherical surface
But the spherical surfaces division into 1s and 0s that we have concluded is con-
tradictory (in addition to clashing with quantum mechanics, that is, for quan-
tum mechanics, and contrary to the result we have obtained, there is a nonzero
probability that a projector within the upper skullcap takes on the value zero).
Indeed, from the hypothesis that projectors have well-defined values, we have
proven that if a projector is 1-valued, it has to be surrounded by a 45 dome
made up of 1s. But let us look at a projector that forms, for example, 40 with
the North Pole (P40 in Fig. 15.31). It also has the value 1; therefore all the
projectors until P85 (a projector that forms 85 with the North Pole) should be
1-valued as well. But we had concluded that P85 was 0-valued. Consequently,
we arrive at a contradiction. Thus it is impossible to distribute 1s and 0s on
a spherical surface with the required condition. (In terms of color we see that
if 1 means red and 0 means green and in each orthogonal triple there must be
only one red point, it is impossible to color the spherical surface with these two
colors). We have proven the KSB no-go theorem: It is impossible to ascribe
values 0 or 1 to each point on a spherical surface so that we have exactly one 1
in each orthogonal triple.
1251
Peres has said that unperformed experiments have no results, preventing us from
using counterfactual arguments.
One of the authors of the theorem, Specker, related this problem to a ques-
tion about the different kinds of Gods knowledge of reality. In particular, the
KSB theorem is related to the future contingencies of the philosopher and Je-
suit Pedro Fonseca (1528-1599): those future events whose occurrence depends
on an action taken by a creature exercising free will. According to Fonseca,
God knows the outcome of unperformed experiments, or as he puts it, he knows
what would happen if He put the will of the creatures in circumstances different
from those in which He put them. Father Fonseca calls this knowledge middle
science. But the KSB theorem proves that God does not have middle science.
He cannot know what would happen if something different from what actually
happens were to happen, that is, if a different projector had been measured.
1252
The figure is a polyhedron which is composed of three interpenetrating cubes,
each obtained from the others by 90 rotations about lines joining the mid-
points of its opposite edges (see Fig. 15.33 below).
The 33 directions of Peres are the lines through opposite vertices, opposite edge
midpoints, and opposite face midpoints of each cube.
1253
Figure 15.35: The directions on the two interpenetrating cubes A and B
1254
Figure 15.38: The interpenetrating cubes A and C in perspective
In the orthogonal triple AS XY , we choose as the AS direction the one with the
value 1 so that the directions X and Y will take on the value 0. The directions
AL and AF are 0-valued too, because they are perpendicular to AS . In the
triple BS BF Y , the direction Y is 0-valued. From the two others, we choose BS
as the one with the value 1. In the triple CS CL X, the X is 0-valued. From the
two others, we choose CS as the one with the value 1. In the triple DEAF ,
AF is 0-valued. We can still choose the value of the two others. We choose D
as the one with the value 1. The directions F and G are perpendicular to D as
well; so they will take on the value 0.
In the triple HF Y , H has the value 1, because F and Y are 0-valued. J takes
on the value 0 because it is perpendicular to H. (It is clear that H and F are
perpendicular because F becomes H when we turn the cube A 90 around Y ,
to generate the cube C).
1255
to Q. In SBF R, S takes on the value 1 because it is orthogonal to BF and
R, which are 0-valued. T takes on the value 0 because it is orthogonal to S.
Finally, in XGT , X takes on the value 1 because it is orthogonal to G and T ,
which had the value 0. But X was 0-valued! Thus we arrive at a contradiction.
By using only a finite number of directions, we have shown again that it is not
possible to assign 0,1 values to every direction if the condition that there is only
one 1-valued direction for each orthogonal triple of directions is fulfilled. So we
have again proved the KSB theorem.
How far does this theorem lead us? If, as shown, projectors and quantum
observable quantities are not completely determined, we have a picture of mi-
croscopic reality with indefinite and ambiguous properties. Physical properties
do not have, in general, objective existence independent of the act of observa-
tion. We should accept what Jordan had already said in 1934: Observations
not only disturb what has to be measured, they produce it! In a measurement of
position, for example, the electron is forced to a decision. We compel it to as-
sume a definite position; previously it was, in general, neither here nor there; it
had not yet made its decision for a definite position.... And Jordan concluded:
We ourselves produce the results of measurement. The KSB theorem proves
that we do not have an alternative picture of the subatomic world. That is its
importance.
15.6 Problems
15.6.1 Measurements in a Stern-Gerlach Apparatus
(a) A spin1/2 particle in the state |Sz +i goes through a Stern-Gerlach an-
alyzer having orientation n = cos z sin x (see figure below).
1256
What is the probability of finding the outgoing particle in the state?
(b) Now consider a Stern-Gerlach device of variable orientation as in the figure
below.
More specifically, assume that it can have the three different directions
n1 = n = cos z sin x
n2 = cos + 23 z sin + 23 x
n3 = cos + 34 z sin + 43 x
with equal probability 1/3. If a particle in the state |Sz +i enters the
analyzer, what is the probability that it will come out with spin eigenvalue
+~/2?
(c) Calculate the same probability as above but now for a Stern-Gerlach an-
alyzer that can have any orientation with equal probability.
(d) A pair of particles is emitted with the particles in opposite directions in
a singlet state |0, 0i. Each particle goes through a Stern-Gerlach analyzer
of the type introduced in (c); see figure below. Calculate the probability
of finding the exiting particles with opposite spin eigenvalues.
1257
(a) Discuss the behavior of (x1 , x2 ) in the limit a 0.
(b) Calculate the momentum space wave function and discuss its properties
in the above limit.
(c) Consider a simultaneous measurement of the positions x1 and x2 of the
two particles when the system is in the above state. What are the ex-
pected position values? What are the values resulting from simultaneous
measurement of the momenta p1 and p2 of the two particles?
1258
15.6.4 Measurement of a Spin-1/2 Particle
A spin1/2 electron is sent through a solenoid with a uniform magnetic field
in the y direction and then measured with a Stern-Gerlach apparatus with field
gradient in the x direction as shown below:
The time spent inside the solenoid is such that t = , where = 2B B/~ is
the Larmor precession frequency.
(a) Suppose the input state is the pure state |z i. Show that the probability
for detector DA to fire as a function of is
1 2 1
PDA = (cos(/2) + sin(/2)) = (1 + sin )
2 2
Repeat for the state |z i and show that
1 2 1
PDA = (cos(/2) sin(/2)) = (1 sin )
2 2
(b) Now suppose the input is a pure coherent superposition of these two states,
1
|x i = (|z i + |z i)
2
Find and sketch the probability for detector DA to fire as a function of .
(c) Now suppose the input state is the completely mixed state
1
= (|z i hz | + |z i hz |)
2
Find and sketch the probability for detector DA to fire as a function of .
Comment on the result.
1259
Figure 15.44: Spin-Interferometer Setup
The strength of the magnetic field is chosen so that t = , for some phase ,
where = 2B B/~ is the Larmor frequency and t is the time spent inside the
solenoid.
(a) Derive the probability of electrons arriving at detector DA as a function
of for the following pure state inputs:
(i) |z i , (ii) |z i , (iii) |x i , (iv) |x i
Comment on your results.
(b) Remember that for a mixed state we have
X
= pi |i i hi |
i
This is a statistical mixture of the states {|i i}, not a coherent superpo-
sition of states. We should think of it classically, i.e., we have one of the
set {|i i}, we just do not know which one.
Prove that X 2
PDA = T r [|x i hx | ] = pi |hx | i i|
i
2
where |hx | i i| = the probability of detector B firing for the given input
state (we figured these out in part (a)). Repeat part (a) for the following
mixed state inputs:
1 1 1 1
(i) = |z i hz | + |z i hz | , (ii) = |x i hx | + |x i hx | ,
2 2 2 2
1 2
(iii) = |z i hz | + |z i hz |
3 3
1260
15.6.6 Which-path information, Entanglement, and Deco-
herence
If we can determine which path a particle takes in an interferometer, then we
do not observe quantum interference fringes. But how does this arise?
Into one arm of the interferometer we place a which-way detector in the form
of another spin1/2 particle prepared in the state |z iW . If the electron which
travels through the interferometer, and is ultimately detected (denoted by sub-
script D), interacts with the which-way detector, the which-way electron flips
the spin |z iW |z iW , i.e, the which-way detector works such that
If |iD = |z iD nothing happens to |z iW
If |iD = |z iD , then |z iW |z iW (a spin flip)
Thus, as a composite system
|z iD |z iW |z iD |z iW |z iD |z iW |z iD |z iW (15.339)
(a) The electron D is initially prepared in the state |x iD = (|z iD + |z iD ) / 2.
Show that before detection, the two electrons D and W are in an entangled
state
1
|DW i = (|z iD |z iW + |z iD |z iW )
2
(b) Only the electron D is detected. Show that its marginal state, ignoring
the electron W, is the completely mixed state,
1 1
D = |z iD D hz | + |z iD D hz |
2 2
This can be done by calculating
X 2
Prob(mD ) = |hmD , mW | DW i|
mW
1261
This state shows no interference between |z iD and |z iD . Thus, the
emphwhich-way detector removes the coherence between states that ex-
isted in the input.
(c) Suppose now the which-way detector does not function perfectly and does
not completely flip the spin, but rotates it by an angle about x so that
| iW = cos (/2) |z iW + sin (/2) |z iW
Show that in this case the marginal state is
1 1
D = |z iD D hz | + |z iD D hz |
2 2
+ cos (/2) |z iD D hz | + sin (/2) |z iD D hz |
Comment on the limits 0 and .
These scientists(one of whom has a diabolical bent) decide to play a game with
Nature: Oivil (of course) stays in the lab, while Livio treks to a point a light-
year away. The light source is turned on and emits two photons, one directed
toward each scientist. Oivil soon measures the polarization of his photon; it is
left-handed. He quickly makes a note that his sister is going to see a left-handed
photon, about a year from that time.
The year has passed and finally Livio sees her photon, and measures its po-
larization. She sends a message back to her brother Oivil, who learns in yet
another year what he know all along; Livios photon was left-handed.
Oivil then has a sneaky idea. He secretly changes the apparatus, without telling
his forlorn sister. Now the ensemble is
1
= (|LLi + |RRi) (hLL| + hRR|)
2
He causes another pair of photons to be emitted with this new apparatus and
repeats the experiment. The result is identical to the first experiment.
1262
(a) Was Oivil lucky, or will he get the right answer every time, for each ap-
paratus? Demonstrate the answer explicitly using the density matrix for-
malism.
(b) What is the probability that Livio will observe a left-handed photon, or a
right-handed photon, for each apparatus? Is there a problem with causal-
ity here? How can Oivil know what Livio is going to see, long before
she sees it? Discuss what is happening here. Feel free to modify the
experiment to illustrate any points you wish to make.
that the qubit was in state |i. Determine the optimal measurement basis
given this procedure. You can take and to be real numbers, in which case
2 2
the normalization || + || = 1 implies that you can write and as, e.g.
= sin and = cos . HINT: you will need to first construct the probability
of a correct measurement in this situation. You should convince yourselves that
this is given by
P r[qubit was |0i]P r[|i |qubit was |0i]+P r[qubit was |i]P r[ |qubit was |i]
where, e.g.
P r[|i |qubit was |0i] = | h | 0i |2
If the state you are presented with is either |i or |i with 50% probability each,
what is the probability that your measurement correctly identifies the state?
15.6.9 To be entangled....
Let HA = span{|0A i , |1A i} and HB = span{|0B i , |1B i} be two-dimensional
Hilbert spaces and let |AB i be a factorizable state in the joint space HA HB .
Specify necessary and sufficient conditions on |AB i such that UAB |AB i is an
entangled state where
1263
respectively. Suppose that a joint state of these systems has been prepared as
the (three-way) entangled state
1
|ABC i = (|0A 0B 0C i + |1A 1B 1C i)
2
(a) What is the reduced density operator on HA HB if we take a partial
trace over HC ?
(b) Suppose Charlie performs a measurement specified by the partial projec-
tors 1A 1B |0C i h0C | and 1A 1B |1C i h1C |. Compute the probabilities
of the possible outcomes, as well as the corresponding post-measurement
states. Show that this ensemble is consistent with your answer to part (a)
(c) Suppose Charlie performs a measurement specified by the partial projec-
tors 1A 1B |xC i hxC | and 1A 1B |yC i hyC |, where
1 1
|xC i = (|0C i + |1C i) , |yC i = (|0C i |1C i)
2 2
Again, compute the probabilities of the possible outcomes and the corre-
sponding post-measurement states and show that this ensemble is consis-
tent with your answer from part (a).
(d) Suppose Alice and Bob know that Charlie has performed one of the two
measurements from parts (b) and (c), but they do not know which (as-
sume equal probabilities) measurement he performed nor do they know
the outcome. Write down the quantum ensemble that you think Alice
and Bob should use to describe the post-measurement state on HA HB .
Is this consistent with the reduced density operator from part (a)? How
should Alice and Bob change their description of the post-measurement
state if Charlie subsequently tells them which measurement he performed
and what the outcome was?
1264
Chapter 16
1265
The corresponding eigenvectors are
|e : +i = cos |e : +i + sin |e : i (16.2)
2 2
|e : i = sin |e : +i + cos |e : i (16.3)
2 2
If the electron is emitted in the state |e : +i, the probability P+ () of finding
the electron in the state |e : +i is given by
2
P+ () = |he : + | e : +i|
2
= cos he : +| + sin he : |
cos |e : +i + sin |e : i
2 2 2 2
2
= cos2
= cos cos + sin sin (16.4)
2 2 2 2 2
and similarly,
2
P () = |he : | e : +i|
2
= sin he : +| + cos he : | cos |e : +i + sin |e : i
2 2 2 2
2
= sin2
= sin cos + cos sin (16.5)
2 2 2 2 2
1266
Therefore,
P+ () = hp : | he : +| (|e : +i he : +| Ip ) |e : +i |p : i
= hp : | Ip |p : i he : + | e : +i he : + | e : +i
2
= |he : + | e : +i| = cos2 (16.7)
2
and the state after the measurement is |e : +i |p : i. The proton spin is
not affected, because the initial state is factorized (and all probability laws are
factorized).
Now
2 ~2 2 ~2
Se = Ie , Sp = Ip (16.10)
4 4
and
D E
Se Sp = he : +| Se |e : +i hp : +| Sp |p : +i
~2
= cos( ) cos( ) (16.11)
4
Thus,
2
~2
~4 cos( ) cos( ) + 4 cos( ) cos( )
E(, ) = ~2
=0
4
This just reflects the fact that in a factorized state, the two spin variables are
independent.
1267
If we measure the component Se of the electron spin along the direction u =
cos uz + sin ux , we find the following results and corresponding probabilities:
there are two possible values
~
+ projector |e : +i he : +| Ip
2
~
projector |e : i he : | Ip
2
with probabilities
1
2 2
P+ () = |he : + | e : +i| + |he : + | e : i| =
2
1
2 2
P () = |he : | e : +i| + |he : | e : i| =
2
This result is a consequence of the rotational invariance of the singlet state.
Now suppose the result of this measurement is +~/2 and then later on, one
measures the component Sp of the proton spin along the direction u =
cos uz + sin ux .
Since the electron spin is measured to be +~/2, the state after that measurement
is
he : + | s i |e : +i
1
= (he : + | e : +i |e : +i |p : i he : + | e : i |e : +i |p : +i)
2
cos |e : +i |p : i sin |e : +i |p : +i (16.13)
2 2
The probabilities for the two possible results of measurement of the proton spin,
~/2, are
P+ () = sin2 , P () = cos2 (16.14)
2 2
What would have happened if we had measured the proton spin first?
The fact that the measurement on the electron affects the probabilities of the
results of a measurement on the proton, although the two particles are spatially
separated, is in contradiction to Einsteins assertion or belief that the real states
of two spatially separated objects must be independent of one another. This is
the starting point of the EPR paradox. Quantum mechanics is not a local theory
1268
as far as measurements are concerned.
Note, however, that this non-locality does not allow the instantaneous trans-
mission of information, From a measurement of the proton spin, one can not
determine whether the electron spin has been previously measured. It is only
when, for a series of experiments, the results of the measurements on the electron
and the proton are later compared, that one can find this non-local character
of quantum mechanics.
D E D E
We now recalculate the expectations values Se and Sp in the singlet state.
D E D E
We get (using the same process as above) Se = 0 = Sp . This is so be-
cause one does not worry about the other variable.
Finally, we can calculate the correlation coefficient in the singlet state. We have,
since the spins are correlated now, that
D E ~2
~2
Se Sp = sin2 cos2 = cos( ) (16.16)
4 2 2 4
and therefore
2
~4 cos( ) + 0
E(, ) = ~2
= cos( ) (16.17)
4
In the case of interest, a very simplified example of such a theory is the follow-
ing. We assume that, after each dissociation, the system is in a factorized state
|e : +i |p : i, but that the direction varies from one event to another. In
this case is the hidden variable. We assume that all directions of are equally
probable, i.e., the probability density that the decay occurs with direction is
uniform and equal to 1/2.
1269
A is now defined to be
Z2
D E 1
A = he : +| hp : | A |e : +i |p : id (16.18)
2
0
Let us now use this new definition of the expectation value to investigate the
correlation coefficient. We have from earlier
D E D ED E
Se Sp Se Sp
E(, ) = D ED E1/2
2
Se 2
Sp
~2 ~2
Z
D E d
Se Sp = cos( ) cos( ) = cos( ) (16.20)
4 2 8
Therefore, in this simple hidden variable model
1
E(, ) = cos( ) (16.21)
2
In such a model, one finds a non-vanishing correlation coefficient, which is an
interesting observation. Even more interesting is that the correlation is smaller
than the prediction of quantum mechanics by a factor of 2.
The first precise experimental tests of hidden variable descriptions versus quan-
tum mechanics have been performed on correlated pairs of photons emitted in
an atomic cascade. Although, we are not dealing with spin1/2 particles in this
case (see discussion later in this chapter), the physical content is basically the
same as in this case. As an example Figure 16.1 below presents the experimental
results of Aspect, et al,
1270
It gives the variation of E(, ) as a function of the difference , which is
found experimentally to be the only relevant quantity, i.e., the results do not
depend in any way on or separately! The circles indicate the size of exper-
imental errors.
The experimental points agree with the predictions of quantum mechanics and
clearly disagree, therefore, with the predictions of this particular hidden vari-
ables theory.
We can, however, show that the correlation results for hidden variable theories
are constrained by what is known as Bells inequality, which is violated by quan-
tum mechanics.
Consider a hidden variable theory, whose results consists of two functions A(, u )
and B(, u ) giving respectively the results of the electron and proton spin mea-
surements. Each of these two functions takes only two values +~/2 and ~/2.
It depends on the value of the hidden variable for the considered electron-
proton pair. The nature of the hidden variable need not be further specified
for this discussion. The result A of course depends on the axis u chosen for
the measurement of the electron spin, but it does not depend on the axis u .
Similarly B does not depend on u . This locality hypothesis is essential for the
following discussion.
Note that we assume here that the hidden variable theory reproduces the one
operator averages found for the singlet state:
D E Z
Se = P ()A(, u )d = 0 (16.22)
D E Z
Sp = P ()B(, u )d = 0 (16.23)
If this was not the case, such a hidden variable theory should clearly be rejected
since it would not reproduce a well-established experimental result.
1271
Now the two quantities B(, u ) and B(, u0 ) can take on only two values ~/2.
Therefore, one has either
or
B(, u ) + B(, u0 ) = 0 , B(, u0 ) B(, u ) = ~ (16.27)
Therefore, since |A(, u )| = |A(, u0 )| = ~/2, we have the result
and we get
1272
Figure 16.2: Bell Function
According to these theories, there exist hidden variables whose values prescribe
the values of all observables for any particular object, except that the hidden
variables are unknown to the experimenter, thus yielding the probabilistic char-
acter of the theory. The probabilistic character of quantum mechanics would
then be quite analogous to that of classical statistical mechanics, where one can
imagine that the motion of all the particles is, in principle, known.
1273
An argument - sometimes referred to as a paradox - due to Einstein, Podolsky
and Rosen (EPR) , played a pivotal role in the discussion of indeterminism and
the existence of hidden variables; we consider the argument as reformulated by
David Bohm.
|i |i (16.34)
and the fact that one can linear superimpose such states.
Einstein, Podolsky and Rosen gave the following argument in favor of hidden
parameters in conjunction with the EPR thought experiment.
1274
In the EPR argument, the predictions of the quantum states
1
|0, 0i = (|i |i |i |i)
2
In the remainder of our discussion, we will consider local hidden variables. These
would predetermine which value each of the components of S ~ of particle 1 has
and likewise for particle 2. Each of the particles would carry this information
independently of the other.
Polarizer 1 with angular setting only lets particle 1 through if its spin in the
direction n has the value +~/2 and polarizer 2 with angular setting only
lets particle 2 through if its spin in the direction n has the value +~/2. The
particles are counted by detectors 1 and 2. If they respond, then the spin is
positive, otherwise it is negative.
1275
Using the spin projection operator
1
P = (1 + ~ n ) (16.35)
2
quantum mechanics gives
1 1
N (; ) = h0, 0| (1 + ~1 n ) (1 + ~2 n ) |0, 0i
2 2
1 1
= h0, 0| (1 + ~1 n ) (1 ~1 n ) |0, 0i
2 2
1
= (1 n n ) (16.36)
4
since h0, 0| ~1 |0, 0i = 0 in the singlet state. For coplanar detectors this reduces
to
1
N (; ) = sin2 (16.37)
2 2
If hidden variables were really present, we could represent N (; ) by the fol-
lowing sum
N (; ) = N (; ) + N (; ) (16.38)
Here, N (; ) is the relative number of particle pairs in which particle 1 has
positive spin at angles and and negative spin at , while N (; ) is the
relative number of particle pairs in which particle 1 has negative spin at |gamma
instead. In theories with hidden variables, all of these quantities are assumed
to be known.
Remarks
1. In experiments one often works with the correlation defined by
P (; ) = h0, 0| (~1 n ) (~2 n ) |0, 0i = 4N (; ) 1 (16.40)
instead of N (; ) itself. Using
1
N (; ) = sin2 (16.41)
2 2
we get
P ( ) P (; ) = cos( ) (16.42)
and Bells inequality becomes
P (; ) 1 P (; ) + P (; ) (16.43)
1276
2. The limit prescribed by the Bell inequality can be determined as follows.
In
N (; ) N (; ) + N (; ) (16.44)
we substitute for , , the values 0, , /2 respectively to obtain
Finally, we contrast the consequences of the Bell inequality with quantum me-
chanics and compare with experiments.
The comparison of quantum mechanics and the Bell inequality is shown in Fig-
ure 16.4 below, which gives the correlation P () = P (; 0) according to quantum
mechanics and the Bell inequality.
1277
Figure 16.4: Bell-EPR Experiment Theoretical Predictions
Clearly, quantum mechanics is correct. This means that any theory that has
the same probabilistic predictions as quantum mechanics must be nonlocal.
1278
16.4.1 Single-Photon Interference
All good discussions on quantum mechanics present a long an interesting anal-
ysis of the double slit experiment. The crux of the discussion comes when the
light intensity is reduced sufficiently for photons to be considered as presenting
themselves at the entry slit one by one. For a long time this point was very con-
tentious, because correlations between two successive photons cannot be ruled
out a priori.
Since 1985, however, the situation has changed. An experiment was performed
by Grangier, Roger and Aspect. It was an interference experiment with only a
single photon. They used a light source devised for an EPR experiment which
guarantees that photons arrive at the entry slit singly.
The light source is a beam of calcium atoms, excited by two focused laser beams
having wavelengths 0 = 406 nm and 00 = 581 nm respectively. Two-photon
excitation produces a state having the quantum number J = 0. When it
decays, this state emits two monochromatic photons having the wavelengths
1 = 551.3 nm and 2 = 422.7 nm respectively, in a cascade of two electronic
transitions from the initial J = 0 level to the final J = 0 state, passing through
an intermediate J = 1 state, as shown in Figure 16.6 below
The mean lifetime of the intermediate state is 4.7 ns. To simplify the terminol-
ogy, we shall call the 1 = 551.3 nm light green, and the 2 = 422.7 nm light
violet.
1279
Next we describe the experiment, exhibiting its three stages which reveal the
complications of the apparatus in progressively greater detail (Figures 16.7-16.9
below).
1. The first stage is a trivial check that the apparatus is working properly;
nevertheless it is already very instructive (Figure 16.7 below).
Figure 16.7 shows interference with a single photon (first stage). In the
sketch, solid lines are optical paths and dashed lines are electrical connec-
tions.
1280
The purpose of this arrangement is to use a green photon in order to open
a 9 ns time window, in which to detect a violet photon emitted by the
same atom. As we shall see, there is only an extremely small probability
of detecting through the same window another violet photon emitted by
a different atom.
We will assume that a second observer is in the lab. This observer always
feels compelled to present what he thinks are simple-minded truths using
ordinary words. We will called this second observer Albert. Albert, as we
shall see, has a tendency to use, one after another, the three phrases, I ob-
serve, I conclude, and I envisage. Consulted about the above experiment,
Albert states, with much confidence,
I observe that the photomultiplier P MA detects violet light when the source
S is on, and that it ceases to detect anything when the source is off. I con-
clude that the violet light is emitted by S, and that it travelled from S to
P MA .
I observe that energy is transferred between the light and the photomulti-
plier P MA always in the same amount, which I will call a quantum.
1281
Across the path of the violet light one places a half-silvered mirror LS ,
which splits the primary beam into two secondary beams(equal intensity),
one transmitted and detected by P MA , the other reflected and detected
by P MB . As in the first stage, the gate is opened for 9 ns, by P MO . While
it is open, one registers detection by either P MA (counted as NA ) or by
P MB (counted as NB ) or by both, which we call a coincidence (counted
as NC ). The experiment runs for 5 hours again and yields the following
results:
(a) The counts NA and NB are both of the order of 105 . By contrast,
NC is much smaller, being equal to 9.
(b) The sequence of counts from P MA is random in time, as is the se-
quence of counts from P MB .
(c) The very low value of NC shows that counts in P MA and P MB are
mutually exclusive (do not occur at same time).
The experimenters analyze the value of NC in depth; their reasoning can
be outlined as follows:
(a) Suppose two different atoms each emit a violet photon, one being
transmitted to P MA and the other reflected to P MB with both ar-
riving during the 9 ns opening of the gate; then the circuitry records
a coincidence. In the regime under study, and for a run of 5 hours,
quantum theory predicts that the number of coincidences should be
NC = 9. The fact that this number is so small means that, in prac-
tice, any given single photon is either transmitted or reflected.
(b) If light is considered as a wave, split into two by LS and condensed
into quanta on reaching P MA and P MB , then one would expect the
photon counts to be correlated in time, which would entail NC 9.
Classically speaking this would mean that we cannot have a trans-
mitted wave without a reflected wave.
(c) Experiment yields NC = 9; this quantum result differs from the
classical value by 13 standard deviations; hence the discrepancy is
very firmly established, and allows us to assert that we are indeed
dealing with a source of individual photons.
Albert leaves such logical thinking to professionals. Once he notes that
NC is very small, he is quite prepared to treat it as if it were zero. He
therefore says I observe that light travels from the source to P MA or to
P MB , because detection ceases when the source is switched off.
I observe that the optical paths 1 and 2 are distinguishable, because the
1282
experiment allows me to ascertain, for each quantum, whether it has trav-
elled path 1 (detection by P MA ) or path 2 (detection by P MB ).
We can easily show that with these rules, there is a 50-50 chance
of either of the detectors shown in the first figure above to click.
1283
According to the rules given in the figure
1 1
1,out = in , 2,out = in (16.51)
2 2
since nothing enters port #2.
1284
(1) Beamsplitter #1
1 1
1,out = in , 2,out = in
2 2
(2) Propagation a distance L/2 along each path mean that the phase
of the wavefunction changes by eikL/2 so that the wavefunctions
are
1 1
1,atmirror = eikL/2 in , 2,atmirror = eikL/2 in
2 2
1
1,af termirror = eikL/2 in (16.52)
2
1 ikL/2
2,af termirror = e in
2
(4) Propagation a distance L/2 along each path mean that the phase
of the wavefunction changes by eikL/2 so that the wavefunctions
are
1 1
1,atb2 = eikL in , 2,atb2 = eikL in
2 2
1,atb2 + 2,atb2
out,1 = = eikL in
2
1,atb2 2,atb2
out,2 = =0
2
Therefore,
Z Z
2 2
P1,out = |out,1 | dx = |in | dx = 1
Z
2
P2,out = |out,2 | dx = 0
1285
Now we know that there is a 50-50 probability for the photon to take
the blue or green path which implies that
Also with the particle incident at b2 along the green path there is
a 50% chance of transmission and similarly for reflection of the blue
path.
Therefore,
and
1 1 1 1 1
PD1 = + =
2 2 2 2 2
so that classical reasoning implies a 50-50 chance of D1 firing, that
is, it is completely random!
The quantum case is different because the two paths which lead to
detector D1 interfere. For the two paths leading to D1 we have
1 eikL in + 1 eikL in
total = 2 2
2
Z
2 1 1 1 1 1 1 1 1
PD1 = |total | dx = PD1 = + + + = 1
2 2 2 2 2 2 2 2
where the last two terms are the so-called interference terms. Thus,
PD1 = 1. The paths that lead to detector D2 destructively interfere
so that PD2 = 0.
We now ask how would you set up the interferometer so that detector
D2 clicked with 100% probability? How about making them click at
random?
Leave the basic geometry the same, that is, do not change the direc-
tion of the beam splitters or the direction of the incident light.
We can achieve this by changing the relative phase of the two paths
by moving the mirror so that the path lengths are not the same.
1286
We then have
1 eik(L+L) + 1 eikL
2 2
out,1 = = in
2
1 eik(L+L) 1 eikL
2 2
out,2 = = in
2
and
Z Z
2 1 2 2 2
PD1 = dx |2,out | = dx |in | eikL eikL + 1
4
1 ikL 1 ikL
+ 1 eikL + 1 = + 1 eikL + 1
= e e
4 4
1 ikL ikL 1 + cos(kL) 2 kL
= (2 + e +e )= = cos
4 2 2
Similarly we have
Z Z
2 1 2 2 2
PD2 = dx |1,out | = dx |in | eikL eikL 1
4
1 ikL 1 ikL
1 eikL 1
ikL
= e 1 e 1 = e
4 4
1 ikL ikL 1 cos(kL) 2 kL
= (2 e e )= = sin
4 2 2
1287
Figure 16.12: Mach-Zehnder Interferometer Inserted
A sweep, taking 15 sec for each standard step, yields two interference plots
corresponding, respectively, to the paths (1,2) and (1,2); the fringes
have good contrast(difference in intensity between maxima and minima),
and their visibility
was measured as 98% as shown in Figure 16.13 below which gives the two
interference plots obtained with the Mach-Zehnder interferometer. Note
that the maximum counting rates in P MA correspond to minima in P MB ,
indicating a relative displacement of /2 between the two interference pat-
terns
1288
Figure 16.13: Interference Results
If we recall that we are reasoning in terms of photons, and that the photons
are being processed individually, then we must admit that the interference
does not stem from any interaction between successive photons, but that
each photon interferes with itself.
What would Albert have to say? He seems exasperated but is still polite.
His statements are brief:
I observe that the optical paths differ in length between LS and LS , and
are then coincident over (1,2) and over (1,2).
1289
In the present situation I want to know for each individual photon which
path it has travelled; to this end I should like to ask you to close off path
2, since this will ensure that the photons travel by path 1..
Albert is visibly displeased and now very wary. He then continues with
his analysis of the experiment:
If I were to suppose that photons travel only along 1, then this would imply
that path 2 is irrelevant, which is contrary to what I have observed. Simi-
larly, if I were to suppose that photons travel only along 2, then this would
imply that path 1 is irrelevant, which is also contrary to my observations.
I shall suppose instead that the source emits a wave; this wave splits into
two at LS , and the two secondary waves travel one along path 1 and
the other along path 2. They produce interference by mutual superposition
on LS constructively in (1,2) and destructively in (1,2). At the far
end of (1,2) or of (1,2) I envisage each of the waves condensing into
particles, which are then detected by the photomultipliers (essentially by
P MA since the contrast is 98% means only very few photons are detected
by P MB
1290
visage light as having two complementary forms: depending on the kind
of experiment that is being done, it can manifest itself either as a wave,
or as a particle, but never as both simultaneously and in the same place.
Thus, in the experiment where the path followed by the light cannot be
ascertained (third stage), light behaves first like a wave, producing inter-
ference phenomena; but it behaves like a particle when, afterwards, it is de-
tected through the photoelectric effect. I conclude that light behaves rather
strangely, but nevertheless I have the impression that its behavior can be
fully described once one has come to terms with the idea of wave-particle
duality..
Originally I supposed that light would behave like a wave or like a particle,
depending on the kind of experiment to which it was being subjected.
I know that both waves and particles obey the principle of causality, that
is, that cause precedes effect.
I conclude that light is neither wave nor particle; it behaves neither like
waves on the sea, nor like projectiles fired from a gun, nor like any other
kind of object that I am familiar with.
1291
I must ask you to forget everything I have said about this experiment,
which seems to me to be thoroughly mysterious.
Albert leaves, but quickly returns with a contented smile, and his final
statement is not without a touch of malice. I observe in all cases that the
photomultipliers register quanta when I switch on the light source.
I conclude that something has travelled from the source to the detector.
This something is a quantum object, and I shall continue to call it a
photon, even though I know that it is neither a wave nor a particle.
I observe that the photon gives rise to interference when one cannot as-
certain which path it follows; and that interference disappears when it is
possible to ascertain the path.
For each detector, I observe that the quanta it detects are randomly dis-
tributed in time.
Physicist have indeed worked hard and the much desired theory has in-
deed come to light, namely, quantum mechanics, as we have seen in our
discussions. As we have seen, it applies perfectly not only to photons, but
equally well to electrons, protons, neutrons, etc; in fact, it applies to all
the particles of microscopic physics. For the last 75 years it has worked
1292
to the general satisfaction of physicists.
Meanwhile, it has produced two very interesting problems of a philosophical
nature.
1. Chance as encountered in quantum mechanics lies in the very nature of the
coupling between the quantum object and the experimental apparatus. No
longer is it chance as a matter of ignorance or incompetence: it is chance
quintessential and unavoidable.
2. Quantum objects behave quite differently from the familiar objects of
our everyday experience: whenever, for pedagogical reasons, one allows
an analogy with macroscopic models like waves or particles, one always
fails sooner or later, because the analogy is never more than partially
valid. Accordingly, the first duty of a physicist is to force her grey cells,
that is her concepts and her language, into unreserved compliance with
quantum mechanics (as we have been attempting to do); eventually this
will lead her to view the actual behavior of microsystems as perfectly
normal. As a teacher of physics, our duties are if anything more onerous
still, because we must convince the younger generations that quantum
mechanics is not a branch of mathematics, but an expression of our best
present understanding of physics on the smallest scale; and that, like all
physical theories, it is predictive.
In this context, let us review the basic formalism of quantum mechanics.
hf1 | ii , hf2 | ii
1293
More generally, we would write
2
X 2
|hf | ii| = |hfk | ii|
k
2. If a photon is emitted by the source S can take either of two paths, but it
is impossible to ascertain which path it does take (Figure 16.14(c) above),
then there are again two transition amplitudes:
hphoton arriving at P MA | photon leaving Sialong path 1
hphotonarrivingatP MB | photon leaving Sialong path 2
which we symbolize simply as
hf | ii1 , hf | ii2
To allow for interference, we assert that in this case it is the amplitudes
that must be added; the total amplitude reads
hf | ii = hf | ii1 + hf | ii2
1294
The total probability is then:
2
|hf | ii1 + hf | ii2 |
3. If one wants to analyze the propagation of the light more closely, one
can take into account its passage through the half-silvered mirror LS ,
considering this as an intermediate state (Figure 16.14(b) above). The
total amplitude for path 1 is
The four rules just given suffice to calculate the detection probability in any
possible experimental situation. They assume their present form as a result of
a long theoretical evolution; but they are best justified a posteriori, because
in 75 years they have never been found to be wrong. Accordingly, we may
consider them as the basic principles governing the observable behavior of all
microscopic objects, that is, objects whose action on each other are of order
(Plancks constant). From these principles (they are equivalent to our earlier
postulates - just look different because we are using the amplitude instead of
the state vector as the fundamental mathematical object in the theory) one can
derive all the requisite formalism, that is, all of quantum mechanics.
1295
Quantum mechanics as we have described it earlier and also above, works splen-
didly, like a well-oiled machine. It, and its basic principles, might therefore be
expected to command the assent of every physicist; yet it has evoked, and on
occasion continues to evoke, reservations both explicit and implicit. For this
there are two reasons:
Both features are revolutionary, and it is natural that they should have provoked
debate. On the opposite sides of this debate we find two great physicists, Neils
Bohr and Albert Einstein, and we will now discuss how the debate evolved from
its beginnings in 1927 to its conclusion in 1983 (that is 56 years!).
1296
1. For the physicist who is a realist, a physical theory reflects the behavior
of real objects, whose existence is not brought into question.
2. For the physicist who is a positivist, the purpose of a physical theory is to
describe the relations between measurable quantities. The theory does not
tell one whether anything characterized by these quantities really exists,
nor even whether the question makes sense.
3. For the physicist who is a determinist, exact knowledge of the initial con-
ditions and of the interactions allows the future to be predicted exactly.
Determinism is held to be a universal characteristic of natural phenom-
ena, even about those which we know, as yet, little or nothing. In this
framework, any recourse to chance merely reflects our own ignorance.
4. For the physicist who is a probabilist, chance is inherent in the very nature
of microscopic phenomena. To her, determinism is a consequence, on
the macroscopic level, if the laws of chance operating on the microscopic
level; it is appropriate to measurements of mean values of quantities whose
relative fluctuations are very weak.
From these four poles, realism, positivism, determinism, and chance, the physi-
cist chooses two, one on each axis. Though sometimes the choice is made in full
awareness of what it entails, most often it is made subconsciously. In our de-
scription of quantum mechanics, we might adopt without reservations, the point
of view of the elementary particle physicist. For a start, she believes firmly in
the existence of particles, since she spends her time in accelerating, deflecting,
focusing, and detecting them. Even though she has never seen or touched them,
to her their objective existence is not in any doubt. Next she observes that they
impinge on the detectors quite erratically, whence she has no doubts, either,
that their behavior is random. Accordingly, the elementary particle experimen-
talist has chosen realism and chance, most often without realizing that she has
made choices at all.
There are other philosophical options that can be adopted with eyes fully open:
realism and determinism are the choices of Albert Einstein; positivism and
chance are those of Neils Bohr. They are well acquainted and each thinks very
highly of the other: which is no bar to their views being incompatible, nor to
the two men representing opposite poles of the debate.
1297
They imply that it is impossible to define exact initial conditions for a mi-
croscopic object, which automatically makes it impossible to construct, on the
microscopic scale, a deterministic theory patterned on classical mechanics. Only
a probabilistic theory is possible, and that theory is quantum mechanics.
Einstein disagrees with this point of view, and his opposition to Bohrs the-
ses becomes public at the Brussels conference in 1930: he adopts the role of a
dissenter who knows precisely how to press home the most difficult questions.
Deeply shocked by the retreat from determinism, he tries to show via his thought
(gedanken) experiments he can contravene the Heisenberg inequalities.
For Bohr, a physical theory makes sense only as a set of relations between
observable quantities. Quantum mechanics supplies a correct and complete de-
scription of the behavior of objects at the microscopic level, which means that
the theory itself is likewise complete. The observed behavior is probabilistic,
implying that chance is inherent in the nature of the phenomena.
1298
these fifty years blow by blow; instead, we confine attention to three decisive
stages reached respectively in 1952, 1964, and 1983. But we start with an
illustration that helps one see what the EPR paradox actually is.
An experimenter in Lyons puts them into separate envelopes which she then
seals. She is thus provided with two envelopes looking exactly alike, and she
puts both into a container. She shakes the container so as the shuffle the pack,
and the system is ready for the experiment.
At 8:00 two travellers, one from Paris and one from Nice, come to the container
(in Lyons), take one envelope each, and then return to Paris and Nice, respec-
tively. At 14:00 they are back at their starting points; each opens her envelope,
looks at the card, and telephones to Lyons reporting the color. The experiment
is repeated every day for a year, and the observer in Lyons keeps a careful record
of the results. At the end of the year the record stands as follows:
1. The reports from Paris are red or black, and the sequence of these reports
is random. The situation is exactly the same as in a game of heads or
tails, and probability of each outcome is 1/2.
2. The reports from Nice are red or black, and the sequence of these reports
is random. Here too probability of each outcome is 1/2.
3. When Paris reports red, Nice reports black ; when Paris reports black, Nice
reports red. One sees that there is perfect(anti) correlation between the
1299
report from Paris and the report from Nice.
Accordingly, the experiment we have described displays two features:
1. It is unpredictable and thereby random at the level of individual observa-
tions in Paris and Nice.
2. It is predictable, by virtue of the correlation, at the level where one observes
the Paris and the Nice results simultaneously.
Einstein and Bohr might have interpreted the correlation as follows.
According to Einstein
The future of the system is decided at 8:00 when the envelopes are chosen,
because he believes that the contents of the two envelopes differ. Suppose, for
instance, that Paris has (without knowing it) drawn a red card, and Nice the
black. The colors so chosen exist in reality, even though we do not know them.
The two cards are moved, separately, by the travellers between 8:00 and 14:00,
during which time they do not influence each other in any way. The results on
opening the envelopes read red in Paris and black in Nice. Since the choice at
8:00 was made blind, the opposite outcome is equally possible, but the results
at 14:00 are always correlated (either red/black or black/red). This correlation
at 14:00 is determined by the separation of the colors at 8:00, and we say the
theory proposed by Einstein is realist, deterministic, and separable(or local), by
virtue of a hidden variable, namely, the color.
According to Bohr
There is a crucial preliminary factor, inherent in the preparation of the system.
On shaking the container with the two envelopes, one loses information regard-
ing the colors. Afterwards, one only knows that each envelope contains either a
red card (probability 1/2) or a black card (probability 1/2). We will therefore
say that a given envelope is in a brown state, which is a superposition of a red
state and of a black state having equal probabilities. At 8:00 the two envelopes
are identical: both are in a brown state, and the future of the system is still
undecided. There is no solution until the envelopes are opened at 14:00, since it
is only the action of opening them that makes the colors observable. The result
is probabilistic. There is a probability 1/2 that in Paris the envelope will be
observed to go from the brown state to the red, while the envelope in Nice is
observed to go from the brown state to the black; there is the same probability
1/2 of observing the opposite. But the results of the observations on the two
envelopes are always correlated, which means that there is a mutual influence
between them, in particular at 14:00; in fact it is better to say that, jointly,
they constitute a single and non-separable system, even though one is in Paris
and the other is in Nice. Accordingly, the theory proposed by Bohr is positivist,
1300
probabilistic(non-deterministic) and non-separable(non-local), interrelating as
it does the colors that are actually observed.
Proceeding with impeccable logic but from different premises, both theories
predict the same experimental results. Can we decide between them? At the
level considered here it seems we cannot: for even if the envelopes were opened
prematurely while still in Lyons, one would merely obtain the same results at a
different time, and without affecting the validity of either interpretation. The
solution to the problem must be looked for at the atomic level, by studying the
true EPR set-up itself.
In 1952, David Bohm showed that the paradox could be set up not only with con-
tinuously varying quantities like position and momentum, but also with discrete
quantities like spin. This was the first step towards any realistically conceiv-
able experiment. Meanwhile, objectives have evolved, and nowadays it is more
usual to talk of the EPR scenario, meaning some sensible experiment capable
of discriminating between quantum theory and hidden-variable theories. Such
a set-up is sketched in Figure 16.17 below where we present the simplest EPR
scenario.
A particle with spin 0 decays, at S, into two particles of spin 1/2, which di-
verge from S in opposite directions. Two Stern-Gerlach type detectors A and
B measure the xcomponents of the spins. Two types of response are possible:
1301
1. spin up at A, spin down at B, a result denoted by (+1, 1)
Bohr reasons that all the pairs produced at S are identical. Each pair con-
stitutes a non-separable system right up to the time when the photons reach
the detectors A and B. At that time we observe the response of the detectors,
which is probabilistic, admitting two outcomes (+1, 1) and (1, +1). To sum
up, Einstein restricts the operation of chance to the instant of decay (at S),
whose details we ignore, but which we believe creates pairs whose hidden vari-
ables are different.
By contrast, Bohr believes that chance operates at the instant of detection, and
that it is inherent in the very nature of the detection process: this chance is
unavoidable.
In 1964, the landscape changes: John Bell, a theorist at CERN, shows that it
is possible to distinguish between the two interpretations experimentally.
The test applies to the EPR scenario; it is refined by Clauser, Horne, Shimony,
and Holt, whence it is called the BCHSH inequality after its five originators.
It is solution (1) that has eventually proved the most convenient; it has been
exploited by Alain Aspect at the Institute for Optics in Paris, in particular.
1302
Next one needs detectors whose response can assume one of two values, repre-
sented conventionally by +1 and 1. Such a detector might be
Our sketch of the EPR scenario can now be completed as in Figure 16.18 below
where we present the most general EPR scenario.
Figure 16.18(a) views the apparatus perpendicularly to axis, showing the two
~ and B.
detectors A and B, with their polarizing directions denoted as A ~
Figure 16.18(b) views the apparatus along its axis, and shows that the analyzing
directions of the two detectors are not parallel, but inclined to each other at an
angle .
In Figure 16.18((c) we also a view along the axis of the apparatus, and shows the
actual settings chosen by Aspect: two orientations are allowed for each detector,
A~ 1 or A
~ 2 for one, and B
~ 1 or B
~ 2 for the other.
~
(2) = 1 is the response of detector B when oriented along B
1303
Since each detector has two possible orientations, called 1 and 2, we shall denote
their responses as 1 , 2 and 1 , 2 respectively. Now consider the quantity hi
defined by
hi = h1 1 i + h1 2 i + h2 1 i h2 2 i (16.55)
where the symbol h...i denotes the mean value over very many measured events.
We call hi the correlation function of the system.
Bells inequalities have the great virtue that they apply to any hidden variable
theory, irrespective of the choice of ().
1304
can assume only the values 2 and 2.
To prove the theorem, one constructs a truth table for all 16 possibilities, which
shows that 2 and 2 are indeed the only possible values of .
1 2 1 1
1 1 1 1 2
1 1 1 -1 2
1 1 -1 1 -2
1 1 -1 -1 -2
1 -1 1 1 2
1 -1 1 -1 2
1 -1 -1 1 -2
1 -1 -1 -1 -2
-1 1 1 1 -2
-1 1 1 -1 2
-1 1 -1 1 -2
-1 1 -1 -1 2
-1 -1 1 1 -2
-1 -1 1 -1 -2
-1 -1 -1 1 2
-1 -1 -1 -1 2
2 hi 2 (16.57)
This is obvious, because every value of lies in this range, and so therefore must
the mean. The endpoints are included in order to allow for limiting cases. Note
that both theorems are purely mathematical, neither involves any assumptions
about physics.
1305
that the response of detector A is independent of the orientation of detector B.
~1, B
1 and 1 in the orientation (A ~ 1)
~2, B
2 and 2 in the orientation (A ~ 2)
~1, B
10 and 20 in the orientation (A ~ 2)
~2, B
20 and 10 in the orientation (A ~ 1)
Recall that the variables and can only take on the values 1 and 1. It is
impossible in practice to make four measurements on one and the same pair
of photons, because each photon is absorbed in the first measurement made
on it; that is why we have spoken conditionally, that is, of what results would
be(a COUNTERFACTUAL statement). But if we believe that the photon
correlations are governed by a theory that is realist, deterministic, and sepa-
rable, then we are entitled to assume that the responses, of type or type ,
depend on properties that the photons possess before the measurement, so that
the responses correspond to some objective reality. In such a framework we can
appeal to the principle of separability, which implies, for instance, that detec-
tor A would give the same response to the orientations (A ~1, B
~ 1 ) and (A
~1, B
~ 2 ),
because the response of A is independent of the orientation of B.
Thus, we have shown that, for a given pair of photons, all possible responses
of the apparatus in its four chosen settings can be specified by means of only
four two-valued variables 1 , 2 and 1 , 2 . This reduction from eight to four
variables depends on the principle of separability. In this way, we are led to a
situation covered by Theorem 2, and therefore 2 hi 2.
By making many measurements for each of the four settings we can determine
the four mean values h1 1 i , h1 2 i , h2 1 i , h2 2 i and thus the mean value
of the correlation
hi = h1 1 i + h1 2 i + h2 1 i h2 2 i
1306
This leads to values well outsidethe interval [2, 2]; for example to hi = 2 2
when = 22.5 and to hi = 2 2 when = 67.5 .
Proof : The laboratory reference frame Oxyz serves to specify the orientations
of detectors and polarizers as shown in Figure 16.20 below:
Before any measurements have been made, the photon pair a, b forms a non-
separable entity, represented by the vector
1
|i = (|xA , xB i + |yA , yB i) (16.59)
2
The act of measurement corresponds to passage to the -basis. Hence, we
require the transition amplitudes from the two states |xA , xB i , |yA , yB i to the
four states
|A , B i , |A , B + /2i , |A + /2, B i , |A + /2, B + /2i (16.60)
In the -basis we have
1
|i = [cos(B A ) |A , B i
2
sin(B A ) |A , B + /2i
+ sin(B A ) |A + /2, B i
+ cos(B A ) |A + /2, B + /2i] (16.61)
The square of each amplitude featured here represents the detection probability.
For example, the probability of simultaneously detecting photon a polarized at
the angle A and the photon b polarized at the angle B is
2
1 1
cos(B A ) = cos2 (B A ) (16.62)
2 2
1307
By convention, we write the responses of detector A to a photon in state |A i
(respectively |A + /2i) as = 1 and similarly with for detector B.
1
P++ = cos2 (B A ) (16.63)
2
1
P+ = sin2 (B A ) (16.64)
2
1
P+ = sin2 (B A ) (16.65)
2
1
P = cos2 (B A ) (16.66)
2
The settings chosen by Aspect are as shown in Figure 16.20 above. Correspond-
ing to it we have the four terms
Thus, the BCHSH test turns the EPR scenario into an arena for rational con-
frontation between the two interpretations; it remains only to progress from
thought experiments to experiments conducted in the laboratory.
1308
16.5.8 The Beginnings of the Experiment at Orsay (1976)
Alain Aspects experiment studies the correlation between the polarizations
of the members of photon pairs emitted by calcium. The light source is a
beam of calcium atoms, excited by two focused laser beams having wavelengths
0 = 406 nm and 00 = 581 nm respectively. Two-photon excitation pro-
duces a state having the quantum number J = 0. When it decays, this state
emits two monochromatic photons having the wavelengths 1 = 551.3 nm and
2 = 422.7 nm respectively, in a cascade of two electronic transitions from the
initial J = 0 level to the final J = 0 state, passing through an intermediate
J = 1 state, as shown in Figure 16.21 below which shows the excitation and
decay of the calcium atom.
The mean lifetime of the intermediate state is 4.7 ns. To simplify the terminol-
ogy, we shall call the 1 = 551.3 nm light green, and the 2 = 422.7 nm light
violet.
The polarizer, which works like a Wollaston prism is shown in Figure 16.22 be-
low where we can see the two-valued response of a Wollaston prism.
1309
The Wollaston prism is made of quartz or of calcite. It splits an incident beam
of natural (unpolarized) light into two beams of equal intensity, polarized at 90
to each other. If only a single unpolarized photon is incident, it emerges either
in the state |xi, with probability 1/2, or in the state |yi , with probability 1/2.
Thus, the response of the system is two-valued.
It uses a coincidence circuit which registers an event whenever two photons are
detected in cascade. In this way four separate counts are recorded simultane-
ously, over some given period of time. In the EPR scenario envisaged by Bohm,
where = 0, the only possible responses are (+1, 1) or (1, +1) In the situ-
ation realized by Aspect, the angle is non-zero, and four different responses
are possible.
1310
4. N the number of coincidences corresponding to = 1 and = 1,
that is, to = 1
The resolving time of the coincidence circuit is 10 ns, meaning that it reckons
two photons as coincident if the they are separated in time by no more than
10 ns. The mean life of the intermediate state of the calcium atom is 4.7 ns.
Therefore, after a lapse of 10 ns, that is more than twice the mean lifetime,
almost all the atoms have decayed (actually 88%). In other words, the efficiency
of the coincidence counter is very high.
The experiment consists in counting, over some given time interval, the four
kinds of coincidence: N++ , N+ , N+ and N . The total number of events is
N = N++ + N+ + N+ + N .
hi = h1 1 i + h1 2 i + h2 1 i h2 2 i (16.74)
1311
Figure 16.24: Results from First Orsay Experiment
The experimental results from 17 different values of are indicated on the figure
by squares, where the vertical size of the square gives plus or minus one standard
deviation (a measure of the experimental error).
Clearly, there can be no doubt that the BCHSH inequality is violated; many of
the experimental points fall outside the interval [2, 2]. At the point where the
violation is maximal ( = 22.5 ), one finds
hi = 2.70 0.015 (16.77)
which represents a departure of over 40 standard deviations from the extreme
value of 2. What is even more convincing is the precision with which the exper-
imental points lie on the curve predicted by quantum mechanics.
Quite evidently, for the EPR scenario one must conclude not only that hidden-
variable theories fail, but also that quantum mechanics is positively the right
theory for describing the observations.
1312
According to quantum theory, before the measurement each particle pair consti-
tutes a single system extending from A to B, whose two parts are non-separable
and correlated. This interpretation corresponds to a violation of Bells inequal-
ity and agreement with experiment.
However, to clinch this conclusion, one must ensure that no influence is exerted
in the ordinary classical sense through some interaction propagated between the
two detectors A and B, that is, no influence which might take effect after the
decay at S, and which might be responsible for the correlation actually observed.
Let us therefore examine the Orsay apparatus in more detail as in Figure 16.25
below where we attempt to test Einsteinian non-separability.
When the detectors at A and B record a coincidence, this means that both have
been triggered within a time interval of at most 10 ns, the resolving time of the
circuit. Could it happen that, within this interval, A sends to B a signal capable
of influencing the response of B? In the most favorable case, such a signal would
travel with the speed of light in vacuum, which according to relativity theory is
the upper limit on the propagation speed of information, and thereby of energy.
To cover the distance AB, which is 12 m in the figure, such a signal would need
40 ns. This is too long by at least 30 ns, and rules out any causal links between
A and B in the sense of classical physics. One says that the interval between A
and B is space-like.
One of the advantages of the Orsay experiment is that it uses a very strong
light-source, allowing sufficient distance between the detectors A and B while
1313
still preserving reasonable counting rates. By increasing the distance AB step
by step, Aspect could check that the correlation persists, even when the interval
between A and B becomes space-like. This is the check that guarantees that
the two-photon system is non-separable irrespective of the distance AB.
The solution adopted at Orsay employs periodic switching every 10 ns. These
changes are governed by two independent oscillators, one for channel A and one
for channel B. The oscillators are stabilized, but however good the stabilization
it cannot eliminate small random drifts that are different in the two channels,
seeing that the oscillators are independent. This ensures that the changes of
orientation are random even though the oscillations are periodic, provided the
experiment lasts long enough (1 to 3 hours).
The key element of the second Orsay experiment is the optical switch shown in
Figure 16.26 below.
1314
Figure 16.26: Second Orsay Experiment - Optical Switch
The fluid keeps changing from a state of perfect rest to one of maximum agita-
tion and back again. In the state of rest, the light beam is simply transmitted,
In the state of maximum agitation, the fluid arranges itself into a structure of
parallel and equidistant plane layers, alternately stationary (nodal planes) or
agitated (antinodal planes). Thus, one sets up a lattice of net-like diffracting
planes; the diffracted intensity is maximum at the so-called Bragg angles, just as
in scattering from a crystal lattice. Here the light beam is deviated through 102
radians (the angles in the figure are exaggerated for effect). The two numerical
values, 25 M Hz and 102 radians, suffice to show the magnitude of the technical
achievement. With the acoustic power of 1 watt, the system functions as an
ideally efficient switch.
In this set-up, the photons a and b leave S without knowing whether they will
go, the first to A1 or A2 and the second to B1 or B2 .
1315
The second experiment is less precise than the first, because the light beams
must be very highly collimated in order to ensure efficient switching. Neverthe-
less, its results exhibit an unambiguous violation of Bells inequality, reaching
5 standard deviations at the peak; moreover the results are entirely compatible
with the predictions of quantum mechanics.
Einstein Bohr
hidden variables quantum mechanics
realist positivist
deterministic probabilistic
separable non-separable
The violation of the BCHSH inequality argues for Bohrs interpretation, all the
more so as the measured values of hi are in close agreement with the predic-
tions of quantum mechanics.
It remains to ask oneself just why hidden-variable theories do fail. Of the three
basic assumptions adopted by such theories, namely realism, determinism, and
separability, at least one must be abandoned. In the last resort, it is separa-
bility that seems to be the most vulnerable assumption. Indeed, one observes
experimentally that the violation of the BCHSH inequality is independent of the
distance between the two detectors A and B, even when this distance is 12 m
or more. There are still die-hard advocates of determinism, who try to explain
non-separability through non-local hidden variables. Such theories, awkward
and barely predictive, are typically ad hoc, and fit only a limited number of
phenomena. They are weakly placed to defend themselves against interpreta-
tions furnished by quantum mechanics, which have the virtues of simplicity,
elegance, efficiency, and generality, and which are invariably confirmed by ex-
periment.
1316
in a quantum system evolving free of external
perturbations,and from well-defined initial
conditions, all parts of thesystem remain
correlated, even when the interval between
them is space-like
This assertion reflects the properties of the state vector of a quantum system.
For an EPR system, the state vector after the decay of the source reads
1
|i = (|xA , xB i + |yA , yB i) (16.78)
2
1317
16.7 An Example and a Solution - Bells Theo-
rem with Photons
Two photons fly apart from one another, and are in oppositely oriented circularly
polarized states. One strikes a polaroid film with axis parallel to the unit vector
a, the other a polaroid with axis parallel to the unit vector b. Let P++ (a, b) be
the joint probability that both photons are transmitted through their respective
polaroids. Similarly, P (a, b) is the probability that both photons are absorbed
by their respective polaroids, P+ (a, b) is the probability that the photon at
the a polaroid is transmitted and the other is absorbed, and finally, P+ (a, b)
is the probability that the photon at the a polaroid is absorbed and the other
is transmitted.
where i and j take on the values + and , where signifies the so-called hidden
variables, and where () is a weight function. This equation is called the
separable form.
where
It is required that
(a) () 0
R
(b) d () = 1
1318
(1) Show that the above classical realist assumptions imply that |B| 2.
2
(2) Show that quantum mechanics predicts that C(a, b) = 2 a b 1.
(3) Show that the maximum value of the Bell coefficient is 2 2, according to
quantum mechanics.
(4) Cast the quantum mechanical expression for C(a, b) into a separable form.
Which of the classical requirements, (a), (b), or (c) above is violated?
Solution
(1) With the separability assumption, we have (16.81)
Z
C(a, b) = d ()C(a, )C(b, )
1319
Thus, in all cases |B| 2.
1 1
|Ri = (|zi + i |xi) , |Li = (|zi i |xi) (16.88)
2 2
|z 0 i
|zi cos sin |zi
= (16.89)
|xi |x0 i sin cos |xi
1
|EP Ri = (|R1 i |R2 i + |L1 i |L2 i) (16.90)
2
corresponds to the more general situation in which the photons are in oppositely
oriented states of circular polarization, where the sense of this polarization is
not specified. We can write this entangled or Einstein-Podolsky-Rosen state in
the form
1
|EP Ri = (|z1 i |z2 i |x1 i |x2 i) (16.91)
2
which is a superposition of states of linear polarization.
1320
where we have used hz1 | x1 i = 0. The probability that photon 1 is found to
have linear polarization in the direction z, and photon 2 in the direction z is
2 1
P++ (a, b) = |hEP R | z1 z20 i| = cos2 (16.93)
2
where we have assumed that a is in the z direction and b is in the z direction.
Suppose next that the linear polarization of the linear polarization of photon 1
were measured in the x direction, and that of photon 2 again in the z direction.
The probability amplitude is
1
hEP R | x1 z20 i = (hz1 | hz2 | hx1 | hx2 |) (|x1 i (cos |z2 i sin |x2 i))
2
1
= sin (16.94)
2
If photon 1 has polarization in the x direction, then it will not be transmitted
by a polarizer in the z direction - it will be absorbed. Hence,
2 1
P+ (a, b) = |hEP R | x1 z20 i| = sin2 (16.95)
2
Similarly,
2 1
P+ (a, b) = |hEP R | z1 x02 i| = sin2 (16.96)
2
2 1
P (a, b) = |hEP R | x1 x02 i| = cos2 (16.97)
2
The correlation coefficient is then
C(a, b) = P++ (a, b) + P (a, b) P+ (a, b) P+ (a, b)
= cos2 sin2 = 2 cos2 1 = cos 2 (16.98)
Since the unit vectors a and b are at an angle with respect to one another, it
follows that a b = cos and therefore
C(a, b) = 2 cos2 1 = 2(a b)2 1 (16.99)
0
(3) Suppose that the angle between the vectors a and a is x/2, between a and
b is y/2 and between b and b0 is z/2. Then the angle between a0 and b0 is
(x + y + z)/2 and according to quantum mechanics, the Bell coefficient has the
form
B = cos x + cos y + cos z cos(x + y + z) (16.100)
This function has extrema when
B
= sin x + sin(x + y + z) = 0
x
B
= sin y + sin(x + y + z) = 0
y
B
= sin z + sin(x + y + z) = 0
z
1321
or
sin x = sin y = sin z = sin(x + y + z) (16.101)
This has the solution
x = y = z and 3x = x x = /4 (16.102)
2B 2B 2B 3
= = = cos + cos = 2<0
x2 y 2 z 2 4 4
(4) Let the vector a be at an angle a with respect to some direction in the xz
plane, and let b be at an angle b with respect to the same direction. Then
() = ( + 1) + ( 1)
C(a, 1) = cos 2a , C(a, 1) = sin 2a
C(b, 1) = cos 2b , C(b, 1) = sin 2b
() 0
1 C(a, ) , C(b, ) 1for = 1
but Z
d () = 1 + 1 = 2
1322
The original EPR analysis was rather complex in a technical sense and most dis-
cussions now use a simpler version due to Bohm. He considered a particle whose
decay produces two spin1/2 particles whose total spin angular momentum is
zero. These particles move away from each other in opposite directions, and the
components of their spins along various directions are subsequently measured
by two observers, N and L, say. The constraint on the total spin means that if
both observers agree to measure the spin along a certain direction n , and if N
measures +~/2, then L will necessarily get the result ~/2, and if N measures
~/2, then L will necessarily get the result +~/2.
There are no surprises if such correlations are analyzed in the context of clas-
sical physics. If one particle emerges from the decay with its internal angular
momentum vector pointing along some particular direction, then because of
conservation of angular momentum, the second particle is guaranteed to emerge
with its spin vector pointing in the opposite direction. Thus, the 100% anti-
correlations found in the measurements made by the two observers are simply
the result of the fact that both particles possess actual, and (anti-)correlated,
values of internal angular momentum and this is true from the time they emerge
from the decay to the time the measurements are made. There are no paradoxes
here, and everything is in accord with the simple realist view of classical physics.
The situation in quantum theory is radically different. Suppose first that the
measurements are made along the zaxes of the two observers. The spin part
of the state of the two particles can be written in terms of the associated eigen-
vectors as
1
|i = (|i |i |i |i) (16.105)
2
where, for example, |i |i is the state in which particles 1 and 2 have spin +~/2
and ~/2 respectively. Thus
1323
interpretation of the above entangled state.
The obvious question is how the information about each observers individual
results gets to the other particle to guarantee that the result obtained by the
second observer will be the correct one.
One might be tempted to invoke the reasoning of classical physics and argue
that both particles possess the appropriate value all the time.
However, the only way in standard quantum theory of guaranteeing that a cer-
tain result will be obtained is if the state is an eigenvector of the observable
concerned. But the state |i above is not of this type. In fact, it displays
the typical features of quantum entanglement - it is a superposition of states.
Any attempt to invoke a hidden variable resolution will have to cope with the
implications of the Kochen-Specker theorem.
There is also a question of whether this picture is compatible with special rela-
tivity. If the measurements by the two observers are space-like separated (which
can be easily arranged) then which of them makes the first measurement and
hence, in the standard interpretation, causes the state-vector reduction is clearly
reference-frame dependent.
In one sense, this new entangled state is what might have been expected, and
confirms that there is the same type of 100% anti-correlation between Sx mea-
surements as that found for the observable Sz . Indeed, this argument can be
generalized to show that for any unit vector n, the entangled state can be
rewritten as a sum of two anti-correlated terms containing eigenvectors of the
projection n S of the spin operator along n. Thus, if one adopts the classical
type argument, one is obliged to conclude that both particles possessed exact
values of spin along any axis from the moment they left the decay. This might
not be easy to reconcile with the uncertainty relations associated with the an-
gular momentum commutators.
EPR considered these issues, and concluded that the difficulties could be re-
solved in one of only two ways:
1. When N makes her measurement, the result communicates itself at once
1324
in some way to particle 2, and converts its state into the appropriate
eigenvector.
or
In contemplating the first possibility it must be appreciated that the two par-
ticles may have moved a vast distance apart before the first measurement is
made and, therefore, any at once mode of communication would be in violent
contradiction with the spirit (if not the law) of special relativity. It is not
surprising that Einstein was not very keen on this alternative! An additional
objection involves the lack within quantum theory itself of any idea about how
this non-local effect is supposed to take place, so in this sense the theory would
be incomplete anyway.
EPR came to the conclusion that the theory is indeed incomplete, although they
left open the correct way in to complete it. One natural path is to suppose that
there exist hidden variables whose values are not accessible to measurement in
the normal way but which determine the actual values of what we normally
regard as observables - in the same way as do the microstates in classical sta-
tistical physics.
However, as we discussed earlier and will review here again, a very famous result
of John Bell shows that this is not possible, that is, any hidden-variable theory
that exactly replicates the results of quantum theory will necessarily possess
striking non-local features.
1325
16.8.1 The Bell Inequalities
As with the Kochen-Specker result, the non-locality property we are about to
discuss is not just a feature of hidden variables theories. It applies to any realist
interpretation of quantum theory in which it is deemed meaningful to say that
an individual system possesses values for its physical quantities in a way that
is analogous to that in classical physics.
The considerations of EPR were concerned with two observers who make mea-
surements along the same axis. Bell found his famous inequalities by asking
what happens if the observers measure the spin of the particles along different
axes. In particular, we consider a pair of unit vectors a and a0 for one observer
and another pair b and b0 for the other observer.
The key ingredient in the derivation of the Bell inequalities is the correlation be-
tween measurements made by the two observers along these different directions.
For directions a and b this is defined by
N
1 X
C(a, b) := lim an bn (16.108)
N N
n=1
and similarly for the other directions. Note that if the results are always totally
correlated(spins always in the same direction) then C(a, b) = +1, whereas if they
are totally anti-correlated(spins always in opposite directions) we get C(a, b) =
1.
For any member n of the collection, each term in this sum will take on the value
+1 or 1 . Furthermore, the fourth term on the right hand side is equal to the
product of the first three (because (an )2 = 1 = (bn )2 ). Then thinking about the
various possibilities shows that gn can take on only the values 2. Therefore,
1326
the right hand side of the expression
N
N N N N
1 X 1 X 1 X 1 X 1 X
gn = an bn + an b0n + a0n bn a0n b0n
N N N N N
n=1 n=1 n=1 n=1 n=1
It is important to emphasize the only assumptions that have gone into proving
this inequality are:
1. For each particle it is meaningful to talk about the actual values of the
projection of the spin along any direction.
2. There is locality in the sense that the value of any physical quantity is not
changed by altering the position of a remote piece of measuring equipment.
This means that both occurrences of an in the expression for the average value
of gn have the same value, that is, they do not depend on the direction (b or
b0 ) along which the other observer chooses to measure the spin of particle 2. In
particular, we are ruling out the type of context-dependent values that arose in
our discussion of the Kochen-Specker theorem.
We will now show that the predictions of quantum theory violate this inequality
over a range of directions for the spin measurements. The quantum mechanical
prediction for the correlation between the spin measurements along axes a or b
is 2
2
C(a, b) := h| a S(1) b S(2) |i (16.111)
~
where S(1) and S(2) are the spin operators for particles 1 and 2 respectively,
and the tensor product is as we defined earlier in this chapter. Since the total
angular momentum of the entangled vector |i is zero, it is invariant under the
unitary operators which generate rotations of the coordinate systems.
This means that C(a, b) is a function of a b = cos ab only and, hence, there is
no loss of generality in assuming that a points along the zaxis and that b lies
in the x z plane. Then the expression for C(a, b) becomes
1327
Now we restrict our attention to the special case in which (1) the four vectors
a, a0 , b, b0 are coplanar and (2) a or b are parallel and (3) ab0 = a0 b = say.
Then the Bell inequality will be satisfied provided that
This is violated for all values of between 0 and 90 . This means that if the
predictions of quantum theory are experimentally valid in this region then any
idea of systems possessing individual values for observables must necessarily in-
volve an essential non-locality. This applies in particular to any hidden variable
theory that is completely consistent with the results of quantum theory. Thus,
the important questions are:
1. Are the Bell inequalities empirically violated?
2. If so, are such violations in accord with the predictions of quantum theory?
In many experiments over the last two decades, the overwhelming conclusion
is that the predictions of quantum theory are vindicated and so we are obliged
either to stick with a pragmatic approach or a strict instrumentalist interpreta-
tion or else to accept the existence of a strange non-locality that seems hard to
reconcile with our normal concepts of spatial separation between independent
entities.
How many tests are needed to make my realist friend feel uncomfortable?
The problem is not whether the validity of a Bell inequality can be salvaged
by invoking clever loopholes, as some realists try to trick us into, but whether
1328
there can be any local realistic theory that reproduces the experimental results.
To simplify the discussion, I will assume that there are ideal detectors and that
the rate at which particles are produced by the apparatus is perfectly known.
P (B|C)
P (B|A C) = P (A|B C) (16.115)
P (A|C)
p 0r
= 100 (16.116)
p 0q
The question is: how many experimental tests are needed to change my friends
opinion to
p 00r
= 0.01 (16.117)
p 00q
say, before he is driven to bankruptcy. This is a reversal (in belief) by a factor
of 104 .
In this case, P (r|{m, n} I) = p 00r is the new prior probability for my friend
after the experiments are finished and similarly we have P (q|{m, n} I) = p 00q .
If we define
which are just the probabilities of the experimentally found result (the actual
data - m successes in n trials) according to the two theories.
1329
These follow from the binomial theorem
n! n!
Er = rm (1 r)nm , Eq = q m (1 q)nm (16.119)
m!(n m)! m!(n m)!
or that
as the confidence depressing factor for the hypothesis LR with respect to the
hypothesis QM.
I, the theorist, will assume that the coin is unbiased and that therefore q = 0.5
and m = n/2 (assuming that I am correct). We then have
n/2 n/2 n n/2
1 1 1 1
D= = (16.124)
2r 2(1 r) 2 r(1 r)
So that it would take only 16 coin flips to reverse my untrusting friends belief.
1330
Figure 16.28: Three Detector Experiment
A run of the experiment consists of setting the switch on each detector to one of
its two positions (labeled 1 or 2), pressing a button at the source (to release a trio
of particles, one aimed at each detector), and recording the color subsequently
flashed at each detector.
We consider only the data acquired for four of the eight possible switch settings,
namely, those in which the number of detectors set to 1 is odd.
1331
will lead to similar results (1 2). As shown in Figure 16.28 we call the detec-
tors A, B, and C, and specify pertinent facts about them by listing three pieces
of information (switch settings or colors flashed) in that order.
If we run the experiment many times, then the observational results are the
following. If just one detector is set to 1 (and the others to 2), then an odd
number of red lights always flash, that is, either all three detectors flash red or
there is one red flash and two green ones.
If all three detectors are set to 1, then an odd number of red lights is never
observed to flash - either two of the three flash red or all three flash green.
All four outcomes are equally likely in each case(this particular detail is not
important).
We will discuss a real, physical system that exhibits this behavior later.
Let us set aside, for the moment, the 111 case and consider the 122, 212, and
221 cases in which just one detector is set to 1. Because an odd number of red
lights always flash in any of these three cases, whenever the switches are so set
we can predict with certainty what one of the three detectors will do in a run,
merely by noting what happens to the other two. For should the other two flash
the same color (RR or GG), then the third will have to flash red, but should
the other two flash different colors (RG or GR), then the third will have to flash
green.
Now we follow the path set out by EPR to draw an inference that will seem
inescapable. Along the way we will use the so-called EPR reality criterion.
Since there are no direct connections between the detectors, their behavior can
only be coordinated due to the fact that all three are triggered by particles that
came from a common source. This fact and this fact alone must contain the
explanation for why we can learn in advance what color will flash at a given
detector, say A, from measurements made far away at B and C. Information
telling the detector at A what color to flash in order to maintain the observed
1332
consistency with the colors flashed at B and C must somehow be encoded in the
particle that triggers A. Since that particle could indeed have been coordinated
with the particles that triggered B and C when all three were back at their
common source, this explanation seems both inevitable and entirely reasonable.
We can apply this reasoning to any one of the three detectors (by moving it
farther from the source so that before it flashes we have had the opportunity
to observe what colors flash at the other two). We conclude that in each run
of the experiment each particle must be carrying to its detector instructions on
what color to flash, and that an odd number of the particles must specify red.
Thus, for a given choice of the switch settings (say 122) the particles heading for
detectors A, B, and C must respectively be carrying instructions RRR, RGG,
GRG, or GGR, but never GRR, RGR, RRG, or GGG.
Which of the four allowed groups of instructions they collectively carry is re-
vealed only when the lights flash. All of the above reasoning applies equally
well, of course, to 212 and 221 runs.
In the absence of connections between the detectors and the source, a particle
has no information about how the switch of its detector will be set until it ar-
rives there. Since in each run any detector might turn out to be the one set to
1 or one of the ones set to 2, to preserve the perfect record of always having
an odd number of red flashes in 122, 212, and 221 runs, it would seem to be
essential for each particle to be carrying instructions for how its detector should
flash for either of the two possible switch settings it might find upon arrival.
Since each of the three possible switch settings result in an odd number of red
flashes, this is indeed a legal set of instructions.
1333
Since there are eight ways the lights can flash, namely,
It is not hard to enumerate all the legal (odd number of red flashes) instruction
sets.
First note that three of the six positions in a legal instruction set corresponding
to any one of the three choices 122, 212, or 221 for the switch settings, must
contain an odd number of Rs, since that particular setting might be encountered
in any run, and since only odd numbers of red flashes are ever observed. Thus,
the only possible entries for the positions corresponding to the switch settings
122 are (leaving blank the entries not relevant to those three settings):
so that 122 gives RRR, RGG, GRG, or GGR independent of the other entries.
We can next count the way to fill in the blanks in these four forms so as to
produce the correct data for switch settings 221. Since each of the four already
specifies the color flashed at detector B for setting 2, namely, R G R G, to
ensure that any 221 run produces an odd number of red flashes there are only
two choices available for the other two (A and C) unspecified 221 entries for
each of the four forms: RR or GG if the specified entry is R and RG or GR if
the specified entry is G so that we have
This raises the number of possible forms to eight, each of which leaves only the
entry for setting 1 at the detector B unspecified. But that entry is now entirely
determined by the entries at settings 2 for detectors A and C (having to be R,
1334
if the latter two entries are the same color and G, if they are different).
ABC ABC ABC ABC
1 RRR RGG GRG GGR
2 RRR RGG GRG GGR
ABC ABC ABC ABC
1 RGR RRR GGR GRG
2 RRR GGG RRG RGR
They are arranged in the same horizontal order as the forms in (1), with the
two possibilities for each form placed directly above one another. It is easy to
check explicitly that each instruction set (2) does indeed give an odd number of
red flashes when a single detector is set to 1.
Now, finally, we consider the fourth type of run, in which all three detectors are
set to 1, and an odd number of red flashes is never observed.
The above instruction sets must determine the outcomes of these runs as well.
For who is to prevent somebody from flipping the two switches set to 2 over to
1, just before the particles arrive?
An inspection of the upper rows in (2) reveals that every one of the eight allowed
instruction sets results in an odd number of red flashes when all three switches
are set to 1.
If the instruction sets existed, then 111 runs would always have to produce an
odd number of red flashes. But they never do.
Thus, a single 111 run suffices all by itself to give data inconsistent with the
otherwise compelling inference of instruction sets.
Here the instruction sets(realistic theory) require an odd number of red flashes
in every 111 run, but quantum mechanics(experiment) prohibits an odd number
of red flashes in every 111 run.
1335
Something is wrong with the EPR idea of instruction sets or EPR reality.
Let us describe a spin state that produces the remarkable correlations (GHZ-
state) described earlier.
We measure angular momentum for each particle in units of ~/2 so that the spin
operators for each particle can be taken to be the Pauli matrices. Now consider
the three commuting Hermitian operators
xa yb yc , ya xb yc , ya yb xc (16.130)
They commute because all pairs of the six spin operators out of which they are
constructed commute, except for those associated with the x and y components
of the spin of a single particle, which anticommute. This does not cause any
trouble, however, because converting the product in one order to the product
in then other order always involves and even number of such anticommuting
exchanges.
Being commuting and Hermitian, the three operators above can be provided
with simultaneous eigenvectors. Since the square of each operator is the iden-
tity, the eigenvalues of each can only be 1.
The actual spin state that produces the remarkable correlations (the Greenberger-
Horne-Zeilinger or GHZ-state) is described by
1
|GHZi = (|1, 1, 1i |1, 1, 1i) (16.131)
2
where 1 specifies spin-up or spin-down along the appropriate zaxis.
For simplicity in the following argument, here we pick the state with all three
eigenvalues equal to +1, which preserves the symmetry among the particles. The
argument works for any such symmetric state and for any linear combination of
such states as in the above state.
Since the components of the spin vectors of different particles commute, we can
1336
simultaneously measure the x component for one particle and the y components
for the other two. Because the spin state is an eigenvector of all three of the
operators
xa yb yc , ya xb yc , ya yb xc
with eigenvalue +1, the product of the results of each of the three single spin
measurements has to be +1, regardless of which particle we pick for the x spin
measurement. Since +1 flashes red and 1 flashes green, there must indeed be
an even number of green flashes and thus an odd number of red flashes.
What about the result of three xspin measurements, declared earlier never to
result in an odd number of red flashes? Translating this into spin language tells
us that the product of the three results must always be 1. The Hermitian
operator corresponding to that product is
xa xb xc (16.132)
so for the declaration to be correct, it must be that the eigenvector of the first
three operators with eigenvalue +1 is also an eigenvector of the last opera-
tor(above) with eigenvalue 1.
This is easily confirmed. Indeed, the last operator is just minus the product of
the other three operators
The consequence of the EPR reality criterion specified earlier, if translated into
quantum theoretic terminology, would also assert that the state was an eigen-
vector of the operator xa xb xc , but with the wrong eigenvalue. In this sense, the
GHZ experiment provides the strongest possible contradiction between quan-
tum mechanics and the EPR reality criterion.
Alternatively, we can say it this way. We may measure, on each particle, either
x or y , without disturbing the other particles. The results of these measure-
ments will be called mx or my , respectively. From
xa yb yc |111i = |111i
ya xb yc |111i = |111i
ya yb xc |111i = |111i
and
xa xb xc |111i = |111i
1337
we can predict with certainty that, if the three x are measured, the results
satisfy
max mbx mcx = 1 (16.134)
Therefore, each of the operators xa , xb and xc corresponds to an EPR element
of reality, because its value can be predicted with certainty by performing mea-
surements on the two other, distant particles.
xa yb yc |111i = |111i
ya xb yc |111i = |111i
ya yb xc |111i = |111i
and
may mby mcx = +1 (16.137)
The product of the last four results gives
max mbx mcx max mby mcy may mbx mcy may mby mcx = 1 (16.138)
2 2 2 2 2 2
(max ) (mbx ) (mcx ) (mby ) (mcy ) (may ) = 1 (16.139)
But,
(mjx )2 = 1 (16.140)
so that we get a contradiction.
There is a tacit assumption in the above argument, that max in max mbx mcx =
1 is the same as max in max mby mcy = +1, in spite of the fact that these two
ways of obtaining max involve mutually exclusive experiments - measuring xb
and xc or measuring yb and yc .
1338
which is totally destructive of the possibility of these instruction sets, comes
from the fact that in working out the identity it is necessary to interchange the
anticommuting operators xb and yb in order to get rid of all the y components
(using (yi )2 = 1) and be left with a product of three x components. It is
only that one instance of uncompensated anticommutation that produces the
conclusion so devastating to the hypothesis of instruction sets.
This is extremely pleasing, for it is just the fact the x and y components of the
spin of a single particle do not commute, which leads the well-educated quantum
mechanician to reject from the start the inference instruction sets (which have
to specify the value of both of these non-commuting observables), making it
necessary for me to disguise what was going on earlier so that you would not
have dismissed this discussion as rubbish before reaching the interesting part.
Let us return now to using Bayesian ideas to convince our realist friends about
the validity of quantum mechanics within the context of the Bell inequalities.
We have three distant observers examine the three subsystems. The first ob-
server has the choice of two tests. The first test can give two different results
that we label a = 1, and likewise the other test yields a0 = 1. Symbols b, b0 ,
c and c0 are similarly defined for the other two observers. Any possible values
of their results satisfy
a0 bc = ab0 c = abc0 = a0 b0 c0 = +1 (16.142)
Mermin has then shown that we have the inequality
2 ha0 bc + ab0 c + abc0 a0 b0 c0 i +2 (16.143)
1339
As we saw above quantum mechanics makes a very simple prediction for the
GHZ state: there are well chosen tests that give with certainty
a0 bc = ab0 c = abc0 = a0 b0 c0 = +1 (16.144)
It is important to remember that performing any such test can verify the value
1 for only one of these products (at a time) since each product corresponds to
a different experimental setup.
If, however, we take all these results together, they manifestly conflict with
a0 bc + ab0 c + abc0 a0 b0 c0 = 2.
Many physicists have erroneously, at this point, stated that a single experi-
ment is sufficient to invalidate local realism. This is sheer nonsense: a single
experiment can only verify one occurrence of one of the terms in
a0 bc = ab0 c = abc0 = a0 b0 c0 = +1 (16.145)
What does our realist friend think?
He believes that, in each experimental run, each term in the above result has a
definite value even if that term is not actually measured in that run.
We ask him to propose a rule giving the average values of the products in
a0 bc + ab0 c + abc0 a0 b0 c0 = 2 (16.146)
Suppose that he assumes
ha0 bci = hab0 ci = habc0 i = ha0 b0 c0 i = 0.5 (16.147)
This clearly attains the right hand side of (Mermins) inequality. This assump-
tion then leads to the prediction that if we measure a0 bc we shall find the result
1 (that is, yes) in 75% of the cases and the opposite result in 25% and like wise
for the other tests. This simply corresponds to the averages proposed above.
1340
16.10 Problems
16.10.1 Bell Inequality with Stern-Gerlach
A pair of spin1/2 particles is produced by a source. The spin state of each
particle can be measured using a Stern-Gerlach apparatus (see diagram below).
2 ~1 2 ~2
(1) = n1 S , (2) = n2 S
~ ~
corresponding to the spin component of each particle along the direction
of the Stern-Gerlach apparatus associated with it. What are the possible
values resulting from measurement of these observables and what are the
corresponding eigenstates?
(b) Consider the observable (12) = (1) (2) and write down its eigenvectors
and eigenvalues. Assume that the pair of particles is produced in the
singlet state
(c) Make the assumption that the spin of a particle has a meaningful value
even when it is not being measured. Assume also that the only possible
results of the measurement of a spin component are ~/2. Then show that
the probability of finding the spins pointing in two given directions will be
proportional to the overlap of the hemispheres that these two directions
define. Quantify this criterion and calculate the expectation value of (12) .
1341
(d) Assume the spin variables depend on a hidden variable . The expectation
value of the spin observable (12) is determined in terms of the normalized
distribution function f ():
Z
D E 4
(12) = 2 df ()Sz(1) ()S(2) ()
~
Prove Bells inequality
D E D E D E
() (12) (0 ) 1 + (12) ( 0 )
(12)
(e) Consider Bells inequality for 0 = 2 and show that it is not true when
applied in the context of quantum mechanics.
where i and j take on the values + and , where signifies the so-called hidden
variables, and where () is a weight function. This equation is called the
separable form.
where
It is required that
1342
(a) () 0
R
(b) d() = 1
(1) Show that the above classical realist assumptions imply that |B| 2
2
(2) Show that quantum mechanics predicts that C(a, b) = 2 a b 1
(3) Show that the maximum value of the Bell coefficient is 2 2 according to
quantum mechanics
(4) Cast the quantum mechanical expression for C(a, b) into a separable form.
Which of the classical requirements, (a), (b), or (c) above is violated?
(a) Calculate the relative frequencies of the coincidencesR(up, up), R(up, down),
R(down, up) and R(down, down), as a function of , the angle between a
and b.
(c) Given two possible directions, a and a0 , for one measurement, and two
possible directions, b and b0 , for the other, deduce the maximum possible
value of the Bell coefficient, defined by
(d) Show that this prediction of quantum mechanics is inconsistent with clas-
sical local realism.
1343
16.10.4 Greenberger-Horne-Zeilinger State
The Greenberger-Horne-Zeilinger (GHZ) state of three identical spin1/2 par-
ticles is defined by
1
|GHZi = (|za +i |zb +i |zc +i |za i |zb i |zc i)
2
where za + is the eigenvector of the zcomponent of the spin operator of particle
a belonging to eigenvalue +~/2 (zspin up), za is the eigenvector of the
zcomponent of the spin operator of particle a belonging to eigenvalue ~/2
(zspin down), and similarly for b and c. Show that, if spin measurements are
made on the three particle in the x or ydirections,
(a) the product of three spins in the xdirection is always ~3 /8
(b) the product of two spins in the ydirection and one spin in the xdirection
is always +~3 /8
(c) Consider a prize game for a team of three players, A, B, and C. The players
are told that they will be separated from one another and that each will
be asked one of two questions, say X or Y, to which each must give one of
two allowed answers, namely, +1 or 1. Moreover, either
(a) all players will be asked the same question X
or
(b) one of the three players will be asked X and the other two Y
After having been asked X or Y, no player may communicate with the
others until after all three players have given their answers, +1 or 1. To
win the game, the players must give answers such that, in case (a) the
product of the three answers is 1, whereas in case (b) the product of the
three answers is +1.
(a) Show that no classical strategy gives certainty of a win for the team
(b) Show that a quantum strategy, in which each player may take one of
the GHZ particles with her, exists for which a win is certain
1344
Chapter 17
In 1933, Dirac made the observation that the action plays a central role in
classical mechanics (he considered the Lagrangian formulation of classical me-
chanics to be more fundamental than the Hamiltonian formulation), but that
it seemed to have no important role in quantum mechanics as it was known at
the time. He speculated on how the situation might be rectified, and he arrived
at the conclusion that (in more modern language) the propagator in quantum
mechanics corresponds to
S
exp i (17.1)
~
where S is the classical action evaluated along the classical path.
17.2 Motivation
What do we learn from path integrals? Path integrals give us no dramatic new
results in the quantum mechanics of a single particle. In fact, most, if not all,
1345
calculations in quantum mechanics which can be done by path integrals can be
done with considerably greater ease using the standard formulation of quantum
mechanics. So why all the fuss?
It turns out that path integrals are considerably more useful in more complicated
situations, such as quantum field theory. Even if this was not case, however,
path integrals give a very worthwhile contribution to our understanding of quan-
tum mechanics.
First, path integrals provide a physically extremely appealing and intuitive way
of viewing quantum mechanics: anyone who can understand Youngs double slit
experiment in optics should be able to understand the underlying ideas behind
path integrals.
It is easy to get a formal expression for this amplitude in the usual Schrodinger
formulation of quantum mechanics. Let us introduce the eigenstates of the
position operator q, which form a complete orthonormal set:
Z
0 0
q |qi = q |qi , hq | qi = (q q) , dq |qi hq| = 1 (17.3)
1346
(except where otherwise noted, ~ will be set to 1). This object, for obvious
reasons, is known as the propagator from the initial space-time point (q, 0) to
the final point (q 0 , T ). Clearly, the propagator is independent of the origin of
time:
Let us separate the time evolution in the above amplitude into two smaller time
evolutions, writing
Z
A= dq1 hq 0 | eiH(T t1 ) |q1 i hq1 | eiHt1 |qi
Z
= dq1 K(q 0 , T ; q1 , t1 )K(q1 , t1 ; q, 0) (17.9)
This formula is none other than an expression of the quantum mechanical rule
for combining amplitudes. If a process can occur a number of ways, the am-
plitude for each of these ways add. A particle, in propagating from q to q 0 ,
must be somewhere at an intermediate time t1 . Labelling that intermediate
position q1 , we compute the amplitude for propagation via the point q1 (this
is the product of two propagators as in equation(17.9)) and integrate over all
possible intermediate positions. This result is reminiscent of Youngs double
slit experiment where the amplitudes for passing through each of the two slits
combine and interfere. We will look at this example in more detail later.
We can repeat the division of the time interval T . Let us divide it up into a
large number N of time intervals of duration = T /N . Then we can write the
propagator as
N
A = hq 0 | eiH |qi = hq 0 | e|iH eiH{z.........eiH} |qi (17.10)
N times
1347
We can again insert a complete set of states (identity operator) between each
exponential, which gives
Z Z
A = hq 0 | eiH dqN 1 |qN 1 i hqN 1 | eiH dqN 2 |qN 2 i hqN 2 | .........
Z Z
iH
..... dq2 |q2 i hq2 | e dq1 |q1 i hq1 | eiH |qi
Z
= dq1 ........dqN 1 hq 0 | eiH |qN 1 i hqN 1 | eiH |qN 2 i ........ hq1 | eiH |qi
Z
= dq1 ........dqN 1 KqN , qN 1 KqN 1 , qN 2 ........Kq2 , q1 Kq1 , q0
Apart from the mathematical details concerning the limit N , this is clearly
going to become a sum over all possible paths of the amplitude for each path:
X
A= Apath (17.11)
paths
where
X Z
= dq1 ........dqN 1 , Apath = KqN , qN 1 KqN 1 , qN 2 ........Kq2 , q1 Kq1 , q0
paths
(17.12)
1348
Let us look at this expression in detail.
p2
Z
dpj
i hqj+1 | H |qj i = i hqj+1 | + V (q) |pj i hpj | |qj i
2m 2
Z 2
dpj p
= i hqj+1 | + V (q) |pj i hpj | qj i
2 2m
!
p2j
Z
dpj
= i hqj+1 | + V (qj+1 ) |pj i hpj | qj i
2 2m
!
p2j
Z
dpj
= i + V (qj+1 ) hqj+1 | pj i hpj | qj i
2 2m
!
p2j
Z
dpj
= i + V (qj+1 ) eipj (qj+1 qj ) (17.16)
2 2m
where we used hq | pi = exp(ipq). Note that the operator p acted to the right
and the operator V (q) acted to the left.
1349
is
! !
p2j
Z
dpj ipj (qj+1 qj ) 2
Kqj+1 , qj = e 1 i + V (qj ) + O( )
2 2m
Z
dpj ipj (qj+1 qj ) iH(pj , qj )
1 + O( 2 )
= e e (17.17)
2
There are N such factors in the amplitude. Combining them, and writing
qj = (qj+1 qj ) /, we get
Z NY 1 N 1
dpj X
Apath = exp i (pj qj H(pj , qj )) (17.18)
j=0
2 j=0
N
where we have neglected a multiplicative factor of the form 1 + O( 2 ) which
will tend toward one in the continuum limit. Then the propagator becomes
Z
K = dq1 ........dqN 1 Apath
Z NY1 Z NY 1 N 1
dpj X
= dqj exp i (pj qj H(pj , qj )) (17.19)
j=1 j=0
2 j=0
Note that there is one momentum integral for each interval (N total), while
there is one position integral for each intermediate position (N 1 total).
This result is known as the phase-space path integral. The integral is viewed as
over all functions p(t) and over all functions q(t) where q(0) = q, q(T ) = q 0 . But
to actually perform an explicit calculation, equation(17.20) should be viewed as
a shorthand notation for the expression equation(17.19) in the limit N .
If, as is often the case (and we have assumed in deriving the above expression),
the Hamiltonian is of the standard form, namely
p2
H= + V (q) (17.21)
2m
we can actually carry out the momentum integrals in equation(17.19). We can
rewrite this expression as
!
Z NY1 N 1 Z NY1 N 1
X dpj X p2j
K= dqj exp i V (qj ) exp i pj qj
j=1 j=0 j=0
2 j=0
2m
(17.22)
1350
The p integrals are all Gaussian, and they are uncoupled. One such integral is
Z r
dp i(pqp2 /2m) m imq2 /2
e = e (17.23)
2 2i
The careful reader may be worried about the convergence of this integral. If so,
a factor exp(p2 ) can be introduce and the limit 0 taken at the end (see
for example Chapter 1 - page 21).
This is our final result and is known as the configuration space path integral.
Again, equation(17.25) should be viewed as a notation for the more precise
expression equation(17.24), as N .
17.3.2 Examples
To solidify the above notions, let us consider a few explicit examples. As a
first example, we will compute the free particle propagator first using ordinary
quantum mechanics and then via PI. We will then mention some generalization
which can be done in a similar manner.
Free Particle
Let us compute the propagator K(q 0 , T ; q, 0) for a free particle, described by the
Hamiltonian
p2
H= (17.26)
2m
The propagator can be computed straightforwardly using ordinary quantum
mechanics. To this end, we write
Z
0 iHT 0 ip2 T /2m dp
K = hq | e |qi = hq | e |pi hp | qi
2
Z Z
dp ip2 T /2m 0 dp ip2 T /2m+i(q0 q)p
= e hq | pi hp | qi = e (17.27)
2 2
1351
The integral is a Gaussian. We obtain
m 1/2 0 2
K= eim(q q) /2T (17.28)
2iT
Let us see how the same result can be obtained using PIs. The configuration
space PI (equation(17.25)) is
m N/2 Z NY 1 N 1 2
m X q j+1 q j
K = lim dqj exp i
N 2i 2
j=1 j=0
m N/2 Z NY1
m (qN qN 1 )2 + (qN 1 qN 2 )2
= lim dqj exp i 2 2
N 2i
j=1
2 +... (q2 q1 ) + (q1 q0 )
(17.29)
where q0 = q and qN = q 0 are the initial and final points. The integrals are
Gaussian and can be evaluated exactly, although the fact that they are coupled
complicates matters significantly. The result is
m N/2 1 2i (N 1)/2 0 2
K = lim eim(q q) /2N
N 2i N m
m 1/2 0 2
= lim eim(q q) /2N (17.30)
N 2iN
A couple of remarks are in order. First, we can write the argument of the
exponential as
0 2
1 q q
T m (17.32)
2 T
which is just the action S(qclassical ) for a particle moving along the classical
path (a straight line in this case) between the initial and final points.
1352
This result typifies a couple of important features of calculations in this subject,
which we will see repeatedly as we continue the discussion. First, the propagator
separates into two factors, one which is the phase eiS(qc )/~ . Second, calculations
in the PI formalism are typically quite a bit more lengthy than using standard
techniques of quantum mechanics.
Harmonic Oscillator
As a second example of the computation of a PI, let us show how to compute
the propagator for the harmonic oscillator using this method.
ZT
1 2 1
S(qc (t) + y(t)) = dt mq m 2 qc2
2 c 2
0
ZT
1 1
+ (linear in y) + dt my 2 m 2 y 2 (17.37)
| {z } 2 2
=0 0
1353
We substitute this into equation(17.25) and obtain
Z
0 iS(qc (t))
K(q , T ; q, 0) = e Dq(t)eiS(y(t)) (17.39)
As mentioned above, the paths y(t) over which we integrate go from y(0) = 0
to y(T ) = 0. The only appearance of the initial and final positions is in the
classical path, i.e., in the classical action. Once again, the PI separates into
two factors. The first is written in terms of the action of the classical path and
the second is a PI over deviations from this classical path. The second factor is
independent of the initial and final points.
This separation into a factor depending on the action of the classical path and
a second one, a PI which is independent of the details of the classical path is a
recurring theme and an important one. Indeed, it is often the first factor which
contains most of the useful information contained in the propagator and it can
be deduced without even performing a PI. It can be said that much of the work
in the game of path integrals consists in avoiding having to compute one!
1354
This innocent looking expression tells us something which is at first glance un-
believable and at second glance really unbelievable. The first glance observation
is that a particle, in going from one position to another, takes all possible paths
between these two positions. This is, if not actually unbelievable, at the very
least counter-intuitive. We could, however, argue away much of what makes us
feel uneasy if we could convince ourselves that while all paths contribute, the
classical path is the dominant one.
How are we to reconcile this really unbelievable conclusion with the fact that a
ball thrown in the air has a more-or-less parabolic motion?
The key, not surprisingly, is in how the different paths interfere with one an-
other. By considering the case where the rough scale of classical action is much
bigger than the quantum of action, ~, we will see the emergence of the Principle
of Least Action.
Consider two neighboring paths q(t) and q 0 (t) which contribute to the PI (see
Figure 17.2 below).
Let q 0 (t) = q(t) + (t), with (t) small. Then we can write the action as a
functional Taylor expansion about the classical path.
If you are not familiar with the manipulation of functionals (functions of func-
tions) do not despair - the only rule needed beyond standard calculus is the
functional derivative
q(t)
= (t t0 ) (17.42)
q(t0 )
1355
where the last function is the Dirac delta function.
We have
Z
S(q(t))
S(q 0 ) = S(q + ) = S(q) + dt(t) + O( 2 ) (17.43)
q(t)
The two paths contribute exp(iS(q(t))) and exp(iS(q 0 (t))) to the PI. The com-
bined contribution is
Z
i S(q)
A ' eiS(q)/~ 1 + exp dt(t) (17.44)
~ q(t)
This argument must be rethought, however, for one exceptional path - the path
which extremizes the action, i.e., the classical path qc (t). For this path
Thus, the classical path and a very close neighbor will have actions which differ
by much less than two randomly chosen but equally close paths (see Figure 17.3
below).
This means that for fixed closeness of two paths and for fixed ~, paths near the
classical path will on average interfere constructively (small phase difference)
whereas for random paths the interference will be on average destructive.
1356
Thus, heuristically, we conclude that is the problem is classical (action ~),
the most important contribution to the PI comes from the region around the
path which extremizes the PI. In other words, the particles motion is governed
by the principle that the action is stationary. This, of course, is none other
than the Principle of Least Action from which the Euler-Lagrange equations of
classical mechanics are derived.
Figure 17.4: Aharonov-Bohm effect. Magnetic flux is confined within the shaded
area. particles are excluded from this area by a perfect shield
1357
Aharonov and Bohm proposed that such an experiment be performed with
charged particles. The setup had an added twist. A magnetic field from which
the particles are perfectly shielded exists in between the two slits. If we per-
form the experiment first with no magnetic flux and then with a nonzero and
arbitrary flux passing through the shielded region, the interference pattern will
change, in spite of the fact that the particles are perfectly shielded from the mag-
netic field and feel no electric or magnetic force whatsoever.
Consider first two representative paths q1 (t) and q2 (t) (in two dimensions) pass-
ing through slits 1 and 2 respectively, and which arrive at the same spot on the
screen as shown in Figure 17.5 below.
Figure 17.5: Two representative paths contributing to the amplitude for a given
point on the screen
Before turning on the magnetic field, let us suppose that the actions for these
paths are S(q1 ) and S(q2 ). Then the interference of the amplitudes is deter-
mined by
eiS(q1 )/~ + eiS(q2 )/~ = eiS(q1 )/~ 1 + ei(S(q2 )S(q1 ))/~ (17.47)
L0 (~q,~q)
L(~q,~q) e ~v A(~
= L(~q,~q) ~ q) (17.49)
c
1358
Thus, the action changes by
Z Z Z
e ~ e d~q(t) ~ e ~ q)
dt ~v A(~q) = dt A(~q) = d~q(t) A(~
c c dt c
The integral is Z
~ q)
d~q(t) A(~ (17.50)
which is the line integral of A~ long the path taken by the particle. So including
the effect of the magnetic field, the action of the first path is
Z
0 e ~ q)
S (~q1 ) = S(~q1 ) d~q(t) A(~ (17.51)
c
q
~1
Let us now look at the interference between the two paths, including the mag-
netic field.
0 0 0
0 0
0
0
eiS (~q1 )/~ + eiS (~q2 )/~ = eiS (~q1 )/~ 1 + ei(S (~q2 )S (~q1 ))/~ = eiS (~q1 )/~ 1 + ei12
(17.52)
where the new relative phase is
Z Z
e ~ q) ~ q )
012 = 12 d~q(t) A(~ d~q(t) A(~ (17.53)
~c
q
~2 q
~1
where is the flux inside the closed loop bounded by the two paths. So we can
write
e
012 = 12 (17.55)
~c
It is important to note that the change of relative phase due to the magnetic field
is independent of the details of the two paths, as long as each passes through
the corresponding slit. This means that the PI expression for the amplitude
for the particle to reach a given point on the screen is affected by the magnetic
field in a particularly clean way. Before the magnetic field is turned on, we may
write A = A1 + A2 , where
Z
0
A1 = D~q eiS (~q)/~ (17.56)
slit 1
1359
and similarly for A2 . Including the magnetic field,
i(S 0 (~
q ) ec ~ ~
Z R R
q A/~
d~ ie q A/~c
d~
A01 = D~q e 1 =e 1 A1 (17.57)
slit 1
where we have pulled the line integral out of the PI since it is the same for all
paths passing through slit 1 arriving at the point on the screen under consider-
ation. So the amplitude is
~ ~ ~
R R R
ie q A/~c
d~ ie q A/~c
d~ ie q A/~c
d~ H
A=e 1 A1 + e A2 = e 2 1 A1 + eie q A/~c
d~
A2
~
R
q A/~c
ie d~
=e 1 A1 + eie/~c A2 (17.58)
The overall phase is irrelevant and the interference pattern is influenced directly
by the phase e/~c. If we vary the phase continuously (by varying the mag-
netic flux), we can detect a shift in the interference pattern. For example, if
e/~c = , then a spot on the screen which formerly corresponded to construc-
tive interference will now be destructive and vice versa.
tN t0
tj+1 tj = = N (17.60)
N
The discretization in time leads to a discretization of the paths x(t) which will
be represented through the series of space-time points
Even though the time instances are fixed, we note that the xj the values are not.
They can be anywhere in the allowed volume which we will choose to be the
interval [, +]. In passing from one space-time instance (xj , tj ) to the next
1360
(xj+1 , tj+1 ) we assume that kinetic energy and potential energy are constant,
namely,
1 (xj+1 xj )2
m and U (xj ) (17.62)
2 2N
respectively. these assumptions lead then to the following Riemann form for the
action integral
N 1
1 (xj+1 xj )2
X
S[x(t)] = lim N m U (xj ) (17.63)
N
j=0
2 2N
The main idea is that one can replace the path integral now by a multiple
integral over x1 , x2 , etc. This allows us to write the propagator or evolution
operator as
K (xN , tN ; x0 , t0 ) =
Z Z Z
N 1 2
i X 1 (xj+1 xj )
lim CN dx1 dx2 ..... dxN 1 exp N m U (xj )
N ~ j=0
2 2N
(17.64)
1361
Also we use the fact that the velocity of the classical path
xN x0
xcl = (17.69)
tN t0
is constant. The action integral S[x(t)|x(t0 ) = x0 , x(tN ) = xN ] for any path
x(t) can be expressed through an action integral over the path y(t) relative to
the classical path(note explicit new notation). We get
ZtN
1
dt m x2cl + 2xcl y + y 2
S[x(t)|x(t0 ) = x0 , x(tN ) = xN ] =
2
t0
ZtN ZtN ZtN
1 1
= m dt x2cl + mxcl dt y + m dt y 2 (17.70)
2 2
t0 t0 t0
The condition (17.68) implies for the second term on the RHS
ZtN
dt y = y(tN ) y(t0 ) = 0 (17.71)
t0
i.e., due to (17.68), can be expressed through a path integral with endpoints
x(t0 ) = 0, x(tN ) = 0. The resulting expression for S[x(t)|x(t0 ) = x0 , x(tN ) =
xN ] is
S[x(t)|x(t0 ) = x0 , x(tN ) = xN ]
2
1 (xN x0 )
= m + S[x(t)|x(t0 ) = 0, x(tN ) = 0] (17.74)
2 tN t0
Inserting into the expression for the propagator we have
" # x(tZN )=0
2
im (xN x0 ) i
K (xN , tN ; x0 , t0 ) = exp D[x(t)] exp S[x(t)]
2~ tN t0 ~
x(t0 )=0
(17.75)
1362
which can be written as
" #
2
im (xN x0 )
K (xN , tN ; x0 , t0 ) = exp K (0, tN ; 0, t0 ) (17.76)
2~ tN t0
K (0, tN ; 0, t0 ) =
N/2 Z Z Z
N 1 2
m i X 1 (yj+1 yj )
lim dy1 dy2 ..... dyN 1 exp N m 2
N 2i~N ~ 2
j=0 N
(17.77)
2y12 y1 y2 y2 y1 + 2y22 y2 y3 y3 y2
im
E=
2~N +2y32 ....... yN 2 yN 1 yN 1 yN 2 + 2yN
2
1
N
X 1
=i yj ajk yk (17.78)
j,k=1
Z Z Z
N
X 1
I= dy1 dy2 ..... dyN 1 exp i yj ajk yk (17.80)
j,k=1
must now be determined. We will exploit the fact that for any real, symmetric
matrix there exists a similarity transformation such that
a11 0 0 . . 0 0
0 a22 0 . . 0 0
0 0 a33 . . 0 0
S 1 aS =
. . . . . . . (17.81)
. . . . . . .
0 0 0 . . an1,n1 0
0 0 0 . . 0 ann
1363
where S can be chosen as an orthonormal transformation, i.e.,
S T S = I S T = S 1 (17.82)
The akk are the eigenvalues of the a matrix and are real. This property allows us
n
P
to simplify the bilinear form yj ajk yk by introducing new integration variables
j,k
n
X n
X
S 1
yj = y
jk k
, yk = Sjk yk (17.83)
k k
as well as
n
Y
det(a) = det(S 1 aS) = det(S 1 ) det(a) det(S) = det(a) = ajj (17.86)
j=1
1364
with elements
yj
Jjs = (17.90)
ys
According to equation(17.84), J = S and hence det(J) = 1. We then have
Z Z Z n
X
!
I= dy1 dy2 ..... dyN 1 exp i akk yk2
k
Z Z
dy1 exp ia11 y12 ............ dyn exp iann yn2
=
n Z
Y
dyk exp iakk yk2
= (17.91)
k=1
where c 6= 0. We first consider the case c > 0. One can relate the integral
(17.92) to the standard Gaussian integral
Z r
cx2
dx e = , c>0 (17.93)
c
1365
2
The contour integral vanishes since eicz is an analytic function, i.e., the inte-
grand does not have any singularities anywhere inside the contour. The contour
integral (17.94) can be written as the sum of the following path integrals
I
2
J = J1 + J2 + J3 + J4 , Jk = dz eicz (17.95)
k
for x, p <.
We now show that the two integrals J2 and J4 vanish for p . This follows
from the calculation below.
p p
Z Z
2 2 2
lim |J2 or 4 | = lim i dx eic(ix+p) lim |i| dx eic(p x ) e2cxp
p+ p+
p+
0p 0
1 e2cp
Z
lim |J2 or 4 | lim dx e2cxp = lim
=0
p+ p+
p+ 2cp
0
1366
One can derive the same result for c < 0, if one chooses the same contour but
with a path that is reflected at the real axis. This gives
Z Z
2 2
J= dx eicx i dx ecx = 0 , c<0
Z s r
2 i i
dx eicx = = (17.102)
|c| c
This last result holds for a ddimensional real, symmetric matrix (ajk ) with
det(ajk ) 6= 0. In order to complete the evaluation of the propagator in equa-
tion(17.64) we split off the factor m/2~N in the definition (17.79) of (ajk )
defining a new matrix (Ajk ) through
m
ajk = Ajk (17.104)
2~N
Using
N 1
m
det(ajk ) = det(Ajk ) (17.105)
2~N
which is a general property of determinants, we get
N/2 (N 1)/2
m 2i~N 1
K(0, tN ; 0, t0 ) = lim p (17.106)
N 2i~N m det(Ajk )
In order to determine det(Ajk ) we consider the dimension n of (Ajk ), presently
N 1, as a variable and let n = 1, 2, ...... We seek to evaluate the determinant
of the n n matrix
2 1 0 . . 0 0
1 2 1 . . 0 0
0 1 2 . . 0 0
.
Dn = det . . . . . . (17.107)
. . . . . . .
0 0 0 . . 2 1
0 0 0 . . 1 2
For this purpose, we expand (17.107) in terms of subdeterminants along the
last column. One can readily verify that this procedure leads to the following
recursion equation for the determinants
Dn = 2Dn1 Dn2 (17.108)
1367
To solve this three term recursion relationship one needs two starting values.
Using
2 1
D1 = det(2) = 2 , D2 = det =3 (17.109)
1 2
we get
Dn = n + 1 (17.110)
Therefore, we get
det(Ajk ) = N (17.111)
which gives
1/2 1/2
m m
K(0, tN ; 0, t0 ) = lim = (17.112)
N 2i~N N 2i~(tN t0 )
The propagator allows us to predict the time evolution of any state function
(x, t) of a free particle. The result can be generalized to three dimensions as
3/2
im (~r ~r0 )2
m
K(~r, t; ~r0 , t0 ) = exp (17.114)
2i~(t t0 ) 2~ t t0
1368
for an arbitrary path x(t) with end points x(t0 ) = x0 and x(tN ) = xN . In order
to simplify this task we again define a new path y(t)
which describes the deviation from the classical path xcl (t) with end points
x(t0 ) = x0 and x(tN ) = xN so that
This gives
where
1 1
L(xcl , xcl , t) = mx2cl c(t)x2cl e(t)xcl (17.121)
2 2
1 1
L0 (y, y(t), t) = my 2 c(t)y 2 e(t)y (17.122)
2 2
L = mxcl y(t) c(t)xcl y e(t)y (17.123)
We now show that the contribution of L to the action integral (17.117) vanishes.
For this purpose we use
d
xcl y = (xcl y) xcl y (17.124)
dt
and get
ZtN ZtN
t
dtL = m (xcl y)|tN0
dt [mxcl (t) + c(t)xcl (t) + e(t)]y(t) (17.125)
t0 t0
According to (17.119) the first term on the RHS vanishes. Getting the Euler-
Lagrange equations from the Lagrangian (17.116) we have for the classical path
and hence, the second term on the RHS also vanishes. Thus,
ZtN
dtL = 0 (17.127)
t0
We then have
i
K(xN , tN ; x0 , t0 ) = exp S[xcl (t)] K(0, tN ; 0, t0 ) (17.128)
~
1369
where
x(tN
Z)=xN ZtN
i
K(0, tN ; 0, t0 ) = D[y(t)] exp dt L0 (y, y, t) (17.129)
~
x(t0 )=x0 t0
1370
Therefore we need to evaluate the function
" N 1 #
2~N
f (t0 , tN ) = lim N det(a) (17.134)
N m
f (t0 , t0 ) = N D0 = 0 (17.140)
2N 2N
df (t0 , t) D1 D0
= N = 2 c 1 1 = 1 c1 = 1 (17.141)
dt t=t0 N m m
1371
We finally get for the propagator (17.115)
1/2
m i
K(x, t; x0 , t0 ) = exp S[xcl (t)] (17.142)
2i~f (t0 , t) ~
where f (t0 , t) is the solution of ((17.139)-(17.141)) and where S[xcl (t)] is deter-
mined by first solving the Euler-Lagrange equations for the Lagrangian (17.116)
to obtain the classical path xcl (t) with end points xcl (t0 ) = x0 and xcl (tN ) = xN
and then evaluating (17.117) for this path. Note that the required solution
xcl (t) involves a solution of the Euler-Lagrange equations for boundary con-
dition which are different from those conventionally encountered in Classical
Mechanics where we usually find a solution for initial conditions xcl (t0 ) = x0
and xcl (t0 ) = v0 .
1372
We now determine the action integral associated with the path ((17.148),(17.149))
Zt
1 1
S[xcl (t)] = d mx2cl ( ) m 2 x2cl ( ) (17.150)
2 2
t0
We assume that t0 = 0. From (17.148) the velocity along the classical path is
x x0 c
xcl ( ) = sin x0 cos (17.151)
s
and for the kinetic energy we get
2
m 2 m 2 (x x0 c)
xcl ( ) = cos2
2 2 s2
x x0 c m 2 x20
m 2 x0 cos sin + sin2 (17.152)
s 2
Similarly, we get the potential energy
2
m 2 2 m 2 (x x0 c)
xcl ( ) = sin2
2 2 s2
x x0 c m 2 x20
m 2 x0 cos sin + cos2 (17.153)
s 2
Using
1 1 1 1
cos2 = + cos 2 , sin2 = cos 2 (17.154)
2 2 2 2
1
cos sin = sin 2 (17.155)
2
the Lagrangian, considered as a function of , reads
1 1
g( ) = mx2cl ( ) m 2 x2cl ( )
2 2 !
2 2
m (x x0 c)
= x20 cos 2 m 2 sin 2 (17.156)
2 s2
1373
where we have used the definitions (17.149). We finally get
m 2
x0 + x2 cos (t t0 ) 2x0 x
S[xcl (t)] = (17.159)
2 sin (t t0 )
The expression for the propagator of the harmonic oscillator is then given by
K(x, t; x0 , t0 ) =
1/2
m im 2 2
exp x0 + x cos (t t0 ) 2x0 x
2i~ sin (t t0 ) 2~ sin (t t0 )
(17.160)
17.7 Problems
17.7.1 Path integral for a charged particle moving on a
plane in the presence of a perpendicular magnetic
field
Consider a particle of mass m and charge e moving on a plane in the presence of
an external uniform magnetic field perpendicular to the plane and with strength
B. Let ~r = (x1 , x2 ) and p~ = (p1 , p2 ) represent the components of the coordinate
~r and of the momentum p~ of the particle. The Lagrangian for the particle is
2
1 d~r e d~r ~
L= m + A(~r)
2 dt c dt
1. Find the relation between the momentum p~ and the coordinate ~r and
explain how the momentum is related to the velocity ~v = d~r/dt in this
case.
1 2 e ~ 2
H(q, p) = p~ A(~r)
2m c
~ r) is the vector potential for a uniform magnetic field, normal to
where A(~
the plane, and of magnitude B. In what follows, we will always write the
~ r) = 0, where it is given by
vector potential in the gauge A(~
B B
A1 (~r) = x2 , A2 (~r) = x1
2 2
4. Derive the form of the path integral, as a sum over the histories of the
position ~r(t) of the particle, for the transition amplitude of the process in
1374
which the particle returns to its initial location ~r0 at time tf having left
that point at ti , i.e.,
h~r0 , tf | ~r0 , ti i
where ~r0 is an arbitrary point of the plain and |tf ti | . What is the
form of the action? What initial and final conditions should be satisfied
by the histories ~r(t)?
h~rf = 0, tf | ~ri = 0, ti i
hq = 0, tf | q = 0, ti i
for ti and tf +.
1375
17.7.4 Greens Function for a Free Particle
The Greens function for the single-particle Schrodinger equation is defined as
the solution of the equation
h i
i~t H G(~r, t; ~r0 , t0 ) = i~(t t0 )(~r vecr0 )
K(x, t + 0; xi , ti ) = (x xi )
We have derived the following path integral expression for the propagator:
Z
i
K(x, t; xi , ti ) = N [Dx(t)]exp S[x(t)]
~
Using the above definition of the path integral, calculate explicitly the propaga-
tor for a free particle in one spatial dimension. Compare your result with that
of Problem 17.7.4.
1376
Chapter 18
The nuclear part of the problem involves several types of motion. For a solid
with N nuclei and hence 3N nuclear degrees of freedom we have
To study the electron motions in the crystal, we ignore the lattice vibrations
and the motion of the crystal as a whole. We assume that the nuclei are fixed at
their equilibrium positions (the crystal ). The electronic Schrodinger equation is
then
~ ) = Ee e (~ri ; R
H 0 e (~ri ; R ~ ) (18.1)
1377
where
Although we can formally write these equations, we have a problem. The solid
contains a very large number of electrons (and nuclei), i.e., for a solid 1 cm3
in volume we have on the order of 1023 electrons, which leads to an enormous
number of coupled equations. We cannot solve these equations.
1378
Translational Symmetry
A crystal(ideal) has translational symmetry. This means that we can pick it
up and move it through a specified vector and we cannot distinguish the final
crystal from the original. For example, we consider a one-dimensional crystal
as shown in Figure 18.1 below.
We say that two points on the are equivalent if the surrounding crystal looks
the same independent of which of the two points we stand on. Clearly, any two
such points must be connected by a lattice translation vector. The points do
not have to be lattice points.
All of these vectors are defined with respect to some origin(arbitrary). This is
illustrated for a 2dimensional crystal in Figure 18.2 below.
1379
Figure 18.2: 2-dimensional crystal
In Figure 18.3 below, translational invariance means that the interactions felt
by an electron at position ~r + T~ are identical to those felt at position ~r.
The choice of the three vectors ~a, ~b and ~c is not unique. In Figure 18.4 below
we illustrate several choices (all equally good).
1380
The vector pairs ~a, ~b and ~a0 , ~b0 are primitive lattice translation vectors(not
unique). The appellation primitive is attached because using the
pair as a basis
we can reach all lattice points. On the other hand the pair ~a00 , ~b00 are not
primitive since a point like P cannot be reached.
The lattice, which is the set of all mathematical points that are equivalent
under translation by T~ = n1~a + n2~b + n3~c where ~a, ~b and ~c are primitive lattice
translation vectors, defines the structure of the crystal. The crystal consists of
lattice + basis, where a basis is one or more atoms that are attached to all the
lattice points.
A more detailed definition goes as follows: A volume of the crystal that when
translated through all translation vectors, just fills all of space without either
overlapping itself or leaving any gaps is called a primitive cell or primitive unit
cell. It must contain exactly one lattice point. In Figure 18.4 above, ~a, ~b
and ~a0 , ~b0 generate primitive unit cells, but ~a00 , ~b00 does not (it contains two
atoms).
In addition, every primitive cell has the same area (volume). In Figure18.5
below we have drawn several unit cells for a rectangular-centered lattice. The
cells labelled (a), (b) , and (d) are primitive while (c) is not.
Point Symmetry
There are another group of symmetry operations for a crystal which leave one
point fixed in space. They are called point symmetry operations. The three most
important point symmetry operations are rotation(R), reflection(F) and inver-
sion(V). Not all crystal structures are characterized by all of these operations.
In fact, they can be used to classify lattices and crystals.
1381
Digression to Group Theory
A group is a set of elements A, B, C, .... such that a group multiplication is
defined which associates a third element with any ordered pair of two elements.
The multiplication must satisfy these requirements:
1. The product of any two elements in the set is in the set also, i.e., the set
is closed under group multiplication.
4. There exists an inverse A1 for every element in the group, i.e., AA1 =
A1 A = E.
An example is the set of four elements (group of order 4) with the multiplication
table below
E A B C
E E A B C
A A E C B
B B C E A
C C B A E
E A B C D F
E E A B C D F
A A E D F B C
B B F E D C A
C C D F E A B
D D C A B F E
F F B C A E D
1382
If the group multiplication operation were ordinary matrix multiplication, then
a representation of this group (an explicit set of elements) is given by
1 0 1 0
E= , A=
0 1 0 1
1 1 3 1 1 3
B= , C=
2 3 1 2 3 1
1 1 3 1 1 3
D= , F =
2 3 1 2 3 1
Notice that each row/column only contains each element once. This is an ex-
ample of the rearrangement theorem, which says
E F
E E F
F F E
1383
1. The identity operation.
The only symmetry operations for this lattice is twofold rotations about axes
perpendicular to the plane of the crystal and passing through each lattice point
and about axes halfway between each pair of lattice points as shown above. The
primitive lattice translation vectors ~a and ~b have no simple relationship to each
other.
1384
Each of the other four Bravais lattices are special cases of the oblique lattice.
Each has more symmetry operations than the oblique lattice and, in each, the
primitive lattice translation vectors ~a and ~b have some definite relationships.
The primitive rectangular lattice shown in Figure 18.7 below has the same
twofold rotation axes as the oblique lattice and in addition has several mir-
ror line as shown. Reflection through mirror lines is a symmetry operation of
the lattice. The primitive lattice translation vectors ~a and ~b are orthogonal to
each other and their lengths are unrelated.
This lattice has the twofold rotation axes, mirror lines and glide lines as shown.
1385
A glide line is a symmetry element corresponding to a combined operation: to
get from one point in the crystal to an equivalent point via a glide line, we first
reflect across the glide line, and then translate along the glide line by 1/2 the
repeat distance of the crystal. The angle between the lattice translation vector
is as shown and their lengths are unrelated.
The square
lattice is a special case of the primitive rectangular lattice where
|~a| = ~b. It is shown in Figure 18.9 below. In addition to the twofold rotation
axes, mirror lines and glide lines of the primitive rectangular lattice, the square
lattice has the fourfold rotation axes as shown in the diagram. The primitive
lattice translation vectors are orthogonal and equal in length.
The hexagonal
lattice is derived from the centered rectangular lattice by setting
|~a| = ~b as shown in in Figure 18.10 below. This lattice has twofold , threefold
and sixfold axes of rotation as shown. It has a large number of mirror line and
glide lines as shown. The primitive lattice translation vectors are of equal length
and make an angle of 60 with respect to each other.
1386
Figure 18.10: Hexagonal Lattice
Up to this point we have dealt exclusively with the classification of lattices ac-
cording to their point symmetry. As we stated earlier, however, a crystal is
composed of a lattice plus a basis at each lattice point. Because of the basis, the
point symmetry of the crystal may not be the same as that of the lattice, i.e.,
the symmetry of the crystal may be lower than that of the lattice. The lattice
may possess some symmetry elements not possessed by the crystal. This is a
very important point. Let us illustrate it with an example.
Reflection through any of these mirror lines yields a lattice that is indistinguish-
able from the original lattice.
1387
Now associating with each lattice point of this square lattice one of the two
different bases as shown in Figure 18.12 below
It is clear that the crystal on left possesses all of the same mirror lines as the
original square lattice while the crystal on the right does not. The symmetry of
the crystal on the right is lower than that of its lattice.
Three-Dimensional Crystals
Our discussions will mostly involve one- and two-dimensional crystals. Three-
dimensional crystals are significantly more difficult to study and we will only
say some general things about them at the end.
In three dimensions there are 14 Bravais lattices. Three examples are shown in
Figure 18.14 below.
1388
From each Bravais lattice we can construct crystals by associating a basis with
each lattice point.
Miller Indices
In order to discuss three dimensional crystals we need to be able to specify
particular directions (axes) and particular planes in the crystal.
A convention has been developed for dealing with this problem. Axes and planes
in a crystal are described by Miller indices.
The Miller indices of the axis are then given by the set of integers
hn n n i
1 2 3
, , = [h, k, l] (18.7)
N N N
where negative indices are denoted by a bar over the integer.
An example is shown below for the body diagonal of a cubic lattice where
T~ = ~a + ~b + ~c N = 1 [1, 1, 1] (18.8)
1389
reciprocals by a common factor that reduces them to the three smallest integers.
These are the Miller indices of the plane, written (h, k, l). Some examples are
shown in Figure 18.16 below.
A slightly more complicated example is one where the intercepts are (3, 4, 12)
1390
so that
1 1 1
(a, b, c) = (3, 4, 12) , , 12 = (4, 3, 1) = (h, k, l) (18.9)
3 4 12
The Miller indices for an axis represent all equivalent axes (all parallel). The
Miller indices for a plane represent all equivalent planes (all parallel).
This concludes our short introduction to crystal structure and symmetry. The
symmetry of the crystal is important beyond what it tells us about the clas-
sification of the crystal. It reflects the fundamental properties of the crystal
Hamiltonian and in particular, the crystal potential energy. We will use these
properties to deduce facts about the energy eigenfunctions for the electrons in
the crystal.
The electronic Schrodinger equation for a single electron in the crystal is given
by
HE (~r) = EE (~r) (18.10)
In the orbital approximation, the potential energy of the electron consists of
terms describing its interaction with the static array of nuclei and average po-
tential energy terms that take into account interactions with the other electrons
in the crystal. The interaction potential with the nuclei clearly has the periodic-
ity of the crystal and we assume that the average potential energy is constructed
so that it also has the periodicity of the crystal. This means that the total po-
tential energy is invariant under translation, i.e.,
for any lattice translation vector T~ . Since the kinetic energy operator is also
translation invariant we then have
1391
Each lattice translation vector T~ = n1~a + n2~b + n3~c has a corresponding trans-
lation operator Top = Top (n1 , n2 , n3 ).
0
Top (n1 , n2 , n3 )Top (n01 , n02 , n03 ) = Top
00
(n1 + n01 , n2 + n02 , n3 + n03 ) (18.15)
T~ 00 = T~ + T~ 0 = (n1 + n0 )~a + (n2 + n0 )~b + (n3 + n0 )~c
1 2 3 (18.16)
for any Top . In addition, from the definition we can see that
0 0
T op Top = Top T op (18.18)
1392
18.2.2 Derivation of Blochs Theorem
All the energy eigenfunctions are normalized so that
Z
2
|E (~r)| d3 r = 1 (18.21)
It tells us that we can choose solutions of the Schrodinger equation such that
for every Hamiltonian eigenfunction E (~r) there exists a vector ~k called a wave
vector, such that (18.27) holds for any translation operator Top , that is, that
~ ~
E (~r + T~ ) differs from E (~r) only by a multiplicative phase factor eikT . Any
function that satisfies Blochs theorem is called a Bloch function or Bloch wave.
1393
The Wave Vector
The proportionality constant ~k has units of inverse length. For each Hamilto-
nian eigenfunction E (~r), there is a wave vector ~k. The relationship of ~k to the
wave function is given by Blochs theorem. This means that ~k can be viewed as
a quantum number for the function E (~r) and we can write E,~k (~r).
It is clear that Blochs theorem is satisfied for an infinite set of kvalues given
by
2
k =+ m , m = 0, 1, 2, ...... (18.30)
d
where we have used the fact that ei2m = 1. This suggests that we cannot
uniquely define the wave vector ~k, i.e., there are an infinite number of wave
vectors that are equivalent to each other in the way that they are related to
the wave function E (x). This result generalizes to three dimensions. For each
single-particle Hamiltonian eigenfunction E (~r) we can find a set of different
wave vectors ~k that impose a similar restriction on E (~r), i.e., for which
~ ~
E (~r + T~ ) = eikT E (~r) (18.31)
In general, we choose the smallest of these wave vectors to label the eigenfunction
E,~k (~r).
1394
This says that the new function uE,~k (~r) is a periodic function invariant under
translation by T~ , i.e., it has the periodicity of the crystal. Using this result we
rewrite the Hamiltonian eigenfunction as
~
E,~k (~r) = eik~r uE,~k (~r) (18.35)
which is the second form of Blochs theorem. This form shows that the electronic
eigenfunction of the crystal can be chosen to have the form of a periodic function
~
uE,~k (~r) modulated by a plane-wave envelope eik~r .
Generally, this envelope varies more slowly with ~r then does uE,~k (~r). The real
part of E,~k (~r) is shown in Figure 18.17 below, where it has been assumed that
uE,~k (~r) is a real function.
Up to this point we have not restricted the magnitude of ~k in any way. Thus,
1395
there exists a continuum of values of ~k . We call this continuum ~kspace or
reciprocal space.
Each crystal lattice in direct space has an associated reciprocal lattice in ~kspace.
Blochs theorem says that for a particular energy eigenfunction of the electronic
Hamiltonian, there exists a vector ~k (actually an infinity of such vectors) such
that
~ ~ ~ ~ ~
Top E,~k (~r) = Top eik~r uE,~k (~r) = eik(~r+T ) uE,~k (~r) = eikT E,~k (~r) (18.37)
~ is a special wave vector such that u ~ (~r) is invariant under any transla-
i.e., G E,k
tion Top . The condition implies
~ ~ ~ T~ ) + i sin(G
~ T~ ) = 1 G
~ T~ = 2n
eiGT = cos(G , n = integer (18.39)
or that
~ T~ = n1 G
G ~ ~b + n3 G
~ ~a + n2 G ~ ~c = 2n (18.40)
The integers n1 , n2 and n3 which define a particular T~ , can take on any values.
For each set {n1 , n2 ,n3 } there exists some integer n for which the above equality
holds. Thus, n depends on n1 , n2 and n3 .
since the relation must be valid for any value of n1 . n1 is another integer given
by n = na n1 . Similarly, we must have
~ ~b = 2nb and G
G ~ ~c = 2nc (18.42)
1396
~
These three relations allow us to write a general form for G
~ = na A
G ~ + nb B
~ + nc C
~ (18.43)
Although this looks like the expression for T~ , remember, however, that G
~ is a
~
vector in kspace, i.e., that it is a vector such that
~ ~
eiGT = 1 (18.44)
~ ~
Since eiGT = 1, all vectors ~k 0 = ~k + G
~ are equivalent in ~kspace, i.e.,
~0 ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
eik T = ei(k+G)T = eikT eiGT = eikT (18.49)
1397
Figure 18.18: Square direct lattice
where |~a| = ~b. We have
~a = dx and ~b = dy (18.51)
The vector defining the reciprocal lattice in two dimensions is
~ = na A
G ~ + nb B
~ (18.52)
We then have
so that
~ = 2 x , B
A ~ = 2 y (18.54)
d d
This result shows that the reciprocal lattice of a square direct lattice is also a
square lattice with the spacing of adjacent lattice points in the x and y direction
given by 2/d.
This is, in fact, a general property of two-dimensional Bravais lattices . They are
self-reciprocal, i.e., the reciprocal lattice of a particular two-dimensional Bravais
lattice is another lattice of the same Bravais type.
1398
Figure 18.19: Reciprocal Lattice for Square direct lattice
We choose to label the wave functions by the smallest or most significant vector
in each of the sets of equivalent vectors.
We now ask the following question: Is there a particular region of the reciprocal
lattice that contains all of the significant ~k vectors for a particular crystal?
1399
Consider the two different wave vectors ~k1 and ~k2 for a particular two-dimensional
crystal shown in Figure 18.20 and Figure 18.21 below. We also show the vectors
~k 0 = ~k1 G
~ and ~k20 = ~k2 G
~ (18.56)
1
Figure 18.20: ~k10 > ~k1
In the Figure 18.20 above, ~k10 > ~k1 and in Figure 18.21below ~k10 > ~k1 .
Figure 18.21: ~k10 > ~k1
Thus, ~k1 may be the smallest vector in its set, but ~k2 is certainly not the small-
est vector in its set.
One difference between ~k1 and ~k2 is that the tip of ~k1 lies to the left of the per-
~ whereas the tip of ~k2 lies to the right of the bisector.
pendicular bisector of G,
It turns out that any wave vector whose tip lies on the side of the bisector away
1400
from the origin, is equivalent to at least one smaller wave vector, which can be
generated by subtracting the lattice vector that was bisected. All the significant
wave vectors (i.e., those not equivalent to some smaller wave vector), lie on the
side of the perpendicular bisector closest to the origin.
~ , we bisect
Now, suppose that instead of bisecting one reciprocal lattice vector G
all such vectors from the origin to nearby points. Then these bisectors enclose
a particular region of reciprocal space that includes the origin. Figure 18.22
below gives this construction for the square lattice(in reciprocal space)
The enclosed region (shaded) is called the first Brillouin zone. The solid arrows
are lattice vectors and the dotted lines are their bisectors.
Each vector in the first Brillouin zone is the smallest vector in the set of equiv-
alent vectors since it lies on the side toward the origin of every perpendicular
~ Any vector whose tip lies outside the first zone is equivalent to
bisector of G.
some smaller vector whose tip is inside the first zone. The first Brillouin zone
contains all ~kvalues closer to the origin than any other point of the reciprocal
lattice.
The ~kvalues inside the first Brillouin zone have special significance. We will
generally restrict our attention to these values and ignore the infinity of other
vectors outside this zone. Each vector outside the first Brillouin zone is, of
course, equivalent to a smaller vector inside the first zone.
1401
We will, however, have to refer to these other vectors in certain situations and
it is useful to introduce a terminology for describing them. Suppose we wish,
for example, to discuss the second-smallest ~k vector for a particular electronic
state, i.e., the ~k vector that is smaller than all but one of the vectors in the set
of equivalent ~k vectors for this state. Where is this vector located in reciprocal
space? In Figure 18.23 below we have added several more bisectors to the Figure
18.22 that we used to describe the first zone. These extra bisectors also enclose
various regions of ~k space.
Consider the shaded region in Figure 18.23. We reach this region from the origin
by crossing one boundary of the first zone. Each ~k vector in this region is inside
all Brillouin zone boundaries (the bisectors) except one - the one we crossed.
Therefore, for each vector in this region there is one and only one smaller ~k
vector, namely, ~k G,
~ where G
~ is the reciprocal lattice translation vector whose
bisector we crossed. This region is part of the second Brillouin zone.
The second Brillouin zone is defined as that region of ~k space containing each
wave vector that is smaller than all but one of the vectors equivalent to it.
Looking at Figure 18.23, there are four parts of the second zone (labelled by II
in Figure 18.23). Crossing one more boundary - one of the boundaries of region
1402
II - we enter a region of ~k space where each ~k vector is smaller than all but two
equivalent ~k vectors. This region is the third Brillouin zone (regions labelled
by III in Figure 18.23). In a similar manner the fourth zone can be identified
(labelled by IV in Figure 18.23).
Central to the construction of the first Brillouin zone was the choice of a point
as the origin. The selection of the origin point, however, is arbitrary. We could
have chosen any point of the reciprocal lattice as the origin and constructed a
zone identical to the first Brillouin zone.
Another property of Brillouin zones that are higher than the first zone is that,
although they consist of several disjoint segments, using an appropriate choice
~ we can translate each segment of
of reciprocal lattice translation vectors G,
any higher zone into the first zone as shown in Figure 18.25 below. When
all segments have been translated, they will fully cover the first zone with no
overlap, i.e., the area (volume in 3 dimensions) of each zone is identical.
1403
Figure 18.25: Translating higher zone to first zone
~ = 2 x
A , ~ = 2 y
B , ~ = 2 z
C (18.58)
d d d
To construct the first Brillouin zone, we choose a point of the reciprocal lattice
as the origin and draw reciprocal lattice translation vectors to all adjacent points
of the lattice.
1404
the first Brillouin zone. The bisectors, in this case (see figure below) are six
planes perpendicular to the
axes and intersecting these axes at /d. The first zone, as shown in Figure
18.26 below, forms a cube.
The second Brillouin zone of the simple cubic lattice is a little more difficult
to visualize because it involves more intersecting planes. It is defined by these
boundaries
1. the 6 planes that enclose the first Brillouin zone
2. the 12 planes
that are orthogonal to the (110) axes and intersect these
axes at 2(/d) as shown in the Figure 18.27 below.
1405
Figure 18.27: Second Brillouin zone of Simple Cubic Lattice
Figure 18.28 below show three typical segments of the second Brillouin zone.
The second zone is a dodecahedron with a cube(first zone) removed from its
center.
1406
Electronic States in a Brillouin Zone
Keep in mind that the significance of each vector in ~k space resides in its effect
on the electronic state of the crystal E,~k (~r). In order to determine an analytic
form for this function we must solve the Schrodinger equation for the crystal.
Whatever the form of the wave function, however, we know that it satisfies
Blochs theorem
~ ~
E,~k (~r + T~ ) = eikT E,~k (~r) (18.60)
Each different ~k vector in the first Brillouin zone corresponds to a different elec-
tron state. We now ask the question: How many distinct electronic states are
represented by wave vectors in the first Brillouin zone?
If the crystal were actually infinite, then there are an infinite number of different
wave vectors in the first zone and hence the number of electronic states is also
infinite. However, no real crystal is truly infinite. We now see how to take this
fact into account.
We denote by N the number of primitive unit cells in the crystal. We will now
prove that there are exactly N electronic state in the first Brillouin zone, i.e., N
distinct ~k vectors in that zone. Since each electronic state can be occupied by
at most two electrons, one with spin up and one with spin down , the electronic
states in the first zone can be occupied by at most 2N electrons.
We do the proof in one dimension (it can easily be extended to 2 and 3dimensions).
We consider a 1dimensional crystal of length L with lattice spacing d (so that
L = N d since there are N primitive cells). We associate a basis with each lat-
tice point. We are only interested in bulk properties so we neglect edge effects
and assume that L, the length of the crystal, is chosen so that we have periodic
boundary conditions
E,~k (0) = E,~k (L) (18.61)
i.e., the crystal repeats itself forever!
or that
~ ~
eikL = 1 ~k L
~ = kL = kN d = 2n , n = integer (18.63)
Thus,
2n 2 n N
k= = , n = 0, 1, 2, ......., (18.64)
L d N 2
The reciprocal lattice of this direct lattice is a lattice in ~k space of equally spaced
points with separation of adjacent point equal to 2/d.
1407
The first Brillouin zone extends from /d to +/d, such that
k = /d n = N/2 (18.65)
k = /d n = N/2 (18.66)
Thus, for the values of ~k in the first zone, the integer n can assume all integral
values in the range [N/2, +N/2] and there are N integers in this range.
This result says that there are N electronic states in each Brillouin zone.
All of these ideas form the basis of the detailed discussion of the electronic states
and energies in a crystal, which we now begin.
First, we will use the so-called free-electron approximation in which the crystal
potential energy function is replaced by a constant. This is a very simple and
a very crude approximation. It reflects only the fact that the electron is con-
strained to move within the crystal.
We will not attempt to find explicit forms for the electronic wave functions,
but instead concentrate on determining the energy eigenvalues. In particular,
we shall only attempt a determination of graphs for the energy versus the wave
vector and the density of states as a function of the energy. These features
will enable us to get a clear picture of what is happening in crystals due to
translational symmetry.
1408
on the electron is to keep it confined inside the crystal. If this is the case,
then we can neglect the periodicity and the singularities in a the free-electron
approximation and determine what we might call a zero-order picture of the
energy as a function of the wave vector. This picture will then be modified in a
weak-binding approximation by reimposing periodicity and the singularities so
that the basis structure of the energy versus wave vector curve remains basically
unchanged except where the added constraints are dominant. That is the reason
for the name weak-binding.
The dashed line in the figure represents V (x) in the free-electron approximation.
It is just a finite square well in one dimension. It is equivalent to replacing the
periodic potential with its average value.
1409
At this point, no limit has be imposed on ~k 0 . It is called an extended-zone-
scheme wave vector. In this case, ~k 0 is a redundant label because the E and ~k 0
are simply related by the (18.69).
In the reduced zone scheme we can also write the wave function as a Bloch wave
using
~ ~ ~
E,~k (~r) = AeiG~r eik~r uE,~k (~r) = AeiG~r (18.71)
In the reduced zone scheme, both the energy and ~k labels are required since these
two quantities are no longer simply related as in the extended zone scheme.
On the horizontal axis - the kaxis - we find values of k from every Brillouin
zone of the lattice and corresponding to each k value there is an E value.
The plot of E versus k above is in the extended zone scheme , i.e., it contains
values of k from all Brillouin zones. The labeled k values correspond to the
Brillouin zone boundaries.
1410
Our earlier discussions imply that as far as Blochs theorem is concerned, the
only significant values of k are those in the first Brillouin zone.
As we saw earlier, for a 1dimensional crystal with N primitive unit cells, the
allowed values of k in the first zone are given by
2 m N
k= , m = 0, 1, 2, ......., (18.72)
d N 2
Since N is a very large number, in general, the values of k are very closely
spaced.
We can restrict our attention to the first Brillouin zone by plotting E versus k
in the reduced zone scheme. To do so we translate all ~k vectors in the higher
zones on the plot in the extended zone scheme (Figure 18.30 above) into the
first zone given by
k+ (18.73)
d d
That means we add the appropriate reciprocal lattice translation vector to ~k 0
to generate a wave vector in this range,
2 0
k = k0 n , n0 = integer (18.74)
d
In Figure 18.31 below we show explicitly how to generate the reduced zone plot.
1411
The arrows labeled a correspond to n0 = 1 and the arrows labeled b correspond
to n0 = 2.
The final plot in the reduced zone scheme is shown in Figure 18.32 below.
We note that in the plot for the reduced zone scheme an infinite number of
energies correspond to each value of k in the first zone. There is an electronic
wave function corresponding to each energy.
For each k, we can specify a particular wave function by giving the energy,
i.e., E,k , or we can introduce a new quantum number to distinguish different
electronic states with the same value of k. This new quantum number, n,
is shown in Figure 18.32 above. We say that each of the ranges of energy
corresponding to k in the first Brillouin zone is a band and n labels the bands
and is called the band index. We write n,k (x). Figure 18.32 above shows the
first four bands and part of the fifth band of the 1dimensional crystal.
The wave function for an electron in this crystal with energy E ~2 2 /2me d2
is
ZL
ikx 2
1,k (x) = Ae , A L= 1,k (x)1,k (x)dx = 1 A = 1/ L (18.75)
0
0
For an energy in the second band, we have 2,k (x) = Aeik x where
2 2
k0 = k + , k<0 , k0 = k , k>0 (18.76)
d d
1412
and so on for higher bands.
1413
Figure 18.33: Density of states D(E)
Figure 18.33 above shows that the density of states is largest at small energies
and falls off as E increases. Note that we are not restricting k to the first
Brillouin zone in this discussion, although we are taking it to be positive.
The dashed line corresponds to T = 0 and the solid line corresponds to kB T <<
.
The product D(E)f (E) is equal to the density of occupied electronic states. As
can be seen from Figure 18.34 above, at T = 0 we find that there is an energy
EF such that all states with E < EF are occupied but no states with E > EF
1414
are occupied. EF is called the Fermi energy. At T = 0 the Fermi energy is the
maximum energy of the electrons in the crystal and is equal to the constant
in the distribution function. We can define a wave vector kF corresponding to
the Fermi energy (the Fermi wave vector).
Using
Nd
w(k) = (18.86)
we get
kF = (18.87)
2d
The corresponding Fermi energy in the free-electron approximation is then
~2 2 ~2
EF = kF = 2 (18.88)
2me 32me d2
In the figures below we show the location of the Fermi energy on plots of E
versus k in the extended and reduced zone schemes and on a plot of D(E)
versus E. We also show the Fermi energies (and Fermi wave vectors) for various
values of .
1415
The arrows label the Fermi energies and Fermi wave vectors in the figures above
and below.
Notice that as the number of electrons per primitive unit cell increases, the
Fermi energy, and hence the number of occupied states, increases. The Fermi
energy enables us to describe the occupation of electronic states in the crystal
(at T = 0), and in the 1-dimensional free-electron approximation it can be
calculated knowing only the repeat distance d and the number of electrons per
primitive unit cell .
The shaded region shows the occupied states for a crystal with = 2.
1416
18.4 Introduction to the Weak-Binding Approx-
imation
We must now introduce the periodicity of the crystal potential energy. We write
H = T + V (x) (18.89)
where V (x) has the periodicity of the crystal. We assume that the electron in-
teractions due to the potential energy is weak and they still move easily within
the crystal, although they are now influenced by the periodic potential. This is
the so-called weak-binding approximation.
We still cannot solve the exact Schrodinger equation. How can we introduce the
weak periodic potential and also use all the results of the free-electron approxi-
mation.
One possible approach is to use perturbation theory, where the free-electron ap-
proximation is used as the zeroth-order solution and V (x) as the perturbation
Hamiltonian. For example, using
(0) 1
E,k (x) = eikx (18.90)
L
as the zeroth-order wave functions, the first-order perturbed wave function can
be written as
(0)
X hk 0 | V |ki (0)
E,k (x) = E,k (x) + (0) (0) 0
E,k0 (x)
k0 6=k
E (k) E (k )
1 ikx X Vk 0 k 1 ik0 x
= e + e (18.91)
L (~2 k 2 /2me ) (~2 k 02 /2me ) L
k0 6=k
where
+L/2
Z
0 1 0
Vk0 k = hk | V |ki = ei(kk )x V (x)dx (18.92)
L
L/2
0
This expression represents a mixing of the functions eik x with the zeroth-order
function eikx to form the first-order approximate wave function.
If perturbation theory is valid in this case, then the above result is a good
approximation to the electronic wave function.
Let us assume that this approximation is valid and investigate the summation
term. The summation is over all allowed wave vectors so it represents a large
number of terms. We ask the question - are there any terms where Vk0 k = 0? If
so, then these functions do not mix with
(0) 1
E,k (x) = eikx
L
1417
We can answer this question by applying the translation operator Top on the
integrand of Vk0 k . This leaves Vk0 k unchanged since it simply corresponds to
redefining the zero of the dummy variable of integration, x, and we may consider
the integral to extend from to + if nd L. For a 1dimensional crystal,
any lattice translation vector can be written as nd so we have
+L/2
Z +L/2
Z
1 i(kk0 )x 1 0
Vk0 k = Top e V (x)dx = ei(kk )(x+nd) V (x + nd)dx (18.93)
L L
L/2 L/2
+L/2
Z
1 0 0
Vk 0 k = ei(kk )(x+nd) V (x)dx = ei(kk )nd Vk0 k (18.94)
L
L/2
Since both V (x) and eiGx are periodic, VG can be written as the integral over
one primitive unit cell
+d/2
Z +d/2
Z
1 iGx 1
VG = N e V (x)dx = eiGx V (x)dx (18.98)
L d
d/2 d/2
This important result - that the only free-electron wave functions that mix with
(0)
E,k (x) are those wave vectors k 0 satisfying k 0 k = G for some reciprocal
lattice vector G - can be shown to hold in any order of perturbation theory. It
is true, in fact, even if perturbation theory breaks down!
1418
The energy to second order can be written as
2
~2 k 2 X |VG |
Ek + (18.99)
2me (~2 k 2 /2me ) (~2 (k G)2 /2me )
G6=0
where k is the wave vector in the extended zone scheme and hence ~2 k 2 /2me is
the energy in zeroth order.
This analysis depends on the question: Is perturbation theory valid for this
problem?
The first-order expression for the wave function is a good approximation to the
true electronic wave function provided that none of the states in the summation
(0)
over G mixes strongly with E,k (x), i.e., provided that
VG
(~2 k 2 /2me ) (~2 (k G)2 /2me ) << 1 (18.100)
for all G 6= 0. If there are states that mix strongly, i.e., for which the above in-
equality is violated, then these states cannot be handled by perturbation theory.
We can rewrite the inequality as
~2
|VG | << |G(2k G)| (18.101)
2me
Since this inequality is violated for any state with k G/2 there exist states for
which we cannot use perturbation theory (if VG 6= 0). So, as long as VG is small
enough (V (x) is weak enough), there exist some states, i.e., those for which k is
not nearly equal to any G/2, for which the perturbation expansion can be used.
For these states the above expressions for the first-order wave function and the
second-order energy are valid and show that the wave functions and energies
differ from their zeroth-order counterparts, although not very much.
On the other hand, for states with k = G/2, perturbation theory cannot be
used since the function ei(kG)x mixes strongly with eikx . We must treat these
states some other way.
It is precisely these cases for which the energy will differ most from the free-
electron energy in the dependence of E on k when we move from the free-electron
approximation to the weak-binding approximation.
1419
Strongly Mixed States
Let us look back at the free-electron E versus k plots we developed earlier
and locate the states that are strongly mixed. We have found that states
with wave vector near k = G/2 will strongly mix with states with wave vector
k 0 = k G G/2, i.e., the strongly mixed states are those with wave vectors
near opposite Brillouin zone boundaries. The plots below show (in both the ex-
tended zone scheme and the reduced zone scheme) the states that are strongly
mixed.
The strongly mixed states are connected by an arrow (see Figure 18.38 and
18.39) and numbered for convenience(triangles indicate the state energies). Be-
cause of the restriction k 0 k = G, there are at most two electronic states that
are close to one another in energy and strongly mixed.
1420
The strongly mixed states are
12 , 34 , 56 (18.102)
Notice that states 1 and 6 are not strongly mixed even though their wave vectors
are near zone boundaries and related by a reciprocal lattice translation vector.
These states are widely separated in energy, so the denominator in the pertur-
bation expansion term corresponding to these two states is large, and the states
are weakly mixed.
In this case, we can expand E (x)i n the complete set of free-particle states,
summing only over equivalent wave vectors, and keep only the terms with large
expansion coefficients for the energy of interest
1 1 0
E (x) a eikx + b eik x (18.105)
L L
Note that the two free-particle states (see the reduced zone energy plot above)
kept in this expansion are nearly degenerate.
We solve this problem by substituting the two state expansion for the wave
function into the Schrodinger equation
1421
where
Hkk = hk| H |ki
L/2
~2 d2 ~2 k 2
Z
1 ikx ikx
= e + V (x) e dx = (18.110)
L 2me dx2 2me
L/2
Hk0 k0 = hk 0 | H |k 0 i
L/2
~2 d2 ~2 k 02
Z
1 0 0
= eik x 2
+ V (x) eik x dx = (18.111)
L 2me dx 2me
L/2
L/2
~2 d2
Z
1 0 ikx 0
H kk0 = hk| H |k i = e + V (x) eik x dx
L 2me dx2
L/2
L/2
Z
1 0
= ei(k k)x V (x)dx = VG = VG (18.112)
L
L/2
L/2
~2 d2
Z
0 1
Hk0 k = hk | H |ki = eikx + V (x) eikx dx
L 2me dx2
L/2
L/2
Z
1 0
= ei(kk )x V (x)dx = VG (18.113)
L
L/2
1422
by an energy separation of 2 |VG | which is called a band gap. As we shall see
shortly, it separates two nearly free-electron bands.
This equation shows that as we move away from a zone boundary, and in-
creases, the splitting between the two states with wave vectors k and k 0 increases.
In addition, the average energy of the two states increases.
As we get further away from the zone boundary, we eventually leave the region
of strongly mixed states.
Once far enough from the boundary so that perturbation theory is valid and
can be used to calculate the energy shift, we know that the energies of the
electronic states in the weak-binding approximation are nearly the same as the
corresponding energies in the free-electron approximation.
Figures 18.40 and 18.41 below show the effect of the periodic potential energy on
the first two bands of the free-electron E versus k plot of the one-dimensional
crystal in the reduced and extended zone schemes. The dashed lines are the
free-electron result.
1423
Figure 18.41: Band Gaps in Extended Zone Scheme
We can see from the figures that in the weak-binding approximation the energy
splitting calculated using the results above, leads to a gap between the first and
second bands. The magnitude of the gap, 2 |VG | is dependent on the size of the
matrix element VG . This means that the magnitude of the gap between bands
2 and 3 will differ from that of the gap between 1 and 2 and so on.
Bands and band gaps for the first three bands are shown in both the extended
and reduced zone schemes in Figures 18.42 and 18.43 below. The dashed lines
are the free-electron result.
1424
Figure 18.43: Band Gaps in Extended Zone Scheme
1 (0) r2
(0)
E,c (x) = E,+/d (x) + E,/d (x) = cos x (18.125)
2 L d
1 (0) r2
(0)
E,s (x) = E,+/d (x) E,/d (x) = sin s x (18.126)
2 L d
2 2
The probability densities |E,c (x)| and |E,s (x)| for wave vectors at the first
Brillouin zone boundary in the case of one atom per primitive unit cell (with
the origin at one of the atoms) is shown in Figure 18.44 below.
1425
Figure 18.44: Probability Densities
We see that E,c (x) has a maximum of probability density near each singularity
of the strongly attractive potential energy of the crystal. On the other hand,
E,s (x) has a minimum of probability density near each singularity. Conse-
quently, we expect to see a splitting between the energies for these two wave
functions. This splitting is just the band gap 2 |VG |.
ZkF
2 w(k)dk = N (18.128)
0
and does not use the assumption of free-electron behavior (it does assume a
monotonic increase of E with k in the extended zone scheme). So, in the weak-
binding approximation, we still have
kF = (18.129)
2d
where is the number of atoms per primitive unit cell. The expression for
the Fermi energy cannot be carried over into the weak-binding approximation
because we used the fact the the energy E was quadratic in the wave vector k
in the derivation.
1426
Metallic and Nonmetallic Behavior
Now we consider a 1-dimensional crystal with one electron per primitive unit
cell ( = 1). The Fermi wave vector is kF = /2d. For this wave vector the
Fermi energy is near the middle of the first band (see the earlier figure). For
= 1, the first band is half-filled. Since the Fermi energy is the maximum
energy for the electrons at absolute zero, no states with E > EF are occupied.
This means that there exist empty states infinitesimally close to the energy of
the highest occupied state and an infinitesimal increment of energy can excite an
electron to one of these empty states (turn on a small electric field). Thus, for
infinitesimal electric fields we get infinitesimal currents (extra kinetic energy).
In contrast, consider a crystal, with two electrons per primitive unit cell. In
this case, kF = /d, which corresponds to the edge of the band - the first band
is full, i.e., all the electronic states in this band are occupied at absolute zero.
Since the next nearest empty state is separated from the first band by an energy
gap of magnitude 2 |VG |, at least this much energy is required in order to excite
an electron into one of the empty states.
Adjacent wells are separated by a distance s, the width of each barrier. The
repeat distance for this crystal - the length of the primitive unit cell - is d. The
1427
only restrictions we place on the well parameters at this time are
where uE,k (x) is a periodic function invariant under translation by any lattice
translation vector T = nd, i.e.,
From the graph(Figure 18.45) of the potential function, we see that there are
two regions of potential energy:
1. ranges of x for which V = 0 nd x nd + (d s)
2. ranges of x for which V = V0 nd + (d s) x (n + 1)d
where n = 0, 1, 2, ...... The form of the wave function is different in these
two regions.
1428
with solutions
1p
E,k (x) = Cex + Dex , = 2me (V0 E) (18.137)
~
There are four unknowns in these solutions A, B, C and D, but there are also
four connecting equations - two for the periodicity conditions and two for the
continuity of u and du/dx.
These four equations have nontrivial solutions only if the determinant of the
matrix of the coefficients of A, B, C and D vanishes
eikd ed eikd ed
1 1
eikd ed eikd ed
i i
=0
i(ds) (18.147)
e
ei(ds) e(ds) e(ds)
iei(ds) iei(ds) e(ds) e(ds)
2 2
cos kd = cos (d s) cosh s + sin (d s) sinh s (18.148)
2
1429
Limiting Cases
1. In the limit V0 0, the particle will be confined to one of the square wells
in the model crystal. In this case the energy eigenvalue equation becomes
cos kd 2 2
= cos (d s) + sin (d s) tanh s
cosh s 2
0 = cos (d s) + sin (d s)
2
sin (d s) = 0 (d s) = n
n2 2 2me E
2 = =
(d s)2 ~2
2 2 2
n ~
E= (18.149)
2me (d s)2
2. Now we keep V0 finite, fix the width of each well, and let the well separation
s . In this case, wells will be completely isolated from one another
and we expect the eigenvalue equation to reduce to the equation for a
particle in a single finite square well. We find
2 2
0 = cos (d s) + sin (d s)
2
2 2
cot (d s) = (18.150)
2
which is the standard transcendental equation of the finite square well.
3. Let us remove the periodicity of V (x). There are three ways to do this
V0 0 or s 0 or s d (18.151)
1430
In this case we have
cos kd = cos d + sin d sinh s (18.154)
2
1
2me V0 1p
cos kd = cos d + ~ sin d sinh 2me V0 s (18.155)
2 ~
1
2me V0 1p
cos kd = cos d + ~ sin d 2me V0 s (18.156)
2 ~
me V0 sd sin d
cos kd = cos d + (18.157)
~2 d
If we choose
me V0 sd
=1 (18.158)
~2
we have
sin d
cos kd = cos d + (18.159)
d
This result clearly shows the band structure, that is, since the value of the left
hand side of this equation is between 1 and 1, solutions only exist for energies
E such that the magnitude of the right hand side is 1.
In Figure 18.46 below we plot the right hand side versus d for d 3.
1431
We find that the lowest allowed energy of the particle in this Kronig-Penny
crystal is
2 ~2
E = 0.174 (18.160)
2me d2
Looking at the regions lying between +1 and 1 we see the structure of energy
bands and band gaps clearly revealed. In particular, the particle can have any
energy in the following ranges
0.174 E 1 band 1
1.370 E 4 band 2
4.490 E 9 band 3
States of the system exist for any energy in one of these bands. No solution
exist if
1 E 1.37 gap 1
4 E 4.49 gap 2
These results are very different from those obtained by solution of the Schrodinger
equation for an atom or molecule. Instead of discrete bound levels, we find con-
tinuous bands of allowed energies and intervening gaps of forbidden energies.
1432
The dashed line shows the free-electron approximation, i.e., treating the Kronig-
Penney model as a giant square well.
Weak-Binding Limit
In this limit we let the well separation s get small. In particular, we assume that
s is sufficiently small that the barriers can be treated as a small perturbation
on a constant potential energy. In this way, we can approach the weak-binding
limit through the exact energies. If s is small we assume that
~2 k 2
E = E0 + = + (18.162)
2me
We want to expand the exact eigenvalue equation in the small parameter . We
write
1p 1p
= 0 + 1 = 2me E0 + 1 = 2me E (18.163)
~ ~
1p 1p
= 0 + 1 = 2me (V0 E0 ) + 1 = 2me (V0 E) (18.164)
~ ~
We then have
1p 1p 1p
1 = 2me (E0 + ) 2me E0 = 2me E0 ((1 + /E0 ) 1)
~ ~ ~
1p me m2 2
= 2me E0 ((1 + /2E0 + ....) 1) = 2 e4 3 (18.165)
~ ~ k 2~ k
and
1p 1p
1 = 2me (V0 E) 2me (V0 E0 )
~ ~
0 0 2
= (18.166)
2 V0 E0 8 (V0 E0 )2
Using trigonometric identities for sin(a b) , cos(a b) and Taylor series expan-
sions of cos, sin, cosh and sinh function, the eigenvalue equation becomes
2 + 2
cos kd = cos d + s sin d (18.167)
2
We can also write
2 + 2 2 + 02 2 + 2
= 0 0 2 0 1 (18.168)
2 20 20
This only retains terms to order , but when multiplied by s sin d the second
order terms are obtained.
1433
Substitution gives
2 + 02
1
12 d2 + 0 s1 d cos kd
2 20
02 + 02 02 + 02
+ 1 d + s 1 s sin kd = 0 (18.169)
20 202
Using this expression we can solve explicitly for the energy for wave vectors at
the Brillouin zone boundaries and determine the band gaps.
kd = n , n = integer (18.170)
Therefore, at zone boundaries the second term is zero and the first term gives
two conditions for 1
02 + 02
1 d = 0 or 1 d = s (18.172)
0
The first choice gives = 0 and the second choice gives = 2V0 s/d and thus
we get a band gap of energy E = 2V0 s/d at the zone boundary in the weak-
binding approximation.
The study of electron band theory in one dimension was comparatively simple
for several reasons. First, there is only one Bravais lattice in one dimension;
thus the only difference between various one-dimensional problems arises from
such factors as the number of atoms per primitive unit cell, the strength of
the potential, and the separation of the lattice points. This situation contrasts
strikingly with that in two and three dimensions, where there are 5 and 14
Bravais lattices, respectively, each of which poses its own special problem.
1434
18.6.1 The Free-Electron Approximation
Let us consider a two-dimensional crystal in the free-electron approximation,
that is, we shall neglect the periodic part of the crystal potential energy V (x, y)
and assume that the electrons are free to move about within the confines of the
crystal. In this approximation the equation relating the energy of an electronic
state to the wave vector is
~2 k 2
E= (18.173)
2me
where the two-component wave vector can be expressed as
~k = kx x + ky y (18.174)
In order to see the basic structure of such a crystal, we must examine plots of E
versus k. Since k is a two-component vector, E(k) is a paraboloid of revolution,
as shown in Figure 18.48 below in the extended zone scheme.
Eventually we will want to use the reduced zone scheme and identify E versus
k graphs so as to incorporate the periodic nature of V (~r). It is clear that each,
of these steps is rather difficult if we use the three-dimensional graph in Figure
18.48.
Moreover, even if we could draw the resulting diagrams, their complexity might
obscure some of the information they contain.
For this reason, we will now introduce two other ways of representing the energy
of the electronic states of a crystal.
1435
path. We will represent the distance along the path by the integral
Z
dk (18.175)
c
For example, suppose that we choose the path shown in Figure 18.49 below.
Using the quadratic relation between E and k for the energy in the free-electron
approximation, we can sketch E as a function of distance along this path, ob-
taining the curve shown in Figure 18.50 below (in the extended zone scheme).
1436
As E increases, the radius of the circle increases. Thus we obtain a series of
concentric circles; several constant energy contours are shown in Figure 18.51
below, where the extended zone scheme is used.
Now, in going to the reduced zone scheme, we must specify a particular type
of Bravais lattice. Recall from our earlier discussion that there are five Bravais
lattices in two dimensions: the oblique lattice, the primitive rectangular lattice,
the centered rectangular lattice, the square lattice, and the hexagonal lattice.
To illustrate the preparation of graphs of E versus path distance and constant
energy contours, we choose the square lattice. Let d denote the spacing of adja-
cent lattice points in the x or y direction in the square lattice. Then the spacing
of adjacent points in the x or y direction in the reciprocal lattice is 2/d.
The reciprocal of the square lattice is square and possesses fourfold symmetry.
1437
A path that passes through all points of high symmetry in the first zone but
that does not contain segments equivalent to one another under the symmetry
operations of the square lattice is shown in Figure 18.52 below.
We will now compute the value of E at points on this path and on equivalent
paths in the surrounding regions of the reciprocal lattice. In Figure 18.53 below,
we have translated the first Brillouin zone and the chosen path throughout a
portion of the reciprocal lattice.
The resulting squares are numbered (1) through (9) for purposes of identifica-
1438
tion. We have also indicated the value of E in the free-electron approximation
at several points in k space - specifically, at the corners of each path (units of
energy are ~2 2 /2me d2 ).
A typical point, say the 13 along the right edge in the (9) box has
~k = 3 x + 2 y (18.177)
d d
so that its energy, in the free-electron approximation, is
~2 k 2 ~2 2
E= = 13 (18.178)
2me 2me d2
The origin is at the center of the (0) square.
The reason for considering values of k outside the first zone is that we wish
to take into account energies greater than ~2 2 /2me d2 , the maximum value of
energy that corresponds to points all of which lie within the first zone. Notice
that each path in a region other than the first zone is equivalent to the path in
the first zone. The wave vectors k 0 of these equivalent paths will be related to
wave vectors in the first zone by some reciprocal lattice translation vector
~k 0 = ~k + G
~ (18.179)
Several different methods exist to generate data like that shown in the above
figure - and hence the resulting free-electron energy bands. The following steps
will enable us to obtain the necessary information in a fairly organized way:
(1) Draw all paths with |k 0 | less than some maximum value corresponding to
an upper limit on the energy for which the resulting bands will be accurate (we
choose 3/d for the upper limit on k 0 in this example).
R
We will guarantee an accurate sketch of E versus dk only for energies below
c
~2 kmax
2
/2me , where kmax is the maximum value of |k 0 |. Above this energy, we
may not have included all the necessary path segments.
for each important point on the path. (In this case, the three corners of the
triangular paths are the important points,
R for it is there that the path changes
direction and the relationship of E to dk is altered.
c
We now use the data in Figure 18.53 to plot the energy in the reduced zone
1439
scheme for each path. These energies are then connected by segments of a
parabola. This process yields the desired plot of E versus path distance.
For this example, the plot is shown in Figures 18.54-18.60 below. Notice that
in the free-electron approximation portions of some of the paths are degenerate
in energy. Most of this degeneracy will be lifted in the weak-binding approxi-
mation( degenerate curves in this figure have more than one number associated
with them). We have plotted a sequence of graphs showing each path being
added.
1440
Figure 18.56: Step #3
1441
Figure 18.59: Step #6
Above, we have shown the energy versus path distance for the nine paths in the
free-electron approximation. The energy is plotted in reduced units. The little
numbers beside each path segment correspond to the path indices (1) through
(9) of the earlier figure. The arrow indicates the upper energy limit correspond-
ing to |k 0 | = 3/d.
1442
Notice carefully that the figure does not show the energy bands explicitly. How-
ever, we can extract them from this sketch.
The curve of E versus path distance that is everywhere lowest in energy corre-
sponds to the first band, the curve that is second lowest in energy corresponds
to the second band, and so on.
Thus we need only examine the figure in each region of (d/)(kx , ky ) (0, 0)
to (1, 0), (1, 0) to (1, 1), and (1, 1) to (0, 0) and select the first, second....
lowest curves to obtain plots of E versus path distance for the first, second, ...
bands. We have extracted the first nine bands in Figure 18.61 below.
The section of this graph between (0,0) and (1,0) shows the first nine bands in
the reduced zone scheme for the free-electron approximation.
We see from the figure that paths 1, 2, and 3 each correspond to a single energy
band, a fortuitous consequence of the triangular path chosen.
However, path 4 does not correspond everywhere to the fourth lowest energy.
Part of path 5 falls below path 4, and so the fourth energy band is made up of
contributions from paths 4 and 5.
In general, all bands higher than the first (lowest) have a complicated structure,
with contributions from parts of several paths. (This situation occurs because
1443
a general path will cross at least one Brillouin zone boundary and so cannot
correspond to a single energy band).
The objective of this method is to display the variation of energy with k for
each band in the reduced zone scheme.
We will begin in the extended zone scheme, with several Brillouin zones, and
draw constant energy contours. These contours can be translated into the first
zone by appropriate reciprocal lattice translation vectors. Such translation of
segments of higher zones into the first zone was discussed earlier.
The first four Brillouin zones of the square lattice were shown earlier in Figure
18.23.
In Figure 18.62 below, we have extended this sketch to show the first seven
zones and have also drawn contours of constant energy that lie entirely within
these zones in the free-electron approximation for the square lattice.
In Figure 18.63 below we show the portions of these contours that lie in the
first zone, together with the free-electron energies (in reduced units) at several
points of the first zone. These are constant energy contours for the first band
of energy.
1444
Figure 18.63: Energy contours in first band
To obtain contours for the second band, we translate the segments of the second
Brillouin zone from the figure above into the first zone, carrying along the
appropriate arcs.
The sequence of steps and the resulting reduced zone scheme constant energy
contours are shown in Figures 18.64-18.66 below.
1445
Figure 18.65: Translating contours to second band regions
Notice the behavior of the energy for each band. For example, the second band
has a minimum in energy at the center of the edge of the Brillouin zone, As we
approach a corner of the zone, the energy increases monotonically to 2; as we
go toward the center of the zone, the energy increases to its maximum value of
4.
1446
tended zone scheme). Therefore we can view the points kF , as defining a
one-dimensional surface enclosing values of k that correspond to occupied states.
Similarly, we can define a Fermi surface in two dimensions corresponding to the
Fermi energy EF . In the free-electron approximation this energy corresponds
to one of the constant energy circles introduced earlier (in the extended zone
scheme). The radius of this circle is kF , the Fermi wave vector. Again, each state
with an energy less than EF is occupied at absolute zero. The corresponding
constant energy contour will lie within the Fermi surface. Let us derive ex-
pressions for kF and EF in the free-electron approximation and then locate the
Fermi surfaces for the square lattice. The derivations will be similar to their
one-dimensional counterparts we carried out earlier. We denote by q the area
in k space of the first Brillouin zone. The number of states in this zone is 2N ,
where N is the number of primitive unit cells in the crystal. Thus w(~k) is equal
to 2N/q, a result completely equivalent to our earlier result. The total number
of electrons in the crystal is [from earlier]
ZkF
2N
w(~k)d~k = kF2 (18.181)
q
0
1447
kF (/d) EF (~2 2 /2me d2
1 0.798 0.637
2 1.128 1.273
3 1.384 1.910
4 1.596 2.546
5 1.784 3.183
6 1.954 3.820
Table 18.4 above shows values of kF and EF for = 1 = 6 for the two-
dimensional square lattice.
The corresponding Fermi surfaces are shown in the extended zone scheme in
Figure 18.67 below.
The circle with the smallest radius corresponds to = 1; the one with the
largest radius to = 6. In Figure 18.68 below these Fermi surfaces are trans-
lated into the first Brillouin zone. The shaded parts correspond to occupied
states at absolute zero (wave vectors with energies less than EF .
1448
Figure 18.68: Fermi surfaces translated to first zone
Let us look at these sketches. For q = 1, the entire Fermi surface lies within
the first Brillouin zone. Thus all occupied states have energies within the first
band. Notice that there are states with energies in the first band that are not
occupied. This fact is also reflected by the E versus path distance graph for
the first four bands. This graph is reproduced in Figure 18.69 below, where we
show the location of the Fermi energies.
1449
Figure 18.69: First 4 bands with location of Fermi surfaces
For crystals with 2 or 3 electrons per primitive unit cell, we see from (b) in
Figure 18.68 and Figure 18.69 above that there are occupied states in the first
and second bands; neither of these bands is completely filled. In contrast, the
first band is filled for crystals with 4.
As earlier, we can determine which states are strongly mixed by examining the
matrix elements of the crystal potential energy. As earlier, this matrix element
is defined as Z
D E 1 ~0 ~
V~k0~k = ~k 0 V ~k = d~reik ~r V (~r)eik~r (18.188)
A
where D E 1 ~ D E 1 ~0
~r ~k = eik~r , ~r ~k 0 = eik ~r (18.189)
A A
are normalized free-electron wave functions of energy
~2 k 2 ~2 k 02
, (18.190)
2me 2me
1450
respectively, and where ~r = xx+y y is a vector in the two-dimensional coordinate
space of the direct lattice. Recall from earlier that these matrix elements also
appear in the calculation of the energy eigenvalues due to mixing of the strongly
mixed states.
or in terms of the integral over one primitive unit cell (PUC) of area Q as
Z
1 ~
VG~ = eiG~r V (~r)d~r (18.192)
Q
P UC
The strongly mixed states satisfying ~k ~k 0 = G~ are those with wave vectors
near Brillouin zone boundaries. All other states can be treated by perturbation
theory. Let us be precise about what this statement means. If we introduce the
periodic potential energy as a perturbation, we obtain an expression for the first-
(0)
order perturbed wave function in terms of the ~ (~r) and a sum over all other
E,k
free-electron wave functions with wave vectors ~k 0 = ~k G ~ for every reciprocal
~ ~
lattice translation vector G(as we found earlier). If k is not near a Brillouin
(0)
zone boundary, all the other free-electron states mix weakly with ~ (~r) and
E,k
will not greatly affect the energy or wave function. If ~k is near a zone boundary,
(0)
one or more other states will mix strongly with (~r).
These states satisfy the
E,~
k
selection rule ~k k = G
~0 ~ and have energies very close to each other ~k ~k 0
so that the energy denominator in the perturbation theory summation is small.
Thus, at or near a single Brillouin zone boundary, two states are strongly mixed.
At or near the intersection of two or more zone boundaries, more than two
states can be strongly mixed. For example, if ~k is near the intersection of two
boundaries, then we can associate two other equivalent wave vectors with ~k,
(0)
each of which corresponds to a state nearly degenerate with ~ (~r). (Notice
E,k
that this situation did not arise in one dimension). For example, in Figure 18.70
below, we show the location of several strongly mixed states for the first four
Brillouin zones of the square lattice.
1451
Figure 18.70: Strongly mixing states
In one case, ~k is near a single zone boundary, and there is only one state that
(0)
mixes strongly with ~ (~r). In the other case shown, ~k is near the intersection
E,k
of three zone boundaries, and we must contend with four strongly mixed states.
In Figure 18.70, the first four Brillouin zones of the square lattice in the reduced
zone scheme are shown. The arrows connect the ~k values for strongly mixed
states. Two cases are considered, one near the edge of one zone boundary (two
states) and one near the intersection of three boundaries (four states).
Let us focus on one of these cases in the square lattice and calculate the energy
gaps that arise from this strong mixing. In particular, suppose that we choose
~k at the corner of the first Brillouin zone, ~k = (/d) (1, 1). At this point,
three Brillouin zone boundaries intersect, so four free-electron states are strongly
mixed. These states correspond to the ~k vectors
~ = (2/d) (m, n)
G (18.194)
weak
The weak-binding wave function E can be written as a linear combination
of the strongly mixed free-electron wave functions
4
(0)
X
weak
E (~r) = ai E,i (~r) (18.195)
i=1
1452
(0)
where the E,i (~r) are the four free-electron wave functions corresponding to
the wave vectors of above. (Remember that these four free-electron states are
degenerate). Substituting this expansion into the Schrodinger equation
HE = EE (18.196)
we obtain
4 4
(0) (0)
X X
H ai E,i (~r) = E ai E,i (~r) (18.197)
i=1 i=1
The corresponding eigenvalue equation is obtained by multiplying this equation
(0)
by E,j (~r) and integrating over d~r = dxdy (using orthonormality of the free-
electron wave functions)
4
X
Hji ai = Eaj , j = 1, 2, 3, 4 (18.198)
i=1
1453
Clearly, we must know more about the form of V (x, y) in order to further
evaluate this matrix element. Since the crystal potential energy has fourfold
rotational symmetry, it satisfies
the second integral in the expression for Vmn is zero, and we conclude that
This equation is a quartic equation, so it has four roots. The eigenvalues are
( 2 2
~ k
V2 twice
E = ~2m e
2 2
k
(18.211)
2me + V2 V1
These are the energies of the four weak-binding states whose wave functions are
given by the linear combinations above. They arise from the mixing of four free-
electron states at the corner ~k = (/d) (1, 1) of the first Brillouin zone. Notice
that a twofold degeneracy remains in the weak-binding approximation.
1454
This result induces gaps in the E versus path distance plot for the square lattice
(remember that our earlier plot had no gaps). The modified plot of weak-binding
energy versus path distance for paths of the first four bands of the square lattice
(energy is shown in reduced units) is shown in Figure 18.71 below. The bands
are labeled.
Here splittings appear between every band for all values of ~k except at the zone
corner. But values of E do not exist for which no electronic states are allowed,
that is, there are no actual band gaps. In fact, the bands overlap in this example.
These features are characteristic of two- (and three-) dimensional systems.
1455
Figure 18.72: Constant energy contours bending toward zone boundaries
Returning to the square lattice, we show in Figure 18.73 below how to modify
the free-electron constant energy contours. Contours are shown in the weak-
binding approximation for the first four bands in the reduced zone scheme.
Fermi Surfaces
The Fermi surfaces are simply particular constant energy contours correspond-
ing to the Fermi energies. Fermi surfaces for several values of shown in Figure
18.74 below. These curves are obtained by modifying the surfaces we drew
earlier, using the bending at the boundary method shown above.
1456
Figure 18.74: Fermi surfaces for = 1
For a crystal with a square lattice and = 1, the Fermi surface in the weak-
binding approximation is a circle whose area is half that of the first zone. Since
this circle is far from the zone boundaries, it is unaltered from its free-electron
behavior. This feature is not true of other values of , as the figure shows.
18.7.1 Introduction
We will now study cold atoms(lattice trap) in periodic potentials formed by
standing wave laser beams.
1457
18.7.2 Lattice Hamiltonian
A. One-body Hamiltonian and Wannier states
The Hamiltonian is block-diagonal in the basis of Wannier states, and the cou-
pling of Wannier states at different locations is given by
D E
(m) (n) (n)
wk Hlat wk = m,n J|jk| (18.217)
where Z +
(n) 1
Jk = dqeikq Eq(n) (18.218)
2
(n)
This shows that Jk is the Fourier transform of the energy bands as a function
of q and the dispersion relations can be written as
(n) (n) (n)
X X
Eq(n) = Jk eikq = J0 2 Jk cos (kq) (18.219)
k=0 k=1
1458
For deep potentials the energy bands are relatively flat, and the higher order
cosine terms are suppressed. This justifies the tight binding approximation in
which one retains only the nearest lattice site coupling, and in the following we
will suppress the band index (n), and focus on the lowest band described by the
tight binding Hamiltonian
X
HT B = J1 {|wk1 i hwk | + |wk+1 i hwk |}
k=
B. Harmonic confinement
HT B k W 2 cos (P )
H = + = W 2 (18.222)
4J1 4J1 2
Similar to the usual relationship between continuous position and momentum
operators, the discrete position operator W acts as a differentiation in the con-
tinuous quasi-momentum representation
hq | W |i = i hq | i (18.223)
q
which is easily derived by inserting a resolution of the identity in Wannier states
and using the overlap formula (18.216). Therefore, we arrive at the quasi-
momentum expression of the single particle Hamiltonian
2
cos (q)
hq | H |i = 2 hq | i (18.224)
q 2
At this point we make the curious observation that, after having restricted the
Hilbert space to the lowest energy band and having added a quasi-harmonic
confinement, the Hamiltonian in momentum space (18.224) has the same form
as the original optical lattice Hamiltonian (18.214) in position space. In both
cases, the Schrodinger equation takes the form of the Mathieu equation, but
instead of looking for eigenstates of (18.214) with any quasi-momentum, we will
only look for periodic eigenstates for (18.224), i.e. with zero quasi-position.
1459
C. Interacting particles
Using the relation (18.216), we can calculate the matrix elements of the effective
interaction operator in quasi-momentum space
ef f G
hq1 ; q2 | Uint (q3 + q4 q1 q2 )
|q3 ; q4 i = (18.228)
2
which shows that the interaction conserves the total quasi-momentum and is
independent of its value.
1460
where we have defined new operators by their action on quasi-momentum eigen-
states,
q iQ /2 |q1 ; q2 i ei(q1 +q2 )/2 |q1 ; q2 i (18.231)
The introduction of these operators suggest to reparameterize the quasi-momentum
basis states |q1 ; q2 i in terms of their sum and difference:
q = q2 q1 (18.232)
The quasi-momentum eigenstates states are defined for pairs of q1 and q2 in the
set
S12 = [; ] [; ] (18.233)
corresponding to a diamond shaped area in the coordinate plane of q as shown
in figure 18.75 below.
Figure 18.75: Quasi-momentum of the two particles vs. relative and center-of-
mass quasi-momentum. Left: The first Brillouin zone S12 in the (q1 , q2 )-plane is
emphasized and repeated in each direction. The color coding indicates the values
of a function that is periodic in both variables with period 2 and illustrates the
required periodicity. The set S which contains exactly one representative of
each point from S12 is shown by the gray rectangle. Right: The same function
is shown but in the (q+ , q? )-coordinate system. The set S is emphasized and
repeated, but with a different tiling than for S12 in the left panel.
1461
If we choose the values of (q+ , q? ) in the set
S = [; ] [2; 2] (18.234)
then each point from S12 is represented exactly once as is evident from figure
18.75. This means that we can reparametrize the quasi-momentum eigenstates
as
1
|q+ , q i = (q+ q )/2 ; (q+ +q )/2
2
|q1 ; q2 i = 2 |q1 + q2 , q2 q1 i (18.235)
where the front factor is chosen to preserve orthonormality, such that we have
the resolution of identity
Z + Z +2
1 = dq+ dq |q+ , q i hq+ , q | (18.236)
2
W2 + W1
W (18.237)
2
act in the following way
hq+ , q | W |i = i hq+ , q | i (18.238)
q
and the interaction operator U has the following matrix elements in terms of
the relative and center-of-mass quasi-momentum states
Z +2
0 0
hq+ , q | U |i = dq q+ , q (18.239)
2
with = G/(16J1 ).
The Schrodinger equation with the Hamiltonian (18.240) can be solved accu-
rately for a wide range of parameters as in 18.241 below. The resulting eigenen-
ergies and the wave functions (18.246 below) will be used as reference for our
analysis by the Born-Oppenheimer separation of the motional degrees of free-
dom.
1462
Solving the equations numerically
The stationary Schrodinger equation with the Hamiltonian (18.240) yields the
equation for the expansion coefficients
where p can assume the values 1. In addition, since we are dealing with two
identical bosons, only symmetrized wave functions are physically meaningful,
with implies that we have the symmetry
1463
18.7.4 Born-Oppenheimer Separation
A. Derivation
H = H + 2W+2 (18.247)
where ! !
Q+ Q
H = 2W2 cos cos + U (18.248)
2 2
We note that eiQ+ /2 commutes with H and we define their joint eigenstates
by |q+ , ni:
H |q+ , ni = n (q+ ) |q+ , ni (18.249)
eiQ+ /2 |q+ , ni = eiq+ /2 |q+ , ni (18.250)
with the following orthogonality relations
0 0 0
q+ , n q+ , n = n,n0 (q+ q+ ) (18.251)
0
The expansion coefficients C (n) (q+ ) are found by applying the Hamiltonian
(18.247) to the expanded wave function (18.254) and using (18.248)
Z + X
0 0 0
0
H |i = dq+ C (n) (q+ ){n (q+ ) + 2W+2 } q+ ,n (18.255)
n
In the (q+ , q? )-representation for the state vector, the eigenvalue equation takes
the form of coupled differential equations
X
0
E An(q+ ) (q )C (n) (q+ )
n
X 2
0 0
= n (q+ ) 2 2 A(q
n
+)
(q )C (n) (q+ ) (18.256)
n
q+
1464
The goal of the following analysis is to find an approximation for the eigen-
states, which is easier to apply numerically and which offers insights into
0their
internal structure and dynamics. To this end, we assume that the states q+ ,n
(q )
described by q wave functions An + (q ) depend only weakly on the argument
(q )
q+ . Eliminating thus the partial derivatives of An + (q ) with respect to q+ in
the evaluation of the right hand side of (18.256), and using the orthogonality
(q )
of the An + (q ) functions, we arrive at the following approximate equation for
the expansion coefficients
2 C (n) (q+ )
n (q+ )C (n) (q+ ) 2 2 = EC (n) (q+ ) (18.257)
q+
This has the form of a Schrodinger equation for a single particle in the poten-
(n)
tial n (q+ ). For each energy potential we can find discrete eigenenergies Em
(n)
and associated eigenfunctions
E Cm that solve (18.257) and yield approximate
(n)
eigenstates m for the full Hamiltonian (18.247)
D E
(n) (n)
q+ , q m = Cm (q+ )A(q
n
+)
(q ) (18.258)
Note the formal similarity of this reduction of the problem with the use of the
Born-Oppenheimer approximation in chemistry. In the latter, the wave function
is expanded as a product of wave functions in nuclear and electronic coordinates,
and due to the large difference in mass and hence in energy and time scales,
the electronic wave functions are supposed to follow changes in the slow nuclear
coordinates adiabatically.
In our case, the two particles have identical masses, and in the absence of mutual
interaction, the relative and center-of-mass motion occur on similar time scales,
and the Born-Oppenheimer approximation should not be valid. But, as we
increase the attractive interaction between the atoms, bounds states are formed,
and the relative position develops a new, faster time scale given by the binding
energy. Our separation is carried out and motivated in the quasi-momentum
picture, where a further observation may be in order: a strongly bound state in
the relative position coordinate corresponds to a very extended wave function
in the relative momentum, while the center-of-mass momentum may be well
defined. This supports the assumption that the dominant contribution to the
(n)
second derivative in (18.256) stems from the q+ wave function Cm (q+ ), and
(q+ )
hence that the derivative of An with respect to q+ may be neglected.
B. Application
1465
and in the basis of relative and center-of-mass quasi-momenta.
R + R +
dq1 dq2 (q1 , q2 ) |q1 ; q2 i
|i = (18.259)
R +
R +2
dq+ 2 dq (q+ , q ) |q+ , q i
While |q1 , q2 i and |q+ , q i are defined for (q1 , q2 ) S12 and (q+ , q ) S ,
respectively, we can look for functions defined on the entire R2 and restrict the
solution afterwards. The function is periodic in both variables with period
2, and this enforces to obey the symmetry
(q+ + 2, q 2) = (q= , q ) (18.260)
c.f. the tiling of R2 with replicas of S in the right panel of figure 18.75. Thus, a
necessary - but not sufficient - condition is that is periodic in both q+ and q
with periodicity 4. We are considering bosons and the state must be symmetric
under the exchange of the two particles, (q+ , q ) 7 (q+ , q ), which implies
the further constraint
(q+ , q ) = (q+ , q ) (18.261)
Using these arguments on (18.258) we conclude that we are looking for solutions
(q )
such that An + (q ) is even and periodic in q with period 4, and such that
(n) (q )
the product of Cm (q+ ) and An + (q ) is periodic in q+ with the same period.
Furthermore, the product must satisfy the relation (18.260).
When solving (18.262) we are looking for solutions to a Schrodinger like equation
with a cosine potential with period 4. Solutions which are periodic with the
same period - referred to as zero quasi-momentum states for periodic problems
in position space - can be chosen to be real-valued. The front factor F (q+ ) of
the cosine potential is itself a cosine function of q+ leading to two observations:
1466
1. F (q+ ) is an even function of q+ so equation 18.262 is unaltered under the
transformation q+ 7 q+ . Thus the solutions must be identical up to a
complex factor, and since they are real-valued we can choose the solutions
as
A(q
n
+)
(q ) = A(q
n
+)
(q ) (18.263)
(q )
We could not have chosen a minus sign, since this would have made An +
vanish for q+ = 0.
Applying the relations (18.263) and (18.265) for q+ = we get the relation
An(+) (q ) = n A(+)
n (q 2) (18.266)
(+)
so we can determine n from the translational symmetries of AN .
Eq. (18.262) yields the potential n (q+ ) which is periodic with period 2, and
(n)
we are looking for functions Cm (q+ ) that are periodic in q+ with period 4.
Therefore, Blochs theorem tells us that we can choose a complete set of solutions
as
(n)
Cm (q+ ) = ein q+ /2 Dm
(n)
(q+ ) (18.267)
(n)
where Dm is periodic with period 2, and n = 0, 1. For n = 0 the solution
(n)
Cm (q+ ) is thus periodic with period 2, whereas for n = 1, it is antiperiodic.
(n) (q )
We require that the product of Cm (q+ ) and An + (q ) satisfies the symmetry
(18.260), and if we combine this with (18.265), we get the relation
(n)
Cm (q+ )A(q
n
+) (n)
(q ) = n Cm (q+ + 2)A(q
n
+)
(q ) (18.268)
(n)
from which we conclude that Cm (q+ ) must fulfill the symmetry
(n) (n)
Cm (q+ + 2) = n Cm (q+ )A(q
n
+)
(18.269)
Comparing to (18.267) we see that this implies that for n = 1 we must choose
n = 1 and for n = +1,we can use = 0. We can solve (18.257) by a Fourier
1467
(n)
expansion of Dm and n (as shown below in 18.270).
(q )
The solutions of (18.262) are functions An + (q ) which are periodic in q with
period 4. Therefore, for each value of q+ we make the expansion
1 X (q+ ) ijq /2
A(q
n
+)
(q ) = j,n e (18.270)
4 j
We solve the second Born-Oppenheimer equation using the results from the first
Born-Oppenheimer equation. First, coefficients in the expansion
X
n ijq+
n (q+ ) = K e (18.273)
k
in (18.257) together with the expansion (18.273) which yields the following equa-
tion for the -coefficients
2
X
n m,n n
k lk + 2 l + lm,n = Em
(n) m,n
l (18.275)
2
k
Since the potential energy curves n (q++ ) are even functions, the solutions can
be chosen to be either even or odd, and the coefficients then fulfill lm,n = l
m,n
.
It suffices to only consider coefficients with m 0 and solve the recurrence
equations.
1468
18.7.5 Exact and Born-Oppenheimer Approximate Solu-
tions
A. Wave functions
Figure 18.76: Energies and eigenfunctions found by solving the two Born-
Oppenheimer equations for = 0.5 and = 0.5. Left panel: The six low-
est potential curves n (q+ ) found from the first Born-Oppenheimer equation.
Upper panels: Magnification of four of the potential curves in the left panel.
(q )
Lower panels: Eigenfunctions An + (q ) for the first Born-Oppenheimer equa-
tion shown for all values of q+ for the corresponding n-values. In the upper
panels are shown (horizontal dashed blue/red lines) the two lowest energies
(n)
Ej for j = 1, 2 found from solving the second Born-Oppenheimer equation
in the potential n (q+ ) and the corresponding wave functions (solid blue/red
lines).
The lowest potential curve is well separated from the higher ones which lie
closer. Each of the potential curves has an energy variation which is typically
small compared to the energy distance between the bands. In the upper panels,
a magnified view of the curves are shown. In the lower panels eigenfunctions
(q )
An + (q ) to the first Born-Oppenheimer equation are shown for four different
values of n.
1469
The second Born-Oppenheimer equation uses the energies n (q+ ) as potential
functions in a Schrodinger like equation, and each of the upper panels in Figure
18.76 show the energy levels of the two lowest eigenstates (m = 1, 2) in these
(n)
potentials along with their eigenfunctions Cm (q+ ). As we saw in the previous
(n)
section, the function Cm (q+ ) should be chosen periodic or anti-periodic de-
(q )
pending on the value of n . By studying the behavior of An + (q ) at q+ =
(q )
one can see if n is +1 or 1 depending on whether the wave function An + (q )
(n)
changes sign when translated by or not. For n = 0, 2 the solutions Cm (q+ ) to
the second Born-Oppenheimer equation must be periodic with period 2, while
(n)
for n = 1, 5 the solutions Cm (q+ ) must be chosen anti periodic.
1470
B. Energies
In figure 18.78 both the exact and the approximative energies are shown for fixed
as functions of the scaled interaction strength . Except in the region where
is numerically small, there is reasonable agreement between the exact and
the approximated energy levels. For negative there is a clear grouping of the
energy levels in two groups: Those that are nearly constant as a function of and
those that depend linearly on . Comparing to the approximate energies found
by the Born-Oppenheimer approximation we see that the linear dependence
comes from the fact that the position of the lowest potential curve varies linearly
with .
1471
C. Approximate solution of the first Born-Oppenheimer equation
The term 4 contributes only for k = 0 since all other plane waves integrate
to zero in the second line of Eq. (18.256). Even when the omission of (18.276)
(q )
is not valid, the integral still becomes substantial if An + (q ) has no nodes,
(q )
whereas it is suppressed when there are sign changes in An + (q ).
Even though figure 18.76 is obtained with moderate values of and , we see
(q )
the similarity between the numerically determined An + (q ) and plane waves,
while the energy levels n (q+ ) are clearly not constant. This is due to the term
(18.276), the effect of which we will approximate using non-degenerate pertur-
bation theory. Due to the orthogonality between the cosine functions, there are
no first-order corrections. The second order corrections, on the other hand, give
contributions of the form
k (q+ ) = ak F (q+ )2
(18.279)
1472
where the amplitude ak can be calculated
1
8 k = 0
1
ak = 216 k = 1 (18.280)
1
otherwise
(4k2 1)
as shown below.
The second order perturbation terms for the potential curves k (q+ ) are found
by calculating the matrix elements of the terms (18.276) between pairs of un-
perturbed eigenfunctions which are plane waves:
F (q+ ) +2
Z
q )
Il,k dq ei(kl)q /2 cos (18.281)
4 2 2
Using the orthogonality of the cosine functions we see that only coefficients with
neighboring values of l and k are coupled
F (q+ )
Il,k = [kl+1 + kl1 ] (18.282)
2
The resulting perturbative corrections then takes the form
X |Ilk |2
k (q+ ) = = ak F (q+ )2 (18.283)
k m
l6=k
1473
D. Approximate solution of the second Born-Oppenheimer equation
(n)
To analyze in more detail how the eigenenergies Em are distributed we must
take a closer look at the second Born-Oppenheimer equation which has the form
of a Schrodinger equation for a particle of mass ~2 /4k in the potential n (q+ ).
When the above perturbative treatment is valid, this potential is a cosine with
amplitude |ak |/2, so in order to estimate the eigenstates and energies, we must
compare and |ak |. In the limit where we can neglect the q+ -dependence
of
k (q+ ), the solutions can be well approximated by plane waves eimq+ / 2 with
box potential -energies
(k) 2
Em = k + 4k,0 + 2m2 (18.286)
2
which depend quadratically on m. In the opposite limit where k (q+ ) s a deep
potential in (18.257), we can approximate the cosine potential by a quadratic ex-
pansion around its minimum. The resulting equation is a Schrodinger equation
for a particle with mass ~2 /4k in a harmonic oscillator of frequency
1p
k = 2|ak | (18.287)
~
For the lower part of the energy spectrum the solutions are then well approxi-
mated by the usual harmonic oscillator eigenstate wave functions and the ener-
gies are equidistantly spaced with spacing ~k :
(k) ak 1 p
Em = k 2 + 4k,0 + + m+ 2|ak | (18.288)
2 2 2
Figure 18.79 illustrates the transition between the particle in a box and the
harmonic oscillator regimes by showing the exact and approximate energies
(0)
Em as functions of for fixed negative . Since the harmonic oscillator ap-
proximation is valid when the potential in (18.257) is deep, it requires that
|a0 | = | 8|1 , so to capture the whole transition, the -axis is loga-
rithmic. The energies are plotted after subtracting the ground state energy E0
and scaling by the energy difference E1 E0 between the first excited state and
the ground state. For 1 the harmonic oscillator spectrum is then revealed
as levels with unit spacing. For 1, on the other hand, the curves become
constant at 1, 4, 9, ... showing the quadratic dependence on m. We note that
there is a perfect agreement between the exact and approximate energies shown
in the figure. In the transition from the harmonic oscillator regime to the par-
ticle in a box regime the energy levels group in pairs, which has the following
explanation: For a deep potential curve n there is a significant energy differ-
ence between the first excited even and odd states, but when the potential curve
is nearly constant, then even and odd solutions with a given wave number has
almost the same energy. This is exemplified in the eigenfunctions in the upper
panels of figure 18.76.
1474
Figure 18.79: Exact and approximate energies as a function of for = 10.
The black dashed curves show the exact energies En , while the solid red curves
(0)
show the approximate energies Em found from the lowest potential-curve in
the second Born-Oppenheimer equation. The green curve shows the position
of the maximum of the lowest potential curve 0 (q+ ) within the perturbative
approximation. The exact ground state energy E0 , which varies with , has
been subtracted from all energies, and afterward, the energies is scaled by the
energy difference between the two lowest exact energy levels E1 E0 .
No matter how deep the potential curve k (q+ ) is, the harmonic approximation
is not perfect, and above some energy the spectrum is ill-described by a harmonic
oscillator spectrum. A simple estimate suggests that the description is good for
eigenstates whose energies lie below the maximum of the potential curve, which
is approximated by the unperturbed energies plus a term depending on the sign
of ak
ak + |ak |
tk (, ) = k 2 + 4k,0 + (18.289)
2 2
In figure 18.79 this (solid green) curve is shown for k = 0 and agrees sys-
1475
tematically with the border where the harmonic oscillator energy spectrum is
significantly altered.
18.7.6 Conclusions
In the present section we have considered two identical bosons on an infinite,
discrete lattice with an additional harmonic confinement. In the tight bind-
ing approximation, the single particle physics in terms of quasi- momenta is
described by the same equation as a single particle in a continuous cosine po-
tential - namely the Mathieu equation. Adding a contact interaction yields a
Hamiltonian which does not separate in relative and center-of-mass coordinates,
even though the two-body interaction problem separates in both a homogeneous
discrete lattice Hamiltonian and in a continuous harmonic oscillator.
In the solution of both the first and second Born-Oppenheimer equations we can
identify the excitation degrees of freedom in the system. This provides physi-
cally motivated quantum numbers valid also for the exact eigenstates together
with rules for which quantum numbers are allowed by symmetry considerations.
Finally, from the good agreement between the exact and approximate solutions
we conclude that the Born-Oppenheimer is well justified when the energy scales
for the relative and the center-of-mass motion of the two-particle quantum state
are well-separated. We imagine that a similar separation may be useful for ap-
proximate first principle calculations on many other cold atom systems, e.g.,
with more particles and possibly with mixtures of different species.
18.8 Problems
18.8.1 Piecewise Constant Potential Energy
One Atom per Primitive Cell
Consider a one-dimensional crystal whose potential energy is a piecewise con-
stant function of x. Assume that there is one atom per primitive unit cell - that
is, we are using the Kronig-Penney model as shown below.
1476
Figure 18.80: Piecewise Constant Potential - 1 Atom per Primitive Cell
(a) Let s = d/2 (same spacing as in the text) and explain why no energy
gap occurs at the second Brillouin zone boundary in the weak-binding
limit, using physical argument based on sketches of the electron probability
density.
(b) For s = d/3, what are the magnitudes of the lowest six band gaps in the
weak binding limit?
(a) Using the weak-binding approximation, determine the band gap for an
arbitrary Brillouin zone boundary.
(b) Use the results of part (a) to obtain an expression for the band gaps for
w d and zone boundaries corresponding to small values of G. Are any
of these band gaps zero? Use physical arguments to explain why or why
not.
1477
(c) Use the results of part(a) to determine the magnitude of the lowest eight
band gaps for w = d/4. Are any of these band gaps zero? Use physical
arguments to explain why or why not.
(d) In the weak-binding approximation, the energies for wave vectors k that
are far from the Brillouin zone boundaries are given by the free-electron
energies E = ~2 k 2 /2me . In relation to the zero of V (x) above, from
what value of the energy are the free-electron energies measured? Does
anything unusual happen when the energies exceed zero - the beginning
of the continuum for the isolated atoms? Determine how many band
gaps occur below E = 0. Answer these questions using the weak-binding
approximation.
(b) Sketch free-electron constant energy contours in the reduced zone scheme
for the lowest four bands.
(c) Sketch the free-electron density of states, D(n) (E), for each of the four
lowest bands individually. Sketch D(E) for the total of the four lowest
bands.
(d) Sketch the free-electron Fermi surfaces in the reduced zone scheme for
= 1 to 6. Indicate the positions of the various Fermi energies on the
density-of-states graphs of part (c). Use quantitatively correct values for
kF and EF in this part.
1478
18.8.4 Weak-Binding Energy Bands for a Crystal with a
Hexagonal Bravais Lattice
Consider a two-dimensional crystal with an hexagonal rectangular Bravais lat-
tice oriented so that two nearest lattice points can lie along the yaxis but not
along the xaxis.
(a) Using the reduced zone scheme, sketch the energy versus distance in ~k
space (starting at ~k = 0) along the path in the first Brillouin zone shown
in the figure below. Do this for the six lowest bands in both the free-
electron and weak-binding approximations (assuming (incorrectly) that
all degeneracies are absent in the latter case).
(b) Sketch constant energy contours in the reduced zone scheme for the lowest
six bands. Do so in both the free-electron and weak-binding approxima-
tions. Indicate the location in ~k space of all distinct maxima, minima and
saddlepoints (only one of a set that are equivalent by symmetry need be
shown).
(c) Sketch the Fermi surface in the reduced zone scheme for = 1 to 7.
Do so in both the free-electron and weak-binding approximations. Use
quantitatively correct values for kF in the free-electron sketches.
(d) Sketch the density of states, D(n) (E), for each of the five lowest bands
individually and D(E) for the total of the five lowest bands.. Do so in
both the free-electron and weak-binding approximations. Assume all de-
generacies are absent in the latter case and make reasonable assumptions
about the sense of the energy shifts from the free-electron values at the
singular points.
(e) For which integral value of would insulating properties be most likely to
first occur as the strength of the periodic potential energy is increased?
Why?
1479
18.8.5 A Weak-Binding Calculation #1
Consider a two-dimensional crystal with a primitive rectangular Bravais lattice
and two identical atoms per primitive unit cell. Take the structure to be as
shown below with a : b : c :: 4 : 2 : 1. Take the potential energy to be the sum of
the potential energies for the individual atoms located at the atom sites given
in the figure. Use the weak-binding approximation.
(a) Find expressions for the matrix elements VG that describe the band gaps
in the weak-binding limit. Under what circumstances, if any, is VG = 0?
(b) Use the results of part (a) to draw qualitatively correct constant energy
contours in the reduced zone scheme for the lowest three bands.
(c) Sketch qualitatively correct individual band densities of states for the low-
est three bands.
(a) For the case of one atom per primitive cell, obtain a general expression for
the energy difference between adjacent bands at a Brillouin zone bound-
ary where they would be degenerate in the free-electron approximation
(ignoring the intersections of two or more boundaries). How does this
result depend on the Bravais lattice (assuming the area of a primitive cell
is the same for each different case)?
(b) For a crystal with a square lattice and one atom per primitive unit cell,
what are the energies of the lowest four bands at ~k = (/d)(1, 1)? Explain
your result in a physical and qualitative way.
1480
(c) For a crystal with a centered rectangular Bravais lattice and two different
delta-function atoms per primitive unit cell as shown in the figure below,
evaluate the energy splittings between the bands for all zone boundaries in
the extended zone scheme for the five lowest bands (ignore all intersections
of two or more boundaries).
1481
(c) Evaluate the limits of Eq. (18.304) by expressing x2tot in terms of boson
operators and taking the expectation value with respect to the ground
SB
state of Hcoll .
(b) Which value should be chosen for uk in order for the Bogoliubov trans-
formation to yield the diagonal Hamiltonian of Eq. (18.295)? [Answer:
tanh (2uk ) = Bk /Ak ]
1482
Chapter 19
Second Quantization
To see how to proceed in these cases, we will step back, revisit the subject
of identical particles and look for an alternative way of thinking about such
systems.
19.1.1 Indistinguishability
As we saw earlier, in quantum mechanics, the state of a system with identical
particles can be described by a set of quantum numbers corresponding to the
eigenvalues of a commuting set of single-particle operators representing single-
particle observables.
When we specify a state vector we designate how many particles have certain
sets of quantum numbers, i.e.,
1483
saw in atoms, this indistinguishability has measurable effects on energy levels
arising from particle exchange symmetries.
We make the fundamental postulate that the set of all Ni forms a complete set
of commuting Hermitian operators for any system of identical particles.
We now construct the state vector space appropriate for the many particle
system by generalizing one-particle quantum mechanics and building in indis-
tinguishability from the start.
Our postulates imply that the state vector space for the many particle system
(called Fock space) has the basis vectors
1484
In this vector space, we define
These one-particle states span the one-particle subspace of the much larger state
space of the many particle system.
Most of the quantum mechanics we have developed so far applies to these one-
particle states.
To answer this question yes, we must show that a consistent framework exists
that makes predictions in agreement with experiment.
Another appearance of operators of this type with similar properties will occur
when we study the interaction of radiation with matter in Chapter 20. We will
be able to introduce photon annihilation/creation operators which remove/add a
1485
single photon with particular quantum numbers (~k, ~) corresponding to photon
momentum and polarization. The photon operators will have same mathemat-
ical structure(commutators) as the a , a+ operators of the harmonic oscillator
system.
In the photon case, as we will see, the states of the system will be the photon
number states given by
E
N~k1~1 , N~k2~2 , N~k3~3 , .......... (19.5)
We will base our generalization on these examples. The generalization will al-
low us eventually to define the most general Fock space for any many particle
system.
Suppose that we have a potential well V (~r) with single particle energy eigen-
states given by
We assume that all n particles are in the lowest level (ground state) 0 (~r) of
the well. We label this state by the symbol |ni where n = 0, 1, 2, 3, ......, i.e., |0i
is the state with no particles in the lowest level.
By definition, these operators relate states of the nboson system with all n
particles in 0 (~r) to those states of an n 1 particle system with all particles
in 0 (~r). In this sense, we say that
These operators, by construction, have the same algebra as the harmonic oscil-
lator operators, i.e.,
a0 , a+ [a0 , a0 ] = 0 = a+ +
0 =1 , 0 , a0 (19.8)
and
n
a+
|ni = 0 |0i , a0 |0i = 0 (no particles to be annihilated) (19.9)
n!
1486
This says that
Acting to the left (instead of to the right) these operators reverse their roles,
i.e.,
hn| a0 = n + 1 hn + 1| adds particles in the state 0 (~r)
hn| a+
0 = n hn 1| removes particles in the state 0 (~r)
The operator
N0 = a+
0 a0 (19.10)
measures the number of particles in a state since
N0 |ni = a+
0 a0 |ni = n |ni (19.11)
This is a very appealing picture, but really all we have done is rewrite the
harmonic oscillator story using a lot of new words. We do not have any new
physics yet!
Before introducing the new physical ideas, we carry out this same discussion for
fermions.
We can figure out the operator algebra by using a special representation of these
states and operators. We let the two states (only allowed states) be a basis and
select the 2-dimensional representation
1 0
|0i = and |1i = (19.12)
0 1
1487
In this representation we have
a0 = |0i h1| , a+
0 = |1i h0| (19.13)
a0 a+
0 = |0i h0| = projection operator on the |0i state
a+
0 a0 = |1i h1| = projection operator on the |1i state
a0 a+
0 + a+
0 a0 = |0i h0| + |1i h1| = I
This last relation was derived, in general, earlier and is just the sum over all
projection operators.
Therefore we get
a0 , a+ = a0 a+ +
0 0 + a0 a0 = I (19.14)
Thus, the algebra involves anticommutators instead of commutators. That is
the only change we need to make!!
We also have
{a0 , a0 } = 0 = a+ +
0 , a0 (19.15)
These anticommutators imply that
2
(a0 ) = 0 we cannot remove two fermions from the
same state (maximum of one allowed)
2
a+
0 =0 we cannot put fermions into the
same state (maximum of one allowed)
Summary:
For one single particle level in a potential well, we can define annihilation/creation
operators such that:
bosons have
a0 , a+ , [a0 , a0 ] = 0 = a+ +
0 =I 0 , a0 (19.16)
1488
and
fermions have
a0 , a+ = I , {a0 , a0 } = 0 = a+ +
0 0 , a0 (19.17)
We now expand our view and consider the case where particles can occupy two
levels of the potential well, say 0 (~r) and 1 (~r).
|n0 , n1 i (19.18)
which implies
In the same manner as in the one level case, we must have the commutator
algebra
a0 , a+ [a0 , a0 ] = 0 = a+ +
0 =1 , 0 , a0 (19.20)
and
a1 , a+ [a1 , a1 ] = 0 = a+ +
1 =1 , 1 , a1 (19.21)
For bosons, the order in which we create or annihilate particles in a state does
not matter, i.e.,
a0 a1 |n0 , n1 i = a1 a0 |n0 , n1 i (19.22)
which says
[a0 , a1 ] = 0 (19.23)
In a similar way all the other mixed commutators are also zero
a0 , a+ +
+ +
1 = 0 = a1 , a0 = a1 , a0 (19.24)
1489
All allowed states can be constructed from the vacuum state |0, 0i by using
n1 + n0
a+ a
|n0 , n1 i = 1 0 |0, 0i (19.25)
n1 ! n0 !
Finally,
N0 = a+
0 a0 the number of particles in state 0 (~r)
N1 = a+
1 a1 the number of particles in state 1 (~r)
and
N = N0 + N1 the total particle number operator (19.26)
Therefore,
N0 |n0 , n1 i = n0 |n0 , n1 i
N1 |n0 , n1 i = n1 |n0 , n1 i
N |n0 , n1 i = (n0 + n1 ) |n0 , n1 i
Thus, for bosons we are able to just glue two single level many particle systems
together to create a two-level many particle system. We are really constructing
direct product states.
For fermions,however, there are some extra complications that we have to deal
with.
We start off by following a similar procedure. For the two-level system we have
only four possible fermion states, namely,
We define the a1 , a+
1 by
a1 |0, 0i = 0 , a1 |1, 0i = 0
a1 |0, 1i = |0, 0i , a1 |1, 1i = |1, 0i
a+
1 |0, 0i = |0, 1i , a+
1 |1, 0i = |1, 1i
a+
1 |0, 1i = 0 , a+
1 |1, 1i = 0
For a0 , a+
0 we can freely define the operation on state with no particles in level
1 (~r)
1490
We must take care, however, in the fermion case when a particle exists in level
1 (~r), i.e., for the operations
a0 |0, 1i , a0 |1, 1i
a+
0 |0, 1i , a+
0 |1, 1i
The reason we must worry about these cases is connected with our earlier dis-
cussion of a totally antisymmetric state vector for fermions,i.e., if we interchange
any two identical fermions we must get a minus sign.
In this formalism, how do we interchange two fermions in the state |1, 1i? Using
only the defined relations we have.....
Step 1: |1, 1i |1, 0i = a1 |1, 1i remove a particle from state 1 (~r)
using a1
Step 2: |1, 0i |0, 1i = a+ 1 a0 |1, 0i transfer the particle from state
0 (~r) to 1 (~r) by applying a+
1 a0 (a0 followed by a1 )
However, all we have done is switch the particles in the original state |1, 1i,
which means that a minus sign must appear or
a+
0 |0, 1i = |1, 1i (19.29)
In a similar manner, the other relations that complete the definition of the
annihilation and creation operators are
a+
0 |0, 1i = |1, 1i , a0 a+
0 |0, 1i = a0 |1, 1i = |0, 1i (19.31)
These definitions, which are now consistent with complete antisymmetry corre-
spond to the anticommutation relations
a0 , a+
+ +
0 =1 {a0 , a0 } = 0 a0 , a0 = 0
+
+ +
a1 , a1 = 1 {a1 , a1 } = 0 a1 , a1 = 0
+
+
{a0 , a1 } = 0 a0 , a1 = 0 a0 , a1 = 0
+ +
a0 , a1 = 0
1491
We get anticommutators instead of commutators because of the complete anti-
symmetry under particle interchange(see argument below).
The connection between the minus sign and the anticommutators is now clear,
i.e.,
a0 |1, 1i = a0 a+ + + +
1 a0 |0, 0i = a1 a0 a0 |0, 0i
= a+ + +
1 [1 a0 a0 ] |0, 0i = a1 |0, 0i
= |0, 1i
So we could have assumed the anticommutators and derived the state operations
instead of going the other way.
The generalization to the many particle system where particles can occupy all
of the levels is now straightforward.
In generalizing, we will not only let the particles occupy all of the levels, but
also have all spin orientations.
where now
ai , a+ , [ai , aj ] = 0 = a+ +
j = ij i , aj (19.34)
and n2 + n1 + n0
a+
2 a a
|n0 , n1 , n2 , ....i = .......... 1 0 |0i (19.35)
n2 ! n1 ! n0 !
where
|0i = |0, 0, 0, 0, .........i = the vacuum state (19.36)
1492
To within numerical factors, as we will see, these are the same relations we will
find in Chapter 20 for photons. Therefore, photons must be bosons !!
ai , a+
+ +
j = ij , {ai , aj } = 0 , ai , aj = 0 (19.37)
and
n2 n1 n0
|n0 , n1 , n2 , ....i = .......... a+
2 a+
1 a+
0 |0i , ni = 0, 1 only (19.38)
In both cases X X
N = Ni = a+
i ai (19.39)
i i
and h i
Ni , Nj = 0 (19.40)
19.2.2 An Example
We now consider the complete set of plane wave states in a box using periodic
boundary conditions. We have
p~2op
H = (19.41)
2m
Since h i
H, p~op = 0 (19.42)
we have a common eigenbasis that we will label by |~
pi. We then have
p~2
p~op |~
pi = p~ |~
pi , H |~
pi = |~
pi = E |~
pi (19.43)
2m
so that
p~2
E= (19.44)
2m
The corresponding wave functions are
~
eik~r
p~ (~r) = h~r | p~i = (19.45)
V
where p~ = ~~k and V = volume of the box. The factor 1/ V normalizes the
wavefunction in the box.
p~ (0, y, z) = p~ (Lx , y, z)
p~ (x, 0, z) = p~ (x, Ly , z)
p~ (x, y, 0) = p~ (x, y, Lz )
1493
which imply that
2nx
eikx Lx = 1 kx = , nx = 0, 1, 2, ....
Lx
2ny
eiky Ly = 1 ky = , ny = 0, 1, 2, ....
Ly
2nz
eikz Lz = 1 kz = , nz = 0, 1, 2, ....
Lz
We now define
a+
~s = the operator that creates (adds) a particle of
p
momentum p~ and spin orientation s in(to) the box
ap~s = the operator that annihilates (removes) a particle of
momentum p~ and spin orientation s in(from) the box
X ei~k~r X ei~k~r
h~r 0 | a+ p
~s |0i = h~r 0 | p~i
p
~
V p
~
V
X ei~k~r ei~k~r 0
= = (~r ~r 0 ) (19.49)
p
~
V V
This says that the operator s+ (~r) adds all the amplitude at the position ~r or
we say
1494
In a similar manner, the operator
+ X ei~k~r
s (~r) = s+ (~r) ap~s (19.50)
p
~
V
In this new formalism, position and momentum are once again just numbers,
but the wave functions are now operators. Hence the name second quantization.
and
X ei~k~r ei~k~r 0
= p~p~ 0 ss0 = (~r ~r 0 )ss0 (19.53)
V
p
~p~0
X ei~k~r ei~k~r 0
= p~p~ 0 ss0 = (~r ~r 0 )ss0 (19.54)
V
p
~p~0
1495
These relations imply that creating particles commutes(bosons) or anticom-
mutes(fermions) with annihilating particles unless the two, operations occur at
the same point in space.
1
|~r1 , ~r2 , ......., ~rn i = + (~rn )....... + (~r2 ) + (~r1 ) |0i (19.57)
n!
represents the state with one particle at ~r1 , one particle at ~r2 , and so on.
We will use these states as a basis for the many particle, many level system.
The states have the properties:
1. for bosons
|~r2 , ~r1 , ......., ~rn i = |~r1 , ~r2 , ......., ~rn i (19.58)
due to the commutation relations which imply
2. for fermions
|~r2 , ~r1 , ......., ~rn i = |~r1 , ~r2 , ......., ~rn i (19.60)
due to the anticommutation relations which imply
Since
n+1 +
+ (~r) |~r1 , ~r2 , ......., ~rn i = (~r) + (~rn )....... + (~r2 ) + (~r1 ) |0i
n + 1!
= n + 1 |~r1 , ~r2 , ......., ~rn , ~ri (19.62)
1496
great advantages of the annihilation/creation operator formalism. We can show
this important property this way:
1
(~r) |~r1 , ~r2 , ......., ~rn i = (~r) + (~rn )....... + (~r2 ) + (~r1 ) |0i (19.63)
n!
1
(~r ~rn ) + (~rn )(~r) + (~rn1 )....... + (~r2 ) + (~r1 ) |0i
=
n!
where
bosons
(19.64)
f ermions
We now continue commuting (~r) with the + s to the right until we have
1497
Note the reversal in the order of the operators.
where
X
= sum over all permutations if the coordinates (19.70)
P
and
+1 bosons
P
() = +1 fermions - even permutation (19.71)
1 fermions - odd permutation
What is the state |i where the particles have a wave function (~r1 , ....., ~rn )?
we must have
Z
|i = d3~r1 0 d3~r2 0 ....d3~rn 0 (~r1 0 , ....., ~rn 0 ) |~r1 0 , ~r2 0 , ......., ~rn 0 i (19.73)
1 X P
h~r1 , ~r2 , ......., ~rn | i = (1) P (~r1 , ....., ~rn ) (19.74)
n!
P
This result is true even if (~r1 , ....., ~rn ) is not already properly symmetrized.
When it already properly symmetrized, then all n! terms are identical and
We must have
h | i = 1 (19.76)
if (~r1 , ....., ~rn ) is symmetrized and
Z
1 = d3~r1 d3~r2 ....d3~rn (~r1 , ....., ~rn )(~r1 , ....., ~rn ) (19.77)
1498
i.e.,
Z
h | i = d3~r1 d3~r2 ....d3~rn (~r1 , ....., ~rn )
Z
h~r1 , ~r2 , ......., ~rn | d3~r1 0 d3~r2 0 ....d3~rn 0 (~r1 0 , ....., ~rn 0 ) |~r1 0 , ~r20 , ......., ~rn 0 i
Z
= d3~r1 d3~r2 ....d3~rn (~r1 , ....., ~rn )(~r1 0 , ....., ~rn 0 )
1 X P
(1) P [(~r1 ~r1 0 )(~r2 ~r2 0 )......(~rn ~rn 0 )]
n!
P
Z
= d3~r1 d3~r2 ....d3~rn (~r1 , ....., ~rn )(~r1 , ....., ~rn ) = 1
Now h~r1 , ~r2 , ......., ~rn | i is the amplitude for observing particles at ~r1 , ~r2 , ......., ~rn .
It implies that
Z
|i = d3~r1 d3~r2 ....d3~rn |~r1 , ~r2 , ......., ~rn i h~r1 , ~r2 , ......., ~rn | i (19.78)
as it should for a complete set within the nparticle subspace, i.e., it is In only
when operating on properly symmetrized nparticle states.
1499
where |i and |0 i are nparticle states. We obtain
r) |i
h0 | (~r) |i = h0 | + (~r)(~r) |i = h0 | + (~r)I(~
!
X
= h0 | + (~r) |0i h0| + In0 (~r) |i (19.84)
n0 =1
We thus obtain
Now, since the h~r1 , ........, ~rn | i are completely symmetrized (or antisymmetrized)
this can be written as
Z n
X
h | (~r) |i = d3~r1 ........d3~rn h0 | ~r1 , ........, ~rn i
0
(~r ~ri ) h~r1 , ........, ~rn | i
i=1
n
X
= h0 | (~r ~ri ) |i (19.89)
i=1
Since these two objects have all their matrix elements are identical, they must
be equal. Therefore,
X n
(~r) = (~r ~ri ) (19.90)
i=1
or
(~r) = a representation of the density operator in this formalism
The way to think about this operator is as follows:
1500
If the particles have spin, the density operator for particles at ~r with spin s is
and
X
(~r) = s+ (~r)s (~r) = the total density operator (19.92)
s
Z XZ
3
N = d ~r(~r) = d3~rs+ (~r)s (~r) = total number operator (19.93)
s
Any operator that is given by the relation f (~ pop ) is easily written down in
this formalism, i.e., such an operator is given by the number of times p~ occurs
(Np~ = ap+
~s ap
~s ) times the value of the operator (f (~
p)) summed over (~ p s) or
X
f (~
pop ) = p)a+
f (~ ~s ap
p ~s (19.94)
p
~s
We can rewrite this expression in a form involving the field operators that will
then lead to a prescription for writing any operator in this formalism.
and so on for the second equation of the pair, which is exactly what we have
1501
been assuming all along!
~2 1 X
Z
~ ~ 0
= d3~rd3~r0 ~r eik~r ~r,0 eik~r, s+ (~r)s (~r,0 )
2m V
p
~s
~2 1 X
Z
~ ~ 0
T = d3~rd3~r,0 eik~r eik~r, (~r s+ (~r)) (~r,0 s (~r,0 )) (19.97)
2m V
p
~s
Using
1 X i~k~r i~k~r 0
e e = (~r ~r 0 ) (19.98)
V
p
~s
~2
Z
T = d3~r + (~r) (~r) (19.99)
2m
This is very similar to the expectation value of the kinetic energy operator for
a single particle, i.e.,
~2
D E Z
T = d3~r (~r) (~r) (19.100)
2m
Similarly, the density operator resembles the probability density (~r)(~r) for
finding a single particle with wave function at the point ~r.
We can now exploit this similarity to write down other operators, for example,
1502
the current density operator is given by
1 +
(~r)((~r)) ( + (~r))(~r)
j(~r) = (19.101)
2im
the operator for the density of spin at point ~r is given by
~ r) = 1
X
S(~ + (~r)~ss0 s0 (~r) (19.102)
2 0 s
ss
where
~ = (x , y , z ) (19.103)
In the new formalism, we describe the ground state via a set of occupation
numbers
(
+ 1 |~
p| p F
np~ = np~ = h0 | ap~ ap~ |0 i = (19.104)
0 |~
p| p F
Now X X
N= np~s = 2 1 (19.105)
p
~s |~
p|pF
1503
and, thus, Z
X V
d3 p~ (19.108)
(2~)3
p
~
We then get
1/3
pF = ~ 3 2 n (19.109)
where
N
n=
= average particle density (19.110)
V
The expectation value of the density operator is
X
h(~r)i = h0 |s+ (~r)s (~r) |0 i
s
X ei~k~r ei~k 0 ~r
= h0 |ap+ r)ap~ 0 s (~r) |0 i
~s (~ (19.111)
0
V
s~
pp~
Now
h0 | a+ r)ap~ 0 s (~r) |0 i = p~p~ 0 np~s
~s (~
p (19.112)
since if we remove a particle of momentum p~ from the ground state, we can only
get the ground state back if we add a particle of the same momentum p~. Where
we have used
np~s = a+
~s (~
p r)ap~s (~r) (19.113)
Therefore,
1 X
h(~r)i = np~s = n = constant (19.114)
V
p
~s
We have
X ei~k~r ei~k 0 ~r
Gs (~r ~r 0 ) = h0 |ap+ r)ap~ 0 s (~r) |0 i
~s (~
V
p
~p~0
~ ~0
1 X eik~r eik ~r 1 X i~k(~r~r 0 )
= p~p~ 0 np~s = e np~s (19.116)
V 0
V V
p
~p~ p
~
1504
Changing to an integral we obtain
ZpF Z Z ZpF Z1
d3 p~ i~k(~r~r 0 )
0 1
r 0
ik r ~
Gs (~r ~r ) = e = p2 dp de
(2~)3 4 2 ~3
0 0 1
ZpF
1
r 0
ik r ~
r 0
ik r ~
= p2 dp e e
4 2 ~3
0
3n sin x x cos x
= (19.117)
2 x3
where
pF 1 p3F
x=|~r ~r 0 | , n = (19.118)
~ 3 2 ~3
As a function of x this looks as shown in Figure 19.1 below (for n = 1000)
3n 13 x3 301 5
x2
x n
= 1
2 x3 2 10
" #
2
1 pF |~r ~r 0 |
= 1 (19.119)
10 ~
1505
Gs (~r ~r0 ) is called the one-particle density matrix.
2. Calculate the density distribution of the particles with spin s0 in the new
state. The density is
h0 (~r, s)| s+0 (~r0 )s0 (~r0 ) |0 (~r, s)i
n 2
= h0 | s+ (~r)s+0 (~r0 )s0 (~r0 )s (~r) |0 i = gss0 (~r ~r0 ) (19.121)
2
where gss0 (~r ~r 0 ) the pair correlation function.
1506
We can evaluate gss0 (~r ~r 0 ) by shifting to the creation/annihilation operator
formalism. We get
n 2 X ei(~p~p 0 )~r/~ ei(~q~q 0 )~r0 /~
0
g ss0 (~r ~r ) = 2
h0 |a+ +
~s aq
p ~s0 aq ~ 0 s |0 i
~ 0 s0 ap
2 0 0
V
p
~p~ q
~q~
(19.124)
Now
h0 | ap+ r)aq+
~s (~ r 0 )aq~ 0 s0 (~r 0 )ap~ 0 s (~r) |0 i = 0
~s0 (~ (19.125)
unless we put back particles with the same spin and same momentum that we
remove, i.e.,
if s 6= s0 , then p~ 0 = p~ and ~q 0 = ~q (19.126)
This implies that
h0 | a+
p
+
~s aq
~s0 aq
~ 0 s0 ap
+
~ 0 s |0 i = h0 | ap
~s ap
+
~s aq ~s0 |0 i = np
~s0 aq ~s nq
~s0 (19.127)
We then get
n 2 1 X n 2
gss0 (~r ~r 0 ) = 2
np~s nq~s0 = ns ns0 = (19.128)
2 V 2
p
~q~
or
gss0 (~r ~r 0 ) = 1 for s 6= s0 (19.129)
This implies that the relative probability for finding particles at ~r and ~r 0 for
different spins is independent of |~r ~r 0 |. This is the same result as one obtains
in a classical non-interacting gas.
The PEP does not influence particles of different(opposite in this case) spins.
On the other hand, if the spins are the same, s = s0 , then we have two possibil-
ities, namely,
2
aq~ 0 s ap~ 0 s (ap~ 0 s ) = 0 (for fermions) (19.130)
Therefore, we have
h0 | a+
p
+
~s aq
~s aq ~ 0 s |0 i = p
~ 0 s ap ~p~ 0 q
~q
+ +
~ 0 h0 | ap
~s aq
~s aq ~s |0 i
~s ap
+ p~q~ 0 q~p~ 0 h0 | a+
p
+
~s aq
~s ap ~s |0 i
~s aq (19.131)
1507
This becomes
h0 | a+
p
+
~s aq
~s aq ~ 0 s |0 i = (p
~ 0 s ap ~p~ 0 q
~q~ 0 p
~q~ 0 q
~p
+
~ 0 ) h0 | ap
~s ap
+
~s aq ~s |0 i
~s aq
= (p~p~ 0 q~q~ 0 p~q~ 0 q~p~ 0 )np~s nq~s (19.132)
~q = p~ h......i = 0
We finally obtain
n 2 1 Xh i
q 0 )(~ r 0 )/~
gss0 (~r ~r 0 ) = 1 e i(~
p~ r ~
np~s nq~s
2 V2
p
~q
n 2
2
= [Gs (~r ~r 0 )] (19.133)
2
where Gs (~r ~r 0 ) is the single particle density function. This then becomes
9 2
gss0 (~r ~r 0 ) = 1 (sin x x cos x) (19.134)
x6
where
pF
|~r ~r 0 |
x= (19.135)
~
As a function of x this looks as shown in Figure 19.2 below.
This result implies a substantial reduction in the probability for finding two
fermions of the same spin at distances less than ~/pF .
The PEP causes large correlations in the motion of the particles with the same
spin. It seems like fermions of the same spin repel each other at short distances.
This effective repulsion is due to the exchange symmetry (PEP) of the wave
function and not from any real additional potentials.
1508
Figure 19.2: g(x) - Two Particle Correlation Function - Fermions
The calculation of the pair correlation function is the same as for fermions up
to this point.
X ei(~p~p 0 )~r/~ ei(~q~q 0 )~r0 /~
0
2
n g(~r ~r ) = h0 |a+ +
~ aq
p ~ aq ~ 0 |0 i (19.138)
~ 0 ap
0 0
V2
p
~p~ q
~q~
In this case,
h0 | ap+ +
~ aq
~ aq ~0 |0 i =
~0 ap 6 0 (19.139)
1509
only if
p~ = p~ 0 , ~q = ~q 0 or p~ = ~q 0 , ~q = p~ 0
These two cases are not distinct if p~ = ~q.
h0 | a+ +
~ aq
p ~ aq ~ 0 |0 i
~ 0 ap
h i
= (1 p~q~ ) p~p~ 0 q~q~ 0 h0 | ap+ +
~ aq
~ aq ~ |0 i + p
~ ap ~q~ 0 q
~p
+ +
~ 0 h0 | ap
~ aq
~ ap ~ |0 i
~ aq
= (1 p~q~ ) (p~p~ 0 q~q~ 0 + p~q~ 0 q~p~ 0 ) np~ nq~ + p~q~ p~p~ 0 q~q~ 0 np~ (np~ 1) (19.140)
h i
where we have used ap~ , a+ ~ = 1 in the last term. Therefore, we obtain
p
If we let V large, with N/V = n fixed, then the third term is of order 1/V
1510
smaller than the first two terms and we can neglect it.
0 2
The e(~r~r ) / term is due to exchange symmetry. In this case, exchange
symmetry increases the probability that two bosons will be found at small sep-
arations.
In fact,
1511
Figure 19.4: Hanbury-Brown and Twiss Boson Clumping Experiment
The half-silvered mirror splits the beam into two identical beams. The
am-
plitude for a photon to be transmitted/reflected by the mirror is 1/ 2 (the
probability is 1/2).
and then they averaged the quantity I1 (t)I2 (t + ) over t, keeping fixed.
This is the same as measuring the relative probability of observing two photons
separated by a distance c in the beams (c = speed of light).
The experimental result is just g(~r ~r 0 ) for |~r ~r 0 | = c . This confirmed the
theory or so one thought at that time.
Let us see how. Consider the setup shown in Figure 19.5 below:
1512
Figure 19.5: Classical or Quantum?
We have two sources of photons, A and B. A emits coherent light with amplitude
and wave number k. B emits coherent light with amplitude and wave number
k 0 . We assume that the relative phase of the two coherent beams is random and
that they have the same polarization.
The amplitude for light (A 1) eikr1 and the amplitude for light (B
0 0
1) eik r1 . Therefore, the superposition principle implies that
and that
I1 = intensity at detector 1
0 0 2
0 0
2 2
= eikr1 + eik r1 = || + || + 2Real ei(k r1 kr1 ) (19.146)
Therefore,
Similarly,
and that
I1 = intensity at detector 2
0 0 2
0 0
2 2
= eikr2 + eik r2 = || + || + 2Real ei(k r2 kr2 ) (19.149)
1513
Therefore,
The product of the intensities, however, behaves very differently, however, i.e.,
2
I1 I2 = |a1 a2 |
0 0 0
0 0 0 0
2
= 2 eik(r1 +r2 ) + 2 eik (r1 +r2 ) + eikr1 eik r2 + eik r1 eikr2 (19.151)
2 2
= I1 I2 + 2 || || cos [k 0 (r10 r20 ) k(r1 r2 )] (19.152)
Therefore, the correlated intensities have a term that depends on the detector
separation.
This term is a maximum when the detectors are at the same point.
If we average over all the different k and k 0 present in the beam (using a Gaus-
sian distribution) we get the same form as the quantum result.
This seems to imply that the photon bunching effect seen in the HBT experi-
ment is a consequence of the superposition principle applied to light from noisy
sources.
1514
These two ways are indistinguishable and the interference between them gives
the cosine term.
So, we corroborate superposition and bunching, but we do not seem to need the
quantum concept of a photon to do it. When one studies this problem in more
detail one can prove that a photon with quantum properties must exist.
Suppose the particles interact via a two-particle potential V (~r ~r 0 ). The inter-
action energy operator then becomes
Z
1X
= d3~rd3~r 0 V (~r ~r 0 )s+ (~r)s+0 (~r 0 )s0 (~r 0 )s (~r) (19.154)
2 0
ss
The order of the operators in this expression is very important. This form for
can be confirmed by comparing its matrix elements to the matrix elements in
the standard formalism.
We now calculate the ground state energy of our gas of spin = 1/2 fermions.
1515
The first order energy correction E (1) is the expectation value of in the un-
perturbed ground state. We get
Z
1 X
E (1) = d3~rd3~r 0 V (~r ~r0 ) h0 |s+ (~r)s+0 (~r0 )s0 (~r 0 )s (~r) |0 i
2
ss0
Z X n 2
1
= d3~rd3~r 0 V (~r ~r 0 ) gss0 (~r ~r 0 )
2 0
2
ss
Z " #
1 3 3 0 0 2
X
2 0
= d ~rd ~r V (~r ~r ) n Gs (~r ~r ) (19.157)
2 s
If we let Z
v0 = d3~rV (~r) (19.158)
then we have Z
1 N nv0
d3~rd3~r 0 V (~r ~r 0 )n2 = (19.159)
2 2
This is the average interaction of a uniform density of particles with itself (no
correlations). It is called the direct or Hartree energy.
This term takes account of the tendency of particles of the same spin to stay
apart. The effects of the short-range part of V (~r ~r 0 ) are overcorrected in the
direct energy and fixed up in exchange energy.
We have
pF r 2
sin pF~ r pF r
~ cos ~
Z
Eex 9n 3
= d r pF r 6
V (r) (19.161)
N 4
~
19.6.1 An Example
Consider a gas of electrons of average density n interacting via a Coulomb
potential
e2
V (~r ~r 0 ) = (19.163)
|~r ~r 0 |
The conduction electrons in a metal form such a gas.
We note that in a real physical system of this type, we never have an isolated
1516
electron gas. There always exists enough positive charges to make the overall
system electrically neutral.
e2 n2
Z
1
d3 rd3 r 0 (19.164)
2 |~r ~r 0 |
plus the average electrostatic interaction between the positive background and
the electrons
e2 n2
Z
d3 rd3 r 0 (19.165)
|~r ~r 0 |
exactly cancels the Hartree energy as it must because the electrostatic energy of
a neutral system can only be proportional to the volume for a large system(not
a higher power of the volume!).
Therefore, the net interaction energy of the electron gas (to first order) is
Z
Eex 9ne2 dx 2
= 2 (sin x x cos x) (19.166)
N pF x5
0
1517
where Z
~
V~k = d3 r eik~r V (~r) (19.172)
so that
X 1 XX
H = ~~k a~+ a~k + Vq~ a~+ a+ q ap
~~ ~ a~ (19.174)
k 2V q p
k+~ k
~
k ~
k~p q
~
where we have used the delta function (which corresponds to momentum con-
servation). This corresponds to the process
~k + p~ (~k + ~q) + (~
p ~q) (19.175)
1518
At low temperatures, a Bose-Einstein condensation takes place in the ~k = 0
mode, i.e., the ~k = 0 mode is macroscopically occupied or
N0 = h0 | a+ a |0 i N , N N0 = # excited particles << N0 (19.176)
This means that we can neglect the interaction of excited particles with one
another and restrict our attention to the interaction of the excited particles
with the condensed particles.
This gives
X 1
H = ~~k a~+ a~k + V0 a+ +
0 a0 a0 a0
k 2V
~
k
1 X 1 X
+ (V0 + V~k )a+ +
0 a0 a~ a~k + V~k (a~+ a+~ a0 a0 + a+ +
0 a0 a~
k a~
k ) + ....
V k 2V k k
~
k6=0 ~
k6=0
(19.177)
The effect of a+
0 and a0 on the state with N0 particles in the condensate is
p
a0 |......, N0 , ....i = N0 |......, N0 1, ....i (19.178)
+
p
a0 |......, N0 , ....i = N0 + 1 |......, N0 + 1, ....i (19.179)
Since N0 is a verylarge number ( 1023 ), both of these relations correspond to
multiplication by N0 . It is physically clear that the removal or addition of one
particle from the condensate will make no difference to the physical properties
of the system.
We then have
X ~2 k 2 1 2
H = a+ a~ + N V0
2m ~k k 2V 0
~
k6=0
N0 X N0 X
+ (V0 + V~k )a~+ a~k + V~k (a~+ a+~ + a~k a~k ) (19.180)
V k 2V k k
~
k6=0 ~
k6=0
We can write X
N = N0 + a~+ a~k (19.181)
k
~
k6=0
1519
The Hamiltonian then becomes
X ~2 k 2 N X
H = a~+ a~k + V~k a~+ a~k
2m k V k
~
k6=0 ~
k6=0
2
N N X
+ V0 + V~k (a~+ a+~ + a~k a~k ) (19.183)
2V 2V k k
~
k6=0
with real coefficients. We then require that the operators satisfy Bose com-
mutation relations
h i h i
~k , ~k0 = ~+ , ~+0 = 0 , ~k , ~+0 = ~k~k0
(19.186)
k k k
1520
In order for the non-diagonal terms to vanish, we must have
2 2
~ k N
+ nV~k u~k v~k + V~k (u~2k + v~k2 ) = 0 (19.189)
2m V
This equation together with u~2k v~k2 = 1 is sufficient to determine u~2k and v~k2 . If
we define
" 2 #1/2
~2 k 2 2
~~k = + nV~k nV~k
2m
" 2 #1/2
~2 k 2 nk 2 V~k
= + (19.190)
2m m
then we get
~2 k 2 ~2 k 2
~~k + 2m + nV~k ~~k + 2m + nV~k
u~2k = , v~k2 = (19.191)
2~~k 2~~k
and 2
nV~k nV~k
u~k v~k = , v~k2 = ~2 k 2
(19.192)
2~~k 2~~k ~~k + 2m + nV~k
Finally, the Hamiltonian becomes
1 X ~2 k 2
X
1 2
H = N V0 + nV~k ~~k + ~~k ~+ ~k (19.193)
2V 2 2m k
~
k6=0 ~
k6=0
where
~2 k 2
1 2 1X
N V0 + nV~k ~~k = ground state energy E0 (19.194)
2V 2 2m
~
k6=0
X
~~k ~+ ~k = sum of oscillators or excitations (19.195)
k
~
k6=0
The excitations (oscillators) or quanta that are created by the ~+ are called
k
quasiparticles.
1521
Quasiparticles appear in all kinds of physical systems at all energy scales.
The ground state of the system |0i is fixed by the condition that no quasiparticles
are excited,
~k |0i = 0 for all ~k (19.197)
The number of particles outside the condensate(the ground state) is given by
X X X
N 0 = h0| a~+ a~k |0i = h0| ~k ~+ |0i = v~k2 (19.198)
k k
~
k6=0 ~
k6=0 ~
k6=0
For small ~k we have
~2 k 2
E~k = ~~k = + nV~k (19.201)
2m
This corresponds to the dispersion relation for free particles whose energy is
shifted by a mean potential nV~k .
The cluster can lose energy (experience friction) only by causing excitations in
the fluid. For T > 0, there are already excitations present in the fluid at which
the cluster may scatter and thus lose energy, but at T = 0, this is not the
case. Let the initial momentum of the cluster be ~~q and the momentum of the
excitations be ~~k. In a scattering event of the cluster with the fluid, energy and
momentum are conserved.
~2 q 2 ~2 q 02
~~q = ~~q 0 + ~~k , = + E(k) (19.202)
2m 2m
where ~~q 0 is the momentum of the cluster after the scattering. The elementary
excitations of the fluid (quasiparticles) are given by
" 2 #1/2
~2 k 2 nk 2 V~k
E~k = ~~k = + (19.203)
2m m
1522
Consider the energy conservation equation. We have
~2 q 2 ~2 (~q ~k)2
= + E(k) (19.204)
2m 2m
or
~2 ~2 k 2
0= ~q ~k + + E(k) (19.205)
2m 2m
This says that (let be the angle between ~q and ~k)
1 E(k) 1 ~k
cos = + + E(k) (19.206)
v ~k v 2m
where
~q
v= = initial velocity of the cluster (19.207)
m
Now for the quasiparticle excitations
E(k)
> cs (19.208)
~k
therefore, for the excitation(emission) of a quasiparticle the cluster velocity must
be larger than cs , i.e., v > cs . This follows from the above relation as k 0 or
the angle becomes imaginary!
A cluster moving with v < vcritical = cs (in this model) cannot lose energy to
the fluid. Thus, there is no friction and one has superfluidity. For liquid helium
vcritical << cs and the physics is even more dramatic.
1523
The general idea behind spontaneous symmetry breaking is easily formulated:
as a collection of quantum particles becomes larger, the symmetry of the system
as a whole becomes more unstable against small perturbations. In the limit of
an infinite system an infinitesimal perturbation is enough to cause the system
to break the underlying symmetry of the Hamiltonian. The fact that the sym-
metry breaking can happen spontaneously is signaled by a set of noncommuting
limits: In the complete absence of perturbations even a macroscopic system
should conform to the symmetry of the Hamiltonian. However, in the presence
of an infinitesimal perturbation a macroscopic system will be able to break the
symmetry and end up in a classical state. This intuitive picture of spontaneous
symmetry breaking is not always easy to demonstrate in an equally clear math-
ematical description of the process.
where j labels the N atoms in the lattice, which have mass m, momentum pj ,
and position xj . We consider here only a one- dimensional chain of atoms, but
all of the following can be straightforwardly generalized to higher dimensions.
The parameter gives the strength of the harmonic potential between neigh-
boring atoms. The results on spontaneous symmetry breaking that follow are
equally valid for anharmonic potentials.
1524
whole. The momentum and position operators can be expressed in terms of
boson operators as
r r
~ + 1 ~ +
pj = iC b bj , xj = b + bj (19.210)
2 j C 2 j
2
so
that the commutation relation [xj , pk ] = i~jk is satisfied. We choose C =
2m so that the Hamiltonian reduces to
r
~ 2 X
2 b+ + +
+
H= j bj + bj bj bj + bj bj+1 + bj+1 (19.211)
4 m
k
where Ak = 2 cos (ka), Bk = cos (ka), and a is the lattice constant. This
Hamiltonian is still not diagonal, because the terms b+ +
k bk and bk bk create
and annihilate two bosons at the same time. We get rid of these terms by
introducing Bogoliubov transformed operators k = cosh (uk )bk + sinh (uk )b+
k
and choosing uk such that the resulting Hamiltonian is diagonal. After this
Bogoliubov transformation, the Hamiltonian in terms of transformed bosons is
given by
r " #
X ka + 1 2
H=~ 2 sin | | k k + + cos (ka)
m 2 2 4
k
r
X ka 1
= 2~ sin | | nk + (19.213)
m 2 2
k
P R
because k cos k = (N/2)
dk cos k = 0.
1525
where ptot = N pk=0 is the total momentum of the system. It can easily be
checked that this part of the Hamiltonian, which describes the external dy-
namics of the crystal as a whole, commutes with the rest of the Hamiltonian,
which describes the internal dynamics of the phonon modes inside the crystal.
We therefore focus on the collective part of the Hamiltonian and disregard the
phonon spectrum given by Eq. (5).
Notice that by considering only the collective part of the Hamiltonian, we have
effectively reduced the problem to a single-particle problem. The single particle
in Eq. (6) with mass N m and momentum ptot describes the dynamics of the
crystal as a whole. Its momentum and position are the center of mass momen-
tum and position of the crystal. In contrast, Eq. (5) describes the internal
degrees of freedom of the crystal and includes all many-body effects that arise
from the coupling of the N individual atoms.
The relevant eigenstates of the collective Hamiltonian Eq. (6) are very low in
energy: their excitation energies scale as 1/n, where N is the number of atoms
in the crystal. In the thermodynamic limit all of these states thus become
nearly degenerate. Because of this property a combination of these states that
break the symmetry of the Hamiltonian can be spontaneously formed in the
thermodynamic limit. At the same time, these collective eigenstates are so few
in number and of such low energy that their contribution to the free energy
completely disappears in the thermodynamic limit. This vanishing contribution
can be seen by looking at their contribution to the partition function.
There are actually several different types of partition functions, each correspond-
ing to different types of statistical ensemble (or, equivalently, different types of
free energy.) The canonical partition function applies to a canonical ensem-
ble, in which the system is allowed to exchange heat with the environment
at fixed temperature, volume, and number of particles. The grand canonical
partition function applies to a grand canonical ensemble, in which the system
can exchange both heat and particles with the environment, at fixed temper-
ature, volume, and chemical potential. Other types of partition functions can
be defined for different circumstances; see partition function (mathematics) for
generalizations.
1526
As a beginning assumption, assume that a thermodynamically large system is in
constant thermal contact with the environment, with a temperature T , and both
the volume of the system and the number of constituent particles fixed. This
kind of system is called a canonical ensemble. Let us label with s (s = 1, 2, 3, ...)
the exact states (microstates) that the system can occupy, and denote the total
energy of the system when it is in microstate s as Es. Generally, these mi-
crostates can be regarded as discrete quantum states of the system.
1
=
kB T
Z = T r eH
It may not be obvious why the partition function, as we have defined it above, is
an important quantity. Firstly, let us consider what goes into it. The partition
function is a function of the temperature T and the microstate energies E1 , E2 ,
E3 , etc. The microstate energies are determined by other thermodynamic vari-
ables, such as the number of particles and the volume, as well as microscopic
quantities like the mass of the constituent particles. This dependence on mi-
croscopic variables is the central point of statistical mechanics. With a model
of the microscopic constituents of a system, one can calculate the microstate
energies, and thus the partition function, which will then allow us to calculate
all the other thermodynamic properties of the system.
1527
The partition function thus plays the role of a normalizing constant (note that
it does not depend on s), ensuring that the probabilities sum up to one:
X 1 X Es 1
Ps = e = Z=1
s
Z s Z
This is the reason for calling Z the partition function: it encodes how the
probabilities are partitioned among the different microstates, based on their
individual energies.
or, equivalently,
hEi = kB T 2 ln ZT
Incidentally, one should note that if the microstate energies depend on a pa-
rameter in the manner
This provides us with a method for calculating the expected values of many mi-
croscopic quantities. We add the quantity artificially to the microstate energies
(or, in the language of quantum mechanics, to the Hamiltonian), calculate the
new partition function and expected value, and then set to zero in the final
expression.
We now state the relationships between the partition function and the various
thermodynamic parameters of the system. These results can be derived using
the method of the previous section and the various thermodynamic relations.
As we have already seen, the thermodynamic energy is
ln Z
hEi =
1528
The variance in the energy (or energy fluctuation) is
2 ln Z
h(E)2 i = h(E hEi)2 i =
2
The heat capacity is
hEi
Cv = = f rac1kB T 2 h(E)2 i
T
The entropy is
X F
S = kB Ps ln Ps = kB (ln Z + hEi) = (kB T ln Z) =
s
T T
F = hEi T S = kB T ln Z
The free energy of the total system is an extensive quantity, so that Fthin /Ftot
ln (N )/N disappears in the limit N , which is the so-called thermodynamic
limit. The states of this part of the spectrum are thus invisible in thermodynam-
ically measurable quantities such as for instance the specific heat of macroscopic
crystals, and it is consequently called the thin spectrum of the quantum crystal.
To see how the states in the thin spectrum can conspire to break the translational
symmetry, we need to add a small symmetry-breaking field to the Hamiltonian:
SB p2tot B
Hcoll = + x2tot (19.216)
2N m 2
Here the symmetry-breaking field B is introduced as a mathematical tool and
need not actually exist. We will send the value of B to zero at the end of the
calculation. The Hamiltonian Eq. (8) is the standard form of the Hamiltonian
for a quantum harmonic oscillator, and its eigenstates are well known. The
ground state wavefunction can be written as
1/4
mN 2
0 (xtot ) = e(mN/2~)xtot (19.217)
~
p
where = B/mN . This ground state is a wavepacket ofthe total momentum
states that make up the thin spectrum. Apart from the ground state configura-
tion there are also collective eigenstates that are described by the excitations of
1529
the harmonic oscillator Eq. (8). These excitations describe the collective motion
of the crystal as a whole. As N becomes larger, the ground state wavepacket
becomes more and more localized at the position xtot = 0, until it is completely
localized as N . That this localization can occur spontaneously without
the existence of a physical symmetry-breaking field B can be seen by considering
the noncommuting limits
If we do not include any symmetry-breaking field, then the crystal is always com-
pletely delocalized and respects the symmetry of the Hamiltonian. If we do allow
for a symmetry-breaking field, then it turns out that in the limit of having in-
finitely many constituent particles, an infinitesimally small symmetry-breaking
field is enough to completely localize the crystal at a single position. This math-
ematical instability implies that the symmetry breaking happens spontaneously
in the thermodynamic limit(spontaneous symmetry breaking).
Notice that once the crystal has been localized at a specific position and the
unphysical symmetry-breaking field has been sent to zero, the delocalization of
the crystal due to the spreading of its wavefunction will take a time proportional
to N and can thus never be observed.
19.8.3 Subtleties
In the derivation of the spontaneous symmetry breaking of a harmonic crystal
we have been somewhat sloppy in the definition of the symmetry-breaking field.
After all, the collective model of Eq.(6) was only the k = 0 part of the full blown
Hamiltonian in Eq. (1), but we did not consider the symmetry-breaking field
to be only the k = 0 part of some other field acting on all atoms individually.
It would therefore be better to start with a microscopic model, which already
1530
includes a symmetry-breaking field such as
" #
X p2j
SB 2
H = + (xj xj+1 ) + B(1 cos (xj )) (19.222)
j
2m 2
SB p2tot B 2
Hcoll + x (19.223)
2N m 2N tot
In Eq. (15) we again consider only the k = 0 part of the Hamiltonian and
have expanded the cosine to quadratic order. The fact that the symmetry-
breaking field now scales as 1/N is a direct consequence of our definition of
the microscopic symmetry-breaking field. The factor 1/N cannot be avoided
if we insist that the microscopic Hamiltonian be extensive. This factor might
seem to imply an end to the localization of the total wavefunction 0 (xtot ),
but spontaneous symmetry breaking is still possible as long as we consider the
correct order parameter. Even though the wavefunction itself does not reduce to
a Dirac delta function anymore, the spatial fluctuations of the crystal compared
to its size still become negligible in the thermodynamic limit if an infinitesimal
symmetry-breaking field is included:
This digression into extensivity and the correct choice for the symmetry-breaking
field seems unnecessary for understanding the essential ingredients of sponta-
neous symmetry breaking, and therefore we have ignored these subtleties in our
main treatment of quantum spontaneous symmetry breaking. In the application
of this procedure to other systems, such as antiferromagnets and superconduc-
tors, these issues dont arise because we are forced to consider extensive models
from the outset. In these cases, however, the mathematics of diagonalizing the
collective Hamiltonian is a bit more involved.
1531
states that make up the symmetry broken wavefunction. As a mathematical
tool necessary to be able to see the symmetry breaking explicitly, we introduced
the symmetry-breaking field B. If we look at the new ground state wavefunction
or at a suitably defined order parameter for the system, we see that in the ther-
modynamic limit an infinitesimally small field B is enough to completely break
the symmetry of the underlying Hamiltonian. It is thus argued that symmetry
breaking can happen spontaneously in the limit N .
19.9 Problems
19.9.1 Bogoliubov Transformations
Consider a Hamiltonian for Bosonic operators b+ k , bk of the form
h i
H = E(k)b+ + +
k bk + A(k) bk bk + bk bk
(a) Assume that E(k), A(k), K are all even functions of k and find the form
of sinh 2k as a function of A(k), E(k) so that
H = (k)k+ k + F (k)
and find (k) and F (k).
1532
(a) Separate the terms of order N02 , N0 N0 , N0 in the interaction term, show
that these are quadratic in b, b+ , and show that terms oforder N0 and
1 are cubic and quartic in b, b+ . Neglect the terms of O( N0 ) and O(1),
which is the Bogoliubov approximation, and write down K only keeping
terms up to quadratic order in b, b+ .
where bk = bk ei . Use
1
eA BeA = B + [A, B] + [A, [A, B]] + .....
2!
to show that
ck = U ()bk U 1 ()
where U () is the unitary operator
k (b+ +
k bk bk bk )
P
U () = e k>0
1533
(b) Show that the ground state
of K in the Bogoliubov approximation is
|GSi = U () 0 where 0 is the vacuum of the operators bk : bk 0 = 0
for all k. Argue that |GSi is a linear superposition of states with a pair
of momenta ~k, ~k respectively. This is a squeezed quantum state (see
Chapter 14 example). These states are ubiquitous in quantum optics and
quantum controlled nanoscale systems.
E
=0
(~x)
(b) Define new operators (~x) (~x) + (~x) where (~x) is the solution
to the Gross-Pitaevskii equation and write K up to quadratic order in
(~x), + (~x). Show that terms linear in (~x), + (~x) are cancelled by (~x)
being a solution to the Gross-Pitaevskii equation.
1534
19.9.5 Weakly Interacting Bose Gas
Consider a homogeneous, weakly interacting Bose gas with Hamiltonian
2 2
~
Z Z Z
H = d3 x + (~x) (~x)+ d3 x d3 y + (~x) + (~y )V (~x ~y )(~y )(~x)
2m
(a) Consider V (~x ~y ) = |V0 | 3 (~x ~y ) and assume a condensate with N0
particles. Obtain the operator for K = H N in the Bogoliubov ap-
proximation. By a canonical Bogoliubov transformation bring it to the
form X
K = ~(k)c+k ck + K0
k
Show that (k) becomes imaginary for some values of 0 < k < kmax .
(b) What is kmax ? What is the physical reason for this imaginary value and
what do they mean?
where a+
k, are creation operators of an electron of spin up or down and mo-
mentum ~k and
g X0
= hGS| ak ak |GSi
V 0
k
1535
P 0
|GSi is the ground state of K and k0 is a sum over states with
~2 k 2
0 (k 0 ) ~m and (k 0 ) = (k) =
2m
(a) Diagonalize K by a Bogoliubov transformation, i.e., introduce new oper-
ators
Ak = uk ak vk a+ +
k , Bk = vk ak + uk ak
and their respective Hermitian conjugates with uk , vk and even in k. Show
that the transformation is canonical, namely, that A, B obey the usual
commutation relations if u2k + vk2 = 1. It is convenient to write
1/2 1/2
1 1
uk = (1 + k ) , vk = (1 k )
2 2
(d) Show that E(0) = = gap and evaluate the resulting integral in part
(b) to give the gap as a function of gN (0) for gN (0) 1.
1536
where un are the displacements of atoms from their equilibrium positions,
and pn are the corresponding conjugate momenta.
Suppose electrons are also present on the same chain of atoms. Suppose
that the electrons can make transitions between neighboring lattice sites
with the probability amplitude tn so that
X
+
Hel = tn n+1 n + h.c.
n=
1537
tn as a function of (un un+1 ) to the first order: tn = t + (un un+1 )t0 .
When substituted into the electron Hamiltonian, the second term gives
the following Hamiltonian:
X
Helph = t0 +
(un un+1 )n+1 n + h.c.
n=
1538
(c) Evaluate the limits of Eq. (19.221) by expressing x2tot in terms of
boson operators and taking the expectation value with respect to the
SB
ground state of Hcoll .
(3) Work out the Bogoliubov transformation of Eqs.(19.212) and (19.213) ex-
plicitly.
1539
1540
Chapter 20
Relativistic Wave Equations
1541
which, as shown, leaves the quadratic form E = p2 /2m invariant, i.e., if E =
p~2 /2m, then E 0 = p~ 02 /2m0 and we derive the transformation rules for E and p~
from that condition.
The non-relativistic Schrodinger equation for the free particle then follows from
the standard identifications
~
E i~ , p~
t i
p2
H = E , H=
2m
which gives
~2 2
i~ = (20.5)
t 2m
It is clear from the form of the Schrodinger equation for a free particle that the
equation cannot be invariant under Lorentz transformation (Lorentz covariant),
i.e., the time derivative is first order and the space derivatives are second order.
Starting from x (s) = (ct, ~x) = (x0 , ~x), the contravariant 4-vector representa-
tion of the worldline as a function of the proper time s, we first obtain the
4-velocity, i.e.,
dx (s) dx (s)
0
dx d~x
x (s) = = 1 0 = , = (1, ~v /c) (20.6)
ds dx
dx0 dx0
where
1 d~x d~x
=q , ~v = =c 0 (20.7)
1 v2 dt dx
c2
1542
The metric tensor defined by
1 0 0 0
0 1 0 0
g = (20.10)
0 0 1 0
0 0 0 1
This fact, in itself, is not a valid reason for rejecting the equation.
There are, however, strong physical reasons for rejecting this equation. The
equation says that the momentum space amplitude
Z
p~ (t) = d3 r ei~p~r/~ (~r, t) (20.15)
p~ (t) 1/2
i~ = p2 c2 + m2 c4 p~ (t) (20.16)
t
If we Fourier transform both sides back to position space we get
Z
(~r, t)
i~ = d3 r0 K(~r ~r 0 )(~r 0 , t) (20.17)
t
1543
where
d3 p i~p(~r~r 0 )/h 2 2
Z
1/2
K(~r ~r 0 ) = 3
e p c + m2 c4 (20.18)
(2~)
This equation for (~r, t) is nonlocal, which means that the value of the integral
at ~r depends on the value of at the other points vecr 0 . The function K(~r ~r 0 )
is large as long as ~r 0 is within a distance
~
= Compton wavelength (20.19)
mc
from ~r. As a consequence of the nonlocality, the rate of change in time of at
the spacetime point (~r, t) depends on the values of at points (~r 0 , t) outside
the light cone centered on (~r, t).
1544
The Klein-Gordon equation has several unusual features.
First, it is second-order in time (space and time derivatives are now the same
order). This means we need to specify twice as much initial information (the
function and its derivative) at one time to specify the relativistic solution as
compared to the nonrelativistic solution which only required specification of the
function at one time. .
This will mean that the equation has an extra degree of freedom. We will see
shortly that this extra degree of freedom corresponds to specifying the charge
of the particle and that the Klein-Gordon equation actually describes both a
particle and its antiparticle together. .
= ei(~p~rEt)/~ (20.25)
The Klein-Gordon equation has negative energy solutions for a free particle!
For these solutions when we increase the magnitude of the momentum p~, then
the energy of the particle decreases! As we will see later, these negative energy
solutions are real and will correspond to antiparticles, while the positive energy
solutions will be particles.
changes in time and thus, we cannot interpret (~r, t)(~r, t) as being the prob-
ability of finding a particle at ~r at time t.
and mc 2
+ = 0 (20.29)
~
1545
which give(subtracting)
= 0 (20.30)
( ) = 0 (20.31)
Expanding these expressions, we have the continuity equation
+ ~j = 0 (20.32)
t
where
i~
(~r, t) = (20.33)
2mc2 t t
~
~j(~r, t) = ( ) (20.34)
im
We have inserted a multiplicative constant so that the current density vector
~j(~r, t) is identical to the nonrelativistic case. Because this density (~r, t) satis-
fies a continuity equation, its integral over all space does not change in time.
Clearly, however, it is not necessarily positive. In particular, < 0 for a nega-
tive energy free particle eigenstate.
This means that we cannot interpret this new (~r) as being the particle (prob-
ability) density at ~r and we cannot interpret ~j(~r) as a particle current.
The interpretation that will eventually emerge is that for charged particles e(~r)
represents the charge density at ~r, which can have either sign and e~j(~r) repre-
sents the electric current at ~r.
Consider a free particle at rest, i.e., p~ = 0. The wave function for the positive
energy solution is
2
(~r, t) = eimc t/~ (20.35)
where the energy of a particle at rest is E = mc2 . The density for this state is
(~r, t) = +1.
1546
This result follows because
p x 0 = Lorentz scalar
= mc2 t in the rest frame
= p~ ~r 0 Et 0 in the moving frame
where
p~c2
~v = (20.41)
Ep~
We see that (~r, t) transforms like Ep~ or as the time component of a 4-vector,
which makes physical sense. Since a unit volume in the rest frame appears
smaller by a factor p
1/ = 1 v 2 /c2 (20.42)
when observed from the moving frame, a unit density in the rest frame will
appear as a density
1 Ep~
= (20.43)
mc2
in a frame in which the particle is moving.
What about the negative energy solutions? For a particle at rest we have, in
this case,
2
(~r, t) = eimc t/~ (20.44)
where the energy of this particle at rest is E = mc2 .
It turns out that one way to interpret a state with a negative particle density
is to say that it is a state with a positive density of antiparticles.
We will make the interpretation that a particle at rest with energy E = mc2
is actually an antiparticle with positive energy E = mc2 . As we shall see,
1547
this interpretation of negative energy states will lead to a consistent theoretical
picture that is confirmed experimentally.
where
momentum = p~ = m~v and energy = E = mc2 (20.46)
In this new frame the particle has velocity ~v , momentum p~ and energy Ep~ . The
wave function, however, describes a particle of energy Ep~ and momentum ~ p.
Ep~
(~r 0 , t0 ) = (20.47)
mc2
and the current is
2
~j(~r 0 , t0 ) = p~ (~r 0 , t0 ) = p~c (~r 0 , t0 ) (20.48)
m Ep~
For a charged particle e(~r, t) is the charge density. It is positive for a free
particle with e > 0 and negative for a free antiparticle, which has opposite
charge to the particle.
The quantity e~j(~r, t) is the electric current of the state . For a particle the
electric current is in the direction of the particle velocity. For the antiparticle
with e < 0, the electric current is opposite to the velocity.
This says that the interpretation of the negative energy solutions as antiparticles
is consistent with the interpretation of the density as a charge density and ~j
as an electric current.
Is this interpretation consistent with the way charged particles interact with the
electromagnetic field?
1548
Taking the complex conjugate we have
2 2 !
1 ~ e ~ r, t) + m2 c2 (~r, t)
i~ + e(~r, t) (~r, t) = + A(~
c2 t i c
(20.50)
These equations say that if (~r, t) is a solution to the Klein-Gordon equation
with a certain sign of the charge, then (~r, t) is a solution of the Klein-Gordon
equation with the opposite sign of the charge and the same mass.
Thus, the relativistic theory of a spin zero particle predicts the existence of its
antiparticle with the opposite charge and same mass, i.e., the theory contains
solutions for both particles and antiparticles.
The solutions are normalized by the requirement that the total associated charge
equals 1 unit, i.e.,
Z Z
d3 r(~r, t) = +1 = d3 rc (~r, t) (20.52)
We define
ie
0 (~r, t) = + (~r, t) (~r, t) (1st - order equation #1) (20.53)
t ~
We then have
2
ie ie
+ (~r, t) 0 (~r, t) = + (~r, t) (~r, t) (20.54)
t ~ t ~
1549
Now using the Klein-Gordon equation we have
2 2 !
m2 c4
ie ie ~
+ (~r, t) (~r, t) = c2 + A(~r, t) 2 (~r, t) (20.55)
t ~ ~c ~
so we get
ie
+ (~r, t) 0 (~r, t)
t ~
2 !
2 4
ie ~ r, t) m c
= c2 + A(~ (~r, t) (1st - order equation #2)
~c ~2
(20.56)
These two new first-order equations involve the two functions 0 (~r, t) and
(~r, t).
The two symmetric equations can then be combined into the single equation
" 2 #
~ e~ 2
i~ = A (3 + i2 ) + mc 3 + e (20.62)
t i c
1550
The internal degree of freedom represented by these two components is the
charge of the particle(one component represents the particle and the other the
antiparticle).
~j(~r, t) = ~ + 3 (3 + i2 ) (+ )3 (3 + i2 )
2im
eA~
+ 3 (3 + i2 ) (20.65)
mc
The normalization condition becomes
Z
d3 r + 3 = 1 (20.66)
The scalar product between two such wave functions and |P si0 is defined by
Z
h | 0 i = d3 r + (~r, t)3 0 (~r, t) (20.67)
1551
which is the form of the charge conjugation operation in two-component lan-
guage.
What can we say about the two-component solutions for free particles and
antiparticles?
where p
Ep~ = p2 c2 + m2 c4 (20.80)
Using
0 ie
(~r, t) = + (~r, t) (~r, t) (20.81)
t ~
and
1 i~ 0 1 i~ 0
= + , = (20.82)
2 mc2 2 mc2
1552
we find (in two-component language)
(+) (+)
p~ (~r, t) = p~ ei(~p~rEp~ t)/~ (20.83)
(+)
where the two-component vector p~ is given by
mc2 + Ep~
(+) 1
p~ = p (20.84)
2 Ep~ mc2 mc2 Ep~
In a similar manner, we can write for the negative energy solutions (free an-
tiparticles) s
() mc2 i(~p~rEp~ t)/~
p~ (~r, t) = e (20.85)
Ep~
() ()
p~ (~r, t) = p~ ei(~p~rEp~ t)/~ (20.86)
mc2 Ep~
() 1 (+)
p~ = p = 1 p~ (20.87)
2 Ep~ mc2 mc2 + Ep~
We note that in the nonrelativistic limit
1/2
p2 p2
p
Ep~ = p2 c2 + m2 c4 = mc2 1 + 2 2 mc2 1 + (20.88)
m c 2m2 c2
2mc2
mc2 Ep~ (20.89)
p2 /2m
so that
v 2 /4c2
(+) 1 ()
p~ = , p~ = (20.90)
v 2 /4c2 1
This shows that in the nonrelativistic limit
The particle and antiparticle solutions are orthogonal in the sense that
(+) () () (+)
p~ 3 p~ = 0 = p~ 3 p~ (20.92)
which should be the case since they represent different energy eigenstates of the
same Hamiltonian.
1553
The free particle solutions form a complete set since any wave function can be
expanded as a linear combination of the free particle and antiparticle solutions.
d3 p i~p~r/~ p~
Z
(~r, t) = e (20.93)
(2~)3 p~
(+) ()
Since the two vectors p~ and p~ are linearly independent, we can write
p~ (+) ()
p~ (t) = = up~ (t)p~ + v~p (t)~
p (20.94)
p~
Substituting we get
d3 p i~p~r/~ h
Z i
(+) ()
(~r, t) = e u p
~ (t) p
~ + v ~p (t) ~p
(2~)3
Z 3
d p h
(+) i~p~
r /~ () i~ p~
r /~
i
= u p
~ (t) p
~ e + v p
~ (t) p
~ e (20.95)
(2~)3
where a change of variables was made in the second term. From the form of this
result, up~ (t) is the amplitude for a particle in the state to have momentum p~
and positive charge and vp~ (t) is the amplitude for a particle in the state to
have momentum p~ and negative charge.
()
Using the orthonormality of p~ we get
Z
(+)+
up~ (t) = d3 rp~ ei~p~r/~ 3 (~r, t) (20.96)
Z
()+
vp~ (t) = d3 rp~ ei~p~r/~ 3 (~r, t) (20.97)
d3 p
Z
2 2
3
|up~ | |vp~ | = 1 (20.98)
(2~)
This says that there is no restriction on the magnitude of either up~ or vp~ . Only
the integral of the difference(above) is fixed.
Physically, we can then say that one can have a state with an arbitrarily large
amplitude for finding a particle with a certain momentum, which is the first
indication that we are dealing with bosons or that spin zero particles must be
bosons.
1554
We can write some expectation values in this formalism, i.e.,
p2
H0 = (3 + i2 )2 + mc2 3 = Kineticenergy (20.99)
2m
d3 p
Z Z
+ 3 2 2
(~r)3 H0 (~r)d r = E p
~ |u p
~ | + |v p
~ | (20.100)
(2~)3
and
~
p~ = = momentum (20.101)
i
d3 p
Z Z
+ ~ 3 2 2
(~r)3 (~r)d r = p
~ |u p
~ | + |v p
~ | (20.102)
i (2~)3
d3 p
Z
(+)
(+) (~r, t) = up~ ei(~p~rEp~ t)/~ p~ (20.103)
(2~)3
Let us assume that up~ is peaked about p~ = p~ 0 . Then, using arguments similar to
our earlier discussions on stationary phase, the center of the wave packet moves
with a group velocity
p~ 0 c2
~vg = (p~ Ep~ )p~=~p 0 = (20.104)
Ep~ 0
and similarly for a free wave packet made of the negative energy solutions for
antiparticles.
Can we construct a free particle wave packet perfectly localized at the origin?
It would have the form
a
(~r) = (~r) (20.105)
b
We then have
Z
(+)+ i~
p~
up~ = d3 rp~ e r /~
3 (~r)
Z
3 (+)+ i~
p~
r /~ a (+)+ a
= d rp~ e 3 (~r) = p~ 3
b b
+
mc2 + Ep~
1 1 0 a
= p
2 Ep~ mc 2 mc2 Ep~ 0 1 b
Ep~ (a + b) + mc2 (a b)
= p (20.106)
2 Ep~ mc2
1555
and similarly
Ep~ (a + b) mc2 (a b)
()+ a
vp~ = p~ 3 = (20.107)
b
p
2 Ep~ mc2
Looking at these results we can see that independent of the choice of a and
b, the wave packet will always have both particle and antiparticle components.
This means that it is impossible to construct a perfectly localized wave packet
from positive energy solutions alone.
Suppose that we take a general wave packet made up of positive energy solutions
and try to squeeze it(make it more localized) with real-world devices such as
collimators. To see what might happen we multiply the wave packet by the
position operator ~r. We then have
d3 p
Z
(+)
~r(+) (~r, t) = up~ (t)p~ ~rei~p~r/~
(2~)3
d3 p
Z
(+) ~
= up~ (t)p~ p~ ei~p~r/~ (20.108)
(2~)3 i
d3 p
Z
(+)
~r(+) (~r, t) = (i~p~ up~ (t))p~ ei~p~r/~
(2~)3
d3 p
Z
(+)
+ up~ (t)(i~p~ p~ )ei~p~r/~ (20.109)
(2~)3
Using
() p~c2 ()
p~ p~ = (20.110)
2Ep~2 p~
we get
~r(+) (~r, t) = ~r+ (+) (~r, t) + ~r (+) (~r, t) (20.111)
where
d3 p
Z
(+)
~r+ (+)
(~r, t) = (i~p~ up~ (t))p~ ei~p~r/~ (20.112)
(2~)3
d3 p pc2 () i~p~r/~
Z
(+) i~~
~r (~r, t) = u p
~ (t) e (20.113)
(2~)3 2Ep~2 p~
This says that multiplying a wave packet of positive energy states by the position
operator mixes in negative energy solutions, i.e.,
1556
The same result occurs for any function of the position operator.
Suppose that ~r(+) (~r) = ~r0 (+) (~r), i.e., it is an eigenstate of ~r with eigenvalue
~r0 . This says that
i~p~ up~ = ~r0 up~ up~ = ei~p~r0 /~ (20.114)
and the state
d3 p i~p(~r~r0 )/~ (+)
Z
~r+0 (~r) = e p~ (20.115)
(2~)3
is an eigenstate of ~r+ .
The presence of the ~r part in the position operator says that putting a wave
packet made from positive energy solutions (a particle) through a potential
(~r) (which multiplies by functions of r) causes the creation of antiparticles
and because charge must be conserved, creates new particles also.
Thus, the relativistic spin-zero theory of the Klein-Gordon equation has built
into it the mechanism of particle-antiparticle production by external potentials.
The solution follows the same lines as the nonrelativistic problem. For x < 0
we have
(x) = aeipx/~ + beipx/~ , Energy = Ep (20.116)
This corresponds to incident and reflected waves. For x > 0, the Klein-Gordon
equation is
2 (x)
(Ep V )2 (x) = ~2 c2 + mc2 (x) , V = e (20.117)
x2
1557
The solution takes the form (x) = deikx where substitution gives
(Ep V )2 = ~2 c2 k 2 + m2 c4 (20.118)
is not continuous at x = 0.
We obtain
p ~k 2p
b= a , d= a (20.121)
p + ~k p + ~k
We consider three cases:
1. If Ep > V + mc2 , then the particle can pass the over the barrier and the
results are identical to the nonrelativistic case, i.e., part of the wave is
reflected and part is transmitted.
2. If we have a stronger potential such that Ep + mc2 > V > Ep mc2 , then
k must be imaginary so that the wave function goes to zero as x .
We then have p
m2 c4 (Ep V )2
k = i = (20.122)
~c
and the wave is totally reflected at the barrier. The charge density on the
right (x > 0) is given by
Ep V 2
(x) = |d| e2x (20.123)
2mc2
For Ep > V , there exists a positive, exponentially decaying charge density
to the right of the barrier. For Ep < V , however, the density is negative
(remember it is a beam of positive particles). We reflect positively charged
particles from the barrier and find negative particles inside the barrier.
Ep
vg = (20.124)
(~k)
1558
Using (Ep V )2 = ~2 c2 k 2 + m2 c4 we get
Ep ~c2 k
(Ep V ) = ~c2 k vg = (20.125)
k Ep V
This says that the reflection coefficient b/a is greater than one, i.e., more
wave is reflected than is incident! In addition, the charge density on the
right is
Ep V 2
(x) = |d| < 0 (20.126)
2mc2
and the current on the right is negative.
One possible explanation is to say that the incident particle induces the
creation of particle-antiparticle pairs at the barrier. The created antiparti-
cles, having the opposite charge, find x > 0 a region of attractive potential
and thus travel towards the right, which explains the negative current on
the right. The created particles travel to the left and together with the
incident particles (wave) which are(is) totally reflected, they add up to an
outgoing current on the left that is larger than the incident current.
The total outgoing current on the left and right equals the incident current
since total charge must be conserved.
This pair creation solution does not violate conservation of energy. The
energy of a created particle on the left is Ep . The energy of a created
antiparticle on the right is ~2 c2 k 2 + m2 c4 V since the electrostatic po-
tential energy has the opposite sign for a particle of opposite charge.
Adding the two energies we get Ep + ~2 c2 k 2 + m2 c4 V = 0, i.e., it takes
zero energy to create a particle-antiparticle pair. This happens because
the potential V is so large that the energy of the antiparticle on the right
is not only less than mc2 but is negative.
1559
This says that in regions where E > e(~r), which includes classically accessible
regions, the charge density is positive. But, in regions where E < e(~r), the
charge density is negative. The way to think about this is to say the particle in
the potential is a linear combination of free particle and free antiparticle states.
This interaction cannot be taken into account in the present one-particle rela-
tivistic theory(requires quantum field theory).
We now turn to the problem of a spin zero particle bound in a Coulomb poten-
tial. An example is a bound to a nucleus. We have
Ze2
e(~r) = (20.129)
r
which leads to the Klein-Gordon equation
" 2 #
Ze2 2 2 2 2
E+ + ~ c mc (~r) = 0 (20.130)
r
Since this is a central potential, we can assume that the eigenstates have definite
values of total orbital angular momentum. We then have
2
1 2 `(` + 1) (Z)2 2Ze2 E
E 2 2 2
m c + ~ r + (r) = 0
c2 r r2 r2 r c
(20.131)
or
1 2 `(` + 1) (Z)2
2
E m2 c4
2ZE
r+ (r) = 0 (20.132)
r r2 r2 ~cr ~2 c2
where
e2
= = fine structure constant (20.133)
~c
Now we define
= Z , ` 0 (` 0 + 1) = `(` + 1) 2
2E 4(m2 c4 E 2 )
= , = 2 , = r
~c ~2 c2
1560
and we get
d2 ` 0 (` 0 + 1)
2
+ 1 + () = 0 (20.134)
d(/2)2 /2 (/2)2
which is identical to the radial equation for the nonrelativistic Coulomb problem
for the function u = (). The difference is that ` 0 is not necessarily an
integer (remember that it is an integer in the nonrelativistic problem), which
causes the orbits of the relativistic Coulomb (Kepler) problem to no longer be
closed, i.e., the orbits precess. This also means that the extra degeneracy of
the nonrelativistic problem which causes the energy to be independent of ` is
broken in the relativistic problem. We now solve this equation in the standard
way. For
0
0 () `
() e/2
Therefore, we guess a solution of the form
`0 +1
u = () = e/2 w(/2) (20.135)
2
The solution method is identical to the nonrelativistic hydrogen atom. We get
a power series which must terminate (so that the solution is normalizable) when
= N + ` 0 + 1 , N = 0, 1, 2, 3, ......
1/2
2 mc2
E = mc2 1 + 2 E= v (20.136)
u
u1 + "
2
#2
r
t 1 1
N+ 2 + (`+ 2 )2 2
The principal quantum number has the possible values n = 1, 2, 3, ...... For
a given n the possible values of the total orbital angular momentum are ` =
0, 1, 2, 3, ...., n 1.
The degeneracy that was present in the nonrelativistic theory with respect to
orbital angular momentum ` is clearly removed.
If we expand the energy in a power series in the fine structure constant (or
) we get
2
2 1 1 3
En` = mc Ry 2 Ry 3 + O(Ry 4 ) (20.138)
n n ` + 21 4n
1561
The first term is the rest energy. The second term is the nonrelativistic Rydberg
formula. The third term is the relativistic correction due to using the relativistic
form of the kinetic energy, which as we saw earlier in Chapter 10 took the form
p4
Hrel = (20.139)
8m3 c2
It is this correction that removes the degeneracy in `, i.e.,
4 2 n 1
En,`=0 En,`=n1 = Ry (20.140)
n3 2n 1
As we shall see later when we derive the Dirac equation, there are more correc-
tions to this formula due to the fact that the electron has spin = 1/2.
Remember that in the nonrelativistic limit the dominant term in the energy will
be mc2 so that we expect the zeroth order equation for to be
i~ = mc2 (20.145)
t
which then implies in the next approximation that
2
1 ~ e~
= 2 2 A (20.146)
4m c i c
1562
The operator on the right-hand side is just the kinetic energy operator
s 2
2 4 2
~ e~
m c +c A (20.148)
i c
expanded to second order in 1/mc2 . This agrees with our earlier result that the
first relativistic correction for a spinless particle is entirely due to the relativistic
modification of the kinetic energy.
For a weak magnetic field B ~ this becomes (to order (v/c)3 ) after much algebra
~2 2 ~2 2 ~2 2
2 e ~ ~
i~ = 1+ + (mc + e) BL 1+
t 2m 2m2 c2 2mc 2m2 c2
(20.149)
~
where L is the orbital angular momentum of the particle. The term
~2 2 p2
1+ 1 (20.150)
2m2 c2 2m2 c2
represents the relativistic correction to the magnetic moment.
1563
Under the action of a Lorentz transformation along the z-axis with velocity
v = c, a 4-vector (any type) since it is a first-rank tensor, transforms as
V 0 = V (20.154)
where
0 0
0 1 0 0 1 v
( ) =
0
, =p , = (20.155)
0 1 0 1 2 c
0 0
This corresponds to the standard transformation relations for the position and
momentum 4-vectors
~ = ~r p~ Li = ijk xj pk
L (20.156)
~ is the product of two vectors and therefore should have the transfor-
In fact, L
mation properties of a second-rank tensor, i.e., as
Q0 = Q (20.158)
1564
as Maxwells equations
F 4
= J
x c
F F F
+ + =0
x x x
where the current density 4-vector is
J = (c, Jx , Jy , Jz ) (20.160)
F 0 01 = 01 = F = 0 1 F = 0 11 F 1 = 0 F 1
= 00 F 01 + 03 F 31 = 1 B2 = (1 ((~v /c) B) ~ 1)
or
~ 1 ) where ~v = vez
01 = (1 + ((~v /c) B) (20.162)
Similarly, we find
~ 2) ,
02 = (2 + ((~v /c) B) 03 = 3
B10 = (B1 ((~v /c) ~)1 ) , B20 = (B2 ((~v /c) ~)2 ) , B30 = B3
Thus, a pure magnetic field in one frame is a mixture of magnetic and electric
fields in the new frame.
~ ~ , ~ B
B ~
~r ~r , p~ ~ ~ L
pL ~
1565
Since spin,
1
S = ~ (20.163)
2
must transform as an angular momentum, which transforms like a magnetic
field and the magnetic field is part of second-rank tensor with the electric field,
we must conclude that there exists another set of dynamical variables generated
by the internal degrees of freedom of the particle that will be analogous to
the electric field. Do not think of the operator ~ as the standard 2 2 Pauli
matrices; we shall see later that ~ will need to be represented by 4 4 matrices
relativistically.
We must now investigate the dynamical properties of the new variables ~ and
also ask this question - where have these objects been hiding in all of previous
discussions?
~.
Since
~ behaves like a vector under spatial rotations (it is like the electric field
vector), it must have the standard commutation relations with S ~
i j = iijk k + ij (20.166)
which must be true in all Lorentz frames, i.e., since 12 = 1, we must have
102 = 1.
1566
For a Lorentz transformation along the zdirection we have
0x = (x + ivy /c)
0y = (y ivx /c)
Squaring 0x we get
x02 = 1 = 2 x2 + iv(x y + y x )/c v/c)2 y2
Since this must be true for all v, the coefficient of v/c must vanish. Thus,
x y + y x = 0 (20.167)
We then have
1 = 2 1 v/c)2 y2
(20.168)
Since
1 = 2 1 v/c)2 y2
(20.169)
we must also have
y2 = 1 (20.170)
These results generalize to the following:
i 6= j i j = j i (20.171)
i=j [i , i ] = 0 (20.172)
Multiplying y0 by x0 we get
Multiplying x0 by y0 we get
Adding, we have
(y x + x y ) + (v/c)2 (x y + y x ) = 0
(x y + y x ) = 0 (i j + j i ) = 0 i 6= j
Continuing, we find these other relations
y x = iz = x y i j j i = 2iijk k (20.173)
or summarizing we have
{i , j } = 2ij , [i , j ] = 2iijk k (20.174)
[i , j ] = 2iijk k , {i , j } = 0 , i 6= j (20.175)
1567
So obeys exactly the same algebraic relations as . How do we know that
is not equal to ? If we apply a parity transformation, we find that ~ ~
since angular momentum is unchanged by spatial inversion, i.e., the space-space
components of a second-rank tensor do not change sign under parity. On the
other hand, the time-space components such as the electric field or i~
do change
~ ~
sign, i.e., . So they cannot be the same operator!
In the first case, the eigenvalues of are 1 and in the second case i. We
choose 2 = +1 1 = . The properties under parity become
1~ = ~ ~ = ~ (20.176)
1
~ = ~
~
= ~
(20.177)
we get
det( 1 i ) = det( 1 i ) = det(i ) (20.180)
Putting these results together we get
All 2 2 matrices can be constructed from the set {I, ~ } and [, ~ ] = 0. This
means that would have to commute with all 2 2 matrices. Since ~ would
then have to commute with , we would then violate the relation ~ = ~ .
1568
This means N must be at least as large as 4. This says that a relativistic spin
1/2 particle would have 4 internal states (the nonrelativistic case has 2). This is
similar to the Klein-Gordon case and it will turn out here also that this doubling
signals the appearance of antiparticles.
where the the last three matrices are the standard Pauli matrices. It is given
by
~ 0 0 ~ I 0
~ = , ~= , = (20.182)
0 ~ ~ 0 0 I
Note that the trace of each of these matrices is zero, which is a general property
of matrices that obey anticommutation relations.
It follows from earlier discussions that the space-space components, i.e., the
spin operators j , generate (in the spin degrees of freedom) a rotation of the
coordinate system.
~ 0 = R~ R
1
, ~ 0 = R
1
~ R , 0 = R R
1
(20.183)
where
R = ei~n/2 (20.184)
and
~ 0 = Lv ~ L1 ~ 0 = Lv
v , ~ L1 0 1
v , = Lv Lv (20.185)
where
Lv = ei(i~)~/2 = e~ ~/2 L1
v =e
~
~
/2
(20.186)
1569
and
~ = vector in direction of velocity of primed frame
with respect to unprimed frame and of magnitude
v
tanh() =
c
Proof : First,
Second,
0
= L L1 = e|| /2 e|| /2 = e|| /2 e|| /2 = e|| = e~ ~
(20.189)
where we have used
{i , j } = 0 , i 6= j || , = 0 (20.190)
)2 = 1 we get
~ = , = 1 and (~
Now using
e~ ~ = cosh +
~ sinh (20.191)
Therefore,
0
= e~ ~ = cosh [1 +
~ tanh ] = [1 +
~ tanh ] (20.193)
~v
~v /c) = i
(~ ~ (20.196)
c
1570
so that
0 ~v
= + i
~ (20.197)
c
which agrees with our earlier result. Thus, ~ transforms correctly. A similar
calculation shows that
~ transforms correctly also and, thus, our interpretation
is correct.
0 = Lv L1
v =e
~ ~
(20.198)
0 = [ (~v /c) ~
] (20.199)
0
0
= L L1 = (20.200)
0 ||0 = || (v/c)
(20.201)
Therefore, (,
~ ) does transform like a 4-vector.
Some properties
( 0 )2 = 1 , ( i )2 = 1 , i = 1, 2, 3 (20.203)
{ , } = 0 , 6= (20.204)
{ , } = 2g (20.205)
We also have
i
= [ , ] (20.206)
2
1571
In fact, any 4 4 matrix can be written as a unique linear combination of the
. The set of 16 matrices
I , , , 5 , 5 , where 5 = 0 1 2 3 (20.207)
are linearly independent and complete. All are traceless except for the identity
matrix.
The new operator 5 commutes with the . This implies that it commutes
with i = 0 i and is invariant under a Lorentz transformation. It is not a
scalar, however, since under parity
5 = 5 (20.208)
= (, ~
) , p = (E/c, p~) (20.209)
1572
This says that s 2
E E
~
p~ = p2 = mc (20.212)
c c
The sign depends on the sign we choose for . If we had interpreted 2 = 1 to
mean = 1 instead of +1, which is equivalent to choosing the parity operator
as , no physics would have changed. This means we are free to choose the
sign. We choose
E
~ p~ = +mc (20.213)
c
or
E c~ p~ = mc2 (20.214)
This operator equation involves 4 4 matrices which implies that any physical
state vectors must be 4-component spinors.
The form of the result says that the Hamiltonian of a relativistic spin 1/2 particle
is
H = c~ p~ + mc2 (20.218)
In the presence of an electromagnetic field we use minimal coupling to get
~ e~
i~ e (~r, t) = c~ A + mc2 (~r, t) (20.219)
t i c
1573
to get
~ e~
(E e) = c~
A + mc2 (20.221)
i c
We then write
A
= (20.222)
B
where A and B are still two-component functions and use the explicit Dirac
matrices to obtain
I 0
A 0 ~ ~ 2 A
(E e) = c~
p eA + mc
B ~ 0 0 I B
(20.223)
This is equivalent to two coupled equations
~ c~
p eA~ B + mc2 A = (E e) A (20.224)
~ c~
p eA~ A mc2 B = (E e) B (20.225)
These last two equations are exact and very useful substitutes for the Dirac
equation.
1574
Now, earlier we derived the identity
(~ ~a)(~ ~b) = ~a ~b + i~ ~a ~b (20.230)
We then have
2 2
~ c~ ~
p eA ~ + i~ c~
p eA
= c~ ~ c~
p eA ~
p eA (20.231)
Now 2 2
~ c~ ~
p eA ~
p eA
= c~ (20.232)
and
~ c~
p eA
c~ ~ = ec p~ A
p eA ~+A
~ p~ = +ie~c A
~+A
~
(20.233)
Now
~+A
~ )i A = ijk j Ak Ak j A
( A
= ijk (j Ak )A + Ak (j A ) Ak (j A )
~ A = B
= ijk (j Ak )A = ( A) ~ A (20.234)
Putting everything together we get
1 e ~ 2 e ~ ~ + eA = E 0 A
p~ A A ~ B (20.235)
2m c mc 2
This the Pauli equation. The term involving the magnetic field has the form of
a magnetic dipole interaction energy
e ~ ~
SB (20.236)
mc
with a gyromagnetic ratio
e
2 g=2 (20.237)
mc
The full time-dependent form of the nonrelativistic limit is given by
2
1 ~ e~ e ~ ~ + (e + mc2 )A = i~ A
A A ~ B (20.238)
2m i c mc 2 t
1575
Note that the Hermitian conjugate operation reverses matrix order. Now multi-
ply the first equation by + (~r, t) on the left and the second equation by (~r, t)
on the right and subtracting we get the continuity-type equation
( + )
+ ( + c~
) = 0 (20.241)
t
This says that the quantity + is a positive conserved quantity that can be
interpreted as a probability density and then
~j = + c~
(20.242)
is the corresponding probability current. The operator c~ corresponds to the
velocity operator, which is the derivative of the Hamiltonian with respect to p~.
It turns out, however, that a more convenient equation to use in the new frame
~ matrices, i.e., 0 and
is one that still involves the old and ~ 0 are represented
by the same matrices as and ~ . We can find this other equation as follows.
We have
(~r 0 , t0 ) ~
i~ 0 = 0 c~ 0 0 (~r 0 , t0 ) + mc2 (~r 0 , t0 )
t0 i
0 0
(~r , t ) ~ 0
i~Lv L1
v 0
= Lv c~ L1v (~ r0 , t0 ) + mc2 (~r 0 , t0 )
t i
(~r 0 , t0 ) ~ 0
i~L1
v = c~ L1
v (~ r 0 , t0 ) + mc2 L1 v (~r 0 , t0 )
t0 i
1576
If we define
0 (~r 0 , t0 ) = L1 r 0 , t0 ) = L1
v (~ v (~
r, t) (20.247)
we have the equation
0 (~r 0 , t0 ) ~
i~ 0
0 0 (~r 0 , t0 ) + mc2 0 (~r 0 , t0 )
= c~ (20.248)
t i
This form of the equation has the same matrices and
~ in all frames with the
wave function in the new frame related to the wave function in the old frame
by the Lorentz transformation.
Alternatively, we can write the Dirac equation in covariant form. The Dirac
equation is
~
i~ = c~ + mc2 (20.249)
t i
which we can rewrite as
We choose to write four linearly independent solutions to the free particle Dirac
equation as:
1 0 0 0
(+) 0 (+) 1 () 0 () 0
u0 = 0 , u0 = 0 , u0 = 1 , u0 = 0
0 0 0 1
1577
where the upper index () denotes the eigenvalue of , the 0 denotes that the
particle is at rest p~ = 0, and the arrow denotes the value of the spin associated
physically with these states.
(+) () (+)
The spinors u0 and u0 are eigenstates of z with eigenvalue +1 and u0
()
and u0 are eigenstates of z with eigenvalue 1.
()
We are saying here that while u0 is the spinor of a negative energy particle
with spin up, we will associate it with a positive energy antiparticle with spin
down.
2
The states with = +1 vary in time as eimc t/~ and those with = 1
2
vary in time as e+imc t/~ . The positive and negative states have opposite par-
ity(intrinsic).
We can now construct states for a particle with momentum p~ by starting with
the particle at rest and applying a Lorentz transformation to take us to a frame
moving with velocity
p~c2 p
~v = where Ep = + p2 c2 + m2 c4 (20.254)
Ep
as the wave function for nonzero momentum. The new spinors are given by
() ()
h i ()
up~, = e~~/2 u0, = cosh
~ v sinh u (20.258)
2 2 0,
Using
p~c2
~v = (20.259)
Ep
we get
r
Ep + mc2 p~c
cosh = , v tanh = (20.260)
2 2mc2 2 Ep + mc2
1578
so that r
Ep + mc2
() c~p
~ ()
up~, = 1+ u (20.261)
2mc2 Ep + mc2 0,
We then have (in the standard representation)
r
Ep + mc2
(+) c (+)
up~, = 1 + p
~
~ u0, (20.262)
2mc2 Ep + mc2
Now
0 0 0 1 0 0 0 i
0 0 1 0 0 0 i 0
p~
~ = px + py
0 1 0 0 0 i 0 0
1 0 0 0 i 0 0 0
0 0 1 0 1
0 0 0 1 (+) 0
+ pz
1
, u0, =
0 0 0 0
0 1 0 0 0
and we get
1
0
r 0
(+) Ep + mc2 0
up~, =
2mc2 0 0 0
c
0 0 0
+ Ep +mc 2 0 + py
px + pz
0 1
1 i 0
r 1
Ep + mc2 0
= cpz (20.263)
2mc2
Ep +mc2
c(px +ipy )
Ep +mc2
and similarly
r 0
(+) Ep + mc2 1
up~, = (20.264)
c(px ipy )
2mc2
Ep +mc2
Epcp z
+mc2
cpz
r Ep +mc2
Ep + mc2 c(px +ipy )
()
up~, = Ep +mc2 (20.265)
2mc2
1
0
1579
c(px ipy )
r Ep +mc2
() Ep + mc2
Epcp z
up~, = +mc2 (20.266)
2mc2
0
1
Remember that the arrow refers to the spin associated with the state in the
rest frame, which is minus the z eigenvalue for the () spinors. We see that a
particle in a z eigenstate in its rest frame appears to be in a z eigenstate to
an observer moving with respect to the particle only if the observer is moving
along the zdirection, i.e., if px = py = 0 we have
r 1 r
Ep + mc2 Ep + mc2 (+)
(+) 0 cpz ()
up~, = cpz = u0, + u
2mc2 Ep +mc2
2mc2 Ep + mc2 0,
0
(20.267)
which is a sum of a particle and an antiparticle where both have spin up!
(+)
The positive energy solutions up~ ei(~p~rEp t)/~ correspond to particles with mo-
mentum p~, energy Ep and spin orientation . The negative energy solutions
()
up~ ei(~p~rEp t)/~ correspond to particles with momentum ~ p, energy Ep and
spin orientation which we will soon associate with antiparticles with mo-
mentum p~, energy Ep and spin orientation .
The nonzero momentum spinors are orthogonal but not normalized to one (as
is the case with the zero momentum spinors). Since L+ 6= L1 , in general, the
Lorentz transformations are not represented by a unitary operator and hence the
lengths of vectors or normalizations change. In particular. The normalization
is given by
()+ () Ep
up~ up~ = (20.268)
mc2
~ is Hermitian, we have L+ = L.
Since
1580
This means that the product u1 u2 of any two spinors is a Lorentz invariant, i.e.,
()
The spinors up~ obey the completeness relation that says that the 44 identity
matrix can be written as the sum of the outer products of the four spinors, i.e.,
X (b) (b)
bup~ up~ = 1 (20.274)
b,
()
The spinors up~ obey
() ()
p~)up~ = mc2 up~
(Ep c~ (20.275)
and
()+ ()+
up~ p~) = mc2 up~
(Ep c~ (20.276)
Multiplying the last equation on the right by we have the equation satisfied
()
by up~
() ()
p~) = mc2 up~
up~ (Ep c~ (20.277)
This says that the product (~r, t)(~r, t) transforms like a Lorentz scalar.
1581
transforms like a 4-vector under a Lorentz transformation. It is the particle
4-current multiplied by 1/c. In the same manner,
The positive density (~r, t) = + (~r, t)(~r, t) and the current ~j(~r, t) = c + (~r, t)~
(~r, t)
satisfy the continuity equation
+ ~j = 0 (20.281)
t
Z
(~r, t)d3 r (20.282)
One important consequence in the spin zero case was that it is impossible for
a particle to make a transition from a state normalized to +1 to a state nor-
malized to 1 since the normalization remains constant in time. We associated
the negative energy states with particles and the negatively normalized states
with antiparticles. We then see that the impossibility of a transition between
positive and negative energy states just corresponds to charge conservation.
In the spin1/2 case, however, both positive and negative energy states have
positive normalization so that there is nothing in the theory (so far) that pre-
vents a particle in a positive energy state from making a transition to a negative
energy state radiating away several high energy photons in the process. A dif-
ficulty in the theory that we must return to later!
Let us say some more about the position and velocity operators in the Dirac
theory.
The position operator has strange features similar to those of the Klein-Gordon
theory. If we apply the position operator to a wave packet made up of positive
1582
energy free particle states we get
!
XZ d3 p (+)
XZ d3 p (+) ~
~r (+)
(~r) = ~r ap~ up~ ei~p~r/~ = ap~ up~ p~ ei~p~r/~
(2~)3
(2~)3 i
3
XZ d3 p
Z
d p (+)
X (+)
= 3
(i~p~ ap~ )up~ ei~p~r/~ + ap~ (i~p~ up~ )ei~p~r/~
(2~)
(2~)3
where we have integrated by parts to get the last two terms. The first term
contains only positive energy components. The second term, however, contains
(+)
the factor i~p~ up~ , which generates both positive and negative components
(explicitly do the derivatives on the column vectors we derived earlier). If we
define, as before
~r = ~r(+) + ~r() (20.283)
then, as before, the even part ~r(+) acting on the wave packet of positive energy
free particle states produces only positive energy free particle states and acting
on the wave packet of negative energy free particle states produces only negative
energy free particle states, while the odd part ~r() turns positive positive energy
states to negative energy states and vice versa.
As in the Klein-Gordon case, both positive and negative energy free particle
solutions are needed to produce a localized wave packet.
d~r
= c~
(20.285)
dt
This means that the velocity operator is not simply related to the momentum
operator relativistically. The eigenstates of any component of ~ are linear com-
binations of positive and negative energy free particle states and thus cannot be
realized in any physical situation! For any arbitrary state the expectation value
of c~
has a magnitude between 0 and c.
1583
20.6.5 Non-relativistic Limit
We now derive corrections to the Pauli equation. Earlier we had
~ c~ ~ B + mc2 A = (E e) A
p eA (20.286)
~ c~ ~ A mc2 B = (E e) B
p eA (20.287)
or
~ e~ 1
~ A B + mcA = i~ e A (20.288)
i c c t
~ e~ 1
~ A A mcB = i~ e B (20.289)
i c c t
The second equation of the above pair gives (an exact equation)
1 ~ e~ 1 2
B = A ~ A i~ mc e B (20.290)
2mc i c 2mc2 t
Now the A term is much larger than the B term on the right. Thus, we get
the first correction by iterating once, i.e.
1 ~ e~
B = A ~ A
2mc i c
1 2 ~ e~
i~ mc e A ~ A (20.291)
4m2 c3 t i c
Substituting this expression into the first equation of the pair we get the first
relativistic correction term to the Pauli equation
1 ~ e~ 2 ~ e~
2 3 A ~ i~ mc e A ~ A (20.292)
4m c i c t i c
which is (v/c)2 smaller than the kinetic energy term p2 /2m.
1584
Using this relation with the identity
(~ ~a)(~ ~b) = ~a ~b + i~ ~a ~b (20.295)
The first term is the relativistic correction to the kinetic energy. The second
term is the spin-orbit coupling. The third term is new and is not even Hermi-
tian!
The reason for this non-Hermitian term is that we are only working to order
(v/c)2 . Such a non-Hermitian term in the wave equation means that the nor-
malization integral Z
+
A A d3 r
can change in time. Now the full Dirac equation obeys the normalization con-
dition Z Z
+ 3 + +
d r = [A A + B B ]d3 r = 1 (20.297)
to order (v/c)2 , that remains constant and equal to 1. This implies that the
correct nonrelativistic limit of the Dirac wave function (the limit whose normal-
ization remains constant in time) is
p2
Z
(~r, t) = 1 + A (~
r , t) + d3 r = 1 (20.299)
8m2 c2
The equation for this form of the wave function will not have any non-Hermitian
terms. A large amount of algebra gives the equation for (~r, t) as
p4
1 e ~ 2
i~ = mc2 + p~ A (20.300)
t 2m c 8m3 c2
2
e~ ~ + e~ ~ (~ p~) + e + ~ (2 e)
~ B
2mc 4m2 c2 8m2 c2
1585
This is the correct nonrelativistic limit of the Dirac equation. All terms are
Hermitian.
e~ e~
2 2
~ (~ p~) = 2 2 ~ ( p~) (20.301)
4m c 4m c
If we assume the potential is spherically symmetric, then
1 d
= ~r (20.302)
r dr
and we get
e~ e~ d e d ~ ~
2 2
~ (~ p~) = 2 2 ~ (~r p~) = 2 2 SL (20.303)
4m c 4m c r dr 2m c r dr
which is the spin-orbit energy. It correctly contains the Thomas precession
correction! We do not have to add any terms in an ad hoc manner!
Correction to the Potential - This is called the Darwin term. Now, from
Poissons equation we have
~2 2 ~2
( e) = Ze2 (~r) (20.304)
8m2 c2 2m2 c2
This term tends to raise the energy of s-states since they do not vanish at the
origin.
1586
where we have substituted ~ for ~ . The potential function is e = Ze2 /r and
~ = 0. Writing
we let A
u1 u3
A = , B = (20.307)
u2 u4
we get
Ze2
i 2 u4 u4 u3
E+ mc u1 + i + =0 (20.308)
~c r x y z
Ze2
i u3 u3 u4
E+ mc2 u2 + +i =0 (20.309)
~c r x y z
Ze2
i u2 u2 u2
E+ + mc2 u3 + i + =0 (20.310)
~c r x y z
Ze2
i u1 u1 u2
E+ + mc2 u4 + +i =0 (20.311)
~c r x y z
We now use another clever trick I learned from Professor Hans Bethe at Cornell
University to find a solution.
If we hconsider
i only large components, i.e., set the small components to zero,
~
then L, H , which is proportional to ~ p~, will be zero, since
~ connects the
small and large components. This means that A will be an eigenfunction of L. ~
In addition, it must contain one spin component with spin up and another with
spin down.
Of course, ~j and jz are constants of the motion. Hence, for j = ` + 1/2 we can
set
s
` + m + 21 m 12
u1 = g(r) Y` () (20.312)
2` + 1
s
` m + 21 m+ 21
u2 = g(r) Y` () (20.313)
2` + 1
where the unknown function g(r) will be the solution of some relativistic radial
equation.
1587
have a different `.
Corresponding to j = `+1/2 the only other possible value of the orbital angular
momentum is `0 = ` + 1. Therefore, we set (remembering the appropriate
Clebsch-Gordon coefficients)
s
` m + 23 m 21
u3 = if (r) Y`+1 () (20.315)
2` + 3
s
` + m + 23 m+ 12
u4 = if (r) Y`+1 () (20.316)
2` + 3
where the unknown function f (r) will be the solution of some relativistic radial
equation. Inserting these solution guesses into the 4 coupled equations we find
that for j = ` + 1/2 the connection between f and g is given by
Ze2
1 dg g
E+ + mc2 f = ` (20.317)
~c r dr r
2
1 Ze df f
E+ mc2 g = (` + 2) (20.318)
~c r dr r
and
Ze2
1 dg g
E+ + mc2 f = + (` + 1) (20.323)
~c r dr r
2
1 Ze df f
E+ mc2 g = + (` 1) (20.324)
~c r dr r
We now define (
(` + 1) if j = ` + 1/2
k= (20.325)
` if j = ` + 1/2
1588
i.e., (
1, 2, .... if j = ` + 1/2
k= (20.326)
1, 2, .... if j = ` + 1/2
We can then combine the 4 equations for f and g into 2 equations as
Ze2
1 2 dg g
E+ + mc f + (1 + k) =0 (20.327)
~c r dr r
Ze2
1 df f
E+ mc2 g + + (1 k) =0 (20.328)
~c r dr r
Setting
F = rf , G = rg
mc2 + E mc2 E
1 = , 2 =
~c ~c
Ze2
= (1 2 )1/2 , = , = r
~c
we get
d k 1
+ G + F =0 (20.329)
d
d k 2
F G=0 (20.330)
d
We now solve these coupled equations using the standard series method to obtain
the positive energy bound state solutions.
We substitute
F = ()e , F = ()e (20.331)
and obtain
0 k 1
+ + =0 (20.332)
k 2
0 =0 (20.333)
We now substitute the series
X
X
= s am m , a0 6= 0 , = s bm m , b0 6= 0 (20.334)
m=0 m=0
1589
This makes sure that s 6= . Substituting the series and equating coefficients
of the same power of we get the recursion relations
1
(s + + k)b b1 a a1 = 0 (20.336)
2
(s + k)a a1 + b b1 = 0 (20.337)
For = 0 we get
(s + k)b0 a0 = 0 = (s k)a0 + b0 (20.338)
These equations have a nontrivial solution if and only if
s = (k 2 2 )1/2 (20.339)
First we look at the negative root. For small the integrand for the integrated
probability density is 2s and we must have 2s > 1 or (k 2 2 )1/2 > 1/2.
The minimum s occurs when k 2 = 1. This corresponds to Z 109. For k 2 > 1,
no value of Z will permit the negative root.
The recursion relations lead to function of the order e2 (the probability density
integral would diverge) unless the series terminate. Suppose the series terminate
for = n0 , i.e., an0 +1 = bn0 +1 = 0. We then have from the recursion relations
that
1 an0 = bn0 , n0 = 0, 1, 2, ..... (20.340)
We now multiply the first recursion relation by and the second by 1 and
subtract them to get
b [(s + + k) 1 ] = a [1 (s + k) + ] (20.341)
Inserting = n0 and using 1 an0 = bn0 we get
2E
2(s + n0 ) = (1 2 ) = (20.342)
~c
Putting everything together we get
1/2
2
1/2 2
2
E = mc 1 + = mc2 1 + (20.343)
2
(s + n0 )2 p
n0 + k 2 2
1
Since |k| = j + 2 we get
1/2
2
E = mc2 1 + , n0 = 0, 1, 2... j + 12 = 1, 2, 3, ...
q 2
n0 + (j + 12 )2 2
(20.344)
1590
where = Ze2 /~c.
Before looking at the physics in this result let us investigate an alternative ap-
proach involving a second-order Dirac Equation. The first-order Dirac equation
is
~ e~
i~ H (~r, t) = 0 , H = c~ A + mc2 + e (20.345)
t i c
and
~ e~
A, i~ e = i~e~ (20.349)
i c t
The new second-order
equation
is just the Klein-Gordon equation with an ad-
~
ditional term ~ B i~
~ , which represents the direct coupling of the elec-
tromagnetic fields to the magnetic(and electric) moments of the particle.
Every solution of the Dirac equation is a solution of this new second-order equa-
tion, but every solution of the second-order equation in not necessarily a solution
of the Dirac equation.
1591
This says that P acts as a projection operator, which reduces solutions of the
second-order equation to solutions of the first-order Dirac equation.
Let us now use the second-order equation to find the energy levels of the Dirac
hydrogen atom (Glauber, et al PR 109,1307(1958)). For a stationary state of
energy E in the Coulomb potential the second-order equation becomes
" 2 2 #
1 Ze2 ~ 2 2 i~Ze2
E+ m c + 2 r = 0 (20.352)
c2 r i r c
~ r. We now write
where r =
2
~2 2 2 L2
~
= r + 2 (20.353)
i r2 r2 r
(20.354)
We now use a few tricks to change this equation, which is almost in the same
form as the Klein-Gordon equation for the Coulomb potential, into exactly the
same form.
Ze2
p~ + mc2
H = c~ (20.358)
r
for the relativistic hydrogen atom.
This says that K is a constant of the motion and since it also commutes with the
total angular momentum we can label the common eigenstates or energy levels
of the hydrogen atom by the eigenvalues of K, J2 and Jz . K is a constant of the
1592
motion for any spherically symmetric, spin-independent potential and physically
it measures the degree to which the spin and the orbital angular momentum of
the particle are aligned.
~ L
L ~ = i~L
~ , J~ = L
~ +S
~ , ~ = ~ ~
S
2
Therefore, we have
1
k 2 = j(j + 1) + = (j + 12 )2 (20.360)
4
n o
Now, since K, 5 = 0 we find that, if k is an eigenvalue of K, i.e.,
then
K5 |ki = 5 K |ki = k5 |ki (20.362)
which says that k is also an eigenvalue of K. The eigenvalues are then
k = 1 , 2 , 3 , ....... (20.363)
since j = 1/2, 3/2, 5/2, ........ Note that zero is not an eigenvalue of K. In addi-
tion, an eigenstate of K with eigenvalue k is an eigenstate of J2 with eigenvalue
j = |k| 1/2.
Ze2
= K i r (20.364)
~c
1593
with the properties
2
Ze2
h i h i
, K = 0 , , J~ = 0 , 2
=K 2
(20.365)
~c
which is the operator in the last term of the second-order equation. We can
then write
" #
E 2 m2 c4 2EZe2 ~2 2 2 ~2 ( + 1)
+ + 2 2r =0 (20.367)
c2 rc2 r r r2
This is exactly the same form as the Klein-Gordon equation except that
`0 (`0 + 1) ( + 1) (20.368)
mc2
E=h , n0 = `0 + 1 + , = 0, 1, 2, ..... (20.369)
Ze2 2
i1/2
1+ ~cn0
does not commute with H however. This means that the solutions we have
found for the second order equation cannot directly be eigenfunctions of H.
Instead, since
H(P ) = E(P ) (20.370)
i.e., the energy eigenvalues from the second-order equation are also the eigen-
values of H, we can find eigenfunctions of H by using the projection operator
P . Since P and do not commute, the eigenfunction of H, namely P , will
generally be a linear combination of different eigenfunctions.
1594
or
2 !1/2
Ze2
2
= k (20.372)
~c
(1) = , `0 =
(2) = , `0 = 1
mc2
E=h , n0 = `0 + 1 + , = 0, 1, 2, ..... (20.374)
Ze2 2
i1/2
1+ ~cn0
The quantum number n is just the principal quantum number of the hydrogen
atom.
1595
Figure 20.2: Dirac hydrogen energy level structure
Some Features
The energy levels for the spin 1/2 particle are the same as those found for the
spin 0 particle with ` j.
The Dirac theory leads to an accidental degeneracy in `, i.e., states with the
same j but different ` have the same energy. This degeneracy is removed by the
Lamb shift, which is due to the interaction of the electron with its own field. As
we shall see later, for j = 1/2, the effect is one order of magnitude smaller then
the fine structure splitting. For j 3/2, it is two orders of magnitude smaller.
e2
= = fine structure constant (20.378)
~c
looks like
Z 2 2 (Z)4
1 3
En,j = mc2 1 2
3 1 + O((Z)6 ) (20.379)
2n 2n j+ 2
4n
1596
To get a handle on how to proceed we look at the nonrelativistic limit where
We expect the solutions of the second order equation with one sign of to
correspond to solutions of the first-order equation with the opposite sign of K.
This means that
It turns out to be convenient to still label the solutions by the ` value that they
would have in the nonrelativistic limit. To find this ` value we use
L2 ~
L L2
K 2 = 1 + 2
+ ~ = K + 2
~ ~ ~
L2
K(K ) = 2
~
In the nonrelativistic limit, 1 and we have
L2
K(K 1) = k(k 1) = `(` + 1) (20.381)
~2
so that ` becomes the total orbital angular momentum quantum number in the
nonrelativistic limit. Solving for ` in terms of k we get
(
k 1 = j 1/2 for k > 0
`= (20.382)
|k| = j + 1/2 for k < 0
Now K measures the alignment of the spin and the orbital angular momen-
tum. The above results say that for k > 0, they are essentially parallel and so
j = ` + 1/2 and for k < 0 they are essentially antiparallel so j = ` 1/2.
A detailed calculation of the wave functions shows that the upper two com-
ponents of the wave function (the large components) are eigenstates of total
orbital angular momentum with eigenvalue `, while the lower two components
(the small components) are eigenstates of total orbital angular momentum with
eigenvalue ` + 1 for k > 0 and with ` 1 for k < 0.
The complete energy level scheme for the relativistic hydrogen atom for n = 1,
2, and 3 looks like Figure 20.3 below.
1597
Figure 20.3: Energy level structure for relativistic hydrogen
All levels except 1S1/2 , 2P3/2 , 3D5/2 , etc., are 2-fold degenerate because they
are the eigenstates of of K with opposite eigenvalues, i.e., 2P3/2 k = 2,
2D3/2 k = 2.
Hyperfine Structure
The are two corrections that modify the energy level results from the Dirac
equation. The two-fold degeneracy is removed by the interaction of the electron
with vacuum fluctuations of the electromagnetic radiation field. This effect is
called the Lamb shift. In addition, there is also a hyperfine interaction which
splits every level into two, It is due to the interaction of the electron with the
magnetic moment of the proton. We consider hyperfine splitting first.
1598
(assuming the proton is fixed at the origin) is given by the relations
~ ~ 1
A(~r) = Mp = vector potential (20.385)
r
~ r) = A(~
B(~ ~ r) (20.386)
~ r) = (~p ) gp p
B(~ (20.387)
2r
This gives
1
H 0 = gp B p~ (~p )
2r
1
= gp B p~ (~p ( ) (~p ))
2r
1
= gp B p ((~ ~p )( ) (~ )(~p )) (20.388)
2r
The first-order shift of the level is
Z
D E
2 1
H 0 = gp B p d3 r |(~r)| (h~ ~p i ( ) h(~ )(~p )i)
2r
(20.389)
where the brackets h.....i denote the expectation value in the relative spin state
of the electron and proton and (~r) is the nonrelativistic wave function of the
level. If we only consider sstates, which are spherically symmetric, then
1
h(~ )(~p )i = h(~ ~p )i 2 (20.390)
3
and we get
Z
D E
0 1 3 2 2 1
H = gp B p d r |(~r)| h(~ ~p )i
3 2r
Z
4 2
= gp B p h(~ ~p )i d3 r |(~r)| (~r)
3
4 2
= gp B p h(~ ~p )i |(0)| (20.391)
3
where we have used
1
2 = 4(~r) (20.392)
r
For the hydrogen atom sstate
2 1
|(0)| = (20.393)
(na0 )3
and we get
D E 2 e2 m 2
H 0 = gp h(~ ~p )i (20.394)
3 2a0 m p n3
1599
We then have
F~ = S
~ + I~ = total spin (20.395)
1
For S = 2 , I = 12 , we have F = 0(singlet) , 1(triplet). But
F~ = S
~ + I~ F~ 2 = S
~ 2 + I~2 + 2S
~ I~ (20.396)
2 2
~ I~ = ~ ~ ~p = ~ (F (F + 1) 3/2)
S (20.397)
4 2
We then have for a relative triplet state h(~ ~p )i = 1 and for a relative singlet
state h(~ ~p )i = 3. This says that the singlet state lies lower than the triplet.
8 e2
m 2
E = gp (20.398)
3 2a0 mp
between the triplet and singlet. The transition between these two levels gen-
erates radiation with a frequency of 1420 M Hz and a wavelength of 21.4 cm.
This radiation is very important in astronomy. From its intensity, Doppler
broadening, and Doppler shift, one obtains information concerning the density,
temperature, and motion of interstellar and intergalactic hydrogen clouds.
We consider an electron in the state |ni with energy n . Because of the above
interaction (see last part of this chapter) the electron is able to spontaneously
emit a photon thereby going to some state |n0 i. This produces a second-order
shift in the energy given by
D 2
X X n0 , ~k~ Hint |n, 0i
En = (20.400)
n n0 ck
n0 ~
k~
where |n, 0iE is the initial state with the electron in |ni with no photons present,
and n0 , ~k~ is the intermediate state with an electron in |n0 i and on photon of
1600
From the quantum theory of electromagnetic radiation (see end of this chapter)
we have that
s
D e 2~2 c2 0 ~ ~
n0 , ~k~ Hint |n, 0i = hn | j~k |ni (20.401)
c k V
P 0 2
hn | ~j~k ~ |ni
d3 k 2~2 e2 X
Z
~
En =
(2~)3 ck 0
n n0 ck
n
2
d hn0 | ~j~k ~ |ni
R P
k 2 dk e2 X
Z
~
= (20.402)
4 2 ~ ck 0 n n0 ck
n
where the factor 2/3 comes from the fact that there are only 2 independent
polarizations for each ~k value. This gives
Z X |hn0 | p~ |ni|2
2e2
En = d , = ck (20.404)
3~c3 m2 0
n n0
0 n
The first problem we encounter is that the integral diverges!! This means that
the interaction with the radiation field produces an infinite shift downward in
the energy of the electron.
This result presented theoretical physics with a great difficulty for many years.
In the late 1940s it was resolved due to the work of Feynman, Schwinger and
Tomonaga in producing new calculation rules within the context of quantum
electrodynamics and by Bethe and Weisskopf who actually carried out the cal-
culation using the new rules and got a finite number agreeing with experiment.
If we do a similar calculation for a free electron, then one gets an infinite result
1601
again. In the dipole approximation, we can evaluate the energy shift for a free
electron in a momentum state |~
pi. We get
Z 2
2e2 X |h~q| p~ |~
pi|
Ep~ = d (20.405)
3~c3 m2 q p
0 q
~
which is infinite. What Bethe and Weisskopf noticed was that this expression
is proportional to p2 . In their development of quantum electrodynamics, Feyn-
man, Schwinger and Tomonaga had similar problems which they were able to
deal with be redefining the electron parameters that appeared in the theory
(like mass and charge). The process is called renormalization. In this process
all infinite expression are consistently incorporated into the mass or charge pa-
rameters and then these are defined to have the known experimental values.
In our case, we can interpret the infinite result as redefining the mass, i.e., as
representing a shift of the mass of the electron. In terms of the mathematics,
this means the following. If we say that m0 is the mass and p2 /2m0 is the kinetic
energy of a free electron of momentum p~ neglecting the electromagnetic interac-
tions, then the energy including the effects of the electromagnetic interactions
is given by
Z
p2 1 2e 2 2
p = 1 p
2
+ Ep~ = d (20.407)
2m0 m0 3~c3 m2 2 m 2
0
i.e., we have renormalized the electron mass. The so-called electromagnetic self-
energy of the electron can thus be interpreted as giving a shift of the mass of the
electron from its bare (no electromagnetic interactions) value m0 to its observed
(measured in the laboratory where all interactions are present) value m.
We then argue that the reason the interacting electron has an infinite energy
shift is that it includes the infinite energy change that we already have counted
once when we use the observed mass m rather than the bare mass in the cal-
culation and, thus, we are double counting. In other words, we should really
start out with the Hamiltonian for the hydrogen atom in the presence of the
radiation field given by
p2 e2
H = + Hint (20.408)
2m0 r
Then using the corrected expression for m we get
Z
p2 e2 2e2
H = + Hint + d (20.409)
2m r 3~c3 m2
0
1602
This means that if we write the observed free particle mass in the kinetic energy
(which we always do) we should not count that part of Hint that produces the
infinite mass shift, i.e., we should regard
Z
2e2
Hint + d (20.410)
3~c3 m2
0
so that
2e2 n0 n
Z
2
X
En0 = 3 2
|hn 0
| p |ni| d (20.413)
3~c m 0
n n0
n 0
The integral is still divergent but only logarithmically and, in fact, not at all
in more sophisticated relativistic calculations. We can imagine that the correct
calculation would yield a similar result but with a convergent integral. We can
simulate this result by integrating to some cutoff value (and not to infinity) say
at ~ = mc2 . We then get
2e2 mc2
2
X
En0 = |hn0
| p |ni| ( n 0 n )`n (20.414)
3~c3 m2 0 n0 n
n
Bethe evaluated this result numerically and obtained En0 = +1040 megacycles
(the 2P1/2 level turns out to be shifted downward) and the observed value equals
+1057 megacycles, which is remarkable agreement!
Taking into account both the Lamb shift and the hyperfine splitting we have
the level scheme shown in Figure 20.4 below for n = 2:
1603
Figure 20.4: n = 2 Energy level structure for relativistic hydrogen
Finally, we tackle the problem of the negative energy states in the Dirac theory.
1604
The properties of the positive energy states show remarkable agreement with
experiment. Can we simply ignore the negative energy states? The answer is no
because an arbitrary wave packet, as we saw earlier will always contain negative
energy components via interactions even if we start off only with positive energy
components.
Dirac proposed a clever way out of this dilemma : since spin 1/2 particles obey
the exclusion principle, all one needs to do to insure stability is to say that the
negative energy states are completely filled. Then a particle cannot make a tran-
sition from a positive to a negative energy state for this would put two particles
into the same (negative energy) state. The vacuum state in this picture consists
of an infinite sea of particles in negative energy states. The particle and charge
density at every point is infinite. This is not a problem for the physical theory
since Dirac contended that we only measure deviations from the vacuum. In the
absence of any potential, the charge density of the negative sea is uniform and
Dirac argued that this charge density can produce no forces, since by isotropy,
the forces have no special direction to point!
Now this theory has some very useful special property. Suppose that we remove
a negative energy electron from the vacuum. What is left behind is a hole in
the negative energy sea. Measured with respect to the vacuum, the hole would
appear to have positive charge and positive energy,i.e., since it is the absence of
negative charge and negative energy. Dirac interpreted it as a positron, which
is the electron antiparticle.
Let me say that again..... an excited state of the vacuum arises as shown in the
figure. A negative energy electron is excited into a positive energy state, leaving
behind a hole with charge (e) = +e and the same mass as the electron, which
is the antiparticle. It looks like a positive charge since if we apply an electric
field the infinite sea of electron translates opposite to the field direction, which
is unobservable since the sea is infinite. However, the hole seems to be traveling
in the direction of the field like a positive charge!
1605
In this way, antiparticles appear in the Dirac theory as unoccupied negative
energy states, which is very different from the way they appear in the spin zero
theory.
This Dirac hole theory gives a simple description for pair production. Suppose
that a photon of energy > 2mc2 traveling through the vacuum is absorbed by
a negative energy electron and the negative energy electron gets excited to a
positive energy state. What remains, as we have said, is a hole in the negative
energy sea, i.e., a positive energy positron and a positive energy electron. This
says that pair production is simply the excitation of a particle from a negative
to a positive energy state.
Since we could exchange the roles of positrons and electrons in the entire Dirac
theory, electrons would appear as holes in a positron sea. This forces us to
conclude that negative energy seas cannot have any physical reality. The hole
theory is simply a mathematical model that allows us to do the correct book-
keeping within the framework of a single-particle Dirac theory.
With a filled negative energy sea, the Dirac theory would become a many-
particle theory in which we are unable to take into account the interactions
between these particles. The Dirac theory gives valid results only when these
interactions can be neglected. For example, in the hydrogen atom, the mod-
ification of the Coulomb potential by vacuum polarization accounts for about
2.5% of the Lamb shift.
If we second-quantized the Dirac theory, we can treat both particles and an-
tiparticles on the same basis.
The full relativistic quantum field theory of the electrons and positrons and
their interactions with photons was carried out by Feynman, et al in a theory
which is beyond the scope of these volumes.
(~r, t) = 0 , ~ r, t) = 0
A(~ (20.415)
The electric and magnetic fields are given in terms of the vector potential (in
this gauge) by
~ r, t)
1 A(~
~(~r, t) = , ~ r, t) = A(~
B(~ ~ r, t) (20.416)
c t
1606
The electromagnetic energy is given by
2 (~r, t) + B 2 (~r, t)
Z
E = d3~r (20.417)
8
and the rate and direction of energy transport is given by the Poynting vector
c ~ r, t)
(~
~ r, t) = ~(~r, t) B(~ (20.418)
4
The radiation field generated by a classical current ~j(~r, t) is given by
1 2 ~
2 4
2 2 A(~r, t) = j (~r, t) (20.419)
c t c
where means the transverse/divergence-free part.
= ck
2
~ = polarization vector with ~ = 1
= amplitude = constant
1607
where the sum is over all allowed ~k values and over the two orthogonal ~ polar-
izations for each ~k such that ~ ~k = 0 and we have assumed that the universe is
a very large box of volume V . The total energy in this wave solution is
X 2
A~ ~ 2
E= 2 k (20.425)
2c
~
k~
How does this classical electromagnetic field interact with a quantum mechanical
particle?
The wave functions or state vectors differ by a phase factor that depends on
space and time and thus, the invariance is LOCAL rather than GLOBAL (a
phase factor independent of space and time).
1608
(whose expectation value IS gauge invariant), that represents a measurable
quantity.
20.7.3 Interactions
We now write
H = H0 + Hint (20.432)
where
p~2
H0 = + V (~r, t) (20.433)
2m
is the Hamiltonian in the absence of electromagnetic fields and
2
e ~ ~ r, t) p~ + e A
~ 2 (~r, t) + e(~r, t)
Hint = p~ A(~r, t) + A(~ (20.434)
2mc 2mc2
is the operator giving the interaction between matter and radiation.
and
N 2
X e ~ ri , t) p~i + e A
~ ri , t) + A(~
~ 2 (~ri , t) + e(~ri , t)
Hint = p~i A(~
i=1
2mc 2mc2
(20.438)
1609
We now define a particle number density
X
(~r) = (~r ~ri ) (20.439)
i
Finally, we have
N n Z
X e ~ ri , t) + A(~
~ ri , t) p~i
o e ~ r, t)
p~i A(~ = d3~r~j(~r) A(~ (20.443)
i=1
2mc c
Since
p~i e ~
~vi = A (20.444)
m mc
when an electromagnetic field is present, the true current operator is
~ r) = ~j(~r) e A(~
J(~ ~ r, t)(~r) = (paramagnetic + diamagnetic) currents
mc
(20.445)
and therefore,
e2
Z
3 e~ ~ ~ 2
Hint = d ~r j(~r) A(~r, t) + (~r)A (~r, t) + e(~r, t)(~r) (20.446)
c 2mc2
1610
~ as a linear superposition of monochromatic plane waves we then have
For A
Z !
e 3 1 X p~i p~i
Hint = d ~r (~r ~ri ) + (~r ~ri )
c 2 i m m
" #
i~ r it
k~ i~
k~
r +it
X e e
A~k~~ + A~k~~
~ ~
V V
k
~i i~ ~i i~
!
e X X A~k~~ pm e k~ri it + A~k~~ pm e k~ri +it
= ~ ~
2c V ~ ~ i +A~k~~eik~ri it pm
~i
+ A~k~~ eik~ri +it pm
~i
k
e Xh i
= A~k~~j~k ~eit + A~k~~j~k ~ eit (20.448)
c V ~~
k
where
Z
~j~ = 1 p~i i~k~ri ~ p~i ~
X
k e + eik~ri = d3~reik~r~j(~r) (20.449)
2 i m m
2 e2 2 2
abs = (n 0 ~) 2 A~k~ hn| ~j~k ~ |0i (20.450)
0n;~
k~ ~ Vc
To find the total rate of transition we must sum over ~k and ~ (2 polarizations
for each ~k) to get
2 X e2 2 2
abs
0n = (n 0 ~) 2 A~k~ hn| ~j~k ~ |0i (20.451)
~V c
~
k~
so that
2e2 2
Z 2
A~ ~ 2 hn| ~j~ ~ |0i
X
abs
0n = 2 2 d k k (20.453)
~ c 2c3
~
where
n 0
= (from the function) (20.454)
~
1611
If the incident light beam subtends a solid angle d and it is polarized with
polarization vector ~ , then the total rate of energy transport in the beam is
the time average of the Poynting vector which is given by
1 X 2 4
Z
2 2
A~k~ = d d 4
A~k~ (20.455)
V 2c (2c)
~
k
Now
4 A~ ~ 2
I() = d k
(2c)4
= intensity of the incident beam per unit frequency (20.456)
In a similar way
4 2 e2 ~ ~ 2
ind
n0
emis
= I() j~k |0i (20.457)
~2 c 2
hn|
Since
hn| ~j~k ~ |0i = hn| ~j~k ~ |0i (20.458)
we have
abs ind emis
0n = n0 (20.459)
(this is the origin of the Einstein A and B coefficients).
Now a photon of frequency and energy ~ and therefore, the total energy in
the incident beam is X
E= ~N~k~ (20.460)
~
k~
where N~k~ = the number of photons in the (~k, ~) mode in the beam. But we
already have
X 2
A~ ~ 2
E= 2 k (20.461)
2c
~
k~
1612
20.7.5 Quantized Radiation Field and Spontaneous Emis-
sion
Up to this point we have been treating the electromagnetic field classically as a
wave. We have mentioned the idea of photons, but have not created any formal
quantum mechanical structure to describe them, i.e., we have been considering
what happens to the atom and ignoring what is happening to the EM field
during these processes.
To bring out the structure of the theory in terms of photons, we must now
describe these processes in terms of state vectors, such that, in the absorption
process the atom makes a transition from |0i |ni while the electromagnetic
field makes a transition from an initial state to a state with one less photon (it
has been absorbed).
All of our development so far has involved what is physically called an incoherent
beam of light.
We related A~k~ and N~k~ so that knowledge of the N~k~ clearly does not imply
any information about the relative phases of the A~k~ which is the meaning of
the term incoherent.
where, as before, the N~k~ = the number of photons in the mode (~k, ~).
Any two of these states are orthogonal if they differ in the number of photons
in any mode.
The final state of the electromagnetic field after photon absorption of a photon
in the mode (~k, ~) is
E
N~k1~1 , N~k2~2 , ......., N~k~ 1, ........ (20.465)
We assume that there exists some Hint that causes both transitions (atom and
electromagnetic field) as it couples the electromagnetic field to matter. We
define
E
initial state = |0i N~k1~1 , N~k2~2 , ......., N~k~ , ........ (20.466)
E
final state = |ni N~k1~1 , N~k2~2 , ......., N~k~ 1, ........ (20.467)
1613
so that
X
Einitial = 0 + ~ck 0 N~k 0~ 0 (20.468)
~
k 0~
0
X
Ef inal = n + ~ck 0 N~k 0~ 0 ~ck (20.469)
~
k 0~
0
The transition rate between the two states is given by Fermis golden rule as
2 2
(n 0 ~) hf inal| Hint |initiali (20.470)
~
This must be the same as our earlier result (20.463) which implies that we must
have
2 e2 2
hn| ~j ~ ~ |0i
2
hf inal| Hint |initiali = A
~ ~ k
V c2 k
2 2 2
e 2~c
= N~k~ hn| ~j~k ~ |0i (20.471)
Vc 2
This implies that as yet undetermined operator Hint must have the following
properties:
1. it must include a part ~j~k ~ that acts on the atom
2. it must have a part that decreases the number of photons in the (~k, ~)
mode by 1
3. it must be Hermitian
One way of doing this is to write
e X ~ (op) (op)+
Hint = j~k 0 ~ 0 A~ 0~ 0 + ~j~k 0 ~ 0 A~ 0~ 0 (20.472)
c V ~ 0~ 0 k k
k
(op)
where A~ ~ reduces the number of photons in the (~k, ~) mode by 1. It is a
k
photon in mode (~k, ~) annihilation operator.
The second term is required to make Hint Hermitian. Using this model we then
have
e
= hn| ~j~k ~ |0i
c
D E
(op)
N~k1~1 , N~k2~2 , ...., N~k~ 1, .... A~ ~ N~k1~1 , N~k2~2 , ...., N~k~ , ....
k
(20.473)
1614
For agreement with the earlier result we must have
D E
(op)
N~k1~1 , N~k2~2 , ..., N~k~ 1, ... A~ ~ N~k1~1 , N~k2~2 , ..., N~k~ , ...
k
r
2~c 2 q
= N~k~ (20.474)
(op)
This matrix element of A~ ~ corresponds to the A~k~ term in the classical field
k
picture.
we have
2
Z 2
e ~ (op) (~r) + e (~r) A
~ (op) (~r)
Hint = d3~r ~j(~r) A (20.479)
c 2mc2
1615
~ (op) (~r, t) has the time dependence
In the interaction representation A
The operator algebra similarity to the a and a+ problem then allows us to write
h i 2~c2 h i
(op) (op)+ (op) (op)
A~ ~ , A~ 0~ 0 = ~k~k 0 ~~ 0 , A~ ~ , A~ 0~ 0 = 0 (20.482)
k k k k
and
E 1 N~k~ E
(op)+
N~k1~1 , N~k2~2 , ..., N~k~ , ... = p A~ ~ , N , ..., 0, ...
k
N~ ~
k1 1 ~ ~
k2 2
N~k~ !
(20.483)
and H = H0 + Hem + Hint where H0 = Hamiltonian for the electrons.
We then have
i (op) i
e ~ Hem t A~ ~ e ~ Hem t ....., N~k~ , ......
k
i (op) 1
= e ~ Hem t A~ ~ ei(N~k~ + 2 )t ....., N~k~ , ......
k
i
q
Hem t i(N~k~ + 21 )t
=e ~ e N~k~ ....., N~k~ 1, ......
1 1
q
= ei(N~k~ 1+ 2 )t ei(N~k~ + 2 )t N~k~ ....., N~k~ 1, ......
(op)
= eickt A~ ~ ....., N~k~ , ......
(20.484)
k
or
i (op) i (op)
e ~ Hem t A~ ~ e ~ Hem t = eickt A~ ~ (20.485)
k k
and similarly
i (op)+ ~i Hem t (op)+
e ~ Hem t A~ ~ e = eickt A~ ~ (20.486)
k k
Putting this all together we have
" #
X (op) ei~k~rit (op)+~ e
i~
k~
r +it
~ (op)
A (~r, t) = ~
A~ ~ + A~ ~ (20.487)
k V k V
~~ k
1616
We now apply the formalism to the emission process. This corresponds to the
transition between the states
E
initial state = |0i N~k1~1 , N~k2~2 , ......., N~k~ , ........ (20.488)
E
final state = |ni N~k1~1 , N~k2~2 , ......., N~k~ + 1, ........ (20.489)
so that
X
Einitial = 0 + ~ck 0 N~k 0~ 0 (20.490)
~
k 0~
0
X
Ef inal = n + ~ck 0 N~k 0~ 0 + ~ck (20.491)
~
k 0~
0
We get
4 2 e2 2
emis (n 0 ~ck) h0| ~j~k ~ |ni N~k~ + 1 6= abs
=
n0;~k~
V 0n;~k~
(20.494)
which disagrees with the classical field result but agrees with experi-
ment.
This term implies that there is an emission process that can take place even if
there is no external field present.
This process is called spontaneous emission. A clear victory for the quantum
approach.
1617
20.8 Problems
20.8.1 Dirac Spinors
p
The Dirac spinors are (with E = p~2 + m2 )
p
/+m p/+m 0
s
u(p, s) = , v(p, s) =
E+m 0 E + m s
(a) Show that the Dirac equation is invariant in form, i.e., i 0 m 0 (x0 ) =
0, provided
S 1 () S() =
1618
(b) Find the representation of 5 = 0 1 , 5 and = 21 i [ , ]. Are
they independent? Define a minimal set of matrices which form a complete
basis.
(c) Find the plane wave solutions + (x) = u(p1 )eipx and (x) = v(p1 )eipx
in 1 + 1 dimensions, normalized to uu = vv = 2m (where u = u+ 0 ).
(a) T r( ) = 0
(b) T r( ) = 4g
(c) T r( ) = 0
(d) T r( ) = 4g g 4g g + 4g g
1 1
R (x) = (1 + 5 )(x) , L (x) = (1 5 )(x)
2 2
In the case of a massless particle (m=0):
(a) Show that the Dirac equation (i/ eA)/ = 0 does not couple R (x) to
L (x), i.e., they satisfy independent equations. Specifically, show that in
the chiral representation of the Dirac matrices
0 I 0
0 = , =
I 0 0
we have
R ipx
= e
L
(b) For the free Dirac equation (A = 0) show that R and L are eigen-
states of the helicity operator 21 p with positive and negative helicity,
respectively, for plane wave states with p0 > 0.
1619
20.8.6 Gyromagnetic Ratio for the Electron
(a) Reduce the Dirac equation (i/ eA
/ m) = 0 by multiplying it with
/ /
(i eA + m) = 0 to the form
h e i
(i eA)2 F m2 = 0
2
i
where = 2 [ , ] and the field strength F = A A .
(b) Show that the dependence in the magnetic field B = A in the spin-
dependent term F is of the form (ge/2m) 12 B when the kinetic
energy is normalized to 2 /2m ( = 5 0 is the spin matrix). Deter-
mine the value of the gyromagnetic ration g for the electron.
(2) Show explicitly that the solutions to the Dirac equation are eigenvectors
of the helicity operator: h i
~ p =
1620
20.8.11 Gyromagnetic Ratio
Show that in the non-relativisitc limit the motion of a spin 1/2 fermion of charge
e in the presence of an electromagnetic field A = (A0 , A)~ is described by
" #
(~ ~ 2
p eA) e
~ + eA0 = E
~ B
2m 2m
where B~ is the magnetic field, i are the Pauli matrices and E = p0 m. Identify
the g-factor of the fermion and show that the Dirac equation predicts the correct
gyromagnetic ratio for the fermion. To write down the Dirac equation in the
presence of an electromagnetic field substitute: p p eA .
20.8.12 Properties of 5
Show that:
(a) 5 is a pseudoscalar.
(a) 5
(b) 5 5
(c) 5
(d) 5 5
(e)
20.8.14 A Commutator
Explicitly evaluate the commutator of the Dirac Hamiltonian with the orbital
angular momentum operator L for a free particle.
1621
20.8.16 Matrix Representation of Dirac Matrices
The Dirac matrices must satisfy the anti-commutator relationships:
{i , j } = 2ij , {i , } = 0 with 2 = 1
(1) Show that the i , are Hermitian, traceless matrices with eigenvalues 1
and even dimensionality.
(2) Show that, as long as the mass term mis not zero and the matrix is
needed, there is no 2 2 set of matrices that satisfy all the above rela-
tionships. Hence the Dirtac matrices must be of dimension 4 or higher.
First show that the set of matrices {I, ~ } can be used to express any 2 2
matrix, i.e., the coefficients c0 , ci always exist such that any 2 2 matrix
can be written as:
A B
= c0 I + ci i
C D
Having shown this, you can pick an intelligent choice for the i in terms
of the Pauli matrices, for example i = i which automatically obeys
{i , j } = 2ij , and express in terms of {I, ~ } using the relation above.
Show then that there is no 2 2 matrix that satisfies {i , } = 0.
satisfy all the Dirac conditions of Problem 20.16. Hence, they form just
another representation of the Dirac matrices, the Weyl representation,
which is different than the standard Pauli-Dirac representation.
(2) Show that the Dirac matrices in the Weyl representation are
0 ~ 0 I
~ = , 0 =
~ 0 I 0
0 1 2 3 I 0
(3) Show that in the Weyl representation 5 = i =
0 I
p~ + m] = E in the particle rest frame
(4) Solve the Dirac equation [~
using the Weyl representation.
(5) Compute the result of the chirality operators
1 5
2
when they are acting on the Dirac solutions in the Weyl representation.
1622
20.8.18 Total Angular Momentum
Use the Dirac Hamiltonian in the standard Pauli-Dirac representation
~ p~ + m
H=
to compute [H, L] and [H, ] and show that they are zero. Use the results to
show that:
[H, L + /2] = 0
where the components of the angular momentum operator are given by:
Li = ijk xj pk
|i
= cx px + cy py + cz pz + mc2 |i
i~
t
Find all solutions and discuss their meaning. Using the identity
~ B)
(~ A)(~ ~ =A
~B
~ + i~ (A
~ B)
~
will be useful.
1623
1624
Chapter 21
We assume the existence of an infinite dimensional Hilbert space and that for
every possible physical state there is a vector in the space. We choose the vec-
tors corresponding to a particle with momentum p~, |~ pi as basis vectors of length
1. The Hilbert space basis is thus:
|0i corresponding to the vacuum
|~
p1 i corresponding to a particle of momentum p~1
|~
p2 i corresponding to a particle of momentum p~2
.....................................................................
|~
p1 , ~q1 i corresponding to two particles one of momentum p~1 , the other ~q1
and so on
The basis vectors are orthonormal, i.e.,
h~
p | ~qi = p~q~ (21.1)
1625
Since the range of momenta is infinite the subspace of the single particles is
infinite dimensional.
For mathematical convenience, we will assume that the universe is a cube with
volume V and we allow only those wave functions whose value on the boundary
is the same as the value on the opposite boundary (called periodic boundary
conditions). This means that we are working with a infinite, but denumerable,
dimensional Hilbert space, where the momenta are restricted to the discrete
values (L = length of a side of the cube) given by
2n1 2n2 2n3
p1 = , p2 = , p3 = n1 , n2 , n3 = 0, 1, 2, .... (21.2)
L L L
At the end of any calculations, we take the limit V thus returning to the
true continuous case. In the limit any sum over momenta becomes an integral
by the rule Z
X V
d3 p (21.3)
(2)3
p
~
The factor
V
(21.4)
(2)3
arises because in a little box d3 p there will be that many possible states(possible
momentum values in phase space).
In addition we have
V
lim p~ p~ = (3) (~
pi p~j ) (21.5)
V (2)3 i j
What we are doing here is of course quite horrible for any relativistic theory.
We are violating Lorentz invariance.
Since quantum mechanics will form the basis of our derivation of QFT and the
Feynman rules, a review(from a different point of view) of the basic concepts is
in order.
1626
where
1 p~2
Ep = non - relativistic kinetic energy = m~v 2 = (21.7)
2 2m
and the normalization is such that the total probability (integrating over the
whole universe of volume V is 1. This wave function describes one particle in
the universe and it is a solution of Schrodingers equation
1 2
= i (21.8)
2m t
The relativistic generalization of the wave function is simple:
1 1
(x) = ei(px) = ei(~p~xEt) (21.9)
V V
where p
E = relativistic kinetic energy = p~2 + m2 (21.10)
and the wave function is now a solution of the Klein-Gordon equation:
2
2 + 2 m2 (x) = E 2 p2 m2 (x) = 0
(21.11)
t
In making the relativistic transition, the definition of probability needs some
revision. In the non-relativistic theory the probability density to find a particle
at some point x within a small box d3 x is given by
2
|(x)| d3 x (21.12)
where the integral of this expression over the whole volume must be equal to
1. However, since the volume is not a Lorentz invariant, we cannot maintain
this definition of probability density. Probability density does not need to be
a Lorentz invariant, only the total probability must be Lorentz invariant (and
equal to 1). This is very much like electric charge. In fact, if we assume that like
electric charge density and current, the probability density (P0 ) and its associ-
ated probability current (P~ = (P1 , P2 , P3 )) satisfies a conservation (continuity)
equation, then we have (pushing on the analogy)
P = 0 j = 0 (21.13)
It then follows that the total probability (like total electric charge) is given by
Z
P = d3 x P0 (x) (21.14)
In the same manner that we can prove total charge is constant, we can then
prove that P is a constant or P/t = 0 (assumes that P is zero on the bound-
ary of a surface integral).
For consistency, we must have that the charge density is proportional to prob-
ability density.
1627
The formal definition of P is
(x) (x)
P (x) = i (x) i (x) (21.15)
x x
Since particles do have a location in space, we must now consider states that
are not pure plane waves. A particle that we know is precisely at the point ~x
at time x0 (we will take x0 = 0) is described at time 0 by a function
which we generalize to
Z
C
(~x, t) = (x) = d3 p ei(px) (21.19)
(2)3
where
px = p0 x0 p~ ~x = Et p~ ~x , E 2 = p~2 + m2 (21.20)
so that it satisfies the Klein-Gordon equation.
If the location is not a function but more smeared out, then we have, in
general: Z
C
(x) = d3 p f (p)ei(px) , p20 = p~2 + m2 (21.21)
(2)3
For example, if f (p) = 1, we have a function in space and if f (p) = (3) (~
p ~q)
we have a state with sharply defined momentum ~q. Therefore, the vector in
Hilbert space that corresponds to a sharply defined location at some time will
be a superposition of sharp momentum states with equal weight:
X
|~x, x0 i = Cei(px) |~
pi (21.22)
p
~
1628
p0 where Lp = p0 . It is a 4 4 matrix in a 4-dimensional space. In Hilbert space
vectors, it must transform the arguments of all of the different ket vectors,i.e.,
|~ p01 i
p1 i |~
|~ p02 i
p2 i |~
|~ p01 , ~q10 i
p1 , ~q1 i |~
and so on.
and X X
ei(px) |~
pi ei(px)+i(pb) |~
pi (21.24)
p
~ p
~
General Rule
To every Lorentz transformation there corresponds a transformation in Hilbert
space.
L1 X1
L2 X2
L3 = L1 L2 X1 X2 = X3
1629
E
p, ~k in Hilbert space and the final state
~k, which corresponds the basis vector ~
may then contain an electron and proton of momenta E p~ 0 and a proton of mo-
mentum ~k 0 , corresponding to the vector ~ p 0 , ~k 0 in Hilbert space. It seems
like we might think of this as if the physical system corresponds
E to aEvector in
~ 0 ~0
Hilbert space that rotates as a function of time from ~ p, k to ~ p ,k .
We cannot, however, describe things in this way. The reason is that scattering
clearly involves interaction between particles, and we have set up our Hilbert
space for free particles only. We must rethink our procedures if we want to
introduce interactions.
Right now it suffices to say that physical quantities will correspond to the el-
ements of a certain matrix defined in Hilbert space. What we need is basic
building blocks, in some sense matrices like the Pauli spin matrices that can be
used to describe the full set of 2 2 matrices. The matrices that we need will
fulfill certain basic requirements, in particular the requirement of locality. This
is the requirement that physical processes cannot influence each other if they
are outside each others light cone, i.e., if speeds larger than that of light are
needed to connect the events.
This we hope will be achieved by insisting that the operator (matrix) describing
a process at the space-time point x will commute with a similar operator for
the space-time point y if x and y are outside each others light cone.
What would be the most elementary building blocks that can be used to build
up any 3 3 matrix? We can do this with two matrices:
0 1 0 0 0 0
a= 0 0 2 , a = 1 0 0 (21.25)
0 0 0 0 2 0
This can be seen by working out the various products involving a and a and
showing that any 3 3 matrix can be obtained as
a linear combination of these
matrices and associated products. The factor 2 was introduced for reasons
that will become transparent later.
The above example can be used for the subspace referring to a definite momen-
1630
tum p~ . A matrix a as above can be constructed in the subspace of states:
1 0 0
|0i = 0 , |~ pi = 1 , |~ p, p~,i = 0 (21.26)
0 0 1
These matrices represent a universe with n particles all with the same momen-
tum p~ where 0 n 2 as can be seen from the relations
a |0i = 0 , a |~pi = |0i , a |~ p, p~i = 2 |~ pi
a |0i = |~
pi , a |~
pi = 2 |~
p, p~i , a |~
p, p~i = 0
aa |0i = 0 , aa |~
pi = 1 |~
pi , aa |~
p, p~i = 2 |~
p, p~i
The above example can be used for the subspace referring to a definite momen-
tum p~. A matrix a as above can be constructed in the subspace of states
|0i , |~
pi , |~
p, p~,i , etc (21.28)
Examples
Consider
a |~
p, p~, p~i = 3 |~
p, p~i , a |0i = 0 (21.29)
This particular matrix operating in the p~ subspace will be denoted by a(~ p).
Similarly, we will have matrices for any other subspace, and we then have an
p) (with a+ = aT which is the same as aT
p) and a+ (~
infinite set of matrices a(~
since a is real here).
Having done this, we do not need to introduce new matrices for that part of
Hilbert space where we have particles of different momenta, such as the state
1631
|~
p, p~, ~q, ~q, ~qi with p~ 6= ~q. The matrices a(~
p) and a(~q) will by definition act on
these states as if the other particles are not there. Thus (with p~ 6= ~q):
a(~p) |~q, ~q, ~q, ~q, ~qi = 0
a(~p) |~p, p~, ~q, ~q, ~qi = 2 |~
p, ~q, ~q, ~qi
a(~q) |~
p, p~, ~q, ~q, ~qi = 3 |~p, p~, ~q, ~qi
and so on.
We now have a set of matrices a(~ p) and a+ (~ p), called annihilation and creation
operators respectively, defined over the whole Hilbert space, that can be used
to build up any other matrix(operator). Note that by construction a(~ p)a(~q) =
p) if p~ 6= ~q and of course also if p~ = ~q. They do not interfere with each
a(~q)a(~
other.
The matrix X p
H= p0 a+ (~
p)a(~
p) , p0 = p~2 + m2 (21.30)
p
~
is diagonal(as seen from the relations in the above example) and it is the energy
operator, that is:
H |i = E |i (21.31)
where E is the total energy of the state |i. Here may be any number of
particles of any momentum. Explicitly, we have
H |0i = 0 , H |~qi = q0 |~qi ,
H |~
p, p~i = 2p0 |~
p, p~i , H |~
p, ~qi = (p0 + q0 ) |~
p, ~qi
and so on.
21.1.2 Fields
The Fourier transforms of the matrices a and a+ are called fields. To be precise
we have the field A(x):
X 1
A(x) = p)eipx
a(~ (21.32)
2V p0
p
~
Similarly
X 1
A+ (x) = p)eipx
a+ (~ (21.33)
2V p0
p
~
Notice that A(x) is no longer a real matrix due to the complex factors exp(ipx).
The matrices A(x) and A+ (x) can be taken as basic building blocks also since
a can be recovered from A. For example, a(~q) can be obtained from A(x) by
another Fourier transformation:
Z
1
d3 x eiqx A(x) = a(~q) (21.34)
2V p0
1632
The Hermitian combination
is called the field corresponding to the particles considered. This field has a
number of properties that make it very useful for the construction of physical
quantities. The main property is that it is local.
The commutator
[(x), (y)] = (x)(y) (y)(x) (21.36)
is zero if x and y are outside of each others light cone. To see this we compute
things step by step:
Altogether we find:
X 1 ip(xy)
[(x), (y)] = e eip(xy) (21.40)
2V p0
p
~
We now show that the right hand side is zero if x and y are outside each others
light cone. First we take the continuum limit, V :
X Z V
d3 p
(2)3
p
~
Calling the right hand side of the above commutator the function c (x y) we
have, Z
1 3 1 ip(xy) ip(xy)
c (x y) = d p e e (21.41)
(2)3 2p0
To prove that this is zero if the four vector z = x y is outside the light cone
(which means (zz > 0) we proceed in two steps. First we will show that c is
Lorentz invariant. Then we will show that c (z) is zero for any z with z0 = 0
by a Lorentz transformation (that leaves c unchanged) we will have proven
the required result. The argument is illustrated in Figure 21.1 below.
1633
Figure 21.1: Light Cone Arguments
If the function is zero at the location of the cross it is zero in the whole
shell(formed from curve shown) going through that point, because all the points
in the shell can be obtained from the cross point by means of a Lorentz transfor-
mation, and the function is Lorentz invariant. If the function is zero along the
whole horizontal axis (the equal time line) then the function is zero everywhere
outside the light cone.
Zq
1
= d( 2 p~2 m2 )() (21.42)
2p0
q
where (
1 if > 0
() = (21.43)
0 if < 0
We may write
p p
( 2 p~2 m2 ) = (( p~2 + m2 )( + p~2 + m2 )) (21.45)
Now
1 1
(ab) = (b) + (a) (21.46)
|a| |b|
1634
p
Furthermore, the solution = p~2 + m2 will not give anything because the
function restricts us to positive values for . Thus, we get
Zq
1 1 p 1
= d p ( p~2 + m2 )() = p (21.47)
2p0 2
+ p~ + m2 2 p~ + m2
2
q
Although it might cause some confusion we are going to use for the name p0 ,
and the above result is
Zq
1
= dp0 (p20 p~2 m2 )(p0 ) (21.48)
2p0
q
We now show that this integral is Lorentz invariant. Thus, we will let z = Lz 0 ,
where L is some Lorentz transformation, and we will show that
c (z) = c (z 0 ) (21.50)
We have
Z
01 0 0
c (z) = c (Lz ) = d4 p(p2 m2 ) eipLz eipLz (p0 ) (21.51)
(2)3
Now introduce four new variables q1 , q2 , q3 , q4 related to the p by p = Lq. This is
as if we did a Lorentz transformation on p, but it is really a change of integration
variables. The integration volume element d4 p becomes d4 q times the Jacobian
of transformation
d4 p = det(L)d4 q (21.52)
Since det(L) = 1 that gives no change. Furthermore:
Finally, what happens with (p0 )? This is more subtle and requires a detailed
investigation of the action of a Lorentz transformation on the vector p. This
1635
vector is restricted to values inside the upper light cone, because we must have
p2 = m2 (p is said to be on the mass-shell ) and the function restricts us to
the upper light cone. Since any four-vector in the upper light cone transforms
into another vector in the upper light cone, q will also be in the upper light cone.
Therefore, (q0 ) will also be non-zero if (p0 ) was nonzero and zero if (p0 ) was
zero. In other words, (p0 ) = (q0 ) if p = Lq, but for this the function is
crucial, because otherwise there would not be the restriction to the upper light
cone. The result of all this is:
Z
1
iqz 0 iqz 0
c (z) = d4
q(q 2
m 2
) e e (q0 ) = c (z 0 ) (21.53)
(2)3
The last step follows since this differs from the original expression only by a
different notation for the integration variables.
The actor restricts the q0 integration but not the ~q integration. The ~q integral
goes from to + for every component. Therefore
Z Z
d3 qf (~q) = d3 qf (~q) (21.55)
c (z)z0 =0 = 0 (21.56)
2
2 2
+ m (x) = 0 (21.58)
t2
due to the mass-shell relation for p0 . Finally, operating with (x) on the vacuum
state we obtain the state for a particle located at the point x at time x0 = 0 we
have
X 1 X 1
(x) |0i = (a(~p)eipx + a+ (~p)eipx ) |0i = eipx |pi (21.59)
2V p0 2V p0
p
~ p
~
1636
We can also derive commutation rules for equal times.
where |1i , |2i , ..., etc, represent the first, second, etc, basis vectors. Then the
scalar-product of |i with itself is given by
h | i = 1 1 + 2 2 + ...... (21.63)
Similarly, X
h | i = 1 1 + 2 2 + ...... = i i (21.64)
i
Now, if the system is in the state described by |ci and one tries to measure
whether the system is in state |ai or |bi one will find:
2
probability to find system in state |ai = || (21.65)
2
probability to find system in state |bi = || (21.66)
Here in means that the system is measured to have the same properties as state
a or state b.
More generally, if a system is in a state |ci, then the probability to observe the
state |ai is given by
2
|ha | ci| (21.67)
This is the fundamental connection between Hilbert space and physical mea-
surements. Since hc | ci = 1 we must have
2 2
|| + || = 1 (21.68)
1637
The two probabilities add up to 1, as should, of course, be the case.
By itself this is not new: even in ordinary space there are restrictions, and mo-
menta of particles are physical only if their scalar-product with themselves is
positive (corresponding to real mass). Particles with momenta such that (pp) is
negative have not been seen.
So, much the same as in ordinary space where some domains seem to be ex-
cluded for physics, the same might happen in Hilbert space. In Hilbert space, we
would be truly in difficulty if we had to allow for physical systems with negative
probability. No consistent theory can then be constructed, because probability
is positive by its very definition.
For scalar or pseudoscalar particles there is not yet any problem, the transfor-
mations in Hilbert space are rather trivial (as for example the Hilbert space
vector |~
pi transforming to the vector |~qi where ~q is the Lorentz transform of p~.
In dealing with particles with spin, such as electrons and photons, the transfor-
mation of the states in Hilbert space becomes more complicated and we leave
that for a more advanced text.
For definiteness, we assume the existence of two kinds of particles called and .
Both are spinless, since including that complication is not needed to understand
the derivation of Feynman rules. We assume masses M and m for the and
respectively. This model is a simplified version of electrodynamics of electrons
and photons. The will play the role of the photon.
1638
To begin we will focus on a specific problem, namely scattering. The physical
process is shown in Figure 21.2 below.
Two pions with momenta p and q (with p2 = q 2 = M 2 ) meet and scatter, and
we are interested in the probability that a final configuration of two pions with
momenta p0 and q 0 (with p 02 = q 02 = M 2 ) is produced. This probability, when
multiplied with the appropriate flux factors, will be the differential cross section
for this process.
A physical state is simply a possible physical system, with particles moving here
and there, with collisions, with dogs chasing cats, with people living and dying,
with all kinds of things happening.
Often people make the mistake of identifying a physical state with the system
at a given moment, one picture from a movie, but that is not what we call the
physical state. The system at some moment may be seen as a boundary condi-
tion, that is, if one knows the whole system at some moment, and one knows
the laws of nature, then in principle we can deduce the rest!
Conveniently, especially for scattering processes one may use the time points
. Thus the above process corresponds to a vector in Hilbert space, and we
can denote that vector by
|p, qiin (21.69)
By this we mean: that physical system that has two pions of momenta p and q at
time t = (the in configuration). It must be understood that |p, qiin contains
everything, including how the system looks at other times. For example, we
1639
could define the state |p, qi0 as that physical system which has at time t = 0
exactly two pions of momenta p and q. The above described state, |p, qiin ,
has two such pions at t = , but it may well be that they scatter before
t = 0, and, thus, the probability that we have still two pions with that same
momentum at t = 0 is smaller than one.
Let |p, qiin and |p, qi0 be the systems as described above. They are different
systems. Let the system be in the state |p, qiin (two pions at t = ). The
probability of having two pions with momenta p and q at t = 0 is the square of
the absolute value of the scalar-product between the states:
2
|0 hp, q | p, qiin | (21.70)
If some collision took place before t = 0 we may actually still have two pions
but with different momenta, say k and r. A state with two pions with momenta
k and r at time t = 0 is denoted by |k, ri0 . If the system is in the state |p, qiin
then the probability of observing two pions of momenta k and r at t = 0 is given
by:
2
|0 hk, r | p, qiin | (21.71)
Thus the state |p, qiin when viewed at time t = contains two pions of
momenta p and q, but if we look to it at time t = 0 we see with some probability
two pions that may or may not have the momenta p and q. More generally, new
particles may be produced in a collision, so we may also see three, four, etc.
pion configurations. For example,
2
|0 hk, r, s | p, qiin | (21.72)
1640
Clearly, we will have to modify this if we want to consider stable bound states.
No matter how far back we go in time, the electron and proton in a hydrogen
atom do not separate. To describe such systems properly we must enlarge
Hilbert space and allow states containing hydrogen atoms. Of course, such
atoms again can just be considered as a new kind of particle, and the Hilbert
space becomes then effectively the free Hilbert space of three (in this case)
particles (electrons, protons and hydrogen atoms). We will consider, however,
only simple particle states.
Now that we are clear about the meaning of states and their representation in
Hilbert space we can proceed and postulate equations that will describe particles
in interactions. Experiment must then decide which equations describe nature.
Of course, whatever we postulate, it will be within the framework of Lorentz
invariant quantum mechanics. Only a limited degree of freedom is left.
U-matrix, S-matrix
In describing scattering problems, where one typically considers initial and final
configurations of widely separated particles, the description in terms of in and
out states is advantageous. A vector |aiin in Hilbert space corresponds to a
physical system characterized by the configuration a at time t = . Similarly
a vector |biout corresponds to a system characterized by the configuration b at
time t = +. The states |aiin can be counted in the same way as the free
particle states:
|0iin = vacuum at t =
|~
p1 iin = one particle (pion) with momentum p~1 at t =
..............................................................
|~
p1 , p~2 , ..., ~q1 , ~q2 , ......iin = pions with momenta p~1 , p~2 , ... and
particles with momenta ~q1 , ~q2 , ...... at t =
A remark needs to be made here.
Since for free particles the energy is known if the three-momentum is known
the state is characterized by the three-momenta only. That is why we used the
three-vector as argument in |~p1 i rather than |p1 i. In the following we will often
drop the arrow, assuming that the reader is aware that the particles indicated
are on mass shell(p2 = m2 ). Note that for finite times, when the particles are
not necessarily far apart, the energy is not simply given by the usual mass shell
relation.
Of course, in general if there is interaction the in states are different from the
out states, although both are in the same Hilbert space. Thus
|~
p1 , p~2 iin 6= |~
p1 , p~2 iout (21.73)
1641
because a system that at t = has two pions with momenta p~1 and p~2 is
unlikely to still have two pions with momenta p~1 and p~2 at t = +. There is
some probability that the pions do not scatter; it is given by the absolute value
squared of the scalar-product between the two states, that is,
2
|out h~
p1 , p~2 | p~1 , p~2 iin | (21.74)
is the probability that, starting with two pions with momenta p~1 and p~2 at
t = we will still find two pions with momenta p~1 and p~2 at t = +.
Similarly,
2
p 0 , ~q 0 | p~, ~qiin |
|out h~ (21.75)
is the probability that when measuring on a system characterized by there being
two particles of momenta p~ and ~q at t = we will find two particles of
momenta p~ 0 and ~q 0 at t = +.
We thus have two sets of basis vectors in the same Hilbert space, namely the in
basis (|0iin , |~
p1 iin , .....) and the out basis (|0iout , |~
p1 iout , .....). Since a system
without any particles at t = will still not have any particles at t = +, we
have |0iin = |0iout . Similarly for one particle states, |~ p1 iin = |~ p1 iout . But for
two or more particle states this is not true if there is any interaction.
Since physical states correspond to vectors of unit length both the in and out
bases are orthonormal. Therefore, there must exist a matrix that transforms
the in basis into the out basis:
S = I + iT (21.78)
i(T T + ) = T + T (21.79)
1642
Exactly the same as in the case of free particle states, we may define matrices
a and a+ in Hilbert space. We can do that on both in and out bases. Thus,
ain (~
p) is a matrix such that
+ +
p) p~, p~, ..., ~q, , ......
ain (~ = n p~, p~, ... , ~q, , ...... (21.80)
| {z } | {z }
n p~0 s n1 p~0 s
in in
Similarly aout (~
p) is defined by its action on the unit vectors of the out basis.
Note that at this point we have no idea what happens when we apply ain (~
p) on
some out basis vector.
Since the S-matrix transformsthe in basis into out basis it must also transform
ain (~
p) into aout (~
p)
p)S + |~
Saout (~ p) |~
p, p~, .....iin = Saout (~ p, p~, .....iout
= nS |~ p, .....iout = n |~ p, .....iin
p) |~
= ain (~ p, p~, .....iin
ain (~ p)S +
p) = Saout (~ (21.82)
Similarly
in (x) = Sout (x)S + (21.83)
Both fields, in and out satisfy the Klein-Gordon equation.
It must be understood that in and out are well defined for all space and time.
Thus in (x) for example is perfectly well defined and non-zero for x0 = +.
The assumption that for t = any physical system becomes a system of free
1643
particles allows a mapping of all possible physical systems (by how they are
at ) on all possible free particle systems. Then we can use the formalism
developed for those systems, and build fields. The fields so constructed are the
in fields. Similarly out fields, related to labeling physical systems by how they
are at +. This then exhibits the role of assumptions on asymptotic behavior,
which are clearly of fundamental importance.
Only empty movies (the vacuum) or movies containing just one actor (one par-
ticle of some momentum) are likely to have identical in and out scenes.
A physical state in Hilbert space is like a movie in a can. It is the whole movie,
not just the opening scene, even if the can is labeled that way. Seeing things
this way it hopefully becomes clear that a progressing physical system is not a
vector in Hilbert space (such as |~p, ~qiin rotating to another state (|~
p, ~qiout ) in
the course of time.
A vector in Hilbert space has no time dependence, but, like in a movie, all action
is contained in that state.
In that sense the S-matrix is a cross-index register, showing the relation between
two labeling systems. Given the beginning scene of a movie, the S-matrix tells
us what the final scene is.
We assume the existence of a field (x) (no in or out index) that is equal to
in (x) if x0 (the time) is and is equal to out (x) at x0 = +. Again,
in (x) is well defined at x0 = +, but then it will be very different from (x).
Thus, we have
1644
Figure 21.3 below shows some attempt to visualize the system.
Note that in (x) and out (x) are well defined for all times. Now, (x) will also
satisfy an equation of motion, but it will not be a simple Klein-Gordon equation.
We therefore write:
2
2 2
2 + M (x) = j(x) (21.85)
t
The minus sign is part of the definition. The quantity j(x) is called a current,
and j(x) is such that if we solve this equation with the boundary condition
then for
x0 = + we will have (x) = out (x) (21.87)
If we know j(x), and can solve this equation then we can find out from in ,
and thus also determine the S-matrix since S relates in- and out-fields.
Basically all this is one big assumption. The system is really very complicated,
even for the simplest cases as we will see. Only for those simple cases can the
above equation be solved, and even then only in terms of successive iterations
(perturbation theory). Thus the scheme developed below is to a large extent
determined simply by the requirement that we can solve it. Fortunately these
methods give rise to results that agree very well with experimental observations.
One truly may be thankful to nature for this, that is, limiting itself to something
that we can compute!
Let us now write down a simple expression for j(x) and solve the equation.
1645
Since we want the pions to interact with the particles we will also include
fields in j(x). We assume
j(x) = 2g(x)(x) (21.88)
The constant g is called the coupling constant. We will now write (x) instead
of (x) to exhibit more clearly the fact that this particular field is associated
with pions.
The above choice is really the simplest non-trivial form for j(x) since if we chose
only (x) or (x) we could, with some reshuffling of the equation, make j(x)
zero and, therefore, this would not correspond to any interactions.
Intuitively, this is simple to see. If j(x) determines how pions interact with s
then evidently this fixes also the interaction of s with s. In fact, one can
determine the S-matrix from either j(x) or j(x) above, and this better be the
same S-matrix! We will see later that the choice j = 2g implies j = g 2 .
It should be stressed here that the field (x) in j(x) is not in or out but also
an interpolating field (interpolating between in and out ) just as (x) is an
interpolating field (interpolating between in and out ).
From the fact that (x) satisfies an equation of motion we should be able to
deduce an equation of motion for U (x0 ). Here we need to be careful because all
the objects that we are dealing with are big, generally non-commuting, matrices.
Basic equations, derived earlier, that we will use are:
(x)
[(x), (y)]x0 =y0 = 0 , , (y) = i (3) (~x ~y ) (21.91)
x0 x0 =y0
1646
We will now show that, if U (x0 ) satisfies the following differential equation:
Z
U (x0 )
= ig d3 y in
2
(y)in (y)U (x0 ) with y0 = x0 (21.92)
x0
then (x) as defined above satisfies
2
2 + 2 M 2 (x) = 2g(x)(x) (21.93)
t
The proof of this statement is not particularly difficult, just a little cumbersome.
It is important to remember that U (x0 ) is time dependent, not space dependent.
In other words
U (x0 )
= 0 , = 1, 2, 3 (21.94)
x
First we introduce a notation:
Z
H(x0 ) = g d3 y 2 (y)(y) with y0 = x0 (21.95)
and Z
Hin (x0 ) = g d3 y in
2
(y)in (y) with y0 = x0 (21.96)
It follows that
H(x0 ) = U 1 Hin (x0 )U (21.97)
The equation to be solved is now:
U (x0 )
= iHin (x0 )U (x0 ) (21.98)
x0
The time derivative of (x) and (x) can be computed. Remember, in general:
(I) = 0 = (U 1 U ) = (U 1 )U + U 1 (U )
(U 1 ) = U 1 (U )U 1 (21.99)
We then find:
= U 1 in U (21.100)
and
(U 1 in U ) (U 1 ) (in ) (U )
= = in U + U 1 U + U 1 in
x0 x0 x0 x0 x0
(U ) ( in )
= U 1 U 1 in U + U 1 U + U 1 in iHin U
x0 x0
(in )
= U 1 iHin U U 1 in U + U 1 U + U 1 in iHin U
x0
(in )
= U 1 i [in , Hin ] U + U 1 U (21.101)
x0
1647
We can see a general rule here. The second time derivative of the field
becomes:
2
2 = U 1 [[in , Hin ] , Hin ] U + U 1 i [in , Hin ] U
x0 x0
2 in
in
+ U 1 i , Hin U + U 1 U (21.102)
x0 x20
As noted before the spatial derivatives of U vanish. Considering now the Klein-
Gordon equation for (x) we find
2 2
2 + 2 M 2 (x) = 2 + 2 M 2 U 1 in U
t t
2
= 2 M 2 U in U
t
2
+ U 1 2 + 2 M 2 in U
t
2
+ U 1 in 2 M2 U
t
2
2 1
= 2 M U in U
t
2
1 2
+ U in 2 M U (21.103)
t
and finally
2
2 + M (x) = U 1 [[in , Hin ] , Hin ] U
2 2
t
+ U 1 i [in , Hin ] U
x0
in
+ U 1 i , Hin U (21.104)
x0
Since the field in satisfies the free Klein-Gordon equation, the first term on the
right hand side vanishes. Also, the second and third terms vanish because in (x)
and in (y) commute for x0 = y0 (as far as in (x) and in (x) are concerned, they
commute always because the matrices a and a+ for the and fields commute).
The last term gives
Z
in in 2
, Hin = ig d3 y(y) , in
x0 x0 x0 =y0
Z
3 (3)
= 2ig d yin (y) (~x ~y )in (y)
x0 =y0
1648
The final result is:
2
2 + 2 M 2 (x) = 2gU 1 in (x)in (x)U
t
= 2g(x)(x) (21.106)
because
2
2 2
2 + m (x) = g 2 (x) (21.108)
t
This is the moment to consider the question of the connection between j and
j. It is clear from the above derivation that they follow from the same U , i.e.,
from the same H. As a matter of fact one notes the formal rules:
H(x) H(x)
j(x) = and j(x) = (21.109)
with Z
H(x) = d3 yH(y) , H(y) = g 2 (y)(y) (21.110)
The quantities H and H are called the interaction Hamiltonian and the interac-
tion Hamiltonian density respectively. It is clear that it is better to start from
a Hamiltonian and then to derive the equations of motion to avoid inconsisten-
cies. This is what we will do in general. In fact, at this point we do not really
need the equations of motion for the fields any more. We will simply take some
Hamiltonian, and we then know that the U matrix satisfies the equation
U (x0 ) = Hin (x0 )U (x0 ) (21.111)
x0
and solve U from that. Once we have U we have the S-matrix, namely, S =
U ().
It should be clearly understood what we have here. The solution of the equation
for U will give us the matrix U as a function of in and in . Thus we will obtain
S as a function of in and in . This is precisely what we need. As noted before,
the probability to find the configuration c at time t = + if at time t =
we have the configuration b, is given by
2 2
|out hc | biin | = |in hc| S |biin | (21.112)
1649
Thus, in principle there is no problem here. However, the actual calculation of
matrix elements in hc| S |biin remains a complicated matter.
where c is an arbitrary constant. If A(t) is a matrix such that A(t1 ) and A(t2 ) do
not necessarily commute, then the solution is more complicated and essentially
can be given only in terms of a series expansion that looks very much like an
expansion of the above exponential, but not exactly (see Chapter 11).
Zt Zt Zt
X 1
X =1+ dt1 dt2 ......... dtn T (A(t1 )A(t2 ).....) (21.116)
n=1
n!
We now proceed to show the correctness of the above solution. It can of course
be verified directly by putting the above solution into the equation, but we will
use iteration instead. Suppose we want to find X as a power series in A. Let us
find the lowest order term. We write X = 1 + 1 , where 1 is of first order in
A. Neglecting terms of order A2 such as A1 the equation for X becomes
Zt
d1
= A 1 = A(t1 )dt1 (21.118)
dt
1650
To obtain the next iteration we write:
Zt
X =1+ A(t1 )dt1 + 2 (21.119)
where 2 is of second order in A. The equation for X becomes (to second order
in A)
Zt
d2
= A(t) A(t1 )dt1 (21.120)
dt
The solution is
Zt Zt2
2 = dt2 A(t2 ) dt1 A(t1 ) (21.121)
Notice that t2 > t1 , that is, the matrices A appear in descending order of time.
To further proceed with this integral we claim that
Zt Zt
2 = dt2 dt1 A(t1 )A(t2 ) (21.122)
t2
also. Note the order of the A and the integration limits of the second integral
where again the A appear in descending order. This may be verified either
by direct insertion into the equation for d/dt or by transforming the integral.
This becomes very easy by considering Figure 21.4 below showing the integra-
tion domains.
Taking c as lower integration limit, with c some number, the first integral corre-
sponds to domain I in the figure and the second to domain II. It is clear that the
two domains can be obtained from each other by exchanging t1 and t2 . Since
indeed the integrands have t1 and t2 interchanged we see that the two integrals
1651
are equal. We may therefore take also for 2 half the sum of both expressions
and thus obtain
Zt Zt
1
2 = dt1 dt2 T (A(t1 )A(t2 )) (21.123)
2
For the sake of clarity we will show directly that the above is the correct solution.
First,
Zt
d2 1
= dt2 ((t t2 )A(t)A(t2 ) + (t2 t)A(t2 )A(t))
dt 2
Zt
1
+ dt1 ((t1 t)A(t1 )A(t) + (t t1 )A(t)A(t1 )) (21.125)
2
The second term is zero unless t2 > t, which is never true, and that term is zero.
Similarly the last two terms, which really differ from the first two only in that
the integration variable is called t1 instead of t2 . Together we get the desired
result.
Zt Zt Zt
1
n = dt1 dt2 ....... dtn T (A(t1 )A(t2 )........A(tn )) (21.127)
n!
In this case one must consider n! domains of integration, all obtained from each
other by some permutation of t1 t2 ......tn , but there is no essential difference from
the case of just two variables.
1652
21.4 Interacting Fields - Part 3
21.4.1 Feynman Rules
We will now work out the lowest non-vanishing order of the Smatrix for the
case of the and fields given before. We have
U (x0 )
= iHin (x0 )U (x0 ) (21.128)
x0
or Z
U (x0 )
=i d3 yHin (y)U (x0 ) (21.129)
x0
x0 =y0
where ain (k) transforms a state with m pions into a state with m 1 pions, and
gives zero if no pions are present, while a+
in (k) gives the opposite result.
Let us first consider the lowest order term of S. As we have now exclusively in
type objects we will drop this subscript. We have:
Z
hp , q | S |p , qi = hp , q | p , qi + i d4 y hp 0 , q 0 | H(y) |p , qi
0 0 0 0
(21.136)
1653
plus terms of higher order in H. Now, if p0 , q 0 is different from p,q then the
first term is zero (orthogonal vectors). The second term contains one H and
therefore only one field. This applied to a state without particles gives zero
(for the a(k) part) or a state containing a particle. But the dot product of
such a state with the state |p0 , q 0 i, containing no particle is zero. Therefore
also the second term is zero.
Generally, any product of an odd number of Hs gives zero between states with-
out particles, by similar arguments.
Next the appropriate a(k) term in the field transforms this state into the state
Finally, selecting the terms with a+ (p0 ) and a+ (q 0 ) in the last two pion fields
transforms the state |0i into the state |p0 , q 0 i. The scalar product of this state
with |p0 , q 0 iis non-zero; in fact it is one.
This may be graphically depicted in the following way. Particles are described
by lines, and the action of and fields is to either end or start a line. The
action of H is thus to start or end two lines and one line. The above exam-
ple, drawn in the opposite direction (i.e. with y 0 left of y) is shown in Figure
21.5 below.
H(y 0 ) ends two lines and starts a line corresponding to 2 |p, qi = |ki and
1654
H(y) ends a line and starts two lines.
We now can draw pictures corresponding to all possibilities. They are shown in
Figure 21.6 below.
+ further permutations
We have drawn the vertices as visible dots, to avoid confusion with crossing
lines. The last case shown differs from the first only by the interchange of y and
y 0 . Since the whole is symmetric in y and y 0 it follows that both cases give the
same result. Also the first four diagrams all correspond to the same expression.
1655
Figure 21.7: Different Feynman diagrams for scattering
Generally, in higher orders one gets the same result for all permutations of
y,y 0 ,y 00 , ...... , which gives a factor n! for the nth order. This cancels against the
factor 1/n! in front.
So far we have not worried about the various factors going with the a and a+ .
This is not very difficult.
We find:
1 0 1 0 1 0 1
eipy eiqy eiky eiky
2V p0 2V q0 2V k0 2V k0
!
1 0 0 1 0 0
p eip y p eiq y (21.141)
2V p00 2V q00
This is for y0 > y00 . For y00 > y0 the order of the H is reversed, which is of no
consequence to the part, but now the starts in the point y and ends in y 0 .
1656
All together we get:
ig 2
Z
0 0 0
d4 yd4 y 0 p 0 0
ei(p+q)y ei(p +q )y
4V 2 p0 q0 p0 q0
!
X 1 ik(yy0 ) X 1 0
(y0 y00 ) e + (y00 y0 ) eik(y y) (21.142)
2V k0 2V k0
k k
Let us now first work out the expression in brackets. The sum over k may be
written as an integral over d3 k, and by methods as described before we may
rewrite the whole in terms of a 4-dimensional integral.
X 1 Z
0 1 0
eik(yy ) 3
d4 keik(yy ) (k0 )(k 2 + m2 ) (21.143)
2V k0 (2)
k
The combination
is called the propagator of the field. It can be worked out easily using a
Fourier expression for the function. On has
Z
1 ei z
(z) = d , lim 0 , > 0 (21.148)
2i i
You can confirm this equation by considering the poles of the integrand in the
complex plane. Add an integral over a large half circle to make a closed
contour; take this circle either in the upper or lower plane depending on the
sign of z such that the exponential becomes very small on the circle.
1657
The trick is to get out of the exponential. This may be achieved by a change
of variable for the k0 integration. We take
k0 = k00 + (21.150)
The functions select the + root for the first term and the root for the
second. The argument of the function can be rewritten
~k 2 (k0 + )2 + m2 = (k0 + ) + (k0 + )
(21.153)
Remember again
1 1
(ab) = (b) + (a) (21.154)
|a| |b|
We find
Z !
1 4 ikz 1 1 1
F (z) = d ke +
(2)4 i
p
2 ~k 2 + m2 k0 + i k0 + i
(21.155)
The complete expression for the diagram considered is
i(p+q)y 0 i(p0 +q 0 )y ik(yy 0 )
ig 2
Z Z
4 4 4 0e e e
d k d yd y
k 2 m2 i
p
(2)4 iV 2 16p0 q0 p00 q00
(21.156)
Both y and y 0 occur only in the exponents, and the integrals can be done using
Z
1
(a) = dx eiax (21.157)
2
and we find
g 2
Z
1
(2)8 d4 k (4) (p + q k) (4) (k p0 q 0 )
k 2 m2 i
p
(2)4 iV 2 16p0 q0 p00 q00
(21.158)
The integral over k can be done
g 2 (2)8 (4) (p + q p0 q 0 )
(21.159)
(2)4 iV 2 16p0 q0 p00 q00 (p + q)2 m2 i
p
1658
From the above calculation we can see how things go in general. Write down
all possible diagrams, and then for any diagram write down the correct factors.
As much as possible factors relating to permutations should be absorbed into
some easy rules. This is not always possible, but in most cases that one meets
there is really not much of a problem.
First, the combinatorial factor relating to there being two pion lines in a vertex
that can be interchanged, is easily taken care of by including a factor of 2 in
the vertex. The factor of two relating to the symmetry in y,y 0 interchange
cancels against the factor 1/2! in front of the second order term of the S-matrix
expansion. Now we have three essentially different diagrams left:
The contribution due to the first diagram has been computed rules for the
theory that we are considering here. Here are the Feynman rules for the case
H(x) = g 2 (x)(x):
2. To every vertex corresponds a factor 2i(2)4 g (4) (.....). Note: 2 for two
pion lines, i from the original equation for the Smatrix, (2)4 from the
integral giving the function, and g as found in the interaction Hamil-
tonian
Many of the propagator integrals can usually be done, thereby getting rid of
the functions due to the vertices. The general rule is that one (4) (.....)
1659
remains, assuring that the sum of incoming momenta equals the total of the
outgoing momenta, thus guaranteeing conservation of energy and momentum
in any process. In the first non-trivial order (as we are considering here) no
momentum integral remains. In the next order one four-dimensional integral
remains non-trivial, and in every next order there is one more four-integral.
This is what makes it so hard to do higher order calculations.
If the time y0 is larger than the time y00 then this propagator equals a function
containing plane waves for a particle of positive energy on mass shell. We can
literally say that if the time y0 > y00 then the Feynman propagator represents
a physical particle moving from the space-time point y 0 to the space-time point
y. In fact, the exponential is nothing else but the wave function for a plane
wave for a particle leaving y 0 multiplied by the wave function for a particle of
the same mass and momentum arriving at y. This product is the overlap of
these functions, something that relates to the probability for this to happen. The
total propagator is obtained when integrating over all possible physical momenta
(positive energy, on mass shell). If the time y00 > y0 , then the particle moves in
the opposite direction.
There is a causality idea in there: energy moves from the earlier point to the
later. There is another feature: the probability for this to happen must not be
negative, which is embodied in the sign of the . Indeed, having a theory with
as above but with a sign in front would give rise to negative probabilities.
This then is the physical content of the Feynman propagator.
1660
21.5.2 Scattering Cross Section
We now introduce a new particle in addition to the and , and we will call it
P . Apart from the spin, which one usually neglects in first approximation, this
P is to play the role of the proton.
It interacts with the in the same way as the (the electron). Thus the inter-
action Hamiltonian becomes:
H = g 2 gP 2 (21.163)
The minus sign reflects the fact that the proton charge is opposite to the electron
charge.
2
2 + 2 M 2 (x) = 2g (21.164)
t
2
2 + 2 m2 (x) = g 2 + gP 2 (21.165)
t
2
2 2
2 + MP P (x) = 2gP (21.166)
t
These follow from previous results, as well as the fact that the P field commutes
with all and fields, including time derivatives of these fields. The Feynman
rules are as before, except we now have an extra particle, the P , to be denoted
by a broken line. There is also a new kind of vertex, showing the P coupling
as shown in Figure 21.10 below.
1661
Figure 21.11: P Scattering
In this case, to second order, we wind up with only one diagram as shown in
Figure 21.12 below.
To obtain a cross section we must take the absolute value squared of this expres-
sion, which is not immediately clear because of the function. To get around
this we first go back to finite volume V , which amounts to the replacement
V
p + ~q p~0 ~q0 )
(3) (~ p~+~q,~p0 +~q0 (21.168)
(2)3
Thus,
V2
( (3) (....))2 p~+~q,~p0 +~q0 (21.169)
(2)6
1662
since there is nothing difficult about squaring a Kronecker, but we now have
V 2 instead of V . Recombining one V factor with the we have
V
( (3) (....))2 (3) (....) (21.170)
(2)3
Now what about the fourth function (relating to energy conservation)? Here
we must introduce a time interval T . Since with plane waves as we consider
here there is really no beginning and end to the scattering process we limit our
observations to a time interval T , and will compute the transition probability per
unit of time. Essentially, now things are entirely the same for time and space,
and squaring the fourth function gives us a factor T /2. The transition
probability is therefore
2
4ig 2 (2)4 1 VT
2
|hSi| = p (4) (....) (21.171)
V 2 16p0 q0 p00 q00 (p p0 )2 m2 i (2)4
1663
The integral over q 0 can be done, using up three of the functions. We get
16g 2 1
Z
tot = d3 p0
(2)2 4p0 q0
2
1 p0 1 (p0 + q0 p00 q00 )
0 0 (21.176)
p| (p p0 )2 m2 i
4p0 q0 |~
p
where q00 = ~q2 + M 2 with ~q 0 = p~ + ~q p~ 0 .
We now make the non-relativistic approximation and also the no-recoil approx-
imation, which is the approximation that the P mass MP is much heavier than
the mass M of the .
where y = |~p|. Conservation of momentum tells us that q00 , the energy of the
outgoing P , is given by
q
q00 = Q~2 + M2 (21.178)
P
~ = p~ + ~q p~ 0 = p~ p~ 0
Q (21.179)
~ is called the
because ~q = 0, as the initial proton is at rest. The quantity Q
momentum transfer. It is the amount of momentum given by the to the P .
~2
Q
q00 = MP + + ..... (21.180)
2MP
1664
The no-recoil approximation is to neglect the term
~2
Q
(21.181)
2MP
with respect to MP . Thus, to this approximation the proton remains at rest,
q00 = MP . The expression for tot now becomes
2
g4 1
Z Z
1 p0 1 (p0 p00 ) (21.182)
y 2 d3 y
tot = d 0 0
4 2 p0 q0 4p0 q0 |~ 2 2
p| Q m i
p
p|, thus p00 =
Now y = |~ y 2 + M 2 . It then follows that
dp00 1 1 y
= p 2y = 0 (21.183)
dy 2 y +M
2 2 p0
Using the fact that y = |~ p 0 | and doing the now trivial p00 integration we
p| = |~
get
2
g4 1
Z
p0 1
tot = 2
d 0 2 2
(21.186)
4 p0 q0 q0 Q m i
~
It should be noted that in the no-recoil approximation Q0 << Q . This follows
because Q0 is the difference between the initial and final P energy
~2
~ 2 + M 2 MP Q
q
Q0 = q00 q0 = Q P (21.187)
2MP
~ 2 to a good approximation. Replacing, nonrelativistically, q0
Therefore Q2 = Q
by MP and p0 by M we have the final result
2
g4
Z
1
tot = d (21.188)
4 MP2
2 2 2
Q m
We have omitted the i in the propagator, because both Q ~ 2 and m2 are positive,
so the infinitesimal is of no relevance here. If we take the mass to be zero
we have
g4
Z
1
tot = 2 2 d (21.189)
4 MP ~4
Q
1665
~ = p~ p~ 0 and
Now Q
~ 2 = 2 |~ 2 2
Q p| sin2
p| (1 cos ) = 4 |~ (21.190)
2
where we used |~ p 0 | and is the angle that the outgoing makes with the
p| = |~
zaxis (the direction of the incoming ), Thus
g4
Z
1
tot = 2 2 d 4
4 MP p| sin4 2
16 |~
g4 2g 4
Z Z
1 sin d
= d 4 4 = (21.191)
64 2 MP2 M 4 v sin 2 64 2M 2 M 4
P v 4 sin4 2
21.5.3 Lifetime
In this last section we will consider another application of the theory developed
so far, namely the calculation of a decay rate, or a lifetime for an unstable
particle.
The process of interest has initially a and finally two pions. We therefore
must consider
hp, q| S |ki (21.192)
where k denotes the momentum of the initial and p and q are the momenta
of the final pions.
1666
Figure 21.14: Feynman diagram for decay
1667
p
with p0 = m/2 and p = p20 M 2 . Thus
g2 p 2
( 2) = m 4M 2 (21.199)
8m
The lifetime is the inverse of this, = 1/.
Numerical Evaluation
Generally one wants a cross section in terms of cm2 and a lifetime in seconds.
We have used ~ = c = 1 and will express everything else in M eV . The cross
section will have the dimension of (M eV )2 , the decay rate is of dimension
M eV and lifetime (M eV )1 . To go to cm2 the cross section must be multiplied
by (hc)2 = 1.97327 1011 (M eV cm)2 . To go from M eV to sec1 the decay
rate must be divided by h = 6.582122 1022 M eV sec and the lifetime is
thus h/. Note that in the examples above the coupling constant g has the
dimension M eV .
1668
Index
1669
Density Matrix, 1145, 1194 Quadratic Lagrangian Propagator,
Actual Process, 1205 1368
Kochen-Specker Theorem, 1212 Quantum Mechanics, 1346
Polarization of Spin=1/2 Systems, Aharonov-Bohm Effect, 1357
1204 Topology, 1357
Projection Operators, 1198 Quantum Mechanics
Pure and Mixed Ensembles, 1194 Examples, 1351
Spin=1/2 Systems, 1200
Stern-Gerlach Example, 1206 Quantum Logic, 1226
von Neuman Equation, 1199 Meaning of True, 1226
Filtering, 1185 FALSE and NOT TRUE, 1230
General Theorem, 1165 Logical Connectives, 1231
Interpretation of State Vector, 1167 Quantum System Examples
Interpretation of States, 1163 Coherent and Squeezed States, 1093
Joint and Conditional Probabili- Quantum System Examples
ties, 1180 Electron in a circular wire, 1098
Mechanisms, 1169 Neutron Interferometry, 1107
Observation and Time Evolution, Penning Trap, 1112
1210 Quantum Eraser, 1133
Probability Distributions, 1181 Schrodinger Cat, 1116
Pure State Factor Theorem, 1159 Spin-Orbit Coupling in Complex
Spin Example, 1163 Atoms, 1102
Spin Recombination Experiment, Zeeman Effect in Complex Atoms,
1174 1105
State Determination, 1150 Quantum World View, 1237
State Preparation, 1145
Relativistic Wave Equation
Wave Function, 1171
Klein-Gordon Equation, 1541
Molecular Physics, 1059
Bound States, 1559
Born-Oppenheimer Approximation,
Free Particles, 1555
1065
Non-relativistic limit, 1562
General Properties, 1059
Physics, 1549
Molecular Physics
Hydrogen Molecular Ion, 1073 Scattering Theory, 1033
Hydrogen Molecule, 1077 Born Approximation, 1047
Vibrational/rotational levels, 1081 Partial Waves
Optical Theorem, 1047
Path Integrals, 1345 Phase Shift Property, 1050
Classical Limit, 1354 Scattering Theory
Least Action Principle, 1354 Born Approximation Examples, 1055
Evaluation Details, 1360 Cross Sections, 1041
Free Particle Propagator, 1361 Greens Functions, 1035
Harmonic Oscillator Propagator, Partial Waves, 1042
1372 Phase Shift Examples, 1051
History, 1345 Second Quantization, 1483
Motivation, 1345 Bogoliubov Transformation, 1518
1670
Hanbury, Brown and Twiss Exper-
iment, 1511
Identical Particles, 1483
N-Particle Non-Interacting Gas, 1503
Occupation Number Space, 1485
Operators, 1499
Pair Correlation Function, 1506
Spontaneous Symmetry Breaking,
1523
The Hamiltonian, 1515
Solid State Physics, 1377
2D Lattices
Fermi Surface, 1446
Bloch Theorem, 1391
Alternative Form, 1394
Derivation, 1393
The Wave Vector, 1394
Translation Operators, 1391
Crystal Structure, 1377
Crystal Symmetry, 1378
3D Crystals, 1388
Classifications of 2D Lattices, 1384
Group Theory, 1382
Point Symmetry, 1381
Translation Symmetry, 1379
Kronig-Penney Model, 1427
Weak Binding Limit, 1433
Reciprocal Lattice, 1395
1st Brillouin Zone, 1399
Brillouin Zone Properties, 1403
Brillouin Zones, 1399
Higher Brillouin Zones, 1401
Square Lattice, 1397
Reciprocal Lattice Vectors, 1396
Spherical Harmonics Addition Theorem,
1018
Sudden Approximation, 950
1671