0% found this document useful (0 votes)
8 views

2223 Advanced Quantum Notes

The document outlines the curriculum for the Advanced Quantum Physics module (PHYS6003) at the University of Southampton for the academic year 2022-23, led by Pasquale Di Bari. It covers a variety of topics including vector spaces, principles of quantum mechanics, harmonic oscillators, angular momentum, relativistic quantum mechanics, and the second quantum revolution. The introduction emphasizes the transition from solving the Schrödinger equation to a more formal and axiomatic approach to quantum physics, focusing on symbolic operations in vector spaces.

Uploaded by

ripek83921
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

2223 Advanced Quantum Notes

The document outlines the curriculum for the Advanced Quantum Physics module (PHYS6003) at the University of Southampton for the academic year 2022-23, led by Pasquale Di Bari. It covers a variety of topics including vector spaces, principles of quantum mechanics, harmonic oscillators, angular momentum, relativistic quantum mechanics, and the second quantum revolution. The introduction emphasizes the transition from solving the Schrödinger equation to a more formal and axiomatic approach to quantum physics, focusing on symbolic operations in vector spaces.

Uploaded by

ripek83921
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 154

University of Southampton

School of Physics and Astronomy

PHYS6003

Advanced Quantum Physics

Academic year 2022-23

Pasquale Di Bari
(e-mail:P.Di-Bari@soton.ac.uk)
2
Contents

Introduction 7

1 Vector spaces 13
1.1 Definition of vector spaces . . . . . . . . . . . . . . . . . . . . 13
1.2 Linear dependence and linear independence . . . . . . . . . . 15
1.3 Dimension and basis of a vector space . . . . . . . . . . . . . . 16
1.4 Inner product . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.4.1 Orthogonal vectors and orthonormal basis . . . . . . . 18
1.4.2 Inner product in terms of the components . . . . . . . 19
1.5 Dual space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.6 Gram-Schmidt procedure . . . . . . . . . . . . . . . . . . . . . 23
1.7 Useful inequalities . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.8 Linear operators . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.9 Adjoint of an operator . . . . . . . . . . . . . . . . . . . . . . 28
1.10 Hermitian and Unitary operators . . . . . . . . . . . . . . . . 30
1.11 Matrix representation of operators . . . . . . . . . . . . . . . . 31
1.12 Outer product . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.13 Projection operators . . . . . . . . . . . . . . . . . . . . . . . 35
1.14 Completeness relation . . . . . . . . . . . . . . . . . . . . . . 36
1.15 Unitary transformations . . . . . . . . . . . . . . . . . . . . . 39
1.16 Active and passive transformations . . . . . . . . . . . . . . . 40
1.17 Trace and determinant of an operator . . . . . . . . . . . . . . 41
1.18 The eigenvalue problem . . . . . . . . . . . . . . . . . . . . . . 42
1.18.1 The characteristic equation . . . . . . . . . . . . . . . 43
1.19 The eigenvalue problem for Hermitian operators . . . . . . . . 45
1.19.1 Non-degenerate case . . . . . . . . . . . . . . . . . . . 45
1.19.2 Degenerate case . . . . . . . . . . . . . . . . . . . . . . 46
1.19.3 Eigenvectors and eigenvalues of unitary operators . . . 47
1.19.4 Diagonalization of Hermitian matrices . . . . . . . . . 48

3
4 CONTENTS

1.19.5 Simultaneous diagonalization of two Hermitian operators 49


1.20 Generalization to infinite dimensions . . . . . . . . . . . . . . 50
1.20.1 A useful discrete-continuous dictionary . . . . . . . . . 50
1.20.2 Operators . . . . . . . . . . . . . . . . . . . . . . . . . 52
1.21 Tensor product of vector spaces . . . . . . . . . . . . . . . . . 58

2 The principles of quantum mechanics 63


2.1 The principles of Quantum Mechanics . . . . . . . . . . . . . . 63
2.2 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
2.3 How to test quantum mechanics . . . . . . . . . . . . . . . . . 67
2.4 Expectation value and uncertainty of the measurement . . . . 68
2.5 Compatible and incompatible operators . . . . . . . . . . . . . 69
2.5.1 Compatible operators: probability of a measurement
outcome . . . . . . . . . . . . . . . . . . . . . . . . . . 71
2.6 The Schrödinger equation . . . . . . . . . . . . . . . . . . . . 74
2.6.1 The time-evolution operator . . . . . . . . . . . . . . . 74
2.6.2 Derivation of the Schrödinger equation . . . . . . . . . 76
2.6.3 Stationary states and time dependence of the state vector 76
2.6.4 Evolution of the expectation value of time-independent
observables . . . . . . . . . . . . . . . . . . . . . . . . 78
2.6.5 The Schrödinger picture and the Heisenberg picture . . 83
2.6.6 Schrödinger equation for the wave function in position
basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

3 Harmonic Oscillator in Quantum Mechanics 89


3.1 Hamiltonian operator and dimensionless variables . . . . . . . 89
3.2 Ladder operators . . . . . . . . . . . . . . . . . . . . . . . . . 91
3.3 Matrix representations . . . . . . . . . . . . . . . . . . . . . . 95
3.4 Eigenstate wave functions in the x̂-basis . . . . . . . . . . . . 96

4 Angular momentum 101


4.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.2 Ladder operators for angular momentum . . . . . . . . . . . . 103
4.3 Eigenvalues of Jˆ2 and Jˆz . . . . . . . . . . . . . . . . . . . . . 105
4.4 Matrix elements and representations of angular momentum
operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
4.5 Orbital angular momentum . . . . . . . . . . . . . . . . . . . 109
4.6 Spin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
4.6.1 General properties . . . . . . . . . . . . . . . . . . . . 113
CONTENTS 5

4.6.2 Elementary particles and their spin . . . . . . . . . . . 115


4.6.3 Matrix representations of spin 1/2 particles . . . . . . 116
4.7 Addition of angular momenta . . . . . . . . . . . . . . . . . . 118
4.8 Clebsch-Gordan coefficients . . . . . . . . . . . . . . . . . . . 121
4.9 Simplest application: two spin 1/2 particles . . . . . . . . . . 122
4.9.1 Entangled states . . . . . . . . . . . . . . . . . . . . . 124

5 Relativistic Quantum Mechanics 129


5.1 The Klein-Gordon equation . . . . . . . . . . . . . . . . . . . 130
5.2 The Dirac equation . . . . . . . . . . . . . . . . . . . . . . . . 131

6 The second quantum revolution 135


6.1 Bell inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . 135
6.1.1 Spin correlation measurements . . . . . . . . . . . . . . 136
6.1.2 EPR paradox and Bell’s inequality . . . . . . . . . . . 138
6.2 Quantum computing . . . . . . . . . . . . . . . . . . . . . . . 143
6.3 Quantum cryptography . . . . . . . . . . . . . . . . . . . . . . 145
6.4 Quantum decoherence . . . . . . . . . . . . . . . . . . . . . . 147
6.4.1 Density matrix . . . . . . . . . . . . . . . . . . . . . . 148
6 CONTENTS
Introduction

The name of this module could be intimidating since the word Advanced
might let you think that the level of difficulty of the treated subjects is
much higher than in the 2nd year Quantum Physics module. In fact, in
some respects, it is maybe even lower! Probably it is more correct to say
that in this module the kind of technical tools are different: In quantum
physics you mainly had to solve the Schrödinger equation applied to different
problems and in most cases the solution requires a knowledge of calculus with
boundary conditions and special functions. In our case, quantum physics is
discussed to a more formal (axiomatic) and fundamental level and, as we will
see, mainly we will have to deal with symbolic operations, mainly involving
algebraic transformations in vector spaces. In our discussion, the description
of a quantum state in terms of the wave function, for example ψ(x, t) for
a one-dimensional system, will be replaced by a state vector defined in an
infinite-dimensional complex vector space. In the Dirac notation, this will be
denoted by |ψ(t)i, an object called ket that, in general, also depends on time
(though often we will imply time dependence). It will be then important, at
the beginning, to get familiar with the notion of vector spaces.
The most familiar vector space you know is certainly the three dimen-
sional Euclidean space where a generic vector ~v (e.g., position or velocity)
can be expressed in terms of basis vectors i, j, k simply as

~v = v1 i + v2 j + v3 k , (1)

where v1 , v2 and v3 are the components of ~v in the basis {i, j, k} and are real
numbers.
Another example of vectors, though quite a different kind, is given by
complex numbers. As you know a complex number z = x + i y can be
represented in the complex plane, or the Argand plane, where the real part
x takes values on the horizontal axis and the imaginary part on the vertical
axis. Operations between complex numbers can be then expressed vectorially.

7
8 CONTENTS

In these two examples the basis is discrete since the basis vectors and
the components of a generic vector can be labelled by a discrete index, for
example we could re-express in the Euclidean case i = x1 , j = x2 , k = x3 .
Therefore, they are examples of discrete vector spaces.
In general, however one can also have a continuous basis, where basis
vectors and components are labelled by a continuous parameter.
The familiar examples we mentioned can be further generalised to vector
spaces with an arbitrary number of dimensions and with components given
by complex numbers. In quantum mechanics (QM) the state of a physical
system, to which we will refer as quantum state, is described by a ket |ψi
and this can be regarded as a state vector in an infinite-dimensional complex
space satisfying certain conditions (it needs to be complete and unitary),
mathematically what is known as Hilbert space.
The basis vectors correspond to the eigenvectors of Hermitian operators.
Hermitian operators have real eigenvalues and are associated to physical ob-
servables. A measurement can be regarded as an operation h ψ |Ô| ψ i, where
Ô is an Hermitian operator. The spectrum of Hermitian operators can be
both continuous and/or discrete, therefore, the basis vectors for quantum
states will have, in general, both continuous and discrete indexes. The de-
scription of a physical system in QM in terms of state vectors is more fun-
damental and general than a description in terms of wave functions. A wave
function would correspond to components with continuous indexes (the ar-
gument of the wave function). Of course the wave function you are typically
used, and the simplest one, is the wave function in coordinate unidimensional
space, ψ(x, t). As we will discuss, in the non-relativistic limit one can simply
derive from the quantum state the usual wave function (as we will see, it will
correspond to the operation ψ(x, t) = hx|ψ(t)i). Therefore, the formalism
you are already familiar with can be fully recovered. However, if one tries
to extend the use of a wave function to the relativistic regime, then different
fundamental issues are encountered.
The easiest way to understand this is to consider the simple case of a
particle that at time t = 0 is located inside a small box centred at the
origin of the reference frame. The solution for the wave function is in this
case well known. Suppose now to open the box and let the particle diffuse
freely. One would find that at a time t the wave function is non-vanishing
to arbitrarily large distances, even for |~r| > c t, where c is the speed of light.
This implies that there would be a non-vanishing probability to measure
the particle outside the light cone of the particle initially placed in the box.
Obviously this would violate the principle on the existence of a maximum
CONTENTS 9

speed of the causal signal corresponding to the speed of light and translating
into Lorentz invariance and shows an intrinsic obstacle of generalising the
concept of wave function to the relativistic case or, in other words, there is a
clash between a wave function formulation of QM and special relativity. This
becomes even more evident in particle physics in processes where colliding
particles are destroyed and new particles are created.
The solution of these problems is provided by Quantum Field Theories
that ultimately represent the consistent way to formulate a relativistic exten-
sion of QM to describe Nature (barring gravity!) but this goes well beyond
the scope of the module. However, a formulation of QM in terms of quan-
tum states described by kets provides the correct fundamental framework for
quantum field theories and, therefore, it represents the first necessary step
toward a relativistic formulation of QM. Therefore, this formalism should
be regarded (to our current knowledge!) as the fundamental underlying de-
scription of the state of a system. It is indeed so fundamental that any
attempt to replace it with a more fundamental one has failed so far.1 We
will see an example with the attempt of hidden variables theory that has
been experimentally disproved.
For this reason the formalism of quantum states is a necessity. Finally,
let us mention that since the components of a state vector are complex, there
is a fundamental difference with respect to Classical Mechanics.
Let us mention an important feature of QM that clearly distinguishes it
from Classical Mechanics. In QM if one has quantum states corresponding to
two different outcomes of a certain observable, say A and B, the generic quan-
tum state describing the system will be given by a superposition of the two
basis vectors (the eigenstates of the operator associated to the observable)
corresponding to the two different outcomes each with its own component α
and β, explicitly
|ψi = α|Ai + β|Bi , (2)
where α and β are complex numbers (the components of the state vector
in this case). If the state is multiplied by an overall complex number, the
quantum state does not change.2 The interesting thing is that a measurement
1
Notice that even string theory relies on such a framework and it is not a theory from
which one can for example think to derive the principles of QM. There is a clear statement
on this point made by to E. Witten during a conversation about the great synthesis that
took place in 1990 at the ICTP (Trieste) with Nobel laureate Abdus Salam [1].
2
Indeed as we will see, a quantum state is described by an infinite set of state vectors
called ray. It is like if the direction of the state vector defines the quantum state while its
length is irrelevant. We will be back on this point in more detail.
10 CONTENTS

will continue to give as a result just either A or B but the outcome cannot be
predicted for the single measurement one can only know that the probability
of obtaining A will be given by |α|2 /(|α|2 + |β|2 ) and the probability of
obtaining B will be given by |β|2 /(|α|2 + |β|2 ). This is very different from
Classical Mechanics, where in general the combination of two physical states
results in general into a new mixed state with new possible outcomes than
A and B and this is, at least in principle, predictable.
Bibliography

[1] See video at https://www.youtube.com/watch?v=AmUI2qf9uyo.

11
12 BIBLIOGRAPHY
Chapter 1

Vector spaces

Before we discuss how the formalism of vector spaces allows to formulate


the principles of QM in a rigorous way and to solve brilliantly some difficult
problems and describe important physical systems, we need first to famil-
iarise with vector spaces, discussing the operations one can perform with
state vectors in a vector space. This (long) chapter we will then provide the
necessary mathematical tools to discuss QM in the next chapters suitably
equipped.

1.1 Definition of vector spaces


Let us denote a vector space by V . The objects in this space are called
vectors. Following Dirac’s notation, we will indicate vectors in a vector space
in terms of kets, such as |ui, |vi, | w i. Two operations can be made with
vectors in a vector space, vector sum and multiplication of a vector by a
scalar, defined in the following way:

• Given to generic vectors | u i and | v i ∈ V one can form a vector sum


denoted by | u i + | v i;

• Given a generic vector | u i ∈ V and a generic number α called scalar,


there is a multiplication denoted by α | u i. Vectors can be multiplied
either by real or complex scalars.1 In the first case one has a real vector
space, in the second a complex vector space. Scalars form the field over
which the vector space is defined.
1
The vectors themselves are neither real nor complex, it refers only to scalars.

13
14 CHAPTER 1. VECTOR SPACES

The operation of sum of vectors must satisfy the following linear properties:

• |ui + |vi is still a vector in V (closure);

• |ui + |vi = |vi + |ui (commutativity) ;

• (|ui + |vi) + |wi = |ui + (|vi + |wi) (associativity);

• There is a null vector |0i such that |ui + |0i = |ui;

• For any generic vector |ui there is an inverse vector | − u i such that
| u i + | − u i = | 0 i.

The operation of multiplication of a vector by a scalar must satisfy the fol-


lowing properties:

• If |ui is a vector in V , then α |ui is also a vector in V (closure);

• (α + β) |ui = α |ui + β |ui (distributivity);

• α (|ui + |vi) = α |ui + α |vi (distributivity) ;

• α (β|ui) = (αβ) |ui (associativity);

• 1 |ui = |ui (unit law);

From these axioms it follows that 0 |ui = |0i and | − u i = −| u i (ps n. 1).
Note: In the following, we will denote the null vector simply by 0, i.e.,
replacing | 0 i → 0.2
In order to simplify the notation, it will prove convenient in the following
to denote the product of a vector | u i by a scalar α simply by | α u i ≡ α | u i.
It will also prove convenient to denote the sum of two generic vectors | u i
and | v i by | u + v i ≡ | u i + | v i.
The Euclidean space is a particular example of three-dimensional real
vector space. In this case given a (orthonormal) basis (reference frame) with
2
This is because we will have in store the notation | 0 i to denote the ground state
(the state with minimum energy) of systems such as the harmonic oscillator, as usually
in physics books such as those by Dirac or Sakurai (see references). The notation | 0 i
for the null vector is encountered in more formally, mathematically oriented, books such
as [2]. The ground state of the fundamental theory is called the vacuum state and its
identification and properties have very important physical consequences. Currently this is
unkown since we do not know yet the fundamental theory of Nature.
1.2. LINEAR DEPENDENCE AND LINEAR INDEPENDENCE 15

unit basis vectors i, j and k, a generic vector |vi, that in the specific case of
Euclidean space is commonly denoted by ~v , can be expressed as

|vi = ~v = v1 i + v2 j + v3 k . (1.1)

where the components v1 , v2 and v3 are real numbers.


The space of row vectors containing ordered lists of n real or complex
numbers is also an example of n-dimensional vector space, like also the space
of column vectors. For example, a generic column vector can be written as
       
v1 1 0 0
 v2 
 = v1  0  + v2  1  + · · · + vn  0  , (1.2)
     
|vi = 
 ...   ...   ...   ... 
vn 0 0 1

where we have expressed it in terms of unit column vectors that provide the
simple basis for column vectors.
We need to generalise to our V the definition of basis and components.

1.2 Linear dependence and linear indepen-


dence
Definition. A set of vectors {| i i} ≡ {| 1 i, | 2 i, . . . , | n i} ∈ V , labelled by a
positive integer number i = 1, . . . , n and not containing the null vector 0, is
said to be linearly independent if their only linear combination satisfying
n
X
ai | i i = 0 , (1.3)
i=1

is the trivial one with all ai = 0. On the other hand, if it is possible to


find a non trivial choice of scalars a1 , a2 , . . . , an , then it is said to be linearly
dependent.
Notice that, if they are linear dependent, then there are at least two
nonzero ai ’s. Suppose that aj 6= 0 for some j, then in this case we can write

1,n
X ai
|j i = − |ii. (1.4)
i6=j
a j
16 CHAPTER 1. VECTOR SPACES

1.3 Dimension and basis of a vector space


Definition. A vector space V has dimension n if it can have no more than
n linearly independent vectors.
We will sometimes explicitly indicate the dimensionality n of a vector
space by writing V n .
Definition. A basis is a set of n linearly independent vectors, called basis
vectors, in a n-dimensional vector space.
Theorem. Any generic vector | u i in an n-dimensional vector space V
can be written as a linear combination of basis vectors | 1 i, | 2 i, . . . , | n i,
explicitly
Xn
|ui = ui | i i . (1.5)
i=1

The coefficients ui are the components of | u i in the given basis and they are
unique.3
Proof. Let us prove it by contradiction, supposing that there is some
vector | v i which cannot be expressed as a linear combination of basis vectors.
In that case we would have found a set of n + 1 linearly independent vectors,
the n basis vectors plus | v i, but this would be in contradiction with the
definition of n-dimensional vector space.
Moreover, we can also prove by contradiction that the components need
to be unique in the given basis. If they were not, then one could write
n
X
|ui = u0i | i i , (1.6)
i=1

where u0i 6= ui at least for one particular value of i. However, subtracting the
two different linear combinations, more precisely subtracting Eq. (1.6) from
Eq. (1.5), one would find
n
X
(ui − u0i ) | i i = 0 , (1.7)
i=1

but this would imply ui = u0i for all i, since otherwise the basis vectors would
be linearly dependent.
Notice that even though the components of the vector | u i are unique for
a given basis, they change in a basis transformation. Explicitly, if we make
3
The components are scalars and, therefore, in a complex vector space they are complex
and in a real vector space, such as the Euclidean space, they are real.
1.4. INNER PRODUCT 17

a change of basis {| i i} → {| i0 i}, then ui → u0i . However, notice that a ket


| u i remains the same in any basis, since this has to be meant as a vector
in abstract, not depending on the specific basis. Therefore, a relation among
kets, such as for example,

|wi = |ui + |vi, (1.8)

is also independent of the basis. When the basis is specified, relations among
kets can be translated into relations among the components. For example,
the relation (1.8) would translate in the basis | i i simply in a analogous
relation among the components

wi = ui + vi , (1.9)

implying that the components of the sum of two vectors are given by the
sums of their components. In a new basis {| i0 i}, the components would
change in a way that Eq. (1.31) would simply transform into

wi0 = u0i + vi0 . (1.10)

Definition of subspace. Given a vector space V , a subset of its elements


that form a vector space among themselves4 is called a subspace of V .
The dimensionality of a subspace of V is clearly lower than the dimen-
sionality of V itself5 , barring the trivial case when the subspace coincides
with the space itself (in which case they would be the same).

1.4 Inner product


We now want to generalise the notion of scalar (or dot) product of two vectors
in the Euclidean space in our space vector V . In the Euclidean space, given
two vectors ~a and ~b, their scalar product is defined in terms of their lengths
and angle θ between them, since ~a · ~b = |~a| |~b| cos θ. However, in our case we
do not have a definition of length of a vector and angle between two of them.
We have then to proceed in an axiomatic way.
We define the inner product h u | v i of two vectors | u i and | v i in V by
the following axioms:
4
This implies that same addition and multiplication are defined as in V and that they
are closed in the subspace.
5
Example: A plane in the Euclidean space is a subspace with dimensionality m = 2.
18 CHAPTER 1. VECTOR SPACES

(i) h u | v i = h v | u i? (implying h u | u i = h u | u i? , i.e., h u | u i is real);


(ii) h u | u i ≥ 0 and h u | u i = 0 if and only if | u i = 0;
(iii) h u | α v + β w i ≡ α h u | v i + b h u | w i, meaning that the inner product
is linear in the second factor in the product.
What if the first vector is a superposition? How can we calculate the inner
product h α u + β v | w i? The three axioms suffice the requirement, since
indeed one has:
hαu + β v|wi = h w | α u + β v i? (by axiom (i))
= (α h w | u i + β h w | v i)? (by axiom (iii))
= α? h w | u i? + β ? h w | v i?
= α? h u | w i + β ? h v | w i (by axiom (i)) . (1.11)
This result means that the inner product is anti-linear with respect to the
first factor.
Definition Having defined an inner product in our vector space V , this is
now promoted to the rank of a inner product space.
Notice that so far we do not have yet a way to calculate the inner product
but, as we will see, this will still stem from the three axioms.
Definition. Having defined the inner product of two vectors we can also
define the norm (or generalised length) of a vector | u i simply as
p
|| u || ≡ h u | u i . (1.12)
Notice that axiom (ii) implies that the norm is a real nonnegative number.
A vector | u i is a normalised vector if it has unit norm, i.e., || u || = 1.

1.4.1 Orthogonal vectors and orthonormal basis


Definition. Two distinct vectors | u i and | v i are orthogonal if their inner
product vanishes, i.e., h u | v i = 0.
A set of mutually orthogonal vectors means that the inner product of
every distinct pair of vectors in the set vanishes.
Theorem. A set of mutually orthogonal vectors is necessarily also linearly
independent.
Proof. Let us impose
n
X
ai | i i = 0 , (1.13)
i=1
1.4. INNER PRODUCT 19

for some choice of the coefficients ai ’s. Taking the inner product of each term
with a vector | j i, where j is a particular value of i, and considering that
from axiom (iii) one has h j | 0 i = 0, one has
Xn
ai h j | i i = 0 . (1.14)
i=1
Since the vectors are mutually orthogonal we have for any i, except for i = j,
that h j | i i = 0, obtaining aj = 0. Since the procedure can be iterated for
any j, one arrives to the conclusion that all ai = 0 and, therefore, that the
set {| i i} is necessarily linear independent.
However, notice that the opposite is not necessarily true: vectors in a
linearly independent set are not necessarily mutually orthogonal.6
Definition. A set of basis vectors all with unit norm, which are pairwise
orthogonal is called an orthonormal basis.
This implies that if the vectors | 1 i, | 2 i, . . . , | n i form an orthonormal
basis, then one has (for i, j = 1, . . . , n)7
h i | j i = δij . (1.15)

1.4.2 Inner product in terms of the components


We can now derive an explicit formula of the inner product in terms of the
components of the two vectors. Given two kets | u i and | v i and expanding
them in a generic not orthonormal basis {| i i},
Xn
|ui = ui | i i (1.16)
i=1
and n
X
|vi = vi | i i , (1.17)
i=1
their inner product h u | v i will be given by8
* n n
+
X X
hu|vi = ui i vj j . (1.18)
i=1 j=1

6
It is easy to get convinced on this point thinking of vectors in the Euclidean space:
one can clearly have three linearly independent vectors that are not mutually orthogonal.
7
The symbol δij is the Kronecker delta defined by δij = 1 for i = j and δij = 0 for
i 6= j.
8
Notice that we replaced the dummy index i in Eq. (1.18) with j, since it is better not
to use the same dummy index in the same formula, it can easily lead to incorrect results.
20 CHAPTER 1. VECTOR SPACES

Using the linearity of the inner product in the second vector (axiom (iii))
and the anti-linearity in the first vector (see Eq. (1.11)) one arrives to the
expression
X n X n
hu|vi = u?i vj h i | j i . (1.19)
i=1 j=1

Notice that the inner products between basis vectors, the h i | j i’s , depend
on the basis. Specifically, for an orthonormal basis one can use Eq. (1.15) to
express them. In this way, under the action of δij , the double sum collapses
into a single sum. Therefore, for an orthonormal basis, one simply obtains:
n
X
hu|vi = u?i vi . (1.20)
i=1

This clearly show the convenience to work in an orthonormal basis, as we will


always do from now on. Notice that h u | v i is in general a complex number.
However, when one considers the norm of a vector | u i, then Eq. (1.20)
specialises into
n
X
2
|| u || = |ui |2 , (1.21)
i=1

an expression consistent with the axiom (ii) implying that the norm of a
vector is a nonnegative real number.
In a given basis a vector is specified by its components and, therefore,
one can represent a ket | u i in terms of a column vector formed by its com-
ponents:9  
u1
 u2 
|ui → u ≡   ...  .
 (1.22)
un
The inner product h u | v i of two kets | u i and | v i can then be obtained as the
matrix product of the transpose conjugate of the column vector representing
| u i with the column vector representing | v i, explicitly:
 
v1
n
X   v2 
hu|vi = u?i vi = u?1 u?2 . . . u?n   ...  .
 (1.23)
i=1
vn
9
In the following the (short) arrow symbol → will always stand for ‘is represented in
the given basis by’.
1.5. DUAL SPACE 21

Notice that denoting the conjugate transpose or adjoint of u by u† ≡ (u?1 u?2 . . . u?n ),
we can also express, more compactly, the inner product in terms of a matrix
product:
X n
hu|vi = u?i vi = u† v . (1.24)
i=1

1.5 Dual space


We have introduced the inner product as an operation made with any two
kets in the vector space and such that the order of the two vectors, i.e., which
vector of the two has to be meant as the first vector and which one as the
second vector, has to be specified. This way of interpreting the inner product
is perfectly fine and we could continue with it without any contradiction.
However, there is a second, perfectly equivalent, way that has different
advantages, including a simplified notation and that is somehow inspired by
Eq. (1.23). In the same way as a ket in abstract | u i is represented by a
column vector and considering that from this one can obtain an associated
row vector by transposing and complex conjugating (the adjoint), one can
say that the row is the representation of a different kind of abstract vector
called bra10 , denoted with the symbol h u |, associated to the | u i.11 In this
way one has not to distinguish a first vector and a second vector in the inner
product h u | v i, simply one can say that this is obtained by the product of
a ket | v i with a bra h u |, or vice-versa, there is no ambiguity now since the
bra is always the first vector and the ket is always the second vector.
There are different useful observations and implications to be noticed:

• The set of all bras form the so-called dual space of V denoted by V ? ;

• In the same way as the conjugate transpose of a column produces its


adjoint that is a row and vice-versa, a bra h u | can be regarded as the
adjoint of the ket | u i and vice-versa;

• Any basis of kets {| i i} has its adjoint basis of bras {h i |} and vice-versa;
10
One can now finally appreciate the origin of the name ‘ket’: taking the inner product
with a bra corresponds to form a bra(c)ket!
11
It is the analogous of forming a contravariant vector from a covariant vector in rela-
tivity. However, notice that in relativity the squared length Aµ Aµ of a generic 4-vector
Aµ can also be negative!
22 CHAPTER 1. VECTOR SPACES

• If one has a ket | αu i, where α is a scalar, then its adjoint bra is given
by12
h αu | = h u | α? , (1.25)
since if one has a column with components {α ui } then its adjoint will
be a row with components {α? u?i }, explicitly:
 
α u1
 α u2  adjoint 
| αu i →   ←→ α? u?1 α? u?2 . . . α? u?n ← h αu | .
 ... 
α un
(1.26)
• From any relation among kets, such as
α|ui = β |vi + γ |wi, (1.27)
one can obtain the adjoint expression replacing each term (a scalar
multiplied by a ket) with its adjoint (a bra multiplied the complex
conjugate of the scalar) and vice-versa. Explicitly, in the previous
example, one obtains:
h u | α? = h v | β ? + h w | γ ? . (1.28)
This can be expressed saying that to take the adjoint of an equation
relating kets (bras), one needs to replace every ket (bra) with the cor-
responding bra (ket) taking the complex conjugate of all coefficients.
• If the expansion in components of a ket | u i in the given basis is given
by
X n
|ui = ui | i i , (1.29)
i=1
then taking the adjoint of this relation gives us the expansion of a bra
in the adjoint basis:
Xn
hu| = h i | u?i . (1.30)
i=1
12 ? ?
Notice that one has h u | α = α h u |. Usually there is some convenience in writing
the scalar on the right of a bra analogously to what must be done with operators since
as we will see the multiplication by a scalar α can be regarded as applying the operator
ˆ where Iˆ is the identity operator, to the ket. However, as we will see, there are also
αI,
situations where it is more convenient to write the scalar on the right of the ket and on
the left of the bra.
1.6. GRAM-SCHMIDT PROCEDURE 23

• By taking the inner product of the expansion (1.29) with a basis bra
h j |. Using the orthonormality of the basis implying h j | i i = δij , one
immediately finds that the components can also be written as
uj = h j | u i , (1.31)
so that one can also write the expansion of a ket | u i as
n
X
|ui = hi|ui|ii. (1.32)
i=1

Taking the adjoint of this expression one immediately finds the corre-
sponding expression for the bra
n
X
hu| = hi|hu|ii, (1.33)
i=1

having used h i | u i? = h u | i i. If one compares this expansion with


(1.30), one obtains consistently that u?i = h u | i i.

1.6 Gram-Schmidt procedure


The Gram-Schmidt procedure is an iterative procedure that allows to built
an orthonormal basis starting from a generic non-orthonormal basis, that we
denote by
{| ei i} ≡ | e1 i, | e2 i, . . . , | en i . (1.34)
These are the steps of the Gram-Schmidt procedure that converts {| ei i} into
an orthonormal basis {| i i}:
1) The first step consists in normalising | e1 i, easily obtaining | 1 i:
| e1 i
| e1 i −→ | 1 i = . (1.35)
||e1 ||

2) One can then obtain | 2 i from | e2 i using | 1 i obtained in the previous


step. This step can be split into two sub-steps:
a) In a first sub-step, one subtracts the | 1 i parallel component from
| e2 i, obtaining an intermediate vector | 20 i, explicitly:
| e2 i −→ | 20 i = | e2 i − | 1 ih 1 | e2 i . (1.36)
One can immediately verify that h 1 | 20 i = 0.
24 CHAPTER 1. VECTOR SPACES

b) In the second sub-step one normalises | 20 i obtaining the unit norm


vector | 2 i, explicitly:

0 | 20 i
| 2 i −→ | 2 i = 0 . (1.37)
||2 ||

3) In the third step one transforms | e3 i −→ | 3 i, using | 1 i and | 2 i


obtained in the previous two steps. It can be again decomposed in the
two following two sub-steps:

a) One first subtracts both the | 1 i and | 2 i parallel components from


| e3 i, obtaining an intermediate vector | 30 i, explicitly:

| e3 i −→ | 30 i = | e3 i − | 1 ih 1 | e3 i − | 2 ih 2 | e3 i; (1.38)

b) One then normalises | 30 i, obtaining the unit norm vector | 3 i,


explicitly
| 30 i
| 30 i −→ | 3 i = 0 . (1.39)
||3 ||
...

n) Finally, with the last n-th step one transforms | en i → | n i using the
mutually orthogonal kets with unit norm | 1 i, . . . , | n − 1 i obtained in
previous steps. It can again be decomposed into two sub-steps:

a) One first subtracts all | 1 i, . . . , | n − 1 i parallel components from


| en i obtaining an intermediate vector | n0 i, explicitly:
n−1
X
| en i −→ | n0 i = | en i − | i ih i | en i ; (1.40)
i=1

b) Finally, one normalises | n0 i obtaining the unit norm vector | n i,


explicitly:
| n0 i
| n0 i −→ | n i = . (1.41)
||n0 ||

Notice that since the initial basis {| ei i} is, by definition of basis, a linearly
independent set, one has | i0 i =
6 0 for all i = 1, . . . , n. This is because if
by subtracting all parallel components to | 1 i, . . . , | m − 1 i one obtains 0 for
1.7. USEFUL INEQUALITIES 25

some m ≤ n, then the vectors would not be linearly independent since it


would imply
m−1
X
| em i = h i |h i | em i . (1.42)
i=1

The Gram-Schmidt procedure clearly shows that the dimension of the


space also corresponds to the possible maximum number of mutually or-
thogonal vectors one can construct and this of course is consistent with the
theorem we proved in 1.4.1, stating that a set of mutually orthogonal vectors
is necessarily also linearly independent.
Problem. What happens if one tries to go further (step n + 1) building a
| n + 1 i vector that is orthonormal to | 1 i, . . . , | n i (see problemsheet n.2)?

1.7 Useful inequalities


From the three axioms defining the inner product, one can derive some useful
inequalities that are a generalisation of well known, more evident, inequalities
for vectors in the Euclidean space.
Theorem. Given two generic vectors | u i and | v i, one has

2 Re(h u | v i) ≤ h u | u i + h v | v i . (1.43)

Proof. It is quite easy to prove it. From axiom (ii) we can impose

hu − v|u − vi ≥ 0, (1.44)

and then decomposing the left-hand side using properties of the inner prod-
uct, one obtains

h u | u i + h v | v i − h v | u i − h u | v i ≥ 0. (1.45)

Since h v | u i = h u | v i? and considering that h u | v i+h u | v i? = 2 Re(h u | v i),


one easily obtains the thesis Eq. (1.43).
Theorem: Cauchy-Schwarz inequality. It states that the absolute value of
the inner product of two vectors cannot exceed the product of their norms,
explicitly:
|h u | v i| ≤ ||u|| ||v|| . (1.46)
Proof. If we define
hu|vi
|wi = |vi − |ui, (1.47)
hu|ui
26 CHAPTER 1. VECTOR SPACES

imposing h w | w i ≥ 0 and considering that

hv|ui
hw| = hv| − hu|, (1.48)
hu|ui
one finds
h u | v i? h u | v i h u | v i h v | u i h v | u i h u | v i
h v | v i− − + h u | u i ≥ 0 , (1.49)
||u||2 ||u||2 ||u||4

and from this, using h u | u i = ||u||2 so that last two terms cancel out and
considering that h u | v i? h u | v i = |h u | v i|2 , with easy algebraic steps one
immediately finds the Cauchy-Schwarz inequality Eq. (1.46).
Theorem: Triangle inequalities. The first states that the norm of the sum
of two vectors cannot exceed the sum of their norms, explicitly:

||u + v|| ≤ ||u|| + ||v|| . (1.50)

The second states that the norm of the sum of two vectors cannot be lower
than the absolute value of the difference of the norms, explicitly:

||u + v|| ≥ abs(||u|| − ||v||) , (1.51)

The proofs are quite simple, they are left as problems in problem-sheet n.2.

1.8 Linear operators


An operator Ô transforms any given vector (either a ket or a bra) into another
vector (either a ket or a bra respectively). Explicitly, in the case of kets, one
can write:

| v i −→ | v 0 i = Ô| v i ≡ | Ov i , (1.52)
where we introduced the notation | Ov i to denote the transformed vector
| v 0 i. On the other hand, in the case of bras, Ô acts on their right. Explicitly,
for a generic bra h v |, one has:


h v | −→ h ve0 | = h v |Ô , (1.53)

where notice that h ve0 | does not coincide in general with the adjoint of | v 0 i
and that is why we did not denote it by h v 0 |. We will see in the next section
how to obtain the bra adjoint of | v 0 i = | Ov i.
1.8. LINEAR OPERATORS 27

We will only deal with linear operators, defined by the following rules:
Ôα| u i = α Ô| u i , (1.54)
Ô (α| u i + β| v i ) = αÔ| u i + β Ô| v i , (1.55)
h u |α Ô = h u |Ô α , (1.56)
(h u |α + h v |β) Ô = h u |Ôα + h v |Ôβ . (1.57)
Definition. The simplest operator is the identity operator, denoted by I, ˆ
transforming any vector into itself, so that explicitly for every ket | u i one
has
ˆ ui = |ui,
I| (1.58)
and for every bra h u |
h u |Iˆ = h u | . (1.59)
Linear operators have the advantage that once their action on the basis
vectors is known, then one can immediately determine their action on any
vector. This is simple to show. Suppose one has for each basis vector | i i in
a basis {| i i} ≡ {| 1 i, | 2 i . . . , | n i}
Ô| i i = | i0 i , (1.60)
then applying the operator to the expansion of a generic ket | u i in the basis
{| i i} (see Eq. (1.29)) and using the linearity of Ô, one immediately finds
n
X
Ô| u i = Ôui | i i , (1.61)
i=1
n
X
= ui Ô | i i , (1.62)
i=1
n
X
= ui | i0 i . (1.63)
i=1

The product Ô2 Ô1 of two operators Ô1 and Ô2 is also an operator. Its action a
generic vector | u i is simply given by applying sequentially the two operators,
in a way that one has:
Ô2 Ô1 | u i = Ô2 (Ô1 | u i) = Ô2 | O1 u i = Ô2 | u0 i = | u00 i , (1.64)
that can be also more explicitly expressed as the double transformation:
Ô Ô
1
| u i −→ | u0 i = Ô1 | u i ≡ | O1 u i −→
2
| u00 i = Ô2 | u0 i = Ô2 Ô1 | u i ≡ | O2 O1 u i .
(1.65)
28 CHAPTER 1. VECTOR SPACES

An important point is that in general the order of the two operators matters,
in a way that defining the commutator of the operator Ô with P̂ as
h i
Ô, P̂ ≡ Ô P̂ − P̂ Ô , (1.66)

this is, in general, nonzero. Notice that the commutator is antisymmetric


under the exchange of the two operators:
h i h i
Ô, P̂ = − P̂ , Ô . (1.67)

If the commutator vanishes, then it is said that the two operators commute
with each other.
If one consider three operators Ô, P̂ and Q̂, then one has these two useful
identities:
h i h i h i
Ô, P̂ Q̂ = P̂ Ô, Q̂ + Ô, P̂ Q̂ , (1.68)
h i h i h i
Ô P̂ , Q̂ = Ô P̂ , Q̂ + Ô, Q̂ P̂ . (1.69)

Definition. Given an operator Ô, its inverse Ô−1 is defined as that oper-
ator such that
Ô−1 Ô = Ô Ô−1 = Iˆ . (1.70)
Notice that not every operator has an inverse. If two operators Ô and P̂
both possess an inverse operator, then the inverse of their product is given
by the product of their two inverses in reverse order, explicitly:

(Ô P̂ )−1 = P̂ −1 Ô−1 . (1.71)

This can be easily verified: (Ô P̂ )−1 Ô P̂ = P̂ −1 Ô−1 Ô P̂ = Iˆ and likewise one
ˆ
finds Ô P̂ (Ô P̂ )−1 = I.

1.9 Adjoint of an operator


When we defined a generic operator Ô and wrote the transformed ket (see
Eq. (1.52)) and the transformed bra (see Eq. (1.53)), we commented that
h u |Ô is not the adjoint of Ô| u i. How can we then obtain the bra adjoint of
Ô| u i ≡ | Ou i? The answer brings to the definition of adjoint of an operator.
Definition. Given an operator Ô, its adjoint, denoted by Ô† , is defined as

h u |Ô† = h Ou | , (1.72)
1.9. ADJOINT OF AN OPERATOR 29

where h Ou | is the adjoint of | Ou i ≡ Ô| u i.13 Notice that in general Ô† 6= Ô


and this implies that the inner product h u |Ô† | v i =
6 h u |Ô| v i.

The correct expression for h u |Ô | v i can be easily derived:
h u |Ô† | v i = h Ou | v i = h v | Ou i? = h v |Ô| u i? , (1.73)
where in the second step we used the axiom (i) defining the inner product.
Theorem. Given two operators Ô and P̂ , the adjoint of their product is
given by the product of their adjoints in reverse order, explicitly:
(Ô P̂ )† = P̂ † Ô† . (1.74)
Proof. We need to derive an expression for h O P u | given a generic bra h u |.
Let us first regard Ô P̂ as one operator and in that case we can write by
definition of adjoint
h O P u | = h u | (Ô P̂ )† . (1.75)
On the other hand, if we now write P̂ | u i = | P̂ u i = | u0 i, we can also write
by definition of adjoint of Ô
h ÔP̂ u | = h Ôu0 | = h u0 |Ô† . (1.76)
However, by definition of adjoint of P̂ , we can also write h u0 | = h P̂ u | =
h u |P̂ † finding
h ÔP̂ u | = h u |P̂ † Ô† . (1.77)
Comparing this last expression with Eq. (1.75), we then find (Ô P̂ )† = P̂ † Ô† ,
proving our theorem.
Example. We can now generalise the example we gave on how, starting
from Eq. (1.27), one can get its adjoint relation, Eq. (1.28). For example, we
can now start from the much more general relation
α1 | v1 i = α2 | v2 i + α3 | v3 i h v4 | v5 i + α4 Ô| v6 i + α5 P̂ Q̂ | v7 i , (1.78)
and from this, using all discussed prescriptions and results, one obtains the
adjoint relation
h v1 |α1? = h v2 |α2? + h v5 | v4 ih v3 |α3? + h v6 |Ô† α4? + h v7 |Q̂† P̂ † α5? . (1.79)
13
We have already seen an example of adjoint operator. The multiplication of a ket by
a scalar α can be seen as the action of the operator Ô = α Iˆ on | u i. Therefore, since the
adjoint of | αu i is h u | α? = h u | Iˆ α? , we can see that in this specific case (αI)
ˆ † = Iˆ α? .
This also explains why in this case scalars should be formally placed at the right of the
bra, for consistency with the operators acting on bras. However, in practice, the scalar
can be freely moved from the right to the left of a bra, while the operator cannot!
30 CHAPTER 1. VECTOR SPACES

Here again let us notice that the scalars αi? might well be placed also to the
left of the bras but it should now be clear that placing them to the right
of bras highlights their nature of special operators, resembling the way how
adjoint operators are written.
We can summarise the rules that bring from one relation to its adjoint
saying that one has to operate the following substitutions also holding in
reverse order:
(i) Ô ←→ Ô† ;

(ii) | . . . i ←→ h . . . | ;

(iii) αi ←→ αi? .

1.10 Hermitian and Unitary operators


Definition. An operator Ô is Hermitian or self-adjoint if Ô† = Ô.
An obvious example of Hermitian operator is the identity operator, since
ˆ† ˆ
I = I.
Definition. An operator Û is unitary if

Û Û † = Iˆ . (1.80)

Since Û and Û † are inverse of each other, then, necessarily from Eq. (1.70),
ˆ A unitary operator is the analogous of a complex
one also has Û † Û = I.
number with unit modulus, e.g., of the kind z = eiϕ , satisfying z z ? = 1.
A very important property of unitary operators is the following.
Theorem. Unitary operators preserve the inner products of vectors.
Proof. Consider a unitary operator Û and two generic kets | u i and | v i
that, under the action of Û , get respectively transformed into

| u0 i = Û | u i and | v 0 i = Û | v i . (1.81)

If we now take the inner product of the transformed vectors we have

h u0 | v 0 i = h U u | U v i = h u |Û † Û | v i = h u | v i . (1.82)

In particular, of course, unitary operators preserve the norms of vectors as


well. Unitary operators should be regarded as the generalisation of rotation
operators in the Euclidean space that preserve the scalar product between
vectors.
1.11. MATRIX REPRESENTATION OF OPERATORS 31

1.11 Matrix representation of operators


We have seen that, in a given basis, kets are represented by column vectors
containing their components and, similarly, bras can be represented in terms
of rows that are the adjoints of the column vectors. In this way one switches
from vectors in abstract to components given by numbers.
We can also represent operators in terms of sets of n2 numbers that can
be arranged in the form of matrices that, in the case of complex spaces, are
complex. These numbers are the matrix elements of the operator in the given
basis.
Definition. Given an operator Ô and a basis {| i i}, the matrix elements
of Ô in the given basis are defined as (i, j = 1, . . . , n)

Oij ≡ h i |Ô| j i . (1.83)

As the column vector formed by the components of a ket in a given basis


provide a representation of the ket, the matrix formed by the matrix elements
of an operator provide a matrix representation of the operator Ô, explicitly
 
O11 O12 . . . . . . O1n
 O21 O22 O2n 
 .. .. .. 
Ô → O ≡ 

. . .

 (1.84)
 .. ... .. 
 . . 
On1 . . . . . . . . . Onn

and of course, like the components of a vector, the matrix elements are also
basis-dependent.14 As we will show explicitly in a moment, the j-th column
of O is formed by the components of the ket | j 0 i = Ô| j i, the transformed
basis vector | j i, in the given basis.
Let us see the importance of knowing the matrix elements in concrete
calculations. We have seen that once we know the set of transformation rules
of an operator on basis vectors, then one can derive easily the transformation
rule for a generic vector and we obtained the result
n
X n
X
0
| u i = Ô| u i = uj Ô | j i = uj | j 0 i . (1.85)
j=1 j=1

14
On the other hand, it should be clear that abstract operators such as Ô, like abstract
vectors, are not basis-dependent.
32 CHAPTER 1. VECTOR SPACES

However, we have not provided a way to calculate explicitly the components


u0i . In order to do that we have to proceed in the same way we did when
we derived Eq. (1.31), writing the components of a vector in terms of inner
products of the ket with basis bras. We need then to take the inner product
of | u0 i with a basis vector. If we want for example the component u0i , for
some value of i ∈ {1, . . . , n}, we obtain
n
X n
X
u0i 0
= hi|u i = hi| uj Ô j i = uj h i |Ô| j i . (1.86)
j=1 j=1

Therefore, we obtain that the components of a transformed vector | u0 i under


the action of an operator Ô on a ket | u i can be calculated, knowing the
matrix elements of the operator Ô defined in Eq. (6.15), as:
n
X
u0i = Oij uj . (1.87)
j=1

Notice that in the sum we deliberately wrote Oij uj instead of uj Oij , since
one can immediately recognise that Eq. (1.87) is equivalent to the matrix
form relation:
    
u01 O11 O12 . . . O1n u1
 u0   O21 O22 . . . O2n   u2 
 2  
 ..  =  .. ..   ..  , (1.88)
 
.. . .
 .   . . . .  . 
u0n On1 On2 . . . Onn un

or, more compactly, u0 = O u. Notice that this relation is nothing else than
the representation, in the given basis, of the abstract relation | u0 i = Ô| u i
defining the operator Ô.
Example. If one considers a basis vector | j i, in this case one obtains as
a transformed basis vector
   
O11 . . . O1j . . . O1n 0 O1j

. .  
 O21 . . O2j . . . O2n    ..   O.2j 
  
 . .. ..   1  =  ..  ,

 .. . .     (1.89)

 .  .   . 
 .. ... ..   ..   .. 

.
On1 . . . Onj . . . Onn 0 Onj

showing that indeed the elements of matrix in a given j-th column corre-
sponds to the components of the transformed basis ket | j i.
1.11. MATRIX REPRESENTATION OF OPERATORS 33

Let us now give two important examples of matrix representations of


specific operators:
• Identity operator I.
ˆ This is of course the simplest example since very
straightforwardly one has
ˆ j i = h i | j i = δij ,
Iij = h i |I| (1.90)
where clearly Iij are the matrix elements of the identity matrix I so
that the matrix representation of the identity operator is simply
 
1 0 ... 0
 0 1 ... 0 
Iˆ → I =  .. .. . . ..  . (1.91)
 
 . . . . 
0 0 ... 1

• Permutation operators. The permutation operator P̂i1 i2 ...in permutes


the basis vectors in a way that | 1 i, | 2 i, . . . , | n i −→ | i1 i, . . . , | in i,
where each ij is a value between 1 and n and they are all different from
each other in a way that there are n! permutation operators (one for
each possible permutation). For example, for n = 3 there are 6 permu-
tation operators. Of course the permutation operator P̂12...n coincides
with the identity operator.
• Adjoint operator Ô† . Given an operator Ô with matrix elements Oij ,
it is easy to calculate from these the matrix elements of the adjoint
operator:

Oij = h i |Ô† | j i (1.92)
= h Ôi | j i (1.93)
= h j | Ôi i? (1.94)
= h j |Ô| i i? (1.95)
?
= Oji . (1.96)

We have in this way obtained the matrix representation of Ô† , explic-


itly:  
? ? ?
O11 O21 . . . On1
 O? O? . . . O? 
† †  12 22 n2 
Ô → O =  .. .. . . ..  . (1.97)
 . . . . 
? ? ?
O1n O2n . . . Onn
34 CHAPTER 1. VECTOR SPACES

This result was somehow expected: the matrix representing the adjoint
of Ô is the adjoint matrix (transpose complex conjugated) of the matrix
representing Ô.

1.12 Outer product


We have seen that given a ket | u i and a bra h v | we can form the inner
product  
u1
 u2 

? ? ?
h v | u i = v1 v2 . . . vn  ..  , (1.98)
 . 
un
that is a complex number and, correctly, its calculation in a given basis is
given by the product of a row, that is a (1, n) matrix, with a column, that is
a (n, 1) matrix.
However, by reversing the order, one could also take the product of the
(n, 1) column vector representing the ket | u i with the (1, n) row representing
the bra | v i, obtaining in this case a (n, n) matrix, explicitly:

u1 v1? u1 v2? . . . u1 vn?


   
u1
u2 .. .. ..
   
. . .

v1? v2? . . . vn? = . (1.99)
   
 ..  .. .. ..
 .   . . . 
un un v1 un v2 . . . un vn?
? ?

As we know a (n, n) matrix represents an operator, the corresponding ab-


stract object. In this case an operator can be formed by the product of a bra
with a ket but, differently from the inner product, the ket has to precede the
bra. Explicitly, we can give the following definition.
Definition. Given a ket | u i and a bra h v |, the outer product, denoted by

| u ih v | , (1.100)

is defined as that operator that transforms the generic ket | w i into

| w0 i = | u ih v | w i . (1.101)

Notice that if either | u i = 0 or h v | = 0, then one obtains the null operator.


1.13. PROJECTION OPERATORS 35

1.13 Projection operators


Definition. Given the generic vector | v i, the operator

| v ih v |
P̂v ≡ (1.102)
hv|vi

is the projection operator onto the vector | v i. If the vector | v i is normalised


then simply P̂v ≡ | v ih v |.
The projection operator acts on a generic ket | w i in a way to return as
outcome
P̂v | v ih v | w i
| w i −→ | w0 i = P̂v | w i = . (1.103)
hv|vi
In the Euclidean space this would be the parallel component of the vector w ~
0
along ~v and, therefore, we can now generalise this concept and refer to | w i as
the parallel component of | w i along | v i. Notice that we have already made
extensive use of this projection operation in the Gram-Schmidt procedure.
It is easy to see that P̂v is Hermitian. Indeed, if we take the adjoint bra of
P̂v | w i, one has
h w | v ih v |
h Pv w | = = h w | P̂v , (1.104)
hv|vi
and from the definition of adjoint of an operator this implies P̂v = P̂v† .
A particularly important case of projection operator is the projection
operator onto a basis vector | i i:

P̂i = | i ih i | . (1.105)

The relation (1.103) for the transformed vector | w0 i under the action of
P̂i on a generic vector | w i specialises then into


i
| w i −→ | w0 i = P̂i | w i = | i ih i | w i = wi | i i , (1.106)

and for the corresponding transformed bra


v
h w | −→ h w0 | ≡ h w |P̂i = h w | i i h i | = wi? h i | . (1.107)

The outcomes were expected: we obtained, respectively, the components of


| w i and h w | along | i i and h i |.
36 CHAPTER 1. VECTOR SPACES

Taking the product of two projection operators onto different basis kets
| i i and | j i, one has:

P̂i P̂j = | i ih i | j ih j | (1.108)


= δij | i ih j | . (1.109)

This implies that P̂i P̂j = 0 if i 6= j, while for i = j one obtains P̂i2 = P̂i .
This result was expected since, if one projects for a second time on the same
vector, then necessarily the result is the same since the vector was already
projected. On the other hand if one projects the second time on an orthogonal
vector, then the parallel component to this orthogonal vector has necessarily
to vanish and, therefore, one obtains the zero vector.
If we calculate the matrix elements of P̂i , we obtain

(Pi )kl = h k |P̂i | l i (1.110)


= hk|iihi|li
= δki δil .

In this way we obtained that the matrix representing | i ih i | is a matrix


with just one non-vanishing entry that is equal to unity and lying along the
diagonal in the position (i, i), explicitly:
 
0 ... ... ... 0
 .. . .
 . .


 . 
P̂i = | i ih i | → Pi = 
 .
. 1 ,
 (1.111)
 . .
 .. ..


0 0

where the representing matrix has just one nonzero element along the diag-
onal in the (i, i) entry.

1.14 Completeness relation


Theorem. The identity operator can be written in terms of outer products
| i ih i | (i.e., projection operatirs P̂i ) through the following completeness re-
lation: n n
X X
Iˆ = | i ih i | = P̂i . (1.112)
i=1 i=1
1.14. COMPLETENESS RELATION 37

Proof. Consider the expansion of a generic ket | u i


n
X
|ui = ui | i i . (1.113)
i=1

We have seen that the components ui can be written in terms of inner prod-
ucts as ui = h i | u i. Therefore, the expansion (1.113) can be also written
as !
X n n
X n
X
|ui = | i ih i | u i = P̂i | u i = P̂i | u i . (1.114)
i=1 i=1 i=1

This expressions is telling that the operator in the brackets is leaving un-
changed the ket | u i but since the ket | u i is generic, then necessarily the
only operator that leaves unchanged any vector is the identity and therefore
we proved the theorem.
The completeness relations can be used to obtain different useful results
since it can be inserted inside an abstract relation to get its matrix form. We
give here below two important applications.
Theorem. Given an orthonormal basis {| i i}, where i = 1, . . . , n, any
generic operator Ô can be decomposed in terms of outer products | i ih j |
as15
1,n
X
Ô = Oij | i ih j | . (1.115)
i,j

Proof. It can be shown either simply verifying that taking the matrix ele-
ments of both sides one indeed obtains Oij = Oij or, alternatively, one can
use twice the completeness relation, in the following way:

Ô = Iˆ Ô Iˆ (1.116)
n
! n
!
X X
= | i ih i | Ô | j ih j | (1.117)
i=1 j=1
1,n
X
= | i ih i |Ô| j ih j | (1.118)
i,j
1,n
X
= Oij | i ih j | . (1.119)
i,j

15
P1,n Pn Pn
We denoted the double sum by the compact notation: i,j ≡ i=1 j=1 .
38 CHAPTER 1. VECTOR SPACES

This expression is the equivalent of the vector expansion in the case of an


operator. In this respect the outer products | i ih j | can be regarded as a
basis for the operators and the Oij ’s are the components of Ô. In matrix
form notice that
 
0 ... ... 0
 .. .. .. 
 . . . 
 . 
 .
 . ... 1 ...

| i ih j | →  . , (1.120)

 .. .
.. 
 
 .. .. 
 . . 
0 ... ... 0

where the unity in the (i, j) entry is the only nonzero entry. Therefore, the
relation (1.116) in matrix form reads as
 
0 ... ... 0
 
O11 O12 . . . . . . . . . O1n
 O21 O22 . . . . . . . . . O2n   .. .. .. 
 . . . 
 .. .. .. .. 
 
 . . . .  X 1,n
 .
.

 . ... 1 ...
 
..  = Oij  . .
 . 
 .. .. .. 
.
. . .   .. .. 
 i,j
 . . . .

..

.. 
 .. .. .. .. 
 
 . . 
On1 On2 . . . . . . . . . Onn 0 ... ... 0
(1.121)
By using the completeness relation, we can also find an expression for the
matrix elements of a product of two generic operators Ô and P̂ in terms of
products of matrix elements of the two operators:
(O P )ij = h i |Ô P̂ | j i (1.122)
n
!
X
= h i |Ô | k ih k | P̂ | j i (1.123)
k=1
n
X
= h i |Ô| k ih k |P̂ | j i (1.124)
k=1
Xn
= Oik Pkj . (1.125)
k=1

As one could expect the result is that the matrix elements of the product of
two operators are given by the sum of the product of the matrix elements of
1.15. UNITARY TRANSFORMATIONS 39

the two operators according to the standard product of rows by columns of


two matrices. That means that the matrix representing the product of two
operators is the product of the matrices representing the two operators.

1.15 Unitary transformations


We have already seen that a transformation induced by a unitary operator,
called unitary transformation, preserves the inner products between any two
vectors. Now we show that such transformations have a special meaning.
Theorem. A change of orthonormal basis {| i i} −→ {| i0 i}, can be re-
garded as induced by the action of a unitary operator Û on each basis vector,
in a way that | i i → | i0 i = Û | i i for all i = 1, . . . , n.
Proof. First of all notice that we can always introduce an operator Û
such that | i0 i = Û | i i for each i. We need to prove that this operator is
necessarily unitary. Let us first expand a generic new basis ket | i0 i, in the
old basis writing

n
X
| i0 i = | k ih k | i0 i (1.126)
k=1
n
X
= | k ih k |Û | i i (1.127)
k=1
n
X
= | k iUki . (1.128)
k=1

We can then similarly expand a generic new basis bra h j 0 |, finding

n
X
0
hj | = h j 0 | ` ih ` | (1.129)
`=1
n
X
= h j |U † | ` ih ` | (1.130)
`=1
Xn

= Uj` h`|. (1.131)
`=1
40 CHAPTER 1. VECTOR SPACES

Since we are assuming that the new basis is orthonormal, one can impose

δji = h j 0 | i0 i (1.132)
1,n
X †
= Uj` Uki h ` | k i (1.133)
k,`
n
X †
= Ujk Uki , (1.134)
k=1

where in the second step we replaced the two expansions written before. The
last relation is nothing else than the matrix form of the abstract relation
ˆ so that we can conclude that Û is a unitary operator.
Û † Û = I,
Therefore, we conclude that the operator Û that induces the transforma-
tion of the basis vectors from one orthonormal basis to another, is necessarily
unitary. Conversely notice that the same proof also shows that given a uni-
tary operator, this can be used to generate a new orthonormal basis starting
from an initial one.

1.16 Active and passive transformations


Consider a unitary transformation induced by the unitary operator Û . In
particular the basis vectors get also transformed as | i0 i = Û | i i. The matrix
elements of a generic operator Ô will transform as
Û0
Oij −→ Oij = h i0 |Ô| j 0 i = h i |Û † ÔÛ | j i , (1.135)

that can be also written as


0
= U† O U

Oij ij
. (1.136)

A unitary transformation acting directly on vectors is called active transfor-


mation.
However, one can also interpret the transformation of the matrix elements
0
Oij → Oij as due to a transformation of the operator itself induced by Û ,
explicitly:

Ô −→ Û † Ô Û . (1.137)
A unitary transformation acting on the operator is called passive transforma-
tion. In this case the vectors do not get transformed but the matrix elements
1.17. TRACE AND DETERMINANT OF AN OPERATOR 41

of Ô transform in the same way as in the case of the active transformation.


A unitary transformation can therefore be interpreted in both ways.
An interesting application is the following situation. Suppose to perform
simultaneously both an active transformation induced by a unitary operator
Û and a passive transformation induced by Û † . In this case it is easy to
see that the two transformations cancel out leaving unchanged any generic
h u |Ô| v i since one has:16


h u |Ô| v i −→ h Û u |Û ÔÛ † | Û v i = h u |Û † Û ÔÛ † Û | v i (1.139)
= h u |Ô| v i .

In particular, this is true also in the case u = v, so that quantities h v |Ô| v i


that, as we will see, in QM play the role of the expectation values of phys-
ical observables in a given quantum state, are also unchanged. This double
application of an active and a passive transformation is equivalent in the
Euclidean space to make a rotation of a physical system and jointly of the
reference frame (think of the room of a laboratory that is rotated and the
walls of the room are used as a reference frame).

1.17 Trace and determinant of an operator


Definition. The trace of an operator Ô, denoted by Tr(Ô) is given by the
trace of the matrix representing the operator, i.e., by the sum of the diagonal
matrix elements, in any given basis {| i i}, explicitly:
n
X
Tr(Ô) ≡ Tr(O) = h i |Ô| i i . (1.140)
i=1

This definition is perfectly consistent with the fact that the operator is an
object in abstract and, therefore, basis-independent. Using the completeness
relation, one can indeed easily
Pprove that Tr(Ô)Pis a basis invariant quantity
(PS n.4), so that Tr(O0 ) = ni=1 h i0 |Ô| i0 i = ni=1 h i |Ô| i i = Tr(O). One
16
In general if the active transformation is induced by Û and the passive transformation
by some second generic unitary operator Ŵ , one would have

Û ,Ŵ
h u |Ô| v i −→ h Û u |Ŵ † ÔŴ | Û v i = h u |Û † Ŵ † ÔŴ Û | v i . (1.138)

In our specific case we have W = Û † .


42 CHAPTER 1. VECTOR SPACES

can also prove that Tr(ÔP̂ ) = Tr(P̂ Ô). From this second result one can
also prove that, give a unitary operator Û and a generic operator Ô, one
has Tr(Û † Ô Û ) = Tr(Ô). This provides another way to show that Tr(Ô) is
basis invariant, since we have seen that a basis transformation is realised by
a unitary transformation.
Definition. The determinant of an operator Ô, denoted by det(Ô), is
given by the determinant of the matrix representing the operator in any
given basis basis {| i i}. As in the case of the trace of an operator, also
the determinant of an operator is basis invariant and, therefore, it can be
calculated in any basis. The proof is analogous to the proof that the trace of
an operator is invariant since one has det(O0 ) = det(U † OU ) = det(O), where
O0 is the matrix representation of Ô in a different basis {| i0 i}.

1.18 The eigenvalue problem


Consider an operator Ô acting on a generic ket | u i as


| u i −→ | u0 i = Ô| u i , (1.141)

in general | u0 i =
6 ω| u i, where ω can be any scalar. However, for some special
| u i and ω, it can happen that one has

Ô| u i = ω| u i . (1.142)

In this case, barring the trivial case | u i = 0, the ket | u i is called an eigenket
of Ô with eigenvalue ω. The set of eigenvalues of an operator Ô is also referred
to as its spectrum and its determination as the eigenvalue problem for the
operator Ô. Notice that, given an eigenket | u i with eigenvalue w, any other
ket α| u i, where α is a scalar, is also an eigenket with the same eigenvalue.
Example. Consider the identity operator I. ˆ In this case one has for any
generic ket | u i
ˆ ui = 1|ui,
I| (1.143)
implying that any ket is an eigenket of Iˆ with eigenvalue ω = 1.
Example. Consider the projection operator P̂v = | v ih v | on a generic
normalised ket | v i.
• Any ket | αv i is an eigenket of P̂v with eigenvalue ω = 1:

P̂v | αv i = | v ih v | αv i = | αv ih v | v i = | αv i . (1.144)
1.18. THE EIGENVALUE PROBLEM 43

• Any ket | v⊥ i orthogonal to | v i, i.e., such that h v | v⊥ i = 0, is an


eigenket of P̂v with eigenvalue ω = 0:
P̂v | v⊥ i = | v ih v | v⊥ i = 0 = 0| v⊥ i . (1.145)

• Any ket that does not fall in one of the two previous classes, i.e., any
linear combination α| v i + β| v⊥ i with both α and β 6= 0, is not an
eigenket of P̂v .
Since these three classes include all kets in the vector space, we found all
eigenkets and eigenvalues of P̂v .
In general, the identification of the eigenvalues and eigenkets of a generic
operator does not proceed so simply and one has to follow a more involved
procedure that we are going to discuss.

1.18.1 The characteristic equation


The Eq. (1.142), defining an eigenket and its eigenvalue, is an abstract rela-
tion, i.e., holding in any basis. However, in order to find all eigenkets and
eigenvalues of an operator Ô, we need to specify a basis where we can cal-
culate them using matrix theory. To this extent we can project Eq. (1.142)
onto some basis {| i i},17 obtaining:
h i |Ô| u i = ω h i | u i . (1.146)
From this, using the completeness relation, we obtain
X
h i |Ô| j ih j | u i = ω ui , (1.147)
j

that can also be written as:


X
Oij uj = ω ui . (1.148)
j

In this way we obtained the representation of Eq. (1.142) in the given basis.
In matrix form the Eq. (1.148) can then be written as
    
O11 . . . . . . O1n u1 u1
 .. ..
.
..   u   u2 
 . .  2 
  ..  = ω  ..  ⇔ O u = ω u . (1.149)
 
 . . .
 .. .. ..   .   . 
On1 . . . . . . Onn un un
17
In practice this means taking the inner product side by side with some basis vector.
44 CHAPTER 1. VECTOR SPACES

This can also be rewritten as:

(O − ω I) u = 0 . (1.150)

If the matrix (O − ω I) is invertible, then one would obtain

u = (O − ω I)−1 0 = 0 , (1.151)

that is absurd since we are requiring that u is not the null vector. This implies
that the solutions of Eq. (1.150) have necessarily to be found imposing that
the matrix (O − ω I) is non-invertible and this, from theory matrix, implies
that has to be singular, i.e., it must have a vanishing determinant, explicitly:

det(O − ω I) = 0 . (1.152)

Notice that the determinant of a matrix representing an operator is an invari-


ant, i.e., basis independent. This is in agreement with our initial observation
that the spectrum of eigenvalues of an operator is also invariant. The condi-
tion (1.152) is called the characteristic equation and can also be put in the
polynomial form
P (n) (ω) = 0 , (1.153)
where
n
X
(n)
P (ω) = cm ω m (1.154)
m=0
18
is the characteristic polynomial. The n, in general complex, roots of the
characteristic equation give finally the spectrum of eigenvalues of the opera-
tor. It can happen that some roots coincide and in that case one says that
there is a degeneracy of the corresponding eigenvalue and the eigenvalue, the
spectrum and the operator itself are said to be degenerate.19 If the operator
is not degenerate, then it is always possible to find n eigenvectors, one for
each eigenvalue. On the other hand, if some eigenvalue is degenerate, then
it is not always possible to find n eigenvectors. However, in our case, we are
particularly interested in Hermitian operators and then, even in the case of
degenerate spectrum, one can always find n eigenvectors.
18
Notice that one has necessarily cn = (−)n .
19
Not to be confused with the mathematical definition of degenerate matrix that in
physics is usually referred to as singular matrix, i.e., a matrix for which the determinant
vanishes.
1.19. THE EIGENVALUE PROBLEM FOR HERMITIAN OPERATORS45

1.19 The eigenvalue problem for Hermitian


operators
Hermitian operators play a special role in QM since they are associated to
physical observables. This association relies on the following two fundamental
properties and their implications.
Theorem. Hermitian operators have real eigenvalues.
Proof. It is quite straightforward. Let us assume that the operator Ô is
Hermitian and that | ω i is an eigenket of Ô with eigenvalue ω, so that we
can write by definition:
Ô| ω i = ω | ω i . (1.155)
If we now take the inner product with h ω |, we obtain first the expression

h ω |Ô| ω i = ω h ω | ω i (1.156)

and then, taking its adjoint, we also obtain

h ω |Ô† | ω i = ω ? h ω | ω i . (1.157)

Since we are assuming Ô† = Ô, then subtracting the last two equations side
by side leads immediately to ω = ω ? .
Another important property of Hermitian operators is the following the-
orem.
Theorem. The normalised eigenvectors of a Hermitian operator are mutu-
ally orthogonal and form an orthonormal basis, called eigenbasis. This basis
is unique if the operator is non-degenerate, otherwise there is an infinite
number of choices.

1.19.1 Non-degenerate case


The theorem is valid in general. Here we give the proof for the non-degenerate
case, i.e., assuming that all eigenvalues are different from each other.
Proof. We just need to proceed along the same lines as for the proof of
the previous theorem but this time considering any pair of two distinct eigen-
values ωi and ωj and their respective eigenvectors | ωi i and | ωj i. Therefore,
by definition, we can now write

Ô| ωi i = ωi | ωi i (1.158)
46 CHAPTER 1. VECTOR SPACES

and
Ô| ωj i = ωj | ωj i . (1.159)
We can now take the inner product with h ωj | and h ωi | respectively, obtaining

h ωj |Ô| ωi i = ωi h ωj | ωi i (1.160)

and
h ωi |Ô| ωj i = ωj h ωi | ωj i . (1.161)
We can now take the adjoint of the latter, and using again Ô† = Ô we obtain

h ωj |Ô| ωi i = ωj h ωj | ωi i , (1.162)

where we used, from the previous theorem, that ωj? = ωj . We can now
subtract from this last equation the expression (1.160) side by side obtaining

0 = (ωj − ωi ) h ωj | ωi i . (1.163)

Since we are assuming ωj 6= ωi , this implies h ωj | ωi i = 0, i.e., | ωi i and | ωj i


are mutually orthogonal. Since this is valid for any generic pair | ωi i and
| ωj i, then all eigenvectors are mutually orthogonal.

1.19.2 Degenerate case


The proof we gave is valid only for non-degenerate operators but it can also be
extended to the case when the spectrum is degenerate.20 Here, however, we
want to point out an important difference in the case of degenerate spectrum:
the orthonormal basis exists but it is not unique, in fact there is an infinite
number of different choices. Let us clarify the reason.
Suppose that ω1 = ω2 = ω. We have then two orthonormal eigenkets
obeying

Ô| ω1 i = ω | ω1 i (1.164)
Ô| ω2 i = ω | ω2 i . (1.165)

Any linear combination of the two eigenkets is clearly still an eigenket since

Ô (α | ω1 i + β | ω2 i) = α ω | ω1 i + β ω | ω2 i
= ω (α | ω1 i + β | ω2 i) , (1.166)
20
For a general proof, see [2].
1.19. THE EIGENVALUE PROBLEM FOR HERMITIAN OPERATORS47

for any α and β. Varying continuously α and β, the linear combination will
fill the whole two-dimensional subspace spanned by | ω1 i and | ω2 i, whose
elements are all eigenkets with eigenvalue ω. This subspace is referred to as
an eigenspace of Ô with eigenvalue ω. Obviously there exists an infinity of
choices of orthonormal pairs | ω10 i and | ω20 i that can be obtained from | ω1 i
and | ω2 i by a rigid rotation of the kind

| ω10 i = cos θ| ω1 i + sin θ| ω2 i (1.167)


| ω20 i = − sin θ| ω1 i + cos θ| ω2 i .

In this specific case the subspace is of course two-dimensional. In general,


if the eigenvalue occurs mi times, i.e., the characteristic equation has mi
roots equal to some ωi , there will be an mi -dimensional eigenspace Vωmi i with
mi ≤ n from which we may choose any mi orthonormal vectors to form a
basis for the subspace.
Clearly in such a situation we need a second label α = 1, . . . , mi to
distinguish the eigenvectors corresponding to the degenerate eigenvalue ωi
that will be then denoted as | ωi , α i. In our previous simple example we had
m1 = 2 and the two eigenvectors | ω1 i and | ω2 i can then be denoted more
rigorously as | ω, 1 i and | ω, 2 i.

1.19.3 Eigenvectors and eigenvalues of unitary opera-


tors
There are analogous theorems also for unitary operators. We just state them
without giving the proof.
Theorem. The eigenvalues of a unitary operator are complex numbers of
unit modulus (they can be written in the form ωi = eiϕi ).
Theorem. The eigenvectors of a unitary operator are mutually orthogonal.
The second is essentially the same as in the case of Hermitian operators.
The theorem can indeed be proved both for Hermitian and unitary operators.
Definition An operator N̂ is normal if

[N̂ , N̂ † ] = 0 . (1.168)

It is clear that this condition is satisfied both by Hermitian (N̂ = N̂ † ) and


unitary (N̂ , N̂ † = N̂ † N̂ = I) operators. These, combined together, form,
therefore, the wider class of normal operators.
48 CHAPTER 1. VECTOR SPACES

1.19.4 Diagonalization of Hermitian matrices


Consider a Hermitian operator Ô represented by a matrix O in some generic
orthonormal basis {| i i}. We have seen that the eigenvectors of Ô form an
orthonormal basis called the eigenbasis. Considering that by definition one
has
Ô| ωi i = ωi | ωi i , (1.169)
then in this basis Ô is represented by a diagonal matrix, since its matrix
elements are given by
h ωj |Ô| ωi i = ωi δji , (1.170)
so that explicitly in matrix form one has
 
ω1
 ω2 
Ô → O =  . (1.171)
 
..
 . 
ωn

We have seen that any operator can be expanded in a generic basis in the
form (1.115). In the eigenbasis this expansion becomes particularly simple:
n
X
Ô = ωi | ωi ih ωi | . (1.172)
i=1

This expression is called the spectral decomposition of the Hermitian operator


Ô. If all eigenvalues are nonzero, then there exists an inverse Ô−1 and it is
easy to verify that its spectral decomposition is given by
n
X
Ô −1
= ωi−1 | ωi ih ωi | . (1.173)
i=1

In Section 1.15 we showed that a change from an orthonormal basis to a new


orthonormal basis can always be described in terms of a unitary operator
acting on the basis vectors. Therefore, we can always find a unitary operator
Û that induces the transformation of the basis vectors | i i to the eigenvectors
| ωi i of Ô, explicitly:
| ωi i = Û | i i . (1.174)
Clearly in the new basis the matrix representation will be simply given by
a diagonal matrix with all the eigenvalues on the diagonal but we have seen
1.19. THE EIGENVALUE PROBLEM FOR HERMITIAN OPERATORS49

that in the new basis the matrix elements are given by the matrix elements
of the operator Û † ÔÛ . Therefore, this result is equivalent to say that for
any Hermitian operator Ô, represented in some orthonormal basis by a ma-
trix O, there always exists a unitary transformation Û represented by a
matrix U such that (U † O U )ij = wi δij . This is equivalent to say that the
matrix representing the operator Û † ÔÛ is diagonal. In terms of a passive
transformation one can say that there exists a unitary operator Û diagonal-
ising Ô, i.e., that transforms Ô to Ô0 = Û † Ô Û in a way that the matrix
O0 = U † O U is diagonal. In both cases, either Û is interpreted as an active
or a passive transformation, this implies that any Hermitian matrix O can
be diagonalised by some unitary matrix U and the problem of finding the
matrix that diagonalises O is equivalent to solving its eigenvalue problem
since, from Eq. (1.174), we can see that the matrix elements of U are given
by
Uij = h i | ωj i , (1.175)

corresponding to the components of the eigenkets in the initial basis, in a


way that the j-th column of the matrix representing Û is given by the column
vector representing the eigenket | ωj i. Explicitly, if we denote the generic i-
(j)
th component o the j − th eigenket by ωi = h i | ωj i, then the diagonalising
unitary matrix is given by
 (1) (2) (n) 
w1 w1 . . . w 1
(1) (2) (n)
 w2 w2 . . . w 2 
U = . (1.176)
 
.. .. .. ..
 . . . . 
(1) (2) (n)
wn wn . . . w n

1.19.5 Simultaneous diagonalization of two Hermitian


operators
Theorem. If two Hermitian operators Ô and P̂ commute with each other,
explicitly
h i
Ô, P̂ = 0 , (1.177)

then they have common eigenvectors.


Proof. We prove it in the case of non-degenerate operator P̂ but it is
valid in general.
50 CHAPTER 1. VECTOR SPACES

Suppose that the operator P̂ has eigenkets | λi i, each with eigenvalue λi .


We have then
 
P̂ Ô| λi i = Ô P̂ | λi i (1.178)
= λi Ô| λi i .

This implies that Ô| λi i is an eigenket of P̂ with eigenvalue λi but then,


since by assumption the only eigenket of P̂ with eigenvalue λi is | λi i, then
necessarily Ô| λi i has to be given by | λi i multiplied by some scalar ωi ,
implying that | λi i is also an eigenket of Ô with eigenvalue ωi , explicitly:

Ô| λi i = ωi | λi i . (1.179)

The common eigenvectors can then be denoted by | ωi , λi i. In the case that


some eigenvalue of Ô is degenerate, for example ω1 = ω2 = ω, then since
λ1 6= λ2 , one can unambiguously denote the two common eigenkets as | ω, λ1 i
and | ω, λ2 i and they are unique. In this case it is said then that the operator
P̂ breaks the degeneracy of the eigenvalue ω of Ô.

1.20 Generalization to infinite dimensions


We have so far discussed vector spaces with a finite dimensionality n and,
therefore, with a discrete basis. In this way basis vectors were labelled by
a discrete index i = 1, 2, . . . , n. However, as discussed in the introduction,
in QM one has to learn to deal with observables, described by Hermitian
operators, that in general have both a discrete and a continuous spectrum
of eigenvalues (think for example of the energy spectrum of a particle in a
potential well). Therefore, we need to learn how to generalise all our formal-
ism to the continuous case, where basis vectors are labelled by a continuous
variable that, therefore, can take an infinite number of values in a way that
the vector space is in this case necessarily infinite-dimensional.

1.20.1 A useful discrete-continuous dictionary


In the following we establish a useful dictionary that allows to translate,
from the discrete to the continuous case and vice versa, all quantities and
definitions we introduced.
1.20. GENERALIZATION TO INFINITE DIMENSIONS 51

• Basis vectors. In the discrete case the basis vectors are labelled by
a discrete index and are denoted by | i i. In the continuous case the
discrete index i is replaced by a continuous real variable x ∈ [a, b] and
basis vectors are denoted by | x i:

| i i ←→ | x i . (1.180)

• Vector expansion. We have seen how, given a basis, a generic state


vector | u i can be expanded in its components ui = h i | u i summing
over all basis vectors. In the continuous limit we denote the generic
vector as | ψ i. The sum has to be replaced by an integral over x of
all contributions from the basis vectors | x i and the components are
replaced by the wave function ψ(x) ≡ h x | ψ i associated to the state
vector | ψ i in the basis of the variable x:21
Xn Z b
|ui = ui | i i ←→ | ψ i = dx ψ(x) | x i . (1.181)
i=1 a

Notice that, if in the discrete case one can represent a ket with a column
vector with n components, now one can think of the wave function as
a column vector with infinite elements (do not try to write it down!).
• Inner product. Similarly to the vector expansion, now the sum on
products of the components of the two vectors is replaced by an integral
over x:
X n Z b
?
hv|ui = vi ui ←→ h ϕ | ψ i = dx ϕ? (x) ψ(x) . (1.182)
i=1 a

• Orthonormality of basis vectors. The Kronecker delta is replaced


by the Dirac delta:

h i | j i = δij ←→ h x | x0 i = δ(x − x0 ) . (1.183)

Proof. Notice that in the discrete case the orthonormality of basis


vectors is such that if one starts from the expansion of a generic vector
|ui
X n
|ui = hi|ui|ii (1.184)
i=1

21
The variable x is not necessarily denoting position.
52 CHAPTER 1. VECTOR SPACES

then taking the projection h j | u i one selects correctly the component


uj = h j | u i:
X n
hj |ui = h i | u i h j | i i = uj . (1.185)
i=1
In the continuous case the latter expression is equivalent to write:
Z b
ψ(x) = h x | ψ i = dx0 h x | x0 i h x0 | ψ i (1.186)
a
Z b
= dx0 h x | x0 i ψ(x0 ) . (1.187)
a

Therefore the quantity h x | x0 i has the property to sample the value


of the wave function ψ(x0 ) at some specific point x and this is exactly
the definition of Dirac delta function, so that one can indeed make the
identification h x | x0 i = δ(x − x0 ).
• Completeness relation ←→ resolution of the identity:
n
X Z b
Iˆ = | i ih i | ←→ Iˆ = dx | x ih x | . (1.188)
i=1 a

The resolution of the identity in the continuous case can be derived


analogously to how we derived the completeness relation in the discrete
case. Starting from Eq. (1.186) we can write
Z b 
0 0 0
hx|ψi = hx| dx | x i h x | ψ i , (1.189)
a

and from this one immediately derives the resolution of the identity
Eq. (1.188).

1.20.2 Operators
We have now to see how operators change in the continuous case. As in the
discrete case, they transform a vector into another vector,

| ψ i −→ | ψ 0 i ≡ | Ôψ i = Ô| ψ i . (1.190)
Given a basis {| x i}, one can again calculate the matrix elements of the
operator Ô:
Oxx0 = h x |Ô| x0 i , (1.191)
1.20. GENERALIZATION TO INFINITE DIMENSIONS 53

and one can still think of a representation matrix where the entries are given
by the matrix elements and that now will have an infinite number of rows
and columns.
The simplest operator one can introduce in the basis {| x i} is the operator
x̂ defined as that Hermitian operator with eigenkets given by the same basis
vectors | x i and with x as eigenvalue,

x̂ | x i = x | x i . (1.192)

The matrix elements of x̂ in this basis can be calculated very straightfor-


wardly using (1.183):
h x0 |x̂| x i = x δ(x0 − x) . (1.193)
Let us now see how the wave function transforms under the action of x̂, i.e.,
we need to calculate:

ψ 0 (x) = h x | x̂ψ i = h x |x̂| ψ i . (1.194)

Using the resolution of identity Eq. (1.188), we find (omitting the integration
limits in the integral)
ˆ ψi
h x |x̂| ψ i = h x |x̂ I| (1.195)
Z 
0 0 0
= h x | x̂ dx | x ih x | | ψ i (1.196)
Z
= dx0 h x | x̂ | x0 ih x0 | ψ i (1.197)
Z
= dx0 x0 δ(x − x0 ) h x0 | ψ i (1.198)
= x ψ(x) . (1.199)

The result is then simply that the wave function is multiplied by x. This
result is expressed saying that in the x̂-basis the operator x̂ is represented by
x, i.e., x̂ → x.
Definition. Another important example of operator is the differential
operator D̂ defined as that operator such that22

h x |D̂| ψ i ≡ , (1.200)
dx
22
This can be also expressed saying that in the x̂-basis D̂ is represented by d/dx or
D̂ → d/dx.
54 CHAPTER 1. VECTOR SPACES

so that we can also define the symbol

| dψ/dx i ≡ D̂| ψ i . (1.201)

Let us now calculate the matrix elements of the differential operator


h x |D̂| x0 i. Starting from the definition (1.200) and applying the resolution
of identity (1.188), we can write
dψ ˆ ψi
≡ h x |D̂| ψ i = h x |D̂ I| (1.202)
dx Z b 
0 0 0
= h x |D̂ dx | x ih x | | ψ i
a
Z
= dx0 h x |D̂| x0 ih x0 | ψ i .

Comparing now this expression with the definition of delta function that, as
we noticed, samples a function in a specific point,
Z b
dx0 δ(x − x0 ) ψ(x0 ) = ψ(x) , (1.203)
a

we find

h x |D̂| x0 iψ(x0 ) = δ(x − x0 ) , (1.204)
dx0
and since ψ is generic, one finds for the matrix elements of D̂:
d
h x |D̂| x0 i = δ(x − x0 ) . (1.205)
dx0
Theorem. The operator D̂ is not Hermitian but the operator

k̂ ≡ −i D̂ (1.206)

is Hermitian.
Proof. In order to prove it we need to show that

h ϕ |k̂| ψ i = h ψ |k̂| ϕ i? . (1.207)

Using the definition of inner product in the infinite-dimensional case Eq. (1.182),
this is equivalent to:
Z b   Z b  ?
? dψ(x) ? dϕ(x)
dx ϕ (x) −i = dx ψ (x) −i . (1.208)
a dx a dx
1.20. GENERALIZATION TO INFINITE DIMENSIONS 55

Integrating the left-hand side by parts and conjugating the integral in the
right-hand side, one then obtains
b b
dϕ? (x) dϕ? (x)
Z Z
−i [ϕ ?
(x) ψ(x)]ba +i dx ψ(x) =i dx ψ(x) , (1.209)
a dx a dx

and from this one finally finds that k̂ is Hermitian if the surface term vanishes,
i.e., if
[ϕ? (x) ψ(x)]ba = 0 . (1.210)
This is interesting, since it shows that in the infinite-dimensional case the
condition for k̂ to be Hermitian is stronger than in the discrete case, since it
also involves the behaviour of the wave functions at the end points a and b.
In QM we will be interested to the case a = −∞ and b = +∞ and in this
case the condition (1.210) is known as square integrability. This is a condition
that wave functions need to satisfy to be physical, i.e., to describe physical
quantum states. In the following, we will then assume square integrability
so that k̂ can be considered an Hermitian operator.
Let us now solve the eigenvalue problem for k̂. We need to find the
eigenvectors | k i satisfying

k̂| k i = k | k i . (1.211)

This can be done in the basis {| x i}. Following the usual procedure, we can
project Eq. (1.211) on the x̂ eigenvectors finding

h x |k̂| k i = k h x | k i (1.212)

and, using once more the resolution of identity in the left-hand side, one finds
Z +∞
dx0 h x |k̂| x0 i ψk (x0 ) = k ψk (x) , (1.213)
−∞

where we defined ψk (x) ≡ h x | k i. From Eq. (1.205) we can then simply


write  
0 0 d
h x |k̂| x i = δ(x − x ) −i 0 , (1.214)
dx
obtaining
dψk (x)
−i = k ψk (x) . (1.215)
dx
56 CHAPTER 1. VECTOR SPACES

A solution to this differential equation is straightforward and one obtains for


the eigenfunctions
ψk (x) = A ei k x , (1.216)

where the eigenvalue k can be any real number. The free parameter A can
be fixed imposing a normalization condition. To this extent, let us calculate
the inner products
Z +∞ Z +∞
0 0 0
hk|k i = dx h k | x ih x | k i = A 2
dx e−i(k−k ) x = 2π A2 δ(k − k 0 ) ,
−∞ −∞
(1.217)
where we used a well known representation√of the delta function. The most
common and sensible choice is then A = 1/ 2π so that finally we have that
the eigenkets of k̂ in the basis of x̂ eigenkets are represented by

1
| k i → ψk (x) = √ ei k x . (1.218)

The eigenvectors of k̂ provide a complete basis in the physical Hilbert space,


the space of interest in QM. This is the space of state vectors that can be
either normalised to unity or to the Dirac delta function and are the only
that can indeed describe physical states in QM.
Since the eigenvectors | k i form an orthonormal basis, then a generic
vector | ψ i can be also expanded in the k̂ basis in the same way as in the x̂
basis (see Eq. (1.181)), explicitly:
Z +∞
|ψi = e |ki,
dk ψ(k) (1.219)
−∞

where ψ(k)
e is the wave function associated to the vector | ψ i in the k̂ basis.
It is now interesting to see how the two wave functions, ψ(k)
e and ψ(x), are
related to each other. An expression of ψ(k) in terms of ψ(x) is easily derived:
e

Z +∞
ψ(k)
e ≡ hk|ψi = dx h k | x ih x | ψ i (1.220)
−∞
Z +∞
1
= √ dx e−i k x ψ(x) . (1.221)
2π −∞
1.20. GENERALIZATION TO INFINITE DIMENSIONS 57

Of course one can also take the inverse path, obtaining ψ(x) from ψ(k):
e
Z +∞
ψ(x) = h x | ψ i = dk h x | k ih k | ψ i (1.222)
−∞
Z +∞
1
= √ dk ei k x ψ(k)
e . (1.223)
2π −∞
What we have found is that the familiar Fourier transform provides the way
to switch from the x̂ basis to the k̂ basis and vice versa with the inverse
Fourier transform. Both bases can be used to expand state vectorsstate
vector in the Hilbert space.23
We can also easily calculate the matrix elements of k̂ in the same k̂ basis:
h k |k̂| k 0 i = k 0 h k | k 0 i = k 0 δ(k − k 0 ) . (1.224)
As expected we obtained exactly the same result we obtained for h x |x̂| x0 i
(c.f. Eq. (1.193)) where simply x is replaced by k. In the same way we can
see that if we calculate the matrix elements of x̂ in the k̂ basis, we obtain
the reciprocal result obtained for the matrix elements of k̂ in the x̂ basis in
Eq. (1.214). We can this time first calculate the action of x̂ on the wave
function in the k̂ basis:
Z
h k |x̂| ψ i = dx h k | x i h x |x̂| ψ i (1.225)
Z
1
= √ dx e−i k x x h x | ψ i (1.226)
2π Z
i d
= √ dx e−i k x h x | ψ i (1.227)
2π Zdk
d
= i dxh k | x i h x | ψ i (1.228)
dk
d
= i hk|ψi (1.229)
dk
d e
= i ψ(k) . (1.230)
dk
From this result one can then also find easily, proceeding analogously to what
we have done in (1.202), the elements of matrix of x̂ in the k̂ basis:
 
0 0 d
h k |x̂| k i = δ(k − k ) i 0 . (1.231)
dk
23
We imply ‘physical’ when we refer to the Hilbert space from now on.
58 CHAPTER 1. VECTOR SPACES

In summary we found that in the x̂ basis the operator x̂ acts simply like
x and k̂ as −id/dx on the wave function ψ(x). On the other hand, in the k̂
basis, k̂ acts as k and x̂ like id/dk: operators like x̂ and k̂, for which such
interrelationship holds, are said to be conjugate to each other.
Another interesting result is that, if one calculates the commutator, one
finds24 h i
x̂, k̂ = i . (1.232)

Notice that here x̂ and k̂ are two generic operators. In QM we will identify
x̂ with the position operator and p̂ ≡ ~ k̂ with the momentum operator. It is
now time to show how all the formalism of vector spaces we discussed in this
chapter can be applied to QM, obtaining a much more elegant and powerful
formulation than the one based on wave functions.

1.21 Tensor product of vector spaces


Given two vector spaces V1 and V2 with dimensionality n1 and n2 respectively,
there is a way to combine them to obtain a vector space of higher dimen-
sionality.25 Here the discussion can refer both to discrete and to continuous
spaces.26
24
Of course the identity operator on the right-hand side is implied, remember that the
multiplication by a scalar can be always regarded as the application of the identity operator
times the scalar. We will also imply it in most of the expressions in next chapters unless
there is a specific reason to indicate it explicitly. However, we will signal in footnotes in
some cases just to warn the reader.
25
Here we are particularly interested in the tensor product space since this has great
importance in QM. However, another possible way to combine the two vector spaces is
also to consider the sum defined in the following way:
Definition. The sum of two vector spaces V1n1 and V2n2 , denoted by V1n1
L n2
V2 , contains:
i) All vectors of V1n1 ;
ii) All vectors of V2n2 ;
iii) All linear combinations of the above vectors.
The sum of the two vector spaces is still a vector space with dimensionality n1 + n2 .
Example. Consider two one-dimension real vector spaces containing all vectors along
two different orthogonal directions in Euclidean space, for example along the axes i and
j. The sum Lof the two one-dimensional spaces will be the two-dimensional vector space
2
Vi−j = Vi1 Vj1 , containing all vectors lying on the i − j plane.
26
In the latter case, even though n1 and/or n2 can be infinite, it still makes sense to
define mathematically the sum n1 + n2 and the product n1 n2 .
1.21. TENSOR PRODUCT OF VECTOR SPACES 59

Definition. Consider two vector spaces V1 and V2 . Let us denote a generic


vector in V1 by | . . . i1 and a generic vector in V2 by | . . . i2 , where the dots
imply some symbol denoting the vectors. The direct product of two generic
vectors | . . . i1 and | . . . i2 , denoted by | . . . i1 ⊗ | . . . i2 , is then defined as a
linear operation in a way that
(α | ψ i1 + β | ψ 0 i1 ) ⊗ (γ| φ i2 ) = αγ | ψ i1 ⊗ | φ i2 + β γ | ψ 0 i1 ⊗ | φ i2 . (1.233)
In particular, if {| i i1 } is a basis for V1 and {| i i2 } is a basis for V2 , one can
form the direct products of basis vectors | i i1 ⊗ | j i2 where i = 1, 2, . . . , n1
and j = 1, 2, . . . , n2 .
Definition. The tensor product of two vector spaces V1 and V2 , with di-
mensionality n1 and n2 respectively is denoted by V1⊗2 . It is the vector space
formed by all vectors | ψ i12 that are linear combinations of the n1 n2 direct
products | i i1 ⊗ | j i2 , explicitly
X
| ψ i12 = aij (| i i1 ⊗ | j i2 ) . (1.234)
i,j

In this way the direct products | i i1 ⊗ | j i2 form a basis for V1⊗2 called
product basis and the dimensionality of V1⊗2 is, therefore, given by n1 n2 , i.e.,
the product of the dimensionalities of V1 and V2 .
Notice that the direct product | ψ i1 ⊗ | ψ i2 of two generic vectors belongs
to V1⊗2 but V1⊗2 also contains linear combinations of direct products that
cannot be expressed in the form | ψ i1 ⊗ | ψ i2 . Indeed if we expand | ψ i1 and
| ψ i2 in their respective bases {| i i1 } and {| i i2 },
X (1) X (2)
| ψ i1 = ai | i i1 and | ψ i2 = aj | j i2 , (1.235)
i j

we obtain for their tensor product


X (1) (2)
| ψ i1 ⊗ | ψ i2 = ai aj (| i i1 ⊗ | j i2 ) . (1.236)
i,j

If one compares with the expression (1.234) one can see that in this this case
(1) (2)
one simply has aij = ai aj , i.e., the components of the direct product can
be expressed as the product of the components of the two distinct vectors.
and the vector is said to be separable. However, this is not possible for all
states | ψ i12 and some of them simply cannot be expressed in that way: these
states are called entangled states.
We will see that in QM, a tensor product space describes a system ob-
tained combining together different subsystems, like for example two particle
states. We will see that the implications are quite astonishing.
60 CHAPTER 1. VECTOR SPACES
Bibliography

[1] R. Shankar, Principles of Quantum Mechanics, August 1994, Springer.

61
62 BIBLIOGRAPHY
Chapter 2

The principles of quantum


mechanics

We can now finally show in this chapter how vector spaces provide the correct
mathematical formalism to formulate QM.

2.1 The principles of Quantum Mechanics


These are the principles of QM:

(I) The state of a physical system in QM (referred to as the quantum state)


is represented by a ray or direction in the (physical) Hilbert space. This
is the set of all vectors α| ψ i obtained multiplying a specific vector
| ψ i by an arbitrary scalar α. Any vector of the ray provides a fully
equivalent description of the quantum state.1 A specific choice of vector
| ψ i of the ray is usually referred to as state vector.
Any state vector in the Hilbert space describes a possible quantum
state. This implies that if one has two state vectors | ψ1 i and | ψ2 i
describing two possible different quantum states of the system, then
any vector | ψ i = α| ψ1 i + β| ψ2 i, with α and β two generic complex
numbers, is also a state vector describing a possible quantum state of
1
Notice that even choosing, as usually done conventionally, state vectors of the ray
normalised to unity or Dirac delta, one still has an infinite number of possible choices of
state vectors that describe the same quantum state that are obtained multiplying | ψ i by
a phase factor eiφ .

63
64 CHAPTER 2. THE PRINCIPLES OF QUANTUM MECHANICS

the system (superposition principle).2

(II) Physical observables are described by Hermitian operators. For exam-


ple, in the case of an elementary particle moving in one dimension, posi-
tion is described by the Hermitian operator x̂ and momentum, properly
defined, by the Hermitian operator p̂ = −i ~ D̂, where D̂ is the differ-
ential operator associated to x̂ defined in the previous chapter.

(III) If the system is in a generic state | ψ i, the measurement of the observ-


able corresponding to an Hermitian operator Ω̂ yields as a result one of
the eigenvalues ωi of Ω̂ with probability (here we refer to the discrete
and non-degenerate case for definiteness)3

|h ωi | ψ i|2 |h ωi | ψ i|2 |h ωi | ψ i|2


P (ωi ) ≡ P 2 = = , (2.1)
h ψ | ψ i
P
j |h ωj | ψ i| j h ψ |P̂ ω j
| ψ i

where P̂ωj = | ωj ih ωj | and each | ωj i is a normalised eigenvector with


eigenvalues ωj . The quantum state will ‘jump’ from | ψ i → | ωi i as a
result of the measurement (collapse of the quantum state).

2.2 Remarks
• Suppose the system is in a quantum state described by | ψ i and that
a measurement of the observable associated to the Hermitian operator
Ω̂ is made. One can then determine the probability to find ωi as a
result of the measurement and that, correspondingly, the state vector
collapses into the eigenket | ωi i in the following way:

i) Solve the eigenvalue problem for Ω̂, finding its eigenvalues ωi and
eigenvectors | ωi i;
ii) Expand | ψ i in the eigenbasis of Ω̂:
X
|ψi = | ωi ih ωi | ψ i ; (2.2)
i
2
Again notice that any state vector obtained by | ψ i multiplying it by some scalar, de-
scribes the same quantum state since this is determined by the ray and not by a particular
state vector of the same ray.
3
It should be appreciated how this expression can be written only for Hermitian oper-
ators, since their eigenkets, as we proved, form a basis.
2.2. REMARKS 65

(iii) The probability will be then given by Eq. (2.1). Notice that if one
has a normalised vector, then h ψ | ψ i = 1 and simply

P (ωi ) = |h ωi | ψ i|2 . (2.3)

Notice that one can also write:

P (ωi ) = |h ωi | ψ i|2 = h ψ |P̂ωi | ψ i = h P̂ωi ψ | P̂ωi ψ i = ||P̂ωi ψ||2 .


(2.4)
As a particular example consider the case when | ψ i is the nor-
malised superposition of just two eigenkets:
α | ω1 i + β | ω2 i
|ψi = . (2.5)
(|α|2 + |β|2 )1/2
In this case any measurement can only yield as a result either ω1
or ω2 with probabilities respectively
|α|2
P (ω1 ) = (2.6)
|α|2 + |β|2
and
|β|2
P (ω2 ) = . (2.7)
|α|2 + |β|2
There is no analogue in classical physics.

• If the operator Ω̂ has degenerate eigenvalues ω = ω1 = ω2 then

P (ω) = |h ω, 1 | ψ i|2 + |h ω, 2 | ψ i|2 . (2.8)

If we define the projector for the eigenspace corresponding to the eigen-


value ω
P̂ω ≡ | ω, 1 ih ω, 1 | + | ω, 2 ih ω, 2 | , (2.9)
then one can also write

P (ω) = h ψ |P̂ω | ψ i = h P̂ω ψ | P̂ω ψ i . (2.10)

• In the continuous case the expressions generalise in the following way.


The expansion of the state vector in the basis of eigenkets is now given
by the integral Z
| ψ i = dω | ω i h ω | ψ i . (2.11)
66 CHAPTER 2. THE PRINCIPLES OF QUANTUM MECHANICS

The (infinitesimal) probability to measure a value in the infinitesimal


interval [ω, ω + dω] will be given by
dP (ω) = |h ω | ψ i|2 dω , (2.12)
while for the (finite) probability to obtain a result in the finite interval
[ω1 , ω2 ] one has clearly to integrate between ω1 and ω2 :
Z ω2 Z ω2
P (ω1 < ω < ω2 ) = dP (ω) = dω|h ω | ψ i|2 . (2.13)
ω1 ω1

• The first principle states that the quantum state is described by a ray
α| ψ i in Hilbert space, meaning that given a state vector | ψ i describing
the quantum state, any other state vector obtained by | ψ i multiplying
it by a scalar α provides an equivalent description of the quantum state.
Moreover a superposition | ψ i = α| ψ1 i + β| ψ2 i of two state vectors
| ψ1 i and | ψ2 i describing two quantum states, also describes a new
possible quantum state. The new quantum state is also described by a
ray in Hilbert space and, therefore, the quantum state does not change
if the state vector is multiplied by an overall scalar. Therefore, we can
factorise α and write
 
β
| ψ i = α | ψ1 i + | ψ2 i , (2.14)
α
showing that the new quantum state is determined by the ratio β/α
rather than by an independent choice of α and β.
• Collapse of the state vector. The third principle states that as a result of
the measurement of the observable Ô, the state vector will change from
| ψ i to some eigenket | ωi i of Ô, with probability P (ωi ) = |h ωi | ψ i|2 ,
also corresponding to the probability of obtaining ωi as result of the
measurement (Born’s rule). This is equivalent to say that the process
of measurement has acted on the state describing the system as the
projection operator P̂ωi . Therefore, if we expand the state vector before
the measurement in the eigenbasis, writing
X
|ψi = | ωj ih ωj | ψ i , (2.15)
i

the measurement will act as a process inducing the collapse of the state
vector in a way that
| ψ i −→ | ψ 0 i = P̂ωi | ψ i = | ωi ih ωi | ψ i . (2.16)
2.3. HOW TO TEST QUANTUM MECHANICS 67

Notice that if one starts from a normalised state, so that h ψ | ψ i = 1,


and if the eigenkets are also normalised, so that h ωi | ωi i = 1, then | ψ 0 i
is not normalised in general since h ψ 0 | ψ 0 i = |h ωi | ψ i|2 ≤ 1. However,
as a matter of convenience, one can always normalise to unity the final
state as well and redefine:
P̂ωi | ψ i
| ψ0 i → | ψ0 i = . (2.17)
h P̂ωi ψ | P̂ωi ψ i1/2
Notice that, as expected, this state is indeed coinciding with | ωi i ei φi ,
the overall phase φi = Arg(h ωi | ψ i) of course is unphysical and can
be simply reabsorbed redefining | ωi i. Therefore, the collapse of the
quantum state can be indeed expressed as a projection of the initial
state on the eigenket | ωi i. We can also consider again the case when
the measured eigenvalue ω is degenerate. In this case we can still write
an expression similar to (2.17)
P̂ω | ψ i
| ψ i −→ | ψ 0 i = . (2.18)
h P̂ω ψ | P̂ω ψ i1/2
This time P̂ω is the projector on the eigenspace of the eigenvalue ω and
| ψ 0 i is not in general given by some specific eigenket | ωi i but by some
linear combination of some them that have been chosen as basis vectors
for the eigenspace with eigenvalue ω. However, notice that such linear
combination is also an eigenket with eigenvalue ω as we know.

2.3 How to test quantum mechanics


Suppose that a physical system is prepared in a state | ω i, for example as an
effect of a measurement process of an observable Ô. Suppose that a second
measurement of an observable Λ̂ is performed on the system and that | ω i
can be expressed simply as a linear combination of two eigenstates | λ1 i and
| λ2 i of Λ̂, explicitly:
α | λ1 i + β | λ2 i
|ωi = 1 . (2.19)
(|α|2 + |β|2 ) 2
QM predicts that the probability to measure the system in | λ1 i and | λ2 i is
given respectively by
|α|2 |β|2
P (λ1 ) = and P (λ2 ) = . (2.20)
|α|2 + |β|2 |α|2 + |β|2
68 CHAPTER 2. THE PRINCIPLES OF QUANTUM MECHANICS

These can be considered as predictions of QM that are, therefore, of statistical


nature: QM does not predict in this case the outcome of the measurement
but the probability that a measurement will give a certain outcome. In
order to test such statistical predictions, one needs to be able to perform
the measurement a high number of times and this implies the possibility to
have a number of replicas of the system all initially prepared in the same
quantum state | ω i: such a set of copies of the system is referred to as a
quantum ensemble.
Notice that in the case of a classical ensemble one would have always the
same outcome in the limit where the experimental error can be neglected. In
QM this would happen only if | ω i coincides with either | λ1 i, for β = 0 or
| λ2 i, for α = 0. This means that if a second measurement is repeated on
the same copy of the ensemble quickly time that the system had no time to
evolve, the outcome will be the same as in the first measurement. This kind
of behaviour has been observed in realistic experiments that have proven QM
with incredible precision and accuracy.
Moreover, after the measurement that has induced the collapse of the
quantum state, the subsequent evolution of the state can be predicted by
solving the Shrödinger equation and this gives further ways to test the theory.
In this way notice how the measurement process allows to determine the
initial quantum state of the system.

2.4 Expectation value and uncertainty of the


measurement

Consider a system in some generic quantum state | ψ i. Having introduced the


quantum ensemble, we can determine the average value of an observable Ô
over the ensemble, that means the average value one would obtain repeating
the measurement on all the copies of the system in the ensemble. In the
limit where the number N of the copies of the ensemble goes to infinity, the
average value tends to its expectation value that is determined just by the
state | ψ i. One can derive an important result starting from the statistical
2.5. COMPATIBLE AND INCOMPATIBLE OPERATORS 69

definition of mean value:


X
hÔiψ = Pψ (ωi ) ωi (2.21)
i
X
= |h ωi | ψ i|2 ωi (2.22)
i
X
= h ψ | ωi ih ωi | ψ i ωi (2.23)
i
X
= h ψ |Ô| ωi i h ωi | ψ i (2.24)
i
!
X
= h ψ |Ô | ωi ih ωi | | ψ i (2.25)
i

= h ψ |Ô| ψ i , (2.26)

where we have used, once more, the completeness relation. This expression is
very interesting since it shows that the expectation value can be calculated
without knowing the eigenvalues and eigenstates of Ô. Notice that if the
state of the system coincides with one eigenstate ωi , then simply hÔi = ωi .
Notice that we derived this result in the discrete case but the derivation can
be also easily extended to the continuous case.
We can also write an expression for the standard deviation, statistically
the average fluctuation around the mean value,
s
 2  q
∆Ô = Ô − hÔi = hÔ2 i − hÔi2 , (2.27)

that in QM is referred to as the uncertainty in the observable Ô. Notice that


if | ψ i = | ωi i, then ∆Ô = 0. In classical mechanics this is always the case
since a variable can ideally measured with arbitrary accuracy and precision
in the limit when the experimental error is made completely absent.

2.5 Compatible and incompatible operators


Suppose that the physical system is initially in a quantum state | ψ i and to
perform a first measurement of the observable Ω̂ that gives as a result the
(non-degenerate) eigenvalue ωi . From the I postulate this also induces the
70 CHAPTER 2. THE PRINCIPLES OF QUANTUM MECHANICS

collapse of the state vector into an eigenstate | ωi i:

first measurement P̂ωi | ψ i


| ψ i −−−−−−−−−−−−−→ | ωi i = . (2.28)
h P̂ωi ψ | P̂ωi ψ i1/2

One can then perform a second measurement of another observable Λ̂, in a


way that one has a second collapse of the state, this time in an eigenstate
| λi i of Λ̂:

second measurement P̂λi | ωi i


| ωi i −−−−−−−−−−−−−−−−→ | λi i = (2.29)
h P̂λi ωi | P̂λi ωi i1/2

The eigenstate | λi i is not coinciding in general with the eigenstate | ωi i of


Λ̂ and, therefore, in the Ω̂-eigenbasis it can be expanded as
X
| λi i = | ωj ih ωj | λi i . (2.30)
j

This implies that if one performs a third measurement, again of the observ-
able Ω̂, the Λ̂ eigenstate | λi i will now collapse in some eigenstate | ωi0 i that,
in general, does not coincide with | ωi i, explicitly:

third measurement P̂ωi0 | λi i


| λi i −−−−−−−−−−−−−−→ | ωi0 i = . (2.31)
h P̂ωi0 λi | P̂ωi0 λi i1/2

It is instructive to calculate what is the probability before performing the


three measurements of obtaining ωi , λi and ωi0 as a result of each measure-
ment respectively. This is given by the product of three probabilities:

P (ωi , λi , ωi0 ) = |h ωi0 | λi i|2 |h λi | ωi i|2 |h ωi | ψ i|2 . (2.32)

A special situation occurs when the state state produced by the first measure-
ment, | ωi i, is unaffected by the second measurement: this can only happen
if | λi i is also an eigenstate of Ω̂, in a way to have a simultaneous eigenstate,
denoted by | ωi , λi i, of Ω̂ and Λ̂. If this happens, then one has at the same
time:

Ω̂| ωi , λi i = ωi | ωi , λi i , (2.33)
Λ̂| ωi , λi i = λi | ωi , λi i . (2.34)
2.5. COMPATIBLE AND INCOMPATIBLE OPERATORS 71

We can then operate again with Λ̂ in the first equation and with Ω̂ in the
second equation, obtaining the same result:
Λ̂ Ω̂| ωi , λi i = ωi λi | ωi , λi i , (2.35)
Ω̂ Λ̂| ωi , λi i = λi ωi | ωi , λi i , (2.36)
and subtracting the two equations side by side one obtains the condition:
[Ω̂, Λ̂] | ωi , λi i = 0 . (2.37)
This means that inverting the order of the two measurements the result will
be the same since in any case the measurement process on the state does not
alter the state itself.
We know already that two operators have common eigenbases when they
commute with each other. Therefore, for sure the condition [Ω̂, Λ̂] = 0
guarantees that they have simultaneous eigenkets and in the experiment we
discussed one would have simply P (ωi , λi , ωi0 ) = |h ωi | ψ i|2 if i0 = i and
P (ωi0 ) = 0 if i0 6= i. In this situation one says that the two operators are
compatible.
On the other hand if [Ω̂, Λ̂]| ψ i =
6 0 for any non trivial | ψ i, implying that
there is no common eigenket and any measurement of Λ̂ made on an eigenket
of Ω̂ will not give a well defined value (i.e., the uncertainty is alway non-
vanishing) and vice versa, then the two operators are said to be incompatible.
An important example is the case of position x̂ and momentum p̂ of an
elementary particle, since in this case, identifying p̂ = ~ k̂, where k̂ is the
conjugate operator of x̂, one has from Eq. (1.232)
[x̂, p̂] = i ~ . (2.38)
There is also a third case, when only some states are simultaneous eigen-
kets of both operators. In that case the condition (2.37) is satisfied only
for these states but not in general, for this reason it is not correct to write
[Ω̂, Λ̂] = 0, since the two operators do not commute with each other in gen-
eral. Notice that one cannot find a full basis of simultaneous eigenstates since
otherwise this would necessarily imply that the condition (2.37) is satisfied
for all states and this would fall within the case of compatible operators.

2.5.1 Compatible operators: probability of a measure-


ment outcome
Non-degenerate case. Let us consider again the case when we have two com-
patible operators Ω̂ and Λ̂. This calculation is interesting for compatible
72 CHAPTER 2. THE PRINCIPLES OF QUANTUM MECHANICS

operators. Let us first consider the case of no degeneracy. Suppose that


the system is in an initial state | ψ i. The probability that measuring Ω̂ one
obtains as a result ωi is given by P (ωi ) = |h ωi , λi | ψ i|2 . If we now perform
the second measurement, of the observable Λ̂, since the two operators are
compatible and they have common eigenbases with eigenkets | ωi , λi i, the
probability to obtain λi , having measured ωi in the first measurement, is just
simply unity, that means it is certain that the second measurement will yield
λi as a result. In summary we can say that the probability to obtain ωi and
λi , before performing the two measurements, is given by

P (ωi , λi ) = 1 · |h ωi | ψ i|2 = |h ωi | ψ i|2 . (2.39)

It is clear that if we invert the order of the two measurements we will obtain
exactly the same result, so that we can simply write:

P (λi , ωi ) = P (ωi , λi ) . (2.40)

The situation is quite simple in this non-degenerate case.4 The first mea-
surement induces the collapse of the quantum state, described by a ray in
Hilbert space, along one of the rays corresponding with one of the common
eigenstates of the two eigenstates with a probability given by the square of
the absolute value of the component of the initial state along that direction.
The collapse, as we have seen, is equivalent to project the initial state along
one of the eigenstates. Therefore, once the projection occurred, further mea-
surements of Ω̂ and/or Λ̂ will not change the quantum state, since as we
know P̂ωni = P̂ωi .
The degenerate case is just slightly more involved. A concrete exam-
ple helps understanding and it is quite easy to generalise. Consider the
three-dimensional real vector space (the Euclidean space). Suppose that the
operator Ω̂ has three different eigenvalues ω1 , ω2 , ω3 while a second opera-
tor Λ̂, commuting with Ω̂ and, therefore, compatible, has one eigenvalue λ3
with eigenket | ω3 , λ3 i also eigenket of Ω̂ with eigenvalue ω3 . This eigenket
identifies an axis in the Euclidean space (think of the z axis).
The other eigenvalue of Λ̂ is degenerate, so that λ = λ1 = λ2 but the
degeneracy is broken by Ω̂ and we can choose as basis of the eigenspace
4
If the operators are incompatible, then in general one has P (ωi , λi ) 6= P (λi , ωi ). Of
course in this case moreover the final quantum state does not have well-defined values
of ωi and λi , since after the second measurement one is in an eigenstate of the second
operator with the measured eigenvalue but this does not coincide with an eigenstate of
the first operator.
2.5. COMPATIBLE AND INCOMPATIBLE OPERATORS 73

associated to λ, that can be identified with the plane orthogonal to | ω3 , λ3 i,


the two eigenkets | ω1 , λ i and | ω2 , λ i, that can be identified for example with
the x and y axes. Suppose now that the initial quantum state corresponds to
some direction in the Euclidean space and that it is described by a normalised
state vector with expansion in the eigenbasis of Ω̂ and Λ̂

| ψ i = α | ω3 , λ3 i + β | ω1 , λ i + γ | ω2 , λ i , (2.41)

with α, β and γ real. Suppose that a measurement of Ω̂ gives as a result ω1 ,


so that the measurement induces a projection along the x axis. Since this
axis is also on the subspace of eigenkets of Λ̂, a second measurement of Λ̂ will
give as outcome λ and will not modify the quantum state that will continue
to be described by the ray along the x axis. Therefore, the probability to
obtain first ω1 and then λ is again simply given by

P (ω1 , λ) = 1 · |h ω1 , λ | ψ i|2 = β 2 . (2.42)

If we now first measure Λ̂, the probability to obtain λ is given by the proba-
bility that initial state collapses on the λ subspace, and this is given by

P (λ) = β 2 + γ 2 . (2.43)

As a result of the measurement, the state vector | ψ i collapses into a new


state vector | ψ 0 i given by the projection of | ψ i onto the eigenspace with
eigenvalue λ. When this is normalised, one obtains using Eq. (2.18):

| P̂λ ψ i β| ω1 , λ i + γ | ω2 , λ i
| ψ0 i = = , (2.44)
h P̂λ ψ | P̂λ ψ i1/2 (β 2 + γ 2 )1/2

If we then measure Ω̂ obtaining ω1 , the state vector is further projected along


| ω1 , λ i and the probability to obtain ω1 after having measured λ is given by
β 2 /(β 2 + γ 2 ). Therefore, the probability to measure first λ and then ω1 is
simply given by the product and we obtain

β2
P (λ, ω1 ) = (β 2 + γ 2 ) · = β2 , (2.45)
β2 + γ2
so that also in the degenerate case there is no difference in performing the
two measurements in different order. There is, however, a substantial differ-
ence: in the non-degenerate case the second measurement does not change
the quantum state obtained after the first measurement, so that the initial
74 CHAPTER 2. THE PRINCIPLES OF QUANTUM MECHANICS

quantum state directly collapses into the final quantum state, let us say
| ω1 , λ1 i. On the other hand, in the degenerate case, if one first measures
the degenerate eigenvalue the system goes through an intermediate quantum
state | ψ 0 i given by (2.44) that will then collapse into the final quantum state
after the second measurement. Of course for the special case γ = 0, this in-
termediate quantum state also coincides with the final one but, in general,
it does not.
Notice that in general, also Ω̂ might be degenerate and in this case one
could have a common eigenspace of quantum states corresponding to | ω, λ i
vectors. In this case one has to look for a third observable Γ̂ breaking the
degeneracy so that one can distinguish the states as | ω, λ, γi i, where the γi ’s
are the eigenvalue of Γ̂. Of course there might be necessity even of more
observable to fully break degeneracies and ultimately one obtains a set of
compatible operators providing a complete set of commuting observables.

2.6 The Schrödinger equation


We want now finally to discuss the dynamics of quantum states in QM, i.e.,
we want to derive the time dependence of quantum states. First of all, let
us stress that time in QM is not an operator but a parameter.

2.6.1 The time-evolution operator


We are then interested to understand how, given the state vector denoted
by | ψ(t0 ) i describing a physical system at some initial time t0 , this evolves
into some new state vector | ψ(t) i at some generic time t > t0 . We can see
this time evolution as the result of the action of some time-evolution operator
Û (t, t0 ), in a way that

time evolution
| ψ(t0 ) i −−−−−−−−−−−→ | ψ(t) i = Û (t, t0 ) | ψ(t0 ) i . (2.46)

As we have seen a generic state vector can be always expanded in the eigen-
basis {| ωi i} of some Hermitian operator Ω̂. Assuming a normalised vector,
the quantities |h ωi | ψ i|2 give the probabilities P (ωi ) to get the ωi ’s as result
of a measurement. The sum of all probabilities have to be equal to unity. At
the generic time t > t0 , this fundamental property has also to hold, so that
we can write X X
|h ωi | ψ(t0 i|2 = |h ωi | ψ(t i|2 = 1 . (2.47)
i i
2.6. THE SCHRÖDINGER EQUATION 75

It is easy then to see (PS n.7) that this straightforwardly translates into the
result that the time-evolution operator has to be unitary:

Û † (t, t0 ) Û (t, t0 ) = Iˆ . (2.48)

Notice that this also implies that the state vector remains normalised at all
later times. Notice that if the state first evolve from t0 to some time t1 and
then from t1 to some time t2 , i.e.,

Û (t1 ,t0 ) Û (t2 ,t1 )


| ψ(t0 ) i −−−−→ | ψ(t1 ) i = Û (t1 , t0 ) | ψ(t0 ) i −−−−→ | ψ(t2 ) i = Û (t2 , t1 ) | ψ(t1 ) i ,
(2.49)
then one has also to impose the composition property

Û (t2 , t0 ) = Û (t2 , t1 ) Û (t1 , t0 ) . (2.50)

If we now consider the case when the state vector evolves from t0 to a time
t0 + dt, where dt is an infinitesimal interval of time, having assumed that
time is a continuous variable, the unitarity operator has also to satisfy the
property
lim Û (t0 + dt, t0 ) = Iˆ . (2.51)
dt→0

These three properties, unitarity, composition and continuity, imply that the
time-evolution operator for such infinitesimal time displacement has to be
necessarily of the form (see PS n.7)


Û (t0 + dt, t0 ) = Iˆ − i dt , (2.52)
~

where the operator Ĥ is called the Hamiltonian operator and it has to be


Hermitian. The constant ~ is there in a way that, in the case where the
classical limit exists, the Hamiltonian operator matrix elements reduce to
the Hamiltonian in classical mechanics. Notice that the fact that the time-
evolution operator can be expressed in the form given by the Eq. (2.52) can
be also expressed saying that the Hamiltonian operator is the generator of
time evolution and this is consistent with the fact that the Hamiltonian in
Classical Mechanics is also the generator of time evolution.5
5
More generally whenever there is a continuous transformation, i.e., such that it can
be expressed as a function of some continuous parameter α in a way that when α → 0 the
transformation tends to the identity, and the operator Û (α) associated to the transfor-
mation also has to satisfy the properties of unitarity and composition, then the generator
76 CHAPTER 2. THE PRINCIPLES OF QUANTUM MECHANICS

2.6.2 Derivation of the Schrödinger equation


If we now apply the composition property Eq. (2.50) where t1 = t is some
generic time and t2 = t1 + dt, where dt is again an infinitesimal interval of
time, we can then write
!

Û (t + dt, t0 ) = Û (t + dt, t) Û (t, t0 ) = Iˆ − i dt Û (t, t0 ) . (2.53)
~

From this relation we can then write


Û (t + dt, t0 ) − Û (t, t0 ) = −i dt Û (t, t0 ) , (2.54)
~
that, in differential form, gives the Schrödinger equation for the time-evolution
operator

i~ Û (t, t0 ) = Ĥ Û (t, t0 ) . (2.55)
∂t
When this is applied to the state vector | ψ(t0 ) i, one obtains the Schrödinger
equation for the time evolution of the state vector:

i~ | ψ(t) i = Ĥ| ψ(t) i . (2.56)
∂t
The Schrödinger equation is a first order differential equation in time and,
therefore, it is sufficient to know the initial state vector to determine its time
evolution.

2.6.3 Stationary states and time dependence of the


state vector
As we said, the Hamiltonian operator Ĥ corresponds to the Hamiltonian
in classical mechanics that depends on classical physical variables: the gen-
eralised coordinates and momenta of the system. In the cases then when
of the transformation Q̂ is defined as Q̂ = const i[U (α) − I]/dα and is Hermitian. The
constant of proportionality is usually unity for the generator properly said but still in our
case one says that the Hamiltonian is the generator of time displacement since indeed in
natural units one would have ~ = 1. In this case the Planck constant appears in a way
that one has the right correspondence with the classical Hamiltonian that has the dimen-
sionality of an energy in the usual systems of measurements: it is a transformation from
the microscopic world units of measurements to the macroscopic world ones.
2.6. THE SCHRÖDINGER EQUATION 77

such a classical correspondence exists, the Hamiltonian operator is obtained


replacing generalised coordinates and momenta by their corresponding Her-
mitian operators. We will see soon a few examples. Since the Hamiltonian
operator is an Hermitian operator, it has real eigenvalues, that we denote
by Ei . The set of eigenvalues can be identified with the energy spectrum of
the system and the corresponding eigenkets, denoted by | Ei i, describe the
energy eigenstates. We can then write the eigenvalue equation for Ĥ as
Ĥ | Ei i = Ei | Ei i . (2.57)
Since Ĥ is Hermitian, the energy eigenstates provide a basis for the Hilbert
space, the Hamiltonian operator eigenbasis or simply the energy eigenbasis.
We can then expand the state | ψ(t) i in this Ĥ eigenbasis, writing6
X
| ψ(t) i = | Ei i h Ei | ψ(t) i (2.58)
i
X
= ai (t) | Ei i ,
i

where we defined ai (t) ≡ h Ei | ψ(t) i, the components of the state in the


Hamiltonian eigenbasis. Notice that if the Hamiltonian is independent of
time, certainly true for an isolated system, then its eigenvalues and eigen-
states are also independent of time and the time dependence is encoded in
the components ai (t). In this case the equation (2.57) is called the time-
independent Schrödinger equation.
In this case, inserting the expansion for | ψ(t) i, Eq, (2.58), into the
Schrödinger equation (2.56), we obtain
X dai (t) X
i~ | Ei i = Ei ai (t) | Ei i , (2.59)
i
dt i

that can be also written as:


X  dai (t) 
i~ − Ei ai (t) | Ei i = 0 . (2.60)
i
dt

Since the | Ei i’s are linearly independent, then necessarily each component
has to satisfy
dai (t) i Ei
=− ai (t) , (2.61)
dt ~
6
For definiteness we write we are referring to the discrete case but all expressions can
be extended to the continuous case as usual.
78 CHAPTER 2. THE PRINCIPLES OF QUANTUM MECHANICS

with solution i Ei
ai (t) = ai (0) e− ~
t
, (2.62)
where we have taken t0 = 0. The solution of the Schrödinger equation is
then given by X i Ei
| ψ(t) i = ai (0) e− ~ t | Ei i . (2.63)
i

It is remarkable that if the initial state coincides with one of the energy
eigenstates, let us say | ψ(0) i = | Ej i, then
i Ej
| ψ(t) i = e− ~
t
| Ej i , (2.64)

meaning that the system will remain in the same initial quantum state. It is
also easy to see that in this case the probability distribution P (ωi ) for any
generic observable Ω̂ is time-independent:

P (ωi , t) = |h ωi | ψ(t) i|2 (2.65)


i Ej 2
= h ωi | Ej i e− ~
t

= |h ωi | Ej i|2
= P (ωi , 0) .

This implies that in this case the expectation values and the standard devi-
ations of any generic observable Ω̂ is independent of time. For these reasons
the energy eigenstates are also called stationary states.

2.6.4 Evolution of the expectation value of time-independent


observables
Here we want now to generalise how expectation values of operators evolve
for a generic initial state. We first consider a particular (but significant) case
that will be then useful to derive more general results.

The norm of the state vector is time invariant


We have seen that the time-evolution operator is unitary and this implies
that if the initial state is normalised to unity then it remains normalised to
unity to any later time. More generally, if the state vector is not normalised,
the unitarity of Û (t) implies that the norm of the state vector does not change
2.6. THE SCHRÖDINGER EQUATION 79

with time. We can now indeed verify, as an exercise, that the Schrödinger
equation preserves the norm of the state. This can be proved quite easily
directly calculating the time derivative of a state obeying the Schrödinger
equation. Let us then start writing:

d ∂ψ(t) ∂ψ(t)
h ψ(t) | ψ(t) i = hψ(t) | i+h | ψ(t) i , (2.66)
dt ∂t ∂t
where | ∂ψ(t)/∂t i ≡ ∂| ψ(t) i/∂t and h ∂ψ(t)/∂t | is its adjoint.7 The Schrödinger
equation can be recast as

∂ψ(t) i
= − Ĥ | ψ(t) i . (2.67)
∂t ~

Taking the adjoint of the Schrödinger equation one has



∂ψ(t) i
= h ψ(t) | Ĥ . (2.68)
∂t ~

When these equations are inserted into Eq. (2.66), one immediately finds, as
expected, that the right-hand side vanishes, so that

d
h ψ(t) | ψ(t) i = 0 . (2.69)
dt
In particular, if the state is initially normalised to unity, it will remain nor-
malised to unity. As we said this result was expected, since it trivially follows
from the unitarity of the time-evolution operator. However, it is a useful pre-
liminary exercise for the proof of a more general result.

Generalised Ehrenfest’s theorem


Let us now consider a generic operator Q̂ which we assume to be time-
independent. However, its expectation value hQ̂i is in general depending on
time since the quantum state depends on time in general. We can then write

dhQ̂i d dψ(t) dψ(t)


= h ψ(t) |Q̂| ψ(t) i = h ψ(t) | Q̂ | i+h | Q̂ | ψ(t) i , (2.70)
dt dt dt dt
7
If one replaces | P
ψ(t) i with its column vector representation with components ai (t),
then h ψ(t) | ψ(t) i = i a?i (t) ai (t). The adjoint h ∂ψ(t)/∂t | is unambiguously defined and
it will be represented by a row with components ∂a?i /∂t.
80 CHAPTER 2. THE PRINCIPLES OF QUANTUM MECHANICS

and using again the Eqs.(2.67) and (2.68), one immediately finds Ehrenfest’s
theorem:
dhQ̂i i Dh iE
=− Q̂, Ĥ . (2.71)
dt ~
This equation is interesting, since it says that if the operator commutes
with the energy, then its expectation value is conserved. It is clear that if
Q̂ = Ĥ itself, then one finds immediately dhĤi/dt = 0. This shows that for
a time-independent Hamiltonian its expectation value, that can be identified
with the energy of the system, is a constant of motion: this result is clearly
the classical analogue of energy conservation for an isolated system. More
generally, it is interesting that from Ehrenfest’s theorem one finds that the
laws of classical mechanics are recovered for the expectation values of the
QM operators. Finally, notice that the conservation of the norm of the state
vector can be regarded as a special case of Ehrenfest’s theorem for the simple
ˆ
choice Q̂ = I.

Evolution of the expectation value of time-independent observables


We can finally derive the time dependence of the expectation value of an
observable Q̂ again assumed to be time independent. We have seen that
if | ψ(0) i coincide with a stationary state then the probability distribution
is independent of time and, therefore, expectation values of observables are
constant of motion. In general, however, we have that | ψ(0) i is a linear
combination of energy eigenstates
n
X
| ψ(0) i = | Ei i h Ei | ψ(0) i .
i=1

and we know that at the time t the state vector evolves into ψ(t) given by
Eq. (2.63). Given an observable Q̂, we can then calculate the time evolution
of its expectation value,

hQ̂i(t) = h ψ(t) |Q̂| ψ(t) i . (2.72)

In order to highlight the main feature of the time-evolution of hQ̂i, let us con-
sider the simple case when n = 2, this can be easily generalised to arbitrary
n. We have then
E1 E2
| ψ(t) i = e−i ~
t
a1 (0) | E1 i + e−i ~
t
a2 (0) | E2 i , (2.73)
2.6. THE SCHRÖDINGER EQUATION 81

where we remind that ai (0) = h Ei | ψ(0) i. If we now insert this expansion


into Eq. (2.72), we find that hQ̂i(t) is given by the sum of four terms:

hQ̂i(t) = |a1 (0)|2 Q11 + |a2 (0)|2 Q22 (2.74)


(E −E ) (E −E )
−i 2 ~ 1 +i 2 ~ 1
+a?1 (0) a2 (0) Q12 e t
+ a1 (0) a?2 (0) Q?12 e t
,

where Qij ≡ h Ei |Q̂| Ej i are of course the matrix elements of Q̂ in the energy
eigenbasis and we used Q21 = Q?12 . The last two terms in the right-hand side
are complex conjugate of each other so that we can also write
h (E2 −E1 )
i
hQ̂i(t) = |a1 (0)|2 Q11 + |a2 (0)|2 Q22 + 2Re a?1 (0) a2 (0) Q12 e−i ~ t .
(2.75)
Considering that given two complex numbers z1 and z2 one has Re(z1 z2 ) =
Re(z1 ) Re(z2 )−Im(z1 ) Im(z2 ), we can rewrite the last term in a way to obtain

hQ̂i(t) = |a1 (0)|2 Q11 + |a2 (0)|2 Q22 + 2Re [a?1 (0) a2 (0) Q12 ] cos α(t)
−2Im [a?1 (0) a2 (0) Q12 ] sin α(t) , (2.76)

where we defined
E2 − E1
α(t) ≡ t. (2.77)
~
Notice that at the time t = 0 one has

hQ̂i(0) = |a1 (0)|2 Q11 + |a2 (0)|2 Q22 + 2Re [a?1 (0) a2 (0) Q12 ] . (2.78)

With some simple algebraic steps, one can then re-express hQ̂i(t) as

α(t)
hQ̂i(t) = hQ̂i(0)−4Re[a?1 (0) a2 (0) Q12 ] sin2 −2Im[a?1 (0) a2 (0) Q12 ] sin α(t) ,
2
(2.79)
2
where we used the simple trigonometric formula cos α(t)−1 = −2 sin [α(t)/2].
As one can see the expression exhibits an oscillatory behaviour around the
initial value. It simplifies even further if a1 (0), a2 (0) and Q12 are real, since
in this case the last term on the right-hand side containing the imaginary
part vanishes and simply

α(t)
hQ̂i(t) = hQ̂i(0) − 4 a1 (0)a2 (0)Q12 sin2 . (2.80)
2
82 CHAPTER 2. THE PRINCIPLES OF QUANTUM MECHANICS

This expression has an important application to particle physics, more pre-


cisely, to neutrino oscillations. This is the only established process8 indi-
cating that the standard model of particle physics, containing massless neu-
trinos, needs to be extended by a new more general model incorporating
neutrino masses and mixing. More precisely, specialising in a proper way
Eq. (2.79), one can calculate the probability that electron neutrinos, pro-
duced in the centre of the sun, are detected on the Earth as muon and tauon
neutrinos.
Let us see how it works and how to establish a connection with the result
for hQ̂i(t) we just derived. In the case of neutrinos, the energy eigenstates
| E1 i and | E2 i have to be identified the neutrino mass eigenstates, | ν1 i, with
mass m1 , and | ν2 i, with mass m2 . Since neutrinos are in the ultra-relativistic
regime (corresponding to have cp  mi c2 (i = 1, 2) where p is their momen-
tum and c is the speed of light), the masses m1 and m2 correspond to energies
given respectively by:
m 2 c4
q
E1 = c2 p2 + m21 c4 ' cp + 1 , (2.81)
2cp
m 2 c4
q
E2 = c2 p2 + m22 c4 ' cp + 2 . (2.82)
2cp
On the other hand, neutrinos are produced and detected in flavour eigen-
states. In particular, electron neutrinos are produced in the centre of the
sun. These can be described by a state vector denoted by | νe i. Therefore,
making correspondence with our general formalism, we have that the initial
state vector is | ψ(0) i = | νe i. This can be expressed as a linear combination
of the two mass eigenstates by means of the solar neutrino mixing angle θ12 ,
explicitly:
| νe i = cos θ12 | ν1 i − sin θ12 | ν2 i . (2.83)
The other two neutrino flavours that can be detected in Earth laboratories
together with electron neutrinos are muon and tauon neutrinos. They can be
jointly described by a superposition state vector that we denote by | νµ+τ i.
This can also be expressed as a linear combination, orthogonal to | νe i, of
the two mass eigenstates, in a way that
| νµ+τ i = sin θ12 | ν1 i + cos θ12 | ν2 i . (2.84)
8
Its discovery in solar neutrinos and atmospheric neutrinos was made respectively by
the Canadian SNO experiment and by the Japanese SuperKamiokande experiment. The
spokesmen of the two collaborations, respectively Arthur Mc Donald and Takaaki Kajita,
had been awarded the Nobel Prize for physics in 2015.
2.6. THE SCHRÖDINGER EQUATION 83

Using the expression (2.80) and making the proper identifications, one finds
(see PS. 8) that the probability of an electron neutrino to be produced in the
centre of the Sun and be detected as either a muon or tauon neutrino on the
Earth is given by9

c4 ∆m212
 
2 2
Pνe →νµ+τ (t) = sin (2θ12 ) sin t , (2.86)
4cp~

where we defined ∆m212 ≡ m22 − m12 . The time t is the time that takes
to neutrinos to travel to us from the Sun and it is approximately given by
t ' `/c, where ` ' 1 A.U. is the baseline length and is of course given,
in the case of solar neutrinos, by the Sun-Earth distance. This formula,
with a proper replacement of the mixing angle, mass squared difference and
baseline length, applies also to a few other neutrino oscillation experimental
setups, in particular to atmospheric neutrinos. In this case muon neutrinos
are produced in the outer layer of the atmosphere and can be detected as
tauon neutrinos on the Earth surface.

2.6.5 The Schrödinger picture and the Heisenberg pic-


ture
The solution (2.63) for | ψ(t) i can be also written as
!
X iE
− ~i t
| ψ(t) i = e | Ei i h Ei | | ψ(0) i . (2.87)
i

This expression shows that the time-evolution operator, in the case of time-
independent Hamiltonian operator, is given by
X Ei
Û (t) = e−i ~
t
| Ei i h Ei | . (2.88)
i

9
Notice that the probability that the neutrino state is detected as an electron neutrino
state, like at the production, is of course simply given by

Pνe →νe (t) = 1 − Pνe →νµ+τ (t) , (2.85)

and is the so-called the (electron neutrino) survival probability, while Pνe →νµ+τ (t) is the
so-called (muon-tauon neutrino) appearance probability.
84 CHAPTER 2. THE PRINCIPLES OF QUANTUM MECHANICS

As an exercise, one can verify that this expression for Û (t) respects unitarity:
X (Ej −Ei )
Û † (t) Û (t) = | Ei i h Ei | Ej i h Ej | e−i ~ t (2.89)
i,j
X
= | Ei i h Ei |
i

= Iˆ .
We have so far assumed that the state vector describing the system also
describes its time evolution. However, there is an alternative description of
time evolution in QM considering that all physical observables are encoded
in matrix elements of operators of the kind h ψi (t) |Ô| ψj (t) i. If we redefine
state vectors and operators according to the following transformations
| ψ(t) iH = Û † | ψ(t) iS = | ψ(0) iS , (2.90)
ÔH (t) = Û † (t) ÔS Û (t) , (2.91)
where we denoted by the subscript ‘S’ quantities in the Schrödinger pic-
ture that we discussed so far, and with the subscript ‘H’ quantities in the
Heisenberg picture, it is straightforward to check that matrix elements are
invariant:
H h ψi |ÔH (t)| ψj iH = S h ψi (t) |ÔS (t)| ψj (t) iS . (2.92)
In this way, while in the Schrödinger picture state vectors evolve with time
and operators are usually time-independent, in the Heisenberg picture state
vectors are time independent and all operators are necessarily time depen-
dent. Notice that these two simultaneous unitary transformations of states
and operators, bringing from the Schrödinger to the Heisenberg picture,
are a particular example of the general double transformation performed in
Eq (1.139) combining an active (transforming states) and a passive (trans-
forming operators) transformation that leave invariant matrix elements. In
the following we will continue to work in the Schrödinger picture dropping
the subscript ‘S’.

2.6.6 Schrödinger equation for the wave function in


position basis
We want now to show how the familiar Schrödinger equation for the wave
function in position basis for a one-dimensional system can be easily recovered
from the general Schrödinger equation (2.56).
2.6. THE SCHRÖDINGER EQUATION 85

We can first of all take the inner products of both sides of (2.56) with an
x̂ basis eigenbra h x |, obtaining

d
i~ h x | ψ(t) i = h x |Ĥ| ψ(t) i . (2.93)
dt

We can then insert the resolution of identity to the right of Ĥ obtaining


Z
d
i~ h x | ψ(t) i = dx0 h x |Ĥ| x0 ih x0 | ψ(t) i . (2.94)
dt

Similarly to the results obtained for the matrix elements of x̂ and k̂, respec-
tively Eq. (1.193) and Eq. (1.214), the matrix elements of Ĥ in the x̂-basis
can be written as

h x |Ĥ| x0 i = δ(x − x0 ) H(x0 , p(x0 )) , (2.95)

so that one immediately obtains the Schrödinger equation for the wave func-
tion in position basis:

∂ψ(x, t)
i~ = H(x, p(x))ψ(x, t) . (2.96)
∂t
86 CHAPTER 2. THE PRINCIPLES OF QUANTUM MECHANICS
Bibliography

[1] P.A.M. Dirac, The principles of Quantum Mechanics, 1930, Oxford


Press.

[2] R. Shankar, Principles of Quantum Mechanics, August 1994, Springer.

87
88 BIBLIOGRAPHY
Chapter 3

Harmonic Oscillator in
Quantum Mechanics

In this chapter we finally study a first application of the formalism discussed


in the previous chapters studying the harmonic oscillator. Our discussion
will partially overlap with what you have already seen in PHYS2003 but
now all results will be expressed within the more general formalism of state
vectors and no integral is needed to be written explicitly. Even though we
will start from the classical Hamiltonian valid for a non-relativistic harmonic
oscillator, the results we obtain are more general and are actually valid even in
the relativistic case. The harmonic oscillator represents a very fundamental
building block in quantum field theories, describing essentially all elementary
particles and also excitations in solids (phonons) and in many other contexts.

3.1 Hamiltonian operator and dimensionless


variables
The classical Hamiltonian of a one-dimensional non-relativisitc harmonic os-
cillator can be expressed as

p2 1
H(x, p) = + m ω 2 x2 , (3.1)
2m 2

where m is the mass of a particle oscillating with an angular frequency ω.


The corresponding Hamiltonian operator is obtained straightforwardly sim-

89
90CHAPTER 3. HARMONIC OSCILLATOR IN QUANTUM MECHANICS

ply replacing x → x̂ and p → p̂:

p̂2 1
Ĥ(x̂, p̂) = + m ω x̂2 . (3.2)
2m 2
It is convenient to introduce the dimensionless operators1
r r
mω 1
X̂ = x̂ and P̂ = p̂ . (3.3)
~ m~ω

The Hamiltonian operator is then easily re-expressed in terms of X̂ and P̂


as
1  2 
Ĥ = X̂ + P̂ 2 ~ ω . (3.4)
2
These obey the commutation relation
h i
X̂, P̂ = i , (3.5)

that is a straightforward consequence of the fundamental quantum condition


(also called fundamental commutation relation)

[x̂, p̂] = i ~ . (3.6)

We have seen that this commutation relation for position and momentum
operators can be derived by identifying p̂ = ~k̂ = −i ~D̂, where k̂ is the
conjugate of x̂. Is this identification to be meant as an additional postulate
to the four we enunciated? This is one possibility, the other possibility is
to assume the commutation relation (3.6) as a postulate and in this case
one can show that from this one can derive p̂ = ~ k̂. However, even more
fundamentally, one can show that the identification p̂ = ~k̂ is a consequence
of identifying momentum, as in classical mechanics, with the generator of
translations and, if space is homogeneous its averaged value is a conserved
quantity of motion, analogously to what happens in classical mechanics where
from Nöther theorem one derives that momentum is a conserved quantity.
From this identification of momentum with the generator of translation it
follows, in our one-dimensional case, then necessarily p̂ = C D̂. The constant
of proportionality C needs to be purely imaginary in a way for the momentum
to be Hermitian, so that we can write p̂ = −i ~ D̂ where ~ is the Planck
constant and has the dimensionality of angular momentum. The Planck
1
Also called scaled position operator and scaled momentum operator respectively.
3.2. LADDER OPERATORS 91

constant is determined experimentally and has a value ~ ' 6.626 × 10−34 J s,


first estimated by Planck himself from the study of the black body radiation.
He proposed in 1901 described this as a collection of quantum harmonic
oscillators to solve the so-called ultraviolet catastrophe, a conceptual problem
rising from a classical description of black body radiation.

3.2 Ladder operators


Let us now introduce the operator

1  
â = √ X̂ + i P̂ (3.7)
2

and its adjoint


† 1  
â = √ X̂ − i P̂ . (3.8)
2
These are together called ladder operators and, specifically, respectively the
lowering and raising operator for reasons that we will soon become clear.2
Notice that â and ↠are not Hermitian since â 6= ↠. However, as we are
going to discuss, they are useful auxiliary operators for the solution of the
eigenvalue problem for Ĥ and, as we will discover, they also have an impor-
tant physical meaning.
The operators â and ↠do not commute and from Eq. (3.5) one can derive
 †
â, â = 1 . (3.9)

Therefore, one has ↠â 6= â ↠and explicitly one can write

† 1  
â â = X̂ + i P̂ (X̂ − i P̂ ) (3.10)
2
1 
2 2
 1
= X̂ + P̂ − i [X̂, P̂ ] (3.11)
2 2
Ĥ 1
= + (3.12)
~ω 2
2
The hat on these operators is often omitted, especially in historical papers and text-
books. We are here keeping it, following more recent literature, to be utterly unambiguous.
92CHAPTER 3. HARMONIC OSCILLATOR IN QUANTUM MECHANICS

and
1  
↠â =
X̂ − i P̂ (X̂ + i P̂ ) (3.13)
2
1 2  1
= X̂ + P̂ 2 + i [X̂, P̂ ] (3.14)
2 2
Ĥ 1
= − . (3.15)
~ω 2
Subtracting the two, one can verify again that â, ↠= 1. In terms of
 

the dimensionless number operator N̂ ≡ ↠â, the Hamiltonian can then be
expressed as3  
1
Ĥ = ~ ω N̂ + . (3.16)
2
The Hamiltonian eigenstates now can be labelled in terms of eigenvalues of
N̂ denoted by n, so that the eigenvalue equation for N̂ can be written as
N̂ | n i = n | n i , (3.17)
where the eigenstates | n i are normalised to unity, i.e., h n | n i = 1. The
eigenstate | n i is clearly also an eigenstate of Ĥ with eigenvalue En = (n +
1/2) ~ω. The eigenvalues n of N̂ have to be necessarily not only real, since
N̂ is Hermitian, but also nonnegative since
n = h n |N̂ | n i = h n |↠â| n i = h â n | â n i = ||â n||2 ≥ 0 . (3.18)
We want now to understand the action of the ladder operators on the eigen-
states | n i and the result will justify their names. In this respect, it is useful
to notice that
[N̂ , ↠] = [↠â, ↠] (3.19)
= ↠â ↠− ↠↠â
= ↠[â, ↠]
= â†
and
[N̂ , â] = [↠â, â] (3.20)
= ↠â â − â ↠â
= [↠, â] â
= − â .
3
Notice that N̂ is also often found to be written without the hat symbol in the literature.
3.2. LADDER OPERATORS 93

With these results we can now easily solve the eigenvalue problem for N̂
understanding the action of â and ↠. If we first act with the number operator
on ↠| n i, we find
 
† † †
N̂ â | n i = â N̂ + [N̂ , â ] | n i (3.21)
= ↠(N̂ + 1) | n i (3.22)
= (n + 1) ↠| n i , (3.23)

showing that ↠| n i is an eigenstate of N̂ with eigenvalue n + 1, unless ↠| n i


is the null vector. However, it is easy to see that ||a† n||2 = n + 1 > 0, so
that a† | n i can never be the null vector. We can then write

↠| n i = Cn+ | n + 1 i , (3.24)

where Cn+ is a constant that can be found


√ imposing h n | n i = 1 for all n and
+
choosing it real. The result is Cn = n + 1 (PS 8), so that finally one has

↠| n i = n + 1 | n + 1 i . (3.25)

Notice that this result now justifies why ↠is referred to as the raising oper-
ator.4
We can repeat the same procedure applying now the number operator to
the kets â | n i, finding:
 
N̂ â | n i = â N̂ + [N̂ , â] | n i (3.26)
= â (N̂ − 1) | n i (3.27)
= (n − 1) â | n i . (3.28)

This shows that, unless â| n i is the null vector, â | n i is an eigenstate of N̂


with eigenvalue n − 1 and, therefore, in this case we can write

â | n i = Cn− | n − 1 i , (3.29)

where Cn− is a constant to be determined again imposing h n√| n i = 1 for all


n and choosing it real. In this way one finds (PS 8) Cn− = n, so that one
finally has √
â | n i = n | n − 1 i . (3.30)
4
It is also called the creation operator, since in quantum field theory an elementary
particle of a given type is created by applying the associated creation operator to an
initial quantum state.
94CHAPTER 3. HARMONIC OSCILLATOR IN QUANTUM MECHANICS

This result now justifies why â is referred to as the lowering operator.5


However, this time it is not excluded that â | n i can be the null vector
since we know that n ≥ 0 and, therefore there must be a minimum value
n0 with 0 ≤ n0 < 1, such that necessarily â| n0 i = 0. The eigenstate | n0 i
is called the ground state. Since N̂ | n0 i = 0 = 0 | n0 i, it necessarily follows
that n0 = 0 and, therefore, the eigenvalues n must be nonnegative integers
(i.e., n = 0, 1, 2, . . . ). Correspondingly, the energy levels (the eigenvalues of
Ĥ) are given by  
1 1
En = n + ~ ω ≥ E0 ≡ ~ ω . (3.31)
2 2
Notice that we will denote the ground state by | 0 i. This has not to be
confused with the null vector simply denoted by 0. Therefore, summarising
we found6

↠| n i = n + 1|n + 1i, (3.32)

â | n i = n | n − 1 i (n ≥ 1) , (3.33)

while â | 0 i = 0 and of course â 0 = 0.


Notice that from Eq. (3.32), swapping left-hand and right-hand sides, one
can express all | n i’s in terms of the ground state | 0 i:

(↠)2 | 0 i (↠)3 | 0 i (↠)n | 0 i


| 1 i = ↠| 0 i , | 2 i =
√ , |3i = √ ,... ,|ni = √ .
2 2·3 n!
(3.34)
A legitimate question is to ask whether these are the only eigenvalues
and eigenvectors. For example, one could wonder whether there could be a
degenerate ground state | 00 i such that also â | 00 i = 0. However, it is possi-
ble to show that there cannot be degenerate eigenvalues in one-dimensional
systems.
A useful exercise is to calculate the expectation value of the potential
energy in the energy eigenstates

1
Vn ≡ h n |V̂ | n i = ~ω h n |X̂ 2 | n i . (3.35)
2
5
It is also called the destruction operator, since in quantum field theory an elementary
particle of a given type is destroyed by applying the associated destruction operator to an
initial quantum state.
6
Mnemonical note: the larger quantum number labelling the kets appears in the square
root.
3.3. MATRIX REPRESENTATIONS 95

From the definition of lowering and raising operators Eqs. (3.7) and (3.8),
one can simply write
â + â†
X̂ = √ (3.36)
2
so that
2 â â + â ↠+ ↠â + ↠â†
X̂ = . (3.37)
2
Using the orthonormality of the eigenstates | n i, one has simply h n |↠↠| n i =
h n |â â| n i = 0. Using then â ↠= [â, ↠] + ↠â = 1 + N̂ , one easily obtains
 
1 1 1
Vn = ~ω n + = En . (3.38)
2 2 2

3.3 Matrix representations


It is instructive to show what is the matrix representation of the eigenstates,
ladder operators and number operator in the N basis. The vector space is
discrete but infinite dimensional. The eigenstates are represented by infinite-
dimensional column vectors:
       
0 1 0 0
 0   0   1   0 
0 →  0  , | 0 i →  0  , | 1 i →  0  , | 2 i →  1  , . . . (3.39)
       
       
.. .. .. ..
. . . .

Notice the difference between the null vector 0 and the ground state | 0 i.
The easiest matrix representation is clearly the one of the number opera-
tor N̂ since its matrix elements Nn0 n = h n0 |N̂ | n i = n δn0 n so that one simply
has:  
0 0 0 0 ...
 0 1 0 0 
 
 0 0 2 0
N̂ →  . (3.40)

 0 0 0 3 
 
.. ..
. .
Clearly the matrix representing Ĥ is also diagonal, with energy levels En =
~ ω (n + 1/2) on the diagonal.
The matrix elements of the raising operator are given by (n0 , n = 0, 1, 2, . . . )
√ √
a†n0 n = h n0 |↠| n i = n + 1h n0 | n + 1 i = n + 1 δn0 ,n+1 , (3.41)
96CHAPTER 3. HARMONIC OSCILLATOR IN QUANTUM MECHANICS

in a way that one has (notice that the first row and the first column are the
zero-th row and the zero-th column)
 
0 0 0 0 ...
 11/2 0 0 0 ... 
 
†  0 21/2 0 0 ...
â →  . (3.42)

 0
 0 31/2 0 ... 

.. ...
.

Analogously the matrix elements of the lowering operator are given by (n0 , n =
0, 1, 2, . . . )
√ √
an0 n = h n0 |â| n i = nh n0 | n − 1 i = n δn0 ,n−1 . (3.43)

in a way that one has the following matrix representation:


 
0 11/2 0 0 ...
 0 0 21/2 0 . . . 
â →  0 0 . (3.44)
 
1/2
 0 3 . . . 
.. ..
. .

3.4 Eigenstate wave functions in the x̂-basis


The wave functions of the eigenstates in the x̂-basis are given by:

ψn (x) ≡ h x | n i . (3.45)

In particular, the wave function of the ground state ψ0 (x) ≡ h x | 0 i. It is


again convenient to work with the dimensionless operators X̂ and P̂ and,
therefore, to switch from the x̂-basis to the X̂-basis. In this way in the X̂-
basis simply X̂ → X. Moreover from the definitions of X̂ and P̂ in Eq. (3.3)
we have seen that X̂ and P̂ satisfy the commutation relation (3.5). This is
exactly the same commutation relation we found in general to be satisfied by
conjugate operators (see Eq. (1.232)) so that we can conclude that P̂ is the
conjugate operator of X̂. Therefore, one simply has that P̂ acts on a state
in a way that the transformed wave function

ψ 0 (X) ≡ h X |P̂ | ψ i = −i ψ(X) . (3.46)
∂X
Therefore, in the X̂-basis, P̂ is represented by −i∂/∂X, (i.e., P̂ → −i∂/∂X).
3.4. EIGENSTATE WAVE FUNCTIONS IN THE X̂-BASIS 97

We can now calculate the ground state wave function ψ0 (X) considering
that, since â| 0 i = 0, then necessarily h X |â| 0 i = 0. This means that the
wave function of the null vector vanishes for all values of X. From the
definition of â (see Eq. (3.7)), we can then recast this simple result as

h X |X̂| 0 i + i h X |P̂ | 0 i = 0 . (3.47)

From the representations of X̂ and P̂ in the X̂ basis, we than find the simple
differential equation for the ground state wave function

X ψ0 (X) + ψ0 (X) = 0 , (3.48)
∂X
with solution
1 2
ψ0 (X) = C e− 2 X , (3.49)
where C is an arbitrary constant. In the x̂-basis, this simply translates into
1 m ω x2
ψ0 (x) = c e− 2 ~ω . (3.50)

The constants C and c, as usual, can be conveniently chosen real and fixed
by normalising ψ0 (x) to unity, i.e., imposing
Z +∞
dX ψ0? (X) ψ0 (X) = 1 , (3.51)
−∞

finding
 14
mω 2

− 41
C=π and c = . (3.52)
π~ω
We can now calculate the normalised wave function of the first excited state
| 1 i = ↠| 0 i in the X̂-basis:

ψ1 (X) = h X |↠| 0 i . (3.53)

From the definition of ↠in Eq. (3.8), we can then write:


1 i
ψ1 (X) = √ h X |X̂| 0 i − √ h X |P̂ | 0 i (3.54)
2 2
 
1 ∂
= √ X− ψ0 (X) (3.55)
2 ∂X
 
C ∂ X2
= √ X− e− 2 . (3.56)
2 ∂X
98CHAPTER 3. HARMONIC OSCILLATOR IN QUANTUM MECHANICS
p
In the x̂-basis one simply has to replace C → c and X = mω 2 /(~ω) x. The
same procedure can be repeated √ to derive the wave function of the generic
† n
excited state | n i = (â ) | 0 i/ n!, finding:
 n
C ∂ X2
ψn (X) = √ X− e− 2 . (3.57)
2n n! ∂X

This can be also expressed in terms of Hermite polynomials


C X2
ψn (X) = √ Hn (X) e− 2 , (3.58)
2n n!
given by  n
X2 ∂ X2
Hn (X) = e 2 X− e− 2 . (3.59)
∂X
For example, one finds:

H0 (X) = 1, H1 (X) = 2 X, H2 (X) = 4 X 2 −2, H3 (X) = 8 X 3 −12 X . (3.60)

It is interesting to notice that the probability density function |ψ0 (x)|2


for the ground state of a quantum harmonic oscillator peaks at x = 0, while
classically this has a minimum at x = 0, since the oscillator is faster at x = 0
than everywhere else and, therefore, spends around 0 less time. However for
n → ∞ the probability density |ψn (x)|2 tends to the classical probability
distribution even though |ψn (x)|2 has nodes and anti-nodes. However, for
large n these are so close together that in the limit n → ∞ they become
experimentally undetectable.
Bibliography

[1] P.A.M. Dirac, The principles of Quantum Mechanics, 1930, Oxford


Press.

[2] R. Shankar, Principles of Quantum Mechanics, August 1994, Springer.

99
100 BIBLIOGRAPHY
Chapter 4

Angular momentum

In this Chapter we discuss another important physical quantity that, like


momentum, is also classically defined: angular momentum. However, as we
will see, there will be a twist and we will discover that there is, in general, a
component of angular momentum that does not have a classical counterpart:
this is the so-called spin angular momentum or simply spin.

4.1 Definition
Angular momentum can be defined in classical mechanics as the generator of
rotations and, for an isolated system, it is a conserved quantity of motion if
one assumes isotropy of space.1 Analogously to what we have seen in the case
of the momentum, one can then generalise the same definition to QM saying
that the angular momentum vector operator is the generator of rotations
(times the Planck constant ~). This definition is analogous to the definition
of Hamiltonian operator as generator of time-evolution. Let us consider a
rotation of the physical system by an angle θ around an axis along the unit
vector n. This will induce a transformation of the state vector | ψ i described
by an operator of rotation D̂(n, θ) 2 into a rotated state vector | ψ 0 (n, θ) i,
explicitly:
D̂(n,θ)
| ψ i −−−−→ | ψ 0 (n, θ) i = D̂(n, θ) | ψ i . (4.1)
1
More precisely, the angular momentum L · θ is the generator of a rotation of an angle
θ about an axis parallel to the unit vector n.
2
This is a generalization of the rotation operator R̂n (θ) acting on vectors in the Eu-
clidean space seen in problem sheets.

101
102 CHAPTER 4. ANGULAR MOMENTUM

If we consider a rotation of an infinitesimal angle dθ, proceeding along the


same lines we discussed for the time-evolution operator, then we can write
the rotation operator in the form:

Ĵ · n
D̂(n, dθ) = Iˆ − i dθ . (4.2)
~
The vector operator
Ĵ = Jˆx i + Jˆy j + Jˆz k (4.3)
is the generator of rotations and is called angular momentum (vector) op-
erator. Notice that, in the same way as the unitarity of the time-evolution
operator implies that the Hamiltonian operator has to be Hermitian, now
the unitarity of the rotation operator implies that Ĵ has to be Hermitian.
The crucial difference with classical mechanics is that in the case of QM
the angular momentum is the sum of two components, namely

Ĵ = L̂ + Ŝ . (4.4)

• The orbital angular momentum vector operator L̂ has a classical corre-


spondence.

• The spin angular momentum operator Ŝ does not have a classical cor-
respondence.
Defining Jˆ1 = Jˆx , Jˆ2 = Jˆy , Jˆ3 = Jˆz , from the general definition of angular mo-
mentum as generator of rotations, one can derive the following fundamental
commutation relations of angular momentum (i, j, k = 1, 2, 3):
h i
Jˆi , Jˆj = i ~ εijk Jˆk , (4.5)

where εijk is the Levi-Civita tensor defined by:


(i) ε123 = 1 ;

(ii) It is fully antisymmetric under the exchange of any two indexes (for
example: εjik = −εijk ).
This implies that all 21 entries where at least two indexes are equal vanish,
while ε123 = ε312 = ε231 = 1 and ε213 = ε321 = ε132 = −1.
For example, one has [Jˆx , Jˆy ] = i ~ Jˆz . We will not prove the relations
Eqs. (4.5) in full generality, but we will see that they hold in the specific case
of the orbital angular momentum of a particle.
4.2. LADDER OPERATORS FOR ANGULAR MOMENTUM 103

Analogously to the expression found for the time-evolution operator in


terms of the Hamiltonian, the expression of the rotation operator for a finite
angle θ is found to be:
Ĵ·n
D̂(n, θ) = e−i ~
θ
. (4.6)

4.2 Ladder operators for angular momentum


Let us introduce the squared angular momentum operator

Jˆ2 = Jˆx2 + Jˆy2 + Jˆz2 . (4.7)

This commutes with every one of Jˆk :

[Jˆ2 , Jˆk ] = 0 , (k = 1, 2, 3) . (4.8)

Let us see this specifically for the case k = 3. One has:

[Jˆ2 , Jˆz ] = [Jˆx2 , Jˆz ] + [Jˆy2 , Jˆz ] + [Jˆz2 , Jˆz ] (4.9)


= Jˆx [Jˆx , Jˆz ] + [Jˆx , Jˆz ] Jˆx + Jˆy [Jˆy , Jˆz ] + [Jˆy , Jˆz ] Jˆy
= −i~ Jˆx Jˆy − i ~Jˆy Jˆx + i~ Jˆy Jˆx + i ~Jˆx Jˆy = 0 .

This implies that one can always find a common eigenbasis for Jˆ2 and one of
Jˆk , conventionally Jˆz . If we indicate with λ ~2 the eigenvalues of Jˆ2 and with
m ~ the eigenvalues of Jˆz , then their common normalised eigenstates can be
denoted by | λ, m i and, therefore, we can write:

Jˆ2 | λ, m i = λ ~2 | λ, m i , Jˆz | λ, m i = m ~ | λ, m i . (4.10)

At this stage λ and m can be any real numbers but we now show that the
fundamental commutation relations enforce stringent limitations. First of all
we can introduce the ladder operators (for angular momentum) defined as:

Jˆ± ≡ Jˆx ± i Jˆy , (4.11)

where notice that Jˆ±† = Jˆ∓ . They satisfy the commutation relations
h i
Jˆz , Jˆ± = ±~ Jˆ± , [Jˆ+ , Jˆ− ] = 2~ Jˆz and [Jˆ2 , Jˆ± ] = 0 , (4.12)
104 CHAPTER 4. ANGULAR MOMENTUM

which can be easily obtained from the fundamental commutation relations


Eqs. (4.5). For example, let us show the first one:
[Jˆz , Jˆ± ] = [Jˆz , Jˆx ± i Jˆy ] (4.13)
= [Jˆz , Jˆx ] ± i [Jˆz , Jˆy ]
= i ~ Jˆy ± ~ Jˆx
= ± ~ (Jˆx ± i Jˆy )
= ± ~ Jˆ± .
Using these commutation relations, we can easily find the physical meaning
of Jˆ± | λ, m i, proceeding in a similar way to what we did for the harmonic
oscillator. We can first of all see how Jˆz acts on | λ, m i:
 
Jˆz (Jˆ± | λ, m i) = [Jˆz , Jˆ± ] + Jˆ± Jˆz | λ, m i (4.14)
= (m ± 1) ~ (Jˆ± | λ, m i) . (4.15)
We can see then that Jˆ+ and Jˆ− transform an eigenstate of Jˆz in another
eigenstate of Jˆz but respectively with m increased by one unit and m de-
creased by one unit. Therefore, Jˆ± allow to make one step up or down on the
ladder of Jˆz eigenvalues and for this reason they are called ladder operators
for the angular momentum.
However, it is straightforward to see that Jˆ± do not change the eigenvalue
of Jˆ2 since they commute with each other:
Jˆ2 (Jˆ± | λ, m i) = Jˆ± Jˆ2 | λ, m i = λ ~2 (Jˆ± | λ, m i) . (4.16)
In conclusion, we can write:
Jˆ± | λ, m i = Cλm
±
~ | λ, m ± 1 i , (4.17)
±
where the coefficients Cλm are dimensionless constants that can be deter-
mined by normalising all eigenkets | λ, m i for all values of λ and m, similarly
to how we determined the constants Cn± for the harmonic oscillator ladder
±
operators. Let us show the calculation of Cλm in detail. From Eq. (4.17),
taking the squared norms of both right-hand side and left-hand side kets and
using the orthonormality of the eigenkets, one obtains
|Cλm | ~ = h λ, m |Jˆ∓ Jˆ± | λ, m i
± 2 2
(4.18)
= h λ, m |Jˆ2 − Jˆ2 ∓ ~ Jˆz | λ, m i
z
2
= ~ [λ − m(m ± 1)] .
4.3. EIGENVALUES OF Jˆ2 AND JˆZ 105

±
Choosing conventionally Cλm real and positive, i.e., setting a phase factor
±
p
equal to unity, we find Cλm = λ − m(m ± 1) so that we can finally write

Jˆ± | λ, m i = [λ − m(m ± 1)]1/2 ~ | λ, m ± 1 i . (4.19)

4.3 Eigenvalues of Jˆ2 and Jˆz


From the definitions of ladder operators, one easily obtains:

Jˆ− Jˆ+ = Jˆ2 − Jˆz2 − ~ Jˆz and Jˆ+ Jˆ− = Jˆ2 − Jˆz2 + ~ Jˆz , (4.20)

showing that
1ˆ ˆ 
Jˆ2 − Jˆz2 = J− J+ + Jˆ+ Jˆ− . (4.21)
2
Using that Jˆ− and Jˆ+ are the adjoint of each other, one has
1  
h λ, m |Jˆ2 − Jˆz2 | λ, m i = h λ, m | Jˆ− Jˆ+ + Jˆ+ Jˆ− | λ, m i (4.22)
2
1  ˆ 
= h J+ λm | Jˆ+ λm i + h Jˆ− λm | Jˆ− λm i ≥ 0 ,
2
where, as usual, we defined | Jˆ± λm i ≡ Jˆ± | λ, m i. At the same time, from
the eigenvalue equations (4.10), one obtains

h λ, m |Jˆ2 − Jˆz2 | λ, m i = ~2 (λ − m2 ) . (4.23)

In this way we find the result

0 ≤ m2 ≤ λ , (4.24)

showing that λ ≥ 0 and that m2 has an upper bound, implying that there
exists both a maximum value mmax , such that m ≤ mmax , and a minimum
value mmin , such that m ≥ mmin . This implies both

Jˆ+ | λ, mmax i = 0 (4.25)

and
Jˆ− | λ, mmin i = 0 . (4.26)
Let us start to see the implications of the first of these two conditions,
Eq. (4.25). It also implies

Jˆ− Jˆ+ | λ, mmax i = 0 , (4.27)


106 CHAPTER 4. ANGULAR MOMENTUM

and from Eq. (4.20) we can rewrite it as

(Jˆ2 − Jˆz2 − ~ Jˆz ) | λ, mmax i = 0 , (4.28)

that immediately yields the condition

λ = mmax (mmax + 1) . (4.29)

Similarly, we can proceed from the condition in Eq. (4.114) and write

Jˆ+ Jˆ− | λ, mmin i = 0 . (4.30)

Using the second equation in (4.20), this gives

(Jˆ2 − Jˆz2 + ~ Jˆz ) | λ, mmin i = 0 , (4.31)

that yields now a condition on mmin :

λ = mmin (mmin − 1) . (4.32)

A comparison of Eq. (4.29) with Eq. (4.32), immediately shows that mmin =
−mmax . The value mmax is usually simply denoted by j, in a way that
Eq. (4.29) can also be written as λ = j (j + 1) and one has

−j ≤ m ≤ j . (4.33)

Notice that j 2 < λ = j(j + 1). This means that the absolute value of the
angular momentum along the z direction can never be equal to the total
angular momentum, as in classical mechanics, but is always lower. This is
because if j 2 = λ, then the x and y components would vanish but in such a
situation all components of the angular momentum would be simultaneously
determined and this would contradict the commutation relations.
Clearly, applying successively Jˆ+ to | λ, −j i, one obtains | λ, j i in n = 2 j
steps, where n is some integer. Therefore, one obtains j = n/2, meaning that
j can be either an integer number, if n is even, or a half-integer number, if n
is odd. This implies that if j is an integer, then all values of m are integers
and if j is half-integer, then all values of m are half-integers. For a given j,
one has then 2j + 1 allowed values of m given by:

m = −j, −j + 1, . . . , j − 1, j . (4.34)

These are some typical examples:


4.4. MATRIX ELEMENTS AND REPRESENTATIONS OF ANGULAR MOMENTUM OPERATORS

• j = 0 ⇒ m = 0;

• j = 1/2 ⇒ m = −1/2, +1/2;

• j = 1 ⇒ m = −1, 0, +1;

• j = −3/2 ⇒ m = −3/2, −1/2, +1/2, +3/2.


Instead of λ, it is customary to use j as a label for the eigenvectors, in this
way we can rewrite the eigenvalue equations for Jˆ2 and Jˆz as

Jˆ2 | j, m i = j (j + 1) ~2 | j, m i , (4.35)
Jˆz | j, m i = m~ | j, m i . (4.36)

It should be noticed how all results have been obtained in great generality just
from the fundamental commutation relations (4.5) and these just follow from
the general definition of angular momentum vector operator as the generator
of rotations.

4.4 Matrix elements and representations of


angular momentum operators
As we did for the harmonic oscillator, we want now to calculate the matrix
elements of different momentum operators and show their matrix representa-
tion in the simultaneous eigenbasis of Jˆ2 and Jˆz with normalised eigenvectors
| j, m i. Let us first of all start from Jˆ2 , one finds in a straightforward way

h j 0 , m0 |Jˆ2 | j, m i = j (j + 1) ~2 δj 0 j δm0 m , (4.37)

confirming of course that Jˆ2 is diagonal. If we fix j = 1, the three eigenstates


| 1, 1 i, | 1, 0 i and | 1, −1 i provide an eigenbasis for the subspace j = 1. We
will order them in a way that | 1, 1 i → (1, 0, 0)> , | 0, 0 i → (0, 0, 0)T and
| 1, 0 i → (0, 1, 0)T . The matrix representation of Jˆ2 in this subspace is
particularly simple:  
1 0 0
Jˆ2 → 2 ~2  0 1 0  . (4.38)
0 0 1
Clearly Jˆz is also diagonal and one has

h j 0 , m0 |Jˆz | j, m i = m ~ δj 0 j δm0 m . (4.39)


108 CHAPTER 4. ANGULAR MOMENTUM

The matrix representation of Jˆz in the subspace j = 1 is also very simple:


 
1 0 0
ˆ
Jz → ~  0 0 0 . (4.40)
0 0 −1

We can finally draw our attention to the ladder operators. First of all let us
rewrite Eq. (4.19) replacing λ → j(j + 1):

Jˆ± | j, m i = [j(j + 1) − m(m ± 1)]1/2 ~ | j, m ± 1 i . (4.41)

The matrix elements can then be written as

h j 0 , m0 |Jˆ± | j, m i = ~ [j(j + 1) − m(m ± 1)]1/2 δj 0 j δm0 ,m±1 . (4.42)

In the subspace j = 1, with the convention we choose for the eigenvectors,


the matrix representations of Jˆ+ and Jˆ− are then given by:
 √   
0 2 √0 √0 0 0
Jˆ+ → ~  0 0 2  , Jˆ− → ~  2 √0 0  . (4.43)
0 0 0 0 2 0

Let us now consider the operators Jˆx and Jˆy . These are straightforwardly
expressed in terms of the ladder operators as:

Jˆ+ + Jˆ− Jˆ+ − Jˆ−


Jˆx = and Jˆy = . (4.44)
2 2i
The matrix elements are then easily calculated from the matrix elements of
the ladder operators. In the subspace j = 1, their matrix representation is
given by (m0 , m = 1, 0 − 1):
   
0 1 0 0 −i 0
~ ~
Jˆx → √  1 0 1  and Jˆy → √  i 0 −i  . (4.45)
2 0 1 0 2 0 i 0

We can now make a useful exercise solving the eigenvalue problem for Jˆx
(analogous results can be easily obtained for Jˆy ). First of all one can easily
check that Jˆx (and likewise Jˆy ) has the same eigenvalues as Jˆz , i.e., ~, 0 and
−~. This is expected from space isotropy, since all directions are equivalent.
4.5. ORBITAL ANGULAR MOMENTUM 109

If we indicate the (normalised) eigenkets with | 1, 1x i, | 1, 0x i and | 1, −1x i,


one finds for their column representations:
     
√ 1 1 1

1 1 1
| 1, 1x i →  2  , | 1, 0x i → √  0  , | 1, −1x i →  − 2  .
2 2 2
1 −1 1
(4.46)
Let us now discuss quite an instructive problem. Let us suppose that a
system is prepared in initially in the Jˆz eigenstate described by the eigenket
| 1, 1 i → (1, 0, 0)T . This can be done simply measuring Jˆz and selecting those
copies of the ensemble that give as a result ~. Let us suppose now to measure
Jˆx . Which the probabilities to obtain as outcome one of the three eigenvalues
(~, 0 and −~)? To answer this question we need of course to calculate the
three probabilities P (mx ) = |h 1, mx | 1, 1 i|2 , where mx = 1, 0, −1. This can
be easily done and, for example, for mx = 1, one finds:
  2
1
1 √ 1
P (mx = 1) = (1 2 1)  0  = . (4.47)
2 4
0

Analogously, one finds P (mx = 0) = 1/2 and P (mx = −1) = 1/4. After the
measurement of the x component one could again measure the z component
of the angular momentum. This time the initial state is in one of the L̂x eigen-
states. One can now again calculate the probability to obtain as outcome one
of the three eigenvalues of L̂z . Suppose that the outcome of the x component
measurement was mx = 1, this time one has P (m) = |h 1, m | 1, 1x i|2 . For
example the probability to obtain m = 1 would be P (1) = 1/4 and not 1 as
one could naively expect.

4.5 Orbital angular momentum


The classical angular momentum of a particle can be written as L = r × p,3
where ~r = (x, y, z) and p~ = (px , py , pz ). Correspondingly, the operator for
the orbital angular momentum can be written as

L̂ = r̂ × p̂ (4.48)
3
Notice that using the Levi-Civita Ptensor one can also write the components of a generic
vectorial product ~c = ~a × ~b as ci = j,k εijk aj bk .
110 CHAPTER 4. ANGULAR MOMENTUM

or, explicitly in components:

L̂x = ŷ p̂z − ẑ p̂y , (4.49)


L̂y = ẑ p̂x − x̂ p̂z , (4.50)
L̂z = x̂ p̂y − ŷ p̂x . (4.51)

In the three-dimensional case the representation of the vector momentum


operator generalises into p̂ → −i~∇. The commutation relations for coordi-
nates and momenta of a particle, that generalise the commutation relation
[x̂, p̂] = i ~ in the simple simple one-dimensional case, are given by the fol-
lowing canonical commutation relations:

[x̂i , p̂j ] = i ~ δij (4.52)

(where x̂1 = x̂, x̂2 = ŷ, x̂3 = ẑ and correspondingly for the three momenta).
This is a compact expression implying that the commutators for the same
components are given by
[x̂i , p̂i ] = i ~ , (4.53)
while the commutators [x̂i , p̂j ] = 0 for i 6= j. These commutation rela-
tions can be easily derived similarly to the one-dimensional case. From these
commutation relations for positions and momenta, one can then derive the
commutation relations for the operators of the orbital angular momentum
components: h i
L̂i , L̂j = i ~ εijk L̂k , (4.54)

that, when spin is also included, are generalised by the Eqs. (4.5), as we
pointed out.
Proof. Let us prove the commutation relations (4.54) in the case i = 1
and j = 2. Using the commutation relations (4.52), one has:

[L̂x , L̂y ] = [ŷ p̂z − ẑ p̂y , ẑ p̂x − x̂ p̂z ] (4.55)


= [ŷ p̂z , ẑ p̂x ] + [ẑ p̂y , x̂ p̂z ]
= ŷ p̂x [p̂z , ẑ] + p̂y x̂ [ẑ, p̂z ]
= i ~ (x̂ p̂y − ŷ p̂x )
= i ~ L̂z .

Similar derivations can be repeated for the other two non-vanishing commu-
tators, confirming Eq. (4.54).
4.5. ORBITAL ANGULAR MOMENTUM 111

We are now interested to see how the orbital angular momentum oper-
ator acts on on the wave function in the position basis, this time in three
dimensions, that of course will be given by

ψ(~r) ≡ h ~r | ψ i . (4.56)

To this extent it is more convenient to use spherical coordinates (r, θ, φ)


rather than Cartesian coordinates (x, y, z), simply related by

x = r sin θ cos φ , (4.57)


y = r sin θ sin φ , (4.58)
z = r cos θ . (4.59)

The reason is that orbital angular momentum is the generator of rotations


and, therefore, one expects that it does not act on the radial dependence
of the wave function. Indeed it is possible to show that the action of its
components is given by:
 
∂ ∂
h ~r |L̂x | ψ i = i ~ sin φ + cot θ cos φ ψ(~r) , (4.60)
∂θ ∂φ
 
∂ ∂
h ~r |L̂y | ψ i = i ~ cos φ + cot θ sin φ ψ(~r) , (4.61)
∂θ ∂φ

h ~r |L̂z | ψ i = −i ~ ψ(~r) . (4.62)
∂φ
Notice that this action can be also more briefly expressed saying that the
three angular momentum components representations in position basis are
given by:
 
∂ ∂
L̂x → i ~ sin φ + cot θ cos φ , (4.63)
∂θ ∂φ
 
∂ ∂
L̂y → i ~ cos φ + cot θ sin φ , (4.64)
∂θ ∂φ

L̂z → −i ~ . (4.65)
∂φ
These results confirm that the angular momentum vector operator does not
involve the radial dependence of the wave function. We want now to deter-
mine, at least partially, the wave functions corresponding to the eigenstates
| l, m i. The angular momentum quantum numbers l and m are not sufficient
112 CHAPTER 4. ANGULAR MOMENTUM

to fully determine the wave function: they are not a complete set of opera-
tors. This is because one needs another operator, usually the Hamiltonian,
to determine also the radial part. We can denote generically by n the addi-
tional quantum number that is needed to characterise the radial part so that
the full quantum states can be denoted by kets | n, l, m i and the associated
wave functions will be given by ψnlm (~r) = h ~r | n, l, m i. If the system has a
spherical symmetry. This means that the particle moves in a potential with
spherical symmetry that depends only on r but not on θ and φ, then the
solutions will factorise in a radial part Rnl (r) and in an angular part given
by the spherical harmonics Ylm (θ, φ), explicitly:

ψnlm (~r) = Rnl (r) Ylm (θ, φ) . (4.66)

The radial part has to be determined by solving the eigenvalue problem


for the operator whose eigenstates are labelled by the quantum number n,
typically the Hamiltonian, giving rise to a radial equation for Rnl (r). Let us
now consider the action of L̂z on ψnlm (~r), since it does not involve the radial
coordinate r, one has


h r, θ, φ |L̂z | n, l, m i = −i ~ Rnl (r) Ylm (θ, φ) . (4.67)
∂φ

At the same time, since | n, l, m i is an eigenket of L̂z , one also has

h r, θ, φ |L̂z | n, l, m i = m ~ Rnl (r) Ylm (θ, φ) . (4.68)

One then obtains a differential equation from which the radial part drops
out:

Ylm (θ, φ) = i m Ylm (θ, φ) . (4.69)
∂φ
This has as a solution:

Ylm (θ, φ) = flm (θ) ei m φ . (4.70)

The wave function in position basis needs to be invariant under a rotation of


2π [3], meaning that it has to be single-valued. This imposes the condition
ei m (φ+2π) = ei m φ , implying ei 2 m π = 1 and, therefore, m and l integers. In
this way half-integer values are excluded for orbital angular momentum. We
will see however, that half-integer values are allowed for spin, and, therefore,
also for total angular momentum.
4.6. SPIN 113

4.6 Spin
In this section we now discuss the spin angular momentum, first its general
properties and then some specific applications.

4.6.1 General properties


In classical physics a point-like particle at rest in the laboratory reference
frame has no angular momentum, no rotation can be generated in this case.
In QM the situation is different, since even for a point-like particle at rest
there can be an associated angular momentum. If the particle is described
in position basis by just one wave function ψ(~r), this would be constant in
time. However, there is also the possibility that a particle, at a give point, is
described not by just one component but by more components, for example
by two components that we can denote by ψ+ (~r) and ψ− (~r). In this situation
the system, though does not have an orbital angular momentum, remember
we are assuming it is at rest, it can still have an angular momentum, since the
two components in a given point can be in general reshuffled by a rotation.
The component of angular momentum describing the way how the multiple
components of the state vector in a point are reshuffled by a rotation, is
called spin. The name derives from the fact that historically it was thought
that such angular momentum was associated to the existence of a small scale
structure of the involved particles that could then spin around some axis. It
was first observed in the famous Stern-Gerlach experiments, using a beam of
silver atoms moving along the central axis of a magnetic field with cylindrical
symmetry. Despite the orbital angular momentum along this axis vanishes
and classically one would not expect any deviation of the beam from the
central axis, a deviation was still observed. This had to be explained in
terms of an intrinsic magnetic moment possessed by the atoms generated by
some spin. The spin of course is possible if the particle has some finite size
structure and it is not just a point-like particle. In the case of silver atoms
this seemed to be a plausible interpretation. However, similar behaviour was
also observed with electrons that today we know to be elementary (i.e., point-
like) particles and any model of a finite size electron generating classically a
spin fails in describing the experimental findings. Therefore, the spin angular
momentum has to be interpreted as in intrinsic property of a particle, like its
mass or its charge, describing its behaviour under rotations centred on the
particle itself, in the absence of any orbital motion.
Since the fundamental commutation relation are valid for the components
114 CHAPTER 4. ANGULAR MOMENTUM

of the total angular momentum vector operator Ĵ (see Eqs. (4.5)), then nec-
essarily the same commutation relations have to be also satisfied separately
by the components of the orbital angular momentum vector operator L̂ and
by the components of the spin Ŝ that is also a vector operator and we can,
therefore, write: h i
Ŝi , Ŝj = i ~ εijk Ŝk . (4.71)

In the case of the orbital angular momentum, we saw that these are consistent
with its definition coming from the classical correspondent quantity and from
the canonical commutation relations (4.52). In the case of spin there is no
such classical analogy but the same commutation relations need to hold since
it is a component of the total angular momentum and, in the case that the
orbital angular momentum vanishes, like in the Stern-Gerlach experiment, it
coincides with it.
The quantum numbers of spin are denoted ms , for the z component Ŝz ,
and its maximum value by s, while the corresponding eigenvalue of the square
of the spin operator, Ŝ 2 , is given by ~2 s (s + 1). Of course all results we ob-
tained for the angular momentum Ĵ also apply to Ŝ, that is a specific case
when orbital angular momentum can be neglected. This is because spin obeys
the same commutation relations and, as we have seen, all results we obtained
for angular momentum were derived just from the fundamental commutation
relations (4.5). In particular, for each value of s, there are 2s + 1 possible
values of ms . The clear difference between orbital angular momentum and
spin is that while in the case of spin both half-integer and integer values are
possible, in the case of orbital angular momentum only integer values are
possible, as we discussed. Moreover, spin does not have a classical counter-
part and its quantum numbers are not related to any kinematic or dynamical
variable describing the motion of the system, they are indeed an example of
internal quantum numbers).
Spin operators act on spin state vectors. They reside in a vector space
Vs that is distinct from the Hilbert space Vo describing the orbital degrees
of freedom of the system, related to its motion and position in space. The
Hilbert space Vo is infinite-dimensional and we can denote the generic state
vector by ψo . The eigenstates can be denoted by | n, l, m i, where l and m are
the quantum numbers of angular momentum and n are some quantum num-
bers describing other dynamical quantities of the system such as energy. The
vector space Vs has dimension 2s + 1 and it is, therefore, finite-dimensional
and we can generically denote a state vector in Vs by | ψs i. The full Hilbert
space of the system will be then given by the direct product (also called tensor
4.6. SPIN 115

product or Kronecker product) of spin and orbital vector spaces and denoted
by
V = Vo ⊗ Vs . (4.72)
If the orbital and spin degrees of freedom evolve independently of each other,
and this happens if the Hamiltonian operator is separable and can be written
as the sum of an orbital contribution Ĥo and a spin contribution Ĥs , explicitly
Ĥ = Ĥs + Ĥo , (4.73)
then consequently a generic state vector | ψ i factorises into
| ψ i = | ψo (t) i ⊗ | ψs (t) i , (4.74)
where the evolution of | ψo i is described by Ĥo and the evolution of | ψs i is
described by Ĥs .

4.6.2 Elementary particles and their spin


Notice that there is another important difference between spin and orbital
angular momentum: while the total orbital angular momentum l can change
during the time evolution, think for example of an initially moving free par-
ticle with vanishing orbital angular momentum that at some point enters a
region with a magnetic field, the total spin of a particle is an intrinsic prop-
erty, like mass and charge, that cannot change dynamically. Here we want
then to give an overview of the spin of known elementary particles.
These are beautifully described by the Standard model of particle physics.
They can be classified in terms of their spin and charge (see Table). We have:
six leptons, three charged leptons (electron e, muon µ and tauon τ ) and three
neutrinos (electron neutrino νe , muon neutrino νµ and tauon neutrino ντ ); six
flavours of quarks (up quark u, down quark d, strange quark s, charm quark
c, top quark t and bottom quark b); four gauge bosons (photon γ, neutral Z
boson Z, charged W boson W ± and gluon g); and, last to be discovered in
2012, the Higgs boson h0 .
The Higgs boson is the only known elementary particle with spin s = 0
and is, therefore, a scalar since it is invariant under rotations. All quarks
and leptons have spin s = 1/2. Finally, the gauge boson are vector bosons
and have spin s = 1. There are no known elementary particles with higher
spin.4
4
The graviton would be the gauge boson mediating the gravity force in quantum gravity
theories and has spin s = 2 but of course this has yet to be discovered.
116 CHAPTER 4. ANGULAR MOMENTUM

4.6.3 Matrix representations of spin 1/2 particles


In the specific case s = 1/2, the Hilbert space V is the direct product of an
infinite-dimensional space V0 and a two-dimensional space Vs . The generic
state | ψ i of a spin 1/2 particle, like the electron, is then represented in the
~r, Ŝz basis by a two-component wave function called a spinor:
     
ψ+ (~r) 1 0
| ψ i → ψ(~r) = = ψ+ (~r) + ψ− (~r) . (4.75)
ψ− (~r) 0 1
As discussed in general in 4.6.1, the state vector can be factorised into an
orbital part and into a spinorial part and, accordingly, this can be easily done
also for the generic spinor ψ(~r) in Eq. (4.75), writing
ψ(~r) = ψo (~r) χ(~r) , (4.76)
p
where ψo (~r) = |ψ+ (~r)|2 + |ψ− (~r)|2 ei α and
 + 
ψs (~r)
ψs (~r) = , (4.77)
ψs− (~r)
where ψs+ = ψ+ /ψo and ψs− = ψ− /ψo , in a way that |ψs+ (~r)|2 + |ψs− (~r)|2 = 1.
The phase α is arbitrary and can be conveniently chosen. As we discussed in
4.6.1, if the Hamiltonian operator is separable, then ψo and ψs time evolve
independently of each other. A spin operator will only act on the spin part
ψs leaving untouched the orbital part ψo .
The spin operator squared Ŝ 2 is represented by the simple 2 × 2 matrix:
 
2 2 3 2 1 0
Ŝ → S = ~ , (4.78)
4 0 1

confirming, also in this special case, the general result that Jˆ2 is propor-
tional to the identity operator (see Eq. (4.37)). Therefore, its action on a
generic state vector | ψ i is simply given by Ŝ 2 | ψ i = (3/4)~2 | ψ i and the
wave function transforms accordingly:
  +   + 
2 3 2 1 0 ψs 3 2 ψs 3
h ~r |Ŝ | ψ i = ~ ψo (~r) − = ~ ψo (~r) − = ~2 ψ(~r) .
4 0 1 ψs 4 ψs 4
(4.79)
One can also easily find the matrix representations for the spin com-
ponents that can be conveniently expressed in terms of the Pauli matrices
~σ = (σx , σy , σz ) as
~ = ~ ~σ ,
S (4.80)
2
4.6. SPIN 117

where:
     
0 1 0 −i 1 0
σx = , σy = , σz = . (4.81)
1 0 i 0 0 −1

Since the matrix S 2 ∝ I, the commutation relations [S 2 , Sk ] = 0 are clearly


respected. In addition one can check that the three matrices Sk also satisfy
the fundamental commutation relations (4.71). The Pauli matrices obey
similar commutation relations

[σi , σj ] = 2 i εijk σk . (4.82)

However, it is easy to check that the Pauli matrices also satisfy the anti-
commutation relations

{σi , σj } ≡ σi σj + σj σi = 0 , (4.83)

so that the commutation relations (4.82) can be also recast as:

σi σj = i εijk σk . (4.84)

We can now solve the eigenvalue problems for the spin components. For all
three spin components the eigenvalues, as expected, are the same: +~/2 and
−~/2. One then finds the following eigenvectors:

• For Ŝz , one has very simply:


   
1 1 1 1 1 0
| , i ≡ |+i → and | , − i ≡ | − i → . (4.85)
2 2 0 2 2 1

• For Ŝx :
   
1 1 1 1 1 1 1 1
| , i ≡ | +x i → √ and | , − ix ≡ | −x i → √ ;
2 2x 2 1 2 2 2 −1
(4.86)

• Finally, for Ŝy :


   
1 1 1 1 1 1 1 1
| , i ≡ | +y i → √ and | , − i ≡ | −y i → √ .
2 2y 2 i 2 2y 2 −i
(4.87)
118 CHAPTER 4. ANGULAR MOMENTUM

These results can be also generalised to the case of the spin component Ŝn̂
along an arbitrary direction n̂ = (θ, φ). This can be expressed as:

Ŝn̂ = Ŝ · n̂ = nx Ŝx + ny Ŝy + nz Ŝz , (4.88)

where in Cartesian coordinates n̂ = (sin θ cos φ, sin θ sin φ, cos θ). Also in this
case the eigenvalues are ±~/2 (they are invariant under a rotation!). Using
the useful short notation already adopted for the Cartesian components,
1 1
| ±n̂ i ≡ | , ± i , (4.89)
2 2 n̂
we can then write the eigenvalue equation as
~
Ŝn̂ | ±n̂ i = ± | ±n̂ i , (4.90)
2

finding for the eigenvectors the following representations in the Ŝz -basis (see
problemsheets):
 
cos(θ/2)
| +n̂ i → , (4.91)
sin(θ/2) e+iφ
 
sin(θ/2)
| −n̂ i → . (4.92)
− cos(θ/2) e+iφ

These expressions show clearly that a spinor is not single-valued, since per-
forming a 2π rotation around the axis n̂, the spinor changes sign.
Finally, from the general results found for the generic angular momentum
operators and in particular for the matrix elements of the ladder operators of
angular momentum in Eqs. (4.19), one can easily find for the representations
of the spin ladder operators (j = s = 1/2):
   
0 1 0 0
S+ = ~ and S− = ~ . (4.93)
0 0 1 0

4.7 Addition of angular momenta


Let us consider the general problem of adding the angular momenta of two
systems in states described by quantum numbers j1 , m1 and j2 , m2 respec-
tively and with angular momentum vector operators acting on the states
given by Ĵ(1) and Ĵ(2) . These could refer, for example, to the momenta of
4.7. ADDITION OF ANGULAR MOMENTA 119

two distinct particles bound in a system or to the orbital angular momen-


tum and spin of the same particle. Assuming that the dynamics of the two
(1)
individual systems are independent of each other, then any component Jˆi
(2)
will commute with any component Jˆj , i.e.,

(1) (2)
[Jˆi , Jˆj ] = 0 ∀ i, j = 1, 2, 3 . (4.94)

On the other hand, if one takes two components of the same angular momen-
tum, then they satisfy the fundamental commutation relations (4.5) that get
specialised into
(1) (1) (1) (2) (2) (2)
[Jˆi , Jˆj ] = i~ εijk Jˆk , and [Jˆi , Jˆj ] = i~ εijk Jˆk . (4.95)

and, as we have seen, these imply

[(Jˆ(1) )2 , Jˆz(1) ] = [(Jˆ(2) )2 , Jˆz(2) ] = 0 . (4.96)

If we build two particle states given by the direct products | j1 , m1 i⊗| j2 , m2 i


and that we denote more simply by | j1 m1 , j2 m2 i, they form a basis for the
space V1⊗2 = V1 ⊗ V2 , the direct product of the two spaces V1 and V2 with
basis vectors | j1 , m1 i and | j2 , m2 i respectively.
Let us now define the total angular momentum vector operator5

Ĵ = Ĵ(1) + Ĵ(2) . (4.97)

Its components also satisfy the fundamental commutation relations (4.5), i.e.,
(1) (2) (1) (2)
X (1) (2)
[Jˆi + Jˆi , Jˆj + Jˆj ] = i ~ εijk (Jˆk + Jˆk ) , (4.98)
k

since the commutators of components of different individual angular momen-


tum vanish.
We want now to show that it is possible to switch from the basis of
eigenstates | j1 m1 , j2 m2 i to a new basis of eigenstates | j m, j1 j2 i, where j
and m are of course the quantum numbers of the total angular momentum
operator. We will refer to the first basis as the product basis (sometimes also
5
Here the operators Ĵ(1) and Ĵ(2) should be actually meant as Ĵ(1) ⊗ Iˆ(2) and Iˆ(1) ⊗ Ĵ(2) ,
where Ĵ(1) and Ĵ(2) act on one-particle states in the spaces V1 and V2 respectively and
where Iˆ(1) and Iˆ(2) are the respective identity operators in these spaces. However, the two
identity operators are usually implied to simplify the notation.
120 CHAPTER 4. ANGULAR MOMENTUM

called the uncoupled basis) and to the second as the total-j basis (sometimes
also called the coupled basis).
The total-j basis is of course a simultaneous eigenbasis of the four op-
(1)
erators Jˆ2 , Jˆz , (Jˆ(1) )2 , (Jˆ(2) )2 , where essentially Jˆ2 and Jˆz replace Jz and
(2)
Jz . This is a very useful change of basis in order to solve problems where
the Hamiltonian operator depends on terms where the two angular momenta
couple together, such as terms ∝ Ĵ(1) · Ĵ(2) .
Notice that all four operators Jˆ2 , Jˆz , (Jˆ(1) )2 , (Jˆ(2) )2 do commute with each
other and, therefore, they admit a common eigenbasis. We already know
that [Jˆ2 , Jˆz ] = [J (1) )2 , (Jˆ(2) )2 ] = 0. We are then left to show that also
[Jˆ2 , (Jˆ(1) )2 ] = [Jˆ2 , (Jˆ(2) )2 ] = 0. First of all notice that we can write:

Jˆ2 = Ĵ· Ĵ = (Ĵ(1) + Ĵ(2) )·(Ĵ(1) + Ĵ(2) ) = (Jˆ(1) )2 +(Jˆ(2) )2 +2 Ĵ(1) · Ĵ(2) , (4.99)

where in the last terms we used the fact that Ĵ(1) · Ĵ(2) = Ĵ(2) · Ĵ(1) . Notice
now that Jˆ2 does indeed commute with each of the three terms on the right-
hand side and notice also that this expression clearly shows the convenience
of using the total-j basis instead of the product basis when the Hamiltonian
contains terms proportional to Ĵ(1) · Ĵ(2) .
Therefore, the common eigenvectors | j m, j1 j2 i also form a basis for the
states of angular momentum of the combined system. Let us explicitly write
the eigenvalue equation for each operator:

Jˆ2 | j m, j1 j2 i = j(j + 1) ~2 | j m, j1 j2 i , (4.100)


Jˆz | j m, j1 j2 i = m ~ | j m, j1 j2 i ,
(Jˆ(1) )2 | j m, j1 j2 i = j1 (j1 + 1) ~2 | j m, j1 j2 i ,
(Jˆ(2) )2 | j m, j1 j2 i = j2 (j2 + 1) ~2 | j m, j1 j2 i .

If one fixes the values of j1 and j2 , in common with the two bases, this
defines a subspace of V1⊗2 . Since j1 and j2 are fixed, one usually simplifies
the notation writing:

| j, m i ≡ | jm, j1 j2 i , (4.101)
| m1 , m2 i ≡ | j1 m1 , j2 m2 i . (4.102)

It is easy to calculate the dimension of the subspace V1⊗2 counting the number
of distinct eigenvectors | m1 , m2 i. Since m1 and m2 can take (2j1 + 1) and
(2j2 + 1) values respectively, this is clearly given by (2j1 + 1) (2j2 + 1). This
has necessarily to coincide with the number of distinct eigenvectors | j, m i.
4.8. CLEBSCH-GORDAN COEFFICIENTS 121

(1) (2)
First of all since Jˆz = Jˆz + Jˆz , then each value of m has to be simply
related to m1 and m2 by m = m1 +m2 .6 The maximum value of j, denoted by
jmax , is necessarily given by j1 +j2 , the sum of the two maximum values of m1
and m2 . Notice that for j = j1 +j2 one can realise all values of m ranging from
−j to j. Indeed the case m = −j corresponds to m1 = −j1 and m2 = −j2 .
We can then lower j by one unit obtaining states with j = j1 + j2 − 1 and so
on. The minimum value jmin can be found imposing that the total-j basis has
a number of eigenvectors equal to the number of eigenvectors in the product
basis. As we have seen, this is given by (2j1 + 1)(2j2 + 1). Considering
moreover that for each allowed value of j, there are 2j + 1 eigenvectors
corresponding to a different value of m in the range −j ≤ m ≤ j, we can
impose:
jmax
X
(2j + 1) = (2j1 + 1) (2j2 + 1) . (4.103)
jmin

The sum on the left-hand side can be easily evaluated, obtaining


jmax
X
(2j + 1) = jmax (jmax + 1) − jmin (jmin − 1) + jmax − (jmin − 1) , (4.104)
jmin

and since jmax = j1 + j2 , with easy algebraic steps, one finds jmin = |j2 − j1 |.
Therefore, summarising, we have just proved the important result

|j2 − j1 | ≤ j ≤ j1 + j2 . (4.105)

4.8 Clebsch-Gordan coefficients


The eigenstates | j1 m1 , j2 m2 i form a complete basis and, therefore, the total
j-basis eigenstates can be expressed as their linear combination:
X
| j m, j1 j2 i = | j1 m1 , j2 m2 i h j1 m1 , j2 m2 | j m, j1 j2 i , (4.106)
m1 ,m2

6
This can be understood considering that the product basis eigenstates | m1 , m2 i are
also eigenstates of Jˆz with eigenvalue m1 + m2 . Therefore, the subspace for a fixed m,
must also correspond to the subspace with basis given by all eigenstates | m1 , m2 i with
m1 + m2 = m. The simplest case is when m1 = j1 and m2 = j2 since in this case there
is only the direct product basis eigenstate | j1 , j2 i and, therefore, the subspace is one-
dimensional. For this reason, one necessarily has | j, j i = | j1 , j2 i. The same it is true
when m1 = −j1 and m2 = −j2 . In this case one has | j, −j i = | − j1 , −j2 i. We will see
this algebraically when we discuss the Clebsch-Gordan coefficients.
122 CHAPTER 4. ANGULAR MOMENTUM

where in the double sum one has m1 = −j1 , −j1 + 1, . . . , j1 − 1, j1 and m2 =


−j2 , −j2 +1, . . . , j2 −1, j2 with m1 +m2 = m. The coefficients of the expansion
are denoted in an abbreviated way by
h j1 m1 , j2 m2 | j, m i ≡ h j1 m1 , j2 m2 | j m, j1 j2 i , (4.107)
and are called Clebsch-Gordan coefficients. From our results in the previous
section, they must respect the following properties:
• h j1 m1 , j2 m2 | jm i =
6 0 only if |j2 − j1 | ≤ j ≤ j1 + j2 ;
• h j1 m1 , j2 m2 | jm i =
6 0 only if m1 + m2 = m.7
Moreover, one usually imposes the Condon-Shortley phase convention, such
that h j1 j1 , j2 (j − j1 ) | j, j i is real and positive. Finally, one has the relation
h j1 (−m1 ), j2 (−m2 ) | j, (−m) i = (−)j1 +j2 −j h j1 m1 , j2 m2 | j, m i . (4.108)
This relation halves the number of coefficients that need to be calculated,
considerably reducing the task.
If one assembles the Clebsch-Gordan coefficients into a matrix, this is
correctly found to be orthogonal (i.e., unitary and real). This is not coming
with a surprise, since we know, from our general discussion on vector spaces,
that two orthonormal bases have to be related by a unitary transformation.

4.9 Simplest application: two spin 1/2 parti-


cles
Let us determine the Clebsch-Gordan coefficients for the simplest case of two
spin 1/2 particles, so that we have j1 = s1 = 1/2 and j2 = s2 = 1/2.
7
It is easy to see directly that this must be true. By definition of Jˆz , one has Jˆz −
(1) (2)
Jˆz − Jˆz = 0. Therefore, we can write

Jˆz − Jˆz(1) − Jˆz(2) | j, m i = 0 .

If we now take the inner product with h j1 m1 , j2 m2 | and act with Jˆz on the ket and with
(1) (2)
−Jˆz − Jˆz on the bra, we immediately obtain

(m − m1 − m2 ) h j1 m1 , j2 m2 | jm i = 0 ,

showing that for h j1 m1 , j2 m2 | jm i neq0, necessarily m = m1 + m2 . Notice that this is an


algebraic proof that each | j, m i can be a linear combination only of vectors | j1 m1 , j2 m2 i
with m = m1 + m2 .
4.9. SIMPLEST APPLICATION: TWO SPIN 1/2 PARTICLES 123

In the specific case j1 = j2 = 1/2 it is customary to introduce the compact


notation
| sign(m1 ) sign(m2 ) i ≡ | m1 m2 i . (4.109)
The range of values for j that in general is given by Eq. (4.105), now simply
reduces to just two values: j = 0 and j = 1. We have then just four | j, m i
eigenstates:
| 1, 1 i , | 1, 0 i , | 1, −1 i , | 0, 0 i . (4.110)
The first three, with s = 1, are together referred to as spin triplet, while the
s = 0 state is referred to as spin singlet. These have to be expressed in terms
of the (likewise) four direct product basis eigenstates
| + +i,| + −i| − +i| − −i. (4.111)
Let us start from the eigenstate | 1, 1 i. Since there is only one eigenstate
| m1 m2 i with m1 + m2 = m (this is true in general!), in our case one neces-
sarily simply has
|11i = | + +i, (4.112)
so that the Clebsch-Gordan coefficient h + + | 1, 1 i = 1 while all the others
with j = m = 1 vanish. We can then obtain the state | 1, 0 i from | 1, 1 i
(1) (2)
using the lowering operator Jˆ− = Jˆ− + Jˆ− :
(1) (2)
Jˆ− | 1, 1 i = (Jˆ− + Jˆ− ) | + + i . (4.113)

We have now to recall the general result Eq. (4.19) for Jˆ−
Jˆ− | j, m i = [j(j + 1) − m(m − 1)]1/2 ~ | j, m − 1 i , (4.114)
that, applied to our case, gives:

Jˆ− | 1, 1 i = 2 ~ | 1, 0 i . (4.115)
At the same time the right-hand side of Eq. (4.113), using (4.114) with
j = 1/2 and m = 1/2, gives
(Jˆ(1) + Jˆ(2) ) | + + i = Jˆ(1) | + + i + Jˆ(2) | + + i (4.116)
p p
= ~ ( 3/4 + 1/4 | − + i + 3/4 + 1/4 | + − i)
= ~ (| − + i + | + − i) , (4.117)
obtaining
| − +i + | + −i
| 1, 0 i = √ , (4.118)
2
124 CHAPTER 4. ANGULAR MOMENTUM

implying that the two Clebsch-Gordan coefficients are given by


1
h ± ∓ | 1, 0 i = √ . (4.119)
2
The eigenstate | 1, −1 i can again be obtained very easily since one can have
m = −1 only for m1 = m2 = −1/2 and, therefore, very simply one has:
| 1, −1 i = | − − i . (4.120)
It can be verified that applying the lowering operator to both sides of (4.118)
one obtains the same result.
Finally the eigenstate | 0, 0 i can be obtained very simply noticing that
again, as | 1, 0 i, this needs to be a linear combination of | + − i and | − + i
but orthogonal to | 1, 0 i (i.e., such that h 1, 0 | 0, 0 i = 0). It is then very
simple to check that
| + −i − | − +i
| 0, 0 i = √ , (4.121)
2
is indeed orthogonal to | 1, 0 i and so finally the last two Clebsch-Gordan
coefficients are given by
1
h ± ∓ | 0, 0 i = ± √ . (4.122)
2
Notice we could also have chosen the same linear combination multiplied
by a minus sign but one can check that (4.122) is the one respecting the
Condon-Shortley convention.

4.9.1 Entangled states


We have seen that the states | 1, 1 i and | 1, −1 i can be expressed as the direct
product of two ‘one-particle’ states, one describing particle 1 in the vector
space V1 and one describing particle 2 in vector space V2 (see Eqs. (4.112)
and (4.120) respectively). On the other hand the states | 0, 0 i in Eq. (4.121)
and | 1, 0 i in Eq. (4.118) cannot be written as the direct product of two one-
particle states: they are two examples of entangled states, defined in general
in Section 1.21.
In the one-particle spaces the column vector representation of the two
basis vectors is simply given by:
   
1 0
| s1 = 1/2, m1 = 1/2 i → , and | s1 = 1/2, m1 = −1/2 i → .
0 1
(4.123)
4.9. SIMPLEST APPLICATION: TWO SPIN 1/2 PARTICLES 125

In the product space V1⊗2 , the direct product basis vectors will be represented
by four components column vectors:
       
1 0 0 0
 0   1   0   0 
| ++ i →  0  , | +− i →  0  , | −+ i →  1  , | −− i →  0  .
      

0 0 0 1
(4.124)
If we now consider the representations of triplet states and of the singlet, the
total-j basis vectors, we have:
       
1 0 0 0
 0  1  1
  1  1 

 0 .
 
 0  , | 1, 0 i → √2  1
| 1, 1 i →  | 0, 0 i → √  | −1 i →
 , , 1,
 2  −1   0 
0 0 0 1
(4.125)
It is clear that the components of the two states | 1, 0 i and | 0, 0 i cannot be
written as the product of the components of two one particle states, simply
because the first and fourth component vanish and this implies for the first
(1) (2)
component that either a1 = 0 or a1 = 0 and similarly for the second
(1) (2)
component, either a2 = 0 or a2 = 0. However, then this necessarily
implies that also the second and third component of the two particle states
should vanish and one would get the null vectors. This confirms that these
two states are entangled states and cannot be written as the product of two
one-particle states.
126 CHAPTER 4. ANGULAR MOMENTUM
Bibliography

[1] J.J Sakurai, Jim Napolitano, Modern Quantum Mechanics, Cambridge


University Press 2017.

127
128 BIBLIOGRAPHY
Chapter 5

Relativistic Quantum
Mechanics

An important question, that we have eluded so far, is whether QM is consis-


tent with special relativity. Let us consider the Schrödinger equation for the
time-evolution of the quantum state,

i~ | ψ(t) i = Ĥ| ψ(t) i . (5.1)
∂t
By itself this is neither relativistic invariant nor relativistic non-invariant
since until we do not specify the Hamiltonian operator Ĥ, the equation is
not fully specified. Therefore, the problem is whether there is a way to write
an expression for the Hamiltonian operator in a way to obtain a manifestly
relativistic invariant equation, what is also referred to as a covariant equation.
More explicitly, this means an equation whose both right-hand and left-hand
sides transform in the same way under a Lorentz transformation. To this
extent both sides have to be relativistic tensors of the same rank.
We have seen a particular example of Hamiltonian operator in the case of
a one-dimensional non-relativistic harmonic oscillator. More generally, for a
particle moving non-relativistically in a three-dimensional potential, we can
write
p̂2
Ĥ(x̂, p̂) = + V (x̂) . (5.2)
2m
The simplest case is of course to consider V (x̂) = 0, obtaining the Schrödinger
equation for a free particle:
∂ p̂2
i~ | ψ(t) i = | ψ(t) i . (5.3)
∂t 2m

129
130 CHAPTER 5. RELATIVISTIC QUANTUM MECHANICS

If we now project on the x̂-basis, we obtain an equation for the wave function
ψ(x, t) that is clearly relativistically non invariant :

∂ ~2 2
i~ ψ(x, t) = − ∇ ψ(x, t) . (5.4)
∂t 2m
Indeed one can see that time and position are not equally treated, as one
would expect in the case of a covariant equation. Of course this is not a
surprise considering that we started from an Hamiltonian operator corre-
sponding to a non-relativistic Hamiltonian.
There are two different solutions to the problem of getting a relativistic
invariant equation: one leads to the Klein-Gordon equation and one leads to
the Dirac equation. Here we just focus on the problem of getting a relativistic
invariant equation for the wave function in position basis. We will then
comment on the fact that both equations suffer other problems that can be
solved only definitively giving up the idea of a wave function formulation of
QM, simply because it is intrinsically at odd with special relativity and it
can only provide an approximation valid if certain conditions hold.

5.1 The Klein-Gordon equation


We have seen that the classical analogy suggests to build the Hamiltonian
operator starting from the classical Hamiltonian and replacing coordinated
and moments with their corresponding operators. For this reason, non-
relativistically, one writes Ĥ = p̂2 /(2m) in the case of a free particle. Rela-
tivistically, the Hamiltonian is given by the well known relation
p
H(p) = c2 p2 + m2 c4 . (5.5)

It seems straightforward the to write a relativistic Schrödinger equation for


a free particle as


q
i~ | ψ(t) i = c2 p̂2 + m2 c4 | ψ(t) i . (5.6)
∂t
However, this equation is not yet relativistically invariant, since the left-hand
side is linear in time derivative while on the right-hand side we have a square
root. When this is projected in position basis one would obtain a square root
of a second order space derivative plus a constant that is equivalent to have
a series of terms containing space derivatives to all orders. This means that
5.2. THE DIRAC EQUATION 131

the equation would not only be covariant, but it would not even be local, i.e.,
the value of the wave function in one point would depend on the value of the
wave function in an arbitrarily far point: this clearly would violate causality.
In order to circumvent this problem, the solution seems to be to get rid of
the square root and to this extent there are two ways to proceed. The first one
is simply to apply the time derivative on both sides and since this commutes
with the Hamiltonian operator of a free particle that is time-independent,
this is equivalent to square the operators on both sides in Eq. (5.1) obtaining
:
∂2
−~2 2 | ψ(t) i = c2 p̂2 + m2 c4 | ψ(t) i .

(5.7)
∂t
If we now project on the position basis, divide both sides by (~c)2 , bring the
left-hand side on the right-hand side and change the overall sign, we obtain
the Klein-Gordon equation:
"  2 2 #
2
1 ∂ mc
2 2
− ∇2 + ψ(x, t) = 0 . (5.8)
c ∂t ~c
Using natural units (c = ~ = 1) and introducing the D’Alambertian (or box)
operator  ≡ ∂ 2 /∂t2 − ∇2 , it can be more compactly rewritten as
( + m2 ) ψ(x, t) = 0 . (5.9)
Here notice that the wave function is a scalar, i.e. a one-component wave
function. Therefore, the Klein-Gordon equation cannot describe spin 1/2
particles such as electrons, simply because, as we have seen, these have to be
described by spinors, i.e., multicomponent wave functions. There is, however,
a second solution to the problem of getting a relativistic invariant equation
for free particles and this is suitable to describe spin 1/2 particles such as
electrons.

5.2 The Dirac equation


Instead of ‘squaring’ the Schrödinger equation, there is a second route fol-
lowed by Dirac. He insisted on having a linear equation both in time and
space derivatives. The trick is to consider that even if though the Hamilto-
nian Eq. (5.5) contains a square root, switching to the Hamiltonian operator
does not necessarily implies the presence of a square root and one can try to
write this in a linear way as
Ĥ(p̂) = α̂ · p̂ + β̂m , (5.10)
132 CHAPTER 5. RELATIVISTIC QUANTUM MECHANICS

where the four quantities α̂ ≡ (α̂1 , α̂2 , α̂3 ) and β̂ should be meant as operators
acting in some spinorial space with a dimension yet to be determined. A
priori one could think this can be just one, then the α̂i ’s and β̂ would be just
scalars and one would go back to the case of a scalar equation. However, as
we will see in a moment, this will be not the case. We now impose that when
we square both sides of (5.10), the left-hand side still respect the classical
(relativistic) analogue equation:

Ĥ 2 (p̂) = p̂2 + m2 . (5.11)

In this way one has to impose that α̂ and β̂ have to be such to satisfy the
condition
p̂2 + m2 = (α̂ · p̂ + β̂m)2 . (5.12)
This condition has no solution if α̂ and β̂ were just scalars. They necessarily
have to be operators satisfying the following anti-commutation conditions:

α̂12 = α̂22 = α̂32 = β̂ 2 = Iˆ , (5.13)

{α̂i , α̂j } = 0 (i, j = 1, 2, 3) , (5.14)


n o
α̂i , β̂ = 0 (i = 1, 2, 3) . (5.15)
These anticommutation relations can be satisfied only if the minimum di-
mension of the spinorial space is four. Since this is the case corresponding to
the physically interesting case of spin 1/2 particles (why four and not two?
We will answer soon this question), we can assume this minimum value so
that the three α̂i and β̂ can be represented by 4 × 4 matrices. Therefore, we
have now that the Schrödinger equation becomes:

i | ψ(t) i = (α̂ · p̂ + β̂ m)| ψ(t) i , (5.16)
∂t
where now | ψ(t) i is a spinor with four components, i.e., a four-spinor. If
we now project on the position basis and we choose some spinorial basis we
obtain  
∂ ~
i + i α · ∇ − β m ψ(x, t) = 0 , (5.17)
∂t
where now ψ(x, t) is a four component wave function and the αi ’s and β are
4 × 4 matrices in some representation. We can now formally define the four
γ matrices
γ 0 = β , γ i = β αi , (5.18)
5.2. THE DIRAC EQUATION 133

and multiplying by β we obtain finally the Dirac equation (µ = 0, 1, 2, 3)


(iγ µ ∂µ − m) ψ(x) = 0 , (5.19)
where x = (x0 , x) is the four-coordinate vector with x0 ≡ ct and ∂µ ≡ ∂/∂xµ .
It is also customary1 to introduce the symbol ∂/ ≡ γ µ ∂µ in a way that the
Dirac equation is written even in a more compact way as:

i∂/ − m ψ(x) = 0 . (5.20)
Notice that it looks like as if the equation is manifestly covariant but actually
the γ µ do not transform as a covariant vector under a Lorentz transforma-
tion, actually they do not change at all. They only change under a basis
transformation in the spinorial space. It is actually the 4-spinor ψ(x) that
has to transform in such a way that the Dirac equation does not change
under a Lorentz transformation.2
The Dirac equation allows to describe spin 1/2 particles and the spin, as
it should be appreciated, arises only as a consequence of the condition (5.16),
imposing that states must live in a discrete spinorial space of dimension four.
Of course there is question: what are the two additional components, com-
pared to the Pauli bi-spinors we discussed, physically describing? In a wave
function formulation, as we discussed, these would describe negative energy
states. Despite this can sound weird, it would be possible to make sense of
such solutions and even to make calculations to some extent. The problem is
that a wave function formulation cannot properly describe processes where
the number of particles change. This forces a further advance in the theoret-
ical description of elementary particles called second quantization where the
wave function becomes an operator itself depending on space and time, what
is called a quantum field. Strictly speaking the wave function of a relativistic
particle cannot be even built and leads to unphysical results as we noticed in
the introduction. The Dirac equation then has to be interpreted as an equa-
tion for spin 1/2 quantum fields (describing fermions) rather then for the
wave function and in this way it becomes finally relativistically consistent.
A similar second quantization procedure can be also applied to the Klein-
Gordon equation and this also cures all the issues of a wave function formu-
lation of relativistic quantum mechanics. In this case of course the Klein-
Gordon equation is an equation for spin 0 quantum fields (describing scalar
1
It is a notation introduced by R. Feynman.
2
The way how the Dirac spinor ψ(x) transforms under Lorentz transformations goes
beyond the scope of this module. It is usually shown in a first year post-graduate module
on quantum field theories.
134 CHAPTER 5. RELATIVISTIC QUANTUM MECHANICS

bosons such as the Higgs boson). In conclusion a consistent relativistic quan-


tum mechanical theory has necessarily to be formulated as a quantum field
theory. The Standard Model is a quantum field theory and it successfully
describes all microscopic processes we observe in nature (except neutrino os-
cillations as we have seen!). Its foundations, however, are firmly relying on
the QM postulates we discussed. The only current limitation of quantum
field theories is the inability to describe gravity. A quantum theory grav-
ity might perhaps require an even more drastic breakthrough than quantum
field theories but, as we noticed at the beginning of the course, even a theory
such as string theory, a leading candidate for a quantum gravity theory, still
obeys the postulates of QM.
Chapter 6

The second quantum revolution

In this Chapter1 we discuss four topics that all contributed, and still con-
tribute today, to shape what can be called the second quantum revolution.2
We start first discussing Bell inequalities applied to the entangled singlet
state we obtained in the last chapter in the case of two spin 1/2 particles.
We will see how these provided the theoretical tool to definitely establish QM
description as a complete theory without the need of introducing so-called
hidden variables. We discuss then how the study of entangled states has
inspired important applications. We first discuss quantum cryptography and
then quantum computing. Finally, we discuss the important idea of quantum
decoherence as a way to understand why macroscopic systems do not exhibit
quantum behaviour such as quantum superposition and this will provide a
solution to the famous Schrödinger’s cat paradox.

6.1 Bell inequalities


In this section we discuss some implications deriving from the entangled
states that we found as eigenstates of total angular momentum of the system
made by two spin 1/2 particles. We will see these results seem apparently
to clash with the postulates of special relativity and thus suggesting that
QM is an incomplete theory, as originally suggested in 1935 by the famous
Einstein-Podolsky-Rosen paradox [2]. This ambiguous situation persisted
1
All content of this chapter is non-examinable in the final exam paper, only in PS 10
(mini-dissertation).
2
For an introduction and an overview of the topics discussed in this Chapter, see
Introduction to the book Speakable and Unspeakable in Quantum Mechanics by J.S. Bell
[1] by Alain Aspect titled John Bell and the second quantum revolution.

135
136 CHAPTER 6. THE SECOND QUANTUM REVOLUTION

until 1964, when J.S. Bell derived an inequality (now bearing his name)
constraining probabilities of observing certain spin correlations that allows
to test whether QM should be regarded as an incomplete theory or not. In
the first case one should introduce so-called hidden variables able to reconcile
the experimental findings with Einstein’s locality principle and such that the
physical reality of the two components of the system is maintained even in
entangled states. With the advent of laser technology, experiments testing
the inequality could gradually be performed with increasing precision and
accuracy. All experimental results have unambiguously ruled out alternative
models of QM, respecting locality and physical reality hypotheses.

6.1.1 Spin correlation measurements


Let us consider the simple system of two spin 1/2 particles, for example
electrons, studied in the previous chapter and in particular let us consider
more closely the physical implications when the system is in the singlet state
| 0, 0 i, where the subscript highlights that the second quantum number is
the z component of the total spin. In terms of the product basis eigenstates
we have seen that one has
1
| 0, 0 i = √ (| + − i + | − + i) , (6.1)
2

where we recall that the first sign refers to particle 1 and the second sign
to particle 2. The two product eigenstates | + − i and | − + i would have
a clear classical interpretation, since each particle would have a well defined
spin. However, their superposition is now telling us that when we describe
the system as a whole we need to abandon the idea that each particle has
a definite spin, we simply cannot know which particle has spin up and spin
down in such a state unless we make a measurement. This would have as an
effect that the singlet state would collapse in one of the two product states,
either | + − i or | − + i, with equal probability. In this case, if one measures
the first particle in the spin-up state, then automatically any subsequent
measurement of the state of the second particle would necessarily give spin-
down as outcome and vice versa. If the system is in one of the two product
states, the two particles gain their distinguishability. However, in the singlet
state they would be entangled together and it makes sense just to talk of
state of the whole system. For example, the two electrons in the Helium
atom can be in the singlet state and be indistinguishable. At first sight this
6.1. BELL INEQUALITIES 137

situation seems easy to accept and that does not lead to any real paradox or
inconsistency.

However, the existence of such entangled states have given risen to a very
intense debate that ultimately stimulated the developments of new quantum
technologies based on their properties. The point is that the entanglement
and the consequent correlation in the measurement of the two spins persists
even if the two particles are produced by the decay of a parent particle in the
singlet state, flying apart and becoming well separated ceasing to interact.

This situation is realised in the decay of the η meson into a muon-


antimuon pair: η → µ− + µ+ . However, since this decay has a very small
branching ratio, it cannot be used to realise an experimental test of spin cor-
relation. A more feasible experiment can be realised in elastic proton-proton
scatterings where the two protons after collision have necessarily to be emit-
ted in a singlet state | 0, 0 i. The correlation between the measurements of
their spin along the same axis has then to be observed even after they get
separated by a macroscopic distance. Suppose that at two opposite sides of
the collision point, along the emission line and at some macroscopic distance,
there are two observers, Alice and Bob, disposing of two detectors able to
detect the spin of the two protons, moving in opposite directions, along differ-
ent axes, that for definiteness we can assume to be Sz and Sx . Suppose that
Alice measures particle 1 with spin up along one of the two axes, then she can
predict that necessarily Bob would measure particle 2 in the spin-down state
along the same axis. On the other hand, if Alice makes no measurement,
then Bob has equal probability, 50-50, to measure particle 2 either with spin
down or spin up. A situation like this can be also reproduced in a classical
analogy, certainly not requiring QM, where Alice and Bob were delivered,
from two messengers coming from some point in between, two sealed (non
transparent) envelopes, one containing a black paper and the second a white
paper. They know that the two letters contain both just one paper but with
different colour so that if, for example, Alice first opens the letter with the
white paper, she will instantaneously know that Bob will necessarily find a
black paper inside the envelope. However, this analogy is too simplistic and
cannot completely account for the complexity of the physical implications of
the entanglement of the singlet state.

Let us indeed now suppose that Alice can either measure either Sz or Sx
of the first particle, while Bob always measures Sx of the second particle. Let
us remind that from Eq. (4.86) one has that the Ŝx one-particle eigenstates
138 CHAPTER 6. THE SECOND QUANTUM REVOLUTION

are given in terms of the Ŝz one-particle eigenstates by:


1
| ±x i = √ (| + i ± | − i) . (6.2)
2
This implies that independently whether one measures first Sx either with
spin up or with spin down, a second measurement of Sz has always 50-50
chance to measure Sz either with spin up or with spin down. One can also
invert these relations and write the Ŝz one-particle eigenstates in terms of
the Ŝx ones, obtaining clearly the same relation where just one swaps x with
z:
1
| ± i = √ (| +x i ± | −x i) . (6.3)
2
The singlet state can also be expressed in terms of the product states of Ŝx
one-particle states | ±x ∓x i obtaining a result, not surprisingly because of
the singlet state isotropy, that is exactly the same expression as in the case
of Ŝz measurement, simply with the replacement z → x:
1
| 0, 0x i = √ (| +x −x i + | −x +x i) . (6.4)
2
Suppose Alice first measures Sz of particle 1 to be positive. Bob has then 50-
50 chance for measuring Sx either positive or negative. This means that even
though Sz would be known to be negative with certainty, Sx would remain
undetermined. On the other hand, if Alice measures Sx of particle 1 to be
positive, then Bob will necessarily measure Sx of particle 2 to be negative.
Finally, if Alice makes no measurement, then Bob will have 50-50 chance to
measure Sx either positive or negative.
If also Bob can measure either Sz or Sx , then there are twelve possible
combinations of measurements and results that are summarised in the Table.

The QM mechanics interpretation of these results is very simple, once


its postulates are accepted: a measurement of Sz of particle 1 induces the
collapse of the singlet state either in | + − i or in | − + i. This means that,
independently of how distant are the two particles at the time of one of the
first measurement, this has actually to be regarded as a measurement made
on the system as a whole.

6.1.2 EPR paradox and Bell’s inequality


The spin correlation measurements are the simplest experimental set up that
can test the physical implications of entangled states. Originally, in 1935,
6.1. BELL INEQUALITIES 139

Table 6.1: The twelve different combinations of measurements and results


for spin-correlation measurements of Sz and Sx .

Spin measurement Alice’s Spin component Bob’s


by Alice result (sign) measured by Bob result (sign)
z + z -
z - z +
x + x -
x - x +
z + x +
z + x -
z - x +
z - x -
x + z +
x + z -
x - z +
x - z -

A. Einstein, B. Podolsky and N. Rosen proposed an experimental setup deal-


ing with x and p but with similar conclusion [2]: in QM one has to accept
that there are correlations between the two entangled systems that violate
the locality hypothesis, formulated by J. Bell, for which there seems to be
an instantaneous influence, i.e. travelling faster than light, and thus not
obeying the principle that causal signals cannot travel faster than light. The
consequence of this observation is that either QM cannot be considered as a
complete theory or that quantum correlations cannot be regarded as causal
signals. The debate on which one of these two options was the correct one
was for long time relegated to the realm of metaphysics rather than physics,
until it was pointed out by J.S. Bell that actually alternative theories to QM
respecting the locality hypothesis predict testable inequalities among the ob-
servables involved in a specific experiment and that these are violated by QM
predictions.
At the time when J.S. Bell proposed his theorem, in 1964, there was no
experiment that could perform a quantitative test. In 1969 a version of Bell’s
inequality was adapted to a real experiment that, however, was lacking suf-
ficient accuracy. In the seventies pioneering experiments with light photon
pairs started to be performed but only in the early eighties a new genera-
140 CHAPTER 6. THE SECOND QUANTUM REVOLUTION

tion of experiments obtained clear results violating Bell’s inequalities and,


therefore, in agreement with QM. A third generation of experiments, since
the beginning of the 1990s, confirmed the QM picture: a pair of entangled
photons must be regarded as a single inseparable system described by a state
that cannot be factorised into single-photon states [4].
Here we want to discuss that kind of Bell inequality that one can derive
within our experiment on spin correlation measurements in the case of two
spin 1/2 particles such as protons, as we mentioned.3 The Bell inequality
is derived within a simple model conceived by E.P. Wigner. First of all
Wigner’s model respects locality hypothesis, and also the physical reality
hypothesis for which each particles carries intrinsic values of spin along any
direction independently of the result of the measurement performed on the
other particle. More specifically, the model assumes that, given a single
two particle system, if Sz of one of the two particles is measured, then only
one well determined sign for the spin can be found as outcome, let us say
the positive, while if Sx is measured, then the opposite sign is obtained, in
our example let us say the negative sign. This does not mean that the two
spins can be measured simultaneously, but only that, after deciding whether
to measure Sz or Sx , a well defined outcome is obtained.4 The quantum
mechanical predictions are not valid for a single particle but only for a system
containing a huge number of particles with equal number of particles with
one sign or the other for both spins (Sz or Sx ). For such a system the model
reproduces the same predictions of QM when a measurement on a single
particle is performed.
Let us now see how the model can even account for the spin-correlation
measurements made on two-particle systems in the singlet state, i.e., with
vanishing total and z component angular momentum. The results can be
explained if the two particles, each characterised by one value for Sz and one
value for Sx , are produced with equal (25%) probability in one of these four
two-particle states (we do not use the bra-ket notation since these are not
QM states):
• Particle 1 with (Sz , Sx ) = (~/2) (+1, −1) and particle 2 with (Sz , Sx ) =
3
In this form the inequality was found in [5].
4
In our analogy with the envelopes it is like if now together with a black or white paper
there is also a second paper that can be either blue or red. If one decides to measure
the colour of one paper, then the colour of the second paper cannot be simultaneously
measured. For example, one can think that the unmeasured paper gets automatically de-
stroyed after the first measurement and the information on the second spin gets irreversibly
lost.
6.1. BELL INEQUALITIES 141

(~/2) (−1, +1);

• Particle 1 with (Sz , Sx ) = (~/2) (+1, +1) and particle 2 with (Sz , Sx ) =
(~/2) (−1, −1);

• Particle 1 with (Sz , Sx ) = (~/2) (−1, +1) and particle 2 with (Sz , Sx ) =
(~/2) (+1, −1);

• Particle 1 with (Sz , Sx ) = (~/2) (−1, −1) and particle 2 with (Sz , Sx ) =
(~/2) (+1, +1) .

One can notice that for all four states, the measurement of one of the two
spins in particle two would give as outcome an opposite value, as in the
quantum-mechanical case, since this is enforced by angular momentum con-
servation. In this model, if Bob decides to measure Sz of particle 2, he will
get a result that is independent of whether Alice decides to measure Sx or
Sz . In this way the locality principle is incorporated in the model: Alice
and Bob results are pre-determined, independently of what the other has
measured or not measured, prior to her/his measurement. The model also
respects physical reality: each particle has an intrinsic well defined value
of one of the two spins independently whether a measurement is performed
or not and independently of the spin of the other particle. However, at the
same time, the statistical results of QM are reproduced by this model, simply
since Alice and Bob perform many measurements and there is a well deviced
statistical distribution, in this case a homogeneous one, of how the different
four configurations are produced at the source. Notice here we do have an
ensemble but it is not made of many copies of the same system, rather it is
an ensemble containing copies of four different two-particle systems that are
produced, one after the other, at the source.
This model seems really to establish an intrinsic ambiguity in the inter-
pretation of the results and it seems to confirm that it is indeed possible
to conceive a model respecting locality and physical reality reproducing the
same QM spin correlation measurement results. However, if one considers
more complicated situations, Wigner’s model predict different outcomes from
QM and one can perform a crucial test to unambiguously understand which
model is correct. As we have seen, one can consider spin projection along
142 CHAPTER 6. THE SECOND QUANTUM REVOLUTION

any arbitrary direction (see Eq. (4.88)). Let us consider then three arbi-
trary directions n̂1 , n̂2 and n̂3 . In QM we can perform measurements along
these three directions, in Wigner’s model we assign to each particle a value
Sni = ~/2 or −~/2 for each direction n̂i . Clearly we have this time eight
possible different combinations of values (Sn1 , Sn2 , Sn3 ) let us say for particle
1. Again we can consider a singlet state where particle 2 will have values
(−Sn1 , −Sn2 , −Sn3 ), i.e., opposite to those of particle 1. Again, at the pro-
duction the two-particle systems are produced in the singlet state in one of
these eight possible combinations with a certain distribution of probabilities.
Despite the arbitrariness in these distributions it is possible to derive the
following Bell’s inequality for the probabilities P (Sni , Snj ) with i, j = 1, 2, 3
and i 6= j that Alice measures Sni and Bob measures Snj :
     
~ ~ ~ ~ ~ ~
P Sn1 = , Sn2 = ≤ P Sn1 = , Sn3 = + P Sn3 = , Sn2 = .
2 2 2 2 2 2
(6.5)
This is the prediction from Wigner’s model respecting the locality and phys-
ical reality hypotheses.
On the other hand in QM it is possible to derive the following equalities:
   
~ ~ 1 2 θij
P Sni = , Snj = = sin , (6.6)
2 2 2 2

where θij is the angle between n̂i and n̂j , i.e., such that n̂i · n̂j = cos θij . If
we now plug Eq. (6.6) in Bell’s inequality (6.5), the latter specialises into:
     
2 θ12 2 θ13 2 θ32
sin ≤ sin + sin . (6.7)
2 2 2

However, it is simple to see that this inequality cannot be satisfied for all
possible directions n̂1 , n̂2 and n̂3 , implying that there are choices of the three
angles θ12 , θ13 and θ32 for which the inequality is clearly not holding. For
example, taking θ12 = 2 θ and θ13 = θ32 = θ, the inequality is violated
for 0 < θ < π/2.5 Therefore, the QM predictions are not compatible with
Bell’s inequality. This shows that there is an intrinsic diffrence between
QM and any model one can conceive where locality and physical reality
hypotheses are satisfied. This however does not mean that QM violates
5
For example, choosing θ = π/4, one obtains from Eq. (6.7) the absurd result 0.5 ≤
0.292.
6.2. QUANTUM COMPUTING 143

causality and, for example, one cannot use spin correlation measurements to
transmit information between two points faster than the speed of light.
Alice, though after the measurement of Sz knows what would Bob would
obtain as outcome if he also measures Sz , cannot pre-determine, before her
measurement, what Bob will obtain in a way to send him a signal. Bob, upon
repeated measurements of Sz , will obtain a random sequence of positive and
negative values, without even knowing whether Alice has measured or not Sz
prior to his measurements. Of course they can collect the data, reconvene,
compare each other findings and verify that there is indeed a strong corre-
lation. The same conclusion would hold even if Alice, during the series of
measurements, changes her mind and start to measure Sz after having agreed
with Bob that both would have measured Sx : there would be no way how
Bob could understand that Alice has violated their agreement, though now
the two series of measurements would become uncorrelated since the time
Alice decides to measure Sz instead of Sx . In conclusion, quantum entan-
glement, cannot be used to use a kind of super-luminal transmitter, as in
science-fiction novels.

6.2 Quantum computing


The unit for measuring information is called binary digit, or more briefly,
bit. This is the information associated with a binary combination, like a
padlock with just one wheel with only two digits. It can be described by a
variable that can take just two values, usually chosen as 0 and 1.6 With two
bits one can of course form four possible combinations, each can be made
corresponding to a number: 00 → 0, 01 → 1, 10 → 2, 11 → 3. Of course
with a string of n bits, that would correspond to a binary number, one can
store 2n integer numbers, for example from 0 to 2n − 1.7 We have seen that
spin 1/2 particles also can have just two values of their spin component along
a certain direction: up or down. In the case of a quantum state described
by a quantum number that can take only two values, like ms for spin 1/2
particles, we can use the Dirac notation and denote the two (orthonormal)

6
The bit can be mathematically regarded as the generic element of the finite field
GF (2).
7 n
A string {ni } of n bits, with ni = 0, 1 and i = 0, . . . , n − 1, corresponds to an
Pn−1
integer number in decimal notation that is simply given by N = i=0 ni 2i . For example:
111 → 7, or 1111 → 15. You can find a converter on the web.
144 CHAPTER 6. THE SECOND QUANTUM REVOLUTION

states by | 0 i8 and | 1 i.9 As we know, however, in the case of quantum states


we can also form a new quantum state taking one superposition of them.
The set of all superposition quantum states form a two dimensional vector
space, where | 0 i and | 1 i are the basis vectors and they are usually referred
to as the computational basis.
Since the physical states correspond to rays and not to a single state
vector, we can always choose to normalise the states to unity. A quantum
bit or qubit is then defined as the generic state vector
|ψi = α|0i + β |1i, (6.8)
with |α|2 + |β|2 = 1.
In the same way as we can combine together more than one bit, we can
also combine more than one qubit taking their direct (or tensor) product.. A
set of n qubits forms a quantum register of size n and thus it needs a Hilbert
space of dimension 2n . Let us for example consider a register of size 2. The
basis vectors (the product basis) can be denoted by
| 0 i ≡ | 00 i , | 1 i ≡ | 01 i , | 2 i ≡ | 10 i , | 3 i ≡ | 11 i . (6.9)
However, as we know, we also have entangled states that cannot be written
as the direct product of two single qubit states:
1 1
√ (| 1 i ± | 2 i) ≡ √ (| 01 i ± | 10 i) . (6.10)
2 2

If both qubits are in the superposition state (| 0 i + | 1 i)/ 2 one obtains the
two-qubit state
1 1
(| 0 i + | 1 i) ⊗ (| 0 i + | 1 i) = (| 0 i + | 1 i + | 2 i + | 3 i) , (6.11)
2 2
which is a superposition of all four state vectors.
Of course one can also consider a quantum register of size 3 and in this
case one obtains a vector space of dimension 8 with product basis vectors:
| 0 i ≡ | 000 i | 1 i ≡ | 001 i | 2 i ≡ | 010 i . . . | 7 i ≡ | 111 i . (6.12)
8
Not to be confused with the null vector!
9
So far we used a notation such as | ω1 i and ω2 , where ω1 and ω2 are the eigenvalues of
some physical observable. Now we are adopting the typical notation used in information
theory but the physical realisation of these two states can then correspond to some of the
physical observables we discussed, such as the spin of a spin 1/2 particle. Here it is not
important the specific physical way, we are interested in the general way how we can store
and manipulate information through these states.
6.3. QUANTUM CRYPTOGRAPHY 145

Qubits are the building blocks of quantum computing. Quantum computers


can carry out certain calculations much faster than classical computers. This
is because while classical computers can execute only one path at a time, i.e.
a sequential operation on each bit of a string of bits such as 01000 . . . 001,
quantum computers exploit the property of quantum states to exist in super-
positions of classical bits, executing multiple computational paths at once:
this is called quantum parallelism. Operations on classical bits are realised
with logic gates and operations on qubits are realised with quantum gates [8].
The goal of demonstrating that certain operation can be indeed carried out
much faster with quantum computers is called quantum supremacy. However,
there are many obstacles and limitations to quantum computing, the tough-
est one is probably represented by quantum decoherence, i.e., the fact that the
interaction of entangled qubits with the external environment makes them
collapsing into factorised states, equivalent, to classical sequences of bits.
Recently, on 23 October 2019, Google has announced that its Sycamore pro-
cessor with 53 qubits has achieved quantum supremacy performing a task in
200 seconds that a classical computer would have performed in 10,000 years
[9]. Although IBM, possessing (at that time) the fastest classical supercom-
puter, has claimed that this, with some proper logarithm, would have taken
2.5 days, quantum supremacy seems now to be a reality.

6.3 Quantum cryptography


Cryptography is the science (and art!) of encoding and/or transmitting a
secret message between two parties, still referred as Alice and Bob, without
it is being read or understood by a third party (the eavesdropper to which
we refer as Eve) [6]. It is part of the broader field of cryptology, which also
includes cryptoanalysis, the science of code breaking. Classical cryptography
aims at guaranteeing a secure transmission of encoded secret messages on
a public channel. The development of more and more refined methods to
encode and involving sophisticated algorithms is of course motivated by the
parallel evolution of equally clever methods of code breaking.
The security of a transmission in classical cryptography can solely rely
on the hypothesis that any eavesdropper trying to break the code has neither
more advanced mathematics, nor more powerful computers, than the sender
and the receiver.
In classical cryptography encryption is achieved combining a message with
some additional information, known as the key, producing a cryptogram.
146 CHAPTER 6. THE SECOND QUANTUM REVOLUTION

The cryptosystem is secure when it is impossible to unlock the cryptogram


without the key. In practice this demand is in most cases soften so that
the system is just extremely difficult to crack. In this case it is important
that the message remains unlocked as long as the information it contains is
valuable. Cryptosystems can be classified into two main categories:

• Asymmetrical (or public-key) cryptosystems;

• Symmetrical (or secret-key) cryptosystems.

Asymmetrical cryptosystems. In the first category one has only classical


cryptosystems. They involve different keys for encryption and decryption
and they are commonly known as public-key cryptosystems. The first im-
plementation is so-called RSA, from the initials of its authors [7]. It was
developed in 1978 but it is still widely used.10 Public-key cryptosystems
are convenient and thus they became very popular over the last 20 years.
The security of the internet, for example, is partially based on such systems.
However, the existence of secure asymmetric cryptosystems is not proven.
This is clear an intolerable threat on these cryptosystems posing social and
economical risks to limit which there is no alternative but to turn to sym-
metrical cryptosystems where quantum cryptography can play an important
role.
Symmetrical (secret-key) cryptosystems. They require the use of a sin-
gle key for both encryption and decryption. The message is locked by Alice
together with a key. Bob in turns uses a copy of this key to unlock the mes-
sage. This one-time pad system is a symmetrical system first proposed in
1926 by G. Vernam of AT&T. Alice encrypts the message using a randomly
generated key. She simply adds each bit of the message with the correspond-
ing bit of the key (of at least the same length of the message) to obtain a
scrambled message. The key is then sent to Bob, who decrypts the scrambled
message by subtracting the key. The key is used only once, hence the name
one-time pad. If they used the key more than once, Eve (the eavesdropper)
could record all of the scrambled messages and reconstruct both plain texts
and the key. The key transmission by Alice to Bob has to be done by some
trusted means or through a personal meeting between Alice and Bob. The
procedure can be expensive and complex and can be regarded as a draw-
back of the system. Because of the expense and complexity of distributing
10
British Government claims that public key cryptography was originally invented at
the Government Communications Headquarters in Cheltenham as early as in 1973.
6.4. QUANTUM DECOHERENCE 147

long keys, it is currently used only for most critical applications. Standard
cryptosystems of this kind are DES, IDEA and AES.
The critical stage in the one-time pad method is the distribution of the
two copies of the keys. It involves secret channels that might be intercepted
by Eve with technologies more advanced than those of the sender and in-
tended receiver. Here is where quantum cryptography basic idea comes to
the rescue. The security of transmission rests on QM postulates. Since in
QM all measurements perturb the system in some way, it is possible to detect
Eve’s action by identifying the trace necessarily left on the transmitted key.
In the absence of such a trace one can be certain that the message has passed
without having been read by Eve. This simple idea can be summarised as:
No perturbation ⇒ No measurement ⇒ No eavesdropping
The first protocol of quantum cryptography was proposed in 1984 by
C.H. Bennett and G. Brassard, hence the name BB84. A variation of the
BB84 protocol is the EPR protocol (also called E91 protocol since it was
proposed by Artur Ekert in 1991). The idea consists in having a quantum
channel carrying the key in the form of entangled particles sent not from Al-
ice to Bob but one particle to Alice and one to Bob from a common source,
similarly to what happens in the case of the EPR paradox setup with spin-
correlation measurements, as we discussed in the previous chapter. If there
is no eavesdropping during the transmission, entangled particles arrive to
Alice and Bob who, performing a series of measurements, will find random
but perfectly correlated results. As long as they have not done the mea-
surements, the results are not predictable and the key simply does not exist.
Comparing their measurements, they can test whether the data violate Bell’s
inequality, as entangled states do, or whether during the transmission Eve
has intercepted the secrete key making the entangled state to collapse in
product states.

6.4 Quantum decoherence


In this section we discuss how the interaction of a system with the external
environment results into a loss of coherence of a pure state that becomes a
mixed state. This process is at the basis of the classical behaviour of macro-
scopic systems that can never be in a superposition of states. This quantum
decoherence provides a solution to the famous Schrödinger’s cat paradox and
is also a way to understand the measurement process and the collapse of
the state without postulating it. From this point of view any interaction of
148 CHAPTER 6. THE SECOND QUANTUM REVOLUTION

the system with the environment can be regarded as a measurement process.


Before discussing quantum decoherence we introduce an important tool: the
density operator and the associated density matrix.

6.4.1 Density matrix


We have so far implicitly assumed that the system is isolated and not in-
teracting with the external environment. In this case the system can be
assumed to be in a pure state and is described by a state vector | ψ i. More-
over, we have seen that given an observable described by an operator Ô, its
expectation value can be calculated as

hÔi = h ψ |Ô| ψ i . (6.13)

The time evolution of the state vector is described by the Schrödinger equa-
tion. An alternative description of the state of the system, introduced by
von Neumann and independently by Landau in 1927, is given by the density
operator associated to the normalised state vector | ψ i that is defined as:

ρ̂ ≡ | ψ ih ψ | . (6.14)

Notice that this is also the definition of the projector operator on the state
| ψ i given in Eq. (1.102), therefore, the density operator is the projection
operator on the state. In a given basis | i i we can of course also introduce
the associated density matrix ρ with matrix elements simply given by

ρij = h i |ρ̂| j i = h i | ψ ih ψ | j i = u?i uj , (6.15)

where, as usual, ui ≡ h ψ | j i are the components of ψ in the given basis.


If the state vector | ψ i coincides with one of the basis vector then we know
that the density matrix has to be very simple: it has just one entry on the
diagonal equal to unity, while all other entries are zero.
From the Schrödinger equation it is possible to show that the dynamics
of the density operator is described by the Liouville-von Neumann equation:
∂ ρ̂ h i
i~ = Ĥ, ρ̂ . (6.16)
∂t
The density operator is clearly Hermitian (ρ̂† = ρ̂) and for the pure states
we are considering one also has ρ̂2 = ρ̂ (it P
is idempotent). It is moreover
easy to understand from (6.15) that Tr[ρ] = i ρii = 1. Another interesting
6.4. QUANTUM DECOHERENCE 149

property of the density matrix ρ is that the expectation value of an operator


Ô can be calculated as

hÔi = Tr[ρ O] , (6.17)

where O is the matrix representation of Ô in the given basis.


So far we have seen how the density operator just provides an alternative
way to describe the state of a system that is fully equivalent to the description
in terms of the state vector. However, the density operator actually also gives
the opportunity to extend the description of a physical system even to the
case when the system is not isolated but interacts with the environment,
i.e., it is a subsystem. In such a situation, the state of a subsystem cannot
be described in terms of a state vector. We have seen that also in the case
of the EPR paradox entangled particles produced by the decay of a parent
particle cannot be described in terms of state vectors as we know. The same
is true if particles are produced from a thermal source (an ‘oven’). In that
case the spin of a particle along a certain component is fully random and
the beam of particles that emerges is said to be unpolarised since there is
no preferred direction for spin orientation. Notice that one can also have
intermediate situations where you start from a gas of polarised particles.
However interacting with the environment, for example with the walls inside
a cavity, they get random kicks and the gas becomes gradually unpolarised.
In situations like this, it is said that the subsystem is in a mixed state and a
statistical description has to be introduced.
For example, in the case of a gas of spin 1/2 particles we can only say what
will be the probability to measure the spin along a certain direction and this is
given by the fractional population of that particular spin. For example we can
say that in the gas a fraction w+ of particles has a positive spin along a certain
direction and the remaining w− = 1 − w+ has a negative spin. This implies
that measuring the spin of one particle one has a probability w+ to measure
positive spin and a probability w− to measure negative spin. Notice that this
is a probability deriving by having a physical ensemble of distinct particles,
it is not the quantum ensemble made of copies of the system with the same
quantum state. Effectively the gas of particles is populating different one-
particle quantum states each with a certain probability. We cannot in this
case build a state vector describing this situation but, quite interestingly, we
can build a density operator associated to such mixed states. Suppose that
the subsystem has a certain distribution of weights wk to populate certain
150 CHAPTER 6. THE SECOND QUANTUM REVOLUTION

states | ψk i, the density operator for such mixed state is then given by
X
ρ̂ = wk | ψk ih ψk | (6.18)
k

and the corresponding density matrix by


X X (k) (k)
ρij = h i |ρ̂| j i = wk h i | ψk ih ψk | j i = wk (ui )? uj . (6.19)
k k

Notice that here the sum over k indicates an incoherent sum over the different
state vectors while indexes i and j label the basis vectors and the components
of | ψ i on these basis vectors. Some of the properties of the density matrix
valid for pure states are still valid for mixed states, specifically the density
matrix is still Hermitian (ρ̂† = ρ̂), the trace of the density matrix is still
unity Tr[ρ] = 1, its dynamics is still described by the Liouville-von Neumann
equation (6.23) and, importantly, the expectation value of an operator can
be still calculated using (6.17). However, for mixed states now ρ̂2 6= ρ̂, as it
can be easily seen again from the matrix elements in Eq. (6.19).
As an example one can build the density matrix for electrons in eigenstate
of Ŝx . In particular, one can easily see that the eigenstate with positive spin,
Sx = ~/2, represented in the Ŝz basis (see Eq. (4.86)), corresponds to a
density matrix  
1/2 1/2
ρ= . (6.20)
1/2 1/2
and one can verify that ρ2 = ρ. On the other hand, in the Ŝx basis, it would
simply be  
1 0
ρ= . (6.21)
0 0
This situation of course describes a beam of electrons fully polarised along
the x axis. Suppose now that the electrons interact with some walls or with a
inhomogeneous magnetic field that tends to randomise the spins. In this way
the beam will gradually become unpolarised and the density matrix would
gradually make a transition
   
1/2 1/2 1/2 0
ρ= −→ , (6.22)
1/2 1/2 0 1/2

where the asymptotic limit on the right hand side corresponds to the density
matrix for a fully unpolarised beam. Indeed it is proportional to the identity
6.4. QUANTUM DECOHERENCE 151

and it is invariant under rotations, meaning that it would look the same
in any basis. This means that independently on which spin component is
measured, one would always find that half of the electrons have spin up and
half have spin down, so that the polarisation, the average spin, vanishes.
The behaviour in Eq. (6.22) is called decoherence and it is a very general
process happening when a microscopic system interacts with a macroscopic
one. The transition can be described by the following Lindblad equation [10]
 
∂ρ 0 1
i~ = [H, ρ] − D , (6.23)
∂t 1 0
where D is the decoherence rate and τdec is the time of relaxation to decoher-
ence of the system. The off-diagonal terms are exponentially damped by the
decoherence term and, therefore, asymptotically the density matrix becomes
diagonal. For macroscopic systems τdec is tiny and this is the reason why
macroscopic system are instantaneously projected on some eigenstate and
are never observed in a superposition state.
This decoherence effect solves the famous Schrödinger’s cat paradox: “A
cat is placed in a steel chamber, together with the following hellish contrap-
tion . . . In a Geiger counter there is a tiny amount of radioactive substance,
so tiny that maybe within an hour one of the atoms decays, but equally
probably none of them decays. If one decays then the counter triggers and
via a relay activates a little hammer which breaks a container of cyanide. If
one has left this entire system for an hour, then one would say the cat is
living if no atom had decayed. The first decay would have poisoned it. The
wave function of the entire system would express this by containing equal
parts of the living and dead cat.”
Therefore, after an hour the state would be described by a state vector
of the schematic form
1
| ψ i = √ (| ψalive i + | ψdead i) . (6.24)
2
The cat would then be in a superposition of dead and alive states. Opening
the box would then correspond to a measurement process and there would
be 50-50 chance to find it dead or alive! Obviously this was a burlesque way
to put the question why macroscopic systems are never observed in a super-
position states. We know the answer now: because the environment itself,
independently whether there is an observer or not, continuously makes the
state of a macroscopic system collapsing in such a fast way that it can actu-
ally never be observed in a superposition state and repeated measurements
152 CHAPTER 6. THE SECOND QUANTUM REVOLUTION

of the same state will always give the same outcome: in this way a classical
behaviour is recovered. Schrödinger’s cat, by the way, still lived a long life!
Bibliography

[1] J.S. Bell, Speakable and Unspeakable in Quantum Mechanics, Cambridge


University Press 2004.

[2] A. Einstein, B. Podolsky and N. Rosen, Can quantum mechanical de-


scription of physical reality be considered complete?, Phys. Rev. 47
(1935), 777-780.

[3] J.J Sakurai, Jim Napolitano, Modern Quantum Mechanics, Cambridge


University Press 2017.

[4] For a review on the experiments that have been performed to test Bell’s
inequalities see A. Aspect, Bell’s inequality test: more ideal than ever,
Nature 398, 189–190 (1999), https://doi.org/10.1038/18296.

[5] J. F. Clauser, M. A. Horne, A. Shimony and R. A. Holt, Proposed exper-


iment to test local hidden variable theories, Phys. Rev. Lett. 23 (1969),
880-884.

[6] For a review on quantum cryptography see: N. Gisin, G. Ribordy,


W. Tittel and H. Zbinden, Quantum cryptography, Rev. Mod. Phys.
74 (2002), 145-195 doi:10.1103/RevModPhys.74.145 [arXiv:quant-
ph/0101098 [quant-ph]].

[7] R. L. Rivest, A. Shamir and L. Adelman, A Method for Obtaining Digital


Signatures and Public Key Cryptosystems (Formerly on Digital Signa-
tures and Public Key Cryptosystems), Communications of the ACM 21,
120-126, MIT/LCS/TM-82.

[8] See the a.y. 2019-20 lecture notes by Andrew Akeroyd.

[9] F. Arute, K. Arya, R. Babbush, D. Bacon, J. C. Bardin, R. Barends,


R. Biswas, S. Boixo, F. G. S. L. Brandao and D. A. Buell, et al. Quantum

153
154 BIBLIOGRAPHY

supremacy using a programmable superconducting processor, Nature 574


(2019) no.7779, 505-510 [arXiv:1910.11333 [quant-ph]].

[10] M. Schlosshauer, Quantum Decoherence, Phys. Rept. 831 (2019), 1-57


[arXiv:1911.06282 [quant-ph]].

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy