Book 12
Quantum Mechanics of
Many-Particle Systems:
Atoms, Molecules and More
Roy McWeeny
Looking ahead
In Book 4, when you started on Physics, we said Physics is a big subject and youll need
more than one book. Here is another one! Book 4 was mainly about particles, the
ways they move when forces act on them, and how the same laws of motion still hold
good for all objects built up from particles however big they may be. In Book 11 we
moved from Classical Physics to Quantum Physics and again started with the study of
a single moving particle and the laws that govern its behaviour. Now, in Book 12, we
move on and begin to build up the mathematical machinery for dealing with systems
composed of very many particles for example atoms, where up to about 100 electrons
move in the electric field of one nucleus, or molecules, where the electrons move in the
field provided by several nuclei.
Chapter 1 reviews the priciples formulated in Book 11, along with the concepts
of vector space, in which a state vector is associated with the state of motion of
a particle, and in which an operator may be used to define a change of state. This
chapter uses Schrodingers form of quantum mechanics in which the state vectors
are represented by wave functions = (x, y, z) (functions of the position of the
particle in space) and the operators are typically differential operators. The chapter
starts from the ideas of observables and measurement; and shows how measurement of a physical quantity can be described in terms of operations in a vector
space. It follows with a brief reminder of the main way of calculating approximate
wave functions, first for one electron, and then for more general systems.
In Chapter 2 you take the first step by going from one electron to two: the
Hamiltonian operator is then H(1, 2) = h(1) + h(2) + g(1, 2), where only g the
interaction operator depends on the coordinates of both particles. With neglect
of interaction the wave function can be taken as a product P si(1, 2) = a (1)b (2),
which indicates Particle 1 in state a and Particle 2 in state b . This is a first
example of the Independent Particle Model and can give an approximate wave
function for a 2-particle system. The calculation of the ground state electronic
energy of the Helium atom is completed with an approximate wave function of
product form (two electrons in an orbital of 1s type) and followed by a study of the
excited states that result when one electron is promoted into the 2s orbital. This
raises interesting problems about the symmetry of the wave function. There are, it
seems, two series of possible states: in one the function is unchanged if you swap the
electrons (it is symmetric) but in the other it changes in sign (it is antisymmetric).
Which must we choose for two electrons?
At this point we note that electron spin has not yet been taken into account.
The rest of the chapter brings in the spin functions (s) and (s) to describe an
electron in an up-spin or a down-spin state. When these spin factors are included
in the wave functions an orbital (r) (r standing for the three spatial variables
Chapter 3 starts from the Antisymmetry Principle and shows how it can be
included generally in the Independent Particle Model for an N -electron system.
Slaters rules are derived as a basis for calculating the total energy of such a system in its ground state, where only the lowest-energy spin-orbitals are occupied
by electrons. In this case, neglecting tiny spin-dependent effects, expressions for
the ground-state energies of the first few many-electron atoms (He, Li, Be, ...) are
easily derived.
So far, we have not considered the analytical forms of the orbitals themselves,
assuming that the atomic orbitals (AOs) for a 1-electron system (obtained in Book
11) will give a reasonable first approximation. In actual fact that is not so and
the whole of this difficult Chapter 4 is devoted to the Hartree-Fock method of
optimizing orbital forms in order to admit the effects of inter-electron repulsion.
By defining two new one-electron operators, the Coulomb operator J and the
Exchange operator K, it is possible to set up an effective 1-electron Hamiltonian
F (the Fock operator) whose eigenfunctions will be best possible approximations
to the orbitals in an IPM wave function; and whose corresponding eigenvalues give
a fairly realistic picture of the distribution of the total electronic energy E among
the individual electrons. In fact, the eigenvalue k represents the amount of energy
belonging to an electron in orbital k ; and this can be measured experimentally by
observing how much energy is needed to knock the electron out. This gives a firm
basis for the much-used energy-level diagrams. The rest of Chapter 4 deals with
practical details, showing how the Hartree-Fock equation F = can be written
(by expanding in terms of a set of known functions) in the finite basis form
Fc = c, where F is a square matrix representing the Fock operator and c is a
column of expansion coefficients.
At last, in Chapter 5, we come to the first of the main themes of Book 12: Atoms
the building blocks of matter. In all atoms, the electrons move in the field of a
central nuclus, of charge Ze, and the spherical symmetry of the field allows us to
use the theory of angular momentum (Chapter 5 of Book 11) in classifying the
possible stationary states. By assigning the Z electrons to the 1-electron states (i.e.
orbitals) of lowest energy we obtain the electron configuration of the electronic
ground state; and by coupling the orbital angular momentum of individual electrons,
in s, p, d, ... states with quantum numbers l = 0, 1, 2, ... it is possible to set
up many-electron states with quantum numbers L = 0, 1, 2, ... These are called
S, P, D, ... states and correspond to total angular momentum of 0, 1, 2, ... units:
a state of given L is always degenerate, with 2L+1 component states in which the
angular momentum component (along a fixed z-axis) goes down in unit steps from
M = L to M = L. Finally, the spin angular momentum must be included.
The next step is to calculate the total electronic energy of the various many-electron
states in IPM approximation, using Slaters Rules. All this is done in detail, using
worked examples, for the Carbon atom (Section 5.2). Once you have found wave
functions for the stationary states, in which the expectation values of observables
do not change in time, youll want to know how to make an atom jump from one
state to another. Remember from Book 10 that radiation consists of a rapidly
varying electromagnetic field, carried by photons of energy = h, where h is
Plancks constant and is the radiation frequency. When radiation falls on an
atom it produces a small oscillating perturbation and if you add this to the freeatom Hamiltonian you can show that it may produce transitions between states
of different energy. When this energy difference matches the photon energy h a
photon will be absorbed by, or emitted from, the atom. And that is the basis of
all kinds of spectroscopy the main experimental tool for investigating atomic
The main theoretical tool for visualizing what goes on in atoms and molecules is
provided by certain electron density functions, which give a classical picture of
how the electric charge, or the electron spin, is spread out in space. These densities,
which you first met in Chapter 4, are essentially components of the density matrix.
The properties of atoms, as atomic number (i.e. nuclear charge, Z) increases, are
usually displayed in a Periodic Table, which makes a clear connection between
electronic and chemical properties of the elements. Here you find a brief description
of the distribution of electrons among the AOs of the first 36 atoms.
This chapter ends with a brief look at the effects of small terms in the Hamiltonian,
so far neglected, which arise from the magnetic dipoles associated with electron
spins. The electronic states discussed so far are eignstates of the Hamiltonian H,
the total angular momentum (squared) L2 , and one component Lz . But when spin
is included we must also admit the total spin with operators S2 and Sz , formed by
coupling individual spins; the total angular momentum will then have components
with operators Jx = Lx + Sx etc. The magnetic interactions between orbital and
spin dipoles then lead to the fine structure of the energy levels found so far. The
experimentally observed fine structure is fairly well accounted for, even with IPM
wave functions.
Atoms first started coming together, to form the simplest molecules, in the very
early Universe. In Chapter 6 Molecules: the first steps you go back to the
Big Bang, when all the particles in the present Universe were contained in a small
ball which exploded the interactions between them driving them apart to form
the Expanding Universe we still have around us today. The first part of the
chapter tells the story, as best we know it, from the time when there was nothing
but an unbelievably hot sea (nowadays called a plasma) of electrons, neutrons and
protons, which began to come together in Hydrogen atoms (1 proton + 1 electron).
Then, when another proton is added, you get a hydrogen molecule ion H2+ and
so it goes on!
In Section 6.2 you do a simple quantum mechanical calculation on H2+ , combining
two hydrogen-like atomic orbitals to form two approximate eigenfunctions for one
electron in the field of two stationary protons. This is your first molecular orbital
(MO) calculation, using linear combination of atomic orbitals to obtain LCAO
approximations to the first two MOs: the lower energy MO is a Bonding Orbital,
the higher energy MO is Antibonding.
The next two sections deal with the interpretation of the chemical bond where does
it come from? There are two related interpretations and both can be generalized at
once to the case of many-electon molecules. The first is based on an approximate
calculation of the total electronic energy, which is strongly negative (describing the
attraction of the electrons to the positive nuclei): this is balanced at a certain
distance by the positive repulsive energy between the nuclei. When the total energy
reaches a minimum value for some configuration of the nuclei we say the system is
bonded. The second interpretation arises from an analysis of the forces acting on
the nuclei: these can be calculated by calculating the energy change when a nucleus
is displaced through an infinitesimal distance. The force-concept interpretation is
attractive because it gives a clear physical picture in terms of the electron density
function: if the density is high between two nuclei it will exert forces bringing them
AO energy levels you know something about already: the order of the MO levels
depends on simple qualitative ideas about how the AOs overlap which depends
in turn on their sizes and shapes. So even without doing a big SCF calculation it
is often possible to make progress using only pictorial arguments. Once you have
an idea of the probable order of the MO energies, you can start filling them with
the available valence electrons and when youve done that you can think about the
resultant electron density! Very often a full SCF calculation serves only to confirm
what you have already guessed.
In Section 7.3 we turn to some simple polyatomic molecules, extending the ideas
used in dealing with diatomics to molecules whose experimentally known shapes
suggest where localized bonds are likely to be found. Here the most important
concept is that of hybridization the mixing of s and p orbitals on the same centre,
to produce hybrids that can point in any direction. It soon turns out that hybrids of
given form can appear in sets of two, three, or four; and these are commonly found
in linear molecules, trigonal molecules (three bonds in a plane, at 120 to each
other) and tetrahedral molecules (four bonds pointing to the corners of a regular
tetrahedron). Some systems of roughly tetrahdral form are shown in Figure 7.7.
It seems amazing that polyatomic molecules can often be well represented in terms
of localized MOs similar to those found in diatomics. In Section 7.4 this mystery is
resolved in a rigorous way by showing that the non-localized MOs that arise from a
general SCF calculation can be mixed by making a unitary transformation without
changing the form of the total electron density in any way! This is another example
of the fact that only the density itself (e.g. ||2 , not ) can have a physical meaning.
Section 7.5 turns towards bigger molecules, particularly those important for Organic
Chemistry and the Life Sciences, with fully worked examples. Many big molecules,
often built largely from Carbon atoms, have properties connected with loosely bound
electrons occupying -type MOs that extend over the whole system.
Such molecules were a favourite target for calculations in the early days of Quantum
Chemistry (before the computer age) because the electrons could be considered
by themselves, moving in the field of a framework, and the results could easily
be compared with experiment. Many molecules of this kind belong to the class
of alternant systems and show certain general properties. They are considered in
Section 7.6, along with first attempts to discuss chemical reactivity.
To end this long chapter, Section 7.7 summarizes and extends the bridges established between Theory and Experiment, emphasizing the pictorial value of density
functions such as the electron density, the spin density, the current density and so
Chapter 1 The problem and how to deal with it
1.1 From one particle to many
1.2 The eigenvalue equation as a variational condition
1.3 The linear variation method
1.4 Going all the way! Is there a limit?
1.5 Complete set expansions
Chapter 2 Some two-electron systems
2.1 Going from one particle to two
2.2 The Helium atom
2.3 But what happened to the spin?
2.4 The antisymmetry principle
Chapter 3 Electronic structure: the independent particle model
3.1 The basic antisymmetric spin-orbital products
3.2 Getting the total energy
Chapter 4 The Hartree-Fock method
4.1 Getting the best possible orbitals: Step 1
4.2 Getting the best possible orbitals: Step 2
4.3 The self-consistent field
4.4 Finite-basis approximations
Chapter 5 Atoms: the building blocks of matter
5.1 Electron configurations and electronic states
5.2 Calculation of the total electronic energy
5.3 Spectroscopy: a bridge between experiment and theory
5.4 First-order response to a perturbation
5.5 An interlude: the Periodic Table
5.6 Effect of small terms in the Hamiltonian
Chapter 6 Molecules: first steps
6.1 When did molecules first start to form?
6.2 The first diatomic systems
6.3 Interpretation of the chemical bond
6.4 The total electronic energy in terms of density functions
The force concept in Chemistry
Chapter 1
The problem
and how to deal with it
Book 11, on the principles of quantum mechanics, laid the foundations on which we hope
to build a rather complete theory of the structure and properties of all the matter around
us; but how can we do it? So far, the most complicated system we have studied has
been one atom of Hydrogen, in which a single electron moves in the central field of a
heavy nucleus (considered to be at rest). And even that was mathematically difficult:
the Schrodinger equation which determines the allowed stationary states, in which the
energy does not change with time, took the form of a partial differential equation in three
position variables x, y, z, of the electron, relative to the nucleus. If a second electron is
added and its interaction with the first is included, the corresponding Schrodinger equation
cannot be solved in closed form (i.e. in terms of known mathematical functions). But
Chemistry recognizes more than a 100 atoms, in which the nucleus has a positive charge
Ze and is surrounded by Z electrons each with negative charge e.
Furthermore, matter is not composed only of free atoms: most of the atoms stick together
in more elaborate structures called molecules, as will be remembered from Book 5. From
a few atoms of the most common chemical elements, an enormous number of molecules
may be constructed including the molecules of life, which may contain many thousands
of atoms arranged in a way that allows them to carry the genetic code from one generation
to the next (the subject of Book 9). At first sight it would seem impossible to achieve
any understanding of the material world, at the level of the particles out of which it is
composed. To make any progress at all, we have to stop looking for mathematically exact
solutions of the Schrodinger equation and see how far we can get with good approximate
wave functions, often starting from simplified models of the systems we are studying. The
next few Sections will show how this can be done, without trying to be too complete
(many whole books have been written in this field) and skipping proofs whenever the
mathematics becomes too difficult.
The first three chapters of Book 11 introduced most of the essential ideas of Quantum
Mechanics, together with the mathematical tools for getting you started on the further
applications of the theory. Youll know, for example, that a single particle moving somewhere in 3-dimensional space may be described by a wave function (x, y, z) (a function
of the three coordinates of its position) and that this is just one special way of representing
a state vector. If we want to talk about some observable property of the particle, such
as its energy E or a momentum component px , which well denote here by X whatever it
may stand for we first have to set up an associated operator1 X. Youll also know that
an operator like X works in an abstract vector space, simply by sending one vector into
another. In Chapter 2 of Book 11 you first learnt how such operators could be defined
that would be obtained from a
and used to predict the average or expectation value X
large number of observations on a particle in a state described by the state vector .
In Schrodingers form of quantum mechanics (Chapter 3) the vectors are replaced by
functions but we often use the same terminology: the scalar product
of two functions
being defined (with Diracs angle-bracket notation) as h1 |2 i = 1 (x, y, z)2 dxdydz
With this notation we often write the expectation value X
= hXi = h|Xi,
which is a Hermitian scalar product of the bra-vector h| and the ket-vector |Xi
obtained by letting the operator X work on the that stands on the right in the scalar
product. Here it is assumed that the state vector is normalized to unity: h|i = 1.
Remember also that the same scalar product may be written with the adjoint operator,
X , working on the left-hand . Thus
= hXi = hX |i.
This is the property of Hermitian symmetry. The operators associated with observables are self -adjoint, or Hermitian, so that X = X.
In Schrodingers form of quantum mechanics (Chapter 3 of Book 11) X is usually represented as a partial differential operator, built up from the coordinates x, y, z and the
differential operators
px =
, py =
, pz =
i x
i y
i z
which work on the wave function (x, y, z).
As weve given up on the idea of calculating wave functions and energy levels accurately,
by directly solving Schrodingers equation H = E, we have to start thinking about
Remember that a special typeface has been used for operators, vectors and other non-numerical
possible ways of getting fair approximations. To this end, lets go back to first principles
as we did in the early chapters of Book 11
The expectation value given in (1.1) would be obtained experimentally by repeating the
measurement of X a large number of times, always starting from the system in state
, and recording the actual results X1 , X2 , ... etc. which may be found n1 times, n2
The fraction ni /N gives the
times, and so on, all scattered around their average value X.
probability pi of getting the result Xi ; and in terms of probabilities it follows that
= hXi = p1 X1 + p2 X2 ... + pi Xi + ... + pN XN =
p i Xi .
Now its much easier to calculate an expectation value, using (1.1), than it is to solve
an enormous partial differential equation; so we look for some kind of condition on ,
involving only an expectation value, that will be satisfied when is a solution of the
equation H = E.
The obvious choice is to take X = H EI, where I is the identity operator which leaves
any operand unchanged, for in that case
X = H E
and the state vector X is zero only when the Schrodinger equation is satisfied. The test
for this is simply that the vector has zero length:
hX|Xi = 0.
In that case, may be one of the eigenvectors of H, e.g. i with eigenvalue Ei , and the
last equation gives Hi = Ei i . On taking the scalar product with i , from the left, it
follows that hi |H|i i = Ei hi |i i and for eigenvectors normalized to unity the energy
expectation value coincides with the definite eigenvalue.
Lets move on to the case where is not an eigenvector of H but rather an arbitrary
vector, which can be expressed as a mixture of a complete set of all the eigenvectors
{i } (generally infinite), with numerical expansion coefficients c1 , c2 , ...ci , .... Keeping
(without subscript) to denote the arbitrary vector, we put
= c1 1 + c2 2 + ... =
ci i
and use the general properties of eigenstates (Section 3.6 of Book 11) to obtain a general
expression for the expectation value of the energy in state (1.7), which may be normalized
so that h|i = 1.
Thus, substitution of (1.7) gives
E = h|H|i = h(
ci i )|H|(
cj j )i =
ci cj hi |H|j i
= |c1 |2 E1 + |c2 |2 E2 + ... =
|ci |2 Ei .
Now suppose we are interested in the state of lowest energy, the ground state, with E1
less than any of the others. In that case it follows from the last two equations that
|c1 |2 E1 + |c2 |2 E2 + ...
|c1 |2 E1 |c2 |2 E1 + ...
0 + |c2 |2 (E2 E1 ) + ...
h|H|i E1 =
All the quantities on the right-hand side are essentially positive: |ci |2 > 0 for all i and
Ei E1 > 0 because E1 is the smallest of all the eigenvalues. It follows that
Given an arbitrary state vector , which may be
chosen so that h|i = 1, the energy expectation value
E = h|H|i/h|i
must be greater than or equal to the lowest eigenvalue, E1 ,
of the Hamiltonian operator H
Here the normalization factor h|i has been left in the denominator of E and the result
then remains valid even when is not normalized (check it!). This is a famous theorem
and provides a basis for the variation method of calculating approximate eigenstates.
In Schrodingers formulation of quantum mechanics, where is represented by a wave
function such as (x, y, z), one can start from any trial function that looks roughly
right and contains adjustable parameters. By calculating a variational energy h|H|i
and varying the parameters until you cant find a lower value of this quantity you will
know you have found the best approximation you can get to the ground-state energy E1
and corresponding wave function. To do better youll have to use a trial of different
functional form.
As a first example of using the variation method well get an approximate wave function
for the ground state of the hydrogen atom. In Book 11 (Section 6.2) we got the energy and
wave function for the ground state of an electron in a hydrogen-like atom, with nuclear
charge Ze, placed at the origin. They were, using atomic units,
E1s = 12 Z 2 ,
1s = N1s eZr ,
exp(ps2 qs)ds =
which holds for any values (real or complex) of the constants p, q. Since the function were integrating is
symmetrical about r = 0 and is needed only for q = 0 well use the basic integral
epr dr = 12 p1/2 . (B)
I0 =
Now lets differentiate both sides of equation (B) with respect to the parameter p, just as if it were an
ordinary variable (even though it is inside the integrand and really one should prove that this is OK).
On the left we get (look back at Book 3 if you need to)
r2 epr dr = I1 ,
where weve called the new integral I1 as we got it from I0 by doing one differentiation. On differentiating
the right-hand side of (B) we get
d 1 1/2
) = 12 ( 21 p3/2 ) = 41 /p p.
( p
dp 2
But the two results must be equal (if two functions of p are identically equal their slopes will be equal at
all points) and therefore
r2 epr dr = 12 ( 12 p3/2 ) = 14 /p p,
I1 =
where the integral I1 on the left is the one we need as it appears in (A) above. On using this result in
(A) and remembering that p = 2 it follows that N 2 = (p/)3/2 = (2/)3/2 .
N =
2 d
+ 2.
r dr dr
Well evaluate the three terms on the right in the next two Examples:
Example 1.2 Expectation value of the potential energy
which looks like the integral at A in Example 1.1 except for the factor (1/r). The new integral we
need is 4I0 , where
repr dr (p = 2)
I0 =
and the factor r spoils everything we can no longer get I0 from I0 by differentiating, as in Example 1.1,
for that would bring down a factor r2 . However, we can use another of the tricks you learnt in Chapter 4
of Book 3. (If youve forgotten all that youd better read it again!) It comes from changing the variable
by putting r2 = uRand expressing I0 in terms of u. In that case we can use the formula you learnt long
epu du.
The integral is a simple standard integral and when the limits are put in it gives (check it!) I0 =
0 = 2 (1/p).
2 [e
And now you know how to do the integrations you should be able to get the remaining
terms in the expectation value of the Hamiltonian h. They come from the kinetic energy
operator T = 21 2 , as in the next example.
Example 1.3 Expectation value of the kinetic energy
We require T = h|T|i and from (1.12) this is seen to be the sum of two terms. The first one involves
the first derivative of , which becomes (on putting r2 = u in = N er )
1 d2
2 dr 2 |i
= 12 N 2 4p
r2 epr dr + 21 N 2 4p2
When the first-derivative term is added, namely 4N 2 pI1 , we obtain the expectation value of the kinetic
energy as
4N 2 pI1 + 2N 2 (p2 I2 pI1 ) = 2N 2 (p2 I2 + 3pI1 .)
The two terms in the final parentheses are
2 2
2N p I2 = 2N
2N pI1 = 2N
and remembering
that p = 2 and that N 2 is given in (1.1), substitution gives the result T = T1 + T2 =
2N 2 (3/8) /2.
2N 2 = .
Finally, the expectation energy with a trial wave function of the form = N er becomes,
on adding the PE term from (1.13), 2ZN 2 (1/2)
E =
1/2 .
There is only one variable parameter and to get the best approximate ground state
function of Gaussian form we must adjust until E reaches a minimum value. The value
(A = 3/2,
B = 2Z
The fact that the minimum energy is exactly 1 the kinetic energy is no accident: it is a consequence
of the virial theorem, about which youll hear more later. For the moment, we note that forp
a hydrogenmin = 1 (2Z 2/)2 /3 =
like atom the 1-term gaussian wave function gives a best approximate energy E
4Z 2 /3.
Example 1.4 gives the result 0.42442 Z 2 , where all energies are in units of eH .
For the hydrogen atom, with Z = 1, the exact ground state energy is 21 eH , as we know
from Book 11. In summary then, the conclusion from the Example is that a gaussian
function gives a very poor approximation to the hydrogen atom ground state, the estimate
0.42442 eH being in error by about 15%. The next Figure shows why:
Figure 1.1 Comparison of exponential and gaussian functions
(r) fails to describe the sharp cusp when r 0 and also goes to zero much too rapidly
when r is large.
Of course we could get the accurate energy E1 = 12 eH and the corresponding wave function 1 , by using a trial function of exponential form exp ar and varying the parameter
a until the approximate energy reaches a minimum value. But here well try another
approach, taking a mixture of two gaussian functions, one falling rapidly to zero as r
increases and the other falling more slowly: in that way we can hope to correct the main
defects in the 1-term approximation.
Example 1.5 A 2-term gaussian approximation
With a trial function of the form = A exp ar2 + B exp br2 there are three parameters that can
be independently varied, a, b and the ratio c = B/A a fourth parameter not being necessary if were
looking for a normalized function (can you say why?). So well use instead a 2-term function =
exp ar2 + c exp br2 .
From the previous Examples 1.1-1.3, its clear how you can evaluate all the integrals you need in calculating h|i and the expectation values h|V|i, h|V|i; all youll need to change will be the parameter
values in the integrals.
Try to work through this by yourself, without doing the variation of all three values to find the minimum
(Until youve learnt to use a computer thats much too long a job! But you may like
value of E.
to know the result: the best values of a, b, c are a = 1.32965, b = 0.20146, c = 0.72542 and the
= 0.4858Z 2 eH . This compares with the one-term
best approximation to E1s then comes out as E
= 0.4244Z eH ; the error is now reduced from about 15% to less than 3%.
approximation E
The approximate wave function obtained in Example 1.5 is plotted in Figure 1.2 and again
compared with the exact 1s function. (The functions are not normalized, being shifted
vertically to show how well the cusp behaviour is corrected. Normalization improves the
agreement in the middle range.)
Instead of building a variational approximation to the wave function out of only two
terms we may use as many as we please, taking in general
= c1 1 + c2 2 + ... + cN N ,
where (with the usual notation) the functions {i (i = 1, 2, ... N )} are fixed and we vary
only the coefficients ci in the linear combination: this is called a linear variation function
and it lies at the root of nearly all methods of constructing atomic and molecular wave
From the variation theorem (1.10) we need to calculate the expectation energy E =
h|H|i/h|i, which we know will give an upper bound to the lowest exact eigenvalue
E1 of the operator H. We start by putting this expression in a convenient matrix form:
you used matrices a lot in Book 11, representing the operator H by a square array of
numbers H with Hij = hi |H|j i (called a matrix element) standing at the intersection
of the ith row and jth column; and collecting the coefficients ci in a single column c. A
matrix element Hij with j = i lies on the diagonal of the array and gives the expectation
energy Ei when the system is in the particular state = i . (Look back at Book 11
Chapter 7 if you need reminding of the rules for using matrices.)
In matrix notation the more general expectation energy becomes
c Hc
c Mc
where c (the Hermitian transpose of c) denotes the row of coefficients (c1 c2 , ... cN ) and
M (the metric matrix) looks like H except that Hij is replaced by Mij = hi |j i, the
scalar product or overlap of the two functions. This allows us to use sets of functions
that are neither normalized to unity nor orthogonal with no additional complication.
The best approximate state function (1.11) we can get is obtained by minimizing E to
make it as close as possible to the (unknown!) ground state energy E1 , and to do this we
look at the effect of a small variation c c + c: if we have reached the minimum, E
will be stationary, with the corresponding change E = 0.
In the variation c c + c, E becomes
c Hc + c Hc + c Hc + ...
E + E =
c Mc + c Mc + c Mc + ...
where second-order terms that involve products of -quantities have been dropped (vanishing in the limit c 0).
The denominator in this expression can be re-written, since c Mc is just a number, as
c Mc[1 + (c Mc)1 (c Mc + c Mc)]
and the part in square brackets has an inverse (to first order in small quantities)
1 (c Mc)1 (c Mc + c Mc).
On putting this result in the expression for E + E and re-arranging a bit (do it!) youll
Mc + c Mc)].
E + E = E + c Mc)1 [(c Hc + c Hc) E(c
It follows that the first-order variation is given by
E = c Mc)1 [(c H Ec
The two terms in (1.18) are complex conjugate, giving a real result which will vanish only
when each is zero.
The condition for a stationary value thus reduces to a matrix eigenvalue equation
Hc = EMc.
To get the minimum value of E we therefore take the lowest eigenvalue; and the corresponding best approximation to the wave function 1 will follow on solving the
simultaneous equations equivalent to (1.19), namely
Hij cj = E
Mij cj (all i).
This is essentially what we did in Example 1.2, where the linear coefficients c1 , c2 gave a
best approximation when they satisfied the two simultaneous equations
11 )c1 + (H12 EM
12 )c2 = 0,
(H11 EM
21 )c1 + (H22 EM
22 )c2 = 0,
(H21 EM
the other parameters bing fixed. Now we want to do the same thing generally, using a
large basis of N expansion functions {i }, and to make the calculation easier its best to
use an orthonormal set. For the case N = 2, M11 = M22 = 1 and M12 = M21 = 0, the
equations then become
1 = H12 c2 ,
(H11 E)c
H21 c1 = (H22 E)c
c1 , c2 . However, by dividing each side of the first
Here there are three unknowns, E,
equation by the corresponding side of the second, we can eliminate two of them, leaving
(H11 E)
(H22 E)
This is quadratic in E and has two possible solutions. On cross-multiplying it follows
and E2 . After substituting either value back in the original equations, we can solve to get
the ratio of the expansion coefficients. Normalization to make c21 + c22 = 1 then results in
approximations to the first two wave functions, 1 (the ground state) and 2 (a state of
higher energy).
Suppose we want a really good approximation and use a basis containing hundreds of
functions i . The set of simultaneous equations to be solved will then be enormous; but
we can see how to continue by looking at the case N = 3, where they become
11 )c1 + (H12 EM
12 )c2 + (H13 EM
13 )c3 = 0,
(H11 EM
21 )c1 + (H22 EM
22 )c2 + (H23 EM
23 )c3 = 0,
(H21 EM
31 )c1 + (H32 EM
32 )c2 + (H33 EM
33 )c3 = 0.
(H31 EM
Well again take an orthonormal set, to simplify things. In that case the equations reduce
to (in matrix form)
H11 E
0 .
H22 E
H33 E
When there were only two expansion functions we had similar equations, but with only
two rows and columns in the matrices:
H11 E
H22 E
And we got a solution by cross-multiplying in the square matrix, which gave
(H11 E)(H
22 E) H21 H12 = 0.
This is called a compatibility condition: it determines the only values of E for which
the equations are compatible (i.e. can both be solved at the same time).
In the general case, there are N simultaneous equations and the condition involves the
determinant of the square array: thus for N = 3 it becomes
H11 E
H22 E
H23 = 0.
H33 E
There are many books on algebra, where you can find whole chapters on the theory of
determinants, but nowadays equations like (1.16) can be solved easily with the help of
a small computer. All the theory you really need, was explained long ago in Book 2
(Section 6.12). So here a reminder should be enough:
Given a square matrix A, with three rows and columns, its determinant can be evaluated as follows. You
can start from the 11-element A11 and then get the determnant of the 22 matrix that is left when you
take away the first row and first column:
A22 A23
A32 A33 = A22 A33 A32 A23 .
as follows from what you did just before (1.16). What you have evaluated is called the co-factor of
A11 and is denoted by A(11) .
Then move to the next element in the first row, namely A12 , and do the same sort of thing: take away
the first row and second column and then get the determinant of the 22 matrix that is left. This would
seem to be the co-factor of A12 ; but in fact, whenever you move from one element in the row to the next,
you have to attach a minus sign; so what you have found is A(12) .
When youve finished the row you can put together the three contributions to get
|A| = A11 A(11) A12 A(12) + A13 A(13)
and youve evaluated the 33 determinant!
The only reason for reminding you of all that (since a small computer can do such things
much better than we can) was to show that the determinant in (1.21) will give you a
(That is clear if you take A = H E1,
make the
polynomial of degree 3 in the energy E.
expansion, and look at the terms that arise from the product of elements on the principal
(H22 E)
(H33 E).
These include E 3 .) Generally, as
diagonal, namely (H11 E)
you can see, the expansion of a determinant like (1.16), but with N rows and columns,
will contain a term of highest degree in E of the form (1)N E N . This leads to conclusions
of very great importance as youre just about to see.
The first time you learnt anything about eigenfunctions and how they could be used
was in Book 3 (Section 6.3). Before starting the present Section 1.4 of Book 12, you
should read again what was done there. You were studying a simple differential equation,
the one that describes standing waves on a vibrating string, and the solutions were sine
functions (very much like the eigenfunctions coming from Schrodingers equation for a
particle in a box, discussed in Book 11). By putting together a large number of such
functions, corresponding to increasing values of the vibration frequency, you were able to
get approximations to the instantaneous shape of the string for any kind of vibration.
That was a first example of an eigenfunction expansion. Here were going to use such
expansions in constructing approximate wave functions for atoms and molecules; and
weve taken the first steps by starting from linear variation functions. What we must do
now is to ask how a function of the form (1.16) can approach more and more closely an
exact eigenfunction of the Hamiltonian H as N is increased.
In Section 1.3 it was shown that an N -term variation function (1.16) could give an optimum approximation to the ground state wave function 1 , provided the expansion
coefficients ci were chosen so as to satisfy a set of linear equations: for N = 3 these took
the form
11 )c1 + (H12 EM
12 )c2 + (H13 EM
13 )c3 = 0,
(H11 EM
21 )c1 + (H22 EM
22 )c2 + (H23 EM
23 )c3 = 0,
(H21 EM
31 )c1 + (H32 EM
32 )c2 + (H33 EM
33 )c3 = 0.
(H31 EM
and were compatible only when the variational energy E satisfied the condition (1.16).
There are only three values of E which do so. We know that E1 is an upper bound to the
accurate lowest-energy eigenvalue E1 but what about the other two?
In general, equations of this kind are called secular equations and a condition like (1.16)
is called a secular determinant. If we plot the value, say, of the determinant (having
against E,
well get a curve something like the
worked it out for any chosen value of E)
one in Figure 1.3; and whenever the curve crosses the horizontal axis well have = 0,
the compatibility condition will be satisfied and that value of E will allow you to solve
the secular equations. For other values you just cant do it!
On the far left in Fig.1.3, will become indefinitely large and positive because its expansion is a polynomial dominated by the term E 3 and E is negative. On the other
side, where E is positive, the curve on the far right will go off to large negative values. In
between there will be three crossing points, showing the acceptable energy values.
Now lets look at the effect of increasing the number of basis functions by adding another,
4 . The value of the secular determinant then changes and, since expansion gives a
Figure 1.3 shows that
polynomial of degree 4, it will go towards + for large values of E.
there are now four crossing points on the x-axis and therefore four acceptable solutions
of the secular equations. The corresponding energy levels for N = 3 and N = 4 are
compared in Figure 1.4, where the first three are seen to go down, while one new level
(E4 ) appears at higher energy. The levels for N = 4 fall in between the levels above and
below for N = 3 and this result is often called the separation theorem: it can be proved
for values of E at the crossing
properly by studying the values of the determinant N (E)
points of N 1 (E).
The conclusion is that, as more and more basis functions are added, the roots of the
secular determinant go steadily (or monotonically) down and will therefore approach
limiting values. The first of these, E1 , is known to be an upper bound to the exact lowest
eigenvalue of H (i.e. the groundstate of the system) and it now appears that the higher
roots will give upper bounds to the higher excited states. For this conclusion to be true
it is necessary that the chosen basis functions form a complete set.
So far, in the last section, weve been thinking of linear variation functions in general,
without saying much about the forms of the expansion functions and how they can be
constructed; but for atoms and molecules they may be functions of many variables (e.g.
coordinates x1 , y1 , z1 , x2 , y2 , z2 , x3 , ... zN for N particles even without including spins!).
From now on well be dealing mainly with wave functions built up from one-particle
functions, which from now on well denote by lower-case letters {k (ri )} with the index i
labelling Particle i and ri standing for all three variables needed to indicate its position
in space (spin will be put in later); as usual the subscript on the function will just indicate
which one of the whole set (k = 1, 2, ... n) we mean. (Its a pity so many labels are needed,
and that sometimes we have to change their names, but by now you must be getting used
to the fact that youre playing a difficult game once youre clear about what the symbols
stand for the rest will be easy!)
Lets start by thinking again of the simplest case; one particle, moving in one dimension,
so the particle label i is not needed and r can be replaced by just one variable, x. Instead
of k (ri ) we can then use k (x). We want to represent any function f (x) as a linear
combination of these basis functions and well write
f (n) (x) = c1 1 (x) + c2 2 (x) + ... + cn n (x)
So instead lets measure the difference by |f (x) f (n) (x)|2 , at any point, and the total
difference by
Z b
Z b
(x) dx =
|f (x) f (n) (x)|2 dx.
The integral gives the sum of the areas of all the strips between x = a and x = b of
height 2 and width dx. This quantity will measure the error when the whole curve is
approximated by f (n) (x) and well only get a really good fit, over the whole range of x,
when D is close to zero.
The coefficients ck should be chosen to give D its lowest possible value and you know
how to do that: for a function of one variable you find a minimum value by first seeking
a turning point where (df /dx) = 0; and then check that it really is a minimum, by
verifying that (d2 f /dx2 ) is positive. Its just the same here, except that we look at
the variables one at a time, keeping the others constant. Remember too that its the
coefficients ck that were going to vary, not x.
Now lets put (1.17) into (1.18) and try to evaluate D. You first get (dropping the usual
variable x and the limits a, b when they are obvious)
(n) 2
(n) 2
D = |f f | dx = f dx + (f ) dx 2 f f (n) dx.
So there are three terms to differentiate only the last two really, because the first
doesnt contain any ck and so will disappear when you start differentiating. These two
terms are very easy to deal with if you Rmake use of the
R supposed orthonormality of the
expansion functions: for real functions 2k dx = 1, k l dx = 0 (k 6= l). Using these
two properties, we can go back to (1.19) and differentiate the last two terms, with respect
to each ck (one at a time, holding the others fixed): the first of the two terms leads to
(n) 2
(f ) dx =
k (x)2 dx = 2ck ;
ck k
ck f (x)k (x)dx = 2hf |k i,
f f dx = 2
where Dirac notation (see Chapter 9 of Book 11) has been used for the integral
which is the scalar product of the two functions f (x) and k (x):
hf |k i = f (x)k (x)dx.
f (x)k (x)dx,
We can now do the differentiation of the whole difference function D in (1.18). The result
= 2ck 2hf |k i
and this tells us immediately how to choose the coefficients in the n-term approximation
(1.17) so as to get the best possible fit to the given function f (x): setting all the derivatives
equal to zero gives
ck = hf |k i (for all k).
So its really very simple: you just have to evaluate one integral to get any coefficient
you want. And once youve got it, theres never any need to change it in getting a better
approximation. You can make the expansion as long as you like by adding more terms,
but the coefficients of the ones youve already done are final. Moreover, the results are
quite general: if you use basis functions that are no longer real you only need change the
definition of the scalar product, taking instead the Hermitian scalar product as in (1.1).
In studying atoms and molecules well have to deal with functions of very many variables,
not just one. But some of the examples we met in Book 11 suggest possible ways of
proceeding. Thus, in going from the harmonic oscillator in one dimension (Example 4.3),
with eigenfunctions k (x), to the 3-dimensional oscillator (Example 4.4) it was possible
to find eigenfunctions of product form, each of the three factors being of 1-dimensional
form. The same was true for a particle in a rectangular box; and also for a free particle.
To explore such possibilities more generally we first ask if a function of two variables, x
and x , defined for x in the interval (a, b) and x in (a , b ), can be expanded in products
of the form i (x)j (x ). Suppose we write (hopefully!)
f (x, x ) =
cij i (x)j (x )
where the set {i (x)} is complete for functions of x defined in (a, b), while {i (x )} is
complete for functions of x defined in (a , b ). Can we justify (1.26)? A simple argument
suggests that we can.
For any given value of the variable x we may safely take (if {i (x)} is indeed complete)
f (x, x ) = c1 1 (x) + c2 2 (x) + ... ci i (x) + ....
where the coefficients must depend on the chosen value of x . But then, because {i (x )}
is also supposed to be complete, for functions of x in the interval (a , b ), we may express
the general coefficient ci in the previous expansion as
ci = ci1 1 (x ) + ci2 2 (x ) + ...cij j (x ) + ....
On putting this expression for ci in the first expansion we get the double summation postulated in (1.26) (as you should verify!). If the variables x, x are interpreted as Cartesian
coordinates the expansion may be expected to hold good within the rectangle bounded
by the summation limits.
Of course, this argument would not satisfy any pure mathematician; but the further
generalizations it suggests have been found satisfactory in a wide range of applications in
where the same set of orbitals {i } is used for each of the identical particles, the two
factors in the product being functions of the different particle variables r1 , r2 . Here a
boldface letter r stands for the set of three variables (e.g. Cartesian coordinates) defining
the position of a particle at point r. The labels i and j run over all the orbitals of the (in
principle) complete set, or (in practice) over all values 1, 2, 3, .... n, in the finite set used
in constructing an approximate wave function.
In Chapter 2 you will find applications to 2-electron atoms and molecules where the wave
functions are built up from one-centre orbitals of the kind studied in Book 11. (You can
find pictures of atomic orbitals there, in Chapter 3.)
Chapter 2
Some two-electron systems
For two electrons moving in the field provided by one or more positively charged nuclei
(supposedly fixed in space), the Hamiltonian takes the form
H(1, 2) = h(1) + h(2) + g(1, 2)
where H(1, 2) operates on the variables of both particles, while h(i) operates on those of
Particle i alone. (Dont get mixed up with names of the indices here i = 1, 2 label the
two electrons.) The one-electron Hamiltonian h(i) has the usual form (see Book 11)
h(i) = 21 2 (i) + V (i),
the first term being the kinetic energy (KE) operator and the second being the potential
energy (PE) of Electron i in the given field. The operator g(1, 2) in (2.1) is simply the
interaction potential, e2 /0 rij , expressed in atomic units (see Book 11) 1 So in (2.1) we
g(1, 2) = g(1, 2) =
r12 being the inter-electron distance. To get a very rough estimate of the total energy E,
we may neglect this term altogether and use an approximate Hamiltonian
H0 (1, 2) = h(1) + h(2),
which describes an Independent Particle Model of the system. The resultant IPM
approximation is fundamental to all that will be done in Book 12.
A fully consistent set of units on an atomic scale is obtained by taking the mass and charge of
the electron (m, e) to have unit values, along with the action ~ = h/2. Other units are 0 = 4 0 (0
being the electric permittivity of free space); length a0 = ~2 0 /me2 and energy eH = me4 /02 ~2 .
These quantities may be set equal to unity wherever they appear, leading to a great simplification of all
equations. If the result of an energy calculation is the number x this just means that E = xeH ; similarly
a distance calculation would give L = xa0 .
With a Hamiltonian of this IPM form we can look for a solution of product form and
use the separation method (as in Chapter 4 of Book 11). We therefore look for a wave
function (r1 , r2 ) = m (r1 )n (r2 ). Here each factor is a function of the position variables
of only one of the two electrons, indicated by r1 or r2 , and (to be general!) Electron 1 is
described by a wave function m while Electron 2 is described by n .
On substituting this product in the eigenvalue equation H0 = E and dividing throughout by you get (do it!)
h( 1)m (r1 ) h(2)n (r2 )
= E.
m (r1 )
n (r2 )
Now the two terms on the left-hand side are quite independent, involving different sets of
variables, and their sum can be a constant E, only if each term is separately a constant.
Calling the two constants m and n , the product mn (r1 , r2 ) = m (r1 )n (r2 ) will satisfy
the eigenvalue equation provided
h(1)m (r1 ) = m m (r1 ),
h(2)n (r2 ) = n n (r2 ).
The total energy will then be
E = m + n .
This means that the orbital product is an eigenfunction of the IPM Hamiltonian provided m and n are any solutions of the one-electron eigenvalue equation
h(r) = (r).
Note especially that the names given to the electrons, and to the corresponding variables
r1 and r2 , dont matter at all. The same equation applies to each electron and = (r)
is a function of position for whichever electron were thinking of: thats why the labels 1
and 2 have been dropped in the one-electron equation (2.6). Each electron has its own
orbital energy, depending on which solution we choose to describe it, and since H0 in
(2.4) does not contain any interaction energy it is not surprising that their sum gives the
total energy E. We often say that the electron is in or occupies the orbital chosen to
describe it. If Electron 1 is in m and Electron 2 is in n , then the two-electron function
mn (r1 , r2 ) = m (r1 )n (r2 )
will be an exact eigenfunction of the IPM Hamiltonian (2.4), with eigenvalue (2.5).
For example, putting both electrons in the lowest energy orbital, 1 say, gives a wave
function 11 (r1 , r2 ) = 1 (r1 )1 (r2 ) corresponding to total energy E = 21 . This is the
(strictly!) IPM description of the ground state of the system. To improve on this
approximation, which is very crude, we must allow for electron interaction: the next
approximation is to use the full Hamiltonian (2.1) to calculate the energy expectation
value for the IPM function (no longer an eigen-function of H). Thus
11 (r1 , r2 ) = 1 (r1 )1 (r2 ).
where the first term on the right is simply twice the energy of one electron in orbital 1 ,
namely 21 . The second term involves the two-electron operator given in (2.3) and has
explicit form
h1 1 |g|1 1 i = 1 (r1 )1 (r2 ) 1 (r1 )1 (r2 )dr1 dr2 ,
Here the variables in the bra and the ket will always be labelled in the order 1,2 and the
volume element dr1 , for example, will refer to integration over all particle variables (e.g.
in Cartesian coordinates it is dx1 dy1 dz1 ). (Remember also that, in bra-ket notation, the
functions that come from the bra should in general carry the star (complex conjugate);
and even when the functions are real it is useful to keep the star.)
To evaluate the integral we need to know the form of the 1-electron wave function 1 , but
the expression (2.9) is a valid first approximation to the electron repulsion energy in the
ground state of any 2-electron system.
Lets start with the Helium atom, with just two electrons moving in the field of a nucleus
of charge Z = 2.
The function (2.7) is clearly normalized when, as we suppose, the orbitals themselves
(which are now atomic orbitals) are normalized; for
h1 1 |1 1 i = 1 (r1 )1 (r2 )1 (r1 )1 (r2 )dr1 dr2 = h1 |1 i h1 |1 i = 1 1.
The approximate energy (2.8) is then
E = 21 + h1 1 |g|1 1 i = 21 + J11 ,
Here 1 is the orbital energy of an electron, by itself, in orbital 1 in the field of the nucleus;
the 2-electron term J11 is often called a Coulomb integral because it corresponds to the
Coulombic repulsion energy (see Book 10) of two distributions of electric charge, each
of density |1 (r)|2 per unit volume. For a hydrogen-like atom, with atomic number Z,
we know that 1 = 21 Z 2 eH . When the Coulomb integral is evaluated it turns out to
be J11 = (5/8)ZeH and the approximate energy thus becomes E = Z 2 + (5/8)Z in
atomic units of eH . With Z = 2 this gives a first estimate of the electronic energy of the
Helium atom in its ground state: E = 2.75 eH , compared with an experimental value
2.90374 eH .
To improve the ground state wave function we may use the variation method as in Section
1.2 by choosing a new function 1 = N eZ r , where Z takes the place of the actual nuclear
charge and is to be treated as an adjustable parameter. This allows the electron to feel
an effective nuclear charge a bit different from the actual Z = 2. The corresponding
normalizing factor N will have to be chosen so that
h1 |1 i = N
exp(2Z r)(4r2 )dr = 1
and this gives (prove it!) N 2 = Z 3 /.
The energy expectation value still has the form (2.8) and the terms can be evaluated
Example 2.1 Evaluation of the one-electron term
The first 1-electron operator has an expectation value h11 |h(1)|11 i = h1 |h|1 ih1 |1 i, a matrix element of the operator h times the scalar product h1 |1 i. In full, this is
eZ r eZ r 4r2 dr,
eZ r heZ r 4r2 dr N 2
h11 |h(1)|11 i = N 2
( 12 2
We can spare ourselves some work by noting that if we put Z = Z the function 1 = N eZ r becomes
an eigenfunction of ( 12 2 Z /r) with eigenvalue = 21 Z 2 (Z being a pretend value of Z. So
h = 21 2 Z/r = ( 12 2 Z /r) + (Z Z)/r,
where the operator in parentheses is easy to handle: when it works on 1 it simply multiplies it by the
h(N eZ r ) = 21 Z 2 +
N eZ r .
The one-electron part of (2.8) can now be written as (two equal terms say why!) 2h11 |h(1)|11 i where
h11 |h(1)|11 i
h1 |h|1 ih1 |1 i
e2Z r 4r2 dr
eZ r heZ r 4r2 dr N 2
N 2
Z r
Z r
1 2
4r2 dr.
2Z + r
Here the last integral on the second line is unity (normalization) and leaves only theRone before it. This
remaining integration gives (check it out!) h11 |h(1)|11 i = 21 Z 2 + 4(Z Z)N 2 0 (re2Z r )dr and
R ax
from the simple definite integral 0 xe
dx = (1/a2 ) it follows that
h11 |h(1)|11 i = 21 Z 2 + 4(Z Z)N 2 (1/2Z )
Example 2.1 has given the expectation value of the h(1) term in (2.8), but h(2) must give
an identical result since the only difference is a change of electron label from 1 to 2; and
a value which compares with 2.75 eH before the variation of Z and is much closer to
the exact value of 2.90374 eH obtained using a very elaborate variation function.
Before moving on, we should make sure that the value used for the Coulomb integral
J = (5/8)ZeH is correct2 . This is our first example of a 2-electron integral: for two
electrons in the same orbital it has the form (2.9), namely (dropping the orbital label
J = (r1 ) (r2 ) (r1 )(r2 )dr1 dr2 .
To evaluate it, we start from Borns interpretation of the wave function |(r)|2 = (r)(r)
(the star allowing the function to be complex ) as a probability density. It is the
probability per unit volume of finding the electron in a small element of volume dr at
Point r and will be denoted by (r) = (r)(r). As you know from Book 11, this
interpretation is justified by countless experimental observations.
We now go a step further: the average value of any quantity f (r) that Rdepends only on the
instantaneous position of the moving electron will be given by f = f (r)(r)dr where,
as usual, the integration is over all space (i.e. all values of the electronic variables). Now
the electron carries a charge e and produces a potential field Vr at any chosen field
point r .
If you find the proof too difficult, just take the result on trust and keep moving!
Its convenient to use r1 for the position of the electron (instead of r) and to use r2 for
the second point, at which we want to get the potential Vr2 . This will have the value
Vr2 = e/0 |r21 |, where |r21 | = |r12 | = r12 is the distance between the electron at r1 and
the field point r2 .
When the electron moves around, its position being described by the probability density
(r1 ), the electric potential it produces at any point r will then have an average value
V (r2 ) =
d(r1 )r1 .
|r21 |
In words, this means that
The average electric field at point r2 , produced by an
electron at point r1 with probability density (r1 ), can
then be calculated just as if the point electron were
smeared out in space, with a charge density e(r1 ).
The statement (2.13) provides the charge cloud picture of the probability density. It
allows us to visualize very clearly, as will be seen later, the origin of many properties of
atoms and molecules. As a first application lets look at the Coulomb integral J.
Example 2.2 Interpretation of the electron interaction.
The integral J can now be viewed as the interaction energy of two distributions of electric charge, both
of density e(r) and of spherical form (one on top of the other). (If that seems like nonsense remember
this is only a mathematical interpretation!)
The two densities are in this case 1 (r1 ) = N 2 exp 2Zr12 and 2 (r2 ) = N 2 exp 2Zr22 ; and the integral
we need follows on putting the interaction potential V (r1 , r2 ) = 1/r12 between the two and integrating
over all positions of both points. Thus, giving e and 0 their unit values, J becomes the double integral
J = ZN 4
exp 2Zr12
exp 2Zr22 dr1 dr2 ,
where (1/r12 ) is simply the inverse distance between the two integration points. On the other hand,
dr1 and dr2 are 3-dimensional elements of volume; and when the charge distributions are spherically
symmetrical functions of distance (r1 , r2 ) from the origin (the nucleus), they may be divided into spherical
shells of charge. The density is then constant within each shell, of thickness dr; and each holds a total
charge 4r2 dr (r), the density being a function of radial distance (r) alone.
Now comes a nice connection with Electrostatics, which you should read about again in Book 10, Section
1.4. Before going on you should pause and study Figure 2.2, to have a clear picture of what we must do
Example 2.2 perhaps gave you an idea of how difficult it can be to deal with 2-electron
integrals. The diagram below will be helpful if you want to actually evaluate J, the
simplest one weve come across.
The integral J gives the electrostatic potential energy of two spherical charge distributions.
Each could be built up from spherical shells (like an onion): these are shown in blue, one
for Electron 1 having radius r1 and another for Electron 2 with radius r2 . The distance
between the two shells is shown with label r12 and this determines their potential energy
as the product of the total charges they contain (4r12 dr1 and 4r22 dr2 ) times the inverse
distance (r12
). The total potential energy is obtained by summing (integrating) over all
shells but you need a trick! at any distance r from the nucleus, the potential due to an
inner shell (r1 < r) is constant until r1 reaches r and changes form; so the first integration
breaks into two parts, giving a result which depends only on where you put r (indicated
by the broken line).
Example 2.3 Evaluation of the electron interaction integral, J
To summarize, J arises as the interaction energy of all pairs of spherical shells of charge, shown (blue) in
Figure 2.2, and this will come from integration over all shells. We take one pair at a time.
You know from Book 10 that the electrostatic potential at distance r from the origin (call it V (r)) due
to a spherical shell of charge, of radius r1 , is given by
V (r)
Q r1
for r < r1 ,
Q r1
for r > r1 ,
where Qr1 = 4r12 dr1 (r1 ) is the total charge contained in the shell of radius r1 and thickness dr1 . The
potential is thus constant within the first shell; but outside has a value corresponding to all the charge
being put at the origin.
We can now do the integration over the variable r1 as it goes from 0 to . For r1 < r the sum of the
contributions to J from the shells within a sphere of radius r will be
while for r1 > r the rest of the r1 integration will give the sum of contributions from shells of radius
greater than r, namely
exp(2Zr12 )(1/r1 )4r12 dr1 .
Youve met integrals a bit like these in Chapter 1, so you know how to do them and can show (do it!)
that the sum of A and B is the potential function
V (r) =
[2 er (2 + r)].
This is a function of r alone, the radius of the imaginary sphere that we used to separate the integration
over r1 into two parts, so now we can put r = r2 and multiply by (4r22 dr2 )N er2 to obtain the energy
of one shell of the second charge distribution in the field generated by the first.
After that its all plain sailing: the integration over all the outer shells (r2 ) now goes from 0 to and
youre home and dry! Integration over r2 , for all shells from r2 = 0 to , will then give (check it out!)
[2 er2 (2 + r2 )]er2 r2 dr2 = (5/8)Z.
2 0
Example 2.3 gave you a small taste of how difficult it can be to actually evaluate the
2-electron integrals that are needed in describing electron interactions.
Now you know how to get a decent wave function for two electrons moving in the field
of a single nucleus the helium atom and how the approximation can be improved as
much as you wish by using the variation method with more elaborate trial functions. But
following that path leads into difficult mathematics; so instead lets move on and take a
quick look at some excited states.
First excited states of He
In Book 11 we studied central-field systems, including many-electron atoms, in order to
illustrate the general principles of quantum mechanics. In particular, we looked for sets of
commuting operators associated with observable quantities such as angular momentum,
finding that the angular momentum operators for motion in a central field commuted with
the Hamiltonian H (see Chapter 6 of Book 11) and could therefore take simultaneously
definite eigenvalues, along with the energy. For such a system, the energy eigenstates could
be grouped into series, according to values of the angular momentum quantum numbers
L and M which determine the angular momentum and one of its three components.
But here we are dealing with systems of at most two electrons and the general theory is
not needed: a 2-electron wave function is represented approximately as a product of 1electron orbitals. And for the Helium atom we are dealing with spherically symmetrical
wave functions, which involve only s-type orbitals, with zero angular momentum.
As a first example of an excited state we suppose one of the two electrons in the ground
state, with wave function 11 (r1 , r2 ) = 1 (r1 )2 (r2 ), is promoted into the next higher
orbital 2 of the s series. According to equation (6.10) of Book 11 Chapter 6, this AO
corresponds to energy E2 = 12 (Z 2 /4), the whole series being depicted in Figure 13.
H23 c3 ,
(H33 E)c
(H22 E)
(H33 E)
Now we know that H22 = H33 (say why!) and H32 = H23 (real matrix elements) and if we call these
2 = 2 . The two roots are ( E)
= and give two
quantities and the equation becomes ( E)
() = .
approximate excited-state energies: E
= + and E
To end this example lets get the energies of these states, just as we did for the ground state, where we
= 21 + J11 in terms of orbital energy 1 and Coulomb interaction J11 . (You should read again,
found E
from equation (2.7) to equation (2.8), to remind yourself of how we did it.)
The excited states are linear combinations of the functions 2 , 3 , which belong to the configuration
(+) , is obtained by putting E
(+) = +
1s2s. Thus 2 for the plus combination, with energy E
back into the second equation, which shows that c3 = c2 . This state therefore has the (normalized) form
2 = (2 + 3 )/ 2 and 2 will be similar, with the plus changed to a minus.
( +)
while E2
= h2
will be
i = [1 + 2 + J12 ] + K12 ,
the ket part of the matrix element h2 |g|3 i containing the orbitals after exchange of the electron
labels. Its no surprise that K12 is called an exchange integral!)
Example 2.4 was tough, but was done in detail because it leads us to tremendously
important conclusions, as youll see presently. (If you didnt manage to get through it
yourself, dont worry you can move on and come back to it later.) What matters here is
mainly the way the two wave functions 2 and 2 behave under symmetry operations
that make no apparent change to the system. The two terms in 2 differ only by an
interchange of electronic variables r1 , r2 (as you can check from the definitions) and their
sum does not change at all under such an operation: we say the wave function 2 is
symmetric under exchange of the electrons. On the otherhand the other state,
with energy E2 = , has a wave function 2 = (2 3 )/ 2, which changes sign
on exchanging the electrons and is said to be antisymmetric.
We started Book 11, on the basic principles of quantum mechanics, by talking about
the Stern-Gerlach experiment which showed a moving electron was not fully described
by giving its position variables x, y, z, it needed also a spin variable s with only two
observable values. But it seems as if weve completely forgotten about spin, using wave
functions that depend only on position of the electron in space. The reason is simple:
the spin (identified in Book 11 as some kind of internal angular momentum) has such a
small effect on energy levels that its hardly observable! You can solve the Schrodinger
equation, and get meaningful results, because the usual Hamiltonian operator contains
no spin operators and acts only on the position variables in the wave function. But in
dealing with many-particle systems its absolutely essential to label states according to
their spin properties: as you will see presently, without spin you and I would not exist
there would be no Chemistry!
Its easy to put the spin back into our equations: just as the product function mn (r1 , r2 ) =
m (r1 )n (r2 ) was used to describe two independent particles, in states m and n , so can
we use a product (r)(s) to describe a particle in orbital state and in spin state . If
is an eigenstate of the spinless operator h (with eigenvalue ) and is an eigenstate
of Sz (with spin component s = Sz along the z-axis), then the product is a simultaneous
eigenstate of both operators:
h[] = (h) = () = []
since the operator h doesnt touch the -factor; and similarly
Sz [] = (Sz ) = (Sz ) = Sz []
since the operator Sz doesnt touch the -factor.
Now the spin-space is only two-dimensional, with basis vectors denoted by and
corresponding to s = + 21 and s = 21 (in units of ~), respectively. So for any given orbital
state there will be two alternative possibilities and when spin is taken into
account. Products of this kind are called spin-orbitals. From now on lets agree to use
Greek letters (, ) for states with the spin description included, leaving , for orbital
states (as used so far) which dont contain any spin factors. The lower-case (small) letters
will be used for one-electron states, upper-case (capital) letters for many-electron states.
As long as we deal only with a two-electron system, the state vector (or corresponding wave
function) can be expressed as a product of space and spin factors: (1, 2) = (1, 2)(1, 2),
where the electron labels are used to indicate spatial or spin variables for electrons 1 and
2. When we want to be more explicit well use a fuller notation, as below.
(x1 , x2 ) = (r1 , r2 )(s1 , s2 ).
Here x stands for both space and spin variables together, so x r, s. This is a neat way
of saying that (x1 , x2 ) in (2.14) really means (x1 , y1 , z1 , x2 , y2 , z2 )(s1 , s2 )!
In the following Example we shall be looking for a simultaneous eigenstate of all commuting operators, which will normally include H, S2 , Sz . We suppose (1, 2) is an eigenstate
(exact or approximate) of the usual spinless Hamiltonian H(1, 2) and take (1, 2) as an
eigenstate of total spin of the two particles i.e. of the operators S2 , Sz .
Before continuing you should turn back to Section 2.2 of Book 11 and make sure you understand the
properties of the total spin operators Sx = Sx (1) + Sx (2), Sy = Sy (1) + Sy (2), Sz = Sz (1) + Sz (2).
Remember, they follow the same commutation rules as for a single particle and that you can define stepup and step-down operators S = (Sx iSy ) in the same way; from them you can set up the operator S2
and show that it has eigenvalues of the form S(S + 1) (in units of ~2 ), where S = 1 (parallel-coupled
spins) or S = 0 (paired spins). Study especially Example 2.2, which gives the spin eigenstates for a
2-electron system.
(Here the S- and M- quantum numbers are shown in parentheses and the state symbol has been used
to denote a two-electron spin state)
Its important to know how these eigenstates change under a symmetry operation which has no
observable effect on the system. In this case, all electrons are identical we cant tell one from another
so exchanging the labels 1 and 2 (call it P12 ) should be a symmetry operation (P12 (1)(2) = (2)(1)
means that Electron 1 goes into the down-spin state, previously occupied by Electron 2, while Electron
2 goes into an up-spin state but the change is not observable).
If you examine all the spin states listed above youll see at once that all the states with S = 1 are
unchanged, they are symmetric under the exchange; but the single state with S = 0 changes sign it is
antisymmetric under exchange, being multiplied by 1.
Were now ready to go back and look again at the excited states of the Helium atom,
but with spin included. The complete wave function will now be a space-spin product of
the form (1, 2) = (1, 2)(1, 2), where the two factors are now re-named as agreed in
the run-up to (2.16). Possible choices for the orbital factor are then 1 , for the ground
state, with both electrons in the first (lowest-energy) AO 1 ; and 2 or 2 , for the
excited states with one electron in the AO 1 and the other is in the next AO 2 with
a plus combination or a minus combination of 2 , 3 . The available energy states for
the two-electron atom, without spin, would seem to be:
Ground state. Energy = E1 , wave function 1
1st excited state. Energy = E2 , wave function (2 3 )/ 2 (normalized minus
2nd excited state. Energy = E2 , wave function (2 + 3 )/ 2 (normalizedplus
What happens when spin is taken into account? When the two electrons are interchanged,
both space and spin variables change:
r1 , r2 r2 , r1
and s1 , s2 s2 , s1 .
But the energy levels are determined essentially by the factor; so lets take the states
as listed above and ask what symmetry each state will have when spin is included.
The space-spin product function = for the ground state will have = 1 which is
symmetric under electron exchange, but may take possible spin factors:
= 1,1 ,
or 1,0 ,
or 1,1 ,
which are all symmetric under spin exchange. So three possible products can be found;
all are totally symmetric and correspond to the same energy suggesting a three-fold
degenerate triplet ground state.
On the other hand, 1 might have been combined with 0,0 = (1)(2) (1)(2) and
that would have given a totally antisymmetric space-spin product a non-degenerate
singlet ground state.
The results were going to get can be summarized very easily in a diagram showing the first
few energy levels you might expect to find for any two-electron system. The alternatives
weve just found for the ground state correspond to the lowest levels in (a) and (b) of
Figure 2.7:
What about the excited state with energy E2 ? The antisymmetric space factor 2
could be associated with any of the three symmmetric spin factors, to give three antisymmetric space-spin products. But it could equally well be attached to the antisymmetric
spin factor 0,0 = (1)(2) (1)(2) to give a single totally symmetric -product.
Finally, the excited state with energy E2 and symmetric space factor 2 could be
associated with the antisymmetric spin factor 0,0 to give an antisymmetric space-spin
-product; or equally well combined with any one of the three symmetric spin factors
1,1 , 1,0 , 1,1 , to give a three-fold degenerate , all products being totally antisymmetric.
That was quite a lot of work, but the results indicated in Figure 2.7 are rather general
and apply to any two-electron system. As long as there are no spin operators in the
Hamiltonian, the electronic energy depends only on the spatial wave function . But the
nature of any state whether it is degenerate or non-degenerate and whether or not it
corresponds to definite values of the total spin depends on the overall symmetry of the
space-spin function . Remember that a state of total spin S has 2S + 1 degenerate
components (labelled by the quantum number MS ) and that this is the multiplicity of
the state.
The remarkable fact is that the experimentally observed states correspond only to those
shown in Figure 2.7(b), where the ground state is a singlet and the first excited state is a
triplet. But wait a minute! How can we be sure the state were calling the first excited
state really is the lowest excited state? If you look back at Example 2.4 youll see that
the first excited state, going up in energy, was taken to be the one with wave function
2 , namely the minus combination of 2 and 3 ; and that is the one with energy
= [1 + 2 + J12 ] K12 .
= [1 + 2 + J12 ] + K12
and since K12 is an essentially positive quantity this energy lies above that of the first
excited state. So we got it right! The energy levels on the right-hand side in Figure 2.6
are in complete agreement with experiment, while those on the left simply do not appear!
Overall antisymmetry of an electronic wave function seems to be an intrinsic property of
the electrons themselves or of the wave field with which they are described. In fact
this conclusion is perfectly general: it applies not just to two-electron systems but to all
the electrons in the universe! and it is confirmed by countless experiments.
This brings us to the last general principle of quantum mechanics that were going to need
in Book 12. It wasnt included in Book 11 because in formulating the basic principles
we were thinking mainly of one-particle systems; but the antisymmetry of many-electron
wave functions is just as important as anything weve discovered so far. So lets state the
antisymmetry principle in the general form which applies to systems of any number
of electrons:
Here P is a general permutation, which acts on the numbered electronic variables
x1 , x2 , ..., xN and changes them into x1 , x2 , ..., xN , where the new numbers 1 , 2 , ..., N
are the old ones written in a different order. This permutation can be achieved by making
a series of transpositions (1, 1 )(2, 2 )...(N, N ), where each (i, i ) interchanges one pair
of numbers, one old and one new: thus (1,3)(4,2) will send 1 2 3 4 into 3 4 1 2. Any
permutation is equivalent to a number of transpositions: when the number is odd the
parity of the permutation is said to be odd; when it is even, the parity is even. (Note
that, in counting, (i, i) (where a number is interchanged with itself) is not included not
being a true transposition.)
Section 2.4 opened with the amazing claim that without spin you and I would not exist
there would be no Chemistry! To end this chapter we must ask how this can be so
and how does the Antisymmetry Principle come into the picture?
During the early development of quantum theory, before Schrodingers introduction of the
wave function, the electrons in an atom were assigned to states on a basis of experimental evidence. Atomic spectroscopy had shown that the emission and absorption of light
could be associated with quantum jumps of single electrons between energy levels with
characteristic quantum numbers. (See Book 11 for spectral series and energy level diagrams.) A key postulate in accounting for the electronic structures of atoms, was Paulis
Exclusion Principle, which stated that no two electrons could be in states with the
same set of quantum numbers.
The Antisymmetry Principle is simply the modern and more general form of Paulis Exclusion Principle3 To see how antisymmetry of the wave function contains the idea of exclusion its enough to go one step beyond the two-electron systems studied in the present
chapter. In an IPM description the first two spin-orbitals might be 1 = 1 , 2 = 1 ,
with both electrons in the same orbital 1 , but with opposite spins. The corresponding
antisymmetric 2-electron state, found in Section 2.4, is then seen to be (before normalization) 1 2 2 1 , which is called an antisymmetrized spin-orbital product. It can
Over the years, starting from Pauli himself, there has been much argument about the fundamental
status of the two principles, but that can be found in books on the philosophy of quantum mechanics
when youre ready!
be derived from the leading term, a parent product, 1 2 , by subtracting the product
obtained after making an electron interchange. The operator A = (1/2)(I P12 ) (I being
the usual identity operator and P12 being the interchange of variables for electrons 1 and
2) is called an anti-symmetrizer. There is a more general form of this operator, which
well need in Chapter 3, namely
1 X
N! P
But now think about the effect of the first permutation (the order doesnt matter as
the sum is over all N ! permutations), taking it to be one that interchanges the first two
spin variables. This will leave the product unchanged, and as the parity factor for a
single interchange is 1 the resultant term in the sum will be [1 1 2 ]. But the identity
permutation, included in the summation, leaves the parent product unchanged and the
net result is thus exactly zero! In fact, what we have shown for three electrons is true
for any number (think about it, noting that if P12 leaves the parent function unchanged,
then any permutation can be expressed as P = P P12 where P acts on all the variables
except x1 , x2 ).
To summarize,
The antisymmetrized product function
A(x1 , x2 , ..., xN ) = A1 (x1 )2 (x2 ) ... N (xN ),
representing an IPM approximation to the state of an N -electron system,
can contain no repetitions of any given spin-orbital: every electron must
have its own distinct spin-orbital. A given spatial orbital can hold not
more than two electrons, one with spin factor , the other with .
This is the quantum mechanical equivalent of Paulis Exclusion Principle: it excludes the
possibility of finding more than two electrons in the same spatial orbital; and when two
are present they must have opposite spins 21 . It is less general than the Antisymmetry
Principle, because it applies only to approximate wave functions of particular form: but
is very simple to apply and leads directly to conclusions that provide the basis for all
modern theories of the electronic structure of atoms and molecules. The example with
which we introduced it explains at once why the 3-electron Lithium atom does not have
all three electrons in the lowest-energy 1s orbital: because the Helium-like configuration
(1s)2 is already full and a third electron must then overflow into the higher-energy 2s
orbital, giving the configuration Li[(1s)2 (2s)]. Thus, there are two electrons in an inner
shell, tightly localized around the nucleus, and one electron by itself, in a more diffuse
2s orbital. And that is the beginning of Chemistry, and of Life in all forms! Without
antisymmetry and the exclusion property to which it leads, all matter would collapse
every nucleus would take all the electrons it could hold, becoming an uncharged and
unreactive system, like no atom in the world we know.
Chapter 3
Electronic structure: the
independent particle model
By the end of Chapter 2 it was already clear that a general antisymmetric wave function
could be built up from products of spin-orbitals i (x) = i (r)(s), where i (r) is a
particular orbital factor and (s) is a spin factor ( or ) indicating the spin state
of the electron (up-spin or down-spin, respectively); and that this could be a difficult
matter. Only for a 2-electron system was it possible to factorize an eigenfunction ,
corresponding to a state of definite energy and spin, into a product . However, as
Example xxx showed, a space-spin function could be expressed in terms of antisymmetrized
spin-orbital products. This discovery, by the physicist J.C.Slater (1929), provided a basis
for nearly all electronic structure calculations in the years that followed.
From now on, well be dealing with many-electron systems: so we need to generalize what
was done in Section 2.1, starting from the definition of the Hamiltonian operator. Instead
of (2.1) well use
h(i) + 12 i,j g(i, j),
where h(i) is the 1-electron Hamiltonian of (3.1), in which the nuclei are treated as if
fixed in space and simply determine the potential field in which the electrons move (the
clamped nucleus approximation); and g(i, j) is the 2-electron operator of (2.3), which
simply multiplies the wave function by the (classical) interaction energy 1/rij between
electrons i and j separated by a distance rij (remember we normally use atomic units).
The prime on the summation sign indicates that there is no term when i = j; and the 21
is put in to avoid counting every interaction twice (which would happen in summing over
both i and j. As in the case of two electrons, leaving out the electron interaction leads to
an IPM approximation in which the wave function is represented as a single spin-orbital
product. (Read the rest of Section 2.2 if you need to.)
From the general principle (2.15) we must be sure that any approximate wave function we
may construct is properly antisymmetric. And we already know how this can be done
by making use of the antisymmetrizer (2.16). So we start with this operator, already
used in Section 2, and show its basic property.
The name of the permutation P is not important when were going to sum over all permutations of the N variables, so (2.16) can be written in two equivalent ways:
1 X
1 X
P P =
N! P
N! Q
where there are N ! terms in each sum. The product of the two operators is thus
2 X
But PQ = R, which is just another permutation thats been given a different name, and
the last result can thus be written
A = AA =
R R,
N! R
where for each choice of one permutation (Q say) there are the same N ! product permutations R = PQ, appearing in some different order. And this fact has let us cancel one
factor (1/N !) in the previous expression. The remarkable result then is that
A2 = AA =
1 X
R R = A.
N! R
Operators with this property are said to be idempotent and you first met them long
ago in Book 1 (Chapter 6)! (The word comes from Latin and means the same power
all powers of A are the same.) You met such operators also in Geometry (Chapter 7 of
Book 2), where they applied to the projection of a vector on some axis in space (if you
do it twice you get the same result as doing it once!).
An immediate result is that A applied to a product of N orthogonal spin-orbitals gives
a wave function which, besides being antisymmetric, can easily be normalized. Lets call
the basic spin-orbital product
(x1 , x2 , ... xN ) = 1 (x1 )2 (x2 ) ... N (xN ),
where all the spin-orbital factors are orthonormal (i.e. individually normalized and
mutually orthogonal) and go ahead as follows.
Thinking about how the normalization integral arises is a good exercise. To make it easier well use
1, 2, 3, ....N to stand for the variables x1 , x2 , ...xN
To get hF |F i you have to integrate, over all space-spin variables, the product of two sums, each containing
N ! spin-orbital products. Typical terms are
P P[1 (1)2 (2) ... N (N )], from the bra, and
Q Q[1 (1)2 (2) ... N (N )] from the ket.
After making the permutations P and Q, which put the variables in a different order, you may find P has
sent the bra product into
1 (1 )2 (2 ) ... N (N ),
while Q has sent the ket product into
1 (1 )2 (2 ) ... N (N ).
And then you have to do the integrations which seems like an impossible task! (Even for the Carbon
atom, with only six electrons, 6! = 720 and gives you 518400 distinct pairs of products to look at before
doing anything else.) But in fact the whole thing is very easy because the spin-orbitals are orthonormal.
This means that in every pair of products the variables must be in exactly the same order (i = i = i)
for all i and the integration will always give unity (hi |i i = 1). So youve done for all matching
pairs of products the result will be unity, and there are N ! of them. Thus the normalization integral
Example 3.1 has shown how we can produce a normalized wave function from the spinorbital product in (3.3): the result is
1 X
(x1 , x2 , ... xN ) =
P P1 (x1 )2 (x2 ) ... N (xN )
N! P
1 2 . . . N
1 2 . . . N
Lets take first the 1-electron sum j h(j) and focus on h| j h(j)|i, getting it in the
next example in much the same way as we got the normalization integral. As everything
is symmetrical in the electrons, their names dont matter and we can make things look
easier by taking j = 1 and writing the corresponding operator h(1) = h1 so as not to mix
it up with the other labels. The expectation value for the operator sum will then be N
times that for the single term.
Example 3.2 Getting a 1-electron expectation value
To evaluate h| j h1 |i, with defined in (3.4), we note that a typical term will be the bra-ket with
h1 between two spin-orbital products:
h1 (1 )2 (2 ) ... N (N )|h1 |1 (1 )2 (2 ) ... N (N )i,
where the primed variables result from permutation P and the double-primed from permutation Q.
Now, as in Example 3.1, every such term will be zero unless i = i , because otherwise the two spinorbital products, 1 (1 )2 (2 ) ... N (N ) and 1 (1 )2 (2 ) ... N (N ), would lead to zero overlap factors,
hi (i )|i (i )i = 0 for i 6= i .
The variables in the N spin-orbitals must therefore match exactly and the only non-zero terms in the
last expression will be of the form
h1 (1 )2 (2 ) ... N (N )|h1 |1 (1 )2 (2 ) ... N (N )i.
Note that only the i (integer-primed) variables are involved in the permutations and that h1 works
only on the factor with i = 1, namely i in position i where the integer 1 has landed after the
permutation. You can see that from the list of permuted products: 1 (1 )2 (2 )3 (3 ) ... . (e.g. if 3 ,
after a permutation, has been replaced by 1 it still refers to spin-orbital 3 .) Putting i = 1 fixes one
non-zero factor as hi |h1 |i i, but this will result from all permutations of the remaining N 1 variables.
So there are N ways of choosing i = 1 and (N 1)! ways of choosing the other matching pairs of overlap
integrals. Thats all for one term h1 = h(1) in the sum h(1) + h(2) + ... h(N ) and every term will appear
N (N 1)! = N ! times. Thus the sum of all the 1-electron operators will have an expectation value
h| j h(j)|i = j hj |h(j)|j i, where the normalizing factor 1/N ! is conveniently cancelled.
In case you didnt follow the argument in Example 3.2, run through it with just 3 electrons
instead of N . With electrons 1,2,3 in spin-orbitals 1 , 2 , 3 , the basic spin-orbital product
will then be (x1 , x2 , x3 ) = 1 (x1 )2 (x2 )3 (x3 ) or, for short, (1, 2, 3) = 1 (1)2 (2)3 (3),
where again the integer i will stand for the variable xi .
To antisymmetrize the products we need to apply the permutation operators, which give
P(1, 2, 3) = 1 (1 )2 (2 )3 (3 ) and Q(1, 2, 3) = 1 (1 )2 (2 )3 (3 ), and then put the
results together with parity factors 1, remembering that i = i for all i (= 1, 2, 3).
The six permuted variables are (1 2 3), (1 3 2), (2 1 3), (2 3 1), (3 1 2), (3 2 1) and the
expectation value contributions are thus, on putting these indices in place of 1 2 3 and
choosing a typical operator h1 = h(j) with j = 1:
h1 2 3 |h1 | 1 2 3i = h1 |h1 |1 ih2 |2 ih3 |3 i = h11 ,
h1 3 2 |h1 | 1 3 2i = h1 |h1 |1 ih2 |2 ih3 |3 i = h11 ,
h2 1 3 |h1 | 2 1 3i = h1 |1 ih2 |h1 |2 ih3 |3 i = h22 ,
h2 3 1 |h1 | 2 3 1i = h1 |1 ih2 |2 ih3 |h1 |3 i = h33 ,
h3 1 2 |h1 | 3 1 2i = h1 |1 ih2 |h1 |2 ih2 |2 i = h22 ,
h3 2 1 |h1 | 3 2 1i = h1 |1 ih2 |2 ih3 |h1 |3 i = h33 ,
Note especially that the labelled -factors do not change their positions: only their arguments
(the electronic variables, not shown) are affected by the permutations. For example, the third
permutation puts 2 = 1 in the second position, showing that h1 operates on 2 .
To summarize the conclusion from Example 3.2, in a strictly IPM approximation the expectation
value of the total energy is simply the sum of the individual orbital energies, derived using the
1-electron operator h (which no longer carries the label 1). Thus
i ,
(i = hi |h|i i).
g(j, k)|i, with defined in (3.4), we note that the expectation value will be
h1 (1 )2 (2 ) ... N (N )|g(j, k)|1 (1 )2 (2 ) ... N (N )i,
where the primed variables result from permutation P and the double-primed from permutation Q. (The
prime on the summation symbol is used to indicate that terms with j = k will be excluded they would
refer to only one electron and there is no self -interaction!)
As in Example 3.2, we first suppose the variables in the two spin-orbital products must match exactly
(i = i for all i 6= j, k) to avoid zero overlap factors. In that case, the only non-zero terms in the last
expression will be of the form
h1 (1 )2 (2 ) ... N (N )|g(j, k)|1 (1 )2 (2 ) ... N (N )i.
Note that only the i (integer-primed) variables are involved in the permutations and that g(j, k) works
on the factors with i = j or i = k, namely j , k the j-th and k-th spin-orbitals in the standard order
1, 2, ...N.
On making this choice, the contribution to the expectation value will contain the 2-electron integral
hj k |g(j, k)|j k i, multiplied by N 2 unit overlap factors, coming from all other matching pairs of
spin-orbitals. And the same result will be obtained on making all permutations of the remaining N 2
variables. So there are N ways of choosing i = j, N 1 ways of choosing another i = k and (N 2)!
of choosing the remaining matching pairs of overlap integrals. Thats all for one term g(j, k) in the
sum j,k g(j, k) and every term will thus appear N (N 1) (N 2)! = N ! times.
normalizing factor 1/N !, h| 12 j,k g(j, k)|i = 21 j,k hj k |g(j, k)|j k i. This is the quantity we met in
Section 2.2 (Example 2.4) and called a Coulomb integral because it represents the Coulomb interaction
of two distributions of electric charge, of density |j |2 and |k |2 respectively. (Look back at (3.1) if you
dont see where the factor 12 comes from.)
That all seems fine but have we included everything? We started by saying that the permutations P
in the bra and Q in the ket must put the variables in matching order, as any mis-match would lead to
zero overlap integrals. But with 2-electron operators like g(j, k) it is clear that non-zero contributions to
the expectation value can arise as long as the N 2 matching pairs (for i 6= j, k) are not changed by
the permutations. So after getting all the non-zero contributions hj k |g(j, k)|j k i we must still allow
new permutations, which differ from those already made by a transposition of the indices j, k. When two
indices are swapped, the term just found will be accompanied by another, hj k |g(j, k)|k j i, which is
called an exchange integral. But, in summing over all permutations, those which lead to an exchange
term are of different parity from those that lead to the corresponding Coulomb term; and when they are
included they mustP
be given a minus sign. Consequently, the expectation value of the 2-electron energy
term, namely h| 12 j,k g(j, k)|i, must now include exchange terms, becoming
If you still have difficulty with such a long and abstract argument, try repeating it with
just three electrons (1,2,3) in spin-orbitals 1 , 2 , 3 , as we did after Example 3.2, but
replacing h1 by the 2-electron operator g12 = g(1, 2). Note that g12 acts on two spinorbitals; thus, for example,
h2 1 3 |g12 | 2 1 3i = h1 2 |g12 |1 2 ih3 |3 i = h1 2 |g|1 2 i.
We can now summarize the conclusions from Examples 3.2 and 3.3 for a state , represented by a single antisymmetrized spin-orbital product and normalized to unity in
Given (x1 , x2 , ..., xN ) = (1/N !)1/2 P P P1 (x1 )2 (x2 ) ... N (xN ),
the 1- and 2-electron contributions to E = h|H|i are:
h| 12
h(i)|i =
g(i, j)|i =
i hi |h|i i
hi j |g|j i i].
These results, Slaters rules, will be used throughout the rest of Book 12, r so if you had
trouble in getting them just take them on trust applying them is much easier! (Note
that the summation indices in the 2-electron sum have been changed back to i, j, as used
originally in (3.1), now theres no longer any risk of confusion.)
Now that we know how to get the expectation energy for a wave function of the form
(3.4) well be wanting to get the best possible approximation of this kind. In Chapter 2
this was done by the variation method, in which the forms of the orbitals were varied
until E reached a stationary minimum value.
For a many-electron ground state we can go ahead in the same way; but the details will be
a bit more complicated. Apart from the fact that we now have to use spin-orbitals, N of
them for an N -electron system, the orbital factors may not be simple functions, containing
a few adjustable parameters; they may be complicated functions of electronic positions
(ri ) and well be looking for a 1-electron eigenvalue equation to determine the orbitals and
corresponding orbital energies. Thats the problem we face in the next section: here we
have to start by getting an expression for the total electronic energy of the system.
First of all, as long as there are no spin operators in the Hamiltonian and this first
approximation is the one usually accepted we can get rid of all the spin factors (, )
and spin variables s by doing the spin integrations before anything else in evaluating
Remember that in general, where = (x1 , x2 , ... xN ) and
the expectation value E.
E = h|H|i, this involves integrating over all variables in the wave function.
Lets start from the single antisymmetrized spin-orbital product in (3.7) and do the spin
integrations to get a spin-free expression for E = hEi. In terms of spin-orbitals, we
already know
hEi = h|H|i =
hi |h|i i + 12 i,j [hi j |g|i j i hi j |g|j i i],
so now we only have to substitute i (x) = i (r)(s), or i (r)(s) in this expression and
complete the spin integrations.
Example 3.4 Getting rid of the spins!
In Chapter 2 we found that quantum mechanics was not complete until we allowed for particles with
spin: otherwise it was not possible to describe the fact that electrons are identical particles of a very
special kind their wave functions must be antisymmetric under exchange of any two particles (an
operation that can make no observable difference to the system). So why should we want to get rid of
spin ? The simple reason is that the observable effects of spin (e.g. on the energy levels of a system) are
tiny and, in good approximation, can often be neglected. That being so, its a nuisance to keep them in
the theory for any longer than necessary.
the term hi |h|i i =
R 1-electron part of the energy in (3.8) depends on the spin-orbitals only through
i (x1 )h(1)i (x1 )dx1 , in which i is occupied by the electron were calling 1 , with space-spin coordi-
nates x1 , and dx1 = dr1 ds1 . When the Hamiltonian h(1) does not contain spin operators, it works on a
spin-orbital i (x1 ) = i (r1 )(s1 ) to give [h(1)i (r1 )](s1 ), without touching the spin factor (s1 ). Thus
The spin integration just takes away the spin factors, leaving hi |h|i i in place of hi |h|i i, and this will
clearly be true also for a spin-orbital with a spin factor. (Integration limits not shown when obvious.)
What about the 2-electron term in (3.8)? This is 21 i,j [hi j |g|i j i hi j |g|j i i] and is a bit more
difficult, so lets take the Coulomb and exchange parts separately. If we take i (x1 ) = i (r1 )(s1 ) and
j (x2 ) = j (r2 )(s2 ), then a single Coulomb term becomes
hi j |g|i j i
i (x1 )
hi j |g|i j i.
i (x1 )
hi j |g|j i i.
and could be obtained from the Coulomb term simply by exchanging the two orbitals (no spins!) in the
(Note that you dont always have to show everything in such detail, with the variables and integral signs.
A shorter way is to write the spin-orbitals i = i , j = j , so
h(i )(j )|g|(j )(i )i = h|i1 h|i2 hi j |g|j i i,
where the first spin scalar product comes from the first spin-orbital and the next one from the second
(its enough just to keep the order). As the spin states are normalized both factors are 1 and the short
cut gives the same result: hi j |g|j i i = hi j |g|j i i.)
Now suppose that i and j have different spins: i = i , j = j . In this case we get, using the
short cut, an exchange term h(i )(j )|g|(j )(i )i = h|i1 h|i2 hi j |g|j i i. Here, because the
different spin states are orthogonal, there are two factors of 0 and the exchange term is hi j |g|j i i =
0 hi j |g|j i i.) The Coulomb term, on the other hand, again reduces to hi j |g|i j i, because the
spin factors are both 1 (check it out!).
In summary, Example 3.4 showed how a system whose Hamiltonian contains no spin
operators can be dealt with in terms of orbitals alone, without the spin factors and :
Given (x1 , x2 , ..., xN ) = N !A1 (x1 )2 (x2 ) ... N (xN ), the
1- and 2-electron energy terms reduce as follows.
When i = i :
hi |h|i i hi |h|i i
and when i = i , j = j :
[hi j |g|i j i hi j |g|j i i] [hi j |g|i j i hi j |g|j i i].
But when i = i , j = j there is no exchange term:
hi j |g|i j i hi j |g|i j i.
Of course, there are similar results if you interchange and throughout. The Coulomb
integrals in terms of i , j give results of the same form in terms of the orbital factors
i , j when both spins are the same (, or , ), or different (, or , ): but this is
so for the exchange integrals only when both spins are the same, the exchange integrals
reducing to zero when the spins are different.
The results listed in (3.7) and (3.9) may be used to obtain energy expectation values, in
IPM approximation, for any kind of many-electron system. They apply equally to atoms,
where the occupied orbitals are AOs (centered on a single nucleus), and to molecules,
where the molecular orbitals (MOs) extend over several nuclei.
Here we start by thinking about atoms, whose AOs have been studied in detail in Chapter
6 of Book 11. Youll remember something about atoms from Book 5 (Sections 1.1 and
1.2, which you might like to read again). In particular, the atomic number Z gives
the number of electrons in the electrically neutral atom and allows us to list all the
known chemical elements in increasing order of atomic mass and electronic complexity.
The first 10 (lightest) atoms in the list are of special importance: they are Hydrogen
(H), Helium (He), Lithium (Li), Beryllium (Be), Boron (B), Carbon (C), Nitrogen (N),
Oxygen (O), Fluorine (F) and Neon (Ne). Together they make up most of the world we
live in, including the water of the oceans, the main gases of the Earths atmosphere and
even about 99% of our human bodies so no wonder they are important! In Book 12
well be tryng to understand some of the properties of these few atoms and the ways they
can be put together to form molecules and other structures. The main tool for doing
this is provided by quantum mechanics; and by now you know enough about this to get
In the next two examples well get approximate energy expressions for the atoms of
Lithium (Z = 3) and Beryllium (Z = 4) in their lowest-energy ground states.
2 = 1s
3 = 2s
hi |h|i i +
hi j ||i j i.
for the expectation energy, in the ground state, of a 3-electron system (the Lithium atom),
in terms of the orbital energies
1s = h1s |h|1s i,
2s = h2s |h|2s i,
J1 ,2 = 1 (r1 )2 (r2 )g1 (r1 )2 (r2 )dr1 dr2 = 1 (r1 )2 (r2 )dr1 dr2 ,
where 1 (r1 ) = 1 (r1 )1 (r1 ) is a real quantity and so is 2 (r2 ). This interaction integral,
between real charge densities, is often denoted by (1 1 , 2 2 ) and has a purely classical
interpretation; J1 ,2 = (1 1 , 2 2 ). The corresponding exchange integral does not have
a classical interpretation: it is K(1 , 2 ) = (1 2 , 2 1 ) where the charge densities are,
in general, complex quantities and have their origin in the region of overlap of the two
The next atom Be, with Z = 4, will contain two doubly occupied orbitals, giving it the
electron configuration (1s)2 (2s)2 . It is the model for all atoms that contain n closed shells
of doubly occupied orbitals and leads to an important generalization.
Example 3.6 Energy expression for the Beryllium atom
Again suppose the electrons are added, one at a time, to the bare nucleus now with charge Z = 4 (atomic
units). The first two go into the AO 1s and the other two into 2s , giving the electron configuration
(1s)2 (2s)2 in which both orbitals are doubly occupied and can accept no more electrons. The atom has
a closed-shell ground state in which the singly occupied spin-orbitals are
1 = 1s
2 = 1s
3 = 2s
4 = 2s
With the spin-orbitals listed above in (A), 1 becomes (making use of (3.9))
1 = 2h1s |h|1s i + 2h2s |h|2s i
and similarly
Again the terms that have been given a prime are the ones that come from spin-orbitals of different spin
and therefore include no exchange term.
On using the J and K notation for the Coulomb and exchange integrals, the last result becomes (showing
the terms in the same order) 2 = J1s,1s +(J1s,2s K1s,2s )+(J1s,2s )+(J1s,2s )+(J1s,2s K1s,2s )+(J2s,2s ).
Thus 2 = J1s,1s + J2s,2s + 4J1s,2s 2K1s,2s , where the first two terms give the Coulomb repulsion energy
within the two doubly occupied AOs while the remainder give the four Coulomb repulsions between the
two electron pairs, (1s2 ) and (2s2 ), together with the two exchange terms from the electrons with the
same spin.
The total electronic energy of the Beryllium atom, in IPM approximation, thus has the expectation value
= 21s + 22s + J1s,1s + J2s,2s + 4J1s,2s 2K1s,2s .
Example 3.6 has given an expression for the total energy of a system consisting of two
doubly occupied AOs, namely
E = 21s + 22s + J1s,1s + J2s,2s + 4J1s,2s 2K1s,2s .
The beauty of this result is that it can be generalized (with no more work!) and will then
hold good for any atom for which the IPM provides a decent approximate wave function.
It was derived for two doubly occupied AOs, 1s and 2s , but for a system with n such
orbitals which we can call simply 1 , 2 , ... i , ... n the derivation will be just the
same (think about it!). The n orbitals can hold N = 2n electrons and the general energy
expression will be (summation limits, not shown, are normally i = 1, n)
E = 2
i +
Ji,i + 4
Ji,j 2
Ki,j ,
where the indices now label the orbitals in ascending energy order. The terms being
summed have the same meaning as for only two orbitals: the first is the energy of two
electrons in orbital i ; the next is their Coulomb repulsion energy; and then there is the
repulsion between each electron of the pair in i and each in j ; the last is the exchange
energy between the two pairs that have the same spin.
At this point we begin to think about how the orbitals might be improved; for we know
that using the AOs obtained for one electron alone, moving in the field of the nucleus,
will give a very poor approximate wave function. Even with only the two electrons of the
Helium atom (Example 2.1) the exponential factor in the 1s orbital is changed quite a lot
by the presence of a second electron: instead of corresponding to nuclear charge Z = 2
a more realistic value turned out to be Zeff = Z (5/8). This is an effective nuclear
charge, reduced by the screening constant (5/8) which allows for some of the repulsion
between the electrons.
Clearly, the 2s AO in the Lithium atom would be much better represented by giving it
an exponent closer to 1 instead of the actual value Z = 3, , to allow for the fact that
the 1s2 inner shell holds two charges of e close to the nucleus. Of course we can find a
better value of the effective nuclear charge, which determines the sizes of the outer AOs,
but we really want to find the best possible IPM
by minimizing the expectation value E;
wave function and that means allowing the AOs to take arbitrary not just hydrogen-like
forms. Thats a much more difficult job.
Chapter 4
The Hartree-Fock method
Note to the reader The next sections contain difficult material and you may need to be reminded
about summations. Youve been summing numbered
Pi=n terms ever since Book 1: if there are nPof them,
t1 , t2 , ... tn say, you may write their sum as T = i=1 ti or, when the limits are clear, just as i ti ; but
if the terms are labeled by two indices, i,
Pj you may need to add conditions e.g. i 6= j or i < j to exclude
some of the terms. Thus, with n = 3, i<j ti,j will give you T = t1,2 + t1,3 + t2,3 ; and if you want to
sum over onePindex only you can use parentheses to exclude the one you dont want to sum over, using
for example i(6=2) to keep j = 2 fixed. Think carefully about what you want to do!
When the IPM approximation was first introduced in Chapter 2, it was taken for granted
that the best 1-electron wave functions would describe accurately a single electron moving in some kind of effective field. That means they would be eigenfunctions of an
eigenvalue equation heff = , with heff = h + V. Here well suppose the spin variables
have been eliminated, as in Examples 3.5 and 3.6, and start from the energy expression
(3.12), namely
E = 2
i +
Jii + 4
Jij 2
Kij .
i + 2
Kij .
Kij = hi j |g|j i i.
(checking that the summations come out right!) and then vary the orbitals one at a time.
Suppose then that k k + k , where k is any chosen (fixed) index, for the orbital
were going to vary. The corresponding small change in the 1-electron part of E will be
easy, since i = hi |h|i i and changes only when we take the term with i = k in the bra
or in the ket. The change in the sum is thus hk |h|k i + (c.c) where (c.c.) stands for
the complex conjugate of the term before it. But the interaction terms are more difficult:
well deal with them in the next two examples.
Example 4.1 The Coulomb operator
It would be nice to write the J- and K-terms as expectation values of 1-electron operators, for then we
could deal with them in the same way as i . A single Coulomb integral is
Jij = hi j |g|i j i = (1/r12 )i (r1 ))j (r2 )i (r1 )j (r2 )dr1 dr2 ,
since g is just a multiplying factor and can be put anywhere in the integrand. Wed like to get one
integration out of the way first, the one that involves the r2 variable, and we can do it by defining an
Jj (1) = (1/r12 )j (r2 )j (r2 )dr2
that works on any function of r1 , multiplying it by the factor that comes from the integration and
obviously depends on orbital j .
With Borns interpretation of the wave function (see Example 2.3), j (r2 )j (r2 ) = Pj (r2 ) is the probaR
bility density of finding an electron in orbital j at point r2 . And the integral (1/r12 )Pj (r2 )dr2 is the
electric potential at point r1 due to an electron in orbital j , treating Pj as the density (in electrons/unit
volume) of a smeared out distribution of charge.
Example 4.1 has given the expression (putting the volume element dr2 just after the
integration sign that goes with it, so as not to get mixed up)
Jj (1) = dr2 (1/r12 )j (r2 )j (r2 )
for the Coulomb operator associated with an electron in orbital j , being the electrostatic potential at point r1 arising from its charge cloud.
And with this definition we can write the Coulomb term as the double integral
Jij =
dr1 dr2 (1/r12 )i (r1 ))j (r2 )i (r1 )j (r2 )
dr1 i (r1 )) (Jj (1)) i (r1 ) = hi |Jj (1)|i i,
which is an expectation value, just as we wished, of the 1-electron operator Jj (1) that gives
the effective field provided by an electron in orbital j . Now we want to do something
similar for the exchange term Kij .
Example 4.2 The exchange operator
The exchange integral is
Kij = hi j |g|j i i =
and the interchange of labels in the ket spoils everything. Well have to invent a new operator!
If you compare the expression for Kij with that for Jij in (5.3) youll see where they disagree. Since the
order of the factors doesnt matter, we can keep the variables in the standard order swapping the labels
instead. The Jij integral is
Jij = dr1 dr2 (1/r12 )i (r1 ))j (r2 )i (r1 )j (r2 ),
while Kij , with its interchange of labels in the ket, is
Kij = dr1 dr2 (1/r12 )i (r1 ))j (r2 )j (r1 )i (r2 ).
The Coulomb operator (4.2) could be defined that way (as a multiplier) because the integration over r2
could be completed first, leaving behind a function of r1 , before doing the final integration over r1 to get
Jij as the expectation value in (5.3). Wed like to do something similar for the exchange integral Kij ,
but the best we can do is to introduce Kj (1), whose effect on any function of r1 will be to give
Kj (1)(r1 ) = dr2 (1/r12 )j (r1 ))j (r2 )(r2 ).
This looks very strange, because operating on (r1 ), it first has to change the variable to r2 and then do
an integration which finally leaves behind a new function of r1 . To put that in symbols we could say
Kj (1)(r1 ) = dr2 (1/r12 )j (r1 ))j (r2 ) (r1 r2 )i (r1 ),
where the operator (r1 r2 ) means replace r1 by r2 in any function that follows it.
Lets test it on the function i (r1 ) by writing the final factor in the expression for Kij as (r1 r2 )i (r1 )
and noting that the integration over r2 is already present. We then find
Example 3.8 has given an expression for the exchange integral hi j |g|j i i, similar to
that in (5.3) but with the exchange operator
Kj (1) = dr2 (1/r12 )j (r1 ))j (r2 )(r1 r2 ).
in place of the Coulomb operator. Both operators describe the effect on Electron 1,
in any orbital , of another electron (2) in orbital j , but while the Coulomb operator
has a simple classical interpretation (giving the energy of 1 in the field produced by the
smeared-out charge density associated with 2) the exchange operator is more mysterious.
There is, however, another way of describing an operator like Kj (1). An operator in a function space is
simply a recipe for going from one function to another e.g from f (x) to g(x). Youve used differential
operators a lot, but another way of getting from f (x) to g(x) is to use an integral operator, k say,
defined by means of a kernel k(x, x ) which includes a second variable x : the kernel determines the
effect of the operator and kf (x) = k(x, x )f (x )dx becomes a new function g(x) = kf (x). Clearly,
Kj (1) in (4.4) is an operator of this kind: it contains two electronic variables, r1 , r2 , and an integration
over the second one (r2 ).
Here well define the kernel of the operator Kj as the function of two variables
Kj (r1 , r2 ) = (1/r12 )j (r1 ))j (r2 ).
(Note: From now on well no longer indicate that the 1-electron operators act on functions
of r1 , by writing h(1) etc as that will always be clear from the function they act on.)
The relationship between the operator and its kernel is usually written Kj Kj (r1 , r2 ).
Notice that the operator Jj , defined in (4.2) has a similar
integrand, except that the
variable r1 has been replaced by r2 and the integration dr2 has been completed before
going on to get (5.3).
Were now nearly ready to go back to (4.1), writing the Coulomb and exchange integrals
in terms of the newly defined operators Jj and Kj , given in (4.2) and (4.6). Remember
that i = hi |h|i i in terms of the 1-electron Hamiltonian h; and we now know how to
express Jij = hi j |g|i j i and Kij = hi j |g|j i i in similar form.
Thus, (4.1) becomes
E = 2
i + 2
= 2
hi |h|i i + 2
Kij .
The 1-electron energy (called i only with complete neglect of electron interaction) has
now been written explicitly as the expectation value of the bare nuclear Hamiltonian h.
The summations over j can now be done, after putting Jij = h
Pi |Jj |i i, Kij =
Phi |Kj |i i
and defining total Coulomb and exchange operators as J = 2 j Jj , K = 2 j Kj . (Remember that Jj and Kj are operators for one electron in orbital j , but here we have
doubly-occupied orbitals.) Thus, on putting J 12 K = G, we find
E = 2
= 2
hi |h|i i +
hi |J|i i
i hi |K|i i
hi |(h + 12 G)|i i (G = J 12 K)
Having found a neat expression for the expectation value of the total energy, the next
step will be to find its variation when the orbitals are changed.
To find the stationary value of the energy we vary the orbitals one at a time, supposing
given in
k k + k and working to the first order of small quantities. The part of E,
(4.7), that depends on the single orbital k the one which is going to be varied is
E (k) = 2k + Jkk +
Jkj 12
= 2hk |h|k i +
Kkj .
Here the 1-electron energy (called k only with complete neglect of electron interaction)
has again been written explicitly as the expectation value of the bare nuclear Hamiltonian
On making the change k k + k , the corresponding first-order change in (4.8) is
E (k) = 2hk |h|k i + (c.c.) +
hk j |g|k j i +
hi k |g|i k i + (c.c.)
hk j |g|k j i +
hi k |g|k i i
+ (c.c.),
where each (c.c.) is the complex conjugate of the term before it.
(Note that the two sums in the parentheses are identical because, for example, the term
hi j |g|i j i in the expression for E (k) could just as well have been written hj i |g|j i i
and then, calling P
the second factor k , the change k k + k would have made the
second sum into j hj k |g|j k i) the same as the first sum if you interchange the
summation indices i, j.)
Thus, noting that the same argument applies on the last line of the equation above, the
first-order change in energy becomes
E (k) = 2hk |h|k i + 2
hk j |g|k j i
hk j |g|j k i + (c.c.).
But hk j |g|k j i = hk |Jj |k i and hk j |g|j k i = hk |Kj |k i; and the last expression can therefore be written (doing the summations over j)
E (k) = (2hk |h|k i + 2hk |J|k i hk |K|k i) + (c.c.)
= 2hk |[h + J 21 K]|k i + (c.c.).
approximation against infinitesimal variation of any orbital k , subject to the usual normalization condition. Since this variation is otherwise arbitrary, it follows (see Section
1.2 of Chapter 1) that a solution is obtained when [h + J 21 K]k is a multiple of k . The
operator in square brackets is often denoted by F and called the Fock operator, after
the Russian physicist who first used it. The condition for finding the best orbital k is
therefore that it be an eigenfunction of F:
(F = h + J 21 K = h + G),
Fk = k k
Theres no simple way of solving the eigenvalue equation found in the last section, because
the Fock operator depends on the forms of all the occupied orbitals which determine
the electron density and consequently the effective field in which all the electrons move!
The best that can be done is to go step by step, using a very rough first approximation
to the orbitals to estimate the J- and K-operators and then using them to set up, and
solve, a revised eigenvalue equation. The new orbitals which come out will usually be a
bit different from those that went in but will hopefully give an improved estimate of the
Fock operator. This allows us to go ahead by iteration until, after several cycles, no
further improvement is needed: at that stage the effective field stops changing and the
output orbitals agree with the ones used in setting up the eigenvalue equation. This is
the self-consistent field method, invented by Hartree (without the exchange operator)
and Fock (including exchange). It has been employed, in one form or another, ever since
the 1930s in calculations on atoms, molecules and more extended systems, and will serve
us well in the rest of Book 12.
The first thing we need to do is to relate the J- and K-operators to the electron density
functions and we already know how to do that. From (5.3) it follows that the total
Coulomb operator, for the whole system with two electrons in every orbital, is
Jj = 2
dr2 (1/r12 )
P the sum of all orbital contributions to the electron density at point r2 is P (r2 ) =
2 j j (r2 )j (r2 ), and this determines the Coulomb interaction between an electron at r1
and the whole electron distribution. The exchange interaction is similar, but depends on
the same density function evaluated at two points for which we use the notation P (r1 ; r2 ).
The usual density function P (r1 ) then arises on putting r2 = r1 : P (r1 ) = P (r1 ; r1 ).
To summarize, the effect of J and K on any function of r1 is given as follows:
The Coulomb and exchange operators for any closed-shell system
are defined by their effect on any 1-electron function (r1 ):
J(r1 ) =
K(r1 ) =
P (r1 ; r2 ) = 2
dr2 (1/r12 )P (r2 ; r2 ) (r1 ),
dr2 (1/r12 )P (r1 ; r2 ) (r2 ) ,
j [j (r1 )j (r2 )
(F = h + G(P )).
It is important to note that F is a Hermitian operator and that its eigenfunctions may
therefore be taken as forming an orthogonal set as we supposed in Section 3.1; but this
is not the case unless the exchange operator is included.
All very well, you might say, but if the Hartree-Fock equations are so difficult to handle
why do we spend so much time on them? And if we do manage to get rough approximations to orbitals and orbital energies, do they really exist and allow us to get useful
information? The true answer is that orbitals and their energies exist only in our minds,
as solutions to the mathematical equations we have formulated. In the rare cases where
accurate solutions can be found, they are much more complicated than the simple approximate functions set up in Chapter 2. Nevertheless, by going ahead we can usually
find simple concepts that help us to understand the relationships among the quantities
we can observe and measure. In Chapter 6 of Book 11 you saw how the idea of orbital
The energy of an X-ray photon is big enough to knock an electron out of an atomic inner
shell, leaving behind an ion with an inner-shell hole. The ejected electron has a kinetic
energy which can be measured and related to the energy of the orbital from which it came.
The whole process can be pictured as in Figure 4.1, which shows the various energy levels.
photon energy h
energy-conservation principle (which youve been using ever since Book 4) then allows
you to write
h = BE + W + KE.
In this approximation the binding energy BE = k when the electron comes from orbital
k (k being negative for bound states), while W (called the work function) is the extra
work that has to be done to get the electron from the level labelled escape energy from
atom to the one labelled escape energy from solid. At that point the electron really
is free to travel through empty space until it reaches a collector, in which its KE can
be measured. (Experiments like this are always made in high vacuum, so the electron
released has nothing to collide with.) The work function W can also be measured, by
doing the experiment with a clean surface (no adsorbed atoms or molecules) and a much
smaller photon energy, so the electron collected can only have come from the energy levels
in the solid.
The last equation can now be rearranged to give an experimental value of BE = k in
terms of the observed KE of the electron reaching the collector:
k = h W KE.
So even if orbitals dont really exist you can measure experimentally the energies of the
electrons they describe! Similar experiments can be done with lower photon energies: if
you use ultraviolet radiation instead of X-rays youll be using Ultraviolet Photoelectron
Spectroscopy (U-PS) and will be able to get information about the upper energy levels
of the adsorbed atoms and molecules. Nowadays, such techniques are widely used not
only to find what atoms are present in any given sample (their inner-shell orbital energies
being their footprints) but also to find how many of them there are in each adsorbed
molecule. For this reason X-PS is often known as Electron Spectroscopy for Chemical
Analysis (ESCA).
Its now time to ask how the Hartree-Fock equations can be solved with enough accuracy
to allow us to make meaningful comparisons between theory and experiment.
Finite-basis approximations
In earlier sections weve often built up approximate 1-electron wave functions as linear
combinations of some given set of functions. This is one form of the more general procedure for building 1-electron wave functions from a finite basis of functions which, from
now on, well denote by
1 , 2 , ... r , ... m .
Here we suppose there are m linearly independent functions, labelled by a general index
r, out of which were going to construct the n occupied orbitals. Usually the functions
will be supposed orthonormal, with Hermitian scalar products hr |s i = 1 for r =
s; = 0 (otherwise). Very often the scalar product will be denoted by Srs and called an
overlap integral. The basis functions are normally set out in a row, as a row matrix,
and denoted by . With this convention, a linear combination of basis functions can be
= c1 1 + c2 2 + ... + cr r + ... + cm m
or, in matrix form, as the row-column product
= (1 2 ... m )
= c,
where c stands for the whole column of expansion coefficients and for the row of basis
functions. Sometimes
it is useful to write such equations with the summation conventions,
so that = r cr r . (Look back at Section 3.1, or further back to Chapter 7 of Book 11,
if you need reminding of the rules for using matrices.)
In dealing with molecules, (4.14) is used to express a molecular orbital (MO) as a linear
combination of atomic orbitals (AOs) and forms the basis of the LCAO method. The
equation F = is easily put in finite-basis form by noting that F =
on taking a scalar product from the left with r , the r-component of the
s s
new vector F becomes
hr |F|i =
hr |F|s ics .
The quantity hr |F|s i is the rs-element of the square matrix F which represents the
operator F in the -basis. The next example will remind you of what you need to know
before going on.
Example 4.3 Matrix representations
new set of expansion coefficients, c r say. Thus = s s c s and to find the r-component of the new
function we form the scalar product hr | i, getting (with an orthonormal basis)
c r = hr | i = hr |Ai = hr |A|
s ics =
hr |A|s ics .
So the operator equation = A is echoed in the algebraic equation cr = s Ars cs and this in turn
can be written as a simple matrix equation c = Ac. (Remember the typeface convention: A (sans serif)
stands for an operator ; A (boldface) for a matrix representing it; and Ars (lightface italic) for a single
number, such as a matrix element.)
In the same way, you can show (do it!) that when a second operator B works on A, giving = BA =
C, the product of operators C = BA is represented by the matrix product C = BA.
To summarize: When a basis set is defined, along with linear combinations of the basis
functions of the type (4.13), or (4.14) in matrix form, the operator equality = A allows
us to say c = Ac. In this case we write
= A c = Ac
and say the equality on the left implies the one on the right. But this doesnt have to
be true the other way round! Each implies the other only when the basis set is complete
(see Section 3.1) and in that case we write
= A c = Ac.
In both cases we speak of a matrix representation of the operator equation, but only in
the second case can we call it faithful (or one-to-one). In the same way, the product
of two operators applied in succession (C = BA) is represented by the matrix product BA
and we write BA BA; but the double-headed arrow applies only when the basis is
complete. (Examples can be found in Book 11.)
Its important to remember that the representations used in the applications of quantum
mechanics are hardly ever faithful. Thats why we usually have to settle for approximate
solutions of eigenvalue equations.
When the eigenvalue equation F = is written in matrix form it becomes Fc = c,
the equality holding only in the limit where the basis is complete and the matrices are
infinite. With only three basis functions, for example, the matrix eigenvalue equation is
F11 F12 F13
F21 F22 F23 c2 = c2
F31 F32 F13
To find the full matrix F associated with the operator given in (??) we need to look
at the separate terms
h, J, K. The first one is easy: the matrix h has an rs-element
hrs = hr |h|s i = r (r1 )hs (r1 )dr1 , but the others are more difficult and are found as
in the following examples.
Example 4.4 The electron density function
Suppose we want the rs-element of the matrix representing
R J, defined in (4.11), using the -basis. When
J acts on any function of r1 it simply multiplies it by g(1, 2)P (r2 , r2 )dr2 , and our first job is to find
the matrix P representing the density function in the -basis. This density is twice the sum over all the
doubly-occupied orbitals, which well now denote by K (using capital letters as their labels, so as not
to mix them up with the basis functions r , s etc.. So the total density becomes
P (r2 ; r2 ) = 2
K (r2 )K
(r2 ) = 2
Ptu t u ,
t t (r2 )(
u u (r2 )) =
where Ptu = 2 K cK
t cu . (Note that the summation indices have been re-named t, u as r, s are already
in use. Also, when theres no room for K as a subscript you can always put it at the top its only a
label!) )
The electron density function P (r1 ; r2 ), first used in (4.11), generally contains two independent variables, the ordinary density of electric charge in electrons/unit volume arising
on putting r2 = r1 . When the function is written in finite basis form as
P (r1 ; r2 ) = 2
K (r1 )K (r2 ) = 2
Ptu t (r1 )u (r2 ),
t t (r1 )(
u u (r2 )) =
the square array P, here with elements Ptu , is an example of a density matrix. You
will find how important density matrices can be when you begin to study the physical
properties of molecules. Here were going to use them simply in defining the Coulomb
and exchange operators.
Example 4.5 The Coulomb operator
To get hr |J|s i we first express the operator J, which is just a multiplying factor, in terms of the -basis:
dr2 (1/r12 )P (r2 ; r2 ) = g(1, 2)
Ptu t (r2 )u (r2 )dr2 .
where the first indices on the two sides of the operator come from the r1 integration, while the second
indices (u, t) come from the r2 .
From Example 4.5, the Coulomb operator in (4.11) is represented in the finite -basis by
a matrix J(P) i.e. as a function of the electron density matrix, with elements
Jrs =
Ptu hr u |g|s t i.
The matrix defined in this way allows one to calculate the expectation value of the energy
of an electron in orbital K = cK , arising from its Coulomb interaction with the whole
electron distribution.
Example 4.6 The exchange operator
To get hr |K|s i we first express the operator K, which is an integral operator,
in terms of the -basis:
from (4.11), taking the operand (r1 ) to be s (r1 ), we get Ks (r1 ) =
dr2 (1/r12 )P (r1 ; r2 )s (r2 ) ,
where the integration over r2 is included in this first step. The next step in getting the matrix element
hr |K|s i is to multiply from the left by r (r1 ) (complex conjugate in the bra factor) and then do the
remaining integration over r1 . The result is
hr |K|s i = dr1 dr2 (1/r12 )r (r1 )
Ptu t (r1 )u (r1 )s (r2 ) =
Ptu hr u |g|t s i.
Here the first indices (r, t) on the two sides of the operator come from the r1 integration, while the second
indices (u, s) come from the r2 ; but note the exchange of indices in the ket.
From Example 4.6, the exchange operator in (4.11) is represented in the finite -basis
by a matrix K(P), again as a function of the electron density matrix, but now with
Krs =
Ptu hr u |g|t s i.
This result allows one to calculate the expectation value of the energy of an electron
in orbital K = cK , arising from its exchange interaction with the whole electron
Weve finished!! We can now go back to the operator forms of the Hartree-Fock equations
and re-write them in the modern matrix forms, which are ideal for offering to an electronic
computer. Equation (4.1), which gave the expectation value of the total electronic energy
in the form
E = 2
hi |h|i i +
hi |J|i i 1
hi |K|i i
= 2
hi |(h + 12 G)|i i (G = J 12 K)
now becomes (dropping the orbital label k to the subscript position now there are no
others, and remembering that the dagger conveniently makes the column ck into a row
and adds the star to every element)
E = 2
ck hck +
ck Jck 21 k ck Kck
= 2
ck (h
(G = J 12 K)
The operator eigenvalue equation (4.10) for getting the best possible orbitals, which was
Fk = k k
(F = h + G),
(F = h + G),
The last two equations represent the prototype approach in applying quantum mechanics
to the real many-electron systems we meet in Physics and Chemistry. Besides providing
a solid platform on which to build all the applications that follow in Book 12, they
provide the underlying pattern for most current developments which aim to go beyond
the Independent-Particle Model.
Chapter 5
Atoms: the building blocks of matter
Chapter 6 of Book 11 dealt with the simplest of all atoms Hydrogen, in which one
electron moves in the field of a positive nucleus of atomic number Z = 1. The eigenstates
of the Hamiltonian H were also eigenstates of the angular momentum operators, L2 and
one component of angular momentum, chosen as Lz . The definite values of the energy
and momentum operators were then En = 12 (Z 2 /n2 ), L(L + 1) and M (all in atomic
units of eH , ~2 and ~, respectively), where n, L, M are quantum numbers. But here
were dealing with a very different situation, where there are in general many electrons.
Fortunately, the angular momentum operators Lx , Ly , Lz , and similar operators for spin,
all follow the same commutation rules for any number of electrons. This means we
dont have to do all the work again when we go from Hydrogen to, say, Calcium with 20
electrons the same rules still serve and very little needs changing. (You may want to
read again the parts of Chapter 5 (Book 11) that deal with angular momentum.)
Here well start from the commutation rules for (orbital) angular momentum in a 1electron system. The operators Lx , Ly , Lz satisfy the equations
(Lx Ly Ly Lx ) = iLz ,
(Ly Lz Lz Ly ) = iLx ,
(Lz Lx Lx Lz ) = iLy ,
which followed directly from the rules for position and linear momentum operators (see
Example 5.4 in Book 11). For a many-electron system the components of total angular
momentum will be
Lx =
Lx (i), Ly =
Ly (i), Lz =
Lz (i),
where Lx (i) for example is an angular momentum operator for Particle i, while the unnumbered operators Lx etc refer to components of total angular momentum. We want to
show that these operators satisfy exactly the same equations (4.1).
Lx (i)
Ly (j)
Ly (j)
Lx (i)
iLz .
iLz (j)
Note that the double sum with i 6= j is zero because the operators commute when they refer to different
particles, but satisfy the equations (5.1) when i = j giving the single sum which is iLz . (And dont
confuse i, the imaginary unit, with i as a summation index!)
The other equations in (5.1) arise simply on changing the names of the indices.
where the numerical multipliers ensure that the shifted states, L,M 1 will also be normalized to unity, hL,M 1 |L,M 1 i = 1. These operators change only the eigenstates of
Lz , leaving a state vector which is still an eigenstate of H and L2 with the same energy
and total angular momentum. And, from what has been said already, they may be used
without change for systems containing any number of electrons. So we can now start
thinking about real atoms of any kind!
The electronic structures of the first four chemical elements are pictured, in IPM approximation, as the result of filling the two lowest-energy atomic orbitals, called 1s
and 2s. (You should read again the parts of Chapter 6, Book 11,
cuments/Books/Book12:The electron configurations of the first ten elements, in increasing order of atomic number, are
Hydrogen[1s1 ] Helium[1s2 ] Lithium[1s2 2s1 ]
Beryllium[1s2 2s2 ]
in which the first two s-type AOs are filling (each with up to two electrons
of opposite spin component, 12 ), followed by six more, in which the p-type
AOs (px , py , pz ) are filling with up to two electrons in each.
Boron[1s2 2s2 2p1 ] Carbon[1s2 2s2 2p2 ] Nitrogen[1s2 2s2 2p3 ]
Oxygen[1s2 2s2 2p4 ] Fluorine[1s2 2s2 2p5 ] Neon[1s2 2s2 2p6 ]
Here the names of the AOs are the ones shown in Figure 15 of Book 11, the leading integer
being the principal quantum number and the letter being the orbital type (s, p, d, f, ...).
Remember the letters just stand for the types of series (sharp, principal, diffuse, fine)
found in the atomic spectra of the elements, arising from particular electronic transitions:
in fact they correspond to values 0, 1, 2, 3,.. of the quantum number L. (The energy-level
diagram below (Figure 5.1) will remind you of all that.)
E 21 Z 2 eH
Note especially that the energy levels differ slightly from those for a strictly Coulombic
central field: the levels of given principal quantum number n normally lie in the energy
order En (s)< En (p)< En (d)<... because in a real atom the orbitals with angle-dependent
wave functions are on the average further from the nucleus and the electrons they hold are
therefore not as tightly bound to it. As the number of electrons (Z) increases this effect
becomes bigger, owing to the screening produced by the electrons in the more tightly
bound inner orbitals. Thus, the upward trend in the series of levels such as 3s, 3p, 3d
becomes more marked in the series 4s, 4p, 4d, 4f.
The first few atoms, in order of increasing atomic number Z, have been listed above along
with the ways in which their electrons can be assigned to the available atomic orbitals
in ascending order of energy. The elements whose atoms have principal quantum numbers
going from n = 3 up to n = 10 are said to form a Period, in which the corresponding
quantum shells fill with up to two electrons in every orbital. This is the first short
period. Chemists generally extend this list, to include all the 92 naturally occuring atoms
and a few more (produced artificially), by arranging them in a Periodic Table which
shows how similar chemical properties may be related to similar electronic structures.
More about that later.
Now that we have a picture of the probable electron configurations of the first few
atoms, we have to start thinking about the wave functions of the corresponding electronic
states of a configuration. For the atoms up to Beryllium, with its filled 1s and 2s
orbitals, the ground states were non-degenerate with only one IPM wavefunction. But
in Boron, with one electron in the next (2p) energy level, there may be several states
as there are three degenerate 2p-type wavefunctions usually taken as 2px , 2py , 2pz , or
as 2p+1 , 2p0 , 2p1 , where the second choice is made when the unit angular momentum
is quantized so that hLz i = +1, 0, 1, respectively. The next element, Carbon, is even
more interesting as there are now two electrons to put in the three degenerate p-orbitals.
Well study it in some detail, partly because of its importance in chemistry and partly
because it gives you the key to setting up the many-electron state functions for atoms in
general. (Before starting, you should read again Section 2.2 of Book 11, where we met
a similar problem in dealing with spin angular momentum and how the spins of two or
more particles could be coupled to give a whole range of total spins.)
First we note that the IPM states of the Carbon (2p)2 configuration can all be built
up from spin-orbital products with six factors of the type (li , mi , si |xi ) (i = 1, 2, ...6).
Here the 1-electron orbital angular momentum quantum numbers are denoted by lowercase letters l, m, leaving capitals (L, M ) for total angular momentum; and si is used for
the 1-electron up- or down-spin eigenvalue, always 12 . For example (1, 1, + 12 |x5 )
means that Electron 5, with space-spin coordinates x5 , occupies a 2p1 orbital with spin
factor .
Next, it is clear that we dont have to worry about antisymmetrizing in looking for the
angular momentum eigenfunctions: if a single product is an eigenfunction so will be the
antisymmetrized product (every term simply containing re-named electron labels). So
we can drop the electronic variables xi , taking the factors to be in the standard order
i = 1, 2, ..., 6, and with this understanding, the typical spin-orbital product for the Carbon
(2p)2 configuration can be indicated as
(l1 , m1 , s1 )(l2 , m2 , s2 )....(l5 , m5 , s5 )(l6 , m6 , s6 ).
The first four factors refer to a closed shell in which the first two orbitals, 1s and 2s,
both correspond to zero angular momentum (l1 = l2 = 0) and are each filled with
two electrons of opposite spin. With the notation youre used to, they could be written
as (1s)(1s)(2s)(2s) and define the closed-shell core. Wave functions for the
quantized electronic states of this configuration are constructed in the following examples.
Example 5.2 The Carbon (2p)2 configuration
The leading spin-orbital product, to which the six electrons are assigned in standard order, can be denoted
Product = (1s)(1s)(2s)(2s) (l5 , m5 , s5 )(l6 , m6 , s6 ). ( We start from the top state, in which
the angular momentum quantum numbers have their maximum values l5 = l6 = 1, m5 = m6 = 1 for
the 2p-orbital with highest z-component, and s5 = s6 = 21 for the up-spin states. You can check
that this product has a total angular momentum quantum number M = 2 for the orbital operator Lz =
Lz (1)+Lz (2)+...+Lz (6) by noting that the first four 1-electron operators all multiply their corresponding
orbital factors by zero, the eigenvalue for an s-type function; while the last two operators each give the
same product, multiplied by 1. Thus, the operator sum has the effect Lz (Product) = 2 (Product). In
the same way, the total spin angular momentum operator Sz = Sz (1) + Sz (2) + ... + Sz (6), will act on
Product to multiply it by 21 + 12 = 1, the only non-zero contribution to the z-component eigenvalue
coming from the last two spin-orbitals, which are each multiplied by 21 .
In short, in dealing with angular momentum, we can completely ignore the spin-orbitals of a closed-shell
core and work only on the spin-orbital product of the open shell that follows it. We can also re-name
the two electrons they hold, calling them 1 and 2 instead of 5 and 6, and similarly for the operators that
work on them it cant make any difference! And now we can get down to the business of constructing
all the eigenstates.
Lets denote the general state, with quantum numbers L, M and S, MS , by L,M ;S,MS or
the ket |L, M ; S, MS i. So the top state will be |L, L; S, Si; and we know from above
that in terms of spin-orbitals this is (l1 , l1 ; 21 , 12 )(l2 , l2 ; 21 , 21 ) = (2p+1 )(2p+1 ), showing
only the open-shell AOs. Here weve put M = L for the top orbital angular momentum
and ms = s = 12 for the up-spin state.
First concentrate on the orbital quantum numbers, letting those for the spin sleep (we
neednt even show them). All the theory we need has been done in Chapter 6 of Book
11, where we found that
L L,M = (L + M )(L M + 1)L,M 1 ,
and L (2) to the individual factors in the orbital product (2p)+1 )(2p+1 ). Well do that
Example 5.3 The orbital eigenstates
The many-electron eigenstates of the total spin operators, L2 and Lz , can all be derived from the top
state L,M with quantum numbers L = 2 and M = 2. From now on, well use p+1 , p0 , p1 to denote the
three 2p-functions, with l = 1, and m = +1, 0, 1, so as not to confuse numbers and names!
The 1-electron step-down operator L (i) (any i) acts as follows:
L (i)p+1 (i) = 2p0 (i), L (i)p0 (i) = 2p1 (i), L (i)p1 (i) = 0 p1 (i),
Thus, to get 2,1 from 2,2 we use (5.2) and find L 2,2 = 4 12,1 ; so 2,1 = 12 L 2,2 . To put this
result in terms of orbital products, we note that L = L (1) + L (2) for the two electrons of the open
shell and obtain
(2, 2) 2,2 = p1 p1
The five angular momentum eigenstates obtained in Example 5.3, all with the same total
angular momentum quantum number L = 2, have M values going down from +2 to 2
in unit steps. Remember, however, that they arise from two electrons, each in a p-state
with l = 1 and possible m-values +1, 0 1. This is an example of angular momentum
coupling, which we first met in Chapter 6 of Book 11 in dealing with electron spins.
There is a convenient vector model for picturing such coupling in a classical way. The
unit angular momentum of an electron in a p-type orbital is represented by an arrow of
unit length l = 1 and its components m = 1, 0 1 correspond to different orientations
of the arrow: parallel coupling of two such angular momenta is shown by putting their
arrows in line to give a resultant angular momentum of 2 units. This angular momentum
vector, with quantum number L = 2, may also be pictured as an arrow but its allowed
(i.e. observable) values may now go from M = L, the top state, down to M = L.
Again, this picture suggests that the angular momentum vector can only be found with
2L + 1 allowed orientations in space; but remember that such ideas are not to be taken
seriously they only remind us of how we started the journey from classical physics into
quantum mechanics, dealt with in detail in Book 11.
What we have found is summarized in the Vector diagrams of Figure 5.2.
0 L =
(b) L = 2
(a) l = 1
Figure 5.2
Vector diagrams for angular momentum
Figure 5.2(a) indicates with an arrow of unit length the angular momentum vector for one
electron in a p-orbital (quantum number l = 1). The allowed values of the z-component
of the vector are m = 0, 1 and the eigenstates, are indicated as bold dots at m = +1
(arrow up), m = 1 (arrow down), and m = 0 (arrow perpendicular to vertical axis, zero
Figure 5.2(b) indicates with an arrow of length 2 units the resultant angular momentum
of the two p-electrons with their unit vectors in line (parallel coupled). The broken line
shows the projection of the L = 2 vector on the vertical axis, the bold dot corresponding
to the eigenstate with L = 2, M = +1.
But are there other states, obtained by coupling the two unit vectors in different ways?
Example 2.2 in Book 11, where we were dealing with spin angular momentum, suggests
that there may be and suggests also how we might find them. The eigenstate indicated
by the bold dot at M = +1 in Figure 4.2(b) was found to be 2,1 = (p0 p+1 + p+1 p0 )/ 2
and both terms are eigenstates of the operator Lz = Lz (1) + Lz (2). So any other linear
combination will also be an eigenstate with M = +1. But we are looking for the simultaneous eigenstates of the commuting operators L2 and Lz ; and we know that two such
states must be orthogonal
when they have different eigenvalues. It follows that the state
= (p0 p+1 p+1 p0 )/ 2, which is clearly orthgonal to 2,1 , will be the eigenstate we are
looking for with eigenvalues (L = 1, M = 1) i.e. the top state of another series. It is
also normalized (check it, remembering that the shift operators were chosen to conserve
normalization of the eigenstates
they work on) and so we can give the subscripts 1, 1.
From 1,1 = (p0 p+1 p+1 p0 )/ 2, we can start all over again, using the step-down operator
to get first 1,0 and then 1,1 .
Finally, we can look for an eigenstate with M = 0 orthogonal to 1,0 . This must be a
simultaneous eigenstate with a different value of the L quantum number: it can only be
the missing 0,0 .
Now we have found all the simultaneous eigenstates of the orbital angular momentum
operators we can display them all in the diagram below:
|2, +2i
|2, +1i
|1, +1i
|2, 0i
|1, 0i
|2, 1i
|1, 1i
|0, 0i
|2, 2i
In Chapter 3 we used Slaters rules (3.7) to derive an IPM approximation to the energy expectation value for a wave function expressed as an antisymmetrized spin-orbital
P P[1 2 ... N ]
= (1/N !)1/2
For the Carbon atom, the basic spin-orbital product for this state would seem to have the
explicit form
1 2 ... 6 = (1s)(1s)(2s)(2s)(2p+1 )(2p+1 ),
but now we have to recognise the degeneracy and the need to couple the angular momenta
of the electrons in the p-orbitals. The last section has shown how to do this: we start from
the top state, with maximum z-component (Lz = 2, Sz = 1 in atomic units) and set up a
whole range of states by applying the shift operators L , S to obtain other simultaneous
eigenfunctions with lower quantum numbers (see Fig. 5.3).
The top state, before antisymmetrizing as in (5.4) will now have the associated product
1 2 ... 6 = (1s)(1s)(2s)(2s)(2p+1 )(2p+1 )
and the states with lower values of M have been found in Example 5.3. The next one
down will be derived by antisymmetrizing the product
1 =
N ! A[(1s)(1s)(2s)(2s)(p0 )(p+1 )]
Given = N ! A[1 2 ... N ],
the diagonal matrix element h|H|i is given by
h|H|i =
R hR |h|R i
hR S |g|S R i],
hR S |g|S R i]
(1/ 2)[p0 (r1 )p+1 (r2 ) p+1 (r1 )p0 (r2 )] (s1 )(s2 )
which is a linear combination F = (F1 F2 )/ 2 of the two spin-orbital products
h1 |H|2 i
h1 2 |g|2 1 i h1 2 |g|1 2 i
When the Hamiltonian contains no spin operators (the usual first approximation) the diagonal 1-electron
integrals each give the energy 2p of a 2p-electron in the field of the 1s2 2s2 core, but off-diagonal elements
are zero because they are between different eigenstates. The 2-electron terms reduce to Coulomb and
exchange integrals, similar to those used in Chapter 3, involving different 2p-orbitals. So its a long and
complicated story, but the rules in (5.7) provide all thats needed (apart from a bit of patience!).
The Carbon ground state in Example 5.2 is described as 3 P (triplet-P) because it has spin
quantum number S = 1 and therefore 3 components, with MS = 0, 1. But it is also
degenerate owing to the three possible z-components of the orbital angular momentum,
with M(L) = 0, 1, for L = 1. As we shall see shortly, this degeneracy is removed
or broken when small terms are included in the Hamiltonian. First, there is a term
describing the interaction between the magnetic field arising from orbital motion of the
electron (see Book 10) and the magnetic dipole associated with electron spin. This gives
rise to a fine structure of the energy levels, which are separated but remain threefold
degenerate for different values of MS ; only when an external magnetic field is applied,
to fix a definite axis in space, is this remaining degeneracy broken an effect called
Zeeman splitting of the energy levels.
The energy-level structure of the lowest electronic states of the Carbon atom is indicated later in Figure 5.4, which shows the positions of the first few levels as determined
experimentally by Spectroscopy.
There are other states belonging to the electron configuration 2p2 , whose energies have
not so far been considered. They are singlet states, labelled in Fig. 5.4 as 1 D and 1 S; why
have we not yet found them? The reason is simply that we started the energy calculation
using a wave function with only spin-up electrons outside the closed shell 1s2 2s2 and
got the other functions by applying only the orbital step-down operator L : this leaves
unchanged the spin factor (s1 )(s2 ) which represents a triplet state with S = 1. In fact,
the Pauli Principle tells us at once that only the 3 P state is then physically acceptable: it
has an orbital factor which is antisymmetric under exchange of electronic variables and
can therefore be combined with the symmetric spin factor to give a wave function which
is antisymmetric under electron exchange. The next example explains the results, which
are indicated in Figure 5.4.
to the orbital eigenstate |2, +2i in Fig. 5.3 the result is p+2 (r1 )p+2 (r2 ) [(s1 )(s2 ) + (s1 )(s2 )]/ 2.
namely F = (F1 + F2 )/ 2; but it still cannot give a wave function that satisfies the Pauli Principle, being
totally symmetric under electron exchange. If we antisymmetrize F it just disappears!
Remember, however, that the step-down operator L changed the quantum number M in a state |L, M i
but not (see Fig. 5.3) the value of L. To change L we had to find a second combination of the component
states in |L, M i, orthogonal
to the first. Its justthe same for the spin eigenfunctions; and the orthogonal
partner of (F1 + F2 )/ 2 is clearly (F1 F2 )/ 2, which has a singlet spin factor with S = MS = 0.
All we have to do, then, to get the singlet D-states
is to use the original orbital eigenfunctions but
attaching the spin factor [(s1 )(s2 ) (s1 )(s2 )]/ 2 in place of the triplet factor (s1 )(s2 ). As the
five states are degenerate its enough to calculate the electronic energy for any one of them e.g. the
top state, with (after antisymmetrizing) the wave function (L=2;S=0) . This is the linear combination
1 =
2A[p+1 (r1 )(s1 )p+1 (r2 )(s2 )]
2 =
2A[p+1 (r1 )(s1 )p+1 (r2 )(s2 )].
The calculation continues along the lines of Example 5.2: the energy of the open-shell electrons in the
field of the core will now be
You must have been wondering what makes a system jump from one quantum state
to another. We met this question even in Book 10 when we were first thinking about
electromagnetic radiation and its absorption or emission by a material system; and again
in the present Book 11 when we first studied the energy levels of a 1-electron atom and the
spectral series arising from transitions between the corresponding states. The interaction
between radiation and matter is a very difficult field to study in depth; but its time to
make at least a start, using a very simple model.
stationary and remain so until you disturb the sytem in some way. Such a disturbance is a perturbation
and depends on the time at which it is applied.
Suppose the system were considering has a complete set of stationary-state eigenfunctions
of its Hamiltonian H. As we know from Book 11, these satisfy the Schrodinger equation
including the time,
H =
i t
even when H itself does not depend on t. The eigenfunctions may thus develop in time
through a time-dependent phase factor, taking the general form n exp (i/~)En t,
where En is the nth energy eigenvalue. (You can verify that this is a solution of (5.8),
provided En satisfies the time-independent equation H = E.)
Now suppose H = H0 + V(t), where V(t) describes a small time-dependent perturbation applied to the unperturbed system whose Hamiltonian we now call H0 . And lets
expand the eigenfunctions of H in terms of those of H0 , putting
(t) =
cn (t)n exp (i/~)En t,
where the expansion coefficient cn (t) changes slowly with time (the exponential factor
usually oscillates very rapidly). On substituting this trial function in (5.8) it follows
~ X dcn
En cn n exp (i/~)En t =
Hn exp (i/~)En t,
i n
and on taking the scalar product from the left with the eigenvector m we get (only the
term with n = m remains on the left, owing to the factor hm |n i)
+ Em cm exp (i/~)Em t =
cn [hm |H0 |n i + hm |V(t)|n i] exp (i/~)En t.
We start by supposing that the perturbation V depends on time only through being
switched on at time t = t0 and finally switched off at a later time t, staying constant,
and very small, between the two end points. Note that V may still be an operator (not
just a numerical constant.) If the system is initially known to be in an eigenstate n
with n = i, then the initial values of all cm will be cm (0) = 0 for m 6= i, while ci (0) = 1.
From (5.9), putting all cn s on the right-hand side equal to zero except the one with n = i,
we find a single differential equation to determine the initial rate of change of all cm (t):
it will be
= Vmi (t) exp (i/~)(Em Ei )t,
which is a key equation for the first-order change in the coefficients (and means approximating all coefficients on the right in (5.9) by their initial values).
When the operator V is time-independent, the initial value ci (0) = 1 will have changed
after time t to
ci (t) = 1 Vii t,
while the other coefficients, initially zero, will follow from (5.9) with m 6= i:
i t
Vmi (t) exp[(i/~)(Em Ei )t]dt
cm (t) =
~ 0
exp(i/~)(Em Ei )t
= Vmi
(i/~)(Em Ei )
[1 exp(i/~)(Em Ei )t]
Em Ei
Now you know (see Book 11, Chapter 3) that |cm (t)|2 will give the probability of observing
the system in state m , with energy Em , at time t after starting in the initial state i at
t = 0. Thus,
|Vmi |2
[1 exp(i/~)(Em Ei )t] [1 + exp(i/~)(Em Ei )t]
(Em Ei )2
(Em Ei )
|Vmi |2
2 sin
t ,
(Em Ei )2
|cm (t)|2 =
where in the second step you had to do a bit of trigonometry (Book 2, Chapter 3).
On setting the energy difference (Em Ei ) = x, this result becomes
|Vmi |2
P (i m) = |cm (t)| = 4 2
sin x
and, if you think of this as a function of x, it shows a very sharp peak at x = 0 (which
means Em Ei ). The form of the peak is like that shown below in Figure 5.5.
By treating x as a continuous variable we can easily get the total probability, m P (i
m), that the system will go into any state close to a final state of given energy Ef .
For this purpose, we suppose the states are distributed with density (Em ) per unit range
around one with energy Ef . In that case (5.11) will yield a total probability of transition
from initial state i to a final state with energy close to Ef , namely
W (i f ) =
P (i m) P (i m)(Em )dEm .
This quantity can be evaluated by using a definite integral well known to Mathematicians;
Z +
sin2 x
F (x)dx = F (0)
where F (x) is any well-behaved function of x. This means that the delta function
(0 ; x) = ()1
sin2 x
when included in the integrand of F (x)dx, simply picks out the value of the function
that corresponds to x = 0 and cancels the integration. It serves as the kernel of an integral
operator, already defined in Book 11 Section 9, and is a particular representation of the
Dirac delta function.
On using (5.12) and (5.13), with = (t/2~), in the expression for W (i f ), we find (do
the substitution, remembering that x = Em Ei )
2t 2
W (i f ) = P (i m)(Em )(0, x)dEm = 4Vmi
(Em )
V (Ef ) (5.14)
~ mi
where the delta function ensures that x = Ei Em = 0 and consequently that transitions
may occur only when the initial and final states have the same energy, Em Ef = Ei . In
other words, since Em Ei the Energy Conservation Principle remains valid in quantum
physics, within the limits implied by the Uncertainty Principle.
For short times (still long on an atomic scale) this quantity is proportional to t and
allows us to define a transition rate, a probability per unit time, as
w(i f ) =
|Vf i |2 (Ef ).
This formula has very many applications in quantum physics and is generally known as
Fermis Golden Rule. In this first application, to a perturbation not depending on
time, the energy of the system is conserved.
The form of the transition probability P (i f ), from which (5.15) was derived, is shown
below in Figure 5.5 and is indeed sharp. The half-width of the peak is in fact h/t and
thus diminishes with time, being always consistent with what is allowed by Heisenbergs
uncertainty principle for the energy of the states.
P (i f )
(Ef = Ei )
Figure 5.5 Probability of a transition (see text)
As a second example, lets think about the absortion and emission of radiation, which lie
at the heart of all forms of Spectroscopy. The quanta of energy are carried by photons,
but in Book 10 radiation was described in terms of the electromagnetic field in which the
electric and magnetic field vectors, E and B oscillate at a certain frequency, depending on
the type of radiation involved (very low frequency for radio waves, much higher for visible
light ranging from red up to blue and much higher still for X-rays and cosmic rays).
That was the classical picture of light as a wave motion. But in quantum physics, a ray
of light is pictured as a stream of photons; and much of Book 11 was devoted to getting an
understanding of this wave-particle duality. The picture that finally came out was that
a quantum of radiant energy could best be visualized as a highly concentated packet of
waves, sharing the properties of classical fields and quantum particles. (Read Chapter 5
of Book 11 again if youre still mystified!)
So now well try to describe the interaction between an electronic system (consisting of
real particles like electrons and nuclei) and a photon field in which each photon carries
energy = h, where is the frequency of the radiation and h is Plancks constant. This
is the semi-classical picture, which is completely satisfactory in most applications and
allows us to go ahead without needing more difficult books on quantum field theory.
The first step is to think about the effect of an oscillating perturbation of the form
V(t) = Veit + V eit
( > 0),
the operator V being small and time-independent, applied to a system with Hamiltonian
H0 and eigenstates n exp (i/~)En t, with energy En .
Transitions may occur, just as they did in the case where there was no oscillating field
and the frequency-dependent factors were absent. But now there are two terms in the
perturbation and each will have its own effect. The argument follows closely the one used
when the -terms were missing, but the equation for the time-dependent coefficient cm (t)
will now be a sum of two parts:
[1 exp[(i/~)(~ + Em Ei )t]
cm (t) = Vmi
(~ + Em Ei )
[1 exp[(i/~)(~ Em + Ei )t]
+ V mi
(~ Em + Ei )
On putting = 0, the first term reduces to the result given in (5.10), for a single constant
perturbation: this was large only when Em Ei , but now it is large only when Em
Ei + ~ 0. The first term can therefore produce a transition from state i to m only
when the radiation frequency (= /2) is such that ~ = (h/2)(2) (Ei Em ).
The transition will thus occur only when h Ei Em . This corresponds to emission of
a photon, leaving the system in a state with lower energy Em .
In fact, the transition energy will not be exactly Ei Em but rather Ei Ef , where Ef
will be an average energy of the group of states into which the emitted electron lands.
The calculation is completed, as in the case of a constant perturbation, by assuming a
density-of-states function (Ef ) for the final state. In the case of emission, the probability
of a transition into state m at time t will be
|Vmi |2
(~ + Em Ei )
P (i m) = |cm (t)| =
2 sin
t ,
(h + Em Ei )2
which you can get using the same argument that follows equation (5.10). Finally, following the same steps (do it!) that led to (5.14) youll get the transition rate. The
probability/unit time for emission of a photon of energy h (= ~)
w( i f ) = (2/~)|Vf i |2 (Ef )(h + Ef Ei ).
The absorption of a photon of energy h is brought about by the second term in (5.16)
and the calculation of the transition rate runs along exactly parallel lines. Thus we find
Emission of a quantum of energy h :
w(i f ) = (2/~)|Vf i |2 (Ef )(h + Ef Ei )
Absorption of a quantum of energy h :
w(i f ) = (2/~)|Vf i |2 (Ef )(h Ef + Ei )
As the photon energy is a positive quantity, the final state in absorption will have higher
energy than that in the initial state; and, as you can see, this is nicely taken care of by
the delta-function. In (5.20) the Initial and Final states are labelled i and f and the
delta-function has the form shown in Figure 5.5. Note that the delta-function peak for
emission of a photon is exactly like that shown in the Figure, but the photon-frequency
is given by putting h + Ef Ei = 0: this means that h = Ei Ef , so the final state
has lower energy than the one the electron comes from; and that corresponds to the peak
being displaced upwards, from energy Ef Ei in Figure 5.5 to Ef Ei h in the
emission process. In the same way, the absorption of a photon of energy h would be
shown by displacing the peak at energy Ef to one at Ef Ei + h.
It may seem that this chapter, with all its difficult theory, has not taken us far beyond the
IPM picture we started from where electrons were supposed independent and assigned
to the AOs obtained by solving a 1-electron Schrodinger equation. But in fact weve come
a very long way: were now talking about a real many-electron system (and not only
an atom!) and are already finding how far its possible to go from the basic principles
of quantum mechanics (Book 11) towards an understanding of the physical world. We
havent even needed a pocket calculator and were already able to explain what goes on
in Spectroscopy! Of course, we havent been able to fill in all the details which will
depend on being able to calculate matrix elements like Vif (that require approximate
wave functions for initial and final states). But weve made a good start.
In Section 5.3 we were dealing with the effect of a small change in the Hamiltonian of a
system, from H0 to H = H0 + V, where the operator V was simply switched on at time
t = 0 and switched off at time t. Now well ask what difference the presence of a timeindependent V will make to the eigenstates n of H0 , which well call the unperturbed
system. All the states will be stationary states and the time-dependent phase factors
exp (i/~)En t may be dropped, having no effect on the expectation values of any timeindependent quantities (look back at Section 5.3 of Book 11 if you need to). So well be
dealing with the Schrodinger equation without the time, H = E.
Well also want to get some picture of what the perturbation is doing to the system; and
that will be provided by various density functions, which can show how the electron
distribution is responding. The probability density P (r) the probability per unit volume
of finding an electron at point r is well known to you for a 1-electron system, as the
squared modulus |(r)|2 of the wave function. But now we need to generalize the idea to
a many-electron system and to include the spin variables s1 , s2 , ... so we still have work
to do.
As in Chapter 1, lets suppose we have a complete set of functions 1 , 2 , ... k , ....,
in terms of which any wave function of the particle coordinates of the system can be
and if we start from the matrix form Hc = Ec it is clear that all the off-diagonal elements
of H will be small, containing only the perturbation operator. As a first approximation,
the diagonal part that remains on neglecting them altogether has elements Hkk = Ek0 +
(1) Ek = Hkk
= hk |H |k i,
where (1) Ek means first-order change in Ek . On writing out in full the matrix element
this becomes
Ek = k (x1 , x2 , ... xN )H k (x1 , x2 , ... xN )dx1 dx2 ... dxi ...dxN .
This is a very general result. The first-order change in the energy Ek of a state k ,
produced by a perturbation H , can be approximated as the expectation value of the
perturbation in the unperturbed state k . (Here, for simplicity, the state is taken to be
How to interpret this result: the electron density function
First we have to think about evaluating the matrix element of H , the change in the
Hamiltonian, and that brings us back to the old problem of how
P to go from
P one particle
to many. We start from the N -electron Hamiltonian H =
i,j g(i, j) and
add a perturbation H . The simplest kind of change is just a change of the field in which
the electrons move, which changes the potential energy
V (i) for every electron
Pfunction P
i = 1, 2, ...N . Thus H0 becomes H = H0 + H with H = i h(i) = i V (i), since the KE
operator is not changed in any way. And the matrix element in (5.23) therefore becomes
hk |H |k i =
k (x1 , x2 , ... xN )H k (x1 , x2 , ... xN )dx1 dx2 ... dxi ...dxN
k (x1 , x2 , ... xN )
V (i)k (x1 , x2 , ... xN )dx1 dx2 ... dxi ...dxN .
Remember that a typical integration variable xi really stands for the three components
of the position vector ri of Electron i, together with its spin variable si , so the volume
element dxi means dri dsi . Remember also that
k (x1 , x2 , ... xN ) k (x1 , x2 , ... xN )dx1 dx2 ... dxi ...dxN
gives the probability of finding electrons labelled 1, 2, ...i, ...N simultaneously in the
corresponding volume elements. This is the basic interpretation of the Schrodinger wave
function (see Book 11 Chapter 3) extended to a system of many particles.
If we had only two particles, described by a wave function (x1 , x2 ), the probability of
finding Electron 1 in volume element dx1 , and 2 at the same time in dx2 , would be
(x1 , x2 )(x1 , x2 )dx1 dx2 , the probabilities being per unit volume. But the probability of finding Electron 1 in dx1 and Electron 2 just anywhere would be obtained by
summing (in this case integrating) over all possible positions of the second box dx2 i.e.
dx1 (x1 , x2 )(x1 , x2 )dx2 .
Owing to the antisymmetry principle, (2.15) in Chapter 2, the same result would follow
if we wanted the probability of finding Electron 2 in box dx1 , and Electron 1 just
anywhere. (You can prove this by interchanging 1 and 2 in the wave function and noting that
will be unchanged.) So, with two electrons, the integration only has to be done once and
the result then multiplied by 2. The probability of finding
an electron, no matter which in dx1
can thus be denoted by (x1 )dx1 , where (x1 ) = 2 (x1 , x2 )(x1 , x2 )dx2 and is called the
one-electron probability density.
For N electrons a similar result will follow when you think of Electron 1 in box dx1 and
dont care where all the (N 1) other electrons are: you get the probability of finding it there
by integrating over all positions of the remaining volume elements. And as youll get the same
result for whichever electron you assign to box dx1 you can define
(x1 ) = N (x1 x2 , ... xN )(x1 , x2 , ... xN )dx2 ... dxi ...dxN .
as the probability per unit volume of finding an electron (no matter which) at point x1 .
Now we can come back to the Physics. The expectation value of H in state = k will be the
sum of N identical terms, coming from the 1-electron quantities V (i). It will thus be
hk |H |k i =
k (x1 , x2 , ... xN )H k (x1 , x2 , ... xN )dx1 dx2 ... dxi ...dxN
= N k (x1 , x2 , ... xN )V (1)k (x1 , x2 , ... xN )dx1 dx2 ... dxi ...dxN .
This can be nicely expressed in terms of the 1-electron density defined in (5.24) and gives for
the first-order energy change (5.23)
(1) Ek = hk |H |k i = V (1)(x1 )dx1 ,
all expressed in terms of the space-spin coordinates of a single electron, just as if we were
dealing with a one-electron system!
A generalization: the density matrix
Although (5.25) P
is a very useful result, as youll see presently, you may want to know what
happens if H = i h(i) where h(i) is a true operator, not just multiplication by a function
V (i). In that case it seems that the reduction to (5.25) is not possible because the operator
stands between and and will work only on the function that stands to its right. In the
step before (5.25) we were able to bring the wave function and its complex conjugate together,
to get the probability density, because
k (x1 , x2 , ... xN )V (1)k (x1 , x2 , ... xN ) = V (1)k (x1 , x2 , ... xN )k (x1 , x2 , ... xN )
the order of the factors doesnt matter when they are just multipliers. But you cant do that
if h(1) contains differential operators: /z1 , for example, will differentiate everything that
stands to its right and contains coordinates of Electron 1. Here we want the operator to work
only on the factor, which contains x1 , and not on . So we have to trick the operator by
writing k (x1 , x2 , ... xN ), where x1 is a new variable, instead of k (x1 , x2 , ... xN ), changing it
back to x1 after the operation.
That makes very little difference: the definition of the 1-electron density in (5.24) is replaced by
that of a 1-electron density matrix, containing the two variables (x1 , x1 ):
(x1 ; x1 ) = N (x1 , x2 , ... xN ) (x1 x2 , ... xN )dx2 ... dxi ...dxN
i h(i)
h|H |i =
is then given by
where the integration is done after applying the operator and identifying the variables. So the
generalization needed is very easy to make (youve done it before in Section 3.5).
As a first application, however, lets think of applying a uniform electric field to an atom and
asking how it will change the energy E of any stationary state . (Well drop the state label k
as its no longer needed.)
Example 5.5 Application of external field F to an atom
(Note that well be in trouble if we use E for the electric field, as E is used everywhere for energy; so now
well change to F when talking about the electric field. You may also need to refer back to Book 10.)
The components Fx , Fy , Fz of the field vector arise as the gradients of an electric potential (x, y, z)
along the three axes in space,
Fx = , Fy = , Fz =
and an obvious solution is (x, y, z) = xFx yFy zFz . (Check it by doing the partial differentiations.)
The change in potential energy of an electron (charge e) at field point (xi , yi , zi ), due to the applied
electric field, is thus V (i) = e(xi Fx + yi Fy + zi Fz ). We can use this in (5.25) to obtain the first-order
change in energy of the quantum state = k produced by the field:
(1) E = e (xi Fx + yi Fy + zi Fz )(x1 )dx1 .
Now the integration (.....)dx1 is over both space and spin variables dx1 dr1 ds1 , but in this example
h contains no spin operators; and even when the wave function contains spin-orbitals, with spin factors
(si ) and (si ), the spin dependence will disappearR when the spin integrations are done. A spinless
density function can therefore be defined as P (r1 ) = (x1 )ds1 and the last equation re-written as
(1) E = V (r1 )P (r1 )dr1 = e (xi Fx + yi Fy + zi Fz )P (r1 )dr1 ,
where the spinless density P (r1 ) depends only on the spatial coordinates of a single point in ordinary
Example 5.5 has given a very transparent expression for the first-order energy change
that goes with a modification of the potential field in which the N electrons of a system
move. If the potential energy of a single electron at point r1 is changed by V (r1 ), the
first-order energy change of the whole electron distribution will be (dropping the label 1
on the integration variable)
(1) E =
V (r)P (r)dr
just as if the distribution were a smeared out charge, with P (r)dr electrons per unit
volume at point r.
This is an example of the Hellman-Feynman theorem which, as well see in the next
chapter, is enormously important in leading to a simple picture of the origin of the forces
that hold atoms together in molecules. The result is accurate if the wave function is
exact or has been obtained by certain types of variational method, but its main value
lies in providing clear pictorial interpretations of difficult theory. It can also lead to
simple expressions for many quantities that are easily measured experimentally. Thus, in
Example 5.5,
(1) E = e
x1 x1
where the prime is removed before doing the integration. We note in passing that this general result
reduces to the one given in (3.7) for an IPM wave function with occupied spin-orbitals 1 , 2 , ...i , ...N .
Thus, on putting
exactly as in (3.7).
Example 5.6 has verified the expression for the density function (5.24) and has given its
form in terms of the occupied spin-orbitals in an IPM wave function. Thus
is the 1-electron density matrix, the two variables corresponding to row and column indices
in a matrix representation of a density operator . The primed and unprimed variables
are shown here in a corresponding standard order.
We havent forgotten about the spin! If you write x1 = r1 s1 and arent interested in
whether the spin is up (s = + 21 ) or down (s = 12 ), Rthen you can sum over both
possibilities to obtain a spinless density function P (r1 ) = (r1 , s1 ds1 . This is the probability/unit volume of finding an electron, of either spin, at point r1 in ordinary 3-space.
The terms in (5.29) depend on spin-orbitals i (x) = i (r)(s) where the spin factor
may be or ; and (x1 ) may therefore be written as a sum of the form
(x1 ) = P (r1 )(s1 ) (s1 ) + P (r1 )(s1 ) (s1 ),
in which the - and -terms are
P (r1 ) =
i (r1 )i (r1 ),
P (r1 ) =
i ()
i (r1 )i (r1 ).
i ()
Here the first sum comes from occupied spin-orbitals with spin factors and the second
from those with factors. The density matrix may clearly be written in a similar form,
but with an extra variable coming from the starred spin-orbital and carrying the prime.
The results (5.29) (5.32) all followed from Example 5.6 and the definition
(x; x ) =
i (x)i (x ),
which gave
h(i)|i =
These are the IPM forms of the 1-electron density functions and their main properties.
When electron interaction is admitted we shall need corresponding results for 2-electron
densities: these are derived in the next example.
Example 5.7 The 2-electron density matrix
The derivation follows closely that in Example 5.6: Probability of Electron 1 in dx1 and Electron 2
simultaneously in dx2
= dx1 dx2
Here the function has been written on the right, ready for defining the density matrix, and the first two
volume elements are kept fixed. Again, Electrons i and j would be found in dx1 and dx2 , respectively,
with exactly the same probability. But the pair could be chosen from the N electrons in N (N 1)
different ways (with one already chosen there are only N 1 others to choose from). Adding all these
identical probabilities means that the total probability of finding any two electrons in dx1 and dx2 will
dx1 dx2 ( x1 x2 , ... xN ) (x1 , x2 , ... xN )dx3 ... dxi ...dxN .
Lets denote this pair probability by dx1 dx2 (x1 , x2 ) (
R being the Greek letter p), so that on dropping
the volume elements the pair density is (x1 , x2 ) = ( x1 x2 , ... xN ) (x1 , x2 , ... xN )dx3 ... dxi ...dxN .
The corresponding 2-electron density matrix follows when we put primes on the arguments x1 and x2 in
the function on the right; the result is denoted by (x1 , x2 ; x1 , x2 ) and the pair probability results on
identifying the primed and unprimed variables. Thus (x1 , x2 ) = (x1 , x2 ; x1 , x2 ).
As in Example 5.6, the 2-electron density matrix for a system with an IPM wave function can be written
down by inspection of thePresults obtained in Chapter 3. Thus, from (3.7) the expectation value of the
where g(1, 2) is the 2-electron operator acting on functions of x1 and x2 . (Labels are needed to indicate
two space-spin variables.) Note that the second matrix element has the spin-orbital labels exchanged in
the ket factor, giving the exchange term.
The first matrix element can be written
hi j |g(1, 2)|i j i = i (x1 )j (x2 )g(1, 2)i (x1 )j (x2 )dx1 dx2 ,
while the second, with a minus sign, is similar but with labels exchanged in the ket factor.
as the 2-electron density matrix. With this definition the many-electron expectation value becomes
g(i, j)|i = [g(1, 2)(x1 , x2 ; x1 , x2 )](x1 x1 ,x2 x2 ) dx1 dx2 .
and when the operator g(i, j) does not touch the spin variables the integrations over spin can be done
g(i, j)|i = [g(1, 2)(r1 , r2 ; r1 , r2 )](r1 r1 ,r2 r2 ) dr1 dr2 ,
where the upper-case Greek letter is used to denote the spinless 2-electron density matrix. (Remember
that upper-case rho, which is P in the Greek alphabet, was used for the spinless 1-electron density
that way you wont get mixed up.)
The conclusions from Examples 5.6 and 5.7 for a state , represented by a single antisymmetrized spin-orbital product and normalized to unity as in (3.4), are collected below:
h(i)|i =
i hi |h|i i
= [h(x; x )](x x) dx,
(x; x ) = i i (x)i (x )
and h|
= [g(1, 2)(x1 , x2 ; x1 , x2 )](x1 x1 ,x2 x2 ) dx1 dx2 ,
Note that the arguments in the density functions no longer serve to label the electrons
they simply indicate space-spin points at which electrons may be found. Now, in the
next Example, well see how things work out in practice.
Example 5.8 Density functions for some atoms
At the beginning of Chapter 5, in Section 5.1, we listed the electron configurations of the first ten atoms of
the Periodic Table. The first four involved only the two lowest-energy AOs, 1s and 2s , which were singly
or doubly occupied by electrons. A doubly occupied orbital appeared once with spin factor and once
with spin factor , describing electrons with up-spin and down-spin, respectively. The corresponding
spin-orbitals were denoted by 1 = 1s , 2 = 1s , 3 = 2s , 4 = 2s and, on putting in the
space and spin variables, the spin-orbital 1s (r)(s) will describe an electron at point r in 3-space, with
spin s. Remember that r = x, y, z (using Cartesian coordinates), while s is a discrete variable with only
two values, s = 12 for an up-spin electron or 21 for down-spin. Now we can begin.
The Hydrogen atom (H) has one electron in a doubly degenerate ground state, described by spin-orbital
1s or 1s . The 1-electron density function for the up-spin state will therefore be
(x) = 1 (x)1 (x) = 1s (r)1s (r)(s) (s)
and the spinless density matrix P (r; r ) = 1s (r)1s (r ) just as if the wave function contained orbitals
with no spin factors.
The Helium atom (He) has a non-degenerate ground state, with two electrons in the 1s AO, but to
satisfy the Pauli principle its wave function must be an antisymmetrized spin-orbital product (3.4) and
we must therefore use (5.29) and (5.30). For the ground state, the results are
(x) = 1s (r)1s (r)(s) (s) + 1s (r)1s (r)(s) (s)
(x; x ) = 1s (r)1s (r )(s) (s ) + 1s (r)1s (r )(s) (s ).
The densities of up-spin and down-spin electrons are clearly, from (5.32),
P (r) = 1s (r)1s (r),
P (r, r ) = 1s (r)1s (r ).
The up-spin and down-spin components of the total electron density are equal whenever the spin-orbitals
are doubly occupied: Total density = P (r)+P (r). But the difference of the densities is also an important
quantity: it is called the spin density and is usually defined as Q(r) = 12 (P (r) P (r)). (The 21 is
the spin angular momentum in units of ~, so it is sensible to include it remembering that the electron
charge density eP (r) is measured in units of charge, with e = 1.)
The Lithium atom (Li) has a degenerate ground state, the third electron being in the 2s orbital with
up-spin or down-spin. The electron density function for the up-spin state follows from (5.29) as
(x) = 1s (r)1s (r)(s) (s) + 1s (r)1s (r)(s) (s) + 2s (r)2s (r)(s) (s).
You can do the rest yourself. The new features of this atom are (i) an inner shell of two electrons, with
equal but opposite spins, in a tightly bound 1s orbital, and (ii) a valence shell holding one electron,
in a diffuse and more weakly bound 2s orbital, with no other electron of opposite spin. This atom has
a resultant spin density, when in the up-spin state, Q(r) = 12 2s (r)2s (r) and this free spin density,
almost entirely confined to the valence shell, is what gives the system its chemical properties.
Beryllium (Be) is another closed-shell system, with only doubly-occupied orbitals, and like Helium
shows little chemical activity.
Boron (B), with one more electron, must start filling the higher- energy p-type AOs such as 2px , 2py , 2pz
and the next few atoms bring in important new ideas.
In Section 5.2 we listed the first ten chemical elements, in order of increasing atomic
number, together with their electron configurations; and in the following sections we
have developed in detail the methods for constructing IPM approximations to the wave
functions that describe their electronic structures. These methods are rather general and
in principle serve as a basis for dealing in a similar way with atoms of atomic number
Z > 10. Many years ago Mendeleev and other Chemists of the day showed (on purely
empirical grounds) how the elements of all the known atoms could be arranged in a
Table, in such a way as to expose various regularities in their chemical behaviour as
the atomic number Z increased. In particular, the elements show a periodicity in which
certain groups of atoms possess very similar properties even when their Z-values are very
different. As more and more elements were discovered it became important to classify
their properties and show how they could be related to our increasing understanding of
electronic structure. Parts of the resultant Periodic Table, in its modern form, are given
First we indicate the Short Periods, along with the electron configurations of the atoms
they include (atomic numbers being attached as superscripts to their chemical symbols):
2s2 2p1
3s2 3p1
2s2 2p2
3s2 3p2
2s2 2p3
3s2 3p3
2s2 2p4
3s2 3p4
2s2 2p5
3s2 3p5
2s2 2p6
3s2 3p6
In these periods the order in which the available orbitals are filled is exactly as suggested
by the first and second columns of Figure 5.1. The lowest energy AO is occupied by one
electron in Hydrogen and two electrons in Helium two atoms not usually counted as
forming a Period. The next two AOs, in ascending energy order, come from the quantum
shell with principal quantum number n = 2 and account for the electron configurations of
all the atoms in the first short period. Lithium and Beryllium hold only electrons in an
orbital of 2s type; but the next AO is of 2p type and is three-fold degenerate, so Carbon,
for example, will have the configuration with 2 electrons in the 2s AO and 2 electrons to
be distributed among the 2p AOs (no matter which). When spin is taken into account,
the ground states and low-lying excited states of the atoms in the short periods may be
set up by angular momentum coupling methods, following the pattern of Example 5.2, to
give all the resultant states of the configuration.
Things become more complicated in the longer periods because, as Figure 5.1 suggests,
the AO energies of orbitals in quantum shells with n 3 may be so close together that it is
not easy to guess the order in which they will be filled. The quantum shell with principal
quantum number n = 4 starts with the atoms Potassium (K) and Calcium (Ca), with
the expected configurations 4s1 and 4s2 (outside the filled shells with n = 1, 2, 3), and
continues with the first long period (shown below).
3d1 4s2
3d2 4s2
3d3 4s2
and continuing :
3d5 4s1
3d5 4s2
3d6 4s2
3d7 4s2
3d8 4s2
3d10 4s1
3d10 4s2
If you look at that, along with Figure 5.1, youll see that the 3d AOs have started to fill
before the 4s because their orbital energies are in this case slightly lower. The atom of Zinc
(Zn), with electron configuration 3d10 4s2 , has a complete shell with all 3d orbitals full;
the next atom is Gallium (Ga), which starts taking on electrons in the 4p orbitals on
top of the filled 4s-3d shell (shown as a ). The atoms from Gallium up to Krypton (Kr)
have configurations similar to those in the short periods, in which the three p orbitals are
filling. The chemical properties of the six resultant atoms resemble those of the atoms in
the two short periods shown above, ending with another inert gas (Kr) like Neon (Ne)
and Argon (A). In fact, such properties depend little on the inner-shell electrons which
simply provide an effective field for the electrons occupying the outer-shell orbitals.
The role of the atoms in Chemistry, which we begin to study in the next chapter, depends
mainly on their outermost orbitals and thats why inner shells are often not shown in the
Periodic Table as listed above, where the Argon-like filled orbitals are shown only as a
dash ().
The whole Periodic Table, including over a hundred known chemical elements, is of such
fundamental importance in Chemistry that it is nowadays displayed in schools and universities all over the world. Here youve seen how it relates to the electronic stuctures of
the building blocks from which all matter is constructed. More of that in later chapters,
but first a bit more quantum mechanics.
Most atoms do not have closed-shell ground states and, as we saw in the last section, that
makes them much more interesting. In particular, electron configurations with degenerate
AOs that are incompletely filled can show a rich variety of electronic states. Even when
the separation of atomic energy levels is very small it is easy to observe experimentally
with present-day techniques: these usually require the application of strong magnetic
fields which allow one to see the effects of coupling between the applied field and any
free spins which carry magnetic dipoles (see Book 10). The spin-field (Zeeman)
interaction gives rise to a perturbation of the form
HZ = g
B S(i),
where = e~/2m is called the Bohr magneton (dont confuse it with a spin eigenstate),
B is the flux density of the magnetic field, and g is a number very close to 2 (which
indicates that spin is twice as effective as orbital motion of a charge in producing a
magnetic dipole).
The normal interaction between the field and an electron with orbital angular momentum
L(i) gives a perturbation
Hmag =
B L(i),
which represents a classical field-dipole interaction). In both cases the summation is over
all electrons.
There are many other interaction terms, which you dont even need to know about, but
for a free atom there are some simplifications and its fairly easy to see how the fine
structure of the energy levels can arise and how the states can be classified. So well end
this chapter by using what we already know about spin and orbital angular momenta. The
unperturbed states of a Carbon 2p2 configuration, with energy levels represented in Figure
5.4, were constructed as linear combinations of antisymmetrized products of spin-orbitals
so as to be simultaneous eigenstates of the commuting operators H, L2 , Lz , S2 , Sz (all in
IPM approximation). But the fine structure of the triplet P level, indicated in Column
(c), was not accounted for though it was put down to spin-orbit coupling, which could
be admitted as a perturbation. Classically, the interaction energy between two magnetic
dipoles m1 , m2 is usually taken to be proportional to their scalar product m1 m2 , so it will
be no surprise to find that in quantum mechanics the spin-orbit perturbation operator,
arising from the spin dipole and the orbital dipole, takes the approximate form (main
term only)
HSL (i) =
f (ri )S(i) L(i),
where the factor f (ri ) depends on distance ri of Electron i from the nucleus, but is also
proportional to nuclear charge Z and therefore important for heavy atoms.
To understand the effect of such terms on the levels shown in Fig. 5.4, we remember that
eigenstates of the operators L2 , Lz and S2 , Sz can be coupled to give eigenstates of total
angular momentum, represented by the operators Jx , Jy , Jz , defined as
Jx = L x + S x ,
Jy = L y + S y ,
Jz = L z + S z ,
and that these operators have exactly the same commutation properties as all angular
momentum operators (reviewed in Section 5.1). Thus, it should be possible to find simultaneous eigenstates of the operators J2 = J2x + J2y + J2z , and Jz , with quantum numbers
(J, MJ ), and also the shift operators J+ = Jx + iJy and J = Jx iJy . To check that
this really is possible, lets start from the orbital and spin eigenstates (already found)
with quantum numbers (L, ML ) and (S, MS ), calling them L,ML and S,MS , respectively.
The product of the top states, with ML = L and MS = S, is clearly an eigenstate of
Jz = Lz + Sz because each operator works only on its own eigenfunction (orbital or spin),
giving Jz (L,ML =L S,MS =S ) = L + S(L,ML =L S,MS =S ), and this means the product function is an eigenfunction of Jz with the maximum available quantum number MJ = L + S,
which implies that J = L + S is the quantum number for the corresponding eigenstate
of J2 . This really is the top state because it cant be stepped up (J+ = L+ + S+ and the
product will be annihilated by one or other of the two operators). On the other hand,
(L,L S,S ) can be stepped down by using (J = L + S ). This will give a function
with L and S unchanged, which is a combination of L,L1 S,S and L,L S,S1 with J
unchanged but MJ reduced by 1.
Youve done all this before! There will be another combination, orthogonal to the first
and still with the Jz quantum number reduced to MJ 1, and this must be the top state
of a new series with J = L + S 1. If you do the same operations all over again you
can reduce the MJ -value to L + S 2 and then, by finding an orthogonal combination,
arrive at the top state of a new series with J = MJ = L + S 2. As you can see, this
gets terribly tedious. But it can be done and the conclusion is easy enough to visualize:
you add vectors by adding their corresponding components. In adding orbital and spin
angular momentum vectors you start with the vectors in line, so J = MJ = ML + MS ,
only the quantized z-components being significant; and then you step down by using the
J operator to get all the 2J + 1 states of the series with the same J = M + S. Then
you move to the series with J = M + S 1 and MJ going down from J to J in integer
steps, corresponding to the allowed projections of an arrow of length J on the z-axis. By
carrying on in that way you find all the vector-coupled states with
J = L + S,
L + S 1,
L + S 2,
...., |L S|.
Since J is a positive number the process must stop when the next step would violate this
condition; thats why the last state has a J value which is the magnitude of the difference
in lengths of the L- and S-vectors.
We can now come back to Figure 5.4 and the splitting of the energy levels in Column
(c). In principle we could estimate the effect of the perturbation terms (5.34), (5.35) and
(5.36) by getting their matrix elements relative to the unperturbed functions and then
solving a system of secular equations, along the lines of Section 5.4; but its much nicer, if
you dont want any numerical detail, to use the fact that the 2L + 1 orbital eigenstates of
L2 and the 2S + 1 spin eigenstates of S2 may in general be mixed by the perturbation to
produce eigenstates of the operators J2 and Jz , which also commute with the Hamiltonian.
Weve just found how the vector-coupled states that result can be labelled in terms of the
eigenvalues J and MJ ; and we know that states with different sets of eigenvalues will in
general have different energies.
The levels in Figure 5.4 result from the unperturbed 2-electron states with quantum
numbers L = 1, ML = 1, 0, 1 and S = 1, MS = 1, 0, 1 and for each choice of L and
S we can obtain all the allowed spin-coupled states of given J and MJ . Moreover, the
unperturbed states have been constructed from antisymmetrized spin-orbital products
and the Pauli Principle is thus taken care of from the start. Lets take the possible states
one at a time:
L = 2, S = 1
In Example 5.3 this case was ruled out, being completely symmetric under electron exchange, so J = L + S = 3 is excluded. But with S = 0 we pass to the next
L=2S=0J =2
L = 2 means this is a D state (2 units of orbital angular momentum) and S = 0 means
this is a spin singlet, so the full state label is 1 D as shown in Fig.5.4
L=1S=1J =2
L = 1 means this is a P state (1 unit of orbital angular momentum) and S = 1 means
this is a spin tripet, so the full state label is 3 P as shown in Fig. 5.4 with some fine
structure resulting from spin-orbit coupling. When J = 2 there are 2J + 1 = 5 states
of different MJ : these are the Zeeman states, which are degenerate in the absence of an
external magnetic field. But the top state (J = MJ = 2) can be stepped down to give a
series with J = 1, still 3 P states, J being J = L + S 1 with L = 1 and S = 1. Another
step down gives J = L + S 2 = 0, a single state with the L- and S-vectors anti-parallel
coupled. To label these component states, of which there are 9 (=5+3+1), it is usual to
add a subscript to the term symbols shown in Fig. 5.4, giving the value of J. The states
of the 3 P multiplet are then labelled 3 P2 , 3 P1 , 3 P0 , in descending order of energy. The
highest-energy state of the multiplet is the one in which the magnetic dipoles point in the
same direction; the lowest is that in which their arrows are opposed just as in Classical
Of course, were still using an IPM picture, which is only a poor approximation, but
its amazing how much understanding we can get from it even without any numerical
calculations. The tiny shifts of the energy levels, brought about by the small terms in the
Hamiltonian, are described as fine structure. When observed spectroscopically they
give important information about the electronic structure of the atoms: first of all they
tell us what atom we are looking at (no two atoms give exactly the same finger prints)
and secondly they tell us whether or not there are singly occupied orbitals, containing
un-paired spins that are free to couple with the spins of other atoms. So they are useful
for both chemical analysis and for understanding chemical reactivity so much so
that most of our big hospitals have expensive equipment for detecting the presence of
unpaired spins in the atoms of the cells in our bodies!
Chapter 6
Molecules: first steps
You first started learning about how atoms could combine, to form molecules, in Chapter
1 of Book 5. Since then, in Book 6, youve learnt more about matter in general and the
history of our planet Earth as a member of the Solar System. You must have been struck
by the time-scale on which things happen and the traces they leave behind in the rocks,
like the fossil remains of creatures that lived here many millions of years ago. And in
Books 7-9 you learnt about the evolution of all those creatures (including ourselves!),
starting from the earliest and simplest forms of life. Before studying molecules in more
detail you may be wondering where the atoms themselves came from; and that takes us
back to the beginning of the Universe. Well have to tell the story in the light of what we
know now (or at least think we know, on the basis of all the evidence we have).
About 14 billion years ago, all the particles in the present Universe were very close together
in a ball of unbelievably dense matter. This ball exploded as a result of the interactions
that drove the particles apart: we now call that event the Big Bang. The particles
spread out in empty space, at great speed, to form an Expanding Universe which is
still getting bigger and bigger. As they interacted, the particles eventually began to form
atoms first of all those of Hydrogen, the lightest known atom, consisting of one proton
and one electron. So at one stage the Universe could be pictured as a dense cloud of
Hydrogen. But it didnt stay that way.
What happened in the early Universe?
The atomic nuclei (protons) could also come together in pairs to form new nuclei, those
of the Helium ion He2+ (the 2+ indicating that the neutral Helium atom has lost two
electrons to give a bare nucleus with two units of positive charge). This process is called
nuclear fusion and was first mentioned in Book 4 Section 8.3 (which you should read
again before going on). When two protons fuse in this way the total mass of the system
is reduced by a factor of about 0.7 102 and since a proton has a mass 1.66 1027 kg
the mass lost will be (0.7 102 )(1.66 1027 kg) = 2.324 1029 kg
Now in Section 8.3 of Book 4, you learnt that mass is a form of energy and that the two
things are related by Einsteins famous formula E = mc2 , where c is the speed with which
light travels ( 3 108 ms1 ). The mass lost when two protons fuse is thus equivalent to
an energy
E = mc2 = (2.324 1029 kg) (9 1016 m2 s2 ) = (20.916 1013 ) kg m2 s2 .
But the energy unit here is the Joule: 1 J = 1 kg m2 s2 . That may not seem much, but
if you remember that the chemical unit of quantity is the mole this must be multiplied
by the Avogadro number L, the number of systems it contains. The fusion energy of 1
mole of proton pairs thus comes out as
(0.602 1024 ) (20.916 1013 ) J = 12.59 1011 J = 12.59 108 kJ.
Lets compare that with the energy released in burning 1 mole of hydrogen gas (often
used as a rocket fuel). In that case (read section 3.2 of Book 5) the reactants are 1 mole
of hydrogen molecules (H2 ) plus 1 mole of Oxygen molecules (O2 ); and the products
are 1 mole of water molecules (H2 O). The energy change when the reactants go to the
products is H = HP HR , where H stands for Heat content per mole. On putting in
the experimental values of these quantities for Hydrogen, Oxygen and Water, the result is
571.6 kJ, the minus sign meaning that the total heat content goes down and the energy
released by burning 1 mole of Hydrogen is 571.6 kJ.
That should be compared with the energy released in the fusion of 1 mole of proton pairs,
which we found to be 1.259 109 kJ over a thousand million kJ. So in the early Universe
there was no shortage of energy; its gaseous contents must have existed at an unbelievably
high temperature!
What happened in the very early stages?
At the beginning of the first 10 billion years after the Big Bang, as it began to cool,
the Universe contained a mish-mash of particles with strange names like quarks and
gluons (given to them by the people who discovered them), forming a continuous sea
called a plasma. That phase lasted only up to about one second after the BB and was
followed by the appearance of heavier particles, mainly protons, electrons and neutrons
collectively known as baryons composed of quarks glued together by the gluons.
(Its almost impossible to observe the quarks and gluons because if ever they get out of a baryon they
have a vanishingly short lifetime and seem to just disappear until recently nobody knew that a proton
was composed of three quarks, held together by gluons! To find what was inside a proton or a neutron
you had to smash it open by firing other particles at it and observing what came out; and to give these
projectiles enough energy to do that they had to be accelerated to speeds close to that of light. Particle
accelerators are nowadays being built, at great expense, to do that job.)
Then, between about 3 and 20 mins after the BB, when the temperature and density of
the plasma had fallen to a low enough level, the baryons started coming together to form
other nuclei, such as He2+ , by the fusion reaction described above.
Much later, between about 200 and 400 thousand years after BB, the positively charged
nuclei began to capture electrons from the plasma to form stable neutral particles, mainly
neutrons, H-atoms and He-atoms together with a few of the other light atoms, like Carbon,
that youve already met. These are the atoms needed in building simple molecules, which
well study in detail in the rest of this chapter. (You might like to read a preview of them
in Book 5.)
From that point on there followed a long period, still going on, of structure formation.
First the atoms came together in small groups, which attracted other groups and became
much bigger (think of a snowball rolling down a hill and picking up more snow on the
way until it becomes a giant snowball): after billions of years these gigantic structures
became the first stars; and started coming together in star-clusters or galaxies. The
galaxy we see in the night sky and call the Milky Way was formed in this way between
7 and 10 billion years ago and one of the stars in this galaxy is our Sun. The whole Solar
System, the Sun and the Planets that move in orbits around it, came into existence about
8 or 9 billion years after the Big Bang; so planet Earth, the part of the Universe we feel
we know best, is about 4 21 billion years old!
But how do we know all that?
We see the stars in the night sky because they shine: they emit radiation in the form
of photons, which travel through space at the enormous speed of 3 108 ms1 (three
hundred million metres per second!) and the light we observe using ordinary (optical)
telescopes consists only of photons in a very narrow range of frequencies (as youll remember from Book 10, Section 6.5). Most of the light that reaches us is invisible but it can
all be seen by the instruments available to us nowadays and it all carries information
about where it came from. We also have radio telescopes, for example, that pick up
the radiation from distant stars. All this radiation can be analised by spectrometers,
which give detailed information about the electronic origins of the light they take in (as
you learnt in Section 5.3 of the present Book 12).
If you really think about all this youll come to some amazing conclusions. First of all
the distances between stars are so large that its most convenient to measure them in
light years; 1 light year is the distance travelled by a photon in 1 year and is about
9.51012 km. The nearest stars to our own Sun are about 4 light years away; so the light
that we see coming from them started in processes that happened 4 years ago. But more
distant stars in the Milky Way galaxy were formed as long as 13 billion years ago and
any radiation that comes from them must therefore have been on the way for no less than
about 13 billion years.
The light that reaches us here on the Earth, from the Milky Way, is very dim and its
spectrum is foggy showing little sign of the sharp lines found in atomic spectra observed
in the laboratory. But against this background there is always one extremely faint line
at a wavelength of 21.106 cm in the microwave region of the spectrum. Where could it
come from?
When the first atoms began to form, so long ago, they were almost exclusively Hydrogen
(one proton plus one electron). And, as you know from Section 5.3, when one of them
makes a transition from one electronic state to another, of lower energy, a photon of
frequency is emitted with h = Einitial Ef inal . The lowest electronic state is a 2 S
doublet, the two 1s levels differing in spin ( 12 ), but now we must remember that the
proton is also a spin- 21 particle and that the two spins (S = 21 for the electron and I = 12
for the proton) can couple to give a total spin angular momentum with quantum number
F , say, with possible values F = 12 + 12 = 1 and F = 21 21 = 0. As a result of this nuclear
hyperfine coupling the lowest energy level of the H-atom becomes a doublet with a
minute energy separation, confirmed here and now in the laboratory, of 5.874 106 eV.
This is the energy of a quantum of radiation of wavelength 21.106 cm.
What does all this mean? When we say here and now we mean here on Earth and
now at the time of making the experimental measurement. But the event we were talking about the emission of a photon from an atom in a distant part of the Universe took
place about 13 billion light years away, which means 13 billion years before our laboratory
experiments! The predicted energy separation comes from calculations that depend on all
the laws of everyday Physics (from Classical Mechanics (Book 4) to Electromagnetism
(Book 10) and Quantum Mechanics (Book 11) as long as extremely high energies or
relativistic velocities are not involved. We can hardly escape the remarkable conclusion
The Laws of Physics are invariant against changes of position or
time of the system to which they are applied; and that must have
been true for at least 13 billion years.
Many details remain to be filled in: for example, theory shows that the 21 cm transition is
in fact forbidden and would probably take place not more than once in 10 million years!
But the number of H atoms in the Milky Way is so enormous that the total probability
of a transition is enough to account for the observed spectral line.
In summary: the fundamental laws of physics are OK and any variations in the behaviour
of matter are normally due to changes in external conditions such as temperature and
density (which may both reach unimaginable values). Now were all set to start thinking
about the next step in the evolution of the Universe: what makes the atoms stick together
to form molecules?
As youve learnt from Section 6.1, the early Universe once consisted of a hot plasma of
electrons, neutrons, and protons (H+ ) that had not yet picked up electrons to become
neutral Hydrogen atoms (H) together with a trace of Helium nuclei (He2+ ) already
formed by proton fusion.
Lets imagine what can happen when a proton meets a Hydrogen atom. There will
then be a composite system, with two protons sharing one electron, namely a hydrogen
molecule ion.
As usual we apply quantum mechanics to this system by first of all setting up the Hamiltonian operator. We should really suppose all three particles are moving, but well use
an approximation that allows for the fact that a proton has a mass almost 2000 times
that of an electron. The rapidly moving electron then sees the nuclei at any instant as
if they were at rest in fixed positions. The three-body problem then becomes in effect a
one-electron problem with Hamiltonian
1 2
h = 2 ra + rb ,
where ra and rb denote distances of the electron from nuclei a and b and atomic units
are used throughout. The 2 -operator works on the electronic coordinates, to be denoted
by r, and will have a form depending on the coordinate system chosen. The energy levels
of the electron are then found by solving the eigenvalue equation
h = .
The energy of the whole system in this fixed nucleus approximation will then be
E = 1/Rab + ,
where denotes the electronic energy eigenvalue and Rab is the internuclear distance.
(Note that in atomic units the proton charges are Za = Zb = 1 and that the first term
in E is their classical Coulomb repulsion energy.) This procedure is called the BornOppenheimer separation of electronic and nuclear motion. Heavy particles (like nuclei)
move in good approximation according to classical physics with E, calculated in this way,
serving as a potential energy function.
But then we meet the next big problem. For an atom we had ready-made atomic orbitals,
with the well-known forms (1s, 2s, 2p, 3s, 3p, 3d, etc.) first discussed in Book 11, but
here we know nothing about the forms of the molecular orbitals that will be needed
in building corresponding approximations to the molecular wave functions. First of all,
then, we need to find how to describe the one-electron system that remains when the
electron is taken away. This system is experimentally well-known: it is the Hydrogen
molecule ion, H2+ .
How can we get a reasonable first approximation to the lowest-energy molecular orbital
(MO)? When the electron is close to Nucleus a, the term 1/ra will be so big that 1/rb
may be neglected in (6.1). The MO will then shrink into an atomic orbital (AO) for a
single hydrogen atom. Well denote this AO by a (r), as were going to use AOs as basis
functions out of which more general wave functions, such as MOs, can be constructed.
In this process a general MO, call it , must change according to (r) ca a (r), since
this will satisfy the same single-atom eigenvalue equation for any value of a numerical
factor c. Similarly, when r is close to the second nucleus (r) will approach a numerical
multiple of the AO b (r). It follows that an electron in the field of both nuclei may be
fairly well represented by an MO of the form
(r) = ca a (r) + cb b (r)
where the constants ca , cb are still to be chosen (e.g. by taking them as variable parameters
and using the variation method of Section 1.3) to find the MO of minimum energy. This
should give at least a rough description of the ground state.
In fact, however, no calculation is needed because the molecule ion is symmetrical across
a plane perpendicular to the molecular axis, cutting the system into two equal halves.
There is no reason to expect the electron to be found with different probability on the
two sides of the symmetry plane and this implies that the values of the coefficients ca , cb
can differ, at most, in sign: cb = ca . Two acceptable approximate MOs are thus, putting
cb = ca = NB in one MO and cb = ca = NA in the other
A (r) = NA [a (r) b (r)].
This case arises only for homonuclear diatomic molecules in which the two nuclei are
identical. It is important because very many common diatomic molecules, such as H2 ,
N2 , O2 , are of this type.
The solutions just found are typical Bonding and Antibonding MOs; so called for
reasons that will soon become clear. The constants NA , NB are normalizing factors, chosen
to give unit probability of finding the electron somewhere in space. For normalization we
NB2 hB |B i = NB2 (2 + 2Sab ) = 1,
where Sab = ha |b i is the overlap integral between the two AOs. In this way we find
a (r) + b (r)
B (r) =
2 + 2Sab
for the Bonding MO, and
A (r) =
a (r) b (r)
2 2Sab
for the Antibonding MO. The following Figure 6.1 gives a very schematic picture of the
two MOs.
nodal plane
Bonding MO
Antibonding MO
Figure 6.1 Schematic representation of the two lowest-energy MOs for H2+ .
Here, for the ion, H = h, the 1-electron Hamiltonian, and the distinct quantities to be
calculated are (using a common notation and supposing the AOs are normalized)
a = ha |h|a i,
ab = ha |h|b i,
b = hb |h|b i,
Sab = ha |b i.
As in Section 1.3 of Chapter 1, the conditions for a stationary value then reduce to
a = (ab ES
ab )cb
(a E)c
ab )ca = (b E)c
(ab ES
But when the system is symmetrical, as already noted, we know that cb = ca and in that
case just one equation is enough to give us both eigenvalues. Thus, putting a = b =
+ S) = 0; while on
and choosing cb = ca , the first equation reduces to ( + ) E(1
(1 S) + S
EA =
where the numerators have been re-arranged so as to separate out the leading terms. In
this way we find
EA =
EB = +
Since is the energy expectation value of an electron very close to one nucleus alone
and (like ) has a negative value, it follows that the Bonding MO B has a lower energy
(EB ) than the free- atom AO, while the Antibonding MO A has a higher energy (EA ).
Note, however, that the upward displacement of the free-atom energy level in going to
the antibonding level is greater than the downward dispacement in going to the bonding
level, owing to the overlap term. All this is shown very nicely in a correlation diagram
which shows how the energies of the AOs on two identical atoms are related to those of
the MOs which result when the atoms are combined to form a homonuclear diatomic
molecule a homonuclear diatomic, for short.
Such a diagram, describing the formation of H2 , is shown in Figure 6.2, energy levels for
the separate atoms being indicated on the left and right with the MO energies in the
Figure 6.2 Energies of orbitals
in a homonuclear diatomic.
AO levels shown left and right
MO levels shown in the centre.
Remember that were still talking about a one-electron system, the hydrogen molecule
positive ion, and that this is homonuclear. But before going to the neutral molecule, with
two electrons, we may want to think also about other 2-electron systems such as HeH+ ,
with one of the Hydrogens (H) replaced by a Helium atom (He) and one of the three
electrons taken away giving you the Helium Hydride positive ion. In that case well
have a heteronuclear system in which the two nuclei are different and the forms of the
acceptable MOs must also be changed. Helium hydride is a very rare species, though it
was important in the very early stages of the developing Universe, when there werent
many atoms around only the very lightest ones like hydrogen and helium had already
formed. But it gives us a general pattern or prototype for the study of heteronuclear
diatomic systems, which are present in great abundance in todays world. So its worth
looking at the system briefly, in the example that follows:
Example 6.1 A heteronuclear system: HeH+ .
HeH+ is a system with two electrons moving in the field of two nuclei, but it differs from the hydrogen
molecule in having a Helium nucleus (with charge Z = 2) in place of one of the protons. Lets take it
as Nucleus a in our study of H2 and ask what MOs can be formed from the AOs a and b when the
different atoms come together. We first take one electron away, leaving the doubly-positive ion HeH2+
for which the MOs may be determined. The 1-electron Hamiltonian then looks much the same as in the
case of H+
2 , given in (6.1), except that the (1/ra )-term will have Z = 2 in the numerator instead of Z = 1.
But this is enough to destroy the symmetry and the acceptable MOs will no longer have the simple forms
(6.4). Instead we must go back to the stationary value conditions to determine the mixing coefficients
ca , cb .
Again, using and S for short (in place of ab , Sab ), the coefficients may be eliminated by division to
give the single equation
b E)
( S E)
2 = 0.
(a E)(
This can be solved by the method you first used in Book 1 (Section 5.3), to give two approximate eigenB (lower energy) and E
A (upper energy). These correspond to the Bonding and Antibonding
values E
levels for a homonuclear system (see Figure 6.2), but solving the quadratic equation by the standard
method doesnt give a simple result comparable with (6.4).
Instead, we use a simple approximation which shows directly how much the AO energies for the free
B , and pushed up, to give E
A . The
atoms (roughly a and b ) are respectively pushed down, to give E
interaction, which does this, is caused by the term ( S E) . If we neglect this term, E a the
lower of the two AO energies (corresponding to Z = 2) so lets use this approximation to estimate the
effect of the other terms: the last equation is then replaced by
b a ) ( a S)2 = 0,
(a E)(
which gives (check it!)
a = ( a S) .
b a
B , the energy
This is the approximation to the lowest root of the quadratic equation, which we called E
of the Bonding MO.
A similar argument (you should try it) shows that the higher AO energy b is pushed up as a result of
the mixing, giving an approximation to the energy of the Antibonding MO.
The results from Example 6.1 may be summarized as follows. The Bonding and Antibonding MOs used in describing the interaction of two different atoms to yield a heteronuclear
diatomic molecule have corresponding MO energies
( a S)2
EB = a
b a
( b S)2
EA = b +
b a
These results should be compared with those in (6.5) and (6.6), which apply to a homonuclear molecule. In particular
the lower of the two energy levels, in this case a , is pushed down to give the Bonding
level EB . But whereas the shift for a homonuclear molecule was roughly it is now
roughly proportional to the square of (neglecting the small overlap term a S),
divided by the difference of the free-atom energies b a ;
the upper free-atom level is raised by a similar amount, to give the energy EA of
the Antibonding MO;
these effects are both much smaller than in the case of a homonuclear system,
unless the free-atom energies are close together. They are of second order in the
interaction term .
The correlation diagram in Figure 6.2 is now replaced by the one shown below:
Figure 6.3 Energies of orbitals
in a heteronuclear diatomic.
AO levels shown left and right
MO levels shown in the centre.
Its time to say why were talking about bonding and antibonding orbitals. Youll
remember from Book 5 that sometimes atoms stick together to form molecules and other
structures gases, liquids, solids and so on. Until the early part of the last century this had
to be accepted as a general property of matter and further details had to be investigated
experimentally. Only now, following the development of Quantum Mechanics, are we in
a position to say why atoms behave like that. This property is called valency: when an
atom usually sticks to only one other atom it is said to be mono-valent. But some atoms,
such as Carbon, often combine with one, two, three or more others; they have a variable
valency, making them poly-valent and giving them the possibility of forming a very rich
variety of molecular structures.
The chemical bond
In Book 5, where you first met molecules, they were often represented in terms of ball
and stick models: the balls represented the atoms, while the sticks that conected
Energy E
them, stood for the chemical bonds that held them together. This is still a widely used
way of picturing molecules of all kinds, ranging from simple diatomics to the gigantic
structures studied in the Life Sciences (see Book 9), where the molecules may contain
many thousands of atoms arranged in long chains and carrying the genetic code.
Here, however, we are concerned with the sticks that join the different atoms: what are
they and how do they work? At bottom, they must be associated with the electrons and
nuclei that carry negative and positive electric charge and with their interaction energy.
And we have just seen how it is possible for even the single electron of a Hydrogen atom
to enter a state of lower energy by bringing up a second proton, so that the electron is
attracted to two positive nuclei instead of one. In that case we are imagining the formation
of a molecular ion H2+ , in which the electron occupies a Bonding MO.
Lets examine this case in more detail. In equation (6.9) we have an expression for the
energy of an electron in the Bonding MO B , as a function of the parameters , , and
S (the Coulomb, resonance, and overlap integrals). These parameters depend on the
geometry of the system (i.e. the positions of the two nuclei) and are not too difficult
to calculate in terms of the internuclear separation R. When this is done, the electronic
energy of the system can be plotted against R and is found to increase steadily, going
towards the energy of a free hydrogen atom, namely 21 eH , in the long-range limit R .
This is shown in the curve labelled Electronic energy in Figure 6.4 (below); but this
has no minimum which would indicate a stable diatomic system. So whats wrong?
E = 21 eH
Nuclear repuls
ion energy
Internuclear distance R
en er g y
n ic
l ect r o
en er g
The fact is simply that we havent yet included the energy of repulsion between the two
nuclei: this is Enuc (1/R) and goes from a large positive value, when the nuclei are close
together, to zero when R . We didnt even include this term in the Hamiltonian (6.1)
as it didnt depend on the electronic variables. Strictly it should be included (the protons
are part of the system!); but then the expectation value E = h|H|i, for any normalized
state with wave function (r) would contain an additive constant Enuc , which can be put
in at the end of the calculation. When this is done, the total energy of the system becomes
the sum of two terms; the repulsion energy Enuc and EB , the energy of the electron in the
Bonding MO. The two terms are in competition, one falling as R increases, the other
rising; and together they give a total energy E(R) which shows a shallow minimum at a
certain value R = R0 . This means there is a chemical bond between the two atoms, with
bond length R0 , say. The variation of all three energy terms, as functions of internuclear
distance, is shown in Figure 6.4; and the total energy that results behaves as in the curve
labelled Resultant energy.
Of course, this is not for the normal hydrogen molecule but rather the molecule ion
that remains when one electron is taken away. However, the energy of H2 behaves in
a very similar way: the electronic energy expression has just the same form as for any
2-electron system, as given in (2.8). The big difference is that the 1-electron terms,
h|h(1)|i and h|h(2)|i, and the 2-electon term h|g(1, 2)|i, are much more difficult
to evaluate. Remember that the wave function were going to use is a product (r1 , r2 ) =
B (r1 )B (r2 ), where both electrons are shown in the Bonding M0 B , which decribes the
state of lowest energy 2EB when the interelectronic repulsion energy, J = h|g(1, 2)|i,
is neglected. Since J is positive the total electronic energy will now have a lowest possible
expectation value
E = 2EB + J,
corresponding to the molecular ground state. This has the same form as that for the
2-electron atom; but the 1-electron part, 2EB , will now depend on the attraction of the
electron to both nuclei and therefore on their separation R, which determines their
positions in space. Apart from this weak dependence on R, the total electronic energy of
the system will behave in much the same way as for the ion H2+ , while the internuclear
repulsion energy remains unchanged as Enuc .
The relevant energy curves for both the normal molecule and its positive ion are therefore
rather similar in form. Those for the ion are shown above. The value of R at which the
energy has its minimum is usually called the equilibrium bond length and is denoted
by Re while the energy difference between that at the minimum and that for R
is called the dissociation energy, denoted by De . The term equilibrium is of course
not quite correct the nuclei are in fact moving and it is an approximation to do the
calculation as if they were at rest, for a series of fixed values of R. But it is usually a decent
first approximation which can later be refined to take account of vibration and rotation
of the system around its equilibrium configuration; and anyway weve made more serious
approximations already in using such a simple form of the electronic wave function.
Figure 6.4 showed the existence of a minimum energy when the two nuclei of a diatomic
molecule were at a certain distance Re , which we called the equilibrium bond length, but
offers no explanation of how the bond originates where does it come from? But another
way of saying that the system is in equilibrium is to say that the distribution of electrons
must produce forces, acting on the nuclei, that balance the force of repulsion between
their positive charges. And we know already that the electron density function P (r),
defined in Chapter 5 for a general many-electron system, will give us a way of calculating
the energy of interaction between the nuclei and the electron distribution.
The charges on the two nuclei produce an electric field and the potential energy function
for unit charge at point r in that field will be V (r); so the electron/nuclear interaction
energy for one electron will be eV (r). When the electronic charge is, in effect, smeared
out with a density P (r) electrons/unit volume, the total interaction energy will be
Ven =
eV (r)dr.
We now want to know how the contributions to Ven can arise from different parts of the
electron distribution. We start with a very simple example: one electron in an MO, which
may be of bonding or anti-bonding type.
Example 6.2 Analysis of the electron density.
Lets think of an electron in a molecular orbital built up from two atomic orbitals, 1 , 2 , as the linear
combination = c1 1 + c2 2 . The electron density function will then be (using for simplicity normalized
and real MO functions)
P (r) = c12 1 (r)2 + 2c1 c2 1 (r)2 (r) + c22 2 (r)2 .
There are three parts to the density, two orbital densities and one overlap density;
d1 (r) = 1 (r)2 , d2 (r) = 2 (r)2 , d12 (r) = 1 (bf r)2 (r)/S12 ,
where S12 = h1 |2 i and all three terms are therefore normalized to unity. On writing c12 = P11 , c22 =
P22 , c1 c2 = P12 , the electron density takes the form
P (r) = q1 d1 (r) + q2 d2 (r) + q12 d12 (r).
Here q1 = P11 , q2 = P22 , q12 = 2S12 P12 are the quantities of charge, in electron units, associated with the
orbital and overlap densities. Provided the MO is correctly normalized, the sum of the qs must be 1
electron: q1 + q2 + q12 = 1. The individual qs indicate in a formal way the electron populations of the
various regions in space.
The following Figure 6.5 gives a very schematic picture of the electron distribution in the
2 ion, according to the LCAO approximation, for the two states in which the electron
occupies the Bonding MO (left) or the Anti-bonding MO (right).
and the restriction of the double summation to terms with i < j makes sure that each
overlap is counted only once.
This conclusion is valid for any N -electron wave function expressed in finite basis form
with any number of basis functions i , which need not even be AOs (though we often
continue to use the term in, for example, the LCAO approximation). Nowadays the
charges qi and qij are usually called orbital and overlap populations of the regions
defined by the functions i and their products i j ; and this way of describing the results
of electronic structure calculations is called electron population analysis. It will be used
often when we study particular molecules.
In Chapter 5 we obtained a general expression for the electron density function (x1 ) for
an N -electron system of particles with spin, using xi to denote the space-spin variables
of Particle i. The probability of finding Particle 1 in volume element dx1 = dr1 ds1 was
dx1 (x1 , x2 , ... xN ) (x1 , x2 , ... xN )dx2 ... dxN ,
obtained by integrating over all positions of the other particles. And, since the same
result will be obtained for whichever electron is found in volume element dx1 at point
x1 , multiplication by N will give equation (5.24). Thus, the probability/unit volume of
finding a particle, no matter which, at point x1 will be
(x1 ) = N (x1 , x2 , ... xN ) (x1 , x2 , ... xN )dx2 ... dxN .
Remember that this is the probability density with spin variable included. If were not
interested in spin its enough to sum over both spin possibilities Rby integrating over the
spin variable s1 to obtain a spinless density function P (r1 ) = (x1 )ds1 . The result
is the probability density for finding a particle, of either spin in a volume element r1
(e.g. dx1 dy1 dz1 ) at point r1 in ordinary 3-space. If you look back at Examples 5.6 and
5.7 in the last chapter youll find that youve done all this before for atoms, thinking
mainly of IPM-type wave functions. But the results apply to any kind of wave function
(approximate or exact and for any kind of system). So now were ready to deal with
The 1- and 2-electron density matrices, including dependence on spin variables, are
(x1 ; x1 ) and (x1 , x2 ; x1 , x2 ).
They determine the expectation values of operator sums, such as
in any state . For example
h| i h(i)|i = x x1 h(1)(x1 ; x1 )dx.
h(i) and
g(i, j),
From now on, to simplify the text, lets remember that the primes are only needed when
an operator works on a density matrix, being removed immediately after the operation
so well stop showing them, writing the expectation values as
h(i)|i =
g(i, j)|i =
h(1)(x1 )dx1 ,
and remembering what the short forms mean.
When tiny spin-dependent terms are neglected these results may be reduced in terms of
the spinless density matrices P (r1 ; r1 ) and (r1 , r2 ; r1 , r2 ). The counterparts of (6.16)
apply when the operators do not touch the spin variables; they become instead
h(i)|i =
g(i, j)|i =
and involve only the position variables of typical particles
In what follows well use the reduced forms in (6.17), which apply when relativistic corrections are ignored. The total electronic energy of any system of N electrons, moving around
a set of fixed nuclei, can then be expressed in a very simple and transparent form. The 1electron operator for an electron at point r1 in ordinary 3-space is h(1) = 12 2 (1)+V (r1 )
(kinetic energy plus potential energy in field of the nuclei), while the 2-electron operator
for electrons at points r1 and r2 is simply the Coulomb repulsion energy, g(1, 2) = 1/r12 (in
atomic units), r12 being the interelectronic distance (the length of the vector separation
r2 r1 ). On putting these terms in the energy expectation value formula E = h|H|i,
we find (do it!)
- 12
V (1)P (r1 ) +
Here the three terms are, respectively, T the total kinetic energy; Ven , the potential energy
of a smeared out electronic charge in the field of the nuclei; and the average potential
energy Vee due to pairwise repulsions described by the 2-electron density (r1 , r2 ).
the first-order changes to be considered result from h and the density functions P (1) and
(1, 2). The total first-order energy change will therefore be (noting that the operators
2 (1) and g(1, 2) remain unchanged)
E = h(1)P (r1 )dr1 + h(1)P (r1 )dr1 + 2 g(1, 2)(r1 , r2 )dr1 dr2
and the stationary value condition will follow on equating this quantity to zero.
Now suppose that the density functions have been fully optimized by varying the energy
in the absence of any perturbation term h(1). In that case only the last two terms remain
in (6.19) and their sum must be equated to zero. Thus
This is usually called the Hellmann-Feynman theorem in its general form. It was
discovered by Hellmann (1937) for the special case where the perturbation was due to a
change of nuclear position and independently, two years later, by Feynman. In thinking
about the forces that hold the nuclei together in a molecule we first have to define them: if
we move one nucleus, nucleus n say, through a distance Xn in the direction of the x-axis,
well change the total energy of the molecule by an amount E given in (6.20). And in
this case h(1) = Vn (r1 ), the corresponding change in potential energy of an electron at
point r1 in the field of the nucleus.
Now the rate of decrease of this potential energy is the limiting value of Vn (r1 )/Xn
as Xn 0 and measures the x-component of the force acting between Nucleus n and an
electron at point r1 . Thus we may write
Vn (r1 )
= Fnx (r1 )
= Fnx
defines the x-component of total force on Nucleus n due to interaction with the whole
electron distribution.
Having defined the forces, in terms of energy derivatives, we return to (6.20). Here,
putting h(1) = Vn (r1 ), dividing by Xn and going to the limit Xn 0, we find
Fnx = Fnx (r1 )P (r1 )dr1 .
In words, the x-component of the total force on any nucleus (n) may be computed by
adding (integrating) the contributions arising from all elements of the charge cloud. This
is true for any component and therefore the force vector acting on any nucleus in the
molecule can be calculated in exactly the same way: once the electron density has been
computed by quantum mechanics the forces holding the nuclei together may be given
an entirely classical interpretation. When the molecule is in equilibrium it is because
the forces exerted on the nuclei by the electron distribution are exactly balanced by the
repulsions between the nuclei as they were in Figure 6.4.
This beautiful result seems too good to be true! Apparently only the electron density
function P (r1 ) is needed and the 2-electron function (r1 , r2 ), which is vastly more difficult
to calculate, doesnt come into the picture. So what have we overlooked?
In deriving (6.21) we simply took for granted that the variational wave function was
fully optimized, against all the parameters it may contain. But in practice that is hardly
ever possible. Think, for example, of an LCAO approximation, in which the atomic
orbitals contain size parameters (orbital exponents) and the coordinates of the points
around which they are centred: in principle such parameters should be varied in the
optimization, allowing the orbitals to expand or contract or to float away from the
nuclei on which they are located. In practice, however, that is seldom feasible and the
Hellmann-Feynman theorem remains an idealization though one which is immensely
useful as a qualitative tool for understanding molecular structure even at a simple level.
Chapter 7
Molecules: Basic Molecular Orbital
We start this chapter by going back to the simple theory used in Chapter 6, to see how
well it works in accounting for the main features of molecules formed from the elements
in the first row of the Periodic Table.
In Section 6.2 we studied the simplest possible diatomic system, the Hydrogen molecule
positive ion H+
2 , formed when a proton approaches a neutral Hydrogen atom. And even
in Chapter 5 we had a glimpse of the Periodic Table of all the elements: the first ten
atoms, with their electron configurations, are
Hydrogen[1s1 ] Helium[1s2 ] Lithium[1s2 2s1 ]
Beryllium[1s2 2s2 ]
in which the first two s-type AOs are filling (each with up to two electrons
of opposite spin component, 21 ), followed by six more, in which the p-type
AOs (px , py , pz ) are filling with up to two electrons in each.
Boron[1s2 2s2 2p1 ] Carbon[1s2 2s2 2p2 ] Nitrogen[1s2 2s2 2p3 ]
Oxygen[1s2 2s2 2p4 ] Fluorine[1s2 2s2 2p5 ] Neon[1s2 2s2 2p6 ]
Helium, with two electrons in the 1s shell, doesnt easily react with anything; it is the
first Inert Gas atom. So lets turn to Lithium, with one 2s electron outside its (1s)2 inner
shell, and ask if this would react with an approaching atom of Hydrogen. We could, for
example, try to calculate the total electronic energy E using the Self-Consistent Field
method (see Chapter 4) and then adding the nuclear repulsion energy, as we did for the
molecule H2 in Section 6.2. Again, as we dont have any ready-made molecular orbitals
we have to build them out of a set of basis functions, 1 , 2 , ... i ... and it seems most
reasonable to choose these as the atomic obitals of the atoms we are dealing with, writing
the MO as
= c1 1 + c2 2 + ... ... cm m
for a basis of m functions. This is the famous LCAO (linear combination of atomic
orbitals) approximation, which is the one most widely used in molecular structure calculations. In principle, if the basis set is large enough, this could be a fairly accurate
As you learnt in Chapter 4 (you should go back there for the details) the MOs should
really be determined by solving the operator equation
F = [the Hartree Fock equation]
but the best we can do, in LCAO approximation, is to choose the expansion coefficients
so as to minimize the calculated value of the total electronic energy E. The best approximate MOs of the form (7.1), along with their corresponding orbital energies () are then
determined by solving the secular equations
Fc = c,
where c is the column of expansion coefficients in (7.1) and F is the square matrix representing the effective Hamiltonian F which has elements Fij = hi |F|j i.
(Note that this simple form of the secular equations, depends on using orthogonal basis functions; but
if overlap is not small enough to be neglected it may be removed by choosing new combinations work
which is easily done by modern computers.)
The electron configuration of the molecule will then be, with four electrons, LiH[1 2 2 2 ].
This indicates a Lithium inner shell, similar to that in the free atom, and a bonding MO 2 containing
2 electrons. But the bonding MO is not formed from one 2s AO on the Lithium atom, overlapping with
the Hydrogen 1s AO; instead, it contains two AOs on the Lithium atom. If we want to keep the simple
picture of the bond, as resulting from the overlap of two AOs, one on each atom, we must accept that
the AOs need not be pure but may be mixtures of AOs on a single centre. Ransils calculation shows
that a much clearer description of LiH is obtained by rewriting his MO in the form
2 0.397hyb + 0.685H ,
where hyb = 0.8132s + 0.5822p is called a hybrid orbital and this kind of mixing is called hybridization.
The general form of this Lithium hybrid AO is indicated below in Figure 7.1, in which the contour lines
correspond to given values of the function hyb . The broken line marks the nodal surface separating
negative and positive values of hyb . The energy is lowered by hybridization, which increases the strength
of the bonding by putting more electron density (i.e. negative charge) between the positive nuclei; but
this is offset by the energy 2p 2s needed to promote an electron from a 2s state to a 2p state. So
hybridization is favoured for AOs of similar energy and resisted when their energy difference is large.
Figure 7.1 Contour map for the s-p hybrid orbital hyb
The capacity of an atom to form chemical bonds with other atoms is known as valency
and is often measured by the number of bonds it can form. Lithium in LiH is monovalent, but Oxygen in H2 O is di-valent and Carbon in CH4 is quadri-valent. But many
atoms show variable valency, depending on the nature of the atoms they combine with
and on the degree of hybridization involved. In Example 7.1 the Lithium atom is said to
be in a valence state, depending on the degree of 2s-2p mixing, and this may usefully
be decribed in terms of the electron populations introduced in Section 6.3. If the hybrid
orbital is written as the mixture hyb = a2s + b2p , an electron in hyb gives a probability
density Phyb = a2 2s2 +b2 2p
+2ab2s 2p . Integration over all space gives unity (1 electron),
with a coming from the 2s density, b2 from the 2p and nothing from the last term (the
AOs being orthogonal). We can then say that the 2s and 2p AOs have electron populations
a2 and b2 , respectively, in the molecule. The electron configuration of the Lithium atom,
in the molecule, could thus be written Li[1s2 2s0.661 2p0.339 ] (according to Example 7.1)
the numbers being the values of a2 and b2 for an electron in the valence orbital hyb .
The atom never actually passes through a valence state; but the concept is none the
less valuable. For example, the idea that a fraction of an electron has been promoted
from a 2s orbital to an empty 2p shows why hybridization, to produce strong bonds, is
most common for elements on the left side of the Periodic Table, where the 2s-2p energy
separation is small.
Now lets try something a bit more complicated. If we replace Lithium by Carbon we
shall have four electrons outside the tightly-bound 1s shell, two of them in the next-higher
energy 2s orbital and two more in the slightly-higher energy 2p orbitals (2px ,2py ,2pz ).
These four are not too strongly bound to prevent them taking part in bonding with other
atoms, so they are are all available as valence electrons. And if we go two places further
along the First Row we come to Oxygen, which has six electrons outside its 1s2 inner shell
all available, to some degree, for bonding to other atoms. The energy difference between
the 2s and 2p orbitals increases, however, with increasing nuclear charge; and as a result
the elements C and O have rather different valence properties. In the next example well
try to understand what can happen when these two atoms come together and begin to
share their valence electrons.
Example 7.2 The Carbon monoxide molecule: CO.
The 1s2 inner shells, or cores, of both atoms are so strongly bound to their nuclei that the main effect
they have is to screen the positive charges (Ze, with Z=6 for the carbon atom and Z=8 for oxygen); the
effective nuclear charges are then closer to Zef f = 4, for C, and 6 for O. Well therefore think about
only the valence electrons, asking first what MOs can be formed to hold them.
We already know that the AOs on two different atoms tend to combine in pairs, giving one bonding
MO along with an anti-bonding partner; and that this effect is more marked the more strongly the AOs
overlap. Think of the 2s AOs as spheres and the 2p AOs as dumbells,
Here the + and parts indicate regions in which the wave function is positive or negative. Unlike
an s-type AO, one of p-type is associated with a definite direction in space, indicated by the arrow. For
the CO molecule, the 2s AOs on the two centres will not overlap strongly as they come together, owing
to their size difference (the oxygen 2s being smaller can you say why?). They might give a weakly
bonding MO, consisting mainly of the oxygen 2s AO, which well call 1s ) as it would be the first MO
with rotational symmetry around the 2-centre axis. On the other hand, the oxygen 2pz AO pointing
along the axis towards the carbon would have a fairly good overlap with the carbon 2s AO. In that case
we might expect, as usual, two MOs; one bonding (call it 2s ) and the other anti-bonding (2s .)
However, there are three 2p AOs on each centre, the 2px and 2py , both transverse to the bond axis (along
which weve directed the 2pz AO). They are of -type symmetry and, when pairs of the same type come
close together, they will have a good side-to-side or lateral overlap:
In summary, the orbitals available for holding the 10 valence electrons would seem to be
1 the first MO of type, mainly Oxygen 2s
2 a bonding MO, formed from Carbon 2s and Oxygen 2pz
3 an anti-bonding MO, the partner of 2
1x a -bonding MO, formed from 2px AOs on C and O
1y a -bonding MO, formed from 2py AOs on C and O
To assign the electrons to these MOs we look at the probable correlation diagram.
The correlation diagram for the CO molecule, with neglect of hybridization on one or
both atoms, would seem to be that shown below in Figure 7.2:
p orbital energies being fairly close together), the correlation diagram in Figure 7.2 must
be re-drawn. The result is that shown in Figure 7.3, where the 2s and 2p orbital energies
are again indicated on the extreme left (for Carbon) and extreme right (for Oxygen). But
now, when these AOs are allowed to mix forming hybrids, the (2s) AO of lower energy
must be raised in energy owing to the inclusion of a bit of 2p character; while the energy
of an electron in the upper AO must be lowered, by inclusion of 2s character. The effects
of s-p hybridization are indicated by the broken lines connecting the hybrid energies with
the energies of their parent AOs.
The probable orbital energies of the MOs in the CO molecule are shown in the centre of
the diagram. The broken lines now show how the MO energy levels arise from the hybrid
levels to which they are connected. The energies of the -type MOs are not affected by
the hybridization (containing only 2px and 2py AOs) and remain as in Figure 7.2 with
the 1 level (not shown)lying between the 2 and 3 MO energies.
When we assign the 10 valence electrons to these MOs we find
a pair of electrons in the 1 MO, which is mainly Oxygen 2s;
a pair in the 2 MO, the bonding combination of strongly overlapping hybrids, pointing
towards each other along the bond axis;
two pairs in the bonding 1-type MOs, transverse to the bond axis; and
a pair in the 3 MO, which is mainly a Carbon hybrid and is too high in energy to mix
with -type AOs on the other atom.
Now lets look at the electron density (density of negative charge), which is given, as a
function of position in space, by ||2 for an electron in the MO . (If youre not yet ready
to follow all the details you can skip the following part, in small type, and come back to
it later.)
The first MO (above) gives a density ||2 roughly spherical and strongly bound to the Oxygen 1s2 core,
but leaning slightly away from the Carbon (can you say why?)
The second MO gives a density concentrated on and around the CO axis, between the two atoms,
providing a strong bond
The third MO is degenerate, with density contributions |x |2 and |y |2 where, for example, x = ca a2px +
cb b2px a linear combination of 2px AOs on the two atomic centres. At a general point P(x, y, z), a
2px AO has the form xf (r), where distances are measured from the nucleus and the function f (r) is
spherically symmetrical. If you rotate a 2px AO around the z axis it will change only through the factor
x; and the same will be true of the combination x .
The whole electron density function will thus change only through a factor x2 + y 2 = rz2 , where rz is the
distance of Point P from the CO bond axis. A slice of density, of thickness dz, will be a circular disk of
charge with a hole in the middle because rz falls to zero on the bond axis. The two bonds together
therefore form a hollow sleeve of electron density, with the distribution inside along the axis.
The 3 HOMO now comes below the 4 anti-bonding MO and does not diminish the strong bond in
any way. It provides essentially a lone-pair electron density, localized mainly on the Carbon. Moreover,
this density will point away from the CO -bond because h2 and h1 stand for orthogonal orbitals and
h1 points into the bond.
The CO molecule should have a triple bond, a strong bond supported by two weaker
bonds; and the Carbon should have a region of lone-pair electron density on the side
away from the CO triple bond all in complete agreement with its observed chemical
The CO molecule has 10 (=4+6) valence electrons outside the 1s2 cores and is therefore
isoelectronic with the Nitrogen molecule, N2 , which is homonuclear and therefore has
a symmetrical correlation diagram. The molecules, Nitrogen (N2 ,) Oxygen (O2 ) and
Fluorine (F2 ), with 10, 12 and 14 valence electrons, respectively, all have similar energylevel diagrams; but differ in the way the levels are filled as electrons are added. This is all
part of the so-called aufbau approach (aufbau being the German word for building
up) in which electrons are added one at a time to the available orbitals, in ascending
order of orbital energy. The First Row atoms use only 1s,2s and 2p AOs, in which only
the 2s and 2p AOs take part in molecule building (see for example Figure 7.2). But in
homonuclear diatomics the two atoms are identical and the correlation diagram is simpler
because orbitals of identical energy interact very strongly and hybridization may often be
neglected. For First Row atoms a typical diagram is shown in Figure 7.4, below.
(1x , 1y ) (1x , 1y ) 2z
where the arrows indicate increasing order of orbital energies and the subscript z refers to
the bond axis, while x and y label the transverse axes. The -type MOs are degenerate,
x and y components having identical energies.
The electron configurations of most of the homonuclear diatomics in the First Row conform to the above order of their MO energy levels. Lets take them one at a time, starting
again with Nitrogen.
Following the aufbau procedure, the first two of the ten valence electrons should go into
the 1 MO with opposite spins; the next two will go into its antibonding partner 1
more or less undoing the bonding effect of the first pair; two more go into the 2z MO
which is strongly bonding, being formed from 2pz AOs pointing towards each other. But
then there are four MOs, all of type, formed from the 2px and 2py AOs on the two
atoms, which are perpendicular to the bond: they are 1x , 1y and their anti-bonding
partners (1x , 1y ) all before we come to 2z , which is well separated from 2z owing
to the strong overlap of the component 2pz AOs. The remaining four of the 10 valence
electrons nicely fill the two bonding MOs and give two bonds of type.
The end result will be that N2 has a triple bond and the electron configuration
1 2 1 2 2z2 1x2 1y2 .
The next First Row diatomic will be
Here there are 12 valence electrons, two more than in N2 , and they must start filling
the anti -bonding -type MOs. But we know that when two orbitals are degenerate
electrons tend to occupy them singly: so 1x 1 1y 1 is more likely than, say, 1x 2 . And
each antibonding electron will cancel half the effect of a bond pair.
The probable result is that O2 will have a double bond and an electron configuration such
1 2 1 2 2z2 1x2 1y2 1x 1 1y 1 .
Moreover, the electrons in the singly-occupied MOs may have their spins parallel-coupled
giving a triplet ground state (S = 1). This means that Oxygen may be a paramagnetic
molecule, attracted towards a magnetic field. All this is in accord with experiments in the
laboratory. Of course, the theory we are using here is much too simple to predict things
like spin coupling effects (we havent even included electron interaction!) but experiment
confirms that the last two electrons do indeed have their spins parallel-coupled to give a
triplet state.
The electron configuration for the molecule F2 is obtained by adding two more valence
electrons. This will complete the filling of the -type anti-bonding MOs, to give the
1 2 1 2 2z2 1x2 1y2 1x 2 1y 2 .
The pairs of electrons in the 1x and 1y MOs then take away the effect of those in the
corresponding bonding MOs, removing altogether the bonding to leave a single bond.
The molecule Ne2 does not exist! Neon is an inert gas, like Helium, its atoms not forming
covalent bonds with anything. The reason is simply that, on adding two more electrons,
every bonding MO has an anti-bonding partner that is also doubly occupied. Every Row
of the Periodic Table that ends with the filling of a shell of s- and p-type AOs has a
last atom of inert-gas type: the inert-gas atoms are Helium (He), Neon (Ne), Argon (A),
Krypton (Kr), Xenon (Xe), Radon (Rn), with values of the principal quantum number n
going from n = 1 up to n = 6.
Here we are dealing only with the First Row, that ends with Neon and contains only the
first 10 elements, but we started from Nitrogen (atomic number Z = 7) and continued in
order of increasing Z. The atom before that is Carbon, the most important of the ones
we left out. So lets do it now.
Carbon has only 4 valence electrons outside its 1s2 core, so if a C2 molecule exists we
should have to assign 8 electrons to energy levels like the ones shown in Figure 7.4
corresponding to the MOs
1 ,
2z ,
(1x , 1y ),
(1x , 1y ),
2z .
Before we start, however, remember that the s- and p-type energy levels get closer together
as the effective nuclear charge (Zeff Z 2) gets smaller; and this means that the 2s and
2pz AOs must be allowed to mix, or hybridize, as in Figure 7.3, where the mixing gives
rise to hybrids h1 and h2 . h1 is largely 2s, but with some 2pz which makes it lean into
the bond; h2 , being orthogonal to h1 , will be largely 2pz , but pointing away from the
bond. This will be so for both Carbons. The correlation diagram should then have the
where the h1 and h2 levels are now relatively close together and the order of the MO
levels they lead to is no longer standard. The order in which they are filled up, in the
2z ,
1 ,
(1x , 1y ),
(1x , 1y ),
2z ,
Of course, the orbitals in the new molecule HCCH, which is called Acetylene, are
not quite the same as in C2 : the lone-pair orbitals (h2 ), which we imagined as sticking
out at the back of each Carbon atom now have protons embedded in them and describe
two CH bonds. Here, in dealing with our first polyatomic molecule, we meet a new
problem: acetylene apparently has two CH single bonds and one CC triple bond. We
are thinking about them as if they were independently localized in different regions of
space; but in MO theory the bonding is described by non-localized orbitals, built up as
linear combinations of much more localized AOs. All the experimental evidence points
towards the existence of localized bonds with characteristic properties. For example,
the bond energies associated with CC and CH links are roughly additive and lead to
molecular heats of formation within a few per cent of those measured experimentally.
Thus, for acetylene, taking the bond energies of CH and CC as 411 and 835 kJ mol1 ,
respectively, gives an estimated heat of formation of 1657 kJ mol1 roughly the observed
value. (If youve forgotten your Chemistry youd better go back to Book 5; Science is all
Next well ask if similar ideas can be applied in dealing with other polyatomic molecules.
The discussion of HCCH can easily be put in pictorial form as follows. Each Carbon
atom can be imagined as if it were in a valence state, with two of its four valence
electrons in hybrid orbitals sticking out in opposite directions along the z axis and the
other two in its 2px and 2py AOs. This state can be depicted as
where the Carbon is shown as the bold dot in the centre, while the bold arrows stand for
the hybrids h1 and h2 , pointing left and right. The empty circles with a dot in the middle
indicate they are singly occupied. The arrows labelled x and y stand for the 2px and
2py AOs, pointing in the positive directions ( to +), and the circles each contain a dot
to stand for single occupation.
The electronic stucture of the whole molecule HCCH can now be visualized as
where the upper diagram represents the two Carbon atoms in their valence states (-type
MOs not indicated); while the lower diagram shows, in very schematic form, the electronic
structure of the molecule HCCH that results when they are brought together and
two Hydrogens are added at the ends. The CC triple bond arises from the -type single
bond, together with the x - and y -type bonds (not shown) formed from side-by-side
overlap of the 2px and 2py AOs. The two dots indicate the pair of electrons occupying
each localized MO.
Acetylene is a linear molecule, with all four atoms lying on the same straight line. But
exactly the same principles apply to two- and three-dimensional systems. The Methyl
radical contains four atoms, lying in a plane, Carbon with three Hydrogens attached.
Methane contains five atoms, Carbon with four attached Hydrogens. The geometrical
forms of these systems are experimentally well known. The radical (so-called because it
is not a stable molecule and usually has a very short lifetime) has its Hydrogens at the
corners of an equilateral triangle, with Carbon at the centre; it has been found recently
in distant parts of the Universe, by astronomical observation, and suggests that Life may
exist elswhere. Methane, on the other hand, is a stable gas that can be stored in cylinders
and is much used in stoves for cooking; its molecules have four Hydrogens at the corners
of a regular tetrahedron, attached to a Carbon at the middle. These shapes are indicated
in Figure 7.6 below.
Figure 7.6 Shapes of the Methyl radical and the Methane molecule
In the Figure the large black dots indicate Carbon atoms, while the smaller ones show
the attached Hydrogens. In the Methyl radical (left) the Hydrogens are at the corners
of a flat equilateral triangle. In Methane (right) they are at the corners of a regular
tetrahedron, whose edges are shown by the solid lines. The tetrahedron fits nicely
inside a cube, which conveniently tells you the coordinates of the four Hydrogens: using
ex ey , ez to denote unit vectors parallel to the cube edges, with Carbon as the origin, unit
steps along each in turn will take you to H4 (top corner facing you) so its coordinates will
be (1,1,1). Similarly, if you reverse the directions of two of the steps (along ex and ey ,
say) youll arrive at H3, the back corner on the top face, with coordinates (1, 1, 1).
And if you reverse the steps along ey and ez youll get to H1 (left corner of bottom face),
while reversing those along e1 and e3 will get you to H2 (right corner of bottom face).
Thats all a bit hard to imagine, but it helps if you make a better drawing, with ez as
the positive z axis coming out at the centre of the top face, ex as the x axis coming out
at the centre of the left-hand face, and ey as the y axis coming out at the centre of the
right-hand face. Keep in mind the definition of a right-handed system; rotating the x axis
towards the y axis would move a corkscrew along the z axis.
In fact, however, its easiest to remember the coordinates of the atoms themselves: they
will be H4(+1,+1,+1) top corner facing you; H3(-1,-1,+1) top corner behind it;
H2(+1,-1,-1) bottom corner right; H1(-1,+1,-1) bottom corner left and that means
their position vectors are, respectively,
h4 = ex + ey + ez ,
h3 = ex ey + ez ,
h2 = ex ey ez ,
h1 = ex + ey ez ,
x axis horizontal
and y axis vertical) the Hydrogens have coordinates
1 1
H2( 2 , 2 3), H3( 2 , 2 3). Their position vectors are thus (given that 3 = 1.73205)
h1 = ex ,
h2 = 0.5ex + 0.8660ey ,
h3 = 0.5ex 0.8660ey .
For a 3-dimensional array (e.g. Methane) the same procedure will give
where l, m, n are the direction cosines of the vector pointing to any attached atom.
Now that we know how to make hybrid orbitals that point in any direction we only need
to normalize them. Thats easy because the squared length of h1 (in function space!) is
hh1 |h1 i = N 2 (1 + 2 ), and the s- and p-type orbitals are supposed to be normalized and
orthogonal (hs|si = hpx |px i = 1, hs|px i = 0). And it follows that N 2 = 1/(1 + 2 ).
The amount of s character in a hybrid will be the square of its coefficient, namely 1/(1+2 ),
while the amount of p character will be 2 /(1 + 2 ); and these fractions will be the same
for every hybrid of an equivalent set. The total s content will be related to the number
of hybrids in the set: if there are only two, as in Acetylene, the single s orbital must be
equally shared by the two hybrids, giving 2/(1 + 2 ) = 1 and so 2 = 1. With p1 directed
along the positive z axis and p2 along the negative, the two normalized hybrids are thus
h1 = (s + p1 ),
h2 = (s + p2 ),
h1 = (s + 2p1 ),
h2 = (s + 2p2 ),
h3 = (s + 2p3 ).
Finally, with four equivalent hybrids (the case of tetrahedral hybridization), we get
in the same way (check it!)
or where some may even be missing). The following are typical examples, all making
use of roughly tetrahedral hybrids: CH4 , NH3 , H2 O, NH4+ . Figure 7.7 gives a rough
schematic picture of the electronic structure and shape of each of these systems.
H2 O
the molecule. But we started from a much more complete picture in the general theory of
Chapter 4, where every orbital was constructed, in principle, from a set of AOs centred on
all the nuclei in the molecule. In that case the MOs of an IPM approximation of LCAO
form would extend over the whole system: they would come out of the SCF calculation
as completely nonlocalized MOs. We must try to resolve this conflict.
Thats a good question, because Chapter 4 (on the Hartree-Fock method) made it seem
that a full quantum mechanical calculation of molecular electronic structure would be
almost impossible to do even with the help of big modern computers. And yet, starting
from a 2-electron system and using very primitive ideas and approximations, weve been
able to get a general picture of the charge distribution in a many-electron molecule and
of how it holds the component atoms together.
So lets end this section by showing how simple MO theory, based on localized orbitals,
can come out from the quantum mechanics of many-electron systems. Well start from
the Hartree-Fock equation (4.12) which determines the best possible MOs of LCAO
form, remembering that this arises in IPM approximation from a single antisymmetrized
product of spin-orbitals:
and that for a closed-shell ground state the spin dependence can be removed by integration
to give the ordinary electron density
P (r1 ) = 2[1 (r1 )1 (r1 ) + 2 (r1 )2 (r1 ) + .... + 5 (r1 )5 (r1 )]
a sum of orbital densities, times 2 as up-spin and down-spin functions give the same
The spinless density matrix (see Chapter 5; and (5.33) for a summary) is very similar:
P (r1 ; r1 ) = 2[1 (r1 )1 (r1 ) + 2 (r1 )2 (r1 ) + .... + N (r1 )N (r1 )]
and gives the ordinary electron density on identifying the two variables, P (r1 ) = P (r1 ; r1 ).
These density functions allow us to define the effective Hamiltonian F used in HartreeFock theory and also give us, in principle, all we need to know about chemical bonding
and a wide range of molecular properties.
The question is now whether, by setting up new mixtures of the spatial orbitals, we can
obtain alternative forms of the same densities, without disturbing their basic property of
determining the best possible one-determinant wave function. To answer the question,
we collect the equations in (4.12), for all the orbitals of a closed-shell system, into a single
matrix equation
F = ,
where the orbitals are contained in the row matrix
= (1 2 .... N/2 )
and is a square matrix with the orbital energies 1 , 2 , ...., N/2 as its diagonal elements,
all others being zeros. (Check this out for a simple example with 3 orbitals!)
Now lets set up new linear combinations of the orbitals 1 , 2 , ... N/2 , and collect them
in the row matrix
= (1 , 2 , ... N/2 ).
where P
the square matrix U has elements Urs which are the mixing coefficients giving
s =
r r Urs . The new density matrix can be expressed as the row-column matrix
P (r1 ; r1 ) =
and is then related to that before the transformation, using (7.11), by
P (r1 ; r1 ) = U(U)
= U(U )
= P (r1 ; r1 ) (provided UU = 1.
Here weve noted that (AB) = (B A ) and the condition on the last line means that U
is a unitary matrix.
That was quite a lot of heavy mathematics, but if you found it tough go to a real application in the next Example, where we relate the descriptions of the water molecule based
on localized and non-localized MOs. You should find it much easier.
2 = ah2 + bH2 .
Here the bonds are equivalent, so the mixing coefficients a, b must be the same for both of them. The
remaining 4 of the 8 valence electrons represent two lone pairs and were assigned to the next two hybrids
h3 and h4 , which we may now denote by
3 = h3 and 4 = h4 .
What about the non-localized MOs? These will be put in the row matrix = (1 2 3 4 ) and should
serve as approximations to the MOs that come from a full valence-electron SCF calculation. There are
only four occupied SCF orbitals, holding the 8 valence electrons, and for a symmetrical system like H2 O
they have simple symmetry properties. The simplest would be symmetric under rotation C2 , through
180 around the z-axis, and also under reflections 1 , 2 across the xz- and yz-planes. How can we express
? Clearly 1 and 2 are both symmetric under reflection 1
such orbitals in terms of the localized set
across the plane of the molecule, but they change places under the rotation C2 and also under 2 which
interchanges the H atoms. For such operations they are neither symmetric nor anti-symmetric; and the
same is true of 3 and 4 . However, the combination 1 + 2 clearly will be fully symmetric. Reflection
sends the positive combination into itself, so 1 + 2 is symmetric, but 1 2 becomes 2 1 and is
therefore anti-symmetric under C2 and 2 . Moreover, the symmetric and anti-symmetric combinations
are both delocalized over both bonds and can be used as
x x 0 0
x x
0 0
U =
0 0 x x
0 0 x x
(x and x
standing for 21 2 and 12 2.) It is easy to confirm that this matrix is unitary. Each column
contains the coefficients of a nonlocalized MO in terms of the four localized MOs; so the first expresses 1
as the combination found above, namely (1/ 2)(1 + 2 ), while the fourth gives 4 = (1/ 2)(3 4 ).
In each case the sum of the coefficients squared is unity (normalization); and for two columns the sum of
corresponding products is zero (orthogonality).
are delocalized combinations of localized bond orbitals, behaving correctly under symmetry operations on the molecule and giving exactly the same description of the electron
distribution. The same is true of the lone pair orbitals: they may be taken in localized
form as, 3 and 4 , which are clearly localized on different sides of a symmetry plane, or
they may be combined into the delocalized mixtures
x x 0 0
x x 0 0
U =
0 0 x x ,
0 0 x x
where x and x stand for the numerical coefficients 21 2 and 21 2. Thus, for example, the
localized lone pairs are 3 = h3 and 4 = h4 , and their contribution to the total electron
density P is 2|h3 |2 + 2|h4 |2 (two electrons in each orbital).
After transformation to the delocalized combinations, given in (7.14), the density contribution of the lone pairs is expressed as (Note that the square modulus |...| is used as the
electron density P is a real quantity, while the functions may be complex.)
2|3 |2 + 2|4 |2 = |(h3 + h4 )|2 + |(h3 h4 )|2
= (|h3 |2 + |h4 |2 + 2|h3 h4 |) + (|h3 |2 + |h4 |2 2|h3 h4 |)
= 2|h3 |2 + 2|h4 |2
exactly as it was before the change to non-localized MOs.
You can write these results in terms of the usual s, px , py , pz AOs (you should try it!),
4 = 2(px + py ).
3 = 2(s + pz ),
Evidently, |3 |2 describes a lone-pair density lying along the symmetry axis of the molecule
(sticking out above the Oxygen) while |4 |2 lies in the plane of the molecule and describes
a halo of negative charge around the O atom.
The water molecule provided a very simple example, but (7.14) and all that follows
from it are quite general. Usually the transformation is used to pass from SCF MOs,
obtained by solving the Hartree-Fock equations, to localized MOs, which give a much
clearer picture of molecular electronic structure. In that case (7.11) must be used, with
some suitable prescription to define the matrix U that will give maximum localization
of the transformed orbitals. Many such prescriptions exist and may be applied even when
there is no symmetry to guide us, as was the case in Example 7.5: they provide a useful
link between Quantum Mechanics and Chemistry.
At IPM level, weve already explored the use of Molecular Orbital (MO) theory in
trying to understand the electronic structures of some simple molecules formed from
atoms of the First Row of the Periodic Table, which starts with Lithium (3 electrons) and
ends with Neon (10 electrons).
Going along the Row, from left to right, and filling the available AOs (with up to two
electrons in each) we obtain a complete shell. We made a lot of progress for diatomic
molecules (homonuclear when both atoms are the same, heteronuclear when they are
different) and even for a few bigger molecules, containing 3,4, or more atoms. After finding
the forms of rough approximations to the first few MOs we were able to make pictures
of the molecular electronic structures formed by adding electrons, up to two at a time,
to the empty MOs. And, remember, these should really be solutions of the Schr`odinger
equation for one electron in the field provided by the nuclei and all other electrons: they
are not buckets for holding electrons! they are mathematical functions with sizes and
shapes, like the AOs used in Section ... to describe the regions of space in which the
electron is most likely to be found.
In the approach used so far, the MOs that describe the possible stationary states of an
electron were approximated as Linear Combinations of Atomic Orbitals on the
separate atoms of the molecule (the LCAO approximation). On writing
= c1 1 + c2 2 + ... cn n =
ci i ,
the best approximations we can get are determined by solving a set of secular equations.
In the simple case n = 3 these have the form (see equation (4.15) of Section 4)
h11 h12 h13
h21 h22 h23 c2 = c2 .
h31 h32 h13
In Section 4 we were thinking about a much more refined many-electron approach, with
as many basis functions as we wished, and an effective Hamiltonian F in place of the
bare nuclear operator h. The Fock operator F includes terms which represent interaction
with all the other electrons, but here we use a strictly 1-electron model which contains
only interaction with the nuclei. The matrix elements hij are then usually treated as
disposable parameters, whose values are chosen by fitting the results to get agreement
with any available experimetal data. And the overlap integrals Sij = hi |j i are often
neglected for i 6= j. This is the basis of semi-empirical MO theory, which we explore
further in this section.
Lets start by looking at some simple hydrocarbons, molecules that contain only Carbon
and Hydrogen atoms, beginning with Acetylene (C2 H2 ) the linear molecule H C
C H, studied in Example 7.3, where the simplest picture of the electronic structure was
found to be
That means, youll remember, that two electrons occupy the MO CH localized around
the left-hand CH bond; another two occupy CCz , a -type MO localized around the CC
(z) axis; two more occupy a -type MO CCx formed from 2px AOs; two more occupy
a similar MO CCy formed from 2py AOs; and finally there are two electrons in the
right-hand CH bond. That accounts for all 10 valence electrons! (2 from the Hydrogens
and 24 from the Carbons) And in this case the localized MOs are constructed from the
same number of AOs.
Now suppose the MOs came out from an approximate SCF calculation as general linear
combinations of the 10 AOs, obtained by solving secular equations like () but with 10 rows
and columns. What form would the equations have? The matrix elements hij for pairs
of AOs i , j would take values hii = i , say, along the diagonal (j = i); and this would
represent the expectation value of the effective Hamiltonian h for an electron sitting in i .
(This used to be called a Coulomb integral, arising from the electrostatic interaction
with all the nuclei.) The off-diagonal elements hij , (j 6= i) would arise jointly from the
way i and j overlap (not the overlap integral, which we have supposed negligible).
It is usually denoted by ij and is often referred to as a resonance integral because it
determines how easily the electron can resonate between one AO and the other. In
semi-empirical work the s and s are looked at as the disposable parameters referred
to above.
In dealing with hydrocarbons the s may be given a common value C for a Carbon
valence AO H for a Hydrogen AO. The s are given values which are large for AOs with
a heavy overlap (e.g. hybrids pointing directly towards each other), but are otherwise
neglected (i.e. given the value zero). This is the nearest-neighbour approximation.
To see how it works out lets take again the case of Acetylene.
Example 7.6 Acetylene with 10 AOs
Choose the AOs as the hybrids used in Example 7.3. Those with symmetry around the (z) axis of the
molecule are:
1 = left-hand Hydrogen 1s AO
2 = Carbon hybrid pointing towards Hydrogen (1 )
3 = Carbon hybrid pointing towards second Carbon (4 )
6 = right-hand Hydrogen 1s AO
The other Carbon hybrids are of x-type, formed by including a 2px component, and y-type, formed by
including a 2py component. They are
7 = x-type hybrid on first Carbon, pointing towards second
8 = x-type hybrid on second Carbon, pointing towards first
To determine the form of the secular equations we have to decide which AOs are nearest
neighbours, so lets make a very simple diagram in which the AOs 1 , ... 10 are indicated
by short arrows showing the way they point (usually being hybrids). As the molecule is
linear, the arrows will be arranged on a straight line as in Figure 7.8 below:
3 4
C 9
The off-diagonal elements hij = hi |h|j i will be neglected, in nearest-neighbour approximation, except for i j pairs that point towards each other. The pairs (1,2) and (5,6)
may be given a common value denoted by CH , while (3,4),(7,8),(9,10) may be given a
common value CC . For short, well use just for the C-C resonance integral and for
the one that links C to H.
Since i and j are row- and column- labels of elements in the matrix h, it follows that the
approximate form of h for the Acetylene molecule is
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0 0
0 0 0
0 0
0 0
0 0 0
0 0 0
0 0 0
0 0 0
The secular equations contained in the matrix hc = c then break up into pairs, corresponding to the 22 blocks along the diagonal of (7.18). The first pair, for example,
could be written
( )c1 + c2 = 0, c1 + ( )c2 = 0
and the solution is easy (youve done it many times before!): by cross-multiplying you
eliminate the coefficients and get
( )( ) ( )2 = 0,
which is a simple quadratic equation to determine the two values of for which the two
equations can be solved (are compatible). These values, the roots, are (look back at
Section 5.3 of Book 1 if you need to!)
= 12 ( + ) 12 ( )2 + 4( )2 ).
Bonding : 1 = (1 + 2 )/ 2, Antibonding : 2 = (1 2 )/ 2,
just as for a homonuclear diatomic molecule (Section 7.2). In other words the IPM
picture, with nearest-neighbour approximations, views the molecule as a superposition
of independent 2-electron bonds, each one consisting of two electrons in a localized MO
extending over only two centres. In this extreme form of the IPM approximation, the
total electronic energy is represented as a sum
E(rs) ,
where E(rs) = 2(rs) and is the energy of two electrons in the bonding MO formed from
an overlapping pair of AOs r , s .
The total electronic energy of the Acetylene molecule would thus be
Etotal 2ECH + 3ECC
corresponding to 2 CH bonds and 3 CC bonds (taken as being equivalent).
From Acetylene to Ethane and Ethylene
In Acetylene the Carbons, which are each able to form bonds with up to four other atoms
(as in Methane, CH4 , shown in Figure 7.7), are each bonded with only one other atom.
The CC triple bond seems a bit strange, with each Carbon using 3 of its 4 valencies to
connect it only with another Carbon! and the triple bond is apparently quite different
from that in the diatomic Nitrogen molecule N N, described in Section 7.2 as one
bond of type with two somewhat weaker -type bonds. Instead weve described it in
terms of hybrid AOs, one pair (3 , 4 ), pointing directly towards each other, and two
pairs pointing away from the molecular axis and able to form only bent bonds. In fact,
however, both descriptions are acceptable when we remember that the hybrid AOs are
simply linear combinations of 2s and 2p AOs and the three pairs of localized MOs formed
by overlapping the hybrids are just alternative combinations. In Section 7.4 we found
how two alternative sets of orbitals could lead to exactly the same total electron density,
provided the mixing coefficients were chosen correctly so as to preserve normalization and
orthogonality conditions. So dont worry! you can use either type of MOs and get more
or less the same overall description of the electronic structure. Any small differences will
arise from using one set of standard s-p hybrids in a whole series of slightly different
molecular situations (as in Figure 7.7).
Ethane and Ethylene illustrate two main categories of carbon compounds: in Ethane
the Carbon forms four -type bonds with other atoms (we say all carbon valencies are
saturated ), leading to saturated molecules; but in Ethylene only three carbon valencies
are used in that way, the fourth being involved in somewhat weaker -type bonding, and
Ethylene is described as an unsaturated molecule. Lets deal first with Ethane and
similar molecules.
The Ethane molecule
Ethane has the formula C2 H6 , and looks like two CH3 groups with a bond between the
two Carbons. Its geometry is indicated in Figure 7.9 below:
with 7-9 Carbons and kerosene with 10-16 Carbons) and finally solid paraffin. Obviously
they are very important commercially.
The Ethylene molecule
In this molecule, with the formula C2 H4 , each Carbon is connected to only two Hydrogens
and the geometry of the molecule is indicated in Figure 7.10 here two CH2 groups lie in the
same plane (that of the paper) and are connected by a CC sigma bond. Each Carbon
has three valencies engaged in sigma-bonding; the fourth involving the remaining 2p AO,
sticking up normal to the plane of the molecule and able to take part in bonding.
C .
Figure 7.10 Geometry of Ethylene
Molecules with Carbons of this kind are said to be conjugated and conjugated molecules
form an important part of carbon chemistry. The Ethylene molecule is flat, with Carbon
2p orbitals normal to the plane and overlapping to give a bond; there is thus a CC double bond, which keeps the molecule flat, because twisting it around the CC bond reduces
the lateral overlap of the 2p orbitals and hence the degree of bonding. Unsaturated
hydrocarbons like Ethylene are generally planar for the same reason. They can all be
built up by replacing one of the Hydrogens by a trivalent Carbon and saturating
two of the extra valencies by adding two more Hydrogens. From Ethylene we obtain in
this way C3 H6 (The Allyl radical), which has 3 electrons and is a highly reactive free
The Butadiene molecule
On replacing the right-most Hydrogen of Allyl by another CH2 group, we obtain the
molecule pictured in Figure 7.11 which is called Butadiene:
C .
C .
that matters is the pattern of connections due to lateral overlap of 2p AOs on adjacent
Carbons. In the early applications of quantum mechanics to molecules this approximation
turned up an amazing number of chemically important results. So lets use Butadiene to
test our theoretical approach:
Example 7.7 Butadiene: electronic structure of the -electron system
In the present approximation we think only of the C C C C chain and set up the secular equations
for a electron in the field of the -bonded framework. The effective Hamiltonian has diagonal matrix
elements for every Carbon and off-diagonal elements for every pair of nearest neighbours, the rest
being neglected. The equations we need are therefore (check it out!)
c2 0
c3 0
There are solutions only for values of which make the determinant of the square matrix zero. How can
we find them?
The first line of the matrix equation above reads as
( )c1 + c2 = 0
and it would look better if you could get rid of the and , which are just parameters that can take
any values you choose. So why not divide all terms by , which doesnt change anything, and denote
( )/ by x? The first equation then becomes xc1 + c2 = 0 and the whole matrix equation becomes
x 1
1 x 1
c2 0
1 x 1 c3 0
1 x
There are solutions only for values of x which make the determinant of the square matrix zero; and if
you know how to solve for x you can get all the energy levels for any values of the adjustable parameters
, .
values of x is thus
x 1
1 x
You may remember the rule for evaluating a determinant (it was given first just after
equation (6.10) in Book 2). Here well use it to evaluate the 44 determinant of the
square matrix on the left in (7.22). Working along the first row and denoting the value
of the 44 determinant by 4 (x) (its a function of x) we get in the first step
x 1
x 1
1 1
1 x 1
4 (x) =
= (x) 1 x 1 (1) 0 x 1 + etc.
1 x 1
0 1 x
1 x
1 x
The next step is to use the same rule to evaluate each of the 33 determinants. Youll
need only the first two as the others are multiplied by zero. The first one is
x 1
1 x 1 = (x) x 1 (1) 1 1 + (0) 1 x
1 x
0 x
0 1
1 x
The second one is
1 1
0 x 1
0 1 x
= (1) x 1
1 x
as follows from the rule youre using (check it) and its therefore easy to work back
from this point and so evaluate the original 44 determinant 4 (x). The final result is
(check it!) 4 (x) = x4 3x2 + 1 and depends only on the square of x. That shows at
once that the set of energy levels will be symmetrical around x = 0; and if we put x2 = y
the consistency condition 4
(x) = 0 becomes y 2 3y + p
1 = 0. This simple p
equation has roots y = (3 5)/2, which lead to x = (1.618), or x = (0.618);
and therefore to energy levels
= x = 1.272, or 0.786.
Since and are both negative quantities the level for the plus sign is below the datum
and corresponds to a bonding state, while that with the negative sign lies symmetrically
above the datum and corresponds to an antibonding state.
The calculation above, for a chain of four Carbons, could be repeated for a chain of six
Carbons (called Hexatriene) but would involve dealing with a 66 determinant; and
with N Carbons we would have to deal with an N N determinant quite a lot of
work! Sometimes, however, it is simpler to solve the simultaneous equations directly: the
method is shown in Example 7.8 that follows.
Example 7.8 Butadiene: a simpler and more general method
Again we calculate the electronic structure of the -electron system of Butadiene, but this time we work
directly from the secular equations, which follow from (7.22) as
( )c1 + c2 + 0c3 + 0c4
c1 + ( )c2 + c3 + 0c4
0c1 + c2 + ( )c3 + c4
0c1 + 0c2 + c3 + ( )c4
0c4 ,
where the sum of terms on the left of the equality must vanish for every line. In short,
( )c1 + c2 = 0,
c1 + ( )c2 + c3 = 0,
c2 + ( )c3 + c4 = 0,
c3 + ( )c4 = 0.
Lets now write cm for the mth coefficient (in the order 1,2,3,4), divide each equation by , and again
put ( )/ = x. The whole set of equations can then be written as a single one:
cm1 xcm + cm+1 = 0,
where m takes the values 1,2,3 and 4 in turn and we exclude any values such as 0 or 5 that lie outside
that range. These are boundary conditions which tell us there are no coefficients below m=1 or above
m=4. The number of atoms in the chain (call it N ) is not important as long as we insist that c0 and
cN +1 should be zero. So now we can deal with polyene chains of any length!
To complete the calculation we can guess that the coefficients will follow the up-and-down pattern of
waves on a string, like the wave functions of an electron in a 1-dimensional box behaving like sin m or
cos m or a combination of the two. Its convenient to use the complex forms exp()im and on putting
cm = exp(im) in the key equation above we get the condition
exp i(m 1) x exp m + exp i(m + 1) = 0.
Taking out a common factor of exp im, this gives ei + ei x = 0, so the wavelength must be
related to the energy x by x = 2 cos .
Since changing the sign of m gives a solution of the same energy, a more general solution will be
cm = A exp(im) + B exp(im)
where A and B are arbitrary constants, which must be chosen to satisfy the boundary conditions: the
first of these, c0 = A + B = 0, gives
cm = A[exp(im) exp(im)] = C sin(m),
(C = 2A)
From Example 7.8, the MOs k for a polyene chain of N Carbons are k =
where the mth AO coefficient is (after normalizing check it!)
(Ck = 2/(N + 1)).
cm = Ck sin
N +1
The corresponding orbital energies are k = + xk , with
xk = 2 cos
N +1
cm m ,
The energy levels are thus symmetrically disposed around = , which may be taken as
a reference level. As N the levels become closer and closer, eventually forming a
l Anti-bonding
l Bonding
N =1 N =2 N =3 N =4 N
Figure 7.12 MO energy levels in a chain of N Carbon atoms
The reference level (N = 1) in Figure 7.12 has = . As N the levels become very
close, forming an energy band extending from = + 2 up to 2. (Remember
and are both negative.) It should be noted that when the number of Carbons is odd
there is always a Non-bonding MO: it is very important, giving the molecule its freeradical character the highest occupied orbital leading to a highly reactive system with
a very short lifetime.
It is also interesting to ask what happens if you join the ends of a chain molecule to form
a ring a cyclic molecule. In Example 7.9 we find the question is easily answered by
making a simple change of the boundary conditions used in Example 7.8.
Example 7.9 Making rings cyclic polyenes
If we join the ends of a chain of N Carbons, keeping the system flat with adjacent atoms connected by
strong bonds, we obtain a ring molecule in which every Carbon provides one electron in a 2p AO
normal to the plane. In nearest neighbour approximation, the equations to determine the MO coefficients
are unchanged except that the AOs 1 and N will become neighbours, so there will be a new non-zero
element in the first and last rows of the matrix h. For a 6-Carbon ring, called Benzene, h16 and h61 will
both have the value instead of zero. The secular equations will then become
( )c1 + c2 + c6
= 0,
c3 + ( )c4 + c5
c4 + ( )c5 + c6
= 0,
= 0,
c1 + ( )c2 + c3
c2 + ( )c3 + c4
= 0,
= 0,
c1 + c5 + ( )c6
= 0,
where the terms at the end of the first line and the beginning of the last line are new: they arise because
now the Carbon with AO cofficient c1 is connected to that with AO coefficient c6 .
On putting ( )/ = x, as before, and dividing throughout by , the first and last of the secular
equations become, respectively,
xc1 + c2 + c6 = 0
c1 + c5 xc6 = 0,
To summarize what came out from Example 7.9, joining the ends of a long polyene chain to
form a ring leaves the formula for the energy levels, namely (7.24), more or less unchanged
k = + 2 cos
with N instead of N + 1, but gives a complex MO with AO coefficients
cm = Ak exp
(Ak = 1/ N ).
However, changing the sign of k in (7.24) makes no difference to the energy, so the solutions
in (7.26) can be combined in pairs to give real MOs with AO coefficients
bm = Ck cos
am = Ck sin
where Ck is again chosen to normalize the function. On putting N = 6, for example, the
three bonding -type MOs for the Benzene molecule can be written as
1 = (1/ 6)(1 + 2 + 3 + 4 + 5 + 6 )
2 = (1/2 3)(1 + 22 + 3 4 25 6 )
3 = (1/2)(1 3 4 25 + 6 ).
The molecule forms a sweet-smelling liquid of great importance in the chemical industry.
It is used in the manufacture of drugs, dyes, plastics and even explosives and is the first in
a whole family of molecules called polyacenes, formed by joining benzene rings together
with one side in common and the loss of corresponding H atoms. All such molecules have
numerous derivatives, formed by replacing one or more of the attached Hydrogens by
other chemical groups such as CH3 (methyl) or OH (the hydroxyl group).
The next two members of the polyacene family are Naphthalene and Anthracene, as shown
where s(r s) under the summation sign means for atoms s connected with atom r,
and when x = 0 the sum of AO coefficients over all atoms s connected with r must
vanish. In the Allyl radical, for example, we could mark the end Carbons (1 and 3, say)
with a star and say that, as they are neighbours of Carbon 2, the NBMO must have
c1 + c3 = 0. The normalized
NBMO would then be (with the usual neglect of overlap)
N BM O = (1 3 )/ 2.
A more interesting example is the Benzyl radical where a seventh conjugated Carbon
is attached to a Benzene ring, the starred positions and corresponding AO coefficients
are as shown below
Figure 7.14 The Benzyl radical: starring of positions, and AO coefficients in the
To summarize: in the NBMO of an alternant hydrocarbon, the starring of alternate
centres divides the Carbons into two sets, starred and unstarred. On taking the AO
coefficients in the unstarred set to be zero, the sum of those on the starred atoms to
which any unstarred atom is connected must also be zero. Choosing the AO coefficients
in this way satisfies the condition (8.13) whenever x = 0 and this lets you write down
at least one NBMO just by inspection! The MO is normalized by making the sum of
the squared coefficients equal to 1 and this means that an electron in the NBMO of the
Benzyl radical will be found on the terminal Carbon with a probability of 4/7, compared
with only 1/7 on the other starred centres. The presence of this odd unpaired electron
accounts for many of the chemical and physical properties of alternant hydrocarbons with
one or more NBMOs. Such electrons easily take part in bonding with other atoms or
chemical groups with singly occupied orbitals; and they are also easily seen in electron
spin resonance (ESR) experiments, where the magnetic moment of the spin couples with
an applied magnetic field. The Benzyl radical, with its single loosely bound electron, is
also easily converted into a Benzyl radical anion by accepting another electron, or into a
cation by loss of the electron in the NBMO. The corresponding starred centres then get
a net negative or positive charge, which determines the course of further reactions.
To show how easy it is to play with such simple methods you could try to find the NBMO
for the residual molecule which results when you take away one CH fragment from the
Naphthalene molecule shown in Figure 7.14 The system that remains when you choose
the top Carbon on the right is
where the starred positions have been chosen as shown. You should try to attach the
non-zero AO coefficients in the NBMO.
The NBMO is important in the discussion of chemical reactions. The taking away
of the CH group in this example actually happens (we think!) when an NO2+ group
comes close to the Carbon: it is thirsty for electrons and localizes two electrons in the
Carbon 2p AO, changing the hybridization so that they go into a tetrahedral hybrid and
leave only 8 electrons in the 9-centre conjugated system of the residual molecule. The
NO2 group then bonds to the Carbon in this activated complex, which carries a positive
charge (still lacking 1 electron of the original 10): an electron is then removed from the
attached Hydrogen, which finally departs as a bare proton! Just before that final step, the
energy (E ) of the residual molecule is higher than the energy (E) of the original molecule
and the difference A = E E is called the Activation Energy of the nitration reaction.
Any change you can make in the original molecule (e.g. replacing another Carbon by a
Nitrogen) that lowers the Activation Energy will make the reaction go more easily; and
thats the kind of challenge you meet in Chemistry.
(Notice that weve been talking always about total -electron energies, estimated as sums
of orbital energies, and were supposing that there are no big changes in the energies of
the bonds. These are approximations that seem to work! but without them there
would be little hope of applying quantum mechanics in such a difficult field.)
Its time to move on this is not a Chemistry book! But before doing so lets remember
that nearly all the molecules weve been dealing with in this section have been built up
from only two kinds of atom Hydrogen, with just one electron, and Carbon, with six.
And yet Carbon chemistry is so important in our daily life that we cant do without it:
hydrocarbons give us the fuels we need for driving all kinds of machines (in our factories)
and vehicles (from scooters to heavy transport); also for heating and cooking; and for
preparing countless other materials (from drugs to plastics and fabrics such as nylon).
Remember also that our own bodies are built up almost entirely from elements near the
beginning of the Periodic Table, Carbon and Hydrogen in long chain molecules, along with
small attached groups containing Nitrogen and Oxygen, and of course the Hydrogen and
Oxygen in the water molecules (which make up over 50% of body mass). When Calcium
and Phosphorus are added to the list (in much smaller quantities) these six elements
account for about 99% of body mass!
The main bridge between so much abstract theory and the things we can measure in the
laboratory, the observables, is provided by a number of electron density functions. In
Chapter 4 we introduced a density matrix, in the usual finite-basis representation, where
it was used to define the Coulomb and exchange operators of self-consistent field (SCF)
theory. But because we were dealing with closed-shell systems, where the occupied
orbitals occurred in pairs (one with spin factor and a partner with spin factor ) we
were able to get rid of spin by integrating over spin variables. Then, in studying the
electronic structure and some of the properties of atoms (in Chapter 5), we took the idea
of density functions a bit further and began to see how useful they could be in dealing
with electronic properties. You should look again at the ideas developed in Examples 5.6
and 5.7 and summarized in the box (5.33). Finally, in Chapter 6, we were able to extend
the same ideas to molecules; so here youll find nothing very new.
It will be enough to remind ourselves of the definitions and fill in some details. The spinless
electron density function, for a system with an N -electron wave function (x1 , x2 , ... xN )
P (r1 ) = (x1 )ds1 ,
where (x1 ) is the density with spin included, as defined in (5.24), and arises from the
product | | on integrating over all variables except x1 . Although the variable has been
called x1 thats only because we chose the first of the N variables to keep fixed in
integrating over all the others: the electrons are indistinguishable and we get the same
density function whatever choice we make so from now on well often drop the subscript
in one-electron functions, using just P (r) or (x). The function P (r) is often called, for
short, the charge density since it gives us a clear picture of how the total electronic
charge is spread out in space.
Well continue to use the N -electron Hamiltonian
h(i) + 12 i6=j g(i, j),
where h(i) and g(i, j) are defined in Chapter 2, through equations (2.2) and (2.3), and
the 1-electron operator h(i) contains a term V (i) for the potential energy of electron i in
the field of the nuclei. The potential energy of the whole system follows in terms of the
state function as
V (i))|i = N
V (1)(x1 )dx1
V (1)P (r1 )dr1 ,
where the first step expresses the expectation value as N times the result for Electron 1;
the next step puts it in terms of the density (x1 ) with spin included; and finally, since
V (1) is spin-independent, the spin integrations can be done immediately and introduce
the spinless density P (r1 ) defined in (7.14)
The spinless density matrix is defined similarly:
P (r1 ; r1 ) =
(x1 ; x1 )ds1
s1 =s1
where (x1 ; x1 ) = N (x1 , x2 , ... xN ) (x1 , x2 , ... xN )dx2 , ... xN ) is the density matrix
with spin included, as defined in (5.26), the prime being used to protect the variable in
from the action of any operator. To express the expectation value of an operator sum
you can make similar steps (you should do it!) Thus, for the kinetic energy with operator
T(i) for electron i,
T(i))|i = N (x1 , x2 , ... xN )T(1)(x1 , x2 , ... xN )dx1 dx2 ... dxN
x1 =x1
r1 =r1
T(1)(x1 ; x1 )dx1
T(1)P (r1 ; r1 )dr1 .
Those are one-electron density functions, but in Example 5.7 we found it was possible
to generalize to two- and many-electron densities in a closely similar way. Thus a pair
density (spin included) is defined as
(x1 , x2 ) = N (N 1) (x1 , x2 ... xN ) (x1 , x2 ... xN )dx3 , ... xN
and the density matrix follows on putting primes on the variables x1 , x2 in the factor. With this definition, the expectation value of the electron interaction term in the
Hamiltonian becomes (remember, the prime on the means no term with j = i)
g(i, j)|i = [g(1, 2)(x1 , x2 ; x1 , x2 )](x1 =x1 ,x2 =x2 ) dx1 dx2 .
As g(1, 2) is just an inverse-distance electron repulsion, without spin dependence, the spin
integrations can be performed immediately and the primes can be removed. The result is
g(i, j)|i = [g(1, 2)(r1 , r2 )]dr1 dr2 .
(The notation is consistent: Greek letters and are used for the density functions with
spin included; corresponding capitals, P and for their spin-free counterparts.)
In summary, (x1 , x2 ) = N (N 1) (x1 , x2 ... xN ) (x1 , x2 ... xN )dx3 , ... xN and
(r1 , r2 ) = (x1 , x2 )ds1 ds2
is a 2-electron probability density: it gives the probability of two electrons (any two)
being found simultaneously at points r1 and r2 in ordinary 3-space, with all the others
anywhere. (Remember this function is a density, so to get the actual probability of finding
two electrons in tiny volume elements at points r1 and r2 you must multiply it by the
volume factor dr1 dr2 .)
The function (r1 , r2 ) describes the correlation between the motions of two electrons
and in IPM approximation turns out to be non-zero only when they have the same spin.
This is one of the main challenges to the calculation of accurate many-electron wave
functions. Fortunately we can go a long way without meeting it!
Some applications
So far weve been thinking mainly of an isolated system, which can stay in a
definite energy eigenstate forever such states being stationary. To make
the system change you must do something to it; you must disturb it and a
small disturbance of this kind is called a perturbation. The properties
of the system are measured by the way it reacts to such changes.
Response to an applied electric field
The simplest properties of molecules are the ones that depend directly on
the charge density, described by the function P (r) defined in (7.29). And
the simplest perturbation you can make is the one due to an electric field
applied from outside the molecule. This will change the potential energy of
Electron i in the Hamiltonian H, so that (using x, y, z for the components
of an electrons position vector r,
V (i) V (i) + V (i).
When the external field is uniform and is in the z-direction, it arises
from the electric potential as Fz = /z; and we may thus choose
zP (r)dr.
atomic and molecular structure involve the application of strong :magnetic fields. One thinks in particular of Nuclear Magnetic Resonance
(NMR) and Electron Spin Resonance (ESR), which bring in the spins
of both electrons and nuclei. So we must start by thinking of how a system
responds to the application of a magnetic field.
Response to an applied magnetic field
Again lets take the simplest case of a uniform field. Whereas an electric
field a vector quantity F with scalar components Fx , Fy , Fz in a Cartesian system, can be defined as the gradient of a scalar potential function
, that is not possible for a magnetic field. If you look at Chapter 4 of
Book 10 youll see why: briefly, div B = 0 at every point in free space;
but if B were the gradient of some scalar potential mag that wouldnt be
possible in general. On the other hand, B could be the curl of some vector
quantity, A, say. (If youve forgotten about operators such as grad, div
and curl, youll need Book 10.)
Now were ready to show how the motion of a particle of charge q is modified by the application of a magnetic field.
of all, remember how the
P First
kinetic energy T is defined: T = (1/2m) pi , where the index i runs over
components x, y, z and px , for example, is the x-component of momentum
px = mvx = mx x being short for the time-derivative dx/dt. Also, when
there is a potential energy V = V (x, y, z) = q(x, y, z) the total energy of
the particle is the Hamiltonian function
E = H(x, y, z, px , py , pz ) = T + V,
but the Lagrangian function
L(x, y, z, px , py , pz ) = T V ;
named after the French mathematician Lagrange, is equally important.
Either can be used in setting up the same equations of motion, but here
well use Lagranges approach.
The Lagrangian for a single particle in a static electric field is thus
L = 12 mv 2 q,
in terms of the speed v of the particle. In terms of L, the momentum
components can be expressed as px = (L/ x)
= (T / x),
since is
mv = p q A.
The time derivative of p follows from the Lagrangian equations of motion
(at the beginning of this Example), namely (for the x-component),
d L
dt x
Thus, (L/ x)
which is the generalized momentum x-component has
a time derivative px and this is equated to (L/x). When the magnetic
field is included it follows that
px = (L/x) = q(/x) + qvx (Ax /x).
The second term in the expression for F = mv is q A and as we have taken
A = 12 B r, we can easily calculate its time rate of change. On taking the
components one at a time and remembering that the position vector r has
components x, y, z. we obtain
with similar expressions for the y- and z-components. Note that the first
term on the right will be zero because A has no explicit dependence on
time. The second term in the expression for F = mv is q A and as we
have taken A = 12 B r, we can easily calculate its time rate of change. On
taking the components one at a time and remembering that the position
vector r has components x, y, z. we obtain
with similar expressions for the y- and z-components. Note that the first
term on the right will be zero because A has no explicit dependence on
time. The second term in the expression for F = mv is q A and as we
have taken A = 12 B r, we can easily calculate its time rate of change. On
taking the components one at a time and remembering that the position
vector r has components x, y, z. we obtain
with similar expressions for the y- and z-components. Note that the first
term on the right will be zero because A has no explicit dependence on
On substituting both terms into the force equation F = mv = p q A the
x-component follows as
Ax Az
Ay Ax
Fx = q
z .
The two terms in round brackets can be recognised as, respectively, the zand y-components of the vector curl A; and when the coefficients y and z
are attached the result in square brackets is seen to be the x-component of
the vector product v curl A.
Finally, then, in vector notation F = qE + q(v B) where the electric field
vector is here denoted by E, while the other term Fmag = q(v curl A) is
the Lorentz force.
Molecules in magnetic fields
In Section 6 of Chapter 5 we noted that whenever a system contained
unpaired electrons there would be a tiny interaction between the electron
spin, with its resultant magnetic dipole, and any external magnetic field.
A free spin interacts with a magnetic field B through a coupling term
gB S, where the g-value is very close to 2 and = e~/2m (the Bohr
magneton). So there will be a small perturbation of the many-electron
Hamiltonian of the form
HZ = g
B S(i),
the summation being over all electrons. This is the spin-field Zeeman
There will also be an interaction between the spin dipole and the magnetic
field produced by motion of the electrons, which will depend on the velocity with which they are moving. In the case of an atom, the spatial motion
Hmag =
B L(i).
The spin operator in (7.40) will multiply (s) by 21 , but (s) by 12 and
then, removing any remaining primes, (7.40) will become (check it out!)
Sz (r) = 12 [P (r) P (r)].
Similar densities, Qx , Qy , may also be defined, but in practice the applied magnetic field (e.g. in NMR
and ESR experiments) is usually chosen to fix the z-direction.
(This is still not absolutely satisfactory if one wants to know how much
KE comes from a finite part of space, when integrating to get the total
expectation value, for it contains terms depending on the surface bounding
the chosen region. But for all normal purposes, which involve integration
over all space, it may be used.)
In the presence of a magnetic field the operator p is replaced by the generalized momentum operator , defined in (7.42), and the natural generalization of the KE density is
PT (r) = (1/2m)[ P (r; r )]r =r .
This gives a current density which is everywhere real and positive and gives
the correct expectation value on integrating over all space.
Finite basis approximations
Of course, if we want to actually calculate a molecular electronic property
we have to use an approximation in which the orbitals used (e.g. the MOs)
are expressed as linear combinations of basis functions (e.g. AOs centred
on the various nuclei). This finite basis approximation was first introduced
in Chapter 4 (Section 4.4) and allows us to convert all equations into matrix
equations. For example any MO
K = cK1 1 + cK2 2 + ... + cKr r + ... + cKm m
can be expressed in matrix form, as the row-column product
= cK ,
K = (1 2 ... m )
where cK stands for the whole column of expansion coefficients and for
the row of basis functions. So the X operator will be represented in the basis by the matrix X, with elements Xrs = hr |X|s i, and its expectation
value for an electron in K will be
cKr hr |X|s icKs
hK |X|K i =
= cK XcK
k = Ak
n exp(2ink/N ) (Ak = 1/ N )
could be found. The MO of lowest energy is 0 = A0 (1 +2 + ... +6 ) and, being real, will have zero value
of the velocity expectation value. But the MOs with k = 1 form a degenerate pair, whose wave functions
are complex conjugate. The expectation value in state k will be hk |v|k i = n,n cn hn |v|n icn
and tdo evaluate this quantity, which measures the expected electron probability current, we need only
the matrix elements hn |v|n i and the AO coefficients cn (given above for any chosen k). If we were
doing an energy calculation, with H
uckel approximations, wed have the 1-electron Hamiltonian h in place
of v; and the n-n matrix element would be given an empirically chosen value nn for nearest neighbour
atoms, zero otherwise. But here the nearest neighbours of Atom n would have n = n + 1 (for positive
direction along the chain) and n = n 1 (for negative direction); and, as the operator v points in the
direction of increasing n, the n n + 1 matrix element would have a substantial (but imaginary) value
(i, say). With this choice, the most suitable approximations would seem to be hn |v|n+1 i = i and
hn |v|n1 i = i, other matrix elements being considered negligible.
On using this very crude model, the expectation value of the velocity component for an electron in MO
k , for an N -atom chain, would be
hk |v|k i
|Ak |2
This reduces to (Check it out! noting that the summation contains N terms)
hk |v|k i = 2 sin(2k/N ).
The example confirms that, even without using a computer (or even a simple calculator!),
its often possible to obtain a good understanding of what goes on in a complicated manyelectron system. Here weve found how an IPM approach with the simplest possible
approximations can reveal factors that govern the flow of charge density along a carbon
chain: a parameter (which depends on overlap of adjacent AOs) should be large and the
flow will be faster in quantum states with higher values of a quantum number k. Pairs of
states with equal but opposite values of k correspond to opposite directions of circulation
round the ring; and the circulating current produces a magnetic dipole, normal to the
plane of the ring. In cyclic hydrocarbons such effects are experimentally observable; and
when the angular momentum operator p is replaced by the gauge invariant operator
(which contains the vector potential of an applied magnetic field) it is possible to
calculate a resultant induced magnetic dipole again experimentally observable. In
fact, the quantum number k in a ring current calculation is the analogue of an angular
momentum quantum number in an atomic calculation. Chapter 7 has set out most of
the mathematical tools necessary for an in depth study of molecular electronic structure
and properties even if only at IPM level. But, for now, thats enough!
In Chapter 8, well start looking at more extended systems where there may be many
thousands of atoms. Incredibly, well find it is still possible to make enormous progress.
Chapter 8
Some extended structures in 1-,2and 3-dimensions
In earlier chapters of this book weve often talked about symmetry properties of a
system; these have been, for example, the exchange of two or more identical particles, or
a geometrical operation such as a rotation which sends every particle into a new position.
Such operations may form a symmetry group when they satisfy certain conditions.
We have met Permutation Groups in introducing the Pauli Principle (Section 2.4 of
Chapter 2); and Point Groups, which contain geometrical operations that leave one
point in the system unmoved, in studying molecules (e.g. in Example 7.5 of this book).
But when we move on to the study of extended structures such as crystals, in which
certain structural units may be repeated indefinitely (over and over again) as we go
through the crystal, we must admit new symmetry operations called translations. So
this is a good point at which to review the old and start on the new.
1 1
Table 1
It is then easily shown (do it!) that the law of combination for such operations is
(R|t)(S|t ) = (RS|t + Rt )
and that the set of all such operations then forms a Space Group.
In the following example we show how just two primitive translations,
call them a1 and a2 , can be used to generate a Crystal Lattice in two
dimensions. On adding a third primitive translation a3 it is just as easy to
generate a lattice for a real three-dimensional crystal.
If you allow n1 , n2 to take all positive and negative values, from zero to infinity, you will generate an
infinite square lattice in the xy-plane; the bold dots will then show all the lattice points.
Example 8.1 brings out one very important conclusion: when translations are combined
with point group operations we have to ask which rotations or reflections are allowed.
The combination (R|t) may not always be a symmetry operation and in that case the
operations will not be acceptable as members of a space group. Looking at the picture it
is clear that if t is a translation leading from point (1,0) to (3,1) it can be combined with
a rotation C4 , and then leads to another lattice point; but it cannot be combined with
C3 or C6 because (R|t) would not lead to a lattice point for either choice of the rotation
and could not therefore belong to any space group. To derive all the possible space
groups, when symmetrical objects are placed in an empty lattice of points, is a very long
and difficult story (there are 320 of them!) but its time to move on.
A general lattice point will then have the position vector (in 3 dimensions)
r = n1 a1 + n2 a2 + n3 a3 ,
where n1 , n2 , n3 are integers (positive, negative, or zero). The scalar product that gives the square of the length of the vector (i.e. of the distance
from the origin to the lattice point) is then
r r = n12 (a1 a1 ) + n22 (a2 a2 ) + n32 (a3 a3 )
+n1 n2 (a1 a2 ) + n1 n3 (a1 a3 ) + n2 n3 (a2 a3 )
ni2 Sii +
ni nj Sij (Sij = ai aj ),
where the Sij are elements of the usual metric matrix S and i, j go from
1 to 3. When the vectors for the primitive translations are orthogonal and
of equal length S is a multiple of the 3 3 unit matrix and the translations generate a simple cubic lattice, in which (distance)2 has the usual
(Cartesian) form as a sum of squares of the vector components.
Using (8.3), with the oblique axes shown in Figure 8.2, the scalar product
does not have that simple form; but we can get it back by setting up a
new basis of reciprocal vectors (not a good name), denoted by b1 , b2 ,
in which a general vector v is expressed as v = v1 b1 + v2 b2 and choosing
b1 orthogonal to a2 , but with length reciprocal to that of a1 , and similarly
for b2 . This makes a1 b1 = a2 b2 = 1, but a1 b2 = 0 and a scalar product
(r1 a1 + r2 a2 ) (v1 b1 + v2 b2 ) will then take the usual form
r v = r1 v1 (a1 b1 ) + r2 v2 (a2 b2 ) + r1 v2 (a1 b2 ) + r2 v1 (a2 b1 ) = r1 v1 + r2 v2 ,
just as it would be for two general vectors in a (Cartesian) 2-dimensional
The same construction can be made in 3-space, with the primitive translations described by the vectors a1 , a2 , a3 ; and with basis vectors b1 , b2 , b3
defining the reciprocal space. But in this case the relationship between
the two bases is not so direct: the b vectors must be defined as
b1 =
a2 a3
[a1 a2 a3 ]
Here k is called a vector of k-space and the basis vectors are now taken as 2b1 , 2b2 , 2b3 .
The corresponding 1-electron wave function will then be (adding a normalising factor M )
(r) = M exp(ir p/~) = M exp(ik r),
with quantized values of the k-vector components. (Remember that vector components,
being sets of numbers, have usually been denoted by bold letters r, while the abstract
vectors they represent are shown in sans serif type as r) The energy of the 1-electron
state (8.6) can still be written in the free-electron form k = (~2 /2m)|k|2 , but when the
axes are oblique this does not become a simple sum-of-squares (you have to do some
trigonometry to find the squared length of the k-vector!) .
Of course an empty box, even with suitable boundary conditions, is not a good model
for any real crystal; but it gives a good start by showing that the fundamental volume
containing G3 lattice cells allows us to set up that number of quantized 1-electron states,
represented by points in a certain central zone of k-space. Each state can hold two
electrons, of opposite spin, and on adding the electrons we can set up an IPM description
of the whole electronic structure of the crystal.
Crystal orbitals
In the empty-lattice approximation, we have used free-electron wave functions of the form
(8.6) to describe an electron moving with definite momentum vector p = ~k, quantized
according to the size and shape of the fundamental volume.
Now we want to recognize the fact that in reality there is an internal structure due to the
presence of atomic nuclei, repeated within every unit cell of the crystal. Were going to
find that the 1-electron functions are now replaced by crystal orbitals of very similar
(r) = M exp(ik r)fk (r),
where fk (r) is a function with the periodicity of the lattice having the same value at
equivalent points in all the unit cells. This result was first established by the German
physicist Bloch and the functions are also known as Bloch functions.
To obtain (8.7) most easily we start from the periodicity of the potential function V (r):
if we look at the point with position vector r + R, where R is the displacement
R = m1 a 1 + m2 a 2 + m3 a 3
and m1 , m2 , m3 are integers, the potential must have the same value as at point r. And
lets define an operator TR such that TR (r) = (r+R). Applied to the potential function
V (r) it produces TR V (r) = V (r + R), but this must have exactly the same value V (r) as
before the shift: the potential function is invariant against any displacement with integral
ms. The same is true for the kinetic energy operator T and for the sum h = T + V, which
is the 1-electron Hamiltonian in IPM approximation. Thus
TR (h) = hTR ,
h being unchanged in the shift; and in other words the operators h and TR must commute.
If we use T1 , T2 , T3 to denote the operators that give shifts r r+a1 , r r+a2 , r r+a3
(for the primitive translations) then we have four commuting operators (h, T1 , T2 , T3 ) and
should be able to find simultaneous eigenfunctions , such that
h = ,
T1 = 1 ,
T2 = 2 ,
T3 = 3 .
Now lets apply T1 G times to (r), this being the number of unit cells in each direction
in the fundamental volume, obtaining (T1 )G = 1G . If we put 1 = ei1 this means that
G1 must be an integral multiple of 2, so we can write 1 = (1 /G) (2), where 1 is
a positive or negative integer or zero. This is true also for 2 and 3 ; and it follows that
in a general lattice displacement, R = m1 a1 + m2 a2 + m3 a3 ,
m2 m3
(r + R) = TR (r) = Tm
1 T2 T3 (r).
To show that a function (r) with this property can be written in the form (8.2) it is
enough to apply the last result to the function eikr f (r), where f (r) is arbitrary: thus
eikR eikr f (r + R) = eikR eikr f (r).
In other words we must have f (r + R) = f (r) and in that case the most general crystal
orbital will have the form
k (r) = eikr fk (r)
Here the subscript k has been added because the components of the k-vector are essentially
quantum numbers labelling the states. There is thus a one-to-one correspondence between
Bloch functions and free-electron wave functions, though the energy no longer depends in
a simple way on the components k of the k-vector The simplest approximation to a crystal
orbital is a linear combination of AOs on the atomic centres: thus, for a 1-dimensional
array of lattice cells, with one AO in each, this has the general form
cn n ,
where the s are AOs on the numbered centres. We imagine the whole 1-dimensional
crystal is built up by repeating the fundamental volume of G unit cells in both directions,
periodicity requiring that cn+G = cn .
In the following example we assume zero overlap of AOs on different centres and use a
nearest-neighbour approximation for matrix elements of the 1-electron Hamiltonian h.
Thus, introducing the so-called coulomb and resonance integrals
hn |h|n i = ,
hn |h|n+1 i =
(all n)
and are easily solved by supposing cn = ein and substituting. On taking out a common factor the
condition becomes, remembering that ei + ei = 2 cos , ( ) + 2 cos = 0, which fixes in terms
of .
To determine itself we use the periodicity condition cn+G = cn , which gives eiG = 1. Thus G must be
an integral multiple of 2 and we can put = 2/G, where is a positive or negative integer or zero.
Finally, the allowed energy levels and AO coefficients in (8.11) can be labelled by :
= + 2 cos(2/G),
cn = exp(2in/G).
The energy levels for a 1-dimensional chain of atoms, in LCAO approximation, should
therefore form an energy band of width 4, where is the interaction integral hn |h|n+1 i
between neighbouring atoms. Figure 8.4, below, indicates these results for a chain of Hydrogen atoms, where every is taken to be a 1s orbital.
Its time to look at some real systems and theres no shortage of them: even plastic
bags are made up from long chains of atoms, mainly of Carbon and Hydrogen atoms, all
tangled together; and so are the DNA molecules that carry the instructions for building
a human being from one generation to the next! All are examples of polymers.
In Example 8.3 we found crystal orbitals for the -electrons of a carbon chain, using a
nearest-neighbour approximation and taking the chain to be straight. In reality, however,
carbon chains are never straight, and the CC sigma bonds are best described in terms of
hybrid AOs, inclined at 120 to each other. Polyene chains are therefore usually zig-zag
in form, even when the Carbon atoms lie in the same plane as in the case shown below:
1 X
exp(ik Rn,B )B,Rn,B (r).
B,k (r) = 3
G n
These functions will behave correctly when we go from the unit cell at the origin to any
other lattice cell and, provided all s are orthonormal, they are also normalized over the
whole fundamental volume. The k-vector specifies the symmetry species of a function,
under translations, and only functions with the same k can be mixed. Just as we can
express a -type MO between the two atoms of the unit cell as a linear combination
cA A + cB B , we can write a -type crystal orbital as
= cA,k A,k + cB,k B,k ,
where the mixing coefficients now depend on the k-vector and must be found by solving
a secular problem, as usual.
To complete the calculation, we need approximations to the matrix elements of the 1electron Hamiltonian between the two Bloch functions in (8.13): these depend on the
corresponding elements between the AOs in all lattice cells and the simplest approximation
is to take, as in (8.12), hA,Rm,A |h|A,Rm,A i = hB,Rn,B |h|B,Rn,B i C (the same value for
= C
= C
The top band (a) refers to the most loosely bound electrons, the reference level at = C
being the energy of an electron in a single Carbon 2p AO. For a fundamental volume
containing G unit cells the band arising from this AO will contain G levels, but as each
Carbon provides only one electron only 21 G of the crystal orbitals will be filled (2
electrons in each, with opposite spins). That means that electrons will be easily excited,
from the ground state into the nearby empty orbitals; so a carbon chain of this kind
should be able to conduct electricity. Polyacetylene is an example of an unsaturated chain
molecule: such molecules are of industrial importance owing to the electrical properties
of materials derived from them.
Electrons in the lower bands, such as (b) and (c), are more strongly bound with crystal
orbitals consisting mainly of -type AOs, which lie at much lower energy. Figure 8.6 is
very schematic; the same reference level C is shown for the hybrid AOs involved in the
CH bonds and the CC bonds (which should lie much lower); and the band widths are
shown equal in all three cases, whereas the resonance integrals () are much greater in
magnitude for the AOs that overlap more heavily. So in fact such bands are much wider
and may even overlap. On carefully counting the number of energy levels they contain
and the number of atomic valence electron available it appears that the crystal orbitals
in these lower energy bands are all likely to be doubly-occupied. In that case, as we know
from Section 7.4, it is always possible to replace the completely delocalized crystal orbitals
by unitary mixtures, without changing in any way the total electron density they give rise
to, the mixtures being strongly localized in the regions corresponding to the traditional
chemical bonds.
There will also be empty bands at much higher energy than those shown in Figure 8.6,
but these will arise from anti-bonding combinations of the AOs and are usually of little
Other types of polymer chains
The polyacetylene chain (Figure 8.5) is the simplest example of an unsaturated polymer:
the Carbons in the backbone all have only three saturated valences, the fourth valence
electron occupying a 2p orbital and providing the partial bonds which tend to keep
the molecule flat. This valence electron may, however, take part in a 2-electron bond with
another atom, in which case all four Carbon valences are saturated and the nature of the
bonding with its neighbours is completely changed. The simplest saturated polymers are
found in the paraffin series, which starts with methane (CH4 ), ethane (C2 H6 ), propane
(C3 H8 ), and continues with the addition of any number of CH2 groups. Nowadays, the
paraffins are usually called alkanes.
Instead of the flat polyacetylene chains, which are extended by adding CH groups, the
alkanes are extended by adding CH2 groups. The backbone is still a zig-zag chain of
Carbons, but the CC links are now single bonds (with no partial double-bond character)
around which rotation easily takes place: as a result the long chains become tangled,
leading to more rigid materials. If the chain is kept straight, as a 1-dimensional lattice,
the unit cell contains the repeating group indicated below in Figure 8.7 (where the unit
cell contents are shown within the broken-line circle).
Figure 8.7 Repeating group (C2 H4 )
Individual CH2 groups are perpendicular to
the plane of the Carbon chain (above it and
below it)
(Carbons shown in black, Hydrogens in grey)
The first few alkanes, with few C2 H4 groups and thus low molecular weight, occur as
gases; these are followed by liquid paraffins and then by solid waxes. But the chain
lengths can become enormous, including millions of groups. The resultant high-density
materials are used in making everything from buckets to machine parts, while the lower
density products are ideal for packaging and conserving food. World production of this
low-cost material runs to billions of tons every year!
In Section 8.1 we introduced the idea of a crystal lattice, in one, two and three dimensions, along with the simplest model in which an empty lattice was just thought of as
a box containing free electrons. Then, in Section 8.2, we improved the model by defining
the crystal orbitals, as a generalization of the MOs used in Chapter 7 for discussing
simple molecules. Finally, in Section 8.3, we began the study of some real systems by
looking at some types of 1-dimensional crystal, namely polymer chains built mainly from
atoms of Hydrogen and Carbon. These simple chain molecules form the basis for most
kinds of plastics that within the last century have changed the lives of most of us.
Most common crystals, however, are 3-dimensional and bring in new ideas which we are
now ready to deal with. The simplest of all (after solid Hydrogen) is metallic Lithium,
a metal consisting of Lithium atoms, each with one valence electron outside a Helium-like
closed shell. The atoms form a body-centred cubic lattice, with the unit cell indicated
lattice cell with origin at Rm will then be Rm + rA and similarly for B. Bloch functions
can be formed for each atom, just as in (8.13), which we repeat here;
1 X
exp[ik(Rn +rB )]B,Rn (r).
B,k (r) = 3
G n
These functions are normalized, over the fundamental volume containing G3 cells, provided all AOs (namely the s) are normalized and orthogonal.
(Remember that A,Rm (r) is an A-type AO centred on point rA in the lattice cell with
origin at Rm and similarly for B,Rn (r). Remember also that the wave vector k is defined
in terms of the reciprocal lattice as
1 X
exp[ik(Rm +rA )]A,Rm (r),
A,k (r) = 3
G m
1 X
A,Rm hB,Rn dr,
G3 m,n
where nm = (Rn Rm + rB rA ) is the vector distance from an atom of A-type, in lattice cell at
Rm , to one of B-type in a cell at Rn . The double summation is over all B-neighbours of any A atom,
so taking A in the unit cell at the origin and summing over nearest neighbours will give a contribution
( n exp(iknm )hA |h|B i. This result will be the same for any choice of the cell at Rm , again cancelling
the normalizing factor on summation. On denoting the structure sum by AB (k), the final result will
thus be hAB = AB AB (k), where AB is the usual resonance integral for the nearest-neighbour pairs.
Example 8.4 has given for the matrix elements of the 1-electron Hamiltonian, between
Bloch functions A,k and B,k ,
hAA = A ,
hBB = B ,
hAB = AB AB (k).
Here, for generality, we allow the atoms or orbitals at rA and rB to be different; so later
we can deal with mixed crystals as well as the Lithium metal used in the present section.
The secular determinant is thus
AB AB (k)
= 0,
where the star on the second sigma arises because hBA is the complex conjugate of hAB
while the AOs are taken as real functions. This quadratic equation for has roots, for
atoms of the same kind (A = B = ),
k = AB |AB (k)|.
Since and AB are negative quantities, the states of lowest energy are obtained by
taking the upper sign. There will be G3 states of this kind, resulting from the solution
of (8.17) at all points in k-space i.e. for all values of k1 , k2 , k3 in the wave vector k =
k1 (2b1 ) + k2 (2b2 ) + k3 (2b3 ). And there will be another G3 states, of higher energy,
which arise on taking the lower sign. The present approximation thus predicts two energy
bands, of the kind displayed in Figure 8.6 for a 1-dimensional crystal (polyacetylene). We
now look for a pictorial way of relating the energy levels k within a band to the k-vector
of the corresponding crystal orbitals.
Each factor cn = ein will be periodic within the fundamental volume of G lattice cells in each direction
when = 2/G, being an integer. So the general AO coefficient will be
cn11n22n33 = exp[2i(n1 1 + n2 2 + n3 3 )/G],
where the three quantum numbers 1 , 2 , 3 determine the state; and the energy follows, as in Example
8.3, from the difference equation (in nearest neighbour approximation). Thus, the Bloch orbital energy
becomes a sum of three terms, one for each dimension:
1 2 3 = + 21 cos(21 /G) + 22 cos(22 /G) + 23 cos(23 /G).
In terms of the wave vector k and its components in reciprocal space, the 3-dimensional
Bloch function and its corresponding k can now be written, assuming all atoms have the
same and all nearest-neighbour pairs have the same ,
k =
exp(ik Rn )n , k = + 2 cos 2k1 + 2 cos 2k2 + 2 cos 2k3 ,
The formula k = + 4 cos (k1 + k2 ) cos (k1 k2 ), obtained from (8.18) in the 2dimensional case, then shows that the energy rises from a minimum + 4 at the zone
centre (where k1 = k2 = 0) to a maximum 4 at the zone corners. The top and
bottom states thus define an energy band of width 8.
Near the bottom of the band ( being negative), k1 and k2 are small and expanding the
cosines in (8.18) gives the approximation (get it!)
k = + 4 4 2 (k12 + k22 ) + ...
which is constant when k12 + k22 = constant. The energy contours in k-space are
thus circular near the origin where k1 = k2 = 0. Remember that, in a free electron
approximation, ~k represents the momentum vector and that k = (1/2m)~2 |k|2 : if we
compare this with the k-dependent part of (8.20) it is clear that ~2 /2m must be replaced
by 4 2 suggesting that the electron in this crystal orbital behaves as if it had an
effective mass
me = 2 2 ~2 /.
This can be confirmed by asking how a wave packet, formed by combining functions k
with k-values close to k1 , k2 travels through the lattice (e.g. when an electric field is
appplied). (You may need to read again about wave packets in Book 11.) The result is
also consistent with what we know already (e.g. that tightly-bound inner-shell electrons
are described by wave functions that overlap very little, giving very small (and negative)
values: (8.20) shows they will have a very high effective mass and thus almost zero
On the other hand, near the corners of a Brillouin zone, where k1 , k2 = 12 , things are
very different. On putting k1 = 12 + 1 , k2 = 12 + 2 , (8.19) gives an energy dependence of
the form (check it!)
k = A + B(12 + 22 ) + ...
showing that the energy contours are again circular, but now around the corner points
with k = 4. Such states have energies at the top of the band; and the sign of B,
as you can show, is negative. This indicates a negative effective mass and shows that a
wave packet formed from states near the top of the band may go the wrong way. In
other words if we accelerate the packet it will be reflected back by the lattice! (Of course
it couldnt go beyond the boundary of the Brillouin zone, because that is a forbidden
The forms of the energy contours are sketched below:
= 4
= + 4
Many of the physical properties of real 3-dimensional crystals, such as the way they
conduct heat and electricity, depend strongly on the highest occupied electronic states;
so it is important to know how the available states are filled. Every crystal orbital can
hold only two electrons, of different spin (Pauli Principle), so with only one monovalent
atom per lattice cell there would be 2G3 states available for the G3 valence electrons: the
lowest energy band would be only half full and the next band would be completely empty.
The crystal should be a good conductor of electricity, with electrons easily excited into
upper orbitals of the lower band; and the same would be true with two monovalent atoms
per cell (4G3 states and 2G3 electrons). On the other hand, with two divalent atoms per
cell there would be 4G3 valence electrons available and these would fill the lower energy
band: in that case conduction would depend on electrons being given enough energy to
jump the band gap.
Some mixed crystals
Even simpler than metallic Lithium, is Lithium Hydride LiH, but the molecule does
not crystallize easily, forming a white powder which reacts violently with water all very
different from the soft silvery metal! On the other hand, Lithium Fluoride forms nice
regular crystals with the same structure as common salt (Sodium Chloride, NaCl); they
have the face-centred cubic structure, similar to that of the metal itself except that the
Fluorine atoms lie at the centres of the cube faces instead of at the cube centre.
Salts of this kind are formed when the two atoms involved (e.g. Li and F; or Na and Cl)
are found on opposite sides of the Periodic Table, which means their electrons are weakly
bound (left side) or strongly bound (right side). You will remember from Section 6.2 that
when the -values of the corresponding AOs differ greatly the energy-level diagram for a
diatomic molecule looks very different from that in the homonuclear case where the two
atoms are the same: in LiF for example, using A and B to denote Fluorine and Lithium,
the lowest-energy MO (Figure 6.3) has 1 A while its antibonding partner has the much
higher energy 2 B . The corresponding diatomic MOs, in the same approximation, are
1 A and 2 B , as you can confirm (do it!) by estimating the mixing coefficients
in cA A + cB B . In other words, the lowest-energy MO is roughly the same as the
AO on the atom of greater electron affinity meaning with the greater need to attract
electrons. When the MOs are filled with electrons (2 in each MO) the Fluorine will grab
two of the valence electrons, leaving the Lithium with none. The bonding between the
two atoms is then said to be ionic, the Fluorine being pictured as the negative ion F
and the Lithium as the positive ion Li+ . In that way both atoms achieve a closed-shell
electronic structure in which their valence orbitals are all doubly occupied. The Fluorine,
in particular, looks more like the inert gas Neon, at the end of this row in the Periodic
When the salts form crystals similar considerations apply: the electronic structure of
the crystal may be described by filling the available crystal orbitals, written as linear
combinations of Bloch functions, and the mixing coefficients could be calculated by solving
a set of secular equations at every point in k-space. But in the case of ionic crystals such
difficult calculations can be avoided: looking ahead, we can guess that the Fluorine AO
coefficients in the crystal orbitals will come out big enough to justify a picture in which
the Fluorine has gained an electron, becoming F , while the Lithium has in effect lost
its valence electron to become Li+ . In this way we come back to the classical picture of
ionic crystals, put forward long before the development of quantum mechanics!
The unit cell in the LiF crystal, well established experimentally by X-ray crystallography,
has the form shown below.
repulsion between two neighbouring ions has the form Erep = B exp(r/) where B and
are constants and r is the distance between the ion centres. Usually the constants are
given empirical values so as to reproduce experimental data such as the unit cell distances
and the total energy of the crystal. Even then the calculations are not simple, because
the crystal contains millions of ions and care must be taken with convergence as more
and more ions are included; but they are by now standard and give a good account of
crystal properties. So lets now look ar something really new!
New materials
A few years ago the Nobel Prize in Physics 2010 was awarded jointly to two Russians,
Andre Geim and Konstantin Novoselov, for their groundbreaking experimental work on
the two-dimensional material graphene. Since then, thousands of scientific papers on
this material and its remarkable properties have been published in all the worlds leading
journals. Graphene seems likely to cause a far bigger revolution in Science and Technology
than that made by the discovery of plastics and yet all the underlying theory was known
more than 50 years ago and can be understood on the basis of what youve done so far.
A crystal of solid graphite, which contains only Carbon atoms lying on a 3-dimensional
lattice, consists of 2-dimensional sheets or layers, lying one on top of another. Each
layer contains Carbons that are strongly bonded together, lying at the corners of a hexagon
as in the benzene molecule, while the bonding between different layers is comparitively
weak. Such a single layer forms the 2-dimensional crystal graphene, whose unit cell is
shown in the figure 8.13 (left) along with that for the corresponding k-space lattice (right).
Because graphene is so important its worth showing how easy it is to construct all we
need from very first principles.
Example 8.6 A bit of geometry the hexagonal lattice
Of course youve been using simple vector algebra ever since Book 2, usually with a Cartesian basis in
which a vector v = vx i + vy j + vz k is expressed in terms of its components relative to orthogonal unit
vectors i, j, k. So this is the first time you meet something new: the basis vectors we need in dealing with
the graphene lattice are oblique though they can be expressed in terms of Cartesian unit vectors. Thus,
in crystal space, Figure 8.13 (left), we can choose i, j as unit vectors pointing along AB and perpendicular
to it (upwards). We then have
a1 = 12 3 i 21 j, a2 = 12 3 i + 21 j.
In reciprocal space (i.e. without the 2 factors), Figure 8.13 (right), we can define
b2 = 12 i + 12 3 j, b1 = 12 i 21 3 j
where b2 and b1 are respectively (note the order) perpendicular to a1 and a2 ). Thus, a1 b2 = a2 b1 = 0.
On the other hand a1 b1 = 41 3 + 14 3 = 21 3 and a2 b2 has the same value. As a result, any pair of
vectors u = u1 a1 + u2 a2 (in a-space) and v = v1 b1 + v2 b2 (in b-space) will have a scalar product
since the other terms are zero. Wed like to have a simpler result, like that for two vectors in an
ordinary (rectangular
Lets collect the two sets of basis vectors obtained in Example 8.6: the a-set define the
real (crystal) space, while the b -vectors define the reciprocal space, which is set up
only for mathematical convenience!
a2 = 21 3 i + 21 j
a1 = 21 3 i 21 j,
b2 = ( 3)1 i + j.
b1 = ( 3)1 i j,
a1 b1 = (a2 b2 ) = 1
but we still have a1 b2 = a2 b1 = 0. So for any two vectors, u, v, the first expressed in
crystal space and the second in reciprocal space, we have
(u1 a1 + u2 a2 ) (v1 b1 + v2 b2 ) = (u1 v1 + u2 v2 )
just as if the two vectors belonged to an ordinary Cartesian space.
Now we know that the vectors set up in (8.22) have the properties we need, we can look
again at Figure 8.13, which shows how they appear in the graphene crystal space and
corresponding k-space lattices:
Figure 8.13 Crystal lattices and some unit cells (see text)
The left-hand side of Figure 8.13 shows part of the lattice in crystal space; one cell, the
unit cell, contains Carbon atoms at A and B and is lightly shaded. The basis vectors a1
and a2 are shown as bold arrows. The right-hand side shows part of the corresponding
lattice in k-space: the basis vectors 2b1 and 2b2 are each perpendicular (respectively)
to a2 , a1 and define a unit cell (lightly shaded) in k-space. The central zone in k-space
is hexagonal (shown in darker shading) and is made up from 12 triangular pieces, one of
which is shown, all equivalent under symmetry operations. You can imagine the 12 pieces
come from the unit cell by cutting it into parts and sliding them into new positions to
fill the hexagon.
What we want to do next is to calculate the energy as a function of the coordinates
(k1 , k2 ) in k-space; then well be able to sketch the energy contours within the unit cell
or the equivalent central zone.
The matrix elements are between Bloch functions, namely A,k , B,k , where
for example
A,k =
exp[ik (rA + Rm )]A,Rm
G m
with an identical result for hBB . (Note that hA |h|A i is for the A-atom in
the unit cell at the origin and that the summation is over G2 equal lattice
cells.) These diagonal matrix elements are independent of k :
hAA = hA |h|A i,
hBB = hB |h|B i.
The summation in the last equation can be broken into terms for A-atoms
and B-atoms in the same or adjacent cells (nearest neighbours) and then in
more distant cells (second and higher order neighbours). Equation (8.26)
may thus be written
hAB (k) = h1 1 (k) + h2 2 (k) + ...,
where the terms rapidly get smaller as the A- and B-atoms become more
distant. Here well deal only with the first approximation, evaluating
h1 1 (k) for nearest-neighbour contributions to 1 (k). We imagine atom A
fixed and sum over the nearest B-type atoms; these will be B, in the same
cell as A, and atoms at points B and B in adjacent cells to the left, one
lower for B and one higher for B . (Study carefully Figure 8.13, where you
will see B is at the lower right corner of the next hexagon, while B is
at its upper right corner.) The vector positions of the three B-atoms are
given in terms of the Cartesian unit vectors i, j, by
rB = (2l)i + 0j, rB = 12 l i 21 3l j rB = 21 l i + 21 3l j,
rB = 12 l i 21 j,
rB = 21 li + 21 j,
B = (1/3)a1 (2/3)a2 ,
where the upper sign gives the lower -energy solution (that of a bonding
orbital), since is a negative quantity.
The squared modulus of the structure sum (k) in the energy expression
(8.29) has the form
(k) (k) = (exp i1 + exp i2 + exp i3 ) (exp i1 + exp i2 + exp i3 ),
1 = (2/3)(k1 + k2 ),
2 = (2/3)(k1 2k2 ),
3 = (2/3)(2k1 + k2 ).
and hence
(k) =
To get the coordinates (k1 , k2 ) of points in k-space we first draw the hexagonal Brillouin zone, indicating the basis vectors b1 , b2 . Note that k1 , k2 are
the coefficients of b1 , b2 in the k-vector. The result is shown in Figure 8.14
below (next page), where the end points of some particular k-vectors are
marked with bold dots. The other diamond-shaped areas are the adjacent
cells in k-space.
The higher of the two bold dots is a K-point (corner point of the hexagonal filled zone), while the lower is an M-point (mid-point of a side). To
calculate the corresponding energies we must change to reciprocal-space
coordinates, k1 , k2 , which go with the basis vectors b1 , b2 .
In fact, the coefficients of b1 and b2 are, apart from the missing 2 factor,
the components of the properly scaled k-vector, denoted by k1 and k2 .
The following picture shows the central zone in k-space and indicates, with
bold dots, two of the most important points (a K-point and an M-point).
on the hexagon having the same energy as at the M-point. After that, the
energy approaches the value = , the highest energy level in the filled
zone, but this is found only at the K-points and close to them where the
contours again become roughly circular. At these points, something very
unusual happens: the energy surface is just touched by the lowest-energy
points of the next energy surface, whose orbitals have energies going from
= up to the maximum = . At all other points there is a gap
between the lower and upper surfaces.
It is this strange fact that gives graphene its unique properties.
The electrons serve mainly to stiffen the sigma-bonded framework of
Carbon atoms and to give the material the unique properties that arise
from the touching of the two energy surfaces. The Carbon-Carbon bonds in
general can be difficult to break, especially when they form a 3-dimensional
network that cant be pulled apart in any direction without breaking very
many bonds. This is the case in crystals of diamond, where every Carbon
forms tetrahedral bonds with its four nearest neighbours, as in Figure 7.7.
(Remember that diamonds which contain only Carbon atoms are used
in cutting steel!) But in graphite the Carbons have the rare property of
forming separate layers, held together only by very weak bonds which
allow them to slide over one another, or to be peeled off as sheets of
The great strength of graphene sheets is often called the Cats Cradle
property, because a single sheet one atom thick and weighing almost
nothing! would support the weight of a sleeping cat!
More useful properties arise from the limited contact (in k-space) between
the filled and empty electronic bands. When the lowest-energy band is
filled and separated from the next band above it by an energy gap greater
than (3/2)kT (which is the average energy of a particle in equilibrium with
its surroundings at temperature T K as you may remember from Book
5) an electron with energy at the top of the filled band is unable to jump
across the gap into an orbital of the empty band, where it will be able
to move freely. But at the K-points in graphene the energy gap is zero
and some electrons will be found in the conduction band where they are
free to conduct electricity. In fact, graphene is a perfect semiconductor,