0% found this document useful (0 votes)
23 views

Solomon: Algebra 2 Notes

This document provides notes for an abstract algebra course covering linear algebra, group theory, and Galois theory. The notes begin with a review of basic linear algebra concepts like vector spaces, subspaces, bases, and isomorphisms between finite-dimensional vector spaces and Rn. Upcoming chapters will cover linear operators, inner products, permutation groups, the symmetry groups of Platonic solids, counting formulas, finite 3D rotation groups, Galois fields, symmetric polynomials, and Galois' theory of equations.

Uploaded by

reveriedotcomm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Solomon: Algebra 2 Notes

This document provides notes for an abstract algebra course covering linear algebra, group theory, and Galois theory. The notes begin with a review of basic linear algebra concepts like vector spaces, subspaces, bases, and isomorphisms between finite-dimensional vector spaces and Rn. Upcoming chapters will cover linear operators, inner products, permutation groups, the symmetry groups of Platonic solids, counting formulas, finite 3D rotation groups, Galois fields, symmetric polynomials, and Galois' theory of equations.

Uploaded by

reveriedotcomm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 108

MATH 4581:

ABSTRACT ALGEBRA II

NOTES by PROFESSOR RONALD SOLOMON

Copyright May 2013


1
MATH 4581:

ABSTRACT ALGEBRA II

NOTES by PROFESSOR RONALD SOLOMON

Copyright May 2013


1
2

Table of Contents

Chapter 0: Introduction 3
Chapter 1: Review of Linear Algebra 4
Chapter 2: Linear Operators 11
Chapter 3: Inner Products and Orthogonal Matrices 22
Chapter 4: Permutations, Orbits, and Lagrange’s Theorem 28
Chapter 5: The Platonic Solids and their Symmetries 39
Chapter 6: The Orbit Counting Formula 45
Chapter 7: Finite Subgroups of SO(3) 52
Chapter 8: Imaginaries and Galois Fields 63
Chapter 9: Symmetric Polynomials & the Fundamental Theorem of Algebra 70
Chapter 10: The Cubic and Quartic Equations Revisited 80
Chapter 11: Galois’ Theory of Equations 81
Chapter 12: The Galois Correspondence 91
Index 105
3

Chapter 0: Introduction

Welcome back! This semester we will go deeper in our investigation of group


theory and its applications to geometry and to the theory of equations.
The first high point of the course will be the discussion of the Platonic solids and
their symmetry groups, culminating in the classification of all finite groups of rota-
tions of R3 , a theorem of great importance to crystallographers, indeed essentially
discovered by the French crystallographer, Auguste Bravais, around 1837.
The second high point of the course will be the theory of symmetric polynomials
and its application to give Euler’s proof of the Fundamental Theorem of Algebra.
The third high point of the course will be the description of Galois’ fundamen-
tal work on the theory of equations, including the famous Galois Correspondence
Theorem, discovered by Évariste Galois in 1832.
Before reaching these goals we need to review and reinforce our knowledge of
some basic linear algebra, especially the theory of linear isometries of R3 . This will
be covered in the first three chapters of the notes.
Next, we will need to review and expand our understanding of permutation
groups, including Cauchy’s efficient methods for computing with permutations.
This material will culminate with a proof of Lagrange’s Orbit-Stabilizer Theorem,
the first and still one of the most important theorems in all of group theory. We will
apply this theorem to give Cauchy’s proof of the Orbit Counting Formula, some-
times incorrectly called Burnside’s Counting Formula. This is the most important
tool for using symmetry to facilitate difficult counting problems. Its applications
were extensively developed by Polya in the early 20th century. This material, along
with an introduction to the Platonic solids and their symmetry groups, will occupy
Chapters 4 through 6. We will then be equipped to do the classification of the finite
3-dimensional rotation groups, in Chapter 7.
Next we will shift to the theory of equations, beginning in Chapter 8. In Chapter
9, we will develop the theory of symmetric functions and prove the Fundamental
Theorem of Algebra. Then, after reviewing the theory of the cubic and quartic
polynomials, we will move on to a discussion of Galois’ theory of equations in
Chapters 11 and 12.
If time permits during the semester (or if not, after the semester ends), you should
study Chapter 13, which takes a deeper look at the arithmetic of the Gaussian
integers and uses this to prove Fermat’s Two Squares Theorem.
4

Chapter 1: Review of Linear Algebra


Linear algebra is the most extensively applied area in all of algebra, and indeed
perhaps in all of advanced mathematics. Its only competitor is di↵erential equa-
tions. In the next three chapters we review and extend much of the basic linear
algebra which you learned in earlier math courses. This first chapter mostly repro-
duces material from Chapter 14 of the Math 4580 text. But this is so important,
it bears repetition and review. Also, note that Lemma 1.13 and Theorem 1.14 are
new.
In subsequent chapters, we shall stick to real vector spaces, but for later use in
our study of field extension, we do this preliminary material over an arbitrary field
of scalars.
Definition 1.1. Let F be a field. An F -vector space is a set V of objects, called
vectors, for which two operations are defined:

Vector Addition: To each pair (v, w) of vectors in V , there is a vector v + w 2 V .

and

Scalar Multiplication: To each pair (a, v), a 2 F , v 2 V , there is a vector a · v 2 V .

The following rules are satisfied:


(1) v + w = w + v for all v, w 2 V ;
(2) u + (v + w) = (u + v) + w for all u, v, w 2 V ;
(3) There exists a zero vector 0 such that v + 0 = v for all v 2 V ;
(4) For each vector v 2 V , there exists a vector v 2 V such that v + ( v) = 0;
(5) a · (v + w) = a · v + a · w for all a 2 F , v, w 2 V ;
(6) (a + b) · v = a · v + b · v for all a, b 2 F , v 2 V ;
(7) (ab) · v = a · (b · v) for all a, b 2 F , v 2 V ; and
(8) 1 · v = v for all v 2 V .

Examples of Vector Spaces

(1) F n is the vector space of all n-tuples (a1 , . . . , an ) with ai 2 F , with the
operations of position-wise addition and scalar multiplication. This is just
the obvious generalization of Rn , considered algebraically;
(2) F is the set of all real-valued functions f : R ! R, with pointwise addition
and scalar multiplication, i.e.,

(f + g)(x) = f (x) + g(x) for all x 2 R,

and
(a · f )(x) = a · f (x) for all a 2 R, x 2 R.

(3) C is the set of all everywhere continuous functions f : R ! R;


(4) D is the set of all everywhere di↵erentiable functions f : R ! R;
(5) P is the set of all polynomial functions f : R ! R.
5

A subset W of an F -vector space V which is itself an F -vector space under


the same operations is called a subspace of V . In the roster of examples (2)–(5)
above, each space is a subspace of the one that precedes it. Since the larger space
V satisfies all of the listed axioms, it suffices, in order to prove that a non-empty
subset W is a subspace to verify:
(1) w + w1 2 W for all w, w1 2 W ; and
(2) a · w 2 W for all a 2 F and w 2 W .
All the examples given above, except for Example 1, are infinite dimensional
vector spaces. We shall be interested primarily in finite-dimensional vector spaces.
Definition 1.2. Let {v1 , v2 , . . . , vn } be a set of vectors in V . Any vector v =
c1 · v1 + . . . cn · vn , with ci 2 F for all i, is called a linear combination of the
vectors v1 , . . . , vn . The set of all linear combinations of these vectors is called the
span of (or the space spanned by) {v1 , . . . , vn }.
Lemma 1.3. Span({v1 , . . . , vn }) is a subspace of V .
Proof. We have

(a1 · v1 + . . . an · vn ) + (b1 · v1 + . . . bn · vn ) = (a1 + b1 ) · v1 . . . (an + bn ) · vn ;

and
c · (a1 · v1 + . . . an · vn ) = (ca1 ) · v1 + · · · + (can ) · vn .


Definition 1.4. We say that V is a finite-dimensional vector space if there exists
a finite set {v1 , . . . , vn } of vectors which spans V .
Lemma 1.5. Suppose V is a finite-dimensional vector space. Let B = {v1 , . . . , vn }
be a minimal spanning set for V . Then every vector v 2 V is uniquely express-
ible as a linear combination of the vectors in B.
Proof. Let v 2 V . By definition of a spanning set, v is a linear combination of the
vectors in B. Suppose the expression is not unique, i.e.,

v = a 1 · v 1 + . . . a n · v n = b 1 · v 1 + · · · + bn · v n ,
with some ai 6= bi . Rearranging, we may assume that an 6= bn . Then

1
vn = ((a1 b1 ) · v1 + · · · + (an 1 bn 1) · vn 1 ).
bn an
But then {v1 , . . . vn } is a spanning set for V , properly contained in B, contradicting
the minimal choice of B. Hence the expression for each vector v is unique.

We call any minimal spanning set for V a basis for V .
This is the important point: If B = {v1 , . . . , vn } is a basis for V , then every
vector v has a unique set (a1 , a2 , . . . , an ) of coordinates with respect to the basis
B.
6

Definition 1.6. Two real vector spaces V and W are isomorphic if there is a
bijective function f : V ! W such that
(1) f (v + w) = f (v) + f (v 0 ) for all v, v 0 2 V ; and
(2) f (a · v) = a · f (v) for all a 2 F , v 2 V .
In other words, f : V ! W is an invertible linear transformation. We write
V ⇠
= W.
Theorem 1.7. Let V be a finite-dimensional real vector space with a basis B =
{v1 , . . . , vn }. Then V ⇠
= F n.
Proof. For each v 2 V , let

v = a1 · v 1 + . . . an · v n
be the unique expression for v as a linear combination of the vectors in B. Define
f : F n ! V by

f (a1 , . . . , an ) = a1 · v1 + . . . an · vn .
Uniqueness of expression implies that this function is one-to-one. Since B is a basis
for V , the map is surjective. Hence f is a bijective function. It is easy to check
that f is a linear transformation.

Thus, every finite-dimensional vector space can be coordinatized and thereby
identified with F n . The identification is not at all “natural”. Each choice of basis
gives a di↵erent coordinatization. However, as we shall see, the dimension of V is
a fixed number, independent of the choice of coordinatization.
Example 1.8: Let Pn be the set of all polynomials (with real coefficients) of
degree at most n. Clearly, each polynomial p(x) is uniquely expressible as

p(x) = a0 · 1 + a1 · x + · · · + an · xn ,
for some a0 , a1 , . . . , an 2 R. In other words, {1, x, x2 , . . . , xn } is a basis for Pn , and
the map

p(x) ! (a0 , a1 , . . . , an )
defines an isomorphism between Pn and Rn+1 .
But, again, the basis {1, x, . . . , xn }, though an obvious one is by no means the
only one. Indeed the whole subject of orthogonal polynomials investigates other
choices of basis (such as Legendre polynomials) which are more useful for certain
applications. We will not pursue this theme here.
Definition 1.9. A set S = {v1 , . . . , vk } of vectors is linearly independent if the
only linear combination of v1 , . . . , vk which equals the zero vector is the “trivial”one:

0 · v1 + · · · + 0 · vk = 0.
In other words, 0 is uniquely expressible as a linear combination of the vectors in
S.
7

Lemma 1.10. If S is a linearly independent set of vectors in F m , then |S|  m.


Proof. Let {v1 , . . . , vn } be a subset of F m . Write

vi = (a1i , a2i , . . . , ami )


for each i. Then c1 v1 + · · · + cn vn = 0 if and only if (c1 , c2 , . . . , cn ) is a solution of
the homogeneous system of simultaneous linear equations Ax = 0, where
0 1
a11 a12 ... a1n
B a21 a22 ... a2n C
A=@ A
...
am1 am2 ... amn

If n > m, there is a non-0 solution (c1 , c2 , . . . , cn ), and so {v1 , . . . , vn } is not linearly


independent.

Lemma 1.11. Let f : V ! W be an isomorphism of vector spaces. Then f maps
linearly independent sets to linearly independent sets, and f maps spanning sets to
spanning sets.
Proof. Let {v1 , . . . , vm } be a linearly independent subset of V . If

c1 · f (v1 ) + · · · + cm · f (vm ) = 0,
then
f (c1 · v1 + · · · + cm · vm ) = 0,
and then , since f is injective,

c1 · v1 + · · · + cm · vm = 0.

Hence c1 = · · · = cm = 0, and so {f (v1 ), . . . , f (vm )} is a linearly independent


subset of W .
Suppose {v1 , . . . , vm } spans V . Let w 2 W . Since f is surjective, there exists
v 2 V with f (v) = w. Write

v = a1 · v1 + · · · + am · v m .
Then
w = f (v) = a1 · f (v1 ) + · · · + am · f (vm ).
Hence {f (v1 ), . . . , f (vm )} is a spanning set for W .

Theorem 1.12. The following conclusions hold:
(1) If F n ⇠
= F m , then m = n;
(2) If V is a finite-dimensional vector space, then every basis B of V has the
same cardinality n. (We call this number the dimension of V , dim(V ).);
and
(3) If dim(V ) = n, then every linearly independent subset of V has cardinality
at most n.
8

Proof. Suppose that m < n and f : F n ! F m is an isomorphism. Clearly, the


standard basis B = {e1 = (1, 0, . . . , 0), . . . , en = (0, 0, . . . , 1)} is a linearly indepen-
dent set of vectors in F n . Then, by Lemma 14.11, f (B) = {f (e1 ), . . . , f (en )} is a
linearly independent subset of F m of cardinality n > m, contrary to Lemma 14.10.
This proves (1).
Now if V is a finite-dimensional vector space with a basis B of cardinality n,
then V ⇠ = F n . Since isomorphism is a transitive relation, it follows by (1) that n is
uniquely determined, proving (2). Finally (3) follows by the same argument as in
(1): If S is a linearly independent subset of V and f : V ! F n is an isomorphism,
then f (S) is a linearly independent subset of F n , whence |S|  n.

We have seen that every minimal spanning set for V is a basis. Here is another
way to recognize a set as a basis for V .
Lemma 1.13. Let V be a finite-dimensional vector space. If B is a maximal
linearly independent subset of V , then B is a basis for V .
Proof. Let B = {v1 , . . . , vn }. Let v 2 V B. Then the set B [ {v} is not linearly
independent. Hence there is a set of scalars, not all 0, such that

c1 · v1 + . . . cn · vn + c · v = 0.
If c = 0, then B is not a linearly independent set, contrary to assumption. Hence
c 6= 0 and we can solve for v:
1
v= (c1 · v1 + . . . cn · vn ).
c
Hence B is a spanning set for V . Suppose that

a 1 · v 1 + · · · + a n · v n = b 1 · v 1 + . . . bn · v n .
Then
(a1 b1 ) · v1 + · · · + (an bn ) · vn = 0.
Since B is a linearly independent set, ai = bi for all i. Hence each vector in V is
uniquely expressible as a linear combination of the vectors in B, i.e., B is a basis
for V , as claimed.

From this, we get the following very important extendibility result.
Theorem 1.14. Let V be a finite-dimensional vector space. Let U be a subspace
of V . Then U is also finite-dimensional, with dim(U )  dim(V ) and with equality
only if U = V . Moreover, if B is any basis for U , then it is extendible to a basis
B ⇤ for V .
Note. By convention, if U = {0}, then the empty set is a basis for U , and
dim(U ) = 0.
Proof. Suppose dim(V ) = n. By the remark, we may assume that U contains a
non-zero vector u. Then {u} is a linearly independent subset of U . Since any
subset of U containing more than n vectors is linearly dependent, there must be a
maximal linearly independent subset B of U with |B|  n. Then B is a basis for
9

U and dim(U ) = |B|  n. Extend B to a maximal linearly independent subset B ⇤


of V , of cardinality n. Then B ⇤ is a basis for V . If dim(U ) = n, then B = B ⇤ and
so, U = V .

Exercises
1. Let X be any non-empty set. Let F(X, Rn ) be the set of all functions with
domain X and co-domain Rn , (The actual range of any one of these functions may
be a proper subset of Rn . Indeed, it may be a single point in Rn .) Define addition
and scalar multiplication pointwise, i.e., if f and g are functions in F(X, Rn ), and
if f (x) = (a1 , a2 , . . . , an ), g(x) = (b1 , b2 , . . . , bn ), and if c is a real number, then

(f + g)(x) = f (x) + g(x) = (a1 + b1 , a2 + b2 , . . . , an + bn ),


and
(c · f )(x) = c · f (x) = (ca1 , ca2 , . . . , can ).
Prove: F(X, R ) is a real vector space.
n

2. Verify that C, D and P are vector subspaces of F.


3. Verify that the function f : Rn ! V defined in Theorem 1.7 is an isomorphism
of vector spaces.
4. Let ⇧ be the plane in R3 whose equation is: x y + z = 0.
(a) Verify that ⇧ is a vector subspace of R . 3

(b) Give a basis B for ⇧.


(c) Extend the basis B to a basis B ⇤ for R3 .
5. Let ⇤ be the line in R3 given parametrically by (x, y, z) = (3t, t, 2t).
(a) Verify that ⇤ is a vector subspace of R3 .
(b) Give an equation for the plane ⇧ passing through (0, 0, 0) which is perpen-
dicular to ⇤.
(c) Give an orthonormal basis B = {u, v, w} for R3 such that {u} is a basis for
⇤. [u, v, and w are mutually perpendicular unit vectors.]
6. Let A · x = 0 be a system of linear equations, where
0 1
a11 a12 ... a1n
B a21 a22 ... a2n C
A=@ A.
...
am1 am2 ... amn

Prove: The set of all solutions of this system of equations is a vector subspace of
Rn .
7. Let y(t)000 + ay(t)00 + by(t) = 0 be a linear di↵erential equation, for some
a, b 2 R. Prove: The set of all solutions of this di↵erential equation is a vector
subspace of the space D of all everywhere di↵erentiable functions f : R ! R.
10

[Note: There is nothing special about three derivatives. This is just an example.
The same statement would be true for arbitrary n-th order linear ODEs.]
8. Let V be a vector space. Let U and W be subspaces of V .
(a) Prove: U \ W is a subspace of V .
(b) Prove: U [ W is a subspace of V if and only if U ✓ W or W ✓ U .
(c) Let U + W := {v = u + w 2 V : u 2 U and w 2 W }. Prove that U + W is a
subspace of V .
(d) Prove: Suppose U \ W = {0}. Suppose that B is a finite basis for U and B1
is a finite basis for W . Then B [ B1 is a finite basis for U + W . (It is then common
to denote U + W as U W , and call it the direct sum of U and W .)
(e) Prove: Suppose U + W is finite-dimensional. It is possible to choose a basis
B for U and a basis B1 for W such that B \ B1 is a basis for U \ W .
(f) Show by example that, in general, if B is a basis for U and B1 is a basis for
W , then B \ B1 is NOT a basis for U \ W .
9. Let V be a real vector space. We say that a (possibly infinite) subset B of
V is a basis for V if and only if every vector v 2 V is uniquely expressible
as a finite linear combination of vectors in B. (Thus, B is a linearly independent
spanning set for V .) Suppose that B is a basis for V .
Prove: V is isomorphic to the subspace F0 of the vector space F(B, R) defined
by: f 2 F0 if and only if f (b) = 0 for all but finitely many b 2 B.
[Hint: Define : V ! F(B, R) as follows. If v = c1 · b1 + · · · + cn · bn for some
b1 , . . . , bn in B and some scalars c1 , . . . , cn , let (v) be the function fv : B ! R by:
fv (bi ) = ci , 1  i  n, fv (b) = 0 for all b 2 B {b1 , . . . , bn }.]
Note: Using the Axiom of Choice, it is possible to prove that every vector space
has a basis.
11

Chapter 2: Linear Operators


Most of the material in this chapter remains valid if R is replaced by an arbi-
trary field F , but we restrict our attention to operators on real vector spaces for
application to Euclidean geometry.

The most elementary non-trivial class of functions from Rn to Rn are the linear
operators:

f (x1 , x2 , . . . , xn ) = (a11 ·x1 +a12 ·x2 +· · ·+a1n ·xn , . . . an1 ·x1 +an2 ·x2 +· · ·+ann ·xn ).

The value of linear operators is


(1) We can understand them better than more complicated functions.
(2) We can approximate nice smooth functions f in the neighborhood of each
point P by the Jacobian matrix Jacf (P ) of partial derivatives, which is a
linear function approximating f near the point P . (This is the generalization
of approximating a smooth curve in R2 by its tangent line at the point P .)
(3) Many important functions are linear (or affine) operators, including isome-
tries and projection maps.
It is convenient to use matrix notation for linear operators, and for this, it is
standard to write vectors as column vectors:
01
x1
Bx C
v=@ 2 A
...
xn

Then, if
0 1
a11 a12 ... a1n
B a21 a22 ... a2n C
A=@ A
...
am1 am2 ... amn

we have
0 1
x1
Bx C
f (v) = A @ 2 A .
...
xn

The following is a basis-free defintion of linear operators.

Definition 2.1. Let V be a vector space and let f : V ! V be a function satisfying:


(1) f (v + w) = f (v) + f (w) for all v, w 2 V ; and
(2) f (a · v) = a · f (v) for all a 2 R, v 2 V .
Then we call f a linear operator on V .

In the finite-dimensional case, the use of a basis and coordinates a↵ords the
relationship between the two definitions.
12

Lemma 2.2. Let V be a finite-dimensional vector space with a basis B = {v1 , . . . , vn }.


Let f : V ! V be a linear operator. Identify V with Rn via the coordinatization:
0 1
a1
Ba C
v = a1 · v1 + · · · + an · vn ! @ 2 A.
...
an
Then f , written in B-coordinates, is the function:
0 1 0 10 1
x1 a11 a12 ... a1n x1
B x2 C B a21 a22 ... a2n C B x2 C
f (@ A) = @ A@ A
... ... ...
xn am1 am2 ... amn xn
where 0 1
a1i
Ba C
f (vi ) = @ 2i A .
...
ani
Conversely, given any n ⇥ n matrix A, the function fA : V ! V defined, with
respect to the coordinatization given by B, by:
0 1 0 1
x1 x1
Bx C Bx C
f (@ 2 A) = A · @ 2 A
... ...
xn xn
is a linear operator on V .
Proof. The rules of matrix multiplication show that any function f (x) = A · x is a
linear operator. Moreover, any linear operator f is uniquely determined by the set
of values {f (v1 ), . . . , f (vn )} where {v1 , . . . , vn } is any basis for V . If A is the n ⇥ n
matrix whose Column i is the column vector f (vi ), then f (vi ) = A · vi . Hence, the
linear operator f agrees with the function x ! A · x.
The discussion above also shows:
Lemma 2.3. Let V be a vector space with a basis B. Let f : B ! V be any
function. Then there is a unique linear operator f ⇤ : V ! V such that f ⇤ extends
f . Moreover f ⇤ is an invertible linear operator if and only if f (B) is a basis for V
if and only if the associated matrix A is invertible.

Examples of Linear Operators


dp
1. Let P be the vector space of all polynomials. Define D : P ! P by D(p) = dx .
Then D is a linear operator on P. Note that D2 := D D is the second derivative
operator. Indeed, if p(x) is any polynomial, then

p(D) := an · Dn + · · · + a1 · D + a0 · I
is a linear di↵erential operator on P, and more generally, on the space of all infinitely
di↵erentiable functions of a real variable.
13

2. Let C be the vector


Rx space of all continuous real-valued functions. Define
J : C ! C by J(f ) = 0 f (t)dt. Then J is a linear operator.
The theory of di↵erential operators and integral operators has played an impor-
tant role in the study of di↵erential equations and physics over the past century.
3. Let ⇢ : R2 ! R2 be the rotation about (0, 0) counterclockwise through the
angle ✓. Then ⇢ is a linear operator with associated matrix
✓ ◆
cos(✓) sin(✓)
sin(✓) cos(✓)

4. Let r : R2 ! R2 be the reflection of R across the line y = mx, where


m = tan( ✓2 ). Then r is a linear operator with associated matrix
✓ ◆
cos(✓) sin(✓)
sin(✓) cos(✓)

Projection maps onto lines y = mx are also linear operators on R2 .


Two subspaces naturally associated with a linear operator T : V ! V are the
kernel of T ,

Ker(T ) = {v 2 V : T (v) = 0},


and the range of T , T (V ),

T (V ) = {T (v) : v 2 V }.
The following theorem is fundamental.
Theorem 2.4. Let T : V ! V be a linear operator on the finite-dimensional vector
space V . Then
dim(Ker(T )) + dim(T (V )) = dim(V ).

Proof. Let B0 be a basis for Ker(T ). Extend B0 to a basis B = B0 [ B1 for V .


Let V1 = Span(B1 ) and let T1 : V1 ! T (V ) be the restriction of T to V1 . Since
B0 \ B1 = ;, we have that

Ker(T ) \ V1 = {0}.
We claim that T1 is an isomorphism of vector spaces.
Note that V = Ker(T ) + V1 . Let v 2 V . Write v = u + v1 with u 2 Ker(T ),
v1 2 V1 . Then

T (v) = T (u + v1 ) = T (u) + T (v1 ) = 0 + T (v1 ) = T (v1 ) = T1 (v1 ).


Hence T1 (V1 ) = T (V1 ) = T (V ), i.e., T1 is surjective. Suppose that v 2 Ker(T1 ).
Then v 2 Ker(T ) \ V1 = {0}. Hence T1 is injective. Thus T1 is an isomorphism,
as claimed. Hence

dim(T (V )) = dim(V1 ) = |B1 | = |B| |B0 | = dim(V ) dim(Ker(T )),


14

proving the theorem.



For the purpose of studying linear operators on higher dimensional vector spaces,
it is convenient whenever possible to break the problem down into “bite-sized”pieces,
by finding smaller invariant subspaces.
Definition 2.5. Let T : V ! V be a linear operator. A subspace W of V is said
to be T -invariant if T (w) 2 W for all w 2 W , i.e. by restriction of domain, T
defines a linear operator TW : W ! W .
Example. Let D : P ! P be the di↵erentiation operator. Let Pn be the subspace
of polynomials of degree at most n. Then, for all n, Pn is an (n + 1)-dimensional
D-invariant subspace of P. [Note: In fact, D(Pn ) = Pn 1 ✓ Pn .]
Other than the 0-subspace, the smallest possible T -invariant subspaces are 1-
dimensional. If W = Span({w}) is a 1-dimensional T -invariant subspace (line),
then

T (w) = ·w
for some scalar . We then call w an eigenvector for T with eigenvalue .

Examples
1. Let D : P ! P be the di↵erentiation operator. Since D(f ) is of lower degree
than f , the only possible way that D(f ) = · f is if = 0, i.e., D(f ) = 0. Thus
the only eigenvectors for D are non-0 constant functions and the only eigenvalue
is 0. (On the other hand, if we enlarge our space from P to C 1 (R), the space of
infinitely di↵erentiable real-valued functions, then D : C 1 (R) ! C 1 (R) is still a
linear operator, and now the function f (x) = e x is an eigenfunction for D with
eigenvalue , for any real number .)
2. Let r : R2 ! R2 be the reflection across the line y = mx. Then the line y =
mx is an r-invariant 1-dimensional subspace of R2 , and so (1, m) is an eigenvector for
1
r with eigenvalue 1. Also, if m 6= 0, the orthogonal line y = m x is r-invariant, but
each vector on this line is mapped to its negative. Hence (m, 1) is an eigenvector
for r with eigenvalue 1. [If R : R2 ! R2 is the reflection across the x-axis, y = 0,
then the y-axis is also R-invariant, and (0, 1) is an eigenvector with eigenvalue 1.]
3. If ⇢ = ⇢✓ : R2 ! R2 is a non-identity rotation, then no 1-dimensional subspace
(line through (0, 0)) is mapped to itself, unless ✓ = ⇡, in which case each vector
is mapped to its negative, and so every non-zero vector is an eigenvector for ⇢⇡
with eigenvalue 1. (By convention, the zero vector is never considered to be an
eigenvector.)
There is a lovely strategy for finding eigenvectors. Let T : V ! V be a linear
operator. Fix a basis B and a coordinatization for V relative to B. Then T (x) =
A · x for some matrix A. We wish to solve the matrix equation:

A·x= · x.
Bringing everything to the left side of the equation, this is equivalent to solving

(A I) · x = 0.
15

Definition 2.6. Let A be an n ⇥ n matrix and let 2 R. The -eigenspace for


A is

{v 2 Rn : A · v = · v},

i.e. it is the set of all eigenvectors for A with eigenvalue plus the vector 0.

Our brief remarks above show that

Lemma 2.7. The -eigenspace for A is the null space for A I, i.e.

{v 2 Rn : (A · I) · v = 0}.

Now, for any given number , the problem of finding the null space for A I
is the problem of solving a certain homogeneous system of n linear equations in
n unknowns. But, for almost all , this system will have (0, 0, . . . , 0) as its only
solution. The Eigenvalue Problem is:

Determine those (few) values of for which the system has a non-zero solution.

This will be true if and only if the matrix A I is singular, i.e. not invertible.
And this property can be determined by the determinant det(A I). There is
a very interesting general theory of the determinant. We will restrict our attention
to the case of n ⇥ n matrices with n = 2 or n = 3. Even then, we will just sketch
the ideas, from a geometric viewpoint.
✓ ◆
a b
Definition 2.8. Let A = . Then the determinant of A is
c d

det(A) = ad bc.

It is easy to see that the homogeneous system A · x = 0 has a non-zero solution


if and only if the rows (a, b) and (c, d) are proportional if and only if ad bc = 0.

Now we proceed to the 3-dimensional case. We shall restrict our discussion of


linear algebra henceforth mostly to the 3-dimensional case, although almost every
statement extends to the general n-dimensional case.

Definition 2.9. Let 0 1


a b c
A = @d e f A.
g h m

Then
✓ ◆ ✓ ◆ ✓ ◆
e f d f d e
det(A) = a · det( ) b · det( ) + c · det( ).
h m g m g h

We recall the definition of the cross product of two vectors in R3 . We use the
standard notation i = (1, 0, 0), j = (0, 1, 0), k = (0, 0, 1).
16

Definition 2.10. Let u = ai + bj + ck and v = di + ej + f k be two vectors in R3 .


Then the cross product of u and v is
0 1
i j k
u ⇥ v = det(@ a b c A).
d e f

It is straightforward to check that u ⇥ v is perpendicular to the vectors u and v.


Moreover, if u and v are non-zero vectors, then ||u ⇥ v|| is equal to the area of the
parallelogram determined by the vectors u and v. The next lemma is clear from
the definition of dot product.
Lemma 2.11. Let u = (a, b, c), v = (d, e, f ), and w = (x, y, z) be three vectors in
R3 . Then 0 1
x y z
w · (u ⇥ v) = det(@ a b c A).
d e f

It follows that
Lemma 2.12. det(A) = 0 if and only if the rows of A form a linearly dependent
set of vectors if and only if the null space of A contains a non-zero vector.
We will need a few additional facts about determinants, the first two of which
are somewhat difficult to prove. We refer students to other books for their proof.
Determinant Theorems 2.13. Let A and B be n ⇥ n matrices. The following
properties hold:
(1) det(AB) = det(A)det(B);
(2) det(AT ) = det(A);
1
(3) If det(A) 6= 0, then A 1 exists and det(A 1
)= det(A) .

Returning to the eigenvalue problem, we obtain the following important result.


Theorem 2.14. Let A be a 3 ⇥ 3 matrix. Let be a real number. The following
are equivalent
(1) The -eigenspace of A is non-zero;
(2) The null space of A · I is non-zero;
(3) is a root of the characteristic polynomial of A, cA (x) := det(x · I A).

Note that, by the Intermediate Value Theorem, every real cubic polynomial
crosses the x-axis at least one time. Indeed, cA (x) has either one real root and a
pair of complex conjugate roots, or cA (x) has three real roots. Thus A has at least
one real eigenvalue. Of course, a real root may occur with multiplicity 1, 2, or 3.
Computing det(x·I A) is a bit tedious, but we can make two easy and important
observations.
Lemma 2.15. Let A be a 3⇥3 matrix and let cA (x) be the characteristic polynomial
of A. Write

cA (x) = x3 c 2 x2 + c 1 x c0 .
17

Then c2 = T r(A) is the trace of the matrix A, and c0 = det(A) is the determinant
of the matrix A.
Proof. Considering the matrix
0 1
x a11 a12 a13
x·I A=@ a21 x a22 a23 A ,
a31 a32 x a33

we see that the cubic and quadratic terms of cA (x) all come from the product:

(x a11 )(x a22 )(x a33 ) = x3 (a11 + a22 + a33 )x2 + · · · = x3 T r(A)x2 + . . . .

Thus c2 = T r(A). Now

c0 = cA (0) = det(0 A) = det( A) = ( 1)3 det(A) = det(A).


Hence c0 = det(A), completing the proof.

Clearly, if v is an eigenvector for A with eigenvalue , then every vector c · v
collinear with v is an eigenvector with the same eigenvalue . Thus if v and w are
eigenvectors for A with distinct eigenvalues, they cannot be collinear. The following
stronger statement is true.
Lemma 2.16. Let T : R3 ! R3 be a linear operator with matrix A, with respect
to the standard basis for R3 . Suppose that u, v, and w are eigenvectors for A with
distinct eigenvalues , µ, and ⌫. Then {u, v, w} is a basis for R3 .
Proof. Suppose that

(1) c1 · u + c2 · v + c3 · w = 0.

Since no two of the vectors u, v, w are collinear, ci 6= 0 for all i. Apply T to both
sides of this equation, getting:

(2) c1 · u + c2 µ · v + c3 ⌫ · w = 0.

Multiply equation (1) by , getting:

(3) c1 · u + c2 · v + c3 · w = 0.

Now subtract equation (3) from equation (2), getting:

(4) c2 (µ ) · v + c3 (⌫ ) · w = 0.

But now, since c3 6= 0 and ⌫ 6= , we may solve for w and get:


18

c2 ( µ)
w= · v,
c3 (⌫ )
whence v and w are collinear, a contradiction proving the lemma.

Suppose that u, v and w are eigenvectors for A with eigenvalues , µ, ⌫, not
necessarily distinct. Suppose that {u, v, w} forms a basis for R3 . Form the matrix
C whose columns are the coordinate vectors for u, v, and w with respect to the
standard basis for R3 . Then the matrix AC has columns Au = · u, Av = µ · v,
and Aw = ⌫ · w. Hence
0 1
0 0
AC = CD := C @ 0 µ 0 A .
0 0 ⌫
Since {u, v, w} is a basis for R3 , C is an invertible matrix and
0 1
0 0
C 1 AC = D = @ 0 µ 0 A .
0 0 ⌫
We say that A is similar to the diagonal matrix D. We also say that A is diago-
nalizable.
Definition 2.17. Two n ⇥ n matrices A and D are said to be similar if there
exists an invertible n ⇥ n matrix C such that C 1 AC = D.
It is left as an exercise to show that the relation of similarity is an equivalence
relation on the set of n ⇥ n matrices. The equivalence classes are called similarity
classes.
Now is a good time to recall some definitions from Math 4580. Indeed, this
would be a good time for you to start reviewing the material in Chapters 9 through
12 of the Math 4580 notes.
Definition 2.18. A nonempty set G of invertible functions is a group of func-
tions if G is closed under composition of functions and under taking inverses. [See
Definition 10.3 on page 84 of the Math 4580 notes.]
Definition 2.19. Two functions f and f1 in a group G are called conjugate if
there is a function g 2 G such that f1 = g f g 1 . Conjugacy is an equivalence
relation on the set G, and the equivalence classes are called conjugacy classes.
[See page 78 of the Math 4580 notes.]
Notice that the intersection of a similarity class of n ⇥ n matrices with the group
GL(n, R) of all invertible n ⇥ n matrices is a conjugacy class in this group. [See
Exercise 8.]
The following definition will be needed in Exercise 8.
Definition 2.20. Let G be a group (of functions) and H a subgroup of G. We say
that H is a normal subgroup of G if
1 1
H = gHg = {ghg : h 2 H}
19

for all g 2 G. (Equivalently, gH = Hg for all g 2 G, where gH and Hg are cosets


of H.) [See Exercise 7 in Chapter 11 of the Math 4580 notes.]

There are two interpretations for a pair of similar matrices, A and D, in terms
of linear operators. Holding one basis, B, fixed, A and D are the matrices for two
di↵erent linear operators with respect to this fixed choice of basis. On the other
hand, we may regard the invertible matrix C as a change of basis matrix, and then
we may interpret A and D as two di↵erent matrix representations for the same
linear operator T : Rn ! Rn with respect to two di↵erent bases. Thus, in the
diagonalizable case, it is convenient to think as follows:
T : R3 ! R3 is a linear operator whose matrix is A with respect to the standard
basis {(1, 0, 0), (0, 1, 0), (0, 0, 1)} for R3 . On the other hand, there is a better choice
of basis for the purpose of understanding T geometrically. Namely, there are three
eigenlines R · u, R · v, R · w, such that T “stretches”(or maybe, shrinks) these lines
by scaling factors , µ, and ⌫, respectively. With respect to the basis {u, v, w}, T
is represented by the diagonal matrix D. So, A and D represent the same linear
operator T with respect to di↵erent choices of basis.

Exercises
1. Let f : V ! V be a linear operator on the finite-dimensional real vector space
V . Let B be a basis for V . Let A be the matrix which represents f with respect
to the basis B.
(a) Prove: f is invertible if and only if f (B) is a basis for V .
(b) Prove: f is invertible if and only if A is an invertible matrix.
2. Prove: The di↵erentiation operator D : P ! P is a linear operator.
3. Prove: The integration operator J : C ! C is a linear operator.
4. Prove: D J : P ! P is the identity operator on P, but J D : P ! P is
not the identity operator.
5. Let T : V ! V be a linear operator.
(a) Prove: The range T (V ) is a subspace of V .
(b) Let V (T ) := {v 2 V : T (v) = · v}. Prove: V (T ) is a T -invariant subspace
of V .
(c) Prove: Ker(T ) = V0 (T ).
(d) Prove: V (T ) = Ker(T · I).
6. Let W be a subspace of the real vector space V . We say that a linear operator
P : V ! V is a projection operator onto W if P (V ) = W and P (w) = w for all
w 2 W.
Prove: A linear operator P : V ! V is a projection operator if and only if P
P := P 2 = P . [Hint: If P 2 = P , write every vector v 2 V as v = P (v) + (v P (v)).
Conclude that V = V0 (P ) + V1 (P ) and P is a projection operator onto V1 (P ).]
7. Let u and v be vectors in R3 .
20

(a) Verify that u ⇥ v is perpendicular both to u and to v.


(b) Verify that u ⇥ v = 0 if and only if {u, v} is a linearly dependent set, i.e. u
and v are collinear.
(c) Using the scalar triple product interpretation of the 3 ⇥ 3 determinant, verify
that det(A) = 0 if and only if the rows of A form a linearly dependent set of vectors.
8. Use the Determinant Theorem 2.13 for this exercise. Let GL(Rn ) be the
set of all invertible linear operators on Rn . To each linear operator T 2 GL(Rn ),
associate the n ⇥ n matrix for T with respect to the standard basis for Rn . Let
GL(n, R) denote the set of associated matrices.
(a) Prove: GL(n, R) is the set of all n ⇥ n matrices A such that det(A) 6= 0.
(b) Prove: If A, B 2 GL(n, R), then AB 2 GL(n, R) and A 1 2 GL(n, R).
Conclude that GL(n, R) is a group under the operation of matrix multiplication.
(c) Let SL(n, R) be the set of all n ⇥ n matrices A with det(A) = 1. Prove:
SL(n, R) is a normal subgroup of GL(n, R). [You must prove that it is a subgroup,
and also that it is normal.]
(d) Let : GL(Rn ) ! GL(n, R) be the function which associates to each invert-
ible linear operator T the matrix representing T with respect to the standard basis
for Rn . Prove: is an isomorphism of groups.
9(a) Prove: The relation of similarity is an equivalence relation on the set of all
n ⇥ n matrices.
(b) Prove: A subgroup H of a group G is normal if and only if H is a union of
conjugacy classes of G.
10. Let ⇧ be a plane passing through (0, 0, 0) in R3 with equation:

ax + by + cz = 0.
Let R⇧ : R3 ! R3 be the reflection map across the plane ⇧. Let Pn : R3 ! R3 be
the orthogonal projection map onto the line through the normal vector n = (a, b, c)
to the plane ⇧. Let P⇧ : R3 ! R3 be the orthogonal projection map onto the plane
⇧.
(a) Prove: For all (x, y, z) 2 R3 ,

ax + by + cz
Pn (x, y, z) = · n.
a 2 + b2 + c 2

(b) Deduce that Pn is a linear operator on R3 .


(c) If n = (1, 0, 1), write the matrix for the operator Pn with respect to the
standard basis for R3 .
(d) Argue geometrically what the eigenvalues and eigenspaces for Pn are.
(e) Verify your claims in (d) for the example n = (1, 0, 1) by explicit matrix
calculation using the matrix found in part (c).
(f) Prove: P⇧ = I Pn .
21

(g) If n = (1, 0, 1), write the matrix for the operator P⇧ .


(h) Argue both geometrically and by matrix calculation what the eigenvalues
and eigenspaces for P⇧ are.
(h) Prove: Pn2 = Pn and P⇧2 = P⇧ . Hence both are projection operators.
(i) Prove by a geometrical argument: R⇧ = 2P⇧ I.
(j) Deduce what the eigenvalues and eigenspaces for R⇧ are.
(k) For n = (1, 0, 1), find the matrix for R⇧ and check your claims concerning
the eigenvalues and eigenspaces.
22

Chapter 3: Inner Products and Orthogonal Matrices


The space Rn is not simply a vector space. It is a metric space with a geometric
structure given by the dot product: If u = (u1 , . . . , un ) and v = (v1 , . . . , vn ), then

u · v = u1 v 1 + · · · un v n .
In particular, u · u is the square of the Euclidean distance from (u1 , . . . , un ) to
(0, . . . , 0). Moreover, if u and v are vectors of unit length, then u · v is the cosine
of the angle between them. Indeed, for any vectors u and v,

u · v = 0 if and only if u and v are orthogonal (perpendicular to each other).


1 0
u1
It will be convenient to write vectors as column vectors: u = @ . . . A, v =
0 1 un
v1
@ . . . A. Then, the (matrix) transpose of u, uT , is
vn
uT := (u1 , . . . , un ),
and the dot product u · v is the matrix product uT v.
Let T : Rn ! Rn be a linear operator. Let A be the matrix representing T with
respect to the standard basis. Then the dot product T (u) · T (v) may be computed
as the matrix product:

(Au)T (Av) = uT AT Av.


In particular, if T : Rn ! Rn is a linear isometry, then T preserves distances
and angles, whence, for all u, v:

u · v = T (u) · T (v), i.e.,

uT v = (Au)T (Av) = uT (AT A)v.


We leave as an exercise to show that uT v = uT (AT A)v for all u, v if and only if

AT A = I.
Thus we have the following fact.
Lemma 3.1. Let T : Rn ! Rn be a linear operator. Let A be the matrix repre-
senting T with respect to the standard basis for Rn . Then T is an isometry if and
only if AT A = I.
We call a square matrix A orthogonal if and only if AT A = I, i.e. the columns
of A form an orthonormal basis for Rn (and so do the rows, since AAT = I as
well). Likewise, we call a linear operator an orthogonal operator if the associated
matrix with respect to the standard basis for Rn is an orthogonal matrix.
From the properties of determinants, we see that
23

Lemma 3.2. If A is an orthogonal matrix, then det(A) = ±1.


Definition 3.3. O(n) is the set of all orthogonal n ⇥ n matrices. SO(n) is the set
of all orthogonal n ⇥ n matrices of determinant 1.
We leave as an exercise to show
Theorem 3.4. O(n) is a subgroup of GL(n, R). SO(n) is a normal subgroup of
index 2 in O(n).
We can now complete our discussion of isometries of Rn by proving the following
analogue of Theorem 11.9 in the Math 4580 notes.
Theorem 3.5. Let f : Rn ! Rn be an isometry. Let f (0, 0, . . . , 0) = a =
(a1 , a2 , . . . , an ). Then f = Ta R, where Ta is the translation operator by the
vector a, and R is an orthogonal operator.
To prove Theorem 3.5, we need the following lemma, which is analogous to
Lemma 11.1 in the Math 4580 notes.
Lemma 3.6. (a) Let a = (a1 , a2 , . . . , an ) and b = (b1 , b2 , . . . , bn ) be two points in
Rn which are equidistant from each of the points

e0 = (0, 0, . . . , 0), e1 = (1, 0, . . . , 0), e2 = (0, 1, 0, . . . , 0), . . . , en = (0, 0, . . . , 0, 1).

Then a = b.
(b) Let f : Rn ! Rn be an isometry fixing each of the points

e0 = (0, 0, . . . , 0), e1 = (1, 0, . . . , 0), e2 = (0, 1, 0, . . . , 0), . . . , en = (0, 0, . . . , 0, 1).

Then f = I, the identity map.


Proof. This is Homework Exercise 11.
Proof of Theorem 3.5. Let g = T a f . Then g(e0 ) = e0 . Let

g(e1 ) = a1 , g(e2 ) = a2 , . . . , g(en ) = an .


Since g is an isometry fixing (0, 0, . . . , 0), and since B := {e1 , e2 , . . . , en } is the
standard orthonormal basis for Rn , we have that {g(e1 ), g(e2 ), . . . , g(en )} is likewise
an orthonormal basis for Rn . Let R : Rn ! Rn be the linear operator whose matrix
A with respect to the basis B has columns g(e1 ), g(e2 ), . . . , g(en ). Then A is an
orthogonal matrix, and so R is an orthogonal operator. Moreover g(ei ) = R(ei )
for all i, 0  i  n. Hence, R 1 g is an isometry of Rn fixing ei for all i. Thus,
by Lemma 3.6(b), R 1 g = I, and so g = R is an orthogonal operator. Since
g = Ta 1 f , we conclude that f = Ta R, as claimed.
We have the following immediate corollary to Theorem 3.5.
Corollary 3.7. Every isometry of Rn is an invertible function.
There is another important subgroup of O(n).
24

Definition 3.8. We call a matrix P a permutation matrix if each row and each
column has exactly one entry equal to 1 and every other entry equal to 0. Equiva-
lently, every entry of P is either 0 or 1, and the columns of P form an orthonormal
basis for Rn , (as do the rows). We let Pn denote the set of all n ⇥ n permutation
matrices.
Theorem 3.9. Let {e1 = (1, 0, . . . , 0), e2 = (0, 1, 0, . . . , 0), . . . , en = (0, 0, . . . , 0, 1)}
be the standard orthonormal basis for Rn . For each permutation 2 Sn , let ˆ be
the unique linear operator on Rn such that

ˆ (ei ) = e (i)

for all i, 1  i  n. Let ˜ be the matrix representing the linear operator ˆ with
respect to the standard basis for Rn . Then ˜ is a permutation matrix, and the
function ⇥ : Sn ! Pn by

⇥( ) = ˜
is an isomorphism of groups.
Proof. Clearly ⇥ is a bijection. Also

⇥( ⌧ )(ei ) = e( ⌧ )(i) =e (⌧ (i)) ,

while
(⇥( ) ⇥(⌧ ))(ei ) = ⇥( )(e⌧ (i) = e (⌧ (i)) = ⇥( ⌧ )(ei ).
Hence ⇥ is an isomorphism of groups.
There is a larger interesting subgroup of O(n), the Weyl group W (n).
Definition 3.10. We call a matrix M a signed permutation matrix if each row
and each column has exactly one entry equal either to 1 or to 1, and all other
entries equal to 0. Equivalently, every entry of M is either 0, 1, or 1, and the
columns of M form an orthonormal basis for Rn , (as do the rows). We let W (n)
denote the set of all n ⇥ n signed permutation matrices.
Theorem 3.11. W (n) is a subgroup of O(n) of cardinality 2n · n!.
Proof. Clearly, for each permutation matrix ˜ , there are 2n signed permutation
matrices whose non-zero entries are in the same location as the 1 entries of ˜ .
Hence the set W (n) has cardinality 2n · n!, and clearly, by definition, W (n) ✓ O(n).
Consider the set Z of all matrices in O(n) with integer entries. Clearly, the
product of any two such matrices is another such matrix. Also, if A 2 O(n), then
A 1 = AT . Hence if A has integer entries, then so does A 1 . As I has integer
entries, Z is a subgroup of O(n). But now let
0 1
a1
B a2 C
@ A
...
an
be a column of some matrix A in Z. Then this column vector is a unit vector. So

a21 + a22 + · · · + a2n = 1.


25

Since each ai 2 Z, exactly one ai = ±1 and the other aj ’s are all 0. Moreover
the columns of A form an orthonormal set. So Z = W (n), proving that W (n) is a
subgroup of O(n).
We will use permutation matrices and signed permutation matrices in the next
chapter when we study the symmetry group of the regular tetrahedron. But now
we return to general properties of orthogonal matrices. There are very limited
possibilities for the real eigenvalues of orthogonal matrices.
Lemma 3.12. Let A 2 O(n) and let be a (real) eigenvalue of A. Then = ±1.
Proof. Let u be an eigenvector for A with eigenvalue . Then

uT u = (Au)T (Au) = ( · uT )( · u) = 2
· (uT u).
Since u 6= 0, uT u 6= 0 and so 2
= 1. Hence, = ±1, as claimed.
Once again, we restrict attention to the 3-dimensional case. Let T : R3 ! R3 be
a linear isometry. Since the characteristic polynomial cT (x) is a cubic polynomial,
cT (x) has at least one real root ✏, and by Lemma 3.9, ✏ = ±1. Let u be a unit
eigenvector for T with eigenvalue ✏. Since T is an isometry,

u? := {v 2 R3 : u · v = 0}
is a T -invariant plane in R3 (with normal vector u). Since T : u? ! u? is an
isometry, T acts as either a rotation or a reflection on the plane u? . Hence there is
an orthonormal basis {v, w} for the plane u? such that the matrix for the T -action
on u? with respect to this basis is:
✓ ◆ ✓ ◆
cos(✓) sin(✓) cos(✓) sin(✓)
or .
sin(✓) cos(✓) sin(✓) cos(✓)
for some angle ✓, 0  ✓ < 2⇡.
Thus the matrix for T with respect to the orthonormal basis {u, v, w} for R3 is:
0 1 0 1
✏ 0 0 ✏ 0 0
@ 0 cos(✓) sin(✓) A or @ 0 cos(✓) sin(✓) A .
0 sin(✓) cos(✓) 0 sin(✓) cos(✓)
In the case where T acts as a reflection on the plane u? , we can make a special
choice of orthonormal basis for u? : Take v 0 to be a unit vector along the reflecting
mirror for T , and take w0 to be a unit vector perpendicular to v 0 . Then

T (v 0 ) = v 0 and T (w0 ) = w0 .
Hence, with respect to the orthonormal basis {u, v 0 , w0 } for R3 , the matrix for T is:
0 1
✏ 0 0
@0 1 0 A.
0 0 1
If T 2 SO(3) and the restriction of T to u? is a reflection, then 1 = det(T ) = ✏.
Hence ✏ = 1 and we see that T is a 180o rotation of R3 around the axis determined
by the eigenvector v 0 . On the other hand, if T 2 SO(3) and the restriction of T to
u? is a rotation, then 1 = det(T ) = ✏, and T is a rotation of R3 through an angle ✓
about the axis determined by the eigenvector u. Thus we have the following result.
26

Theorem 3.13. Let T : R3 ! R3 be a linear isometry with det(T ) = 1. Then T


is a rotation of R3 about an axis R · u, and there is an angle ✓ and an orthonormal
basis for R3 such that the matrix for T with respect to this basis is
0 1
1 0 0
@ 0 cos(✓) sin(✓) A .
0 sin(✓) cos(✓)
Moreover, if T is not the identity operator, then the line R · u is the unique line of
eigenvectors with eigenvalue 1 for T , i.e. if v 2 R3 with T (v) = v, then v = c · u
for some scalar c.
Note that if T 2 SO(3) and T 6= I, then R · u is the unique eigenline for T , unless
T induces a 180o rotation of u? in which case u? is the ( 1)-eigenspace for T .
We now have a rather complete description of linear operators T 2 SO(3). In
Chapter 7, we will use this knowledge in combination with some basic facts about
permutation groups, including the lovely Orbit Counting Formula from Chapter 6,
to describe the finite subgroups of SO(3).
In Math 4507, you study the group of isometries of Spherical Geometry. This is
precisely the group O(3) of all isometries of the 2-sphere S 2 in R3 , with the metric
induced from R3 . You also study the group of isometries of Hyperbolic Geometry.
This group is isomorphic to
0 1 0 1
1 0 0 1 0 0
H := {A 2 GL(3, R) : AT · @ 0 1 0 A · A = @ 0 1 0 A .}
0 0 1 0 0 1

Exercises
1. Prove: Let A be an n ⇥ n matrix such that uT Av = uT v for all vectors u,
v 2 Rn . Then A = I, the identity matrix. [Hint: Let ei be the unit vector with
0 in the j-th entry for all j 6= i and with 1 in the i-th entry. Compute eTi Aej and
compare with eTi ej .]
2. In this exercise, you may use basic properties of the dot product in Rn . Let
u and v be vectors in Rn .
(a) Prove: u · v = 12 (||u + v||2 ||u||2 ||v||2 ).
(b) Prove: Let T : Rn ! Rn be a linear isometry. Then T (u) · T (v) = u · v for
all u, v 2 Rn .
(c) Prove: Let T : R3 ! R3 be a linear isometry. Suppose that u 2 R3 is an
eigenvector for T . Then the plane

u? := {v 2 R3 : u · v = 0}
is a T -invariant subspace of R3 .
3. Using Determinant Theorems 2.12, prove: If A is an orthogonal matrix, then
det(A) = ±1.
27

4. Prove Theorem 3.4: O(n) is a subgroup of GL(n, R), and SO(n) is a normal
subgroup of index 2 in O(n).
5. Let D(n) be the set of all n ⇥ n diagonal matrices (i.e., every entry o↵ the
main diagonal is 0) such that every diagonal entry is ±1.
(a) Prove: W (n) = D(n) · Pn = {DP : D 2 D(n), P 2 Pn }.
(b) Prove: D(n) is a normal subgroup of W (n).
6. Prove: Let R : R3 ! R3 be a linear operator. Then R is a reflection across a
plane ⇧ through (0, 0, 0) if and only if the following three conditions hold:
(1) R has a 1-dimensional eigenspace U with eigenvalue 1;
(2) R has a 2-dimensional eigenspace W with eigenvalue 1; and
(3) U ? W .

7.(a) Give an example to show that not every linear isometry of R3 is a rotation
or a reflection.
(b) Prove: If T : R3 ! R3 is a linear isometry, then either T is a rotation or
T = R ⇢, where R is a reflection and ⇢ is a rotation (possibly the identity rotation).
8. Prove the following theorem of Euler: Let f : R3 ! R3 be an isometry (not
necessarily linear). Then either f = Tv ⇢ or f = Tv R ⇢, where Tv : R3 ! R3
is the translation by the vector v 2 R3 , ⇢ is a rotation about (0, 0, 0), and R is a
reflection across a plane passing through (0, 0, 0).
9. Prove the following theorems of Euler.
(a) Let f = Tv ⇢ be an isometry, using the notation of Exercise 8, with ⇢ not
the identity rotation. Let u be an eigenvector for ⇢ with eigenvalue 1. Then f is a
rotation about some point in R3 if and only if u · v = 0, i.e., v lies in the plane u? .
Moreover, if f is a rotation, then the axis of rotation is parallel to the vector u.
(b) Let f = Tv ⇢ be as in (a). Write v = u1 + w1 , where u1 is the orthogonal
projection of v onto the line through u, and w1 is the orthogonal projection of v
into the plane u? . Then

f = T u1 (Tw1 ⇢) = Tu1 ⇢1 ,

where ⇢1 := Tw1 ⇢ is a rotation about an axis parallel to the vector u, and Tu1
is a translation by the vector u1 , which is also parallel to the vector u. [Note: If
u1 6= 0, then f is a screw motion along the axis of the rotation ⇢.]
10. Prove: The set H defined at the end of this chapter is a subgroup of GL(3, R).
28

Chapter 4: Permutations, Orbits, and Lagrange’s Theorem


We now return to the topic of group theory, in particular to the study of groups of
permutations and groups of isometries. Armed with the tools of linear algebra, we
shall be equipped to study the isometries of R3 , and in particular, the symmetries
of the Platonic solids. First, however, we need to study the theory of permutation
groups more deeply. The highlight of this chapter will be the proof of Lagrange’s
Theorem, which most people would point to as the beginning of the history of group
theory.
Consider the equilateral triangle 4ABC and the rotation ⇢ counterclockwise
about its center through an angle of 120o . We can think of ⇢ as a permutation of
the vertex set {A, B, C}:

⇢(A) = B, ⇢(B) = C, ⇢(C) = A.


If, instead of an equilateral triangle, the figure was a regular octagon and ⇢ was a
45o rotation about its center, then defining ⇢ as a function on the vertex set would
require an enumeration of eight function values:

⇢(A) = B, ⇢(B) = C, ⇢(C) = D, ⇢(D) = E, ⇢(E) = F, ⇢(F ) = G, ⇢(G) = H, ⇢(H) = A.

This can become tedious. Cauchy devised a somewhat more efficient notation for
these functions, called cycle notation. It is based on the following principle.
Definition 4.1. Let H be a group of permutations of a set X. The H-orbit
containing the point x is the set xH := {h(x) : h 2 H}.
Definition 4.2. Let be a permutation of the set X. Recall that the cyclic group
generated by is the set h i := { i : i 2 Z}. We write x for xh i , and call it
the -orbit containing the point x.
We shall be particularly interested in the case where has finite order.

Examples
1. If r is a reflection across a line in R2 , then r has order 2, since r2 = r r = I,
but r 6= I.
2. If ⇢ is a 120o rotation of R2 about the point P , then ⇢ has order 3, since
⇢ = I, but ⇢ 6= I 6= ⇢2 .
3

3. If ⇢ is a 45o rotation of R2 about the point P , then ⇢ has order 8.


Lemma 4.3. Let 2 G be an element of order m. Let n 2 Z. Write

n=q·m+r
with q, r 2 Z, and 0  r < m, as given by the Division Algorithm. Then n
= r
.
Hence
0 2 m 1
h i = {I = , , ,..., }.

Proof.
n q·m+r m q r
= =( ) = Iq r
= r
.
29


Now suppose that is a permutation of the set X and has order m. Let x be
a point of X. Then we can enumerate the elements of x as:
2 m 1
x = {x, (x), (x), . . . , (x)}.
You might be tempted to guess that the -orbit containing x always has cardi-
nality m, where m is the order of the permutation . But it is easy to see that this
is not the case. For example, if X = R2 and ⇢ is a 120o rotation about the point P ,
then ⇢ is a permutation of the set X and ⇢ has order 3, but the ⇢-orbit containing
the point P is simply {P }, since ⇢(P ) = ⇢2 (P ) = P . So there can be repetitions in
the set
2 m 1
{x, (x), (x), . . . , (x)}.
A more interesting example is the following one:
Let X = {1, 2, 3, 4, 5}. Let : X ! X be the permutation defined by:

(1) = 2, (2) = 3, (3) = 1, (4) = 5, (5) = 4.


It is easy to check that i (1) = 1 if and only if i is a multiple of 3, and j (4) = 4
if and only if i is even. It is then not hard to conclude that has order 6. But

2 3 4 5
1 = {1, (1), (1), (1), (1), (1)} = {1, 2, 3, 1, 2, 3} = {1, 2, 3}.
We cycle through the same numbers twice. Similarly,
2 3 4 5
4 = {r, (4), (4), (4), (4), (4)} = {4, 5, 4, 5, 4, 5} = {4, 5}.
This time we cycle through the same numbers three times. So the -orbits on X
have cardinality 2 and 3, but has order 6.
In general, if O is a -orbit on the set X, we may define the permutation O :
O ! O to be the function with its domain restricted to O. If
2 k 1
O = {x, (x), (x), . . . , (x)}
k i i
with (x) = x, but (x) 6= x for all i, 1  i < k, then clearly O 6= I for 1  i < k,
k
but O = I, since
k i k
O ( (x)) = ( i (x)) = i
( k
(x)) = i
(x)
m m
for all i. So, O has order k = |O|. Since O = = I, it follows, by Exercise 3b
of Chapter 5, that

If O is a -orbit on X, then |O| divides m = |h i|.


This is a special case of the famous Theorem of Lagrange, which will will prove
later in this chapter.
Going back to the example : {1, 2, 3, 4, 5} ! {1, 2, 3, 4, 5} defined by

(1) = 2, (2) = 3, (3) = 1, (4) = 5, (5) = 4,


we see that
1 = 2 = 3 , and 4 = 5 .
30

Lemma 4.4. Let H be a group of permutations of the set X. Define a relation on


X by

x ⌘H y if and only if x 2 y H .
Then ⌘H is an equivalence relation. The equivalence classes are the H-orbits on
X.
Proof. We must check the three properies of an equivalence relation.
Reflexivity: Let x 2 X. Since I 2 H, x = I(x) 2 xH . So x ⌘H x.
Symmetry: Suppose x ⌘H y. Then x 2 y H . Hence, there exists h 2 H with
h(y) = x. Then h 1 2 H and h 1 (x) = y. So y 2 xH , i.e., y ⌘H x.
Transitivity: Suppose x ⌘H y and y ⌘H z. There there exists h, h0 2 H with
h(y) = x and h0 (z) = y. Then h h0 2 H and

(h h0 )(z) = h(h0 (z)) = h(y) = x.


Hence x 2 z H , i.e., x ⌘H z.

From the basic properties of equivalence relations and partitions, we obtain the
following corollary.
Corollary 4.5. If O and O1 are two H-orbits on X, then either O = O1 or
O \ O1 = ;.
Rather than defining Cauchy cycle notation, we shall illustrate it by some ex-
amples.
1. Let ⇢ be the 45o rotation of the regular octagon mentioned before, regarded
as the following function on the set of vertices:

⇢(A) = B, ⇢(B) = C, ⇢(C) = D, ⇢(D) = E, ⇢(E) = F, ⇢(F ) = G, ⇢(G) = H, ⇢(H) = A.

In Cauchy notation, we write:

⇢ = (A, B, C, D, E, F, G, H).
This signifies that ⇢ maps the first entry in the 8-tuple, A, to the second entry, B;
the second entry B to the third entry C, . . . , and finally, ⇢ maps the last entry
H back to the first entry A. The 8-tuple (A, B, C, D, E, F, G, H) is called a cycle.
The symbols in a cycle comprise all of the elements in some ⇢-orbit. In this case,
there is only one ⇢-orbit.
We can compute powers of a permutation by leap-frogging in the cycle. For
example, suppose we wish to compute the cycle structure of the 135o rotation
⇢3 . Since ⇢(A) = B, ⇢2 (A) = ⇢(B) = C, and ⇢3 (A) = ⇢(⇢2 (A)) = ⇢(C) = D. So we
leapfrog over two symbols in the cycle:

⇢3 = (A, D, G, B, E, H, C, F ).
31

Similarly we can compute the 90o rotation ⇢2 by leapfrogging over one symbol each
time. But now there are two ⇢2 -orbits, not just one:

⇢2 = (A, C, E, G)(B, D, F, H).


1
To compute ⇢ , we may simply write the cycle for ⇢ backwards:
1
⇢ = (H, G, F, E, D, C, B, A),
or, if you prefer to start with A:
1
⇢ = (A, H, G, F, E, D, C, B).
This illustrates the important fact that there are many di↵erent Cauchy cycle struc-
tures representing exactly the same permutation.
We can compute ⇢4 by leapfrogging over three symbols at a time in the cycle for
⇢, but we can also compute it by squaring each cycle in ⇢2 :

⇢4 = (A, E)(C, G)(B, F )(D, H).

2. Now let : {1, 2, 3, 4, 5} ! {1, 2, 3, 4, 5} be the function discussed above:

(1) = 2, (2) = 3, (3) = 1, (4) = 5, (5) = 4.


In Cauchy cycle notation,

= (1, 2, 3)(4, 5).


Then
2
= (1, 3, 2)(4)(5).
3
= (1)(2)(3)(4, 5).
4
= (1, 2, 3)(4)(5).
5
= (1, 3, 2)(4, 5).
6
= I = (1)(2)(3)(4)(5).
As a further simplification of Cauchy notation, it is common, when the domain X is
unambiguous, to omit cycles (orbits) of length 1. But, for the identity permutation,
it is customary to write (1), rather than ;. Hence, in this example, we would write:

= (1, 2, 3)(4, 5)
2
= (1, 3, 2)
3
= (4, 5)
4
= (1, 2, 3)
5
= (1, 3, 2)(4, 5)
6
= (1).
This clearly verifies the assertion made earlier that has order 6. Indeed, it is easy
to see that the following assertion is true.
32

Lemma 4.6. If the permutation is a single cycle of length k, then the order of
is k. If the permutation is the product of disjoint cycles of lengths k1 , k2 , . . . , km ,
then the order of is the least common multiple of {k1 , k2 , . . . , km }.
The fact that Cauchy cycle notation gives a valid representation of a permutation
is a consequence of the fact that the -orbits on X form a disjoint partition of
the set X. Hence each element of the domain X appears in one and only one cycle.
With some care, we can use the Cauchy notation to multiply permutations.
Thus, suppose we have the permutations ⇢ = (1, 2, 3)(4, 5) and = (1, 3, 4, 2) of
X = {1, 2, 3, 4, 5}. To compute ⇢ , we work from right to left in the cycles:

⇢ = (1, 2, 3)(4, 5)(1, 3, 4, 2).


Thus, in the rightmost cycle, 1 goes to 3. Then moving to the left, we see that 3
goes to 1. So the net e↵ect is that 1 stays fixed. Next, in the rightmost cycle, 2
goes to 1. Then in the leftmost cycle, 1 goes to 2. Again, the net e↵ect is that 2 is
fixed. Next, in the rightmost cycle, 3 goes to 4. Then in the next cycle to the left,
4 goes to 5. So, 3 goes to 5. Likewise, 4 goes to 2, which goes to 3. So 4 goes to 3.
Finally, 5 goes to 4. Hence we conclude that

⇢ = (1)(2)(3, 5, 4) = (3, 5, 4).

In the exercises, you will be asked to practice your skills at the “calculus of
permutations”. Practice makes perfect.
The multiset of numbers listing the lengths of the disjoint cycles (orbits) for a
permutation is called the cycle structure of . For example, the cycle struc-
ture of the permutation (1, 2, 3)(4, 5)(6, 7)(8)(9)(10) = (1, 2, 3)(4, 5)(6, 7) in S10 is
{3, 2, 2, 1, 1, 1}. The following fact is fundamental.
Lemma 4.7. Let H be a subgroup of Sym(X) and let ⌧ be a permutation of the
set X. Let O be an H-orbit on X. Then ⌧ (O) is a ⌧ H ⌧ 1 -orbit on X of the
same length as |O|. In particular, if H = ⌧ H ⌧ 1 , then O and ⌧ (O) are two
H-orbits of the same length. They may or may not be the same orbit.
Proof. Let x and y be in ⌧ (O). Then there exist a and b in O with ⌧ (a) = x,
⌧ (b) = y. Since a and b are in O, b = (a) for some 2 H. Then

1
(⌧ ⌧ )(x) = (⌧ )(a) = ⌧ (b) = y.
Hence x and y are in the same ⌧ H ⌧ 1 -orbit.
On the other hand, let a 2 O and x = ⌧ (a). Suppose that y is in the same
⌧ H ⌧ 1 -orbit as x. Then, for some 2 H,

1
y = (⌧ ⌧ )(x) = ⌧ ( (a)).
Since (a) 2 O, y 2 ⌧ (O).
1
Hence ⌧ (O) is the ⌧ H ⌧ -orbit containing x. Since ⌧ is a bijective map on
X, |O| = |⌧ (O)|.

As a corollary we obtain the following important fact.
33

Corollary 4.8. Let and 1 be two permutations in Sn . Then and 1 are in


the same Sn -conjugacy class if and only if they have the same cycle structure. Thus
the Sn -conjugacy classes are in one-to-one correspondence with the partitions of
n, i.e. the decompositions:

n = n1 + n2 + · · · + nr ,
with n1 n2 ··· nr > 0, and all ni 2 N.
Proof. Let O1 , . . . , Or be the -orbits on {1, 2, . . . , n}. If 1 = ⌧ ⌧ 1 , then by
Lemma 4.7, ⌧ (O1 ), . . . , ⌧ (Or are the 1 -orbits on {1, 2, . . . , n}, and |Ok | = |⌧ (Ok )|
for all k. Hence and 1 have the same cycle structure.
Now suppose that and 1 have the same cycle structure. Let (a1 , a2 , . . . , at )
be a cycle of and (b1 , b2 , . . . , bt ) be a cycle of 1 . Let ⌧ be a permutation in Sn
with ⌧ (ai ) = bi , 1  i  t. Then
1
(⌧ ⌧ )(bi ) = (⌧ )(ai ) = ⌧ (ai+1 ) = bi+1 .
(Here we understand t + 1 to be 1.) Hence ⌧ ⌧ 1 and 1 both contain the cycle
(b1 , b2 , . . . , bt ). Since cycles are disjoint and since and 1 have the same cycle
structure, we can define ⌧ cycle by cycle, so that ⌧ ⌧ 1 = 1 , i.e., and 1
are in the same Sn -conjugacy class.

We conclude this chapter with a proof of the most important basic theorem in the
theory of groups. It is also historically the first theorem in the theory of groups. It
was stated by Lagrange in 1771, almost 60 years before the word group was coined.
The striking nature of the result convinced mathematicians of the importance of
using groups to organize their thinking about permutations.
Lagrange’s Orbit-Stabilizer Theorem. Let G be a finite group of permutations
of the set X. Let O be a G-orbit on the set X containing the point x. Let

Gx := {g 2 G : g(x) = x}.
Then
|G| = |O| · |Gx |.

We call Gx the stabilizer in G of the point x. In order to prove Lagrange’s


Theorem, we need a few remarks. The first is a special case of Theorem 10.11(1)
in the Math 4580 text.
Lemma 4.9. For any x 2 X, Gx is a subgroup of G.
Next comes the crucial observation.
Lemma 4.10. Let g 2 G and let y = g(x). Then y 2 O and

g Gx := {g h : h 2 Gx } = {g 0 2 G : g 0 (x) = y}.

Proof. By definition of orbits, if y = g(x), then y 2 O. Let g h 2 g Gx . Then

(g h)(x) = g(h(x)) = g(x) = y.


34

Hence
g Gx ✓ {g 0 2 G : g 0 (x) = y}.
Now let g 0 2 G with g 0 (x) = y = g(x). Then (g 1
g 0 )(x) = x. Hence g 1
g0 = h
for some h 2 Gx . So g 0 = g h 2 g Gx . Hence

{g 0 2 G : g 0 (x) = y} ✓ g Gx .

Hence
g Gx = {g 0 2 G : g 0 (x) = y},
completing the proof.

Our next remark is essentially the same as Lemma 12.2 in the Math 4580 text,
but now for left cosets.
Lemma 4.11. Let g 2 G. Then the function : Gx ! g Gx , defined by

(h) = g h
for all h 2 Gx , is a bijection of sets. Hence |Gx | = |g Gx | for all g 2 G.
Proof. Clearly, by definition of g Gx , is surjective. Suppose that (h) = (h0 ).
Then g h = g h0 , and so, by the Cancellation Law, h = h0 . Hence is also
injective. So is surjective. Now the equality of cardinalities is immediate.

Now we can proceed to a proof of Lagrange’s Theorem.
Proof of Lagrange’s Theorem. Since O = {g(x) : g 2 G}, |O|  |G| < 1. Let

O = {x = x1 , x2 , . . . , xm }.
Thus |O| = m.
For every g 2 G, g(x) = xi for a unique i, 1  i  m. Hence if we set

Gi = {g 2 G : g(x) = xi },
then
G = G1 [ G2 [ · · · [ Gm ,
and
Gi \ Gj = ; for all i 6= j.
Hence
|G| = |G1 | + |G2 | + · · · + |Gm |.
By Lemma 6.10, Gi = gi Gx , where gi 2 G and gi (x) = xi . Moreover, by Lemma
6.11,

|Gi | = |gi Gx | = |Gx | for all i.


Hence
|G| = |Gx | + |Gx | + · · · + |Gx | = m · |Gx | = |O| · |Gx |,
35

proving Lagrange’s Theorem.



You may wonder how Lagrange could even STATE Lagrange’s Theorem, given
that the concept of a group had not yet been defined. In fact, Lagrange considered
only one type of group, Sn , the group of all permutations of the set {1, 2, . . . , n}.
His version of Lagrange’s Theorem was more like the following:
Theorem 4.12. Let Sn act as a group of permutations of the set X. Then the
cardinality of every Sn -orbit on X is a divisor of n!.
If you think of X = {1, 2, . . . , n}, then this is the trivial statement:

n is a divisor of n!
However, Sn acts “naturally”on other sets as well. For example, let

X = {(i, j) : 1  i, j  n}.
Then |X| = n2 and Sn acts naturally on X via:

(i, j) = ( (i), (j)).


In this case, Sn has two orbits on X:

:= {(i, i) : 1  i  n} and X .
As | | = n and |X | = n2 n = n(n 1), it is easy to verify Lagrange’s Theorem
in this case as well. Lagrange was actually interested in the action of Sn on the
infinite set P n of all multi-variable polynomials p(x1 , x2 , . . . , xn ), under the action

(p(x1 , x2 , . . . , xn )) = p(x (1) , x (2) , . . . , x (n) ).

Although P n is an infinite set, of course all of the Sn -orbits have finite length,
and Lagrange’s Theorem tells us that in fact this length must always divide n!.
We shall return to this setting for Lagrange’s Theorem in Chapter 9, when we
study polynomials and their roots. Now we shall return to geometric considerations
and see how we can use Lagrange’s Theorem to understand the structure of the
symmetry groups of the Platonic solids. But first we review some linear algebra.

Exercises
1. List all of the partitions of 6. For each partition ⇡, give a permutation ⇡
in S6 whose cycle structure is given by that partition. For each ⇡ , list all of the
powers of ⇡ and indicate the order of ⇡ .
2. Repeat Exercise 1 for 8 in place of 6.
3. Let ⇢ = (1, 2, 3, 4), = (1, 2, 3)(4, 5) and ⌧ = (2, 4, 5) in S5 . In each case
below, write your answer in Cauchy cycle notation.
(a) Compute ⇢ .
(b) Compute ⇢.
36

(c) Compute ⌧.
(d) Compute ⌧ .
(e) Compute ⇢ ⌧ .
(f) Compute ⌧ ⇢.
(g) Compute ⇢ ⌧.
(h) Compute ⇢ ⌧ .
(i) Compute ⇢ ⌧.
(j) Compute ⌧ ⇢.
(k) Compute ⌧ ⇢ .
(l) Compute ⌧ ⇢.
4. Let ⇢ : R2 ! R2 be a nonidentity rotation about the point P . Describe
geometrically the ⇢-orbits on the set P of all points of R2 . Does this explain why
orbits are called orbits?
5. Let G be a group. Let N be a normal subgroup of G. Let H be any subgroup
of G. Let

N H = {n h : n 2 N and h 2 H}.
(a) Prove: N H is a subgroup of G.
(b) Suppose that N \ H = {I}. Let h, h0 2 H. Prove: N h=N h0 if and
only if h = h0 . Conclude that |N H| = |N ||H|.
6. Let G be a group. Let g 2 G and let H be a subgroup of G. Define the
function cg : H ! g H g 1 by
1
cg (h) = g h g for all h 2 H.
Prove: cg is an isomorphism of groups.
7. Consider the group S4 .
(a) Prove that {(1, 2)(3, 4), (1, 3)(2, 4), (1, 4)(2, 3)} is an S4 -conjugacy class.
(b) Let V := {(1), (1, 2)(3, 4), (1, 3)(2, 4), (1, 4)(2, 3)}. Prove that V is a normal
abelian subgroup of S4 isomorphic to the Klein 4-subgroup V4 .
(c) Let H = {(1), (1, 2, 3), (1, 3, 2)}. Prove: V H is a normal subgroup of S4 of
cardinality 12, which is the union of three S4 -conjugacy classes.
(d) Prove that S4 contains exactly three subgroups of cardinality 8, and that
every element of S4 of order 4 is contained in exactly one of these subgroups.
Conclude that these subgroups are S4 -conjugate, and that each is isomorphic to
D4 , the symmetry group of the square.
In the next two exercises we prove another version of Lagrange’s Theorem, using
an argument very similar to that for Lagrange’s Orbit-Stabilizer Theorem.
8. Let G be a group and let H be a subgroup of G. Define a relation on G by:
37

1
g ⌘H g1 if and only if g g1 2 H.

(a) Prove: ⌘H is an equivalence relation on G.


(b) Prove: The ⌘H -equivalence classes are the left cosets g H of H in G.
(c) Prove: |g H| = |H| for all g 2 G.
9. Prove Lagrange’s Theorem: Let G be a finite group and H a subgroup of G.
Then |H| divides |G|. [Hint: Use Exercise 8. Show that if m is the number of left
cosets of H in G, then |G| = m · |H|.]
10. Prove: Let p be a prime. Then every finite group of cardinality p is cyclic.
11. Prove: Let G be a group and H a subgroup of G. Then H is a normal
subgroup of G if and only if g H = H g for all g 2 G.
Definition. Let G be a (not necessarily finite) group and let H be a subgroup of
G. If there are only finitely many right cosets of H in G, then the (right) index
of H in G is the number m of right cosets of H in G. It is denoted (G : H).

12. Let G be a (not necessarily finite) group and H a subgroup of G with


(G : H) = m < 1.
(a) Prove: The number of left cosets of H in G is also finite and is equal to m.
[Hint: Prove that the map H g ! (H g) 1 = g 1 H is a bijection between the
set of right cosets of H in G and the set of left cosets of H in G.] Thus we may
speak simply of the index of H in G.
(b) Prove: If (G : H) = 2, then H is a normal subgroup of G.
Definition. Let G be a group of permutations of the set X. We say that G is
transitive on X, or that G acts transitively on X if X is a single G-orbit.

13. Prove: Let G be a finite group of permutations of the set X. Suppose G


acts transitively on X. Then X is a finite set, and |X| divides |G|.
14. Prove: Let H be a group of permutations of the set X. Let K be a normal
subgroup of H having a unique fixed point y on X, i.e.,

{y} = {x 2 X : k(x) = x 8k 2 K}.


Then h(y) = y for all h 2 H.
15. Let G be a group with |G| = 4. Define the map ⇤ : G ! Sym(G) by

⇤(g)(x) = g x
for all g in the group G and all x in the set G.
(a) Prove: ⇤ defines an isomorphism between G and the subgroup ⇤(G) of
Sym(G).
(b) Prove: Either G is cyclic or G is isomorphic to V4 .
16. Let G be a group with |G| = 6.
38

(a) Prove: G contains elements of order 2 and 3. [Hint: Use Exercises 9 and 10
from Chapter 12 of the Math 4580 text.]
(b) Prove: If G is not cyclic, then G has exactly three elements of order 2: ⌧1 ,
⌧2 , ⌧3 . [Hint: Suppose not. Then G has a unique element ⌧ of order 2. Let g 2 G
of order 3. Prove that g ⌧ = ⌧ g. Conclude that g ⌧ has order 6.]
(c) Prove: If G is not cyclic, then G ⇠= S3 . [Hint: Define the map c : G !
Sym({⌧1 , ⌧2 , ⌧3 }) by:
1
c(g)(⌧i ) = g ⌧i g .
Prove that c is an isomorphism of groups.]
17. In this exercise, we determine all of the subgroups of S4 .
(a) Prove: S4 has exactly one subgroup of cardinality 12. (We call this subgroup
the alternating group on four letters and denote it A4 .)
(b) Prove: If H is a subgroup of S4 containing two distinct cyclic subgroups of
cardinality 3, then H acts transitively on {1, 2, 3, 4}, and hence H = A4 or H = S4 .
(c) Prove: Suppose H is a subgroup of S4 containing the cyclic subgroup K =
h(a, b, c)i. Suppose that H 6= A4 and H 6= S4 . Then either H = K or H = (S4 )d ,
the stabilizer in S4 of the point d. In the latter case, H ⇠
= S3 . [Hint: Apply Exercise
14. You must justify that K is a normal subgroup of H.]
(d) Prove: D4 contains two subgroups isomorphic to V4 and one subgroup iso-
morphic to C4 . All other subgroups have order 1, 2, or 8.
(e) Prove: Every subgroup of S4 is either cyclic of order 1, 2, 3, or 4, or is
isomorphic to V4 , S3 , A4 , or S4 .
18.(a) Prove: Let G be a finite group and let K be a G-conjugacy class. Then
|K| divides |G|.
[Hint: Use Lagrange’s Orbit-Stabilizer Theorem.]
(b) Prove: Let G be a group. Then Z(G) is the union of all G-conjugacy classes
K such that |K| = 1.
(c) Prove: Let p be a prime, and let G be a finite group of cardinality pn for
some n 2 N. Then Z(G) contains a non-identity element of G.
(d) Prove: Let p be a prime and let G be a finite group of cardinality p2 . Then
G is an abelian group. [Hint: By (b), Z(G) 6= {I}. If g 2 G Z(G), argue that
every element of G is equal to z g i for some z 2 Z(G) and i 2 N. Conclude that
G is abelian.]
19. (Bonus) Let p be a prime and let G be a finite group with |G| = 2p. Prove:
Either G is cyclic or G ⇠= Dp . [Hint: Suppose G is not cyclic. As in Exercise
16, prove that G has exactly p elements of order 2, and a normal cyclic subgroup
K = hxi of cardinality p. Let t 2 G K. Show that x t = t x 1 . Using this,
prove that the map : G ! Dp defined by

(xi tj ) = ⇢i Rj
for 0  i  p 1, 0  j  1, where ⇢ = ⇢ 2⇡
p
and R is reflection across the x-axis, is
an isomorphism of groups.]
39

Chapter 5: The Platonic Solids and their Symmetries


It is quite remarkable that, although there are infinitely many regular polygons,
there are, up to similarity, only five regular polyhedra: the Platonic solids.
Definition 5.1. A regular polyhedron is the surface S formed by a finite set
of congruent regular polygons, called the faces of S, completely enclosing a region
in R3 , such that any two faces are either disjoint or have exactly one vertex in
common or have exactly one edge in common.
Since S encloses a region, at least 3 faces must meet at any vertex of S, and the
convexity of S forces the sum of the angles at any vertex to be less than 360o . Hence
the polygons have interior angle less than 120o , i.e. they are triangles, squares, or
pentagons. Moreover, in the case of squares and pentagons, exactly three polygons
meet at a vertex, while in the case of triangles, at most five meet at a vertex.
Let V , E, and F denote the number of vertices, edges, and faces of S, respectively.
We shall use Descartes’ Formula:

V E + F = 2,
in conjunction with the local data:

2E = rV = mF,
where r is the number of edges meeting at a vertex v, and each face is a regular
m-gon. Now we can easily compute, case-by-case. For example, if m = r = 3, then
2 2
E E + E = 2,
3 3
whence E = 6, and then, V = F = 4.
Thus we get the following table:

m r V E F
3 3 4 6 4
3 4 6 12 8
3 5 12 30 20
4 3 8 12 6
5 3 20 30 12
This is the data for the tetrahedron, the octahedron, the icosahedron, the cube,
and the dodecahedron, respectively. It can be verified that if S is a Platonic solid
and if one marks points at the center of each face of S and then joins points at
a minimal distance apart by edges, the resulting surface is another Platonic solid,
S ⇤ , said to be dual to S. This is easy to visualize in the case of the cube S, and
the resulting figure is a regular octahedron. Less obviously, the icosahedron and
the dodecahedron are dual figures.
Dual Platonic solids have the identical group of symmetries. We shall only
consider the group of rotational symmetries of the tetrahedron, the octahedron,
and the icosahedron. These are called the tetrahedral group T , the octahedral
group, O, and the icosahedral group I.
Lagrange’s Theorem enables us easily to give an upper bound for the sizes of
these groups.
40

Lemma 5.2. The following upper bounds hold:


(1) |T |  12;
(2) |O|  24; and
(3) |I|  60.

Proof. First consider T . Let v be a vertex of the tetrahedron S(4). If ⇢ is a rotation


in T fixing the vertex v, then ⇢ must also fix the center c of the opposite face and
hence, ⇢ is either the identity map or ⇢ is a rotation through 120o (either clockwise
or counterclockwise) about the axis determined by v and c. Hence the stabilizer
Tv satisfies: |Tv |  3. Moreover, every rotation of S(4) moves v to some vertex of
S(4). Hence the orbit v T satisfies: |v T |  4. Hence by Lagrange’s Theorem,

|T | = |v T | · |Tv |  4 · 3 = 12.
For the octahedron and the icosahedron, each triangular face f is opposite an
antipodal triangular face f . Any non-identity rotation ⇢ fixing f (as a set) must
fix the centers of both f and f and induce a 120o rotation about the axis joining
those centers. Hence |Of |  3 and |If |  3. As the octahedron has 8 faces and the
icosahedron has 20 faces, we conclude, using Lagrange’s Theorem, that

|O| = |f O | · |Of |  8 · 3 = 24,


and
|I| = |f I | · |If |  20 · 3 = 60,
completing the proof of the lemma.
Now we wish to argue that these upper bounds are actually achieved. We will
only do this carefully in the first two cases.

The Symmetries of the Tetrahedron


There is a clever trick for describing a tetrahedron in a way that makes it easy to
compute its symmetries. Inside R4 , we take the four unit vectors v1 = (1, 0, 0, 0),
vp2 = (0, 1, 0, 0), v3 = (0, 0, 1, 0), and v4 = (0, 0, 0, 1). Each of these points is
2 units from each of the others. So these points are the vertices of a regular
tetrahedron in R4 . Indeed, the solid tetrahedron is the convex hull of these four
points:

{t1 v1 + t2 v2 + t3 v3 + t4 v4 : t1 + t2 + t3 + t4 = 1, 0  ti  1 8 i}.
This tetrahedron is contained in the 3-dimensional space defined by the equation:

x + y + z + w = 1.
Now consider the subgroup P4 of the orthogonal group O(4) consisting of all
4 ⇥ 4 permutation matrices. Each permutation in P4 permutes the four vertices
v1 , v2 , v3 , v4 , and hence is a symmetry of the tetrahedron S(4). Moreover, leaves
invariant

R̂3 := {(x, y, z, w) 2 R4 : x + y + z + w = 1},


41

and, since is an isometry of R4 , induces an isometry of R̂3 which is a symmetry


of S(4).
Since any two isometries of S(4) are completely determined by their action on
the four vertices of S(4), and since P4 ⇠
= S4 , it follows that P4 = Isom(S(4)).
Thus we have:
Theorem 5.3. If S(4) is a tetrahedron, then Isom(S(4)) ⇠ = S4 , and the group T
of rotational symmetries of S(4) is isomorphic to a subgroup of S4 of cardinality
12.
Proof. The argument above shows that Isom(S(4)) ⇠ = S4 . Since O(3) = SO(3) ⇥
{±I}, either all the symmetries in Isom(S(4)) are rotations or half are. Since
we have shown that |T |  12, we conclude that exactly half the symmetries are
rotations and |T | = 12. Thus, T is isomorphic to a subgroup of S4 of cardinality
12.
In Chapter 4, Exercise 17a, you have shown that there is only one subgroup of
S4 of cardinality 12. It is called the alternating group of degree 4, and is denoted
A4 . It contains eight elements of order 3: for each vertex v of S(4), there are two
120o rotations about an axis passing through v and the through the center of the
face of S(4) opposite v. The remaining three non-identity rotational symmetries
are obtained as follows: The six edges of S(4) subdivide into three pairs {e, e0 },
where e and e0 have no vertex in common. If Le,e0 is the line through the midpoints
of e and e0 , then the 180o rotation of R3 about this line maps e to e and e0 to e0 ,
and hence is a symmetry of S(4) of order 2.

The Rotational Symmetries of the Octahedron


We may construct the octahedron in R3 by taking as vertices the six points
(±1, 0, 0), (0, ±1, 0), (0, 0, ±1). Noticepthat the four points lying in the x, y-plane
form
p the vertices of a square of side 2, and each of these points is at distance
2 from both (0, 0, 1) and (0, 0, 1). Hence, joining (0, 0, 1) to the vertices at the
endpoints of an edge of the square in the x, y-plane gives an equilateral triangle,
and similarly for (0, 0, 1). These are the eight triangular faces of the octahedron
S(8).
Let be any permutation of {1, 2, 3}. Let ✏i = 1 or 1, for each i, 1  i  3.
Setting e1 = (1, 0, 0), e2 = (0, 1, 0), and e3 = (0, 1, 0), we consider the unique linear
operator ˆ which is the extension to R3 of the function:

ˆ (e1 ) = ✏1 · e (1) ,

ˆ (e2 ) = ✏2 · e (2) ,

ˆ (e3 ) = ✏3 · e (3) .

For each choice of the ✏i , ˆ is an isometry of R3 permuting the vertices of the


octahedron S(8). Hence each ˆ is a symmetry of S(8). The matrix representing
ˆ with respect to the standard basis {e1 , e2 , e3 } for R3 is a signed permutation
matrix.
Thus the Weyl group W (3) is a subgroup of Isom(S(8)). As we have seen,
|W (3)| = 23 · 3! = 48. Thus |Isom(S(8))| 48. On the other hand, we have
computed, using Lagrange’s Theorem, that |O|  24, and we know that |O|
42

1
2 |Isom(S(8))|. The only possible conclusion is that W (3) = Isom(S(8)) and O is
a subgroup of W (3) of index 2, with |O| = 24.
As remarked before, the centers of the eight faces of the octahedron S(8) are the
vertices of a cube C having the same group O of rotational symmetries. Let be
the set of four long diagonals joining opposite vertices of C. If ⇢ is a symmetry of C
which maps each long diagonal to itself, then ⇢ = I or ⇢ = I. (Exercise.) Hence
I is the only rotational symmetry of O (and C) fixing every element of . Thus if

: O ! Sym( ),
is the function which maps each rotation in O to the permutation it induces on ,
then is an injective map. But |O| = 24 = |Sym( )|. Hence O ⇠ = Sym( ) ⇠ = S4 .
Thus we have proved the following theorem.
Theorem 5.4. The group O of rotational symmetries of the octahedron (or the
cube) is isomorphic to the symmetric group S4 .
We can enumerate the 24 symmetries in O as follows. First, regarding them as
symmetries of the octahedron, we have:
(1) 6 90o rotations (clockwise or counterclockwise) about the axes joining op-
posite vertices;
(2) 3 180o rotations (clockwise or counterclockwise) about the axes joining op-
posite vertices;
(3) 8 120o rotations (clockwise or counterclockwise) about the axes joining op-
posite faces;
(4) 6 180o rotations (clockwise or counterclockwise) about the axes joining op-
posite edges; and
(5) 1 identity rotation.
Regarded as symmetries of the cube C, we have:
(1) 6 90o rotations (clockwise or counterclockwise) about the axes joining op-
posite faces;
(2) 3 180o rotations (clockwise or counterclockwise) about the axes joining op-
posite faces;
(3) 8 120o rotations (clockwise or counterclockwise) about the axes joining op-
posite vertices;
(4) 6 180o rotations (clockwise or counterclockwise) about the axes joining op-
posite edges; and
(5) 1 identity rotation.

The Rotational Symmetries of the Icosahedron


We shall not attempt a rigorous determination of the group Icos of rotational
symmetries of the icosahedron. We shall simple state the following facts:
Theorem 5.5. The group Icos of rotational symmetries of the icosahedron (and
the dodecahedron) is isomorphic to the alternating group of degree 5, the unique
subgroup of cardinality 60 in the symmetric group S5 .
We can enumerate the 60 symmetries in Icos as follows. First, regarding them
as symmetries of the icosahedron, we have:
(1) 12 72o rotations (clockwise or counterclockwise) about the axes joining op-
posite vertices;
43

(2) 12 144o rotations (clockwise or counterclockwise) about the axes joining


opposite vertices;
(3) 20 120o rotations (clockwise or counterclockwise) about the axes joining
opposite triangular faces;
(4) 15 180o rotations (clockwise or counterclockwise) about the axes joining
opposite edges; and
(5) 1 identity rotation.

Regarded as symmetries of the dodecahedron, we have:


(1) 12 72o rotations (clockwise or counterclockwise) about the axes joining op-
posite pentagonal faces;
(2) 12 144o rotations (clockwise or counterclockwise) about the axes joining
opposite pentagonal faces;
(3) 20 120o rotations (clockwise or counterclockwise) about the axes joining
opposite vertices;
(4) 15 180o rotations (clockwise or counterclockwise) about the axes joining
opposite edges; and
(5) 1 identity rotation.

Exercises
1. Consider the pattern F in R which is the tiling of the plane by unit squares,
2

whose vertices have coordinates (a, b) with a, b 2 Z. Let G = Isom(F ). Prove that
G = T · G0 , where

T := {T(a,b) : a, b 2 Z}
is the normal abelian subgroup of translational symmetries of F , and G0 is the
stabilizer in G of the point (0, 0), with G0 ⇠
= D4 , the group of all symmetries of the
square.
2.(a) Consider the pattern F1 in R2 which is the tiling of the plane by congru-
ent equilateral triangles with sides p
of unit length, including the triangle T having
vertices at (0, 0), (1, 0), and ( 12 , 23 ). Prove: Isom(F1 ) = T1 · H, where

T1 = {a · T(1,0) + b · T( 1 , p3 ) : a, b 2 Z}
2 2

is the normal abelian subgroup of translational symmetries of F1 , and

H = Isom(F1 )0 ⇠
= D6 ,
the group of all symmetries of the regular hexagon. [Note: There are six triangles
meeting at the point (0, 0), comprising the six wedges of a regular hexagon centered
at (0, 0).]
(b) Let P be the center of the triangle T from (a). Find the coordinates of P .
Prove: The stabilizer, Isom(F1 )P , of the point P is isomorphic to D3 , the group
of all symmetries of the equilateral triangle.
3. Let C be a cube in R3 centered at (0, 0, 0). Prove: If f 2 Isom(C) and f
maps each long diagonal of C to itself (but not necessarily pointwise), then f = I
or f = I.
44

4. In this exercise, you may assume that the icosahedral group Icos transitively
permutes the sets V , F , and E of all vertices, faces, and edges of the icosahedron,
respectively.
(a) Prove: The set of 15 elements of order 2 in Icos forms a single Icos-conjugacy
class.
(b) Prove: Icos contains 10 subgroups of cardinality 3, and Icos permutes these
subgroups transitively under conjugation.
(c) Prove: Icos contains 6 subgroups of cardinality 5 and Icos permutes these
subgroups transitively under conjugation.
(d) Prove: The only normal subgroups of Icos are the identity subgroup {I} and
the full group Icos. [Hint: Recall that if N is a subgroup of Icos, then |N | divides
60 = |I|. Moreover, if N is a normal subgroup of Icos and N contains the subgroup
H of Icos, then N contains g H g 1 for all g 2 Icos. Now use (a), (b), and (c).]
Definition. We say that a group G is a simple group if {I} and G are the only
normal subgroups of G.

5. Prove: If G is an abelian group, then G is a simple group if and only if |G| = p


for some prime p.
The group Icos is the smallest non-abelian simple group.
45

Chapter 6: The Orbit Counting Formula


We now can derive a very useful counting formula. It was apparently first dis-
covered by Cauchy in 1845. It was rediscovered by Frobenius in 1887, and was
included by Burnside in his textbook on group theory. Sometimes it is misnamed
the Burnside Counting Formula.
The Orbit Counting Formula. Let G be a finite group of permutations of a
finite set X. The number of G-orbits on X equals the average number of fixed
points on X of the elements of G. In other words, let f : G ! N [ {0} be the
function defined by

f (g) = |{x 2 X : g(x) = x}|.


Let r denote the number of G-orbits on X. Then
1 X
r= f (g).
|G|
g2G

The keys to the proof are:


Lagrange’s Orbit-Stabilizer Theorem
and the following elementary but useful counting observation:
Lemma 6.1. Let G be a finite group of permutations of a finite set X. Then
X X
f (g) = |Gx |.
g2G x2X

Proof. The easiest way to visualize why this is true is to imagine a rectangular
array (a matrix, if you will) whose rows are labeled by the elements of G and whose
columns are labeled by the points in the set X. The (g, x) entry of this array is 1 if
g(x) = x, and is 0 if g(x) 6= x. Now,Padd up all of the entries in the matrix. Adding
one row at a time gives the answer g2G f (g). Adding one column at a time gives
P
the answer x2X |Gx |, noting that the (g, x) entry is 1 if and only if g 2 Gx . This
proves the lemma.
Next we make the following observation.
Lemma 6.2. Let G be a finite group of permutations of a finite set X. Let O be
any G-orbit on X. Then
X
|Gx | = |G|.
x2O

Thus
1 X
|Gx | = 1,
|G|
x2O

for every G-orbit O on X.


Proof. Lagrange’s Orbit Stabilizer Theorem tells us that, for any x 2 O,

|G| = |Gx | · |O|.


46

In particular, this means that, for all x 2 O,

|G|
|Gx | = ,
|O|
independent of the choice of x. But then
X
|Gx | = |O| · |Gx | = |G|,
x2O

proving the lemma.


Now we can quickly complete the proof of the Orbit Counting Formula. By
Lemma 6.1,
1 X 1 X
f (g) = |Gx |.
|G| |G|
g2G x2X

Let O1 , O2 , . . . , Or be the G-orbits on X. Then, counting orbit by orbit, and using


Lemma 6.2, we have
r r
1 X X 1 X X
|Gx | = ( |Gx |) = 1 = r.
|G| i=1
|G| i=1
x2X x2Oi

Thus
1 X
f (g) = r,
|G|
g2G

completing the proof of the Orbit Counting Formula.


You may well object that you have never counted orbits in your life and can’t
imagine a situation where it would be useful to do so. Here are a couple of examples
where orbits arise in a natural way.
Orbit Example 1. Let’s call an organic hexad a ring-shaped molecule consisting
of six atoms, each of which is either a carbon atom (C) or a hydrogen atom (H). How
many di↵erent organic hexads are possible? [Assume that all bonds are isomorphic.]
What makes this an orbit problem is the fact that each molecule can be rotated
into six di↵erent position, and can be flipped across six axes. So each molecule
can be thought of as an orbit of the symmetry group D6 of the regular hexagon on
the set S of all possible labelings of each vertex with a letter C or H. Since each
vertex has two possible labelings and the choices are independent, there are 26 = 64
di↵erent labelings, i.e. |S| = 64. But we want to count the number of orbits of D6
on S.
Let’s count fixed points instead. Suppose ⇢ is a 60o rotation in either direction.
If L is a labeling with ⇢(L) = L, then every vertex must have the same label
as its adjacent vertex, i.e. the labeling L must be “monochromatic”. i.e., L =
{C, C, C, C, C, C} or {H, H, H, H, H, H}. So

f (⇢) = 2.
On the other hand, if ⇢2 is a 120o rotation, then adjacent vertices may be labeled
either the same or di↵erently, but every other vertex must have the same label.
Hence there are two additional labelings fixed by ⇢2 :
47

{C, H, C, H, C, H} and {H, C, H, C, H, C}.


Thus
f (⇢2 ) = 4.
Likewise, labelings fixed by a 180o rotation must have opposite vertices labeled the
same. So there are eight labelings fixed by ⇢3 :

f (⇢3 ) = 8.
There are two di↵erent kinds of reflections. If rv is a reflection fixing two opposite
vertices, then these two vertices may be labeled any way, but mirror-image vertices
must have the same label. So

f (rv ) = 2 ⇥ 2 ⇥ 2 ⇥ 2 = 16.
If re is a reflection fixing no vertices, then there are three mirror-image pairs and

f (re ) = 2 ⇥ 2 ⇥ 2 = 8.
Next we add up the number of fixed points, keeping track of the number of sym-
metries of each type:

X
f (g) = f (I)+2f (⇢)+2f (⇢2 )+f (⇢3 )+3f (rv )+3f (r3 ) = 64+4+8+8+48+24 = 156.
g2D6

Finally we divide by |D6 | to get that the number of distinct organic hexads is
156
12 = 13.
Note: Don’t tell your chemistry professor about “organic hexads”. They have
no basis in chemical reality. However similar arguments can be used in genuine
chemistry problems.
Before proceeding to the next example, we make a useful observation, which is
implicit in the last calculation.
Lemma 6.3. Let G be a group of permutations of a finite set X and let f (g) denote
the number of fixed points of the element g 2 G. If h g h 1 is any conjugate of
g in G, then

1
f (g) = f (h g h ),

proof. For any element g 2 G, let

F (g) = {x 2 X : g(x) = x}.


Then f (g) = |F (g)|. We shall show that

1
F (h g h ) = h(F (g)).
Then, since h is a bijective mapping on X,
48

1
|F (h g h )| = |F (g)|,

as claimed.
First let x 2 F (g). Then

1 1
(h g h )(h(x)) = (h g)(h (h(x))) = h(g(x)) = h(x).
1
Thus h(x) 2 F (h g h ) whenever x 2 F (g), i.e.

1
h(F (g)) ✓ F (h g h ).
1 1
Secondly, let y 2 F (h g h ). We wish to show that h (y) 2 F (g). Now

1
y = (h g)(h (y)).
1
So, applying h to both sides, we get

1 1 1 1 1 1
h (y) = h ((h g)(h (y))) = (h h)(g(h (y))) = g(h (y)).

1
So h (y) 2 F (g), as desired. Thus

1
F (h g h ) ✓ h(F (g)),

and so
1
F (h g h ) = h(F (g)),

as claimed.
Thus, in order to perform the calculations required for the Orbit Counting For-
mula, we only need to compute f (g) for one representative of each conjugacy class
of G, and we need to know the size of each conjugacy class. This is in fact what we
did in Example 1, where the “types”of rotations and reflections were actually the
di↵erent conjugacy classes of the group D6 .
Example 2. Let’s call a crystal tetrad a crystalline molecule in the shape of a
tetrahedron with each vertex containing either a silicon atom (Si), an oxygen atom
(O), or a hydrogen atom (H). How many di↵erent crystal tetrads are possible?
This problem is very similar to the previous one, only now we are considering
the orbits of the symmetry group of the tetrahedron on the set S of all labelings of
the vertices of the tetrahedron with the label Si, O or H. Thus there are 34 = 81
possible labelings, i.e. |S| = 81.
We have seen that the tetrahedral group T is isomorphic to S4 acting as the group
of all possible permutations of the four vertices of the tetrahedron. We have also
seen that two permutations in Sn are conjugate if and only if they have the same
cycle structure. Here we must compute fixed points for five di↵erent permutations:
(1), (1, 2), (1, 2, 3), (1, 2, 3, 4), and (1, 2)(3, 4).
If a labeling L is fixed by the isometry g, then two vertices of T which are in the
same g-orbit must have the same label, while vertices in di↵erent g-orbits may be
labeled independently. Thus we easily establish the following table:
49

T ype(g) |T ype(g)| f (g)


(1) 1 81
(1, 2) 6 27
(1, 2, 3) 8 9
(1, 2, 3, 4) 6 3
(1, 2)(3, 4) 3 9
Thus we conclude that the number of crystal tetrads is

(1 ⇥ 81) + (6 ⇥ 27) + (8 ⇥ 9) + (6 ⇥ 3) + (3 ⇥ 9) 27 + 54 + 24 + 6 + 9 120


= = = 15.
24 8 8

Here is a somewhat more complicated example, taken from “Abstract Alge-


bra” by Ted Shifrin.
Example 3. Suppose we are going to paint two faces of a cube red, two white, and
two blue. How many di↵erently colored cubes are possible?
First we need to count the size of the set C of possible colorings. We have to
choose two faces out of six to color red. We can do this in 6⇥5 2 = 15 ways. Then
we must choose two of the remaining four faces to color white. We can do this in
4⇥3
2 = 6 ways. So there are 15 ⇥ 6 = 90 colorings.
Next we must count the number of orbits of the octahedral group O on C. Again,
we do this by counting fixed points for each type of rotation in O. We recall the
types:
(1) 6 90o rotations (clockwise or counterclockwise) about the axes joining op-
posite faces;
(2) 3 180o rotations (clockwise or counterclockwise) about the axes joining op-
posite faces;
(3) 8 120o rotations (clockwise or counterclockwise) about the axes joining op-
posite vertices;
(4) 6 180o rotations (clockwise or counterclockwise) about the axes joining op-
posite edges; and
(5) 1 identity rotation.
If ⇢ is a 90o rotation and c is a coloring fixed by ⇢, then four faces of c must have
the same color, which is impossible. So f (⇢) = 0. Similarly, if ⇢ is a 120o rotation,
then the three faces having a common vertex must be colored the same. So, again
f (⇢) = 0.
If ⇢ is a 180o rotation about the center of two opposite faces, then there are
3 ⇥ 2 ⇥ 1 colorings fixed by ⇢: the rotated faces must be colored in pairs. So the
fixed faces must share the same third color. Thus f (⇢) = 6.
If ⇢ is a 180o rotation about the axis joining midpoints of opposite edges, then
pairs of interchanged faces must share the same color. Again, f (⇢) = 6.
Of course, f (I) = 90, and so we have that the answer is
3 ⇥ 6 + 6 ⇥ 6 + 1 ⇥ 90 144
= = 6.
24 24

We conclude this section with an amusing example.


50

Example 3. Let p be a prime number. Imagine a pinwheel with p identically shaped


pins. Suppose that there are n di↵erent colors of pins to choose from. How many
di↵erent pinwheels are possible?
Since the pins are di↵erent, front to back, only rotations are possible, not reflec-
tions. Thus we are counting the number of orbits of the cyclic group of p rotations
on the set S of np di↵erent colorings. For any non-identity rotation ⇢, there is only
one ⇢-orbit on the p pins. Hence the only colorings fixed by ⇢ are the monochromatic
colorings. Thus

f (⇢) = n
for all non-identity ⇢, of which there are p 1. Also, clearly

f (I) = np .
Hence the number of di↵erent pinwheels is

(1 ⇥ np ) + ((p 1) ⇥ n) np n
=n+ .
p p
Since the number of di↵erent pinwheels must be an integer, we obtain the following
corollary.
Fermat’s Little Theorem. Let p be a prime and n be any natural number. Then

np ⌘ n (mod p).

Exercises
1. How many di↵erent colored tetrahedra are there in which each face is colored
either red or white or blue?
2. How many di↵erent colored cubes are there in which each face is colored
either red or white or blue?
3. How many di↵erent pinwheels with 6 identically shaped pins are there, if each
pin can be colored one of 4 di↵erent colors?
4. How many di↵erent pinwheels with 8 identically shaped pins are there, if each
pin can be colored one of 4 di↵erent colors?
5. Instead of pinwheels, consider bracelets with 8 identical size spherical beads,
each of one of 4 di↵erent colors? How many di↵erent bracelets are there? [Now
reflectional symmetries must be considered, in addition to rotational symmetries.]
6. A toy pyramid in the shape of a regular tetrahedron is built out of six pegs.
Count the number of di↵erent designs if there are
(a) two each of red, white, and blue pegs; or
(b) three each of red and white pegs.
7. The skeleton of a cube is made out of twelve pegs. How many distinguishable
such cube can be made from:
51

(a) seven blue and five white pegs?


(b) six blue, two white, and four red pegs?
8. A soccer ball is more or less a regular dodecahedron in which the vertices have
been replaced by regular hexagons. An artistic soccer ball manufacturer wants to
make soccer balls in which the hexagons are all black, but the four of the pentagons
are colored silver, four gold, and four scarlet. How many di↵erently colored soccer
balls can he manufacture?
9. How many di↵erent square patchwork quilts, four patches by four patches,
can be made from six red, four white, and six blue squares, assuming that the quilts
(a) cannot be turned over?
(b) can be turned over?
10. Prove: Let G be a finite group of permutations acting transitively on the
finite set X, where |X| > 1. There exists at least one element g 2 G having no
fixed point on X, i.e., such that g(x) 6= x for all x 2 X.
52

7. Finite Subgroups of SO(3)


In this concluding chapter on symmetries and isometries, we shall combine all of
the ideas we have developed so far to prove a very beautiful theorem of mid-19th
century mathematics.
Theorem 7.1. Let G be a finite group of rotations of R3 . Then one of the following
conclusions holds:
(1) G is a finite cyclic group of rotations, all about the same axis L; or
(2) G fixes a line L as a set and a point P on L, and G induces on the plane
L? perpendicular to L and passing through P the full dihedral symmetry
group of some regular polygon; or
(3) |G| = 12 and G induces the full group of rotational symmetries of some
tetrahedron; or
(4) |G| = 24 and G induces the full group of rotational symmetries of some
octahedron; or
(5) |G| = 60 and G induces the full group of rotational symmetries of some
icosahedron.

This is a very remarkable theorem, obviously closely related to the theorem that
there are only five Platonic solids. It challenges our intuition. Although there
are infinitely many di↵erent 2-dimensional rotation groups, one for each regular
polygon, there are only three di↵erent essentially 3-dimensional rotation groups.
Somehow, the extra dimension provides less freedom, not more.
In fact, we shall prove an even stronger statement.
Definition 7.2. Let f : Rn ! Rn be a function. We call f an affine transfor-
mation if there exists a vector v 2 Rn and a linear operator g : Rn ! Rn such
that f = Tv g, where Tv : Rn ! Rn is translation by the vector v. We denote by
Af f (Rn ) the group of all invertible affine transformations of Rn .
First we prove the following striking fact:
Theorem 7.3. Let G be a finite subgroup of Af f (Rn ). Then G fixes a point, i.e.
there exists a point P 2 Rn such that g(P ) = P for all P 2 Rn .
This statement is certainly false for many infinite subgroups of Af f (Rn ), e.g.
the translation group Tn . So we have to use the only two facts we know:
(1) G is finite; and
(2) Every g 2 G acts as an affine transformation, i.e. for some vector v 2 Rn ,

g = Tv f,

where Tv is translation by v and f : Rn ! Rn is a linear transformation.


We use statement (2) to understand affine transformations a little better:
Lemma 7.4. Let g = Tv f be an affine transformation of Rn , where f is a linear
transformation and Tv is a translation map. Let v1 , v2 , . . . , vm be vectors in Rn and
let c be a real scalar. Then the following formulas hold:
(1) g(v1 + v2 + · · · + vm ) = g(v1 ) + g(v2 ) + · · · + g(vm ) (m 1)v; and
(2) g(c · v1 ) = c · g(v1 ) + (1 c) · v.
53

Proof. (1) We have

g(v1 + · · · + vm ) = f (v1 + · · · + vm ) + v = f (v1 ) + · · · + f (vm ) + v.


On the other hand,

g(v1 ) + . . . g(vm ) (m 1)v = (f (v1 ) + v) + · · · + (f (vm ) + v) (m 1)v =

f (v1 ) + · · · + f (vm ) + mv (m 1)v = f (v1 ) + · · · + f (vm ) + v = g(v1 + · · · + vm ).


(2) We have

g(c · v1 ) = f (c · v1 ) + v = c · f (v1 ) + v = c · (g(v1 ) v) + v = c · g(v1 ) + (1 c) · v.


Now we can describe the averaging trick which will permit us to find a fixed
point for G. Let’s imagine first that G is the cyclic group generated by a rotation
⇢ about a point P through an angle of 2⇡ n
n . Hence ⇢ = I. Now suppose we didn’t
know P . We could pick a random point Q and look at the ⇢-orbit of Q:

Q, ⇢(Q), ⇢2 (Q), . . . , ⇢n 1
(Q).
Then next step, ⇢n (Q), would take us back to Q. What we would see is n points
evenly spaced on the circumference of a circle, and we would realize that P must be
the center of that circle. If we change the choice of Q, we change the set of points
and we probably even change the circle, but the center is always P !
This is the magic that we will exploit.
First we need the following remark.
Lemma 7.5. Let G be a group and let h be an element of G. Consider the function
h : G ! G defined by

h (g) = h g for all g 2 G.


This is a bijective function.
Proof. Since G is a group, h g 2 G for all g 2 G. So h is indeed a function from
G into G. Suppose h g = h g1 . Then by the Left Cancellation Law, g = g1 . Thus
h is an injective map.
For any g 2 G, h 1 g 2 G, and

1 1 1
h (h g) = h (h g) = (h h ) g = g.
Thus h is also a surjective map, hence a bijective map.

Now we can prove Theorem 7.3.
Proof of Theorem 7.3. Let v be any point in Rn . Let
54

1 X
v= · g(v).
|G|
g2G

Let h 2 G. We claim that h(v) = v. Write h = Tw f , where Tw is translation by


the vector w and f 2 GL(Rn ). Then by Lemma 7.4,
1 X 1
h(v) = · (h( g(v))) + (1 )·w =
|G| |G|
g2G

1 X 1
= ( (h g)(v) (|G| 1) · w) + (1 ) · w.
|G| |G|
g2G
P P
Now by Lemma 7.5, g2G (h g)(v) = g2G g(v), since as g runs through the
elements of G once each, so does h g. Hence
1 X 1
h(v) = ( g(v) (|G| 1) · w) + (1 )·w =
|G| |G|
g2G

|G| 1 1
v · w + (1 ) · w = v,
|G| |G|
completing the proof of Theorem 7.3.

Corollary 7.6. Let G be a finite subgroup of Isom(R3 ). Then G is conjugate to
a subgroup of O(3).
Proof. By Theorem 7.3, there is a point v 2 R3 such that g(v) = v for all g 2 G.
Then

T v g Tv (0, 0, 0) = T v (g(v)) =T v (v) = (0, 0, 0).


1
Hence Tv G Tv ✓ O(3), as claimed.

We now study the finite subgroups of SO(3). It is a bit more complicated to
describe all of the finite subgroups of O(3). You will explore this in the exercises.
Let G be a finite subgroup of SO(3) with G 6= {I}. Recall that every non-
identity element ⇢ of G is a non-identity rotation about an axis passing through
(0, 0, 0), and the only points of R3 fixed by ⇢ lie on this axis of rotation. We let S 2
denote the unit sphere in R3 :

S 2 = {(x, y, z) 2 R3 : x2 + y 2 + z 2 = 1}.
Then the axis of rotation of ⇢ intersects S 2 in exactly two antipodal points (a, b, c)
and ( a, b, c). We think of (a, b, c) and ( a, b, c) as the north and south
poles for the rotation ⇢. Let

P = {P = (a, b, c) 2 S 2 : g(P ) = P for some g 2 G {I}}.


We call P the set of poles for the group G. Notice that each non-identity element of
G contributes two poles to the set P. However two di↵erent non-identity elements
may contribute the same pair of poles. In any case, we see that:
55

Lemma 7.7. We have


2  |P|  2(|G| 1).

Thus P is a finite set. Even better, we have the following fact.


Lemma 7.8. G acts as a group of permutations of the set P, i.e., if P 2 P and
g 2 G, then g(P ) 2 P.
Proof. Let P 2 P. Then by definition of P, there exists some h 2 G with h 6= I
and with h(P ) = P . Now let g be any element of G. Then

1 1
(g h g )(g(P )) = (g h)(g (g(P ))) = (g h)(P ) = g(h(P )) = g(P ).

1
Also, if g h g = I, then

1 1 1
h=g (g h g ) g=g I g = I,
contrary to the fact that h 6= I. Hence g h g 1 6= I and g h g 1
(g(P )) = g(P ).
So g(P ) 2 P for all P 2 P and all g 2 G, as claimed.

Remarkably, we can now count the number of G-orbits on P. Except for one
trivial case, there must be exactly three orbits.
Lemma 7.9. The following conclusions hold:
(1) G has either two or three orbits on P.
(2) If G has only two orbits on P, then P = {P, P } is a pair of antipodal
points on S 2 , and G is a cyclic group of rotations about the axis L through
P and P .
(3) If G has three orbits on P, then |P| = |G| + 2.

Proof. Using the notation of the Orbit Counting Formula, f (g) = 2 for all g 2
G {I}. Let m be the number of G-orbits on P. Then, by the Orbit Counting
Formula,

(|G| 1) · 2 + |P| |P| 2


m= =2+ .
|G| |G|
Thus m 2 and, if m = 2, then |P| = 2, while if m = 3, then |P| = |G| + 2.
Since |P|  2(|G| 1) by Lemma 12.6, we have that

|P| 2 2|G| 4 4
m=2+ 2+ =2+2+ < 4.
|G| |G| |G|
Hence m = 2 or 3.
Finally, suppose that m = |P| = 2. If P is a pole of G, then so is the antipodal
point P . Hence P = {P, P } and these points are fixed by every element of G.
Thus the line L through P and P is held pointwise fixed by every element of G,
and so using Theorem 12.12 in the Math 4580 text, we see that G acts as a finite
cyclic group of rotations of the plane L? through (0, 0, 0) perpendicular to L.

56

For the remainder of this chapter, we shall assume that m = 3, and let O1 , O2 ,
and O3 be the three G-orbits on P. Choose notation so that

|O1 | |O2 | |O3 |.

Let P be a point in O1 , Q a point in O2 , and R a point in O3 . Recall the


notation:

GP = {g 2 G : g(P ) = P }.

By Lagrange’s Orbit Stabilizer Theorem,

|G| = |O1 | · |GP | = |O2 | · |GQ | = |O3 | · |GR |.

Let |GP | = p, |GQ | = q, and |GR | = r. By the ordering of the orbits, we have

p  q  r.

Also, since every pole is fixed both by I and by at least one non-identity element
of G, we have

2  p  q  r.

We now obtain the key formula, which may be interpreted as a statement in


spherical geometry:

Lemma 7.10.
1 1 1 2
+ + =1+ > 1.
p q r |G|

Proof. By Lemma 7.9(3),

|G| + 2 = |P| = |O1 | + |O2 | + |O3 |.

Hence, dividing by |G| and using Lagrange’s formula, we have

2 |O1 | |O2 | |O3 | 1 1 1


1+ = + + = + + .
|G| |G| |G| |G| p q r


From this, we immediately get that the following are the only possibilities for p,
q, and r.

Lemma 7.11. One of the following possibilities holds:


(1) p = q = 2 and r = |G|
2 ; or
(2) p = 2, q = r = 3, and |G| = 12; or
(3) p = 2, q = 3, r = 4, and |G| = 24; or
(4) p = 2, q = 3, r = 5, and |G| = 60.
57

Proof. If p > 2, then


1 1 1 1 1 1 2
+ +  + + =1<1+ ,
p q r 3 3 3 |G|
contrary to Lemma 7.10. Hence p = 2.
|G|
If p = q = 2, then by Lemma 7.10, 1r = 2
|G| . So r = 2 , as claimed in (1).
Suppose that q > 3. Then
1 1 1 1 1 1
+ +  + + = 1,
p q r 2 4 4
again contrary to Lemma 7.10. Hence we may assume that p = 2 and q = 3. If
r > 5, then
1 1 1 1 1 1
+ +  + + = 1,
p q r 2 3 6
again contrary to Lemma 7.10. Hence if q = 3, then r 2 {3, 4, 5}, completing the
proof.

We notice that in the last three cases, G has the cardinality of the tetrahe-
dral group, the octahedral group or the icosahedral group. But first, let’s try to
understand Case 1.
Lemma 7.12. If p = q = 2, then O3 = {R, R}. Let L be the line through R and
R, and let ⇧ be the plane through (0, 0, 0) perpendicular to L. Then G acts as a
dihedral group of isometries of the plane ⇧, containing r rotations and r reflections.
|G|
Proof. As r = |GR | = 2 , GR is a normal subgroup of G with

G = GR [ f GR ,
for some f = G GR . Every element of GR fixes both R and R, and so GR acts
as a cyclic group of rotations of the plane ⇧ about the point (0, 0, 0). Let g 2 GR
with g 6= I. Then, since GR is a normal subgroup of G, f 1 g f := g1 is also an
element of GR . Hence g f = f g1 , and so

g(f (R)) = (g f )(R) = (f g1 )(R) = f (g1 (R)) = f (R).


Hence f (R) is a pole of g. But the poles of g are R and R. Since f is not in GR ,
f (R) 6= R. So f (R) = R and f ( R) = R.
In any case, f fixes setwise both the line L and the plane ⇧ perpendicular to L.
Hence every element of G fixes setwise both the line L and the plane ⇧.
Also, notice that, since f (R) = R, the two poles, S and S, of f lie in the plane
⇧. Now f 2 (R) = R, and so f 2 fixes the non-collinear points R, S, and (0, 0, 0). So
f 2 = I for all f 2 G GR . Hence f is a 180o rotation about the line M through S
and S. So f induces a reflection of the plane ⇧ across the mirror-line M .
Thus G acts as a dihedral group of isometries of the plane ⇧ with the r elements
of GR acting as rotations of ⇧ about (0, 0, 0), and with the r elements of G GR
acting as reflections across lines in ⇧ passing through (0, 0, 0).

58

Next we consider the case

p = 2, q = r = 3.
Consider the orbit O2 of size 4. Since no non-identity rotation fixes four poles,
the map : T ! Sym(O2 ) by restriction of domain is an injective map of T into
Sym(O2 ) ⇠ = S4 . Since |T | = 12, T is isomorphic to the unique subgroup of S4 of
cardinality 12, namely, the alternating group A4 . As you will show in Exercise 1
below, A4 transitively permutes the set of unordered pairs {{i, j} : 1  i < j  4}.
Translating this into geometry: T transitively permutes the set of six edges joining
pairs of vertices in O2 . As T is a group of isometries, all edges have the same
length, i.e. the figure S(4) formed in this way is a tetrahedron, and T is the group
of rotational symmetries of S(4). If we were to use the other orbit of length 4, O3 ,
we would have constructed the dual tetrahedron, S(4)⇤ .
Next. we consider the case

p = 2, q = 3, r = 4.
Now we have that |G| = 24 and so the orbit O3 has length 6. Suppose that X and
X are two antipodal poles. Then GX = G X . In particular, X and X lie in
|G|
orbits of the same length, namely, |G X|
. Since no two orbits have the same length
in this case (in contrast to the tetrahedral case), we see that X and X lie in the
same orbit. In particular

O3 = {R, R, S, S, T, T }
for some poles R, S, T and their antipodes.
The group GR acts as a group of rotations of the latitudinal planes perpendicular
to the axis passing through R and R. Hence, as we have seen from studying the
2-dimensional case, GR is a cyclic group of cardinality 4 = 24 6 . If ⇢ is a cyclic
generator of GR , then ⇢ is a 90o rotation about the {R, R} axis. Since ⇢2 fixes
only the poles R and R, the set {S, S, T, T } is a ⇢-orbit on P. As an exercise,
you are asked to show that this is possible if and only if S, T, S, T lie at a set
of compass points on the equatorial plane relative to the poles R and R. Thus
S, T, S, T determine a square in the equatorial plane, and {±R, ±S, ±T } is the
vertex set of an octahedron C ⇤ with G as its group of rotational symmetries.
The orbit O2 contains four pairs of antipodal poles, which may be obtained as
follows: Draw the lines through the centers of opposite faces of the octahedron C ⇤ .
Each such line L gives a pair {QL , QL } of antipodal points on the unit sphere S 2 ,
which are poles for the rotational symmetries ⇢F and ⇢2F of order 3 fixing the face
F . This set of eight points on S 2 is the set of points in the orbit O2 . Clearly, it
may be identified with the vertices of a cube which is a dilation of the cube C dual
to the octahedron C ⇤ .
Finally we say a few words about the most difficult case

p = 2, q = 3, r = 5.
Now |G| = 60 and the orbit O3 has length 12. Again, because no two orbits have
the same length, antipodal poles lie in the same orbit. In particular, GR is a cyclic
59

group of cardinality 5, fixing the points R and R, while permuting the remaining
10 points of O3 in two orbits of length 5.
Suppose one GR -orbit consists of five points in the equatorial plane relative to R
and R. Then, since these points are the vertices of a regular pentagon, they do not
contain antipodal pairs. Hence the other GR -orbit must consist of their antipodal
points, also lying in the equatorial plane. Let C be the great circle formed by the
intersection of this equatorial plane with the unit sphere S 2 . Then, clearly C is the
only great circle on S 2 containing 10 points from O3 . But then since G acts on the
orbit O3 , G fixes the circle C (as a set), contrary to the fact that O3 is a G-orbit
and R does not lie on C.
Hence there are two latitudinal circles, CN and CS , each containing the points
in one GR -orbit of length 5 on O3 , placed at the vertices of a regular pentagon.
Moreover the regular pentagon on CN is antipodal to the regular pentagon on CS .
Using some further symmetry arguments, it can be shown that O3 is the set
of vertices of an icosahedron inscribed in S 2 . And, then, in a similar way to the
octahedral case, it may be argued that O2 is the set of vertices of an inscribed
dodecahedron, which is the dilation of the dodecahedron dual to the inscribed
icosahedron.
Finally, as |G| = 60, it follows that G ⇠
= I, the group of all rotational symmetries
of the icosahedron.

Exercises
1. Let G = A4 , the alternating group on 4 letters. Let

X = {{i, j} : 1  i < j  4}.


Let G act on X via
({i, j}) = { (i), (j)}.
Prove: G acts transitively on the set X, i.e., for all {i, j} 2 X, there exists 2G
with

({1, 2}) = {i, j}.

2.(a) Justify the statement in the discussion of the octahedral case: “Since ⇢2
fixes only the poles R and R, the set {S, S, T, T } is a ⇢-orbit on P.
(b) In the same context as (2a), prove: S, T, S, T lie at a set of compass
points on the equatorial plane relative to the poles R and R.
3. The Direct Product: Let G be a group having subgroups H and K satisfying
the following two conditions:
(a) h k = k h for all h 2 H and k 2 K; and
(b) H \ K = {I}.
Prove: The subset HK of G is a subgroup of G which is isomorphic to the following
formal group. called the direct product of H and K:

H ⇥ K := {(h, k) : h 2 H and k 2 K},


with the operation
60

(h, k) (h0 , k 0 ) = (h h0 , k k 0 ).

4. Let G = HK ⇠
= H ⇥ K be a group. Let ⇡H : G ! H be the projection map
defined by:

⇡H ((hk)) = h 8 g = hk 2 G.
Define ⇡K : G ! K analogously.
(a) Prove: Let M be any subgroup of G. Then ⇡H (M ) is a subgroup of H and
⇡K (M ) is a subgroup of K.
(b) Prove: M is a subgroup of the group ⇡H (M ) · ⇡K (M ) ⇠
= ⇡H (M ) ⇥ ⇡K (M ).
5.(a) Prove: O(3) ⇠
= SO(3) ⇥ {I, I} ⇠
= SO(3) ⇥ C2 .
(b) Conclude that every finite subgroup of O(3) is isomorphic to a subgroup of
H ⇥ K, where H is a finite subgroup of SO(3) and |K| = 1 or 2.
6. Let S be a regular tetrahedron in R3 . We have proved that Sym(S) ⇠ = S4 .
On the other hand, the tetrahedral group T of all rotational symmetries of S is
isomorphic to A4 . Verify that S4 is not isomorphic to a subgroup of T ⇥ {I, I}.
Explain how this could be true.
7. If S is either a regular octahedron or a regular icosahedron centered at (0, 0, 0),
then the antipodal map I is a symmetry of S. Using this fact, prove that the
symmetry group of S is Sym(S) = O ⇥ {±I} is S is an octahedron, and Sym(S) =
Icos ⇥ {±I} if S is an icosahedron.
8. Let D3 be the group of all diagonal matrices in O(3). Prove: D3 ⇠ = C2 ⇥ C2 ⇥
C2 , an abelian group of cardinality 8, all of whose non-identity elements have order
2.
9.(a) Prove: Let p be a prime and let G be a finite group of cardinality p2 . Then
either G is cyclic or G ⇠
= Cp ⇥ Cp .
(b) Exhibit a subgroup P of S6 with |P | = 9. Verify that P ⇠
= C3 ⇥ C3 .
10. Inner Product Spaces: A function h., .i : R3 ⇥ R3 ! R is an inner product
on R3 if the following properties hold:
(i) hu, vi = hv, ui for all u, v 2 R3 ;
(ii) hu + v, wi = hu, wi + hv, wi for all u, v, w 2 R3 ;
(iii) hcu, vi = c · hu, vi for all u, v 2 R3 , c 2 R; and
(iv) hu, ui 0 for all u 2 R3 , with equality if and only if u = 0.

(a) Prove: If h., .i is an inner product on R3 , there is an orthonormal basis for


R with respect to h., .i, i.e. there exist vectors e1 , e2 , e3 such that
3

hei , ej i = 0 for all i 6= j, and hei , ei i = 1 for all i.

(b) Let G⇤ be the set of all linear isometries of R3 with respect to h., .i, i.e.,

G⇤ := {T 2 GL(R3 ) : hu, vi = hT (u), T (v)i 8u, v 2 R3 }.


61

Prove: G⇤ is a subgroup of GL(R3 ) and G⇤ ⇠= O(3).


[Hint: With respect to the standard basis for R3 ,

O(3) ⇠
= {A 2 GL(3, R) : AT A = I}.
Show that the same is true for G⇤ with respect to a suitable choice of basis for R3 .]
11. Another Averaging Trick. Prove E. H. Moore’s Theorem: Let G be a finite
subgroup of GL(3, R). Then G is isomorphic to a subgroup of O(3). [Hint: Define
a function on R3 ⇥ R3 by
1 X
hu, vi = g(u) · g(v)
|G|
g2G

for all u, v 2 R . Prove that h., .i is an inner product on R3 . Then, prove that G
3

is a group of linear isometries with respect to h., .i. Now use Exercise 8b.]
12. Prove: Let G be a finite subgroup of Af f (R3 ). Then G is isomorphic to a
subgroup of O(3).
[Hint: Use Theorem 7.3 and the proof of Corollary 7.6 to show that G is conjugate
to a subgroup G1 of GL(3, R). Now use Exercise 11.]
13. Give an example of a bijective function f : R ! R of finite order, i.e., f n = I
for some n 2 BbbN , such that f fixes no point of R, i.e., f (x) 6= x for all x 2 BbbR.
14(a) Consider the following set of 2 ⇥ 2 matrices with complex entries:
✓ ◆ ✓ ◆ ✓ ◆
i 0 0 1 0 i
Q8 := {±I, ± ,± ,± }.
0 i 1 0 i 0
Prove: Q8 is a subgroup of the group of all 2 ⇥ 2 matrices with complex entries.
(b) Let m 2 N and let ⇣m = cos( 2⇡ 2⇡
m ) + isin( m ). Using DeMoivre’s Formula,
prove that ⇣m = 1, but ⇣m 6= 1 for all d < m, d 2 N.
m d
✓ ◆
⇣p 0
(c) Let p and q be prime numbers. Let Zpq = . Prove: Zpq generates a
0 ⇣q
cyclic subgroup of cardinality pq in GL(2, C), the group of 2 ⇥ 2 invertible matrices
with complex entries.
(d) Let p be a prime number. Let a, b 2 N with a  b. Let H be the subgroup
of GL(2, C) generated by the two matrices:
✓ ◆ ✓ ◆
⇣pa 0 1 0
and .
0 1 0 ⇣pb
Prove: Hp ⇠
= Cpa ⇥ Cpb .
(e) Prove: Let L be the subgroup of GL(2, C) generated by the two matrices:
✓ ◆ ✓ ◆
⇣3 0 0 1
and .
0 ⇣3 1 1 0
Then L is a nonabelian group of cardinality 12 with |Z(L)| = 2 and with only one
element of order 2.
62

Remark:. We have now constructed examples of all finite groups G with |G|  15:
(1) Cn , n  15;
(2) V4 = D 2 ⇠ = C2 ⇥ C2 ;
(3) D3 ⇠ S
= 3 ;
(4) D4 ; ✓ ◆ ✓ ◆
⇠ i 0 1 0
(5) C4 ⇥ C2 = h , i;
0 1 0 1
(6) C2 ⇥ C2 ⇥ C2 , the group of all diagonal matrices in O(3);
(7) Q8 ✓ GL(2, C);
(8) C 3 ⇥ C 3 ✓ S6 ;
(9) D5 ;
(10) C6 ⇥ C2 ✓ GL(2, C);
(11) D6 ;
(12) A4 ;
(13) L ✓ GL(2, C).

15. Justify the statement that no two of the 27 groups listed above are isomor-
phic.
It is not terribly difficult, but it is a bit beyond the scope of this course to prove
that every group G with |G|  15 is isomorphic to one of these 27 groups.
63

8. Imaginaries and Galois fields

“... and gives to airy nothing a local habitation and a name ...”
– Wm. Shakespeare
We now change topic, returning to the theme of polynomial equations and their
roots. De Moivre’s Formula, which you studied last semester, demonstrated the
existence of exactly n complex nth roots for any number, i.e. a full complement
of n solutions to the equation xn ↵ = 0 could be found among the complex
numbers, for any given complex number ↵. This lent support to the idea that every
polynomial equation of degree n with complex coefficients should have a full set of n
solutions (counting multiplicity) in the field C. A somewhat cryptic version of this
statement was first made (without proof) by the French mathematician Girard as
early as 1629 (before Descartes published his Factor Theorem), along with formulas
expressing the coefficients as symmetric functions of the roots.
Nevertheless, as late as the early 1700s, Leibniz thought he had a counterex-
ample. D’Alembert published a somewhat incomplete proof of this “Fundamental
Theorem of Algebra” using the methods of calculus, in 1746. Euler attempted a
more algebraic proof in 1749, which was improved by de Foncenex and then La-
grange in 1772.
There was one big problem with Euler’s proof, and this was pointed out by
Gauss, who proposed his own calculus-based proof in 1799, and a second more
algebraic proof in 1816. The problem detected by Gauss was:
Euler’s proof assumed the existence somewhere of a set of n roots of an nth
degree polynomial p(x) with real coefficients. The proof then proceeded to show
that these roots were in fact complex numbers. But, said Gauss, this misses the
entire point. Why do roots of p(x) exist anywhere??
If Euler were still alive when Gauss wrote this, he might have responded: What
do you mean exist somewhere?? We can simply invent them, as needed. After
all, the imaginary number i was invented to provide a root for the polynomial
f (x) = x2 + 1 2 R[x]. By adding i to R, we were able to create a larger field, C,
containing the roots of all quadratic equations. Just keep on doing this, as needed.
Well, Gauss had a point. One can certainly invent symbols, but can one construct
an algebraic structure containing these symbols and having all of the usual nice
properties that one needs to carry out Euler’s proof? In modern language, given
a field F and a polynomial f (x) 2 F [x], can one always construct a larger field E
containing F and also containing a complete set of roots for this polynomial? Galois
was perhaps the first person to demonstrate that this is always possible. In 1831,
he wrote a note on fields of numbers, hinting at the construction we shall describe
in this section. This was later clarified and elaborated by Leopold Kronecker.
Let’s construct C a di↵erent way: Let R[x] be the domain of all polynomials
with real coefficients. Let (x2 + 1) denote the principal ideal in R[x] generated by
the polynomial q(x) = x2 + 1. Form the quotient ring C := R[x]/(x2 + 1). By the
Division Algorithm, we see that every element of C has the form

(a + bx) + (x2 + 1)
64

for some a, b 2 R. If we define the symbol i to denote the coset x + (x2 + 1), then
we see that

i2 = x2 + (x2 + 1) = 1 + (x2 + 1).


Thus we have created a commutative ring whose objects are the symbols a + bi for
a and b real numbers, satisfying the condition i2 = 1, i.e. we have recreated the
complex numbers.
Clearly, we can imitate this construction in great generality:
Let F be a field and let f (x) 2 F [x]. Take the principal ideal (f (x)) in F [x] and
form the quotient ring F [x]/(f (x)).
It is easy to see that if we set

j = x + (f (x)) 2 F [x]/(f (x)),


then f (j) = 0. So we have created a ring in which f (x) has a root.
Experimentation shows that in general the ring we have created is not a field.
However, if f (x) is an irreducible polynomial in F [x], then indeed Euclid’s Lemma
for Polynomials will guarantee that every non-zero element of F [x]/(f (x)) has a
multiplicative inverse in F [x]/(f (x)), i.e. F [x]/(f (x)) is indeed a field.
Theorem 8.1. Let F be a field and let f (x) 2 F [x] be a (non-constant) irreducible
polynomial. Then the ring E := F [x]/(f (x)) is a field. Moreover, F is isomorphic
to the following subfield of E:

F0 := {↵ + (f (x)) : ↵ 2 F }.
(Here we are identifying the number ↵ with the constant polynomial c(x) = ↵.)
Proof. For any p(x) 2 F [x], let [p(x)] := p(x) + (f (x)) 2 E. We need to prove that
if [g(x)] is a non-zero element of E, then there is a polynomial h(x) 2 F [x] such
that [g(x)] · [h(x)] = [1] in E, i.e.,

(g(x) + (f (x)))(h(x) + (f (x))) = 1 + (f (x)) 2 F [x]/(f (x)).


This is true if and only if there exists a polynomial k(x) 2 F [x] such that:

g(x)h(x) = 1 + k(x)f (x).


Rewriting this as

g(x)h(x) k(x)f (x) = 1,


reminds us of Euclid’s Lemma. Indeed, since f (x) is irreducible and g(x) is not a
multiple of f (x), we do indeed have that gcd(g(x), f (x)) = 1. Hence there exist
polynomials h(x) and m(x) such that

h(x)g(x) + m(x)f (x) = 1.


Taking k(x) = m(x), we are done. Thus [g(x)] · [h(x)] = [1] in F [x]/(f (x)). So
this ring is indeed a field.
65

Now let : F ! F0 be the function

(↵) = ↵ + (f (x)) for all ↵ 2 F .


Then clearly is a homomorphism of F onto F0 . If (↵) = 0 + (f (x)), then the
constant polynomial ↵ is in the ideal (f (x)), i.e., ↵ is a multiple of f (x). Since f (x)
is not a constant polynomial, this is possible only if ↵ = 0. Hence : F ! F0 is an
isomorphism of fields, as claimed. This completes the proof of the theorem.

It is easy to see that the element j := x + (f (x)) in E := F [x]/(f (x)) is a root
of the polynomial f (x) 2 E[x], where one regards F as a subfield of E, and hence
F [x] as a subdomain of E[x].
We are not done however. In order to fully answer Gauss’ objection, we have to
construct an extension field of F which contains all of the roots of p(x). Consider
for example the irreducible polynomial f (x) = x4 + 2 2 Q[x]. If ↵ is the positive
real fourth root of 2, then the roots of f (x) in C are ↵, ↵, i · ↵, and i · ↵. Thus
Q(↵, i) contains all the roots of f (x) in C, but clearly Q(↵), being a subfield of R,
does not. With some further thought, we can see that Q(i · ↵) also does not contain
all of the roots of f (x).
We solve this problem by repeating the process.
Theorem 8.2. Let F be a field and let p(x) 2 F [x] be a polynomial of degree n.
There exists a field E containing a subfield F0 isomorphic to F and such that p(x)
factors as a product of n linear factors in E[x].
Proof. We proceed by complete mathematical induction on the degree n of p(x). If
n = 1, then p(x) itself is linear and there is nothing to prove. Suppose then that
the theorem is true for all polynomials of degree less than n.
Suppose first that p(x) = g(x)h(x) with g(x) and h(x) non-constant polynomials
in F [x]. By induction, there is a field E1 containing a subfield F1 isomorphic to F
and such that g(x) factors into linear factors in E1 [x]. Identifying F and F1 , we may
assume that h(x) 2 E1 [x]. Then by induction, there exists a field E containing a
subfield E0 isomorphic to E1 and such that h(x) factors into linear factors in E[x].
Now, since E0 is isomorphic to E1 , E0 contains a subfield F0 isomorphic to F .
Thus, after suitable identifications, we have p(x) = g(x)h(x) 2 E[x] with both g(x)
and h(x) factoring into linear factors in E[x]. Hence p(x) factors into linear factors
in E[x], and we are done.
Thus we may assume that p(x) 2 F [x] is irreducible. Then by Theorem 8.1,
there exists a field E1 containing a subfield F1 isomorphic to F , and such that p(x)
has a root ↵ 2 E1 . Thus, by Descartes’ Factor Theorem, there exists g(x) 2 E1 [x]
such that

p(x) = (x ↵)g(x) 2 E1 [x].


Since g(x) has degree n 1, we may apply induction as before to conclude that
there exists a field E containing a subfield E0 isomorphic to E1 , and such that g(x)
factors into linear factors in E1 [x]. As before, E contains a subfield F0 isomorphic
to F , and p(x) factors into linear factors in E[x], as desired.

66

Definition 8.3. Let F be a field and let p(x) 2 F [x] be a polynomial. Suppose
that E is a field which contains a subfield F0 isomorphic to F , and such that p(x)
factors into a product of linear factors in E[x]. Let r1 , r2 , . . . , rn be the roots of
p(x) in E, and let E0 = F (r1 , r2 , . . . , rn ) ✓ E. Then we call E0 a splitting field
for p(x) over F .
It is very useful to be able to measure the “relative sizes”of fields F and E, where
E is an extension field of F , i.e.,F is a subfield of E. Since both fields are, in
general, infinite, cardinality is not a good measuring stick. Fortunately we have an
alternative, suggested by the following observation.
Lemma 8.4. Let E be an extension field of F . Then the usual operations of
addition and multiplication in E make E an F -vector space.
We leave the proof as an exercise. Note that, since E is a field, we know that
(E, +) is an abelian group. The axioms for scalar multiplication:
(1) ↵ · (u + v) = ↵ · u + ↵ · v, for ↵ 2 F , u, v 2 E;
(2) (↵ + ) · u = ↵ · u + · u for ↵, 2 F , u 2 E;
(3) (↵ ) · u = ↵ · ( · u) for ↵, 2 F , u 2 E; and
(4) 1 · v = v for all v 2 E.
follow easily from the properties of the field E.
Since E is an F -vector space, we can measure its size relative to F by the
dimension of E as an F -vector space. Note that, if E = F , then {1} is a basis for
E as an F -space, and so dimF (F ) = 1.
Definition 8.5. We write (E : F ) and speak of the degree of E over F , to denote
the dimension of E as an F -vector space.
This degree is most useful when it is finite. This will be the case in the situations
of interest to us. We need the following remark, whose proof we leave as an exercise.
Lemma 8.6. Let F be a field and E an extension field of F containing a root ↵
of some non-zero polynomial in F [x]. Then the set

K(↵) = {f (x) 2 F [x] : f (↵) = 0}


is a principal ideal in F [x] generated by a monic irreducible polynomial m(x). In
particular, if p(x) is any irreducible polynomial in F [x] with p(↵) = 0, then p(x) =
c · m(x) for some c 2 F .
We call m(x) the minimum polynomial of ↵ in F [x]. Note that m(x) depends
heavily both on ↵ and on F .
Theorem 8.7. Let F be a field and let p(x) 2 F [x] be a polynomial of degree n 1.
Let E = F [x]/(p(x)). Then (E : F ) = n.
Proof. If f (x) 2 F [x], we let [f (x)] := f (x) + (p(x)) 2 E. By the Division Algo-
rithm, there exist polynomials q(x) and r(x) 2 F [x], with either r(x) = 0 or r(x)
of smaller degree than n, such that

f (x) = q(x)p(x) + r(x).


Thus
67

[f (x)] = [q(x)p(x) + r(x)] = [q(x)][p(x)] + [r(x)] = [q(x)][0] + [r(x)] = [r(x)].

Set

n 1
r(x) = a0 + a1 x + · · · + an 1x 2 F [x].
Being a bit sloppy, we shall assume that F ✓ E and denote by a the coset a +
(p(x)) 2 E for any a 2 F . Thus

[r(x)] = a0 · [1] + a1 · [x] + · · · + an 1 · [xn 1


] 2 E.
In other words, the set

B := {[1], [x], . . . , [xn 1


]}
is a spanning set for E as an F -vector space. We claim that B is also an F -linearly
independent set. For, suppose that

c0 · [1] + c1 · [x] + · · · + cn 1 · [xn 1


] = [0] 2 E,
for some c0 , c1 , . . . , cn 1 2 F . Then

n 1
[c0 + c1 x + · · · + cn 1x ] = [0].
Setting h(x) = c0 + c1 x + . . . cn 1 xn 1 2 F [x], we conclude that h(x) is a multiple
of f (x). However, either h(x) is the zero polynomial or deg(h(x))  n 1 < n =
deg(f (x)). Hence h(x) ⌘ 0, i.e.

c0 = c1 = · · · = cn 1 = 0.
Thus B is indeed an F -linearly independent set. So B is an F -basis for E, whence
(E : F ) = n, as claimed.

As a corollary, we obtain the following important fact.
Corollary 8.8. Let E be a field and let F be a subfield of E. Let ↵ 2 E and
suppose that m(x) 2 F [x] is the minimum polynomial of ↵ in F [x]. If the degree of
m(x) is n, then (F (↵) : F ) = n.
Proof. F (↵) ⇠
= F [x]/(m(x)).

Note that not every number has a minimum polynomial. For example if F =
Q and E = R, then ⇡ 2 R and ⇡ is not the root of any polynomial equation
with rational coefficients. [This is a fairly difficult theorem to prove. It was first
proved by Lindemann.] We say that a number ↵ is algebraic over F if ↵ is the
root of a polynomial equation with coefficients in F . Otherwise, we say that ↵ is
transcendental over F . We shall restrict our attention to algebraic numbers. We
have the following converse to Corollary 8.8.
68

Theorem 8.9. Let E be an extension field of F with (E : F ) = n < 1. Then


every element of E is algebraic over F .
Proof. Let ↵ 2 E. Consider the set

S := {1, ↵, ↵2 , . . . , ↵n }.
Since |S| = n + 1 and dimF (E) = n, S is a linearly dependent set. Hence there
exist numbers c0 , c1 , . . . , cn 2 F , not all 0, such that

c0 + c1 ↵ + · · · + cn ↵n = 0.
Let p(x) = cn xn + · · · + c1 x + c0 2 F [x]. Then p(↵) = 0. Hence ↵ is algebraic over
F , as claimed.

Exercises
1. Describe the multiplication in the ring F [x]/(x2 ). Is this a field? What type
of element is [x]?
2. Describe the multiplication in the ring Q[x]/(x2 x). Is this a field? What
type of element is [x]?
3a. Let p(x) = g(x)h(x) 2 Q[x] with g(x) and h(x) non-constant irreducible
polynomials with gcd(g(x), h(x)) = 1. Prove: Q[x]/(p(x)) ⇠
= F1 F2 , with F1 and
F2 extension fields of Q. [Hint: Use the Chinese Remainder Theorem from last
semester.]
b. Let p(x) = g(x)h(x) 2 Q[x] with g(x) and h(x) non-constant irreducible
polynomials. Prove: Q[x]/(p(x)) is not a field, but also it has no non-zero nilpotent
elements.
c. (Bonus) Give necessary and sufficient conditions on a polynomial p(x) 2 Q[x]
for the ring Q[x]/(p(x)) to contain non-zero nilpotent elements.
4a. Describe the multiplication in the ring Q[x]/(x2 + x + 1). Is this a field?
What is the multiplicative inverse of [x]?
p
3
b. Let ! = 1
2 + 2 i 2 C. Let

E = {a + b! + c! 2 : a, b, c 2 Q} ✓ C.
Prove: E is closed under addition, subtraction, multiplication, and division (by
non-zero elements).
c. Let E be as in (b). Prove: E ⇠
= Q[x]/(x2 + x + 1).
5. Prove Lemma 8.4.
6. Prove Lemma 8.6.
7. Prove: Let F be a field. Let F0 be the intersection of all subfields of F . Then
F0 is a subfield of F . [Hence F0 is the unique smallest subfield of F .]
The next few exercises relate to finite fields. These were first described by Galois,
and are sometimes called Galois fields.
69

8a. Prove: Let E be a finite field. Let F be the smallest subfield of E, i.e.,

F = {0E , 1E , 1E + 1E , ...}.
Then F ⇠
= Z/pZ for some prime p. [Recall: p is the characteristic of the field E.]
b. Prove: Let E and F be as in (a). Then E is a finite-dimensional F -vector
space. In particular, |E| = pn for some n 2 N.
n
c. Prove: Let E be a finite field with |E| = pn . Then xp = x for all x 2 E.
[Hint: E {0} is a finite group under multiplication. Apply Lagrange’s Theorem.]
d. Prove: Let E be a field of characteristic p. Let a, b 2 E. Then
n n n
(a + b)p = ap + bp
for all n 2 N.
n
9. Let F = Z/pZ. Let f (x) = xp x 2 F [x]. Let E be a splitting field for f (x)
over F .
a. Prove: f (x) has pn distinct roots in E. [Hint: Suppose on the contrary
that f (x) = (x a)2 · g(x) 2 E[x]. Compute f 0 (x) in two di↵erent ways to get a
contradiction.]
b. Let f (x) be as above. Let S := {a 2 E : f (a) = 0}. Prove: S is closed under
addition, subtraction, multiplication, and division (by non-zero elements), i.e., S is
a subfield of E.
c. Let S be as in (b). Prove: S = E, i.e. |E| = pn .
10. Prove: If E and E1 are two fields with |E| = pn = |E1 |, then E ⇠
= E1 . [Thus,
up to isomorphism, there is one and only one field of cardinality pn for each prime
p and each n 2 N.]
11. Prove: Let E be a field of characteristic p. Define f : E ! E by f (x) = xp
for all x 2 E. Then f is an injective ring homomorphism. In particular, if |E| = pn
for some n 2 N, then f is an automorphism of E.
70

9. Symmetric Polynomials and the


Fundamental Theorem of Algebra
Symmetry is of course a ubiquitous topic in Euclidean geometry. But geometric
symmetry did not lead to the development of group theory. Instead, that had
to wait 2000 years until the problem of finding the roots of polynomial equations
pushed mathematicians to develop the theory of symmetric polynomials. In this
chapter we shall develop some basic properties of symmetric polynomials and apply
them to give Euler’s proof of the Fundamental Theorem of Algebra.
Definition 9.1. A polynomial in n commuting variables f (r1 , r2 , . . . , rn ) is called
a symmetric polynomial in these variables if, for every 2 Sn ,

f (r (1) , r (2) , . . . , r (n) ) = f (r1 , r2 , . . . , rn ).

The concept of a symmetric polynomial is not interesting when n = 1. For n = 2,


some examples are:

r1 + r2 ,
r1 r 2 ,
r12 + r22 + r1 r2 ,
r13 + r23 .
For n = 3, another type of example is:

r 1 r 2 + r1 r3 + r 2 r 3 .
We shall restrict our attention to symmetric polynomials with integer coefficients.
We leave it as an exercise to prove the following theorem.
Theorem 9.2. Let S be the set of all symmetric polynomials in the variables
r1 , r2 , . . . , rn with integer coefficients. Then S is a subring of Z[r1 , r2 , . . . , rn ], i.e.
S contains 0 and 1, and S is closed under addition, subtraction, and multiplication.
Now we are really interested in polynomials p(x) in one variable. The context
in which these multivariable polynomials arises is the following:
Suppose that p(x) = xn +an 1 xn 1 +· · ·+a1 x+a0 2 F [x] is a monic polynomial
having roots r1 , r2 , . . . , rn in some splitting field E containing F . Then

p(x) = xn + an 1x
n 1
+ · · · + a1 x + a0 = (x r1 )(x r2 ) . . . (x rn ) 2 E[x].

Equating coefficients, we get n formulas of the type:

an 1 = r 1 + r2 + · · · + rn ,
an 2 = r 1 r 2 + r1 r3 + · · · + rn 1 rn ,

...,
n
( 1) a0 = r1 r2 . . . rn .
71

Each of the expressions on the right hand side can be thought of as a polynomial
in the ring Z[r1 , r2 , . . . , rn ]. In fact, each lies in the subring of symmetric polyno-
mials, since obviously p(x) is unchanged by any permutation in the ordering of the
linear factors. Indeed, these n polynomials are called the elementary symmetric
polynomials in Z[r1 , r2 , . . . , rn ]:

s 1 = r 1 + r2 + · · · + rn ,
X
s2 = ri r j ,
i6=j
X
s3 = r i r j rk ,
i6=j6=k6=i
...
s n = r 1 r 2 . . . rn .
The following fundamental theorem was probably known to Isaac Newton, but
was first explicitly proved somewhat later by Edward Waring.
Waring’s Theorem. Let S be the subring of Z[r1 , r2 , . . . , rn ] consisting of all
symmetric polynomials in the variables r1 , r2 , . . . , rn with integer coefficients. Then

S = Z[s1 , s2 , . . . , sn ].

The proof of Waring’s Theorem amounts to an algorithm for the following:


Given a symmetric polynomial f (r1 , r2 , . . . , rn ), rewrite this polynomial as

f (r1 , r2 , . . . , rn ) = F (s1 , s2 , . . . , sn ),
for some polynomial F with integer coefficients, depending of course on f .
Rather than describe the algorithm in full gory generality, let’s look at an illus-
trative example in three variables. To save ink, let’s call the variables r, s, and t,
instead of r1 , r2 , r3 .
If you want to cook up an example of a symmetric polynomial in three variables,
you can symmetrize any monomial by adding up all of its possible permutations.
For example, starting with the monomial

m(r, s, t) = r2 s,
we get the symmetrized polynomial

p(r, s, t) = r2 s + r2 t + s2 r + s2 t + t2 r + t2 s.
An important feature of any algorithm is to have a way of measuring whether
you are making steady progress in the correct direction, or just going around in
circles. To do this, we choose a way to say that a symmetric polynomial p(r, s, t)
is bigger than some other symmetric polynomial q(r, s, t). Then we will look for an
algorithm that makes our polynomial smaller and smaller.
For a monomial m(r, s, t) = ari sj tk (with a 2 Z), we call its degree vector
(i, j, k). Thus, the monomial m(r, s, t) = r2 s has degree vector (2, 1, 0). We order
the degree vectors lexicographically reading from left to right. Thus
72

(3, 0, 0) > (2, 1, 1) > (2, 1, 0) > (2, 0, 7) > (0, 3, 5),
for example, i.e.,

r3 > r2 st > r2 s > r2 t7 > s3 t5 .


Here is a crucial point: If f (r, s, t) is a symmetric polynomial containing a mono-
mial with degree vector (i, j, k), then by symmetry, f must also contain monomials
with degree vector every possible permutation of (i, j, k). In one of these, we must
have i j k. In particular, the highest term of any symmetric polynomial
f (r, s, t) has degree vector (a, b, c) with a b c.
Here is another crucial point: The highest terms of the elementary symmetric
polynomials in r, s, t have degree vectors as follows:

If s1 = r + s + t, degree vector is (1, 0, 0).


If s2 = rs + rt + st, degree vector is (1, 1, 0).
If s3 = rst, degree vector is (1, 1, 1).
When we multiply monomials, we add their degree vectors. Hence if i j k,
then

si1 j sj2 k k
s3 has degree vector (i, j, k).
For example, if we want a symmetric polynomial with degree vector (2, 1, 0), then
we note that

(2, 1, 0) = (1, 0, 0) + (1, 1, 0),


and so s1 s2 should do the job. Let’s check:

s1 s2 = (r + s + t)(rs + rt + st) = r2 s + r2 t + rt2 + s2 t + rs2 + st2 + 3rst.


Thus, indeed, the highest monomial term of s1 s2 is r2 s, which has degree vector
(2, 1, 0).
Now let’s go back to our polynomial

p(r, s, t) = r2 s + r2 t + s2 r + s2 t + t2 r + t2 s.
If we let q(r, s, t) = p(r, s, t) s1 s2 , then we have succeeded in canceling the term
of highest degree out of p(r, s, t). Thus q(r, s, t) = 3rst has highest degree vector
(1, 1, 1) < (2, 1, 0). So we are making progress. In fact, in this case we are done,
since q(r, s, t) = 3s3 . So we have

p(r, s, t) = s1 s2 + q(r, s, t) = s1 s2 3s3 2 Z[s1 , s2 , s3 ],


as desired.
Of course, in general, the procedure takes much longer, but clearly we keep sim-
plifying our problem. So like the Euclidean algorithm or the Gaussian elimination
algorithm, we must eventually succeed. You will be asked to try a few harder
examples in the homework exercises.
Here is the way that we will apply Waring’s Theorem to prove the Fundamental
Theorem of Algebra.
73

Lemma 9.3. Let p(x) 2 R[x] be a monic polynomial of degree n. Let E be a


splitting field for p(x) containing R. Denote by r1 , r2 , . . . , rn the roots of p(x) in E.
Let c 2 N and consider the n2 numbers aij := ri + rj + cri rj where 1  i < j  n.
Let f (x) 2 E[x] be the polynomial
Y
f (x) = (x aij )
1i<jn

n
of degree 2 . Then f (x) 2 R[x], i.e., all of the coefficients of f (x) are real numbers.
Proof. Let A = {aij : 1  i, j  n, i 6= j}. Let Sn act on A by:

(aij ) = a (i) (j) for all 2 Sn .


n
Then Sn permutes the elements of A. Hence each of the 2 elementary symmetric
polynomials in the elements of A:

E1 := a12 + a13 + · · · + an 1,n ,

X
E2 := aij akl ,
(ij)6=(kl)

...

E(n) := a12 · a13 · . . . · an 1,n ,


2

is fixed by every permutation in Sn . Thus, if we view each Ei as a polynomial in


Z[r1 , r2 , . . . , rn ], then in fact each Ei lies in the subring S of all symmetric poly-
nomials in the variables r1 , r2 , . . . , rn . By Waring’s Theorem, S = Z[s1 , s2 , . . . , sn ],
where

s 1 = r 1 + r2 + · · · + rn ,
X
s2 = ri r j ,
i6=j

...

s n = r 1 r 2 . . . rn .

But then, for each i, either si or si is a coefficient of the polynomial p(x) 2 R[x],
i.e., each si is a real number. Hence Z[s1 , s2 , . . . , sn ] is a subring of R. It follows
that each Ei is a real number. But
n n
f (x) = x( 2 ) E1 x ( 2 ) 1
+ · · · ± E( n ) .
2

Hence f (x) 2 R[x], as claimed.



We are now ready to attack the Fundamental Theorem of Algebra in the manner
of Euler.
74

Fundamental Theorem of Algebra. Let p(x) 2 C[x] be a polynomial of degree


n 1. Then there exist complex numbers c, r1 , r2 , . . . , rn such that

p(x) = c(x r1 )(x r2 ) . . . (x rn ) 2 C[x].

We begin with a few easy reductions. We shall refer to the Fundamental Theorem
of Algebra as FTA.

Lemma 9.4. FTA is true provided that the following statement is true:

(*) Let f (x) be any monic polynomial in C[x] of degree n 1. Then there is at
least one complex number r with f (r) = 0.

Proof. Assuming the statement above, we shall prove FTA by induction on the
degree n of p(x). If n = 1, then p(x) = ax + b for some a, b 2 C, and so

b
p(x) = a(x ( )).
a
Thus we are done, taking c = a and r1 = ab .
Now suppose FTA is true for polynomials of degree n, and let p(x) 2 C[x] have
degree n + 1. Write

p(x) = an+1 xn+1 + an xn + · · · + a1 x + a0 .


ai
Let bi = an+1 for 0  i  n. Then p(x) = an+1 f (x), where f (x) 2 C[x] is the
monic polynomial

f (x) = xn+1 + bn xn + · · · + b1 x + b0 .

By assertion (*), there exists a complex number r with f (r) = 0. Then by Descartes’
Factor Theorem,

f (x) = (x r)g(x),

where g(x) is a monic polynomial in C[x] of degree n. By induction, there exist


complex numbers r1 , r2 , . . . , rn with

g(x) = (x r1 )(x r2 ) . . . (x rn ).

Thus

p(x) = an+1 f (x) = an+1 (x r)g(x) = an+1 (x r)(x r1 )(x r2 ) . . . (x rn ).

Thus p(x) has a factorization as claimed in FTA. Hence FTA is true, provided that
(*) is true.


The next step is to reduce (*) to the case when f (x) has real coefficients.
75

Lemma 9.5. (*) is true if and only if the following statement is true:

(**) Let f (x) be any monic polynomial in R[x] of degree n 1. Then there is at
least one complex number r with f (r) = 0.

Proof. It is clear that (*) implies (**). Suppose now that (**) is true, and let f (x)
be a monic polynomial in C[x] of degree n 1. Write

f (x) = xn + an 1x
n 1
+ · · · + a1 x + a0 .

Define the conjugate polynomial f (x) by

f (x) = xn + an 1x
n 1
+ · · · + a1 x + a0 .

Let g(x) = f (x) · f (x). Then

g(x) = f (x) · f (x) = f (x) · f (x) = f (x) · f (x) = g(x).

Thus g(x) 2 R[x]. Hence by (**), there exists r 2 C with g(r) = 0. Thus

0 = g(r) = f (r) · f (r).

Since C is a domain, either f (r) = 0 or f (r) = 0. If f (r) = 0, then (*) holds and
we are done. Suppose, then, that f (r) = 0, i.e.

r n + an 1r
n 1
+ · · · + a1 r + a0 = 0.

Taking complex conjugates of both sides, we get

r n + an 1r
n 1
+ · · · + a1 r + a0 = 0.

Hence f (r) = 0, and again (*) holds, and we are done in this case as well.


Finally, we have reached the heart of the problem. We must show that every
non-constant monic polynomial with real coefficients has at least one complex root.

It is impossible to give a purely algebraic proof of the Fundamental Theorem


of Algebra, because the real number field is not a purely algebraic object. Its
construction depends on taking limits of Cauchy sequences or finding least upper
bounds of infinite sets, both of which are analytic constructions. Euler uses calculus
in his proof via the Intermediate Value Theorem of which we state (without proof)
the following special case.

Intermediate Value Theorem. Let f (x) 2 R[x] be a polynomial. Suppose that


there exist real numbers a and b such that f (a)  0 and f (b) 0. Then there exists
a real number c (between a and b) such that f (c) = 0.

The application of the Intermediate Value Theorem which we need is the follow-
ing corollary.
76

Corollary 9.6. Let f (x) be a monic polynomial in R[x] of odd degree. Then there
is at least one real number r with f (r) = 0.
Idea of Proof. Let f (x) = xn + an 1x
n 1
+ · · · + a1 x + a0 2 R[x]. Then

1 1 1
f (x) = xn · (1 + an 1 + · · · + a1 n 1
+ a0 ).
x x xn
Let M = max0i<n |ai | and choose x such that |x| > max(1, nM ). Then |xk |
|x| > nM for all k 1. So

|an k | M 1
< = for 0  k < n.
|xk | nM n
Hence

1 1 1 |an 1 | |a1 | |a0 | 1 1 1


|an 1 +· · ·+a1 n 1
+a0 n
| +· · ·+ n 1 + n < +· · ·+ + = 1.
x x x |x| |x | |x | n n n

Hence
1 1 1
1 + an 1 + · · · + a1 n 1
+ a0 > 0.
x x xn
Hence f (x) > 0 for all x > max(1, nM ), and f (x) < 0 for all x < min( 1, nM ).
It follows from the Intermediate Value Theorem that there exists at least one real
number r with f (r) = 0, as claimed.

Now finally we arrive at Euler’s brilliant idea. He would like to prove the Funda-
mental Theorem of Algebra by induction. But rather than trying to use the obvious
induction on the degree of f (x), he uses induction on the 2-part of the degree of
f (x).
Theorem 9.7. Let f (x) be a monic polynomial in R[x] of degree n > 0. Then
there is at least one complex number r with f (r) = 0.
Euler’s Proof. Write n = 2m · n1 , where n1 is odd. The proof is by mathematical
induction on m for m 0. If m = 0, then n = n1 is odd, and we are done by
Corollary 9.6. Hence we may assume that m > 0 and that the theorem is true for
all monic polynomials in R[x] of degree k = 2m 1 · k1 with k1 odd.
Let E be a splitting field for f (x) containing C. Let r1 , r2 , . . . , rn be the roots of
f (x) in E. Our goal is to show that at least one of these roots is in the subfield C.
We construct an infinite family of new polynomials g1 (x), g2 (x), . . . , one for each
natural number, by the following rule:
Y
gc (x) = (x (ri + rj + cri rj )).
1i<jn

By Lemma 9.3, gc (x) 2 R[x] for all c 2 N. Moreover, the degree of gc (x) is
✓ ◆
n n(n 1) 2m · n1 · (n 1)
= = = 2m 1 · n1 (n 1),
2 2 2
77

with n1 (n 1) odd. Hence, by the inductive hypothesis, each gc (x) has at least
one complex root. In other words, for each c 2 N, there exists a pair {i, j} with
1  i 6= j  n, such that

ri + rj + cri rj 2 C.
Since there are infinitely many natural numbers, it follows by the Pigeonhole Prin-
ciple that there exist two di↵erent natural numbers c and d for which the same
choice of {i, j} yields a complex number. In other words, both

a := ri + rj + cri rj 2 C
and
b := ri + rj + dri rj 2 C.
Then
a b
ri r j = := C 2 C,
c d
and
a b
r i + rj = a cri rj = a c := B 2 C.
c d
But then

q(x) := (x ri )(x rj ) = x 2 (ri + rj )x + ri rj = x2 + Bx + C 2 C[x].

Now the Quadratic Formula is valid for polynomials with complex coefficients. Let
denote one of the two complex square roots of B 2 4C, as given by DeMoivre’s
Formula. Then the roots of q(x) are:
B+ B
ri = 2 C and rj = 2 C.
2 2
Thus ri and rj are two conjugate complex numbers which are roots of the polyno-
mial f (x), completing the proof.

And indeed, by combining Lemmas 9.4 and 9.5, Corollary 9.6, and Theorem 9.7,
we have completed Euler’s beautiful proof of the Fundamental Theorem of Algebra.
78

Exercises
1. Prove Theorem 9.2. [Hint: For f = f (r1 , r2 , . . . , rn ) 2 Z[r1 , r2 , . . . , rn ] and for
2 Sn , let (f ) := f (r (1) , r (2) , . . . , r (n) ). You may use the facts that (f + g) =
(f ) + (g) and (f · g) = (f ) · (g).]
2a. Express r2 +s2 +t2 as a polynomial in the elementary symmetric polynomials
s 1 , s 2 , s3 .
b. Do the same for r3 + s3 + t3 .
3. Express r12 + r22 + r32 + r42 as a polynomial in the four elementary symmetric
polynomials si (r, s, t, u), 1  i  4.
4a. Using the Fundamental Theorem of Algebra, prove that every polynomial in
R[x] can be factored into a product of polynomials of degree 1 or 2, each in R[x].
b. Give necessary and sufficient conditions for a polynomial p(x) = an xn + · · · +
a1 x + a0 2 R[x] to be irreducible.
5. Give an example of a quadratic polynomial in Q[x] which is irreducible in
Q[x], but is not irreducible in R[x].
6a. Prove: Counting multiplicity, a polynomial of even degree in R[x] has an
even number of real roots.
b. Give an example of a quartic polynomial p(x) in R[x] having no real roots.
Factor p(x) as a product of two quadratic polynomials in R[x].
7. Let f (r1 , r2 , . . . , rn ) 2 Q[r1 , r2 , . . . , rn ]. We call f (r1 , r2 , . . . , rn ) an alternat-
ing polynomial if the Sn -orbit containing f (r1 , r2 , . . . , rn ) has cardinality 2, i.e.,
there exists a polynomial g(r1 , r2 , . . . , rn ) 2 Q[r1 , r2 , . . . , rn ] such that, for every
2 Sn , either

f (r (1) , r (2) , . . . , r (n) ) = f (r1 , r2 , . . . , rn ),


or
f (r (1) , r (2) , . . . , r (n) ) = g(r1 , r2 , . . . , rn ).

Define the discriminant polynomial (r1 , r2 , . . . , rn ) by


Y
(r1 , r2 , . . . , rn ) = (ri rj ).
i<j

a. Prove: (r1 , r2 , . . . , rn ) is an alternating polynomial.


2
b. Prove: (r1 , r2 , . . . , rn ) is a symmetric polynomial.
c. Let f (x) = x2 + bx + c = (x r1 )(x r2 ). Express 2
(r1 , r2 ) in terms of b
and c.
79

8. Let (r1 , r2 , . . . , rn ) be the discriminant polynomial. We call a permutation


2 Sn a transposition if = (i, j) for some i, j with 1  i < j  n.
a. Express the m-cycle (1, 2, . . . , m) 2 Sn as a product of transpositions.
b. Explain why every permutation in Sn can be written as a product of trans-
positions.
c. Prove: If ⌧ 2 Sn is a transposition, then

⌧ ( (r1 , r2 , . . . , rn )) = (r1 , r2 , . . . , rn ).

d. Prove: Suppose 2 Sn and

= ⌧ 1 · ⌧ 2 · . . . · ⌧ m = t1 · t 2 · . . . · t r ,
where each ⌧i and each tj is a transposition. Then m is even if and only if r is even.
[Hint: Use (c).]
e. Let An be the set of all permutations 2 Sn such that is expressible as a
product of an even number of transpositions. Prove: An is a subgroup of Sn . [An
is called the alternating group on n letters.]
n!
f. Prove: |An | = 2 .
80

10. The Cubic and Quartic Equations Revisited


No one was more deeply influenced by the work of Euler than his young con-
temporary, Joseph Louis Lagrange. During the period 1770–1772, while serving as
court mathematician to Frederick the Great in Berlin, Lagrange undertook a deep
study of the known methods for solving equations.
Let’s begin with the very easy case of the quadratic equation:

x2 + bx + c = 0.
If we call the roots r and s, then we have

x2 + bx + c = (x r)(x s) = x2 (r + s)x + rs.


So, we are given the symmetric numbers r + s = b and r · s = c, and we want
to find the asymmetric numbers r and s. Since the set of all symmetric numbers
is a ring, we can’t escape from it by doing addition, subtraction or multiplication.
The one thing which we are allowed to do which decreases symmetry in a controlled
fashion is to take square roots, cube roots, etc. This is a first clue.
Lagrange talks about how many “values” (“valeurs” in French) an expression
takes when you permute the variables. Thus the expression

f (r, s) = r
gets permuted to the expression

g(r, s) = s
by the permutation

= (r, s) 2 Sym({r, s}).


Since there are only two permutations in Sym({r, s, }): ⌧ = (r, s) and I, the identity
permutation, the total number of values taken by f (r, s) = r is 2.
In modern language we would say the following: The group Sym({r, s}) acts
on the infinite set of all polynomials in C[r, s] by permuting the variables. Each
polynomial p(r, s) is contained in some Sym({r, s}) orbit, which must have size
either 1 or 2. The symmetric polynomials are precisely the polynomials which are
in an orbit of size 1.
Now an interesting example of a polynomial in an orbit of size 2 is:

(r, s) = r s.
This polynomial is called the discriminant polynomial. It has the interesting
property that:

⌧ ( (r, s)) = ⌧ (r s) = s r= (r s) = (r, s).


So, the Sym({r, s}) orbit containing (r, s) is:

{ (r, s), (r, s)}.


Since ⌧ ( ) = , we have
81

⌧ ( 2) = ( )2 = 2
,
i.e., 2 is a symmetric polynomial in r and s. Hence, by Waring’s Theorem, 2 is
expressible as a polynomial in the elementary symmetric polynomials b = r + s and
c = rs. Of course, it is easy to compute explicitly:
2
= (r s)2 = (r + s)2 4rs = ( b)2 4c = b2 4c.
Hence
p
= b2 4c
for a suitable choice of square root. Of course, we can now find r and s by solving
the system of linear equations:
(1) r + s = pb
(2) r s = b2 4c
Thus we recover the Quadratic Formula. Now let’s try to apply the same rea-
soning to the cubic equation:

x3 + px q = 0.
Again set

x3 + px q = (x r)(x s)(x t).


Here the process is a bit more complicated. Lagrange’s key observation is that, in
the quadratic case, r + s and r s should be thought of as r + 1 · s and r + ( 1) · s,
where 1 and 1 are the two square roots of one. Therefore the analogous objects
of study in the cubic case should be of the form

= (r, s, t) := r + !s + ! 2 t,
p p
where 1, ! = 12 + 23 i and ! 2 = 12 3
2 i are the three cube roots of 1, and r,
3
s, t are the three roots of f (x) := x + px q. An expression of this form is called
a Lagrange resolvent for the cubic f (x).
Now, (r, s, t) takes six values under the action of Sym({r, s, t}), suggesting why
Cardano ended up with an equation of degree 6 in attempting to solve the cubic.
If we set

µ = µ(r, s, t) := !r + s + ! 2 t,
then the six values taken by are

, ! · , ! 2 · , µ, ! · µ, ! 2 · µ.
Since ! 3 = (! 2 )3 = 1, the function 3
takes only two values under permutation:
3
and µ3 .
Thus, the set { 3 , µ3 } is a Sym({r, s, t}) orbit of size 2 on the set C[r, s, t]. We
leave as an exercise, the following corollary:
82

Lemma 10.1. Let be a permutation in Sym({r, s, t}). If has order 1 or 3,


then

3 3
( )= and (µ3 ) = µ3 .
3
If has order 2, then interchanges and µ3 .
In any case, it follows that every permutation in Sym({r, s, t}) fixes both

3
+ µ3 and 3
· µ3 ,
i.e. these are symmetric polynomials in r, s, t. As a somewhat tedious exercise, you
will be asked to write out 3 + µ3 explicitly as a polynomial in r, s, t, and then to
express it as a polynomial in the three elementary symmetric functions in r, s, t.
Note that, in this case,
(1) s1 = r + s + t = 0,
(2) s2 = rs + rt + st = p, and
(3) s3 = rst = q.
3
Now and µ3 are the two roots of the quadratic polynomial

q(x) := x2 ( 3
+ µ3 )x + 3
· µ3 .
Hence, using the Quadratic Formula, we could explicitly solve for 3 and µ3 in
terms of p and q. Then, by taking cube roots, we could find and µ. Finally, we
end up with a system of three linear equations in the three unknowns r, s, and t:

r+s+t=0
r + !s + ! 2 t =
!r + s + ! 2 t = µ
to be solved in order to find r, s, and t. We leave as an exercise for you to verify
that the coefficient matrix
0 1
1 1 1
@1 ! !2 A
! 1 !2
is invertible, and hence the system has a unique solution.
Lagrange further extended these ideas to explain the solution of the quartic
equation. We give a brief description in the general spirit of his work. Consider the
quartic

f (x) = x4 + ax2 + bx + c.
Let the roots of f (x) be r1 , r2 , r3 , and r4 . Consider the elements

✓1 = (r1 + r2 )(r3 + r4 )
✓2 = (r1 + r3 )(r2 + r4 )
✓3 = (r1 + r4 )(r2 + r3 )
83

in Z[r1 , r2 , r3 , r4 ]. We leave it as an exercise to verify that the set

T := {✓1 , ✓2 , ✓3 }
is a S := Sym({r1 , r2 , r3 , r4 }) orbit on Z[r1 , r2 , r3 , r4 ]. The kernel of the action of
S on this orbit is a normal Klein 4-subgroup V of S. It is very important, as we
shall see later, that V is a normal subgroup of S.
Since S permutes the set T , it follows that the elementary symmetric functions
in the ✓i ’s are fixed by all of the elements of S, and hence are expressible in terms
of the elementary symmetric functions in the roots r1 , r2 , r3 , r4 , i.e., in terms of the
coefficients a, b, c of f (x). In fact, computation shows that

✓1 + ✓2 + ✓3 = 2a
✓1 ✓2 + ✓1 ✓3 + ✓2 ✓3 = a2 4c
2
✓1 ✓2 ✓3 = b .
It follows that ✓1 , ✓2 , ✓3 are the roots of the resolvent cubic

h(x) = x3 2ax2 + (a2 4c)x + b2 .


Now, assuming that we have found ✓1 , ✓2 , and ✓3 , we can easily solve for the roots
r1 , r2 , r3 , r4 of f (x). For example, since f (x) has no cubic term, we have

(r1 + r2 ) + (r3 + r4 ) = 0 and (r1 + r2 )(r3 + r4 ) = ✓1 .


So r1 + r2 and r3 + r4 are the two roots of the quadratic equation

q(x) := x2 0x + ✓1 = x2 + ✓1 .
Hence r1 + r2 and r3 + r4 are the two square roots of ✓1 . Similarly, r1 + r3 and
r2 + r4 are the two square roots of ✓2 ; and r1 + r4 and r2 + r3 are the two square
roots of ✓3 . Finally, as in the cubic case, one can solve a system of linear equations
to find the roots of f (x). For example,
1 p p p
r1 = ( ✓1 + ✓2 + ✓3 ).
2
Lagrange found himself unable to extend his methods to the case of the quintic
equation. There was a good reason for this failure, but it would only be clearly
elucidated 60 years later by Evariste Galois. However, Lagrange’s work was a failure
only in the sense that Columbus’ voyages were failures. Lagrange had touched upon
a new world: the world of groups.
Here is Lagrange’s formulation of the great theorem which has come to bear his
name.
Lagrange’s Theorem. Let f (r1 , r2 , . . . , rn ) be a polynomial in n commuting vari-
ables. The number of values taken by f under permutation of the variables must be
a divisor of n!.
In the exercises you will be asked to reformulate this theorem in modern language
and to explain why it is a corollary of Lagrange’s Orbit-Stabilizer Theorem, as
stated and proved in Chapter 4.
84

Exercises
1. Prove Lemma 10.1.
3
2a. Write out + µ3 explicitly as a polynomial in r, s, t.
3
b. Express + µ3 as a polynomial in the three elementary symmetric functions
in r, s, t.
3. Prove that the matrix
0 1
1 1 1
@1 ! !2 A
! 1 !2
is invertible.
4. Using the notation from the discussion of the quartic equation, prove that
the set T = {✓1 , ✓2 , ✓3 } is a Sym({r1 , r2 , r3 , r4 })-orbit on Z[r1 , r2 , r3 , r4 ], and prove
that the kernel of the action of Sym({r1 , r2 , r3 , r4 }) on T is the group

V4 = {(1), (r1 , r2 )(r3 , r4 ), (r1 , r3 )(r2 , r4 ), (r1 , r4 )(r2 , r3 )}.

5a. Reformulate Lagrange’s Theorem as stated in this chapter in more modern


language, but stick to the same context in which Lagrange stated it. In other words,
your reformulated theorem should not be any more general than the one Lagrange
stated, but it should be in the language of a certain group acting on a certain set,
and it should not use the term “values”.
b. Prove that the theorem you stated in (a) is a corollary of Lagrange’s Orbit-
Stabilizer Theorem, as stated and proved in Chapter 4.
6a. For each divisor d of 24, give a polynomial fd (r1 , r2 , r3 , r4 ) such that fd takes
exactly d distinct “values”under permutation of the variables.
b. For each polynomial fd from (a), explicitly give the subgroup

Gd = { 2 S4 : (fd ) = fd }.

7a. Prove: If H is a subgroup of S4 with |H| = 6, then H contains a transposi-


tion.
b. Prove: If H is a subgroup of S4 with |H| = 8, then H contains a transposition.
8a. Prove: For all n 2, Sn is generated by the transpositions

(1, 2), (2, 3), . . . , (n 1, n).


[Hint: Let H be the subgroup generated by these transpositions. Prove that
(1, 2, . . . , n) 2 H. Now use induction and Lagrange’s Orbit-Stabilizer Theorem.]
b. Prove: For all n 2, Sn is generated by (1, 2) and (1, 2, . . . , n). [Hint: Let
= (1, 2, . . . , n). Compute k (1, 2) k
for 1  k  n.]
c. Prove: Let p be a prime. Let ⌧ = (i, j) 2 Sp , and let be a p-cycle in Sp .
Then Sp is generated by ⌧ and . [Hint: Renumber so that ⌧ = (1, 2). Argue that
85

k k k
for some k, is a p-cycle with (1) = 2. Renumber so that (i) = i + 1 for all
i, 1  i < p.]
9. Let G be a group and H a subgroup of G with (G : H) = n. Let

X = {g1 H = H, g2 H, . . . , gn H}
be the set of all left cosets of H in G. For each g 2 G, define the function g :X!
X by

g (gi H) = ggi H for all gi H 2 X.


a. Prove: g : X ! X is a permutation of X, i.e. a bijective function.
b. Prove: The order of g as an element of Sym(X) is a divisor of the order of
g in G.
c. Prove: g (H) = H if and only if g 2 H.
10. Let H be a subgroup of S5 with |H| = 30 or 40.
a. Prove: H contains a 5-cycle . [Hint: Use Exercise 9.]
b. Let H5 := {h 2 H : h(5) = 5}. Prove: H5 contains a transposition of S5 .
[Hint: Use Exercise 7.]
c. Conclude that S5 has no subgroup H with |H| = 30 or 40. [Hint: Use Exercise
8.]
d. Prove: If f (r1 , r2 , r3 , r4 , r5 ) takes at most 4 values under permutation of the
variables, then f is either an alternating or a symmetric function.
86

11. Galois’ Theory of Equations


Let’s try to imagine the thought processes of the young genius Evariste Galois
as he contemplated the work of his predecessors on the theory of equations.
On the one hand, there was the great paper of Lagrange, in which Lagrange
examined the work of his predecessors and attempted to extract a universal guiding
principle. The principle Lagrange discovered was that of symmetries of the roots
of the polynomial p(x). He let the symmetric group Sn act on the roots and found
auxiliary equations (Lagrange resolvents) whose solution would lead to a solution
of p(x) = 0 itself. However, Lagrange’s paper was finally pessimistic. He concluded
(although he did not prove) that these methods would never give a formula for
solving equations of degree greater than 4.
On the other hand, there was the work of Gauss, which you studied in Math
4580, in which Gauss showed how to solve the polynomial equation x17 1 = 0
by successive extraction of square roots, and indeed gave a general analysis of the
equations xn 1 = 0. Here the key role was played by a much smaller group,
2⇡i
Aut(Q(e n ), isomorphic to the group Un of invertible elements of Z/nZ.

Galois realized that Gauss was on the right track . When considering a specific
polynomial p(x), one should not treat its roots as “indeterminates”– r1 , r2 , . . . , rn
– and indiscriminantly apply every possible permutation in Sn . Rather, one should
remember the algebraic relationships among the roots and apply only those permu-
tations which respect those relationships. [Of course, Lagrange was right too. He
was looking for a general formula valid for all equations, not a specific formula for
a specific equation.]

Note: Throughout our discussion of Galois Theory, we shall assume without


further comment, as did Galois, that we are working with subfields of the complex
numbers. Most of this theory can be extended, with minor changes, to much more
general contexts. Recall that, by the Fundamental Theorem of Algebra, if p(x) is
any polynomial with coefficients in any subfield F of C, then a splitting field, E,
for p(x) over F can be found as a subfield of C. We have the following fundamental
definition.

Definition 11.1. Let F be a subfield of C and let E be the splitting field over F of
the polynomial p(x) 2 F [x]. The Galois group of p(x) over F , Gal(E/F ), is the
group of all 2 Aut(E) such that (x) = x for all x 2 F . We call E/F a Galois
extension. As noted, we shall assume without further comment that F ✓ E ✓ C.

Thus, Gal(E/F ) = Aut(E/F ) for the special case when E is a splitting field over
F . In particular, in the important case when F = Q, we simply have Gal(E/Q) =
Aut(E), since every automorphism of E fixes every rational number. We leave the
proof of the following fact as an exercise.

Theorem 11.2. Let E/F be a Galois extension. Suppose a, b 2 E and 2


Gal(E/F ) with (a) = b. Then a and b have the same minimum polynomial in
F [x].

Other than the identity function, it is not obvious that there are any Galois
automorphisms of E/F . The remarkable fact is that there are quite a few. This is
the content of the following converse of Theorem 11.2.
87

Theorem 11.3. Let E/F be a Galois extension. Let a 2 E with minimum poly-
nomial p(x) 2 F [x]. Let b be any root of p(x). Then b 2 E and there exists
2 Gal(E/F ) with (a) = b.

We shall proceed via a sequence of intermediate results, arriving finally at a


stronger theorem which will imply Theorem 11.3.

Theorem 11.4. Let F and F 0 be subfields of C and let h : F ! F 0 be an isomor-


phism of fields. Extend h to an isomorphism h̃ : F [x] ! F 0 [x] via:

h̃(an xn + · · · + a1 x + a0 ) = h(an )xn + · · · + h(a1 )x + h(a0 ).

Let a be a root of the irreducible polynomial p(x) 2 F [x], and let b be a root of the
irreducible polynomial h̃(p(x)) 2 F 0 [x]. Then there is an isomorphism h⇤ : F (a) !
F 0 (b) such that
(1) h⇤ (c) = h(c) for all c 2 F ; and
(2) h⇤ (a) = b.

Proof. Set p0 (x) = h̃(p(x)) 2 F 0 [x]. By the construction of extension fields, there
are isomorphisms:

f : F [x]/(p(x)) ! F (a)

and
g : F 0 [x]/(p0 (x)) ! F (b),

given by

f (c + (p(x))) = c for all c 2 F , and f (x + (p(x))) = a,

and
g(c0 + (p0 (x))) = c0 for all c0 2 F 0 , and g(x + (p0 (x))) = b.

The ring isomorphism h̃ : F [x] ! F 0 [x] maps the principal ideal (p(x)) to the
principal ideal (p0 (x)), and so there is an induced isomorphism

ĥ : F [x]/(p(x)) ! F 0 [x]/(p0 (x))

such that

ĥ(c + (p(x))) = h(c) + (p0 (x)) for all c 2 F , and ĥ(x + p(x)) = x + (p0 (x)).

Now define h⇤ = g ĥ f 1 : F (a) ! F (b). As h⇤ is a composition of isomorphisms,


h⇤ is an isomorphism. Moreover, direct computation shows that h⇤ (c) = h(c) for
all c 2 F and h⇤ (a) = b, as claimed.


88

Theorem 11.5. Let E be the splitting field over F of the polynomial p(x) 2 F [x].
Let E 0 be another subfield of C containing F , and let h : E ! E 0 be an isomorphism
of fields satisfying:

h(x) = x for all x 2 F .


Then the following statements are true:
(1) E 0 = E,
(2) h is a Galois automorphism of E/F , and
(3) h permutes the roots of p(x).

Proof. Let p(x) = an xn + · · · + a1 x + a0 2 F [x]. Let

{↵1 , ↵2 , . . . , ↵n }
be the set of all roots of p(x). (Note: p(x) may have repeated roots. So there may
be redundancies on this list.) Then, for all i,

an · ↵in + · · · + a1 · ↵i + a0 = 0.
Since ↵i 2 E for all i, we may apply h to this equation, yielding:

an · h(↵i )n + · · · + a1 · h(↵i ) + a0 = 0.
Thus h(↵i ) is also a root of p(x), for all i. In other words, since h is an injective
function, h permutes the roots of p(x), as claimed. In particular, h(↵i ) 2 E for all
i. However, since E is the splitting field for p(x) over F , E = F (↵1 , ↵2 , . . . , ↵n ).
Hence, since h(F ) = F , we have that h(E) ✓ E, i.e., E 0 ✓ E. Note: Since E 0 is
an infinite set, this does not immediately guarantee that E 0 = E. However, in this
case, h : E ! E 0 is, in particular, an isomorphism of F -vector spaces. Since E is
finite-dimensional as an F -vector space, it now follows that E 0 = E, as desired.
Now we are ready to prove the Main Theorem on Galois automorphisms.
Theorem 11.6. Let E/F be a Galois extension. Let L be any subfield of E and
let L0 be another subfield of C containing F . Suppose that g : L ! L0 is an
isomorphism of fields satisfying:

g(x) = x for all x 2 F .


Then there exists 2 Gal(E/F ) extending g, i.e.,

(y) = g(y) for all y 2 L.


In particular, L0 is a subfield of E.
Proof. We proceed by complete mathematical induction on n := (E : L). Notice
that the case n = 1 is precisely Theorem 11.5. Now assume that the theorem is
true for all subfields K of E with (E : K) < n. Since n > 1, we may choose
a 2 E L. Let p(x) 2 L[x] be the minimum polynomial for a over L. As before,
we may extend g to an isomorphism g̃ : L[x] ! L0 [x]. Let b be any root of the
polynomial p⇤ (x) = g̃(p(x)) 2 L0 [x]. By Theorem 11.4, g extends to an isomorphism
g ⇤ : L(a) ! L0 (b). Since (E : L(a)) < n = (E : L), our inductive hypothesis implies
89

that g ⇤ extends to a Galois automorphism 2 Gal(E/F ). Clearly, is the desired


extension of g and we are done.

As a corollary we have Theorem 11.3, which we repeat now for emphasis.
Theorem 11.3. Let E/F be a Galois extension. Let a 2 E with minimum poly-
nomial p(x) 2 F [x]. Let b be any root of p(x). Then b 2 E and there exists
2 Gal(E/F ) with (a) = b.
Proof. We may take L := F (a) and L0 := F (b). By Theorem 11.4, taking F = F 0
and h to be the identity map on F , there is an isomorphism g : L ! L0 with
g(x) = x for all x 2 F , and with g(a) = b. Hence, by Theorem 11.6, there exists
2 Gal(E/F ) with (a) = b, as claimed.

Here is an application to constructible numbers.
Theorem 11.7. Let ↵ 2 C be a constructible number. Let p(x) 2 Q[x] be the
minimum polynomial for ↵. Then every root of p(x) is a constructible number.
Proof. Since ↵ is constructible, for some n 2 N, there is a finite tower of fields:

Q = K 0 ✓ K 1 ✓ · · · ✓ Kn
such that ↵ 2 Kn and (Ki+1 : Ki ) = 2 for all i. Now p(x) 2 Kn [x]. Let E be a
splitting field for p(x) over Kn . Let be any root of p(x) in E. Then Q(↵) ⇠
= Q( )
and so, by Theorem 11.3, there exists 2 Gal(E/Q) = Aut(E) such that (↵) = .
Since Ki ✓ E for all i, we may apply to the tower above to get a new tower of
fields:

Q = L0 ✓ L1 = (K1 ) ✓ · · · ✓ Ln = (Kn )
such that = (↵) 2 Ln and (Li+1 : Li ) = 2 for all i. Hence is constructible, as
claimed.

Exercises
1. Prove Theorem 11.2.
2. Let E be the splitting field of p(x) = (x2 2)(x3 1) over Q.
a. Prove: (E : Q) = 4.
b. Prove that q(x) = x2 2 remains irreducible over Q(!), the splitting field of
x3
1 over Q.
c. Prove that Gal(E/Q) is a noncyclic group of cardinality 4.
d. Give three di↵erent subfields of E, each of degree 2 over Q.
3. Let E be as in Exercise 2. Prove that E is also the splitting field of f (x) =
(x2 + 3)(x2 4x + 2) over Q.
4. Find the splitting field and Galois group of g(x) = x3 5 over Q.
90

5. Find the splitting field and Galois group for h(x) = x4 2x2 + 9 over Q.
6. Let L be the splitting field of k(x) = x4 2 over Q.
a. Prove: (L : Q) = 8.
b. Prove that Gal(L/Q) is a subgroup of S4 isomorphic to D4 .
91

12. The Galois Correspondence


We are finally ready to state and prove the amazing Galois Correspondence The-
orem, relating the subfield structure of the Galois extension E/F to the subgroup
structure of its Galois group Gal(E/F ). One amazing aspect of this theorem is that
it describes the internal structure of an infinite, albeit finite-dimensional, object E
in terms of the internal structure of its finite group of automorphisms. Thus, in
particular, we shall see that, although E has infinitely many subspaces as a vector
space over F , E has only finitely many subfields containing the field F . This is, in
fact, an easy consequence of our previous results. First note that if E is the splitting
field of the polynomial p(x) 2 F [x], and if K is any intermediate field between F
and E, then E is also the splitting field of the same polynomial p(x) regarded as a
polynomial in K[x]. Thus, it makes sense to speak of the Galois group Gal(E/K).
By definition,

Gal(E/K) = { 2 Aut(E) : (x) = x for all x 2 K}.

Since F ✓ K, any element of Gal(E/K) satisfies:

(y) = y for all y 2 F .


Thus Gal(E/K) is a subgroup of Gal(E/F ). This easy remark has the following
important consequence.
Theorem 12.1. Let E/F be a Galois extension and let K be any intermediate field
between F and E. For any ↵ 2 E K, there exists 2 Gal(E/K) with (↵) 6= ↵.
Proof. Since ↵ 62 K, the minimum polynomial f (x) 2 K[x] for ↵ has degree at
least 2. By an earlier result, since f (x) 2 K[x] is irreducible, f (x) does not have a
repeated root. Hence there is a root of f (x) with 6= ↵. Then by Theorem 11.3,
there exists 2 Gal(E/K) with (↵) = 6= ↵, as claimed.

Now we can define the fundamental Galois Correspondence. We fix a Galois
extension E/F and let G = Gal(E/F ). We let

F = {K : F ✓ K ✓ E}
be the set of all subfields of E containing the field F . We let

G = {H : H ✓ G}
be the set of all subgroups of G. Recall that if H is a subgroup of G, we define

E H := {x 2 E : h(x) = x for all h 2 H}.

We define two functions:

: F ! G via (K) = Gal(E/K) for all K 2 F,


and
⇥ : G ! F via ⇥(H) = E H for H 2 G.
92

Our goal is to show that these two functions are inverses of each other and define
a one-to-one correspondence between the fields in F and the groups in G.
We leave the following theorem as an exercise. It is an easy corollary of Theorem
12.1.
Theorem 12.2. For all K 2 F, K = ⇥( (K)), i.e.,

K = E Gal(E/K) .
In particular, ⇥ : G ! F is a surjective map, and so

|F|  |G| < 1.


Thus, there are only finitely many subfields of E containing F .
Already, we have achieved the surprising result, announced above, that there
are only a finite number of fields lying between F and E, even though there are
infinitely many F -vector spaces lying between F and E.
To complete the proof of the Galois Correspondence Theorem, we need to know
that for all subgroups H of G, we have

H = Gal(E/E H ).
Note that

Gal(E/E H ) = { 2 G : (x) = x whenever h(x) = x for all h 2 H}.

Clearly, by the definition,

H ✓ Gal(E/E H ) ✓ G.
We need to verify that Gal(E/E H ) is not bigger than it “should be”. This will
follows from the fundamental Primitive Element Theorem of Galois.
Primitive Element Theorem. Let E/F be a Galois extension. There exists
↵ 2 E such that E = F (↵).
We call ↵ a primitive element of E/F . The Primitive Element Theorem is an
immediate corollary of the following linear algebra fact.
Theorem 12.3. Let V be a finite-dimensional vector space over an infinite field
F . Then V is not the union of any finite collection of proper F -subspaces of V .
This is intuitively obvious. No finite set of lines completely covers R2 . No finite
set of planes completely covers R3 .
Proof. Let dimF (V ) = n. We shall prove the theorem by induction on n. If n = 1,
then the only proper subspace of V is {0}, and the theorem is obvious. Henceforth
assume n 2.
We call a subspace H of V a hyperplane if dimF (H) = n 1. First we argue
that V contains infinitely many hyperplanes. Let B = {e1 , e2 , . . . en 1 , en } be an
F -basis for V . For ↵ 2 F , let H↵ be the subspace of V spanned by
93

B↵ := {e1 , e2 , . . . , en 1 + ↵ · en }.

Clearly H↵ is a hyperplane of V for all choices of ↵. Suppose H↵ = H . Then


there exist ci 2 F , 1  i  n 1 with

en 1 + · en = c1 · e1 + c2 · e2 + · · · + cn 1 · (en 1 + ↵ · en ).

Thus

c1 · e1 + c2 · e2 + · · · + (cn 1 1) · en 1 + (cn 1↵ ) · en = 0.

Since B is a linearly independent set, it follows first that cn 1 = 1 and then that
↵ = . Thus

H↵ = H if and only if ↵ = .

Since F is an infinite set, we conclude that V contains infinitely many hyperplanes,


as claimed.
Now suppose that the theorem is true whenever dimF (W ) = n 1, but suppose
that the theorem is false for V . There there exists a finite set

H = {H1 , H2 , . . . , Hr }

of hyperplanes of V such that

V = H1 [ H2 [ · · · [ Hr .

Since V contains infinitely many hyperplanes, we may choose a hyperplane H of V


such that H 6= Hi for any i, 1  i  r.
If H = H \ Hi for some i, then H ✓ Hi . But then H = Hi , contrary to the
choice of H. Hence H \ Hi is a proper subspace of H for all i. But

H = (H \ H1 ) [ (H \ H2 ) [ · · · [ (H \ Hr ),

contrary to the inductive hypothesis. This completes the proof.



We now restate and prove the Primitive Element Theorem.
Primitive Element Theorem. Let E/F be a Galois extension. There exists
↵ 2 E such that E = F (↵).
Proof. Suppose that ↵ is not a primitive element of E for any ↵ 2 E. Then, for
every ↵ 2 E, F (↵) is a proper subfield of E containing F . But |F| < 1, i.e., there
are only finitely many proper subfields of E containing F , and E is the union of
these finitely many proper subspaces, contrary to Theorem 12.3. This completes
the proof.

Now we can complete the fundamental Galois Correspondence Theorem after
two corollaries.
94

Corollary 12.4. |Gal(E/F )| = (E : F ).


Proof. Let E = F (↵) by the Primitive Element Theorem. Then the minimum
polynomial p(x) 2 F [x] for ↵ has degree n := (E : F ). Let be any Galois
automorphism of E/F . Then is completely determined by (↵). Moreover, (↵)
is one of the n roots of p(x), and (↵) 2 E by Theorem 11.3. Hence

|Gal(E/F )|  n = (E : F ).
On the other hand, by the Main Theorem on Galois Automorphisms, for every
root of p(x), there exists one (and only one) Galois automorphism 2 Gal(E/F )
with (↵) = . Thus

|Gal(E/F )| n = (E : F ).
Hence equality holds, as claimed.

Corollary 12.5. Let H be any subgroup of G := Gal(E/F ). Then H = Gal(E/E H ).
Proof. Let K = E H and let

H ⇤ = Gal(E/K) = { 2 G : (x) = x for all x 2 K}.


Then H ✓ H ⇤ . By Corollary 12.4,

|H ⇤ | = (E : K).
We must show that (E : K)  |H|.
Let H = {h1 , h2 , . . . , hm }, with h1 the identity automorphism. Let ↵ be a
primitive element of E/K. Set

g(x) = (x h1 (↵)) · (x h2 (↵)) · . . . · (x hm (↵)).


The coefficients of g(x) are the elementary symmetric polynomials in

{h1 (↵), h2 (↵), . . . , hm (↵)},


and so they are fixed by every automorphism in H, i.e., if ci is a coefficient of g(x),
then

h(ci ) = ci for all h 2 H.


Hence ci 2 E H = K for all coefficients ci of g(x), i.e. g(x) 2 K[x]. Let f (x) 2 K[x]
be the minimum polynomial of ↵ over K. Since g(↵) = 0, f (x) divides g(x). Since
E = K(↵), we have

|H ⇤ | = (E : K) = (K(↵) : K) = deg(f (x))  deg(g(x)) = m = |H|,


as claimed. Since H ✓ H ⇤ , we conclude that H = H ⇤ = Gal(E/E H ), as claimed.

Now, in the notation established at the beginning of this section, we have
95

( ⇥)(H) = (E H ) = Gal(E/E H ) = H for all subgroups H of G.


Earlier we established that

⇥ (K) = ⇥(Gal(E/K)) = E Gal(E/K) = K for all fields K with F ✓ K ✓ E.

Thus and ⇥ are inverses of each other. Hence both and ⇥ are bijections
between the sets F and G. This completes the proof of the Fundamental Theorem
of Galois Theory.
The Fundamental Theorem of Galois Theory. Let F ✓ E ✓ C with E/F a
Galois extension of fields. Then the correspondence

K = E H () H = Gal(E/K)
defines a one-to-one inclusion-reversing correspondence between the subfields of E
containing F and the subgroups of Gal(E/F ). Moreover,

|H| = (E : E H )
for every subgroup H of Gal(E/F ).
We apply this theorem to obtain a necessary and sufficient condition for a com-
plex number to be constructible. First we need a general fact about finite p-groups.
Theorem 12.6. Let p be a prime number and let G be a finite group with |G| = pn
for some n 2 N. Then there is a tower of subgroups

G = G0 ◆ G1 ◆ G2 ◆ · · · ◆ Gn = {e},
where (Gi : Gi+1 ) = p for all i, and e is the identity element of G.
The proof of Theorem 12.6 will require us to develop a little more basic group
theory. Back in Math 4580, we defined the relation of conjugacy on a group G. We
recall this definition now.
Definition 12.7. Let G be a group. We say that two elements x and y of G are
conjugate if there exists an element g 2 G such that y = gxg 1 . We shall write
x ⇠ y to denote the fact that x is conjugate to y.
We leave the following fact as an exercise.
Lemma 12.8. The relation of conjugacy is an equivalence relation on a group G.
We refer to the equivalence classes under this relation as conjugacy classes.
Thus the conjugacy classes of G define a partition of G into disjoint subsets. We
can think of this in a more sophisticated way in terms of group actions.
Lemma 12.9. Let G be a group. Then G acts as a group of functions on the set
G via:

1
g(x) = gxg for all x 2 G.
96

Proof. We must verify that multiplication in G corresponds to composition of func-


tions, i.e., that

(gh)(x) = g(h(x))
for all g and h in the group G and all x in the set G:
1 1 1
(gh)(x) = (gh)x(gh) = g(hxh )g = g(h(x)),
as claimed.

Notice that the conjugacy classes of G are the G-orbits on the set G under the
conjugation action. We introduce the following notation.
Definition 12.10. Let G be a group and let x be an element of G. Then the
centralizer in G of x is the set

CG (x) := {g 2 G : g · x = x · g} = {g 2 G : g(x) = x}.


Thus CG (x) is the stabilizer in G of the “point” x under the conjugation action. In
particular, CG (x) is a subgroup of G.
In consequence of Lemma 12.9, we may apply Lagrange’s Orbit-Stabilizer The-
orem to the conjugation action of the group G on the set G to obtain the following
result, whose proof we leave as an exercise.
Theorem 12.11. Let G be a finite group. Let C1 , C2 , . . . , Cn be the conjugacy
classes of G. For each i, let xi be an element in Ci . Then the following conclusions
hold:
Pn
(1) |G| = i=1 |Ci |;
(2) For each i, |G| = |Ci | · |CG (xi )|;
(3) Ci = {xi } if and only if xi 2 Z(G).
(4) Suppose that C1 , . . . , Cr are the conjugacy classes of G such that |Ci | > 1.
Then
r
X
|G| = |Z(G)| + |Ci |.
i=1

Note that (b) implies, in particular that |Ci | divides |G| for all i.
The displayed equation is usually referred to as Cauchy’s Class Equation. It
has the following important corollary.
Theorem 12.12. Let p be a prime and let G be a finite group with |G| = pn for
some n 2 N. Then Z(G) 6= {1}. In particular, G has a normal subgroup N with
|N | = p.
Proof. By Theorem 12.11, if Ci is a conjugacy class of G, then |Ci | divides pn .
Hence either |Ci | = 1 or p divides |Ci |. Of course, p divides |G|. Thus in the
notation of Cauchy’s Class Equation, p divides |Ci | for all i with 1  i  r. Hence
p divides

|Z(G)| = |G| (|C1 | + · · · + |Cr |).


97

Since 1 2 Z(G), we have |Z(G)| > 0. Hence |Z(G)| p.


Now let x 2 Z(G) with x 6= 1. By Lagrange’s Theorem, the order of x divides
|G| = pn . Hence the order of x is pa for some a 1. Set
a 1
z = xp .
Then z is an element of Z(G) of order p. Let N = hzi be the cyclic subgroup of
Z(G) generated by z. Then |N | = p. Indeed,

N = {1, z, z 2 , . . . , z p 1
}.
Let g 2 G. Then, since N ✓ Z(G),

g · zi · g 1
=g·g 1
· z i = 1 · z i = z i 2 N,
for all i. Thus N is a normal subgroup of G with |N | = p.

We would like to produce a tower of subgroups

1 = N0 ✓ N = N1 ✓ · · · ✓ Nn = G
with |Ni | = pi for all i. This will follow easily by induction once we generalize the
quotient group construction.
Definition 12.13. Let G be a group and let N be a normal subgroup of G. We
define a quotient group G/N as follows. The set G/N is the set of all cosets
g · N for g 2 G. Multiplication in G/N is defined by the rule:

(g · N ) · (g1 · N ) = (g · g1 ) · N
for all g, g1 2 G.
Note that we do not have to specify left or right cosets, since the condition that
N is a normal subgroup of G is equivalent to the assertion

g · N = N · g for all g 2 G.
As usual, we must verify that the multiplication operation is well-defined. This
follows directly from the equation above and the Associative Law:

(g · N ) · (g1 · N ) = g · (N · g1 ) · N = g · (g1 · N ) · N = (g · g1 ) · (N · N ) = (g · g1 ) · N,

the last equality holding because N is a group. We leave as an exercise to verify


the following facts.
Theorem 12.14. Let G be a group and let N be a normal subgroup of G. Then
(1) G/N is a group; and
|G|
(2) If |G| < 1, then |G/N | = |N | .

We need one further fact.


98

Lemma 12.15. Let G be a group and let N be a normal subgroup of G. Define


the function f : G ! G/N by

f (g) = g · N for all g 2 G.


Then f is a surjective group homomorphism. Moreover if H/N is a subgroup of
G/N and if

H := {h 2 G : f (h) 2 H/N },
then H is a subgroup of G. Moreover if |G| < 1, then |H| = |N | · |H/N |.
Proof. We leave as an exercise to show that f is a surjective group homomorphism.
Suppose that H/N is a subgroup of G/N and H is defined as above. Since H/N is
a group, the identity coset 1 · N is in H/N . Hence 1 2 H. Suppose that h, h1 2 H.
Then h · N 2 H/N and h1 · N 2 H/N . Hence h 1 · N = (h · N ) 1 2 H/N and

(h · h1 ) · N = (h · N ) · (h1 · N ) 2 H/N.
Hence h 1 2 H and h · h1 2 H. Thus H is indeed a subgroup of G. Moreover,
if G is finite, then H is the union of |H/N | cosets of N . Each of these cosets has
cardinality |N |. Hence

|H| = |H/N | · |N |,
as claimed.

We can now easily establish the existence of the desired tower of subgroups in a
finite p-group. The following result is a bit stronger than what we need for Theorem
12.17, but we will use it again in Corollary 12.18.
Theorem 12.16. Let p be a prime and let G be a finite group with |G| = pn for
some integer n 0. (We call such a group a finite p-group.) Let H be a subgroup
of G with |H| = pm . Then there is a tower of subgroups

H = H0 ✓ H1 ✓ · · · ✓ Hn m =G
with |Hi | = pm+i for all i, 0  i  n.
Proof. We proceed by induction on k := n m. The result is trivial if n m = 0.
Suppose then that the result is true for k = n m 1, and that |G| = pn and
|H| = pm . By Theorem 12.12, G has a normal subgroup N with |N | = p. If N is
not contained in H, then H ✓ N H with |N H| = pm+1 . Since by induction, the
result is true for k = n m 1 = n (m + 1), it follows that there is a tower of
subgroups

N H = H 1 ✓ · · · ✓ Hn m =G
m+1
with |Hi | = p . Then taking H := H0 , we are done.
Hence, we may assume that N ✓ H. Let G = G/N . By induction, since
(G : H) = pn m 1 , there is a tower of subgroups

H0 = H ✓ H 1 ✓ · · · ✓ H n m =G
99

with |H i | = pm+i 1 for all i, 0  i  n. For each i, let Hi be the pre-image in G


of H i under the homomorphism f : G ! G/N via f (g) = g · N for all g 2 G. Then
by Lemma 12.15, Hi is a subgroup of G for all i, 0  i  n, and

|Hi | = |H i | · |N | = pi 1
· p = pi ,

for all i, 0  i  n. Clearly, H0 = H and Hi ✓ Hi+1 for all i. Hence these groups
provide the desired tower of subgroups of G.

We can now complete our characterization of constructible numbers.
Theorem 12.17. Let ↵ 2 C. Let f (x) 2 Q[x] be the minimum polynomial of ↵
over Q. Let E be the splitting field of f (x) over Q, and let G = Gal(E/Q). Then
↵ is constructible if and only if |G| is a power of 2.
Proof. Suppose first that ↵ is constructible. We have seen earlier that then every
root of f (x) is constructible. Hence by combining towers of fields, we can achieve
a tower

Q = K0 ✓ K1 ✓ K 2 ✓ · · · ✓ Kn

such that E ✓ Kn and (Ki+1 : Ki ) = 2 for all i. Hence (Kn : Q) = 2n by Theorem


14.14 in the Math 2580 text, and since E ✓ Kn , (E : Q) is also a power of 2. Then
|G| = (E : Q) is a power of 2, as claimed. Note that this extends Theorem 14.16 in
the Math 4580 text, which guarantees that the degree of f (x) is a power of 2.
Next suppose that |G| is a power of 2. We claim that there is a tower of fields

Q = E0 ✓ E1 ✓ E 2 ✓ · · · ✓ En = E

with (Ei+1 : Ei ) = 2 for all i. By the Galois Correspondence Theorem, this is true
if and only if there is a tower of subgroups

G = G0 ◆ G1 ◆ G2 ◆ · · · ◆ Gn = {I}

with (Gi : Gi+1 ) = 2. Since |G| is a power of the prime 2, this is immediate from
Theorem 12.16.

Using Theorems 12.16 and 12.17, we can obtain the following sharpened purely
field-theoretic criterion for a number to be constructible.
Corollary 12.18. Let ↵ 2 C with (Q(↵) : Q) = 2n . Then ↵ is constructible if and
only if there exists a tower of subfields of C:

Q = F0 ✓ F1 ✓ · · · ✓ Fn = Q(↵)

with (Fi+1 : Fi ) = 2 for all i.


Proof. Clearly if the tower exists, then this demonstrates that ↵ is constructible,
by Theorem 13.16 in the Math 4580 text. Now assume that ↵ is constructible.
Let f (x) 2 Q[x] be the minimum polynomial for ↵ and let E be the splitting field
100

for f (x) over Q. By Theorem 12.17, the Galois group G := Gal(E/Q) is a 2-


group. Let H = Gal(E/Q(↵)) ✓ G. Then by the Galois Correspondence Theorem,
(G : H) = 2n . By Theorem 12.16, there exists a tower of subgroups

H = H0 ✓ H1 ✓ · · · ✓ Hn = G
with (G : Hi ) = 2 . Let Fi := E Hi . Again, by the Galois Correspondence
n i

Theorem, we get a tower of fields

Fn = E G = Q ✓ Fn 1 ✓ · · · ✓ F1 ✓ F0 = E H = Q(↵)
with (Fi+1 : Fi ) = 2 for all i, as claimed.

Corollary 12.18 does not seem very di↵erent from Theorem 13.16 in the Math
4580 text. However, without Galois Theory, it is not clear that Corollary 12.18
can be deduced directly from the earlier Theorem 13.16, even in the following very
elementary case. Suppose that ↵ is a constructible number which is the root of a
quartic irreducible polynomial p(x) 2 Q[x]. Let F = Q(↵) and suppose that E is
the splitting field for p(x) over Q with (E : Q) = 8. Suppose you know that there
is a tower of fields:

Q = E0 ✓ E 1 ✓ E 2 ✓ E 3 = E
with (Ei : Q) = 2i . Show that there exists a subfield F1 of F with (F1 : Q) = 2. I
don’t know how to do this without using Galois Theory.
We now discuss a way to show that there exist nonconstructible numbers whose
minimum polynomial over Q has degree a power of 2, specifically 4. Going back to
Lagrange’s analysis of the quartic polynomial, we recall that if

p(x) = x4 + cx2 + dx + e = (x r1 )(x r2 )(x r3 )(x r4 )


with c, d, e 2 Q, then the numbers

t1 := (r1 + r2 )(r3 + r4 ), t2 := (r1 + r3 )(r2 + r4 ), and t3 := (r1 + r4 )(r2 + r3 )

are roots of the Lagrange resolvent cubic polynomial

R(x) := x3 2cx2 + (c2 4e)x + d2 .


If this cubic polynomial is irreducible, then the Galois group of p(x) over Q is
isomorphic to the symmetric group S4 or to the alternating group A4 . In either
case, it is not a 2-group, and hence the ri ’s are not constructible numbers. In the
exercises, you will be asked to work out an explicit example of this, namely when
p(x) = x4 + x + 1.
We thank Professor S. K. Wong for the following easier argument for p(x) =
x4 + x + 1 2 Q[x], avoiding the use of Lagrange resolvents, but using Corollary
12.18. We leave some details to the exercises. Let ✓ 2 C be a root of p(x). By the
Rational Root Theorem from Math 4580 (Chapter 6, Exercise 9a), since f (1) = 3
and f ( 1) = 1, f (x) has no rational root. Hence (K : Q) = 2 or 4, where K = Q(✓).
101

We outline a proof that K has no subfield F with (K : F ) = 2. It will follow then


that (K : Q) = 4, and then, by Corollary 12;18, that ↵ is not constructible. Suppose
then that K has a subfield F with (K : F ) = 2. Since K = F (✓), there is a monic
quadratic polynomial q(x) 2 F [x] with q(✓) = 0. Since p(x) 2 F [x] with p(✓) = 0,
we see that q(x) divides p(x). Write q(x) = x2 + ax + b 2 F [x] and factor

p(x) = x4 + 0 · x3 + 0 · x2 + x + 1 = (x2 + ax + b)(x2 + cx + d) 2 F [x].

Equating coefficients, we obtain


(1) a + c = 0;
(2) ac + b + d = 0;
(3) ad + bc = 1; and
(4) bd = 1.
Since c = a, we see that a 6= 0 by equation (3), and then from (2) and (3) we
get
(1) a2 = d + b; and
(2) a1 = d b.
Adding these, we get formulas for 2d and 2b, which we can multiply to get
1 1
(a2 + )(a ) = 4bd = 4.
a a
We conclude that a2 is a root of the polynomial r(x) = x3 4x 1 2 Q[x]. Since
r(x) is irreducible in Q[x], it follows that (Q(a2 ) : Q) = 3. However a 2 F and so
Q(a2 ) ✓ F with (F : Q)  2, a contradiction.
The concept of a normal subgroup was one of Galois’ great contribution to the
emerging theory of groups. It plays the following crucial role in the theory of
equations, extending the earlier observation of Abel.
Theorem 12.19. Let E/F be a Galois extension of fields with Galois group G.
Suppose that N is a normal subgroup of G. Then E N /F is a Galois extension of
fields with Galois group isomorphic to G/N . Conversely, if H is a subgroup of G
such that E H /F is a Galois extension of fields, then H is a normal subgroup of G.
Proof. Let ↵ 2 E N , let g 2 G, and let h 2 N . Since N is normal in G, there exists
h1 2 N with

h · g = g · h1 .
Then

h(g(↵)) = (h · g)(↵) = (g · h1 )(↵) = g(h1 (↵)) = g(↵).


Since this is true for every h 2 H, it follows that

g(↵) 2 E H for all ↵ 2 E H and all g 2 G.


Now let ↵1 , . . . , ↵r be a set of elements of E N such that

E N = F (↵1 , . . . , ↵r ),
102

and let mi (x) 2 F [x] be the minimum polynomial of ↵i over F . If i is any root
of mi (x), then i 2 E and there exists gi 2 G with gi (↵i ) = i . But gi (↵i ) 2 E H .
Hence i 2 E H . In other words, E H contains all of the roots of mi (x) for all i,
1  i  r. Thus E H is the splitting field of m(x) = m1 (x) · . . . mr (x) 2 F [x], i.e.,
E H /F is a Galois extension of fields.
Galois’ work finally clarified the question of when a polynomial equation can
be solved by a process involving only addition, subtraction, multiplication division,
and extraction of roots. The fact that this was impossible for the general polynomial
equation of degree n 5 had been established a bit earlier by Ruffini and Abel.
Speaking informally, what Galois showed was the following: Suppose p(x) is a
polynomial with rational coefficients, having splitting field E/Q. The problem of
finding the roots of this polynomial can be reduced to the problem of finding roots of
polynomials of lower degree if and only if there exists a Galois extension F/Q with
F a proper subfield of E. If there is such a subfield, then one can first try to solve
the polynomial equation f (x) for which F/Q is the splitting field. Next one can
try to solve the polynomial equation g(x) 2 F [x] for which E is the splitting field.
Since (F : Q) < (E : Q) and (E : F ) < (E : Q), the problem has been reduced to
two smaller problems. By the fundamental Galois Correspondence Theorem, this is
possible if and only if the Galois group G := Gal(E/Q) contains a proper normal
subgroup, i.e., a normal subgroup N with N 6= {1} and N 6= G. Then one can
choose F to be E N .
A polynomial equation is solvable by radicals, i.e. its solutions can be found
using only addition, subtraction, multiplication, division, and extraction of roots,
if and only if this process can continue to be refined until one finally reaches field
extensions Fi+1 /Fi , all of prime degree, as Gauss did in his reduction of the cyclo-
tomic polynomials. Galois’ fundamental result in this context requires the following
definition.
Definition 12.20. A group G is solvable if there is a tower of normal subgroups

G = G0 ◆ G1 ◆ G2 ◆ · · · ◆ Gn = {1}
such that each quotient group Gi /Gi+1 is an abelian group.
Theorem. Let p(x) 2 Q[x]. Then p(x) can be solved by a process involving only
addition, subtraction, multiplication, division, and extraction of roots if and only if
the Galois group of p(x) is a solvable group.
It is possible to show that most polynomial equations of degree n have Galois
group Sn , the full symmetric group on n letters. Since Sn is not solvable for n 5,
it follows that most polynomials of degree at least 5 are not “solvable by radicals”.
Galois’ work to a large extent closed the book on the subject of finding algebraic
algorithms for solving general polynomial equations of degree greater than 4. How-
ever, other mathematicians, notably Leopold Kronecker, pursued this algorithmic
question much more deeply, in the case of polynomials with solvable Galois group.
Even more importantly, Galois’ work opened the book of group theory and more
generally, in conjunction with the great work of his predecessors such as Lagrange
and Gauss, opened a vast and fascinating book of abstract mathematical structures
(like groups, rings, and fields) and the “Galois correspondences”which link them.
103

Although Galois’ work appeared almost impenetrable to his contemporaries, it


was clarified by Camille Jordan in his book Traite des substitutions and des equa-
tions algebriques published in 1870. Two young men who came to Paris at that
time – Felix Klein and Sophus Lie – learned Galois’ theory from Jordan and were
profoundly influenced by it. Klein was led to articulate his Erlanger Programm
describing all geometries in terms of the action of groups of isometries on spaces.
Lie was motivated to search for a Galois correspondence for di↵erential equations,
which led him to the important concepts of a Lie group and a Lie algebra. This in
turn had a profound impact on modern physics.
And so mathematics evolves.

Exercises
1. Prove Theorem 12.2.
2. Let G be a group. Define the conjugacy relation xỹ on G by:

1
xỹ if and only if y = gxg for some g 2 G.
Prove: The conjugacy relation is an equivalence relation on G.
3. Prove Theorem 12.11.
4. Let G be a group and let N be a normal subgroup of G. a. Prove: G/N is a
group (as defined in Definition 12.13).
|G|
b. Prove: If |G| < 1, then |G/N | = |N | .

5. Let H be a finite group with |H| = pa · m, with p and prime and with
gcd(p, m) = 1. Suppose that H has a normal subgroup P with |P | = pa . a. Prove:
If ↵ 2 Aut(H), then ↵(P ) = P .
b. Prove: If H is a normal subgroup of a (larger) group G, then P is also a
normal subgroup of G.
6. Verify that the function f defined in Lemma 12.15 is a surjective group
homomorphism.
7a. Prove: If G is a finite group with |G| even, then G contains an element g of
order 2.
b. Prove: Suppose A is an abelian group with |A| = 2a · p1 · p2 · . . . · pr , where
the pi are distinct odd primes. Then A has a subgroup B with either |B| = 2a or
with |B| = pi for some i.
c. Recall from Corollary 4.8 that two elements of S5 are conjugate if and only if
they have the same cycle structure. Use this to list all the conjugacy classes of S5
and their sizes.
d. Prove: S5 is not a solvable group. [Hint: Suppose that S5 is a solvable group.
Use (b) to argue that S5 must have a normal subgroup B with |B| = 2, 3, 4, 5, or
8. Now use (c) to derive a contradiction.
8. For each of the equations listed below, determine the Galois group over Q of
the splitting field of the equation. List all of the subgroups of the Galois group.
104

List all of the subfields of the splitting field of the equation, and draw a diagram
illustrating the Galois correspondence between subgroups and subfields for each
example.
a. (x2 + 1)(x2 2)
2
p b. (x 2)(x2 3)(x2 p+ 1) (Note: You must prove by explicit calculation that
3 is not contained in Q[ 2].)
c. x3 2
d. x7 1
4
e. x 3
f. x11 1
9. For each finite group G with |G|  7, give an example of an equation whose
Galois group over Q is isomorphic to G.
10. Let p(x) = x4 + x + 1. Let E be the splitting field for p(x) over Q. a. Find
the resolvent cubic R(x).
b. Prove that R(x) is irreducible over Q.
c. Prove that (E : Q) = 12 or 24.
d. Prove: Gal(E/Q) ⇠
= A4 or S4 .
e. If p(x) = (x2 + ax + b)(x2 + cx + d), verify the calculations on page 100 which
show that a2 is a root of the cubic polynomial r(x) = x3 4x 1.
f. Prove: r(x) = x3 4x 1 is irreducible in Q[x].
g. Explain why (Q(a2 ) : Q) = 3 and (F : Q)  2 combine to give a contradiction
to the assumed existence of the field F .
105

INDEX

affine group, Af f (Rn ) 52


affine transformation 53
algebraic number 67
alternating group, An 38, 79
alternating polynomial 78
averaging trick 53
basis 5
Cauchy’s Class Equation 96
centralizer of a group element, CG (x) 96
characteristic polynomial 16
conjugate elements, conjugacy class 18, 95
coordinates 5
cross product 16
cycle notation 28
cycle structure 30–31
cyclic group 28
degree of a field extension, (E : F ) 66
determinant 15–16
diagonalizable matrix 18
dimension 6–7
direct product 59
direct sum 10
discriminant 78, 80
dodecahedron 39
eigenspace 15
eigenvalue 14
eigenvector 14
elementary symmetric polynomial 71
extension field 65
face of a polyhedron 39
Fermat’s Little Theorem 50
Fundamental Theorem of Algebra (FTA) 74
Fundamental Theorem of Galois Theory 95
Galois Correspondence 91
Galois extension 86
Galois field 68–69
Galois group 86
GL(n, R), GL(Rn ) 20
group 18
hyperplane 92
icosahedron 42–43
inner product 60
Intermediate Value Theorem 75
invariant subspace 14
isomorphism (of vector spaces) 5
Lagrange’s Orbit-Stabilizer Theorem 33
106

Lagrange resolvent 81
Lagrange’s Theorem 37, 82
linear combination 5
linear independence 6
linear operator 11
linear transformation 6
Main Theorem on Galois Automorphisms 88
minimum polynomial of a complex number 66
normal subgroup 18, 101
octahedron 41–42
orbit 28
Orbit Counting Formula 45
orthogonal basis 22
orthogonal group, O(n) 23
orthogonal matrix 22
orthogonal operator 22
orthogonal vectors 22
permutation matrix 24
p-group (finite) 98
Primitive Element Theorem 92–93
projection operator 19
quotient group 97
regular polyhedron 39
resolvent cubic 83
scalars 4
similar matrices 18
simple group 44
singular matrix 15
SL(n, R) 20
solvability by radicals 102
solvable group 102
spanning set 5
special orthogonal group, SO(n) 23
splitting field 66
subspace 5
symmetric polynomial 70
tetrahedron 39–41
transcendental number 67
transitive group action 37
vector space 4
Waring’s Theorem 71

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy