Solomon: Algebra 2 Notes
Solomon: Algebra 2 Notes
ABSTRACT ALGEBRA II
ABSTRACT ALGEBRA II
Table of Contents
Chapter 0: Introduction 3
Chapter 1: Review of Linear Algebra 4
Chapter 2: Linear Operators 11
Chapter 3: Inner Products and Orthogonal Matrices 22
Chapter 4: Permutations, Orbits, and Lagrange’s Theorem 28
Chapter 5: The Platonic Solids and their Symmetries 39
Chapter 6: The Orbit Counting Formula 45
Chapter 7: Finite Subgroups of SO(3) 52
Chapter 8: Imaginaries and Galois Fields 63
Chapter 9: Symmetric Polynomials & the Fundamental Theorem of Algebra 70
Chapter 10: The Cubic and Quartic Equations Revisited 80
Chapter 11: Galois’ Theory of Equations 81
Chapter 12: The Galois Correspondence 91
Index 105
3
Chapter 0: Introduction
and
(1) F n is the vector space of all n-tuples (a1 , . . . , an ) with ai 2 F , with the
operations of position-wise addition and scalar multiplication. This is just
the obvious generalization of Rn , considered algebraically;
(2) F is the set of all real-valued functions f : R ! R, with pointwise addition
and scalar multiplication, i.e.,
and
(a · f )(x) = a · f (x) for all a 2 R, x 2 R.
and
c · (a1 · v1 + . . . an · vn ) = (ca1 ) · v1 + · · · + (can ) · vn .
⇤
Definition 1.4. We say that V is a finite-dimensional vector space if there exists
a finite set {v1 , . . . , vn } of vectors which spans V .
Lemma 1.5. Suppose V is a finite-dimensional vector space. Let B = {v1 , . . . , vn }
be a minimal spanning set for V . Then every vector v 2 V is uniquely express-
ible as a linear combination of the vectors in B.
Proof. Let v 2 V . By definition of a spanning set, v is a linear combination of the
vectors in B. Suppose the expression is not unique, i.e.,
v = a 1 · v 1 + . . . a n · v n = b 1 · v 1 + · · · + bn · v n ,
with some ai 6= bi . Rearranging, we may assume that an 6= bn . Then
1
vn = ((a1 b1 ) · v1 + · · · + (an 1 bn 1) · vn 1 ).
bn an
But then {v1 , . . . vn } is a spanning set for V , properly contained in B, contradicting
the minimal choice of B. Hence the expression for each vector v is unique.
⇤
We call any minimal spanning set for V a basis for V .
This is the important point: If B = {v1 , . . . , vn } is a basis for V , then every
vector v has a unique set (a1 , a2 , . . . , an ) of coordinates with respect to the basis
B.
6
Definition 1.6. Two real vector spaces V and W are isomorphic if there is a
bijective function f : V ! W such that
(1) f (v + w) = f (v) + f (v 0 ) for all v, v 0 2 V ; and
(2) f (a · v) = a · f (v) for all a 2 F , v 2 V .
In other words, f : V ! W is an invertible linear transformation. We write
V ⇠
= W.
Theorem 1.7. Let V be a finite-dimensional real vector space with a basis B =
{v1 , . . . , vn }. Then V ⇠
= F n.
Proof. For each v 2 V , let
v = a1 · v 1 + . . . an · v n
be the unique expression for v as a linear combination of the vectors in B. Define
f : F n ! V by
f (a1 , . . . , an ) = a1 · v1 + . . . an · vn .
Uniqueness of expression implies that this function is one-to-one. Since B is a basis
for V , the map is surjective. Hence f is a bijective function. It is easy to check
that f is a linear transformation.
⇤
Thus, every finite-dimensional vector space can be coordinatized and thereby
identified with F n . The identification is not at all “natural”. Each choice of basis
gives a di↵erent coordinatization. However, as we shall see, the dimension of V is
a fixed number, independent of the choice of coordinatization.
Example 1.8: Let Pn be the set of all polynomials (with real coefficients) of
degree at most n. Clearly, each polynomial p(x) is uniquely expressible as
p(x) = a0 · 1 + a1 · x + · · · + an · xn ,
for some a0 , a1 , . . . , an 2 R. In other words, {1, x, x2 , . . . , xn } is a basis for Pn , and
the map
p(x) ! (a0 , a1 , . . . , an )
defines an isomorphism between Pn and Rn+1 .
But, again, the basis {1, x, . . . , xn }, though an obvious one is by no means the
only one. Indeed the whole subject of orthogonal polynomials investigates other
choices of basis (such as Legendre polynomials) which are more useful for certain
applications. We will not pursue this theme here.
Definition 1.9. A set S = {v1 , . . . , vk } of vectors is linearly independent if the
only linear combination of v1 , . . . , vk which equals the zero vector is the “trivial”one:
0 · v1 + · · · + 0 · vk = 0.
In other words, 0 is uniquely expressible as a linear combination of the vectors in
S.
7
c1 · f (v1 ) + · · · + cm · f (vm ) = 0,
then
f (c1 · v1 + · · · + cm · vm ) = 0,
and then , since f is injective,
c1 · v1 + · · · + cm · vm = 0.
v = a1 · v1 + · · · + am · v m .
Then
w = f (v) = a1 · f (v1 ) + · · · + am · f (vm ).
Hence {f (v1 ), . . . , f (vm )} is a spanning set for W .
⇤
Theorem 1.12. The following conclusions hold:
(1) If F n ⇠
= F m , then m = n;
(2) If V is a finite-dimensional vector space, then every basis B of V has the
same cardinality n. (We call this number the dimension of V , dim(V ).);
and
(3) If dim(V ) = n, then every linearly independent subset of V has cardinality
at most n.
8
c1 · v1 + . . . cn · vn + c · v = 0.
If c = 0, then B is not a linearly independent set, contrary to assumption. Hence
c 6= 0 and we can solve for v:
1
v= (c1 · v1 + . . . cn · vn ).
c
Hence B is a spanning set for V . Suppose that
a 1 · v 1 + · · · + a n · v n = b 1 · v 1 + . . . bn · v n .
Then
(a1 b1 ) · v1 + · · · + (an bn ) · vn = 0.
Since B is a linearly independent set, ai = bi for all i. Hence each vector in V is
uniquely expressible as a linear combination of the vectors in B, i.e., B is a basis
for V , as claimed.
⇤
From this, we get the following very important extendibility result.
Theorem 1.14. Let V be a finite-dimensional vector space. Let U be a subspace
of V . Then U is also finite-dimensional, with dim(U ) dim(V ) and with equality
only if U = V . Moreover, if B is any basis for U , then it is extendible to a basis
B ⇤ for V .
Note. By convention, if U = {0}, then the empty set is a basis for U , and
dim(U ) = 0.
Proof. Suppose dim(V ) = n. By the remark, we may assume that U contains a
non-zero vector u. Then {u} is a linearly independent subset of U . Since any
subset of U containing more than n vectors is linearly dependent, there must be a
maximal linearly independent subset B of U with |B| n. Then B is a basis for
9
Exercises
1. Let X be any non-empty set. Let F(X, Rn ) be the set of all functions with
domain X and co-domain Rn , (The actual range of any one of these functions may
be a proper subset of Rn . Indeed, it may be a single point in Rn .) Define addition
and scalar multiplication pointwise, i.e., if f and g are functions in F(X, Rn ), and
if f (x) = (a1 , a2 , . . . , an ), g(x) = (b1 , b2 , . . . , bn ), and if c is a real number, then
Prove: The set of all solutions of this system of equations is a vector subspace of
Rn .
7. Let y(t)000 + ay(t)00 + by(t) = 0 be a linear di↵erential equation, for some
a, b 2 R. Prove: The set of all solutions of this di↵erential equation is a vector
subspace of the space D of all everywhere di↵erentiable functions f : R ! R.
10
[Note: There is nothing special about three derivatives. This is just an example.
The same statement would be true for arbitrary n-th order linear ODEs.]
8. Let V be a vector space. Let U and W be subspaces of V .
(a) Prove: U \ W is a subspace of V .
(b) Prove: U [ W is a subspace of V if and only if U ✓ W or W ✓ U .
(c) Let U + W := {v = u + w 2 V : u 2 U and w 2 W }. Prove that U + W is a
subspace of V .
(d) Prove: Suppose U \ W = {0}. Suppose that B is a finite basis for U and B1
is a finite basis for W . Then B [ B1 is a finite basis for U + W . (It is then common
to denote U + W as U W , and call it the direct sum of U and W .)
(e) Prove: Suppose U + W is finite-dimensional. It is possible to choose a basis
B for U and a basis B1 for W such that B \ B1 is a basis for U \ W .
(f) Show by example that, in general, if B is a basis for U and B1 is a basis for
W , then B \ B1 is NOT a basis for U \ W .
9. Let V be a real vector space. We say that a (possibly infinite) subset B of
V is a basis for V if and only if every vector v 2 V is uniquely expressible
as a finite linear combination of vectors in B. (Thus, B is a linearly independent
spanning set for V .) Suppose that B is a basis for V .
Prove: V is isomorphic to the subspace F0 of the vector space F(B, R) defined
by: f 2 F0 if and only if f (b) = 0 for all but finitely many b 2 B.
[Hint: Define : V ! F(B, R) as follows. If v = c1 · b1 + · · · + cn · bn for some
b1 , . . . , bn in B and some scalars c1 , . . . , cn , let (v) be the function fv : B ! R by:
fv (bi ) = ci , 1 i n, fv (b) = 0 for all b 2 B {b1 , . . . , bn }.]
Note: Using the Axiom of Choice, it is possible to prove that every vector space
has a basis.
11
The most elementary non-trivial class of functions from Rn to Rn are the linear
operators:
f (x1 , x2 , . . . , xn ) = (a11 ·x1 +a12 ·x2 +· · ·+a1n ·xn , . . . an1 ·x1 +an2 ·x2 +· · ·+ann ·xn ).
Then, if
0 1
a11 a12 ... a1n
B a21 a22 ... a2n C
A=@ A
...
am1 am2 ... amn
we have
0 1
x1
Bx C
f (v) = A @ 2 A .
...
xn
In the finite-dimensional case, the use of a basis and coordinates a↵ords the
relationship between the two definitions.
12
p(D) := an · Dn + · · · + a1 · D + a0 · I
is a linear di↵erential operator on P, and more generally, on the space of all infinitely
di↵erentiable functions of a real variable.
13
T (V ) = {T (v) : v 2 V }.
The following theorem is fundamental.
Theorem 2.4. Let T : V ! V be a linear operator on the finite-dimensional vector
space V . Then
dim(Ker(T )) + dim(T (V )) = dim(V ).
Ker(T ) \ V1 = {0}.
We claim that T1 is an isomorphism of vector spaces.
Note that V = Ker(T ) + V1 . Let v 2 V . Write v = u + v1 with u 2 Ker(T ),
v1 2 V1 . Then
T (w) = ·w
for some scalar . We then call w an eigenvector for T with eigenvalue .
Examples
1. Let D : P ! P be the di↵erentiation operator. Since D(f ) is of lower degree
than f , the only possible way that D(f ) = · f is if = 0, i.e., D(f ) = 0. Thus
the only eigenvectors for D are non-0 constant functions and the only eigenvalue
is 0. (On the other hand, if we enlarge our space from P to C 1 (R), the space of
infinitely di↵erentiable real-valued functions, then D : C 1 (R) ! C 1 (R) is still a
linear operator, and now the function f (x) = e x is an eigenfunction for D with
eigenvalue , for any real number .)
2. Let r : R2 ! R2 be the reflection across the line y = mx. Then the line y =
mx is an r-invariant 1-dimensional subspace of R2 , and so (1, m) is an eigenvector for
1
r with eigenvalue 1. Also, if m 6= 0, the orthogonal line y = m x is r-invariant, but
each vector on this line is mapped to its negative. Hence (m, 1) is an eigenvector
for r with eigenvalue 1. [If R : R2 ! R2 is the reflection across the x-axis, y = 0,
then the y-axis is also R-invariant, and (0, 1) is an eigenvector with eigenvalue 1.]
3. If ⇢ = ⇢✓ : R2 ! R2 is a non-identity rotation, then no 1-dimensional subspace
(line through (0, 0)) is mapped to itself, unless ✓ = ⇡, in which case each vector
is mapped to its negative, and so every non-zero vector is an eigenvector for ⇢⇡
with eigenvalue 1. (By convention, the zero vector is never considered to be an
eigenvector.)
There is a lovely strategy for finding eigenvectors. Let T : V ! V be a linear
operator. Fix a basis B and a coordinatization for V relative to B. Then T (x) =
A · x for some matrix A. We wish to solve the matrix equation:
A·x= · x.
Bringing everything to the left side of the equation, this is equivalent to solving
(A I) · x = 0.
15
{v 2 Rn : A · v = · v},
i.e. it is the set of all eigenvectors for A with eigenvalue plus the vector 0.
Lemma 2.7. The -eigenspace for A is the null space for A I, i.e.
{v 2 Rn : (A · I) · v = 0}.
Now, for any given number , the problem of finding the null space for A I
is the problem of solving a certain homogeneous system of n linear equations in
n unknowns. But, for almost all , this system will have (0, 0, . . . , 0) as its only
solution. The Eigenvalue Problem is:
Determine those (few) values of for which the system has a non-zero solution.
This will be true if and only if the matrix A I is singular, i.e. not invertible.
And this property can be determined by the determinant det(A I). There is
a very interesting general theory of the determinant. We will restrict our attention
to the case of n ⇥ n matrices with n = 2 or n = 3. Even then, we will just sketch
the ideas, from a geometric viewpoint.
✓ ◆
a b
Definition 2.8. Let A = . Then the determinant of A is
c d
det(A) = ad bc.
Then
✓ ◆ ✓ ◆ ✓ ◆
e f d f d e
det(A) = a · det( ) b · det( ) + c · det( ).
h m g m g h
We recall the definition of the cross product of two vectors in R3 . We use the
standard notation i = (1, 0, 0), j = (0, 1, 0), k = (0, 0, 1).
16
It follows that
Lemma 2.12. det(A) = 0 if and only if the rows of A form a linearly dependent
set of vectors if and only if the null space of A contains a non-zero vector.
We will need a few additional facts about determinants, the first two of which
are somewhat difficult to prove. We refer students to other books for their proof.
Determinant Theorems 2.13. Let A and B be n ⇥ n matrices. The following
properties hold:
(1) det(AB) = det(A)det(B);
(2) det(AT ) = det(A);
1
(3) If det(A) 6= 0, then A 1 exists and det(A 1
)= det(A) .
Note that, by the Intermediate Value Theorem, every real cubic polynomial
crosses the x-axis at least one time. Indeed, cA (x) has either one real root and a
pair of complex conjugate roots, or cA (x) has three real roots. Thus A has at least
one real eigenvalue. Of course, a real root may occur with multiplicity 1, 2, or 3.
Computing det(x·I A) is a bit tedious, but we can make two easy and important
observations.
Lemma 2.15. Let A be a 3⇥3 matrix and let cA (x) be the characteristic polynomial
of A. Write
cA (x) = x3 c 2 x2 + c 1 x c0 .
17
Then c2 = T r(A) is the trace of the matrix A, and c0 = det(A) is the determinant
of the matrix A.
Proof. Considering the matrix
0 1
x a11 a12 a13
x·I A=@ a21 x a22 a23 A ,
a31 a32 x a33
we see that the cubic and quadratic terms of cA (x) all come from the product:
(x a11 )(x a22 )(x a33 ) = x3 (a11 + a22 + a33 )x2 + · · · = x3 T r(A)x2 + . . . .
(1) c1 · u + c2 · v + c3 · w = 0.
Since no two of the vectors u, v, w are collinear, ci 6= 0 for all i. Apply T to both
sides of this equation, getting:
(2) c1 · u + c2 µ · v + c3 ⌫ · w = 0.
(3) c1 · u + c2 · v + c3 · w = 0.
(4) c2 (µ ) · v + c3 (⌫ ) · w = 0.
c2 ( µ)
w= · v,
c3 (⌫ )
whence v and w are collinear, a contradiction proving the lemma.
⇤
Suppose that u, v and w are eigenvectors for A with eigenvalues , µ, ⌫, not
necessarily distinct. Suppose that {u, v, w} forms a basis for R3 . Form the matrix
C whose columns are the coordinate vectors for u, v, and w with respect to the
standard basis for R3 . Then the matrix AC has columns Au = · u, Av = µ · v,
and Aw = ⌫ · w. Hence
0 1
0 0
AC = CD := C @ 0 µ 0 A .
0 0 ⌫
Since {u, v, w} is a basis for R3 , C is an invertible matrix and
0 1
0 0
C 1 AC = D = @ 0 µ 0 A .
0 0 ⌫
We say that A is similar to the diagonal matrix D. We also say that A is diago-
nalizable.
Definition 2.17. Two n ⇥ n matrices A and D are said to be similar if there
exists an invertible n ⇥ n matrix C such that C 1 AC = D.
It is left as an exercise to show that the relation of similarity is an equivalence
relation on the set of n ⇥ n matrices. The equivalence classes are called similarity
classes.
Now is a good time to recall some definitions from Math 4580. Indeed, this
would be a good time for you to start reviewing the material in Chapters 9 through
12 of the Math 4580 notes.
Definition 2.18. A nonempty set G of invertible functions is a group of func-
tions if G is closed under composition of functions and under taking inverses. [See
Definition 10.3 on page 84 of the Math 4580 notes.]
Definition 2.19. Two functions f and f1 in a group G are called conjugate if
there is a function g 2 G such that f1 = g f g 1 . Conjugacy is an equivalence
relation on the set G, and the equivalence classes are called conjugacy classes.
[See page 78 of the Math 4580 notes.]
Notice that the intersection of a similarity class of n ⇥ n matrices with the group
GL(n, R) of all invertible n ⇥ n matrices is a conjugacy class in this group. [See
Exercise 8.]
The following definition will be needed in Exercise 8.
Definition 2.20. Let G be a group (of functions) and H a subgroup of G. We say
that H is a normal subgroup of G if
1 1
H = gHg = {ghg : h 2 H}
19
There are two interpretations for a pair of similar matrices, A and D, in terms
of linear operators. Holding one basis, B, fixed, A and D are the matrices for two
di↵erent linear operators with respect to this fixed choice of basis. On the other
hand, we may regard the invertible matrix C as a change of basis matrix, and then
we may interpret A and D as two di↵erent matrix representations for the same
linear operator T : Rn ! Rn with respect to two di↵erent bases. Thus, in the
diagonalizable case, it is convenient to think as follows:
T : R3 ! R3 is a linear operator whose matrix is A with respect to the standard
basis {(1, 0, 0), (0, 1, 0), (0, 0, 1)} for R3 . On the other hand, there is a better choice
of basis for the purpose of understanding T geometrically. Namely, there are three
eigenlines R · u, R · v, R · w, such that T “stretches”(or maybe, shrinks) these lines
by scaling factors , µ, and ⌫, respectively. With respect to the basis {u, v, w}, T
is represented by the diagonal matrix D. So, A and D represent the same linear
operator T with respect to di↵erent choices of basis.
Exercises
1. Let f : V ! V be a linear operator on the finite-dimensional real vector space
V . Let B be a basis for V . Let A be the matrix which represents f with respect
to the basis B.
(a) Prove: f is invertible if and only if f (B) is a basis for V .
(b) Prove: f is invertible if and only if A is an invertible matrix.
2. Prove: The di↵erentiation operator D : P ! P is a linear operator.
3. Prove: The integration operator J : C ! C is a linear operator.
4. Prove: D J : P ! P is the identity operator on P, but J D : P ! P is
not the identity operator.
5. Let T : V ! V be a linear operator.
(a) Prove: The range T (V ) is a subspace of V .
(b) Let V (T ) := {v 2 V : T (v) = · v}. Prove: V (T ) is a T -invariant subspace
of V .
(c) Prove: Ker(T ) = V0 (T ).
(d) Prove: V (T ) = Ker(T · I).
6. Let W be a subspace of the real vector space V . We say that a linear operator
P : V ! V is a projection operator onto W if P (V ) = W and P (w) = w for all
w 2 W.
Prove: A linear operator P : V ! V is a projection operator if and only if P
P := P 2 = P . [Hint: If P 2 = P , write every vector v 2 V as v = P (v) + (v P (v)).
Conclude that V = V0 (P ) + V1 (P ) and P is a projection operator onto V1 (P ).]
7. Let u and v be vectors in R3 .
20
ax + by + cz = 0.
Let R⇧ : R3 ! R3 be the reflection map across the plane ⇧. Let Pn : R3 ! R3 be
the orthogonal projection map onto the line through the normal vector n = (a, b, c)
to the plane ⇧. Let P⇧ : R3 ! R3 be the orthogonal projection map onto the plane
⇧.
(a) Prove: For all (x, y, z) 2 R3 ,
ax + by + cz
Pn (x, y, z) = · n.
a 2 + b2 + c 2
u · v = u1 v 1 + · · · un v n .
In particular, u · u is the square of the Euclidean distance from (u1 , . . . , un ) to
(0, . . . , 0). Moreover, if u and v are vectors of unit length, then u · v is the cosine
of the angle between them. Indeed, for any vectors u and v,
AT A = I.
Thus we have the following fact.
Lemma 3.1. Let T : Rn ! Rn be a linear operator. Let A be the matrix repre-
senting T with respect to the standard basis for Rn . Then T is an isometry if and
only if AT A = I.
We call a square matrix A orthogonal if and only if AT A = I, i.e. the columns
of A form an orthonormal basis for Rn (and so do the rows, since AAT = I as
well). Likewise, we call a linear operator an orthogonal operator if the associated
matrix with respect to the standard basis for Rn is an orthogonal matrix.
From the properties of determinants, we see that
23
Then a = b.
(b) Let f : Rn ! Rn be an isometry fixing each of the points
Definition 3.8. We call a matrix P a permutation matrix if each row and each
column has exactly one entry equal to 1 and every other entry equal to 0. Equiva-
lently, every entry of P is either 0 or 1, and the columns of P form an orthonormal
basis for Rn , (as do the rows). We let Pn denote the set of all n ⇥ n permutation
matrices.
Theorem 3.9. Let {e1 = (1, 0, . . . , 0), e2 = (0, 1, 0, . . . , 0), . . . , en = (0, 0, . . . , 0, 1)}
be the standard orthonormal basis for Rn . For each permutation 2 Sn , let ˆ be
the unique linear operator on Rn such that
ˆ (ei ) = e (i)
for all i, 1 i n. Let ˜ be the matrix representing the linear operator ˆ with
respect to the standard basis for Rn . Then ˜ is a permutation matrix, and the
function ⇥ : Sn ! Pn by
⇥( ) = ˜
is an isomorphism of groups.
Proof. Clearly ⇥ is a bijection. Also
while
(⇥( ) ⇥(⌧ ))(ei ) = ⇥( )(e⌧ (i) = e (⌧ (i)) = ⇥( ⌧ )(ei ).
Hence ⇥ is an isomorphism of groups.
There is a larger interesting subgroup of O(n), the Weyl group W (n).
Definition 3.10. We call a matrix M a signed permutation matrix if each row
and each column has exactly one entry equal either to 1 or to 1, and all other
entries equal to 0. Equivalently, every entry of M is either 0, 1, or 1, and the
columns of M form an orthonormal basis for Rn , (as do the rows). We let W (n)
denote the set of all n ⇥ n signed permutation matrices.
Theorem 3.11. W (n) is a subgroup of O(n) of cardinality 2n · n!.
Proof. Clearly, for each permutation matrix ˜ , there are 2n signed permutation
matrices whose non-zero entries are in the same location as the 1 entries of ˜ .
Hence the set W (n) has cardinality 2n · n!, and clearly, by definition, W (n) ✓ O(n).
Consider the set Z of all matrices in O(n) with integer entries. Clearly, the
product of any two such matrices is another such matrix. Also, if A 2 O(n), then
A 1 = AT . Hence if A has integer entries, then so does A 1 . As I has integer
entries, Z is a subgroup of O(n). But now let
0 1
a1
B a2 C
@ A
...
an
be a column of some matrix A in Z. Then this column vector is a unit vector. So
Since each ai 2 Z, exactly one ai = ±1 and the other aj ’s are all 0. Moreover
the columns of A form an orthonormal set. So Z = W (n), proving that W (n) is a
subgroup of O(n).
We will use permutation matrices and signed permutation matrices in the next
chapter when we study the symmetry group of the regular tetrahedron. But now
we return to general properties of orthogonal matrices. There are very limited
possibilities for the real eigenvalues of orthogonal matrices.
Lemma 3.12. Let A 2 O(n) and let be a (real) eigenvalue of A. Then = ±1.
Proof. Let u be an eigenvector for A with eigenvalue . Then
uT u = (Au)T (Au) = ( · uT )( · u) = 2
· (uT u).
Since u 6= 0, uT u 6= 0 and so 2
= 1. Hence, = ±1, as claimed.
Once again, we restrict attention to the 3-dimensional case. Let T : R3 ! R3 be
a linear isometry. Since the characteristic polynomial cT (x) is a cubic polynomial,
cT (x) has at least one real root ✏, and by Lemma 3.9, ✏ = ±1. Let u be a unit
eigenvector for T with eigenvalue ✏. Since T is an isometry,
u? := {v 2 R3 : u · v = 0}
is a T -invariant plane in R3 (with normal vector u). Since T : u? ! u? is an
isometry, T acts as either a rotation or a reflection on the plane u? . Hence there is
an orthonormal basis {v, w} for the plane u? such that the matrix for the T -action
on u? with respect to this basis is:
✓ ◆ ✓ ◆
cos(✓) sin(✓) cos(✓) sin(✓)
or .
sin(✓) cos(✓) sin(✓) cos(✓)
for some angle ✓, 0 ✓ < 2⇡.
Thus the matrix for T with respect to the orthonormal basis {u, v, w} for R3 is:
0 1 0 1
✏ 0 0 ✏ 0 0
@ 0 cos(✓) sin(✓) A or @ 0 cos(✓) sin(✓) A .
0 sin(✓) cos(✓) 0 sin(✓) cos(✓)
In the case where T acts as a reflection on the plane u? , we can make a special
choice of orthonormal basis for u? : Take v 0 to be a unit vector along the reflecting
mirror for T , and take w0 to be a unit vector perpendicular to v 0 . Then
T (v 0 ) = v 0 and T (w0 ) = w0 .
Hence, with respect to the orthonormal basis {u, v 0 , w0 } for R3 , the matrix for T is:
0 1
✏ 0 0
@0 1 0 A.
0 0 1
If T 2 SO(3) and the restriction of T to u? is a reflection, then 1 = det(T ) = ✏.
Hence ✏ = 1 and we see that T is a 180o rotation of R3 around the axis determined
by the eigenvector v 0 . On the other hand, if T 2 SO(3) and the restriction of T to
u? is a rotation, then 1 = det(T ) = ✏, and T is a rotation of R3 through an angle ✓
about the axis determined by the eigenvector u. Thus we have the following result.
26
Exercises
1. Prove: Let A be an n ⇥ n matrix such that uT Av = uT v for all vectors u,
v 2 Rn . Then A = I, the identity matrix. [Hint: Let ei be the unit vector with
0 in the j-th entry for all j 6= i and with 1 in the i-th entry. Compute eTi Aej and
compare with eTi ej .]
2. In this exercise, you may use basic properties of the dot product in Rn . Let
u and v be vectors in Rn .
(a) Prove: u · v = 12 (||u + v||2 ||u||2 ||v||2 ).
(b) Prove: Let T : Rn ! Rn be a linear isometry. Then T (u) · T (v) = u · v for
all u, v 2 Rn .
(c) Prove: Let T : R3 ! R3 be a linear isometry. Suppose that u 2 R3 is an
eigenvector for T . Then the plane
u? := {v 2 R3 : u · v = 0}
is a T -invariant subspace of R3 .
3. Using Determinant Theorems 2.12, prove: If A is an orthogonal matrix, then
det(A) = ±1.
27
4. Prove Theorem 3.4: O(n) is a subgroup of GL(n, R), and SO(n) is a normal
subgroup of index 2 in O(n).
5. Let D(n) be the set of all n ⇥ n diagonal matrices (i.e., every entry o↵ the
main diagonal is 0) such that every diagonal entry is ±1.
(a) Prove: W (n) = D(n) · Pn = {DP : D 2 D(n), P 2 Pn }.
(b) Prove: D(n) is a normal subgroup of W (n).
6. Prove: Let R : R3 ! R3 be a linear operator. Then R is a reflection across a
plane ⇧ through (0, 0, 0) if and only if the following three conditions hold:
(1) R has a 1-dimensional eigenspace U with eigenvalue 1;
(2) R has a 2-dimensional eigenspace W with eigenvalue 1; and
(3) U ? W .
7.(a) Give an example to show that not every linear isometry of R3 is a rotation
or a reflection.
(b) Prove: If T : R3 ! R3 is a linear isometry, then either T is a rotation or
T = R ⇢, where R is a reflection and ⇢ is a rotation (possibly the identity rotation).
8. Prove the following theorem of Euler: Let f : R3 ! R3 be an isometry (not
necessarily linear). Then either f = Tv ⇢ or f = Tv R ⇢, where Tv : R3 ! R3
is the translation by the vector v 2 R3 , ⇢ is a rotation about (0, 0, 0), and R is a
reflection across a plane passing through (0, 0, 0).
9. Prove the following theorems of Euler.
(a) Let f = Tv ⇢ be an isometry, using the notation of Exercise 8, with ⇢ not
the identity rotation. Let u be an eigenvector for ⇢ with eigenvalue 1. Then f is a
rotation about some point in R3 if and only if u · v = 0, i.e., v lies in the plane u? .
Moreover, if f is a rotation, then the axis of rotation is parallel to the vector u.
(b) Let f = Tv ⇢ be as in (a). Write v = u1 + w1 , where u1 is the orthogonal
projection of v onto the line through u, and w1 is the orthogonal projection of v
into the plane u? . Then
f = T u1 (Tw1 ⇢) = Tu1 ⇢1 ,
where ⇢1 := Tw1 ⇢ is a rotation about an axis parallel to the vector u, and Tu1
is a translation by the vector u1 , which is also parallel to the vector u. [Note: If
u1 6= 0, then f is a screw motion along the axis of the rotation ⇢.]
10. Prove: The set H defined at the end of this chapter is a subgroup of GL(3, R).
28
This can become tedious. Cauchy devised a somewhat more efficient notation for
these functions, called cycle notation. It is based on the following principle.
Definition 4.1. Let H be a group of permutations of a set X. The H-orbit
containing the point x is the set xH := {h(x) : h 2 H}.
Definition 4.2. Let be a permutation of the set X. Recall that the cyclic group
generated by is the set h i := { i : i 2 Z}. We write x for xh i , and call it
the -orbit containing the point x.
We shall be particularly interested in the case where has finite order.
Examples
1. If r is a reflection across a line in R2 , then r has order 2, since r2 = r r = I,
but r 6= I.
2. If ⇢ is a 120o rotation of R2 about the point P , then ⇢ has order 3, since
⇢ = I, but ⇢ 6= I 6= ⇢2 .
3
n=q·m+r
with q, r 2 Z, and 0 r < m, as given by the Division Algorithm. Then n
= r
.
Hence
0 2 m 1
h i = {I = , , ,..., }.
Proof.
n q·m+r m q r
= =( ) = Iq r
= r
.
29
⇤
Now suppose that is a permutation of the set X and has order m. Let x be
a point of X. Then we can enumerate the elements of x as:
2 m 1
x = {x, (x), (x), . . . , (x)}.
You might be tempted to guess that the -orbit containing x always has cardi-
nality m, where m is the order of the permutation . But it is easy to see that this
is not the case. For example, if X = R2 and ⇢ is a 120o rotation about the point P ,
then ⇢ is a permutation of the set X and ⇢ has order 3, but the ⇢-orbit containing
the point P is simply {P }, since ⇢(P ) = ⇢2 (P ) = P . So there can be repetitions in
the set
2 m 1
{x, (x), (x), . . . , (x)}.
A more interesting example is the following one:
Let X = {1, 2, 3, 4, 5}. Let : X ! X be the permutation defined by:
2 3 4 5
1 = {1, (1), (1), (1), (1), (1)} = {1, 2, 3, 1, 2, 3} = {1, 2, 3}.
We cycle through the same numbers twice. Similarly,
2 3 4 5
4 = {r, (4), (4), (4), (4), (4)} = {4, 5, 4, 5, 4, 5} = {4, 5}.
This time we cycle through the same numbers three times. So the -orbits on X
have cardinality 2 and 3, but has order 6.
In general, if O is a -orbit on the set X, we may define the permutation O :
O ! O to be the function with its domain restricted to O. If
2 k 1
O = {x, (x), (x), . . . , (x)}
k i i
with (x) = x, but (x) 6= x for all i, 1 i < k, then clearly O 6= I for 1 i < k,
k
but O = I, since
k i k
O ( (x)) = ( i (x)) = i
( k
(x)) = i
(x)
m m
for all i. So, O has order k = |O|. Since O = = I, it follows, by Exercise 3b
of Chapter 5, that
x ⌘H y if and only if x 2 y H .
Then ⌘H is an equivalence relation. The equivalence classes are the H-orbits on
X.
Proof. We must check the three properies of an equivalence relation.
Reflexivity: Let x 2 X. Since I 2 H, x = I(x) 2 xH . So x ⌘H x.
Symmetry: Suppose x ⌘H y. Then x 2 y H . Hence, there exists h 2 H with
h(y) = x. Then h 1 2 H and h 1 (x) = y. So y 2 xH , i.e., y ⌘H x.
Transitivity: Suppose x ⌘H y and y ⌘H z. There there exists h, h0 2 H with
h(y) = x and h0 (z) = y. Then h h0 2 H and
⇢ = (A, B, C, D, E, F, G, H).
This signifies that ⇢ maps the first entry in the 8-tuple, A, to the second entry, B;
the second entry B to the third entry C, . . . , and finally, ⇢ maps the last entry
H back to the first entry A. The 8-tuple (A, B, C, D, E, F, G, H) is called a cycle.
The symbols in a cycle comprise all of the elements in some ⇢-orbit. In this case,
there is only one ⇢-orbit.
We can compute powers of a permutation by leap-frogging in the cycle. For
example, suppose we wish to compute the cycle structure of the 135o rotation
⇢3 . Since ⇢(A) = B, ⇢2 (A) = ⇢(B) = C, and ⇢3 (A) = ⇢(⇢2 (A)) = ⇢(C) = D. So we
leapfrog over two symbols in the cycle:
⇢3 = (A, D, G, B, E, H, C, F ).
31
Similarly we can compute the 90o rotation ⇢2 by leapfrogging over one symbol each
time. But now there are two ⇢2 -orbits, not just one:
= (1, 2, 3)(4, 5)
2
= (1, 3, 2)
3
= (4, 5)
4
= (1, 2, 3)
5
= (1, 3, 2)(4, 5)
6
= (1).
This clearly verifies the assertion made earlier that has order 6. Indeed, it is easy
to see that the following assertion is true.
32
Lemma 4.6. If the permutation is a single cycle of length k, then the order of
is k. If the permutation is the product of disjoint cycles of lengths k1 , k2 , . . . , km ,
then the order of is the least common multiple of {k1 , k2 , . . . , km }.
The fact that Cauchy cycle notation gives a valid representation of a permutation
is a consequence of the fact that the -orbits on X form a disjoint partition of
the set X. Hence each element of the domain X appears in one and only one cycle.
With some care, we can use the Cauchy notation to multiply permutations.
Thus, suppose we have the permutations ⇢ = (1, 2, 3)(4, 5) and = (1, 3, 4, 2) of
X = {1, 2, 3, 4, 5}. To compute ⇢ , we work from right to left in the cycles:
In the exercises, you will be asked to practice your skills at the “calculus of
permutations”. Practice makes perfect.
The multiset of numbers listing the lengths of the disjoint cycles (orbits) for a
permutation is called the cycle structure of . For example, the cycle struc-
ture of the permutation (1, 2, 3)(4, 5)(6, 7)(8)(9)(10) = (1, 2, 3)(4, 5)(6, 7) in S10 is
{3, 2, 2, 1, 1, 1}. The following fact is fundamental.
Lemma 4.7. Let H be a subgroup of Sym(X) and let ⌧ be a permutation of the
set X. Let O be an H-orbit on X. Then ⌧ (O) is a ⌧ H ⌧ 1 -orbit on X of the
same length as |O|. In particular, if H = ⌧ H ⌧ 1 , then O and ⌧ (O) are two
H-orbits of the same length. They may or may not be the same orbit.
Proof. Let x and y be in ⌧ (O). Then there exist a and b in O with ⌧ (a) = x,
⌧ (b) = y. Since a and b are in O, b = (a) for some 2 H. Then
1
(⌧ ⌧ )(x) = (⌧ )(a) = ⌧ (b) = y.
Hence x and y are in the same ⌧ H ⌧ 1 -orbit.
On the other hand, let a 2 O and x = ⌧ (a). Suppose that y is in the same
⌧ H ⌧ 1 -orbit as x. Then, for some 2 H,
1
y = (⌧ ⌧ )(x) = ⌧ ( (a)).
Since (a) 2 O, y 2 ⌧ (O).
1
Hence ⌧ (O) is the ⌧ H ⌧ -orbit containing x. Since ⌧ is a bijective map on
X, |O| = |⌧ (O)|.
⇤
As a corollary we obtain the following important fact.
33
n = n1 + n2 + · · · + nr ,
with n1 n2 ··· nr > 0, and all ni 2 N.
Proof. Let O1 , . . . , Or be the -orbits on {1, 2, . . . , n}. If 1 = ⌧ ⌧ 1 , then by
Lemma 4.7, ⌧ (O1 ), . . . , ⌧ (Or are the 1 -orbits on {1, 2, . . . , n}, and |Ok | = |⌧ (Ok )|
for all k. Hence and 1 have the same cycle structure.
Now suppose that and 1 have the same cycle structure. Let (a1 , a2 , . . . , at )
be a cycle of and (b1 , b2 , . . . , bt ) be a cycle of 1 . Let ⌧ be a permutation in Sn
with ⌧ (ai ) = bi , 1 i t. Then
1
(⌧ ⌧ )(bi ) = (⌧ )(ai ) = ⌧ (ai+1 ) = bi+1 .
(Here we understand t + 1 to be 1.) Hence ⌧ ⌧ 1 and 1 both contain the cycle
(b1 , b2 , . . . , bt ). Since cycles are disjoint and since and 1 have the same cycle
structure, we can define ⌧ cycle by cycle, so that ⌧ ⌧ 1 = 1 , i.e., and 1
are in the same Sn -conjugacy class.
⇤
We conclude this chapter with a proof of the most important basic theorem in the
theory of groups. It is also historically the first theorem in the theory of groups. It
was stated by Lagrange in 1771, almost 60 years before the word group was coined.
The striking nature of the result convinced mathematicians of the importance of
using groups to organize their thinking about permutations.
Lagrange’s Orbit-Stabilizer Theorem. Let G be a finite group of permutations
of the set X. Let O be a G-orbit on the set X containing the point x. Let
Gx := {g 2 G : g(x) = x}.
Then
|G| = |O| · |Gx |.
g Gx := {g h : h 2 Gx } = {g 0 2 G : g 0 (x) = y}.
Hence
g Gx ✓ {g 0 2 G : g 0 (x) = y}.
Now let g 0 2 G with g 0 (x) = y = g(x). Then (g 1
g 0 )(x) = x. Hence g 1
g0 = h
for some h 2 Gx . So g 0 = g h 2 g Gx . Hence
{g 0 2 G : g 0 (x) = y} ✓ g Gx .
Hence
g Gx = {g 0 2 G : g 0 (x) = y},
completing the proof.
⇤
Our next remark is essentially the same as Lemma 12.2 in the Math 4580 text,
but now for left cosets.
Lemma 4.11. Let g 2 G. Then the function : Gx ! g Gx , defined by
(h) = g h
for all h 2 Gx , is a bijection of sets. Hence |Gx | = |g Gx | for all g 2 G.
Proof. Clearly, by definition of g Gx , is surjective. Suppose that (h) = (h0 ).
Then g h = g h0 , and so, by the Cancellation Law, h = h0 . Hence is also
injective. So is surjective. Now the equality of cardinalities is immediate.
⇤
Now we can proceed to a proof of Lagrange’s Theorem.
Proof of Lagrange’s Theorem. Since O = {g(x) : g 2 G}, |O| |G| < 1. Let
O = {x = x1 , x2 , . . . , xm }.
Thus |O| = m.
For every g 2 G, g(x) = xi for a unique i, 1 i m. Hence if we set
Gi = {g 2 G : g(x) = xi },
then
G = G1 [ G2 [ · · · [ Gm ,
and
Gi \ Gj = ; for all i 6= j.
Hence
|G| = |G1 | + |G2 | + · · · + |Gm |.
By Lemma 6.10, Gi = gi Gx , where gi 2 G and gi (x) = xi . Moreover, by Lemma
6.11,
n is a divisor of n!
However, Sn acts “naturally”on other sets as well. For example, let
X = {(i, j) : 1 i, j n}.
Then |X| = n2 and Sn acts naturally on X via:
:= {(i, i) : 1 i n} and X .
As | | = n and |X | = n2 n = n(n 1), it is easy to verify Lagrange’s Theorem
in this case as well. Lagrange was actually interested in the action of Sn on the
infinite set P n of all multi-variable polynomials p(x1 , x2 , . . . , xn ), under the action
Although P n is an infinite set, of course all of the Sn -orbits have finite length,
and Lagrange’s Theorem tells us that in fact this length must always divide n!.
We shall return to this setting for Lagrange’s Theorem in Chapter 9, when we
study polynomials and their roots. Now we shall return to geometric considerations
and see how we can use Lagrange’s Theorem to understand the structure of the
symmetry groups of the Platonic solids. But first we review some linear algebra.
Exercises
1. List all of the partitions of 6. For each partition ⇡, give a permutation ⇡
in S6 whose cycle structure is given by that partition. For each ⇡ , list all of the
powers of ⇡ and indicate the order of ⇡ .
2. Repeat Exercise 1 for 8 in place of 6.
3. Let ⇢ = (1, 2, 3, 4), = (1, 2, 3)(4, 5) and ⌧ = (2, 4, 5) in S5 . In each case
below, write your answer in Cauchy cycle notation.
(a) Compute ⇢ .
(b) Compute ⇢.
36
(c) Compute ⌧.
(d) Compute ⌧ .
(e) Compute ⇢ ⌧ .
(f) Compute ⌧ ⇢.
(g) Compute ⇢ ⌧.
(h) Compute ⇢ ⌧ .
(i) Compute ⇢ ⌧.
(j) Compute ⌧ ⇢.
(k) Compute ⌧ ⇢ .
(l) Compute ⌧ ⇢.
4. Let ⇢ : R2 ! R2 be a nonidentity rotation about the point P . Describe
geometrically the ⇢-orbits on the set P of all points of R2 . Does this explain why
orbits are called orbits?
5. Let G be a group. Let N be a normal subgroup of G. Let H be any subgroup
of G. Let
N H = {n h : n 2 N and h 2 H}.
(a) Prove: N H is a subgroup of G.
(b) Suppose that N \ H = {I}. Let h, h0 2 H. Prove: N h=N h0 if and
only if h = h0 . Conclude that |N H| = |N ||H|.
6. Let G be a group. Let g 2 G and let H be a subgroup of G. Define the
function cg : H ! g H g 1 by
1
cg (h) = g h g for all h 2 H.
Prove: cg is an isomorphism of groups.
7. Consider the group S4 .
(a) Prove that {(1, 2)(3, 4), (1, 3)(2, 4), (1, 4)(2, 3)} is an S4 -conjugacy class.
(b) Let V := {(1), (1, 2)(3, 4), (1, 3)(2, 4), (1, 4)(2, 3)}. Prove that V is a normal
abelian subgroup of S4 isomorphic to the Klein 4-subgroup V4 .
(c) Let H = {(1), (1, 2, 3), (1, 3, 2)}. Prove: V H is a normal subgroup of S4 of
cardinality 12, which is the union of three S4 -conjugacy classes.
(d) Prove that S4 contains exactly three subgroups of cardinality 8, and that
every element of S4 of order 4 is contained in exactly one of these subgroups.
Conclude that these subgroups are S4 -conjugate, and that each is isomorphic to
D4 , the symmetry group of the square.
In the next two exercises we prove another version of Lagrange’s Theorem, using
an argument very similar to that for Lagrange’s Orbit-Stabilizer Theorem.
8. Let G be a group and let H be a subgroup of G. Define a relation on G by:
37
1
g ⌘H g1 if and only if g g1 2 H.
⇤(g)(x) = g x
for all g in the group G and all x in the set G.
(a) Prove: ⇤ defines an isomorphism between G and the subgroup ⇤(G) of
Sym(G).
(b) Prove: Either G is cyclic or G is isomorphic to V4 .
16. Let G be a group with |G| = 6.
38
(a) Prove: G contains elements of order 2 and 3. [Hint: Use Exercises 9 and 10
from Chapter 12 of the Math 4580 text.]
(b) Prove: If G is not cyclic, then G has exactly three elements of order 2: ⌧1 ,
⌧2 , ⌧3 . [Hint: Suppose not. Then G has a unique element ⌧ of order 2. Let g 2 G
of order 3. Prove that g ⌧ = ⌧ g. Conclude that g ⌧ has order 6.]
(c) Prove: If G is not cyclic, then G ⇠= S3 . [Hint: Define the map c : G !
Sym({⌧1 , ⌧2 , ⌧3 }) by:
1
c(g)(⌧i ) = g ⌧i g .
Prove that c is an isomorphism of groups.]
17. In this exercise, we determine all of the subgroups of S4 .
(a) Prove: S4 has exactly one subgroup of cardinality 12. (We call this subgroup
the alternating group on four letters and denote it A4 .)
(b) Prove: If H is a subgroup of S4 containing two distinct cyclic subgroups of
cardinality 3, then H acts transitively on {1, 2, 3, 4}, and hence H = A4 or H = S4 .
(c) Prove: Suppose H is a subgroup of S4 containing the cyclic subgroup K =
h(a, b, c)i. Suppose that H 6= A4 and H 6= S4 . Then either H = K or H = (S4 )d ,
the stabilizer in S4 of the point d. In the latter case, H ⇠
= S3 . [Hint: Apply Exercise
14. You must justify that K is a normal subgroup of H.]
(d) Prove: D4 contains two subgroups isomorphic to V4 and one subgroup iso-
morphic to C4 . All other subgroups have order 1, 2, or 8.
(e) Prove: Every subgroup of S4 is either cyclic of order 1, 2, 3, or 4, or is
isomorphic to V4 , S3 , A4 , or S4 .
18.(a) Prove: Let G be a finite group and let K be a G-conjugacy class. Then
|K| divides |G|.
[Hint: Use Lagrange’s Orbit-Stabilizer Theorem.]
(b) Prove: Let G be a group. Then Z(G) is the union of all G-conjugacy classes
K such that |K| = 1.
(c) Prove: Let p be a prime, and let G be a finite group of cardinality pn for
some n 2 N. Then Z(G) contains a non-identity element of G.
(d) Prove: Let p be a prime and let G be a finite group of cardinality p2 . Then
G is an abelian group. [Hint: By (b), Z(G) 6= {I}. If g 2 G Z(G), argue that
every element of G is equal to z g i for some z 2 Z(G) and i 2 N. Conclude that
G is abelian.]
19. (Bonus) Let p be a prime and let G be a finite group with |G| = 2p. Prove:
Either G is cyclic or G ⇠= Dp . [Hint: Suppose G is not cyclic. As in Exercise
16, prove that G has exactly p elements of order 2, and a normal cyclic subgroup
K = hxi of cardinality p. Let t 2 G K. Show that x t = t x 1 . Using this,
prove that the map : G ! Dp defined by
(xi tj ) = ⇢i Rj
for 0 i p 1, 0 j 1, where ⇢ = ⇢ 2⇡
p
and R is reflection across the x-axis, is
an isomorphism of groups.]
39
V E + F = 2,
in conjunction with the local data:
2E = rV = mF,
where r is the number of edges meeting at a vertex v, and each face is a regular
m-gon. Now we can easily compute, case-by-case. For example, if m = r = 3, then
2 2
E E + E = 2,
3 3
whence E = 6, and then, V = F = 4.
Thus we get the following table:
m r V E F
3 3 4 6 4
3 4 6 12 8
3 5 12 30 20
4 3 8 12 6
5 3 20 30 12
This is the data for the tetrahedron, the octahedron, the icosahedron, the cube,
and the dodecahedron, respectively. It can be verified that if S is a Platonic solid
and if one marks points at the center of each face of S and then joins points at
a minimal distance apart by edges, the resulting surface is another Platonic solid,
S ⇤ , said to be dual to S. This is easy to visualize in the case of the cube S, and
the resulting figure is a regular octahedron. Less obviously, the icosahedron and
the dodecahedron are dual figures.
Dual Platonic solids have the identical group of symmetries. We shall only
consider the group of rotational symmetries of the tetrahedron, the octahedron,
and the icosahedron. These are called the tetrahedral group T , the octahedral
group, O, and the icosahedral group I.
Lagrange’s Theorem enables us easily to give an upper bound for the sizes of
these groups.
40
|T | = |v T | · |Tv | 4 · 3 = 12.
For the octahedron and the icosahedron, each triangular face f is opposite an
antipodal triangular face f . Any non-identity rotation ⇢ fixing f (as a set) must
fix the centers of both f and f and induce a 120o rotation about the axis joining
those centers. Hence |Of | 3 and |If | 3. As the octahedron has 8 faces and the
icosahedron has 20 faces, we conclude, using Lagrange’s Theorem, that
{t1 v1 + t2 v2 + t3 v3 + t4 v4 : t1 + t2 + t3 + t4 = 1, 0 ti 1 8 i}.
This tetrahedron is contained in the 3-dimensional space defined by the equation:
x + y + z + w = 1.
Now consider the subgroup P4 of the orthogonal group O(4) consisting of all
4 ⇥ 4 permutation matrices. Each permutation in P4 permutes the four vertices
v1 , v2 , v3 , v4 , and hence is a symmetry of the tetrahedron S(4). Moreover, leaves
invariant
ˆ (e1 ) = ✏1 · e (1) ,
ˆ (e2 ) = ✏2 · e (2) ,
ˆ (e3 ) = ✏3 · e (3) .
1
2 |Isom(S(8))|. The only possible conclusion is that W (3) = Isom(S(8)) and O is
a subgroup of W (3) of index 2, with |O| = 24.
As remarked before, the centers of the eight faces of the octahedron S(8) are the
vertices of a cube C having the same group O of rotational symmetries. Let be
the set of four long diagonals joining opposite vertices of C. If ⇢ is a symmetry of C
which maps each long diagonal to itself, then ⇢ = I or ⇢ = I. (Exercise.) Hence
I is the only rotational symmetry of O (and C) fixing every element of . Thus if
: O ! Sym( ),
is the function which maps each rotation in O to the permutation it induces on ,
then is an injective map. But |O| = 24 = |Sym( )|. Hence O ⇠ = Sym( ) ⇠ = S4 .
Thus we have proved the following theorem.
Theorem 5.4. The group O of rotational symmetries of the octahedron (or the
cube) is isomorphic to the symmetric group S4 .
We can enumerate the 24 symmetries in O as follows. First, regarding them as
symmetries of the octahedron, we have:
(1) 6 90o rotations (clockwise or counterclockwise) about the axes joining op-
posite vertices;
(2) 3 180o rotations (clockwise or counterclockwise) about the axes joining op-
posite vertices;
(3) 8 120o rotations (clockwise or counterclockwise) about the axes joining op-
posite faces;
(4) 6 180o rotations (clockwise or counterclockwise) about the axes joining op-
posite edges; and
(5) 1 identity rotation.
Regarded as symmetries of the cube C, we have:
(1) 6 90o rotations (clockwise or counterclockwise) about the axes joining op-
posite faces;
(2) 3 180o rotations (clockwise or counterclockwise) about the axes joining op-
posite faces;
(3) 8 120o rotations (clockwise or counterclockwise) about the axes joining op-
posite vertices;
(4) 6 180o rotations (clockwise or counterclockwise) about the axes joining op-
posite edges; and
(5) 1 identity rotation.
Exercises
1. Consider the pattern F in R which is the tiling of the plane by unit squares,
2
whose vertices have coordinates (a, b) with a, b 2 Z. Let G = Isom(F ). Prove that
G = T · G0 , where
T := {T(a,b) : a, b 2 Z}
is the normal abelian subgroup of translational symmetries of F , and G0 is the
stabilizer in G of the point (0, 0), with G0 ⇠
= D4 , the group of all symmetries of the
square.
2.(a) Consider the pattern F1 in R2 which is the tiling of the plane by congru-
ent equilateral triangles with sides p
of unit length, including the triangle T having
vertices at (0, 0), (1, 0), and ( 12 , 23 ). Prove: Isom(F1 ) = T1 · H, where
T1 = {a · T(1,0) + b · T( 1 , p3 ) : a, b 2 Z}
2 2
H = Isom(F1 )0 ⇠
= D6 ,
the group of all symmetries of the regular hexagon. [Note: There are six triangles
meeting at the point (0, 0), comprising the six wedges of a regular hexagon centered
at (0, 0).]
(b) Let P be the center of the triangle T from (a). Find the coordinates of P .
Prove: The stabilizer, Isom(F1 )P , of the point P is isomorphic to D3 , the group
of all symmetries of the equilateral triangle.
3. Let C be a cube in R3 centered at (0, 0, 0). Prove: If f 2 Isom(C) and f
maps each long diagonal of C to itself (but not necessarily pointwise), then f = I
or f = I.
44
4. In this exercise, you may assume that the icosahedral group Icos transitively
permutes the sets V , F , and E of all vertices, faces, and edges of the icosahedron,
respectively.
(a) Prove: The set of 15 elements of order 2 in Icos forms a single Icos-conjugacy
class.
(b) Prove: Icos contains 10 subgroups of cardinality 3, and Icos permutes these
subgroups transitively under conjugation.
(c) Prove: Icos contains 6 subgroups of cardinality 5 and Icos permutes these
subgroups transitively under conjugation.
(d) Prove: The only normal subgroups of Icos are the identity subgroup {I} and
the full group Icos. [Hint: Recall that if N is a subgroup of Icos, then |N | divides
60 = |I|. Moreover, if N is a normal subgroup of Icos and N contains the subgroup
H of Icos, then N contains g H g 1 for all g 2 Icos. Now use (a), (b), and (c).]
Definition. We say that a group G is a simple group if {I} and G are the only
normal subgroups of G.
Proof. The easiest way to visualize why this is true is to imagine a rectangular
array (a matrix, if you will) whose rows are labeled by the elements of G and whose
columns are labeled by the points in the set X. The (g, x) entry of this array is 1 if
g(x) = x, and is 0 if g(x) 6= x. Now,Padd up all of the entries in the matrix. Adding
one row at a time gives the answer g2G f (g). Adding one column at a time gives
P
the answer x2X |Gx |, noting that the (g, x) entry is 1 if and only if g 2 Gx . This
proves the lemma.
Next we make the following observation.
Lemma 6.2. Let G be a finite group of permutations of a finite set X. Let O be
any G-orbit on X. Then
X
|Gx | = |G|.
x2O
Thus
1 X
|Gx | = 1,
|G|
x2O
|G|
|Gx | = ,
|O|
independent of the choice of x. But then
X
|Gx | = |O| · |Gx | = |G|,
x2O
Thus
1 X
f (g) = r,
|G|
g2G
f (⇢) = 2.
On the other hand, if ⇢2 is a 120o rotation, then adjacent vertices may be labeled
either the same or di↵erently, but every other vertex must have the same label.
Hence there are two additional labelings fixed by ⇢2 :
47
f (⇢3 ) = 8.
There are two di↵erent kinds of reflections. If rv is a reflection fixing two opposite
vertices, then these two vertices may be labeled any way, but mirror-image vertices
must have the same label. So
f (rv ) = 2 ⇥ 2 ⇥ 2 ⇥ 2 = 16.
If re is a reflection fixing no vertices, then there are three mirror-image pairs and
f (re ) = 2 ⇥ 2 ⇥ 2 = 8.
Next we add up the number of fixed points, keeping track of the number of sym-
metries of each type:
X
f (g) = f (I)+2f (⇢)+2f (⇢2 )+f (⇢3 )+3f (rv )+3f (r3 ) = 64+4+8+8+48+24 = 156.
g2D6
Finally we divide by |D6 | to get that the number of distinct organic hexads is
156
12 = 13.
Note: Don’t tell your chemistry professor about “organic hexads”. They have
no basis in chemical reality. However similar arguments can be used in genuine
chemistry problems.
Before proceeding to the next example, we make a useful observation, which is
implicit in the last calculation.
Lemma 6.3. Let G be a group of permutations of a finite set X and let f (g) denote
the number of fixed points of the element g 2 G. If h g h 1 is any conjugate of
g in G, then
1
f (g) = f (h g h ),
1
F (h g h ) = h(F (g)).
Then, since h is a bijective mapping on X,
48
1
|F (h g h )| = |F (g)|,
as claimed.
First let x 2 F (g). Then
1 1
(h g h )(h(x)) = (h g)(h (h(x))) = h(g(x)) = h(x).
1
Thus h(x) 2 F (h g h ) whenever x 2 F (g), i.e.
1
h(F (g)) ✓ F (h g h ).
1 1
Secondly, let y 2 F (h g h ). We wish to show that h (y) 2 F (g). Now
1
y = (h g)(h (y)).
1
So, applying h to both sides, we get
1 1 1 1 1 1
h (y) = h ((h g)(h (y))) = (h h)(g(h (y))) = g(h (y)).
1
So h (y) 2 F (g), as desired. Thus
1
F (h g h ) ✓ h(F (g)),
and so
1
F (h g h ) = h(F (g)),
as claimed.
Thus, in order to perform the calculations required for the Orbit Counting For-
mula, we only need to compute f (g) for one representative of each conjugacy class
of G, and we need to know the size of each conjugacy class. This is in fact what we
did in Example 1, where the “types”of rotations and reflections were actually the
di↵erent conjugacy classes of the group D6 .
Example 2. Let’s call a crystal tetrad a crystalline molecule in the shape of a
tetrahedron with each vertex containing either a silicon atom (Si), an oxygen atom
(O), or a hydrogen atom (H). How many di↵erent crystal tetrads are possible?
This problem is very similar to the previous one, only now we are considering
the orbits of the symmetry group of the tetrahedron on the set S of all labelings of
the vertices of the tetrahedron with the label Si, O or H. Thus there are 34 = 81
possible labelings, i.e. |S| = 81.
We have seen that the tetrahedral group T is isomorphic to S4 acting as the group
of all possible permutations of the four vertices of the tetrahedron. We have also
seen that two permutations in Sn are conjugate if and only if they have the same
cycle structure. Here we must compute fixed points for five di↵erent permutations:
(1), (1, 2), (1, 2, 3), (1, 2, 3, 4), and (1, 2)(3, 4).
If a labeling L is fixed by the isometry g, then two vertices of T which are in the
same g-orbit must have the same label, while vertices in di↵erent g-orbits may be
labeled independently. Thus we easily establish the following table:
49
f (⇢) = n
for all non-identity ⇢, of which there are p 1. Also, clearly
f (I) = np .
Hence the number of di↵erent pinwheels is
(1 ⇥ np ) + ((p 1) ⇥ n) np n
=n+ .
p p
Since the number of di↵erent pinwheels must be an integer, we obtain the following
corollary.
Fermat’s Little Theorem. Let p be a prime and n be any natural number. Then
np ⌘ n (mod p).
Exercises
1. How many di↵erent colored tetrahedra are there in which each face is colored
either red or white or blue?
2. How many di↵erent colored cubes are there in which each face is colored
either red or white or blue?
3. How many di↵erent pinwheels with 6 identically shaped pins are there, if each
pin can be colored one of 4 di↵erent colors?
4. How many di↵erent pinwheels with 8 identically shaped pins are there, if each
pin can be colored one of 4 di↵erent colors?
5. Instead of pinwheels, consider bracelets with 8 identical size spherical beads,
each of one of 4 di↵erent colors? How many di↵erent bracelets are there? [Now
reflectional symmetries must be considered, in addition to rotational symmetries.]
6. A toy pyramid in the shape of a regular tetrahedron is built out of six pegs.
Count the number of di↵erent designs if there are
(a) two each of red, white, and blue pegs; or
(b) three each of red and white pegs.
7. The skeleton of a cube is made out of twelve pegs. How many distinguishable
such cube can be made from:
51
This is a very remarkable theorem, obviously closely related to the theorem that
there are only five Platonic solids. It challenges our intuition. Although there
are infinitely many di↵erent 2-dimensional rotation groups, one for each regular
polygon, there are only three di↵erent essentially 3-dimensional rotation groups.
Somehow, the extra dimension provides less freedom, not more.
In fact, we shall prove an even stronger statement.
Definition 7.2. Let f : Rn ! Rn be a function. We call f an affine transfor-
mation if there exists a vector v 2 Rn and a linear operator g : Rn ! Rn such
that f = Tv g, where Tv : Rn ! Rn is translation by the vector v. We denote by
Af f (Rn ) the group of all invertible affine transformations of Rn .
First we prove the following striking fact:
Theorem 7.3. Let G be a finite subgroup of Af f (Rn ). Then G fixes a point, i.e.
there exists a point P 2 Rn such that g(P ) = P for all P 2 Rn .
This statement is certainly false for many infinite subgroups of Af f (Rn ), e.g.
the translation group Tn . So we have to use the only two facts we know:
(1) G is finite; and
(2) Every g 2 G acts as an affine transformation, i.e. for some vector v 2 Rn ,
g = Tv f,
⇤
Now we can describe the averaging trick which will permit us to find a fixed
point for G. Let’s imagine first that G is the cyclic group generated by a rotation
⇢ about a point P through an angle of 2⇡ n
n . Hence ⇢ = I. Now suppose we didn’t
know P . We could pick a random point Q and look at the ⇢-orbit of Q:
Q, ⇢(Q), ⇢2 (Q), . . . , ⇢n 1
(Q).
Then next step, ⇢n (Q), would take us back to Q. What we would see is n points
evenly spaced on the circumference of a circle, and we would realize that P must be
the center of that circle. If we change the choice of Q, we change the set of points
and we probably even change the circle, but the center is always P !
This is the magic that we will exploit.
First we need the following remark.
Lemma 7.5. Let G be a group and let h be an element of G. Consider the function
h : G ! G defined by
1 1 1
h (h g) = h (h g) = (h h ) g = g.
Thus h is also a surjective map, hence a bijective map.
⇤
Now we can prove Theorem 7.3.
Proof of Theorem 7.3. Let v be any point in Rn . Let
54
1 X
v= · g(v).
|G|
g2G
1 X 1
= ( (h g)(v) (|G| 1) · w) + (1 ) · w.
|G| |G|
g2G
P P
Now by Lemma 7.5, g2G (h g)(v) = g2G g(v), since as g runs through the
elements of G once each, so does h g. Hence
1 X 1
h(v) = ( g(v) (|G| 1) · w) + (1 )·w =
|G| |G|
g2G
|G| 1 1
v · w + (1 ) · w = v,
|G| |G|
completing the proof of Theorem 7.3.
⇤
Corollary 7.6. Let G be a finite subgroup of Isom(R3 ). Then G is conjugate to
a subgroup of O(3).
Proof. By Theorem 7.3, there is a point v 2 R3 such that g(v) = v for all g 2 G.
Then
S 2 = {(x, y, z) 2 R3 : x2 + y 2 + z 2 = 1}.
Then the axis of rotation of ⇢ intersects S 2 in exactly two antipodal points (a, b, c)
and ( a, b, c). We think of (a, b, c) and ( a, b, c) as the north and south
poles for the rotation ⇢. Let
1 1
(g h g )(g(P )) = (g h)(g (g(P ))) = (g h)(P ) = g(h(P )) = g(P ).
1
Also, if g h g = I, then
1 1 1
h=g (g h g ) g=g I g = I,
contrary to the fact that h 6= I. Hence g h g 1 6= I and g h g 1
(g(P )) = g(P ).
So g(P ) 2 P for all P 2 P and all g 2 G, as claimed.
⇤
Remarkably, we can now count the number of G-orbits on P. Except for one
trivial case, there must be exactly three orbits.
Lemma 7.9. The following conclusions hold:
(1) G has either two or three orbits on P.
(2) If G has only two orbits on P, then P = {P, P } is a pair of antipodal
points on S 2 , and G is a cyclic group of rotations about the axis L through
P and P .
(3) If G has three orbits on P, then |P| = |G| + 2.
Proof. Using the notation of the Orbit Counting Formula, f (g) = 2 for all g 2
G {I}. Let m be the number of G-orbits on P. Then, by the Orbit Counting
Formula,
|P| 2 2|G| 4 4
m=2+ 2+ =2+2+ < 4.
|G| |G| |G|
Hence m = 2 or 3.
Finally, suppose that m = |P| = 2. If P is a pole of G, then so is the antipodal
point P . Hence P = {P, P } and these points are fixed by every element of G.
Thus the line L through P and P is held pointwise fixed by every element of G,
and so using Theorem 12.12 in the Math 4580 text, we see that G acts as a finite
cyclic group of rotations of the plane L? through (0, 0, 0) perpendicular to L.
⇤
56
For the remainder of this chapter, we shall assume that m = 3, and let O1 , O2 ,
and O3 be the three G-orbits on P. Choose notation so that
GP = {g 2 G : g(P ) = P }.
Let |GP | = p, |GQ | = q, and |GR | = r. By the ordering of the orbits, we have
p q r.
Also, since every pole is fixed both by I and by at least one non-identity element
of G, we have
2 p q r.
Lemma 7.10.
1 1 1 2
+ + =1+ > 1.
p q r |G|
⇤
From this, we immediately get that the following are the only possibilities for p,
q, and r.
G = GR [ f GR ,
for some f = G GR . Every element of GR fixes both R and R, and so GR acts
as a cyclic group of rotations of the plane ⇧ about the point (0, 0, 0). Let g 2 GR
with g 6= I. Then, since GR is a normal subgroup of G, f 1 g f := g1 is also an
element of GR . Hence g f = f g1 , and so
p = 2, q = r = 3.
Consider the orbit O2 of size 4. Since no non-identity rotation fixes four poles,
the map : T ! Sym(O2 ) by restriction of domain is an injective map of T into
Sym(O2 ) ⇠ = S4 . Since |T | = 12, T is isomorphic to the unique subgroup of S4 of
cardinality 12, namely, the alternating group A4 . As you will show in Exercise 1
below, A4 transitively permutes the set of unordered pairs {{i, j} : 1 i < j 4}.
Translating this into geometry: T transitively permutes the set of six edges joining
pairs of vertices in O2 . As T is a group of isometries, all edges have the same
length, i.e. the figure S(4) formed in this way is a tetrahedron, and T is the group
of rotational symmetries of S(4). If we were to use the other orbit of length 4, O3 ,
we would have constructed the dual tetrahedron, S(4)⇤ .
Next. we consider the case
p = 2, q = 3, r = 4.
Now we have that |G| = 24 and so the orbit O3 has length 6. Suppose that X and
X are two antipodal poles. Then GX = G X . In particular, X and X lie in
|G|
orbits of the same length, namely, |G X|
. Since no two orbits have the same length
in this case (in contrast to the tetrahedral case), we see that X and X lie in the
same orbit. In particular
O3 = {R, R, S, S, T, T }
for some poles R, S, T and their antipodes.
The group GR acts as a group of rotations of the latitudinal planes perpendicular
to the axis passing through R and R. Hence, as we have seen from studying the
2-dimensional case, GR is a cyclic group of cardinality 4 = 24 6 . If ⇢ is a cyclic
generator of GR , then ⇢ is a 90o rotation about the {R, R} axis. Since ⇢2 fixes
only the poles R and R, the set {S, S, T, T } is a ⇢-orbit on P. As an exercise,
you are asked to show that this is possible if and only if S, T, S, T lie at a set
of compass points on the equatorial plane relative to the poles R and R. Thus
S, T, S, T determine a square in the equatorial plane, and {±R, ±S, ±T } is the
vertex set of an octahedron C ⇤ with G as its group of rotational symmetries.
The orbit O2 contains four pairs of antipodal poles, which may be obtained as
follows: Draw the lines through the centers of opposite faces of the octahedron C ⇤ .
Each such line L gives a pair {QL , QL } of antipodal points on the unit sphere S 2 ,
which are poles for the rotational symmetries ⇢F and ⇢2F of order 3 fixing the face
F . This set of eight points on S 2 is the set of points in the orbit O2 . Clearly, it
may be identified with the vertices of a cube which is a dilation of the cube C dual
to the octahedron C ⇤ .
Finally we say a few words about the most difficult case
p = 2, q = 3, r = 5.
Now |G| = 60 and the orbit O3 has length 12. Again, because no two orbits have
the same length, antipodal poles lie in the same orbit. In particular, GR is a cyclic
59
group of cardinality 5, fixing the points R and R, while permuting the remaining
10 points of O3 in two orbits of length 5.
Suppose one GR -orbit consists of five points in the equatorial plane relative to R
and R. Then, since these points are the vertices of a regular pentagon, they do not
contain antipodal pairs. Hence the other GR -orbit must consist of their antipodal
points, also lying in the equatorial plane. Let C be the great circle formed by the
intersection of this equatorial plane with the unit sphere S 2 . Then, clearly C is the
only great circle on S 2 containing 10 points from O3 . But then since G acts on the
orbit O3 , G fixes the circle C (as a set), contrary to the fact that O3 is a G-orbit
and R does not lie on C.
Hence there are two latitudinal circles, CN and CS , each containing the points
in one GR -orbit of length 5 on O3 , placed at the vertices of a regular pentagon.
Moreover the regular pentagon on CN is antipodal to the regular pentagon on CS .
Using some further symmetry arguments, it can be shown that O3 is the set
of vertices of an icosahedron inscribed in S 2 . And, then, in a similar way to the
octahedral case, it may be argued that O2 is the set of vertices of an inscribed
dodecahedron, which is the dilation of the dodecahedron dual to the inscribed
icosahedron.
Finally, as |G| = 60, it follows that G ⇠
= I, the group of all rotational symmetries
of the icosahedron.
Exercises
1. Let G = A4 , the alternating group on 4 letters. Let
2.(a) Justify the statement in the discussion of the octahedral case: “Since ⇢2
fixes only the poles R and R, the set {S, S, T, T } is a ⇢-orbit on P.
(b) In the same context as (2a), prove: S, T, S, T lie at a set of compass
points on the equatorial plane relative to the poles R and R.
3. The Direct Product: Let G be a group having subgroups H and K satisfying
the following two conditions:
(a) h k = k h for all h 2 H and k 2 K; and
(b) H \ K = {I}.
Prove: The subset HK of G is a subgroup of G which is isomorphic to the following
formal group. called the direct product of H and K:
(h, k) (h0 , k 0 ) = (h h0 , k k 0 ).
4. Let G = HK ⇠
= H ⇥ K be a group. Let ⇡H : G ! H be the projection map
defined by:
⇡H ((hk)) = h 8 g = hk 2 G.
Define ⇡K : G ! K analogously.
(a) Prove: Let M be any subgroup of G. Then ⇡H (M ) is a subgroup of H and
⇡K (M ) is a subgroup of K.
(b) Prove: M is a subgroup of the group ⇡H (M ) · ⇡K (M ) ⇠
= ⇡H (M ) ⇥ ⇡K (M ).
5.(a) Prove: O(3) ⇠
= SO(3) ⇥ {I, I} ⇠
= SO(3) ⇥ C2 .
(b) Conclude that every finite subgroup of O(3) is isomorphic to a subgroup of
H ⇥ K, where H is a finite subgroup of SO(3) and |K| = 1 or 2.
6. Let S be a regular tetrahedron in R3 . We have proved that Sym(S) ⇠ = S4 .
On the other hand, the tetrahedral group T of all rotational symmetries of S is
isomorphic to A4 . Verify that S4 is not isomorphic to a subgroup of T ⇥ {I, I}.
Explain how this could be true.
7. If S is either a regular octahedron or a regular icosahedron centered at (0, 0, 0),
then the antipodal map I is a symmetry of S. Using this fact, prove that the
symmetry group of S is Sym(S) = O ⇥ {±I} is S is an octahedron, and Sym(S) =
Icos ⇥ {±I} if S is an icosahedron.
8. Let D3 be the group of all diagonal matrices in O(3). Prove: D3 ⇠ = C2 ⇥ C2 ⇥
C2 , an abelian group of cardinality 8, all of whose non-identity elements have order
2.
9.(a) Prove: Let p be a prime and let G be a finite group of cardinality p2 . Then
either G is cyclic or G ⇠
= Cp ⇥ Cp .
(b) Exhibit a subgroup P of S6 with |P | = 9. Verify that P ⇠
= C3 ⇥ C3 .
10. Inner Product Spaces: A function h., .i : R3 ⇥ R3 ! R is an inner product
on R3 if the following properties hold:
(i) hu, vi = hv, ui for all u, v 2 R3 ;
(ii) hu + v, wi = hu, wi + hv, wi for all u, v, w 2 R3 ;
(iii) hcu, vi = c · hu, vi for all u, v 2 R3 , c 2 R; and
(iv) hu, ui 0 for all u 2 R3 , with equality if and only if u = 0.
(b) Let G⇤ be the set of all linear isometries of R3 with respect to h., .i, i.e.,
O(3) ⇠
= {A 2 GL(3, R) : AT A = I}.
Show that the same is true for G⇤ with respect to a suitable choice of basis for R3 .]
11. Another Averaging Trick. Prove E. H. Moore’s Theorem: Let G be a finite
subgroup of GL(3, R). Then G is isomorphic to a subgroup of O(3). [Hint: Define
a function on R3 ⇥ R3 by
1 X
hu, vi = g(u) · g(v)
|G|
g2G
for all u, v 2 R . Prove that h., .i is an inner product on R3 . Then, prove that G
3
is a group of linear isometries with respect to h., .i. Now use Exercise 8b.]
12. Prove: Let G be a finite subgroup of Af f (R3 ). Then G is isomorphic to a
subgroup of O(3).
[Hint: Use Theorem 7.3 and the proof of Corollary 7.6 to show that G is conjugate
to a subgroup G1 of GL(3, R). Now use Exercise 11.]
13. Give an example of a bijective function f : R ! R of finite order, i.e., f n = I
for some n 2 BbbN , such that f fixes no point of R, i.e., f (x) 6= x for all x 2 BbbR.
14(a) Consider the following set of 2 ⇥ 2 matrices with complex entries:
✓ ◆ ✓ ◆ ✓ ◆
i 0 0 1 0 i
Q8 := {±I, ± ,± ,± }.
0 i 1 0 i 0
Prove: Q8 is a subgroup of the group of all 2 ⇥ 2 matrices with complex entries.
(b) Let m 2 N and let ⇣m = cos( 2⇡ 2⇡
m ) + isin( m ). Using DeMoivre’s Formula,
prove that ⇣m = 1, but ⇣m 6= 1 for all d < m, d 2 N.
m d
✓ ◆
⇣p 0
(c) Let p and q be prime numbers. Let Zpq = . Prove: Zpq generates a
0 ⇣q
cyclic subgroup of cardinality pq in GL(2, C), the group of 2 ⇥ 2 invertible matrices
with complex entries.
(d) Let p be a prime number. Let a, b 2 N with a b. Let H be the subgroup
of GL(2, C) generated by the two matrices:
✓ ◆ ✓ ◆
⇣pa 0 1 0
and .
0 1 0 ⇣pb
Prove: Hp ⇠
= Cpa ⇥ Cpb .
(e) Prove: Let L be the subgroup of GL(2, C) generated by the two matrices:
✓ ◆ ✓ ◆
⇣3 0 0 1
and .
0 ⇣3 1 1 0
Then L is a nonabelian group of cardinality 12 with |Z(L)| = 2 and with only one
element of order 2.
62
Remark:. We have now constructed examples of all finite groups G with |G| 15:
(1) Cn , n 15;
(2) V4 = D 2 ⇠ = C2 ⇥ C2 ;
(3) D3 ⇠ S
= 3 ;
(4) D4 ; ✓ ◆ ✓ ◆
⇠ i 0 1 0
(5) C4 ⇥ C2 = h , i;
0 1 0 1
(6) C2 ⇥ C2 ⇥ C2 , the group of all diagonal matrices in O(3);
(7) Q8 ✓ GL(2, C);
(8) C 3 ⇥ C 3 ✓ S6 ;
(9) D5 ;
(10) C6 ⇥ C2 ✓ GL(2, C);
(11) D6 ;
(12) A4 ;
(13) L ✓ GL(2, C).
15. Justify the statement that no two of the 27 groups listed above are isomor-
phic.
It is not terribly difficult, but it is a bit beyond the scope of this course to prove
that every group G with |G| 15 is isomorphic to one of these 27 groups.
63
“... and gives to airy nothing a local habitation and a name ...”
– Wm. Shakespeare
We now change topic, returning to the theme of polynomial equations and their
roots. De Moivre’s Formula, which you studied last semester, demonstrated the
existence of exactly n complex nth roots for any number, i.e. a full complement
of n solutions to the equation xn ↵ = 0 could be found among the complex
numbers, for any given complex number ↵. This lent support to the idea that every
polynomial equation of degree n with complex coefficients should have a full set of n
solutions (counting multiplicity) in the field C. A somewhat cryptic version of this
statement was first made (without proof) by the French mathematician Girard as
early as 1629 (before Descartes published his Factor Theorem), along with formulas
expressing the coefficients as symmetric functions of the roots.
Nevertheless, as late as the early 1700s, Leibniz thought he had a counterex-
ample. D’Alembert published a somewhat incomplete proof of this “Fundamental
Theorem of Algebra” using the methods of calculus, in 1746. Euler attempted a
more algebraic proof in 1749, which was improved by de Foncenex and then La-
grange in 1772.
There was one big problem with Euler’s proof, and this was pointed out by
Gauss, who proposed his own calculus-based proof in 1799, and a second more
algebraic proof in 1816. The problem detected by Gauss was:
Euler’s proof assumed the existence somewhere of a set of n roots of an nth
degree polynomial p(x) with real coefficients. The proof then proceeded to show
that these roots were in fact complex numbers. But, said Gauss, this misses the
entire point. Why do roots of p(x) exist anywhere??
If Euler were still alive when Gauss wrote this, he might have responded: What
do you mean exist somewhere?? We can simply invent them, as needed. After
all, the imaginary number i was invented to provide a root for the polynomial
f (x) = x2 + 1 2 R[x]. By adding i to R, we were able to create a larger field, C,
containing the roots of all quadratic equations. Just keep on doing this, as needed.
Well, Gauss had a point. One can certainly invent symbols, but can one construct
an algebraic structure containing these symbols and having all of the usual nice
properties that one needs to carry out Euler’s proof? In modern language, given
a field F and a polynomial f (x) 2 F [x], can one always construct a larger field E
containing F and also containing a complete set of roots for this polynomial? Galois
was perhaps the first person to demonstrate that this is always possible. In 1831,
he wrote a note on fields of numbers, hinting at the construction we shall describe
in this section. This was later clarified and elaborated by Leopold Kronecker.
Let’s construct C a di↵erent way: Let R[x] be the domain of all polynomials
with real coefficients. Let (x2 + 1) denote the principal ideal in R[x] generated by
the polynomial q(x) = x2 + 1. Form the quotient ring C := R[x]/(x2 + 1). By the
Division Algorithm, we see that every element of C has the form
(a + bx) + (x2 + 1)
64
for some a, b 2 R. If we define the symbol i to denote the coset x + (x2 + 1), then
we see that
F0 := {↵ + (f (x)) : ↵ 2 F }.
(Here we are identifying the number ↵ with the constant polynomial c(x) = ↵.)
Proof. For any p(x) 2 F [x], let [p(x)] := p(x) + (f (x)) 2 E. We need to prove that
if [g(x)] is a non-zero element of E, then there is a polynomial h(x) 2 F [x] such
that [g(x)] · [h(x)] = [1] in E, i.e.,
Definition 8.3. Let F be a field and let p(x) 2 F [x] be a polynomial. Suppose
that E is a field which contains a subfield F0 isomorphic to F , and such that p(x)
factors into a product of linear factors in E[x]. Let r1 , r2 , . . . , rn be the roots of
p(x) in E, and let E0 = F (r1 , r2 , . . . , rn ) ✓ E. Then we call E0 a splitting field
for p(x) over F .
It is very useful to be able to measure the “relative sizes”of fields F and E, where
E is an extension field of F , i.e.,F is a subfield of E. Since both fields are, in
general, infinite, cardinality is not a good measuring stick. Fortunately we have an
alternative, suggested by the following observation.
Lemma 8.4. Let E be an extension field of F . Then the usual operations of
addition and multiplication in E make E an F -vector space.
We leave the proof as an exercise. Note that, since E is a field, we know that
(E, +) is an abelian group. The axioms for scalar multiplication:
(1) ↵ · (u + v) = ↵ · u + ↵ · v, for ↵ 2 F , u, v 2 E;
(2) (↵ + ) · u = ↵ · u + · u for ↵, 2 F , u 2 E;
(3) (↵ ) · u = ↵ · ( · u) for ↵, 2 F , u 2 E; and
(4) 1 · v = v for all v 2 E.
follow easily from the properties of the field E.
Since E is an F -vector space, we can measure its size relative to F by the
dimension of E as an F -vector space. Note that, if E = F , then {1} is a basis for
E as an F -space, and so dimF (F ) = 1.
Definition 8.5. We write (E : F ) and speak of the degree of E over F , to denote
the dimension of E as an F -vector space.
This degree is most useful when it is finite. This will be the case in the situations
of interest to us. We need the following remark, whose proof we leave as an exercise.
Lemma 8.6. Let F be a field and E an extension field of F containing a root ↵
of some non-zero polynomial in F [x]. Then the set
Set
n 1
r(x) = a0 + a1 x + · · · + an 1x 2 F [x].
Being a bit sloppy, we shall assume that F ✓ E and denote by a the coset a +
(p(x)) 2 E for any a 2 F . Thus
n 1
[c0 + c1 x + · · · + cn 1x ] = [0].
Setting h(x) = c0 + c1 x + . . . cn 1 xn 1 2 F [x], we conclude that h(x) is a multiple
of f (x). However, either h(x) is the zero polynomial or deg(h(x)) n 1 < n =
deg(f (x)). Hence h(x) ⌘ 0, i.e.
c0 = c1 = · · · = cn 1 = 0.
Thus B is indeed an F -linearly independent set. So B is an F -basis for E, whence
(E : F ) = n, as claimed.
⇤
As a corollary, we obtain the following important fact.
Corollary 8.8. Let E be a field and let F be a subfield of E. Let ↵ 2 E and
suppose that m(x) 2 F [x] is the minimum polynomial of ↵ in F [x]. If the degree of
m(x) is n, then (F (↵) : F ) = n.
Proof. F (↵) ⇠
= F [x]/(m(x)).
⇤
Note that not every number has a minimum polynomial. For example if F =
Q and E = R, then ⇡ 2 R and ⇡ is not the root of any polynomial equation
with rational coefficients. [This is a fairly difficult theorem to prove. It was first
proved by Lindemann.] We say that a number ↵ is algebraic over F if ↵ is the
root of a polynomial equation with coefficients in F . Otherwise, we say that ↵ is
transcendental over F . We shall restrict our attention to algebraic numbers. We
have the following converse to Corollary 8.8.
68
S := {1, ↵, ↵2 , . . . , ↵n }.
Since |S| = n + 1 and dimF (E) = n, S is a linearly dependent set. Hence there
exist numbers c0 , c1 , . . . , cn 2 F , not all 0, such that
c0 + c1 ↵ + · · · + cn ↵n = 0.
Let p(x) = cn xn + · · · + c1 x + c0 2 F [x]. Then p(↵) = 0. Hence ↵ is algebraic over
F , as claimed.
⇤
Exercises
1. Describe the multiplication in the ring F [x]/(x2 ). Is this a field? What type
of element is [x]?
2. Describe the multiplication in the ring Q[x]/(x2 x). Is this a field? What
type of element is [x]?
3a. Let p(x) = g(x)h(x) 2 Q[x] with g(x) and h(x) non-constant irreducible
polynomials with gcd(g(x), h(x)) = 1. Prove: Q[x]/(p(x)) ⇠
= F1 F2 , with F1 and
F2 extension fields of Q. [Hint: Use the Chinese Remainder Theorem from last
semester.]
b. Let p(x) = g(x)h(x) 2 Q[x] with g(x) and h(x) non-constant irreducible
polynomials. Prove: Q[x]/(p(x)) is not a field, but also it has no non-zero nilpotent
elements.
c. (Bonus) Give necessary and sufficient conditions on a polynomial p(x) 2 Q[x]
for the ring Q[x]/(p(x)) to contain non-zero nilpotent elements.
4a. Describe the multiplication in the ring Q[x]/(x2 + x + 1). Is this a field?
What is the multiplicative inverse of [x]?
p
3
b. Let ! = 1
2 + 2 i 2 C. Let
E = {a + b! + c! 2 : a, b, c 2 Q} ✓ C.
Prove: E is closed under addition, subtraction, multiplication, and division (by
non-zero elements).
c. Let E be as in (b). Prove: E ⇠
= Q[x]/(x2 + x + 1).
5. Prove Lemma 8.4.
6. Prove Lemma 8.6.
7. Prove: Let F be a field. Let F0 be the intersection of all subfields of F . Then
F0 is a subfield of F . [Hence F0 is the unique smallest subfield of F .]
The next few exercises relate to finite fields. These were first described by Galois,
and are sometimes called Galois fields.
69
8a. Prove: Let E be a finite field. Let F be the smallest subfield of E, i.e.,
F = {0E , 1E , 1E + 1E , ...}.
Then F ⇠
= Z/pZ for some prime p. [Recall: p is the characteristic of the field E.]
b. Prove: Let E and F be as in (a). Then E is a finite-dimensional F -vector
space. In particular, |E| = pn for some n 2 N.
n
c. Prove: Let E be a finite field with |E| = pn . Then xp = x for all x 2 E.
[Hint: E {0} is a finite group under multiplication. Apply Lagrange’s Theorem.]
d. Prove: Let E be a field of characteristic p. Let a, b 2 E. Then
n n n
(a + b)p = ap + bp
for all n 2 N.
n
9. Let F = Z/pZ. Let f (x) = xp x 2 F [x]. Let E be a splitting field for f (x)
over F .
a. Prove: f (x) has pn distinct roots in E. [Hint: Suppose on the contrary
that f (x) = (x a)2 · g(x) 2 E[x]. Compute f 0 (x) in two di↵erent ways to get a
contradiction.]
b. Let f (x) be as above. Let S := {a 2 E : f (a) = 0}. Prove: S is closed under
addition, subtraction, multiplication, and division (by non-zero elements), i.e., S is
a subfield of E.
c. Let S be as in (b). Prove: S = E, i.e. |E| = pn .
10. Prove: If E and E1 are two fields with |E| = pn = |E1 |, then E ⇠
= E1 . [Thus,
up to isomorphism, there is one and only one field of cardinality pn for each prime
p and each n 2 N.]
11. Prove: Let E be a field of characteristic p. Define f : E ! E by f (x) = xp
for all x 2 E. Then f is an injective ring homomorphism. In particular, if |E| = pn
for some n 2 N, then f is an automorphism of E.
70
r1 + r2 ,
r1 r 2 ,
r12 + r22 + r1 r2 ,
r13 + r23 .
For n = 3, another type of example is:
r 1 r 2 + r1 r3 + r 2 r 3 .
We shall restrict our attention to symmetric polynomials with integer coefficients.
We leave it as an exercise to prove the following theorem.
Theorem 9.2. Let S be the set of all symmetric polynomials in the variables
r1 , r2 , . . . , rn with integer coefficients. Then S is a subring of Z[r1 , r2 , . . . , rn ], i.e.
S contains 0 and 1, and S is closed under addition, subtraction, and multiplication.
Now we are really interested in polynomials p(x) in one variable. The context
in which these multivariable polynomials arises is the following:
Suppose that p(x) = xn +an 1 xn 1 +· · ·+a1 x+a0 2 F [x] is a monic polynomial
having roots r1 , r2 , . . . , rn in some splitting field E containing F . Then
p(x) = xn + an 1x
n 1
+ · · · + a1 x + a0 = (x r1 )(x r2 ) . . . (x rn ) 2 E[x].
an 1 = r 1 + r2 + · · · + rn ,
an 2 = r 1 r 2 + r1 r3 + · · · + rn 1 rn ,
...,
n
( 1) a0 = r1 r2 . . . rn .
71
Each of the expressions on the right hand side can be thought of as a polynomial
in the ring Z[r1 , r2 , . . . , rn ]. In fact, each lies in the subring of symmetric polyno-
mials, since obviously p(x) is unchanged by any permutation in the ordering of the
linear factors. Indeed, these n polynomials are called the elementary symmetric
polynomials in Z[r1 , r2 , . . . , rn ]:
s 1 = r 1 + r2 + · · · + rn ,
X
s2 = ri r j ,
i6=j
X
s3 = r i r j rk ,
i6=j6=k6=i
...
s n = r 1 r 2 . . . rn .
The following fundamental theorem was probably known to Isaac Newton, but
was first explicitly proved somewhat later by Edward Waring.
Waring’s Theorem. Let S be the subring of Z[r1 , r2 , . . . , rn ] consisting of all
symmetric polynomials in the variables r1 , r2 , . . . , rn with integer coefficients. Then
S = Z[s1 , s2 , . . . , sn ].
f (r1 , r2 , . . . , rn ) = F (s1 , s2 , . . . , sn ),
for some polynomial F with integer coefficients, depending of course on f .
Rather than describe the algorithm in full gory generality, let’s look at an illus-
trative example in three variables. To save ink, let’s call the variables r, s, and t,
instead of r1 , r2 , r3 .
If you want to cook up an example of a symmetric polynomial in three variables,
you can symmetrize any monomial by adding up all of its possible permutations.
For example, starting with the monomial
m(r, s, t) = r2 s,
we get the symmetrized polynomial
p(r, s, t) = r2 s + r2 t + s2 r + s2 t + t2 r + t2 s.
An important feature of any algorithm is to have a way of measuring whether
you are making steady progress in the correct direction, or just going around in
circles. To do this, we choose a way to say that a symmetric polynomial p(r, s, t)
is bigger than some other symmetric polynomial q(r, s, t). Then we will look for an
algorithm that makes our polynomial smaller and smaller.
For a monomial m(r, s, t) = ari sj tk (with a 2 Z), we call its degree vector
(i, j, k). Thus, the monomial m(r, s, t) = r2 s has degree vector (2, 1, 0). We order
the degree vectors lexicographically reading from left to right. Thus
72
(3, 0, 0) > (2, 1, 1) > (2, 1, 0) > (2, 0, 7) > (0, 3, 5),
for example, i.e.,
si1 j sj2 k k
s3 has degree vector (i, j, k).
For example, if we want a symmetric polynomial with degree vector (2, 1, 0), then
we note that
p(r, s, t) = r2 s + r2 t + s2 r + s2 t + t2 r + t2 s.
If we let q(r, s, t) = p(r, s, t) s1 s2 , then we have succeeded in canceling the term
of highest degree out of p(r, s, t). Thus q(r, s, t) = 3rst has highest degree vector
(1, 1, 1) < (2, 1, 0). So we are making progress. In fact, in this case we are done,
since q(r, s, t) = 3s3 . So we have
n
of degree 2 . Then f (x) 2 R[x], i.e., all of the coefficients of f (x) are real numbers.
Proof. Let A = {aij : 1 i, j n, i 6= j}. Let Sn act on A by:
X
E2 := aij akl ,
(ij)6=(kl)
...
s 1 = r 1 + r2 + · · · + rn ,
X
s2 = ri r j ,
i6=j
...
s n = r 1 r 2 . . . rn .
But then, for each i, either si or si is a coefficient of the polynomial p(x) 2 R[x],
i.e., each si is a real number. Hence Z[s1 , s2 , . . . , sn ] is a subring of R. It follows
that each Ei is a real number. But
n n
f (x) = x( 2 ) E1 x ( 2 ) 1
+ · · · ± E( n ) .
2
We begin with a few easy reductions. We shall refer to the Fundamental Theorem
of Algebra as FTA.
Lemma 9.4. FTA is true provided that the following statement is true:
(*) Let f (x) be any monic polynomial in C[x] of degree n 1. Then there is at
least one complex number r with f (r) = 0.
Proof. Assuming the statement above, we shall prove FTA by induction on the
degree n of p(x). If n = 1, then p(x) = ax + b for some a, b 2 C, and so
b
p(x) = a(x ( )).
a
Thus we are done, taking c = a and r1 = ab .
Now suppose FTA is true for polynomials of degree n, and let p(x) 2 C[x] have
degree n + 1. Write
f (x) = xn+1 + bn xn + · · · + b1 x + b0 .
By assertion (*), there exists a complex number r with f (r) = 0. Then by Descartes’
Factor Theorem,
f (x) = (x r)g(x),
g(x) = (x r1 )(x r2 ) . . . (x rn ).
Thus
Thus p(x) has a factorization as claimed in FTA. Hence FTA is true, provided that
(*) is true.
⇤
The next step is to reduce (*) to the case when f (x) has real coefficients.
75
Lemma 9.5. (*) is true if and only if the following statement is true:
(**) Let f (x) be any monic polynomial in R[x] of degree n 1. Then there is at
least one complex number r with f (r) = 0.
Proof. It is clear that (*) implies (**). Suppose now that (**) is true, and let f (x)
be a monic polynomial in C[x] of degree n 1. Write
f (x) = xn + an 1x
n 1
+ · · · + a1 x + a0 .
f (x) = xn + an 1x
n 1
+ · · · + a1 x + a0 .
Thus g(x) 2 R[x]. Hence by (**), there exists r 2 C with g(r) = 0. Thus
Since C is a domain, either f (r) = 0 or f (r) = 0. If f (r) = 0, then (*) holds and
we are done. Suppose, then, that f (r) = 0, i.e.
r n + an 1r
n 1
+ · · · + a1 r + a0 = 0.
r n + an 1r
n 1
+ · · · + a1 r + a0 = 0.
Hence f (r) = 0, and again (*) holds, and we are done in this case as well.
⇤
Finally, we have reached the heart of the problem. We must show that every
non-constant monic polynomial with real coefficients has at least one complex root.
The application of the Intermediate Value Theorem which we need is the follow-
ing corollary.
76
Corollary 9.6. Let f (x) be a monic polynomial in R[x] of odd degree. Then there
is at least one real number r with f (r) = 0.
Idea of Proof. Let f (x) = xn + an 1x
n 1
+ · · · + a1 x + a0 2 R[x]. Then
1 1 1
f (x) = xn · (1 + an 1 + · · · + a1 n 1
+ a0 ).
x x xn
Let M = max0i<n |ai | and choose x such that |x| > max(1, nM ). Then |xk |
|x| > nM for all k 1. So
|an k | M 1
< = for 0 k < n.
|xk | nM n
Hence
Hence
1 1 1
1 + an 1 + · · · + a1 n 1
+ a0 > 0.
x x xn
Hence f (x) > 0 for all x > max(1, nM ), and f (x) < 0 for all x < min( 1, nM ).
It follows from the Intermediate Value Theorem that there exists at least one real
number r with f (r) = 0, as claimed.
⇤
Now finally we arrive at Euler’s brilliant idea. He would like to prove the Funda-
mental Theorem of Algebra by induction. But rather than trying to use the obvious
induction on the degree of f (x), he uses induction on the 2-part of the degree of
f (x).
Theorem 9.7. Let f (x) be a monic polynomial in R[x] of degree n > 0. Then
there is at least one complex number r with f (r) = 0.
Euler’s Proof. Write n = 2m · n1 , where n1 is odd. The proof is by mathematical
induction on m for m 0. If m = 0, then n = n1 is odd, and we are done by
Corollary 9.6. Hence we may assume that m > 0 and that the theorem is true for
all monic polynomials in R[x] of degree k = 2m 1 · k1 with k1 odd.
Let E be a splitting field for f (x) containing C. Let r1 , r2 , . . . , rn be the roots of
f (x) in E. Our goal is to show that at least one of these roots is in the subfield C.
We construct an infinite family of new polynomials g1 (x), g2 (x), . . . , one for each
natural number, by the following rule:
Y
gc (x) = (x (ri + rj + cri rj )).
1i<jn
By Lemma 9.3, gc (x) 2 R[x] for all c 2 N. Moreover, the degree of gc (x) is
✓ ◆
n n(n 1) 2m · n1 · (n 1)
= = = 2m 1 · n1 (n 1),
2 2 2
77
with n1 (n 1) odd. Hence, by the inductive hypothesis, each gc (x) has at least
one complex root. In other words, for each c 2 N, there exists a pair {i, j} with
1 i 6= j n, such that
ri + rj + cri rj 2 C.
Since there are infinitely many natural numbers, it follows by the Pigeonhole Prin-
ciple that there exist two di↵erent natural numbers c and d for which the same
choice of {i, j} yields a complex number. In other words, both
a := ri + rj + cri rj 2 C
and
b := ri + rj + dri rj 2 C.
Then
a b
ri r j = := C 2 C,
c d
and
a b
r i + rj = a cri rj = a c := B 2 C.
c d
But then
Now the Quadratic Formula is valid for polynomials with complex coefficients. Let
denote one of the two complex square roots of B 2 4C, as given by DeMoivre’s
Formula. Then the roots of q(x) are:
B+ B
ri = 2 C and rj = 2 C.
2 2
Thus ri and rj are two conjugate complex numbers which are roots of the polyno-
mial f (x), completing the proof.
⇤
And indeed, by combining Lemmas 9.4 and 9.5, Corollary 9.6, and Theorem 9.7,
we have completed Euler’s beautiful proof of the Fundamental Theorem of Algebra.
78
Exercises
1. Prove Theorem 9.2. [Hint: For f = f (r1 , r2 , . . . , rn ) 2 Z[r1 , r2 , . . . , rn ] and for
2 Sn , let (f ) := f (r (1) , r (2) , . . . , r (n) ). You may use the facts that (f + g) =
(f ) + (g) and (f · g) = (f ) · (g).]
2a. Express r2 +s2 +t2 as a polynomial in the elementary symmetric polynomials
s 1 , s 2 , s3 .
b. Do the same for r3 + s3 + t3 .
3. Express r12 + r22 + r32 + r42 as a polynomial in the four elementary symmetric
polynomials si (r, s, t, u), 1 i 4.
4a. Using the Fundamental Theorem of Algebra, prove that every polynomial in
R[x] can be factored into a product of polynomials of degree 1 or 2, each in R[x].
b. Give necessary and sufficient conditions for a polynomial p(x) = an xn + · · · +
a1 x + a0 2 R[x] to be irreducible.
5. Give an example of a quadratic polynomial in Q[x] which is irreducible in
Q[x], but is not irreducible in R[x].
6a. Prove: Counting multiplicity, a polynomial of even degree in R[x] has an
even number of real roots.
b. Give an example of a quartic polynomial p(x) in R[x] having no real roots.
Factor p(x) as a product of two quadratic polynomials in R[x].
7. Let f (r1 , r2 , . . . , rn ) 2 Q[r1 , r2 , . . . , rn ]. We call f (r1 , r2 , . . . , rn ) an alternat-
ing polynomial if the Sn -orbit containing f (r1 , r2 , . . . , rn ) has cardinality 2, i.e.,
there exists a polynomial g(r1 , r2 , . . . , rn ) 2 Q[r1 , r2 , . . . , rn ] such that, for every
2 Sn , either
⌧ ( (r1 , r2 , . . . , rn )) = (r1 , r2 , . . . , rn ).
= ⌧ 1 · ⌧ 2 · . . . · ⌧ m = t1 · t 2 · . . . · t r ,
where each ⌧i and each tj is a transposition. Then m is even if and only if r is even.
[Hint: Use (c).]
e. Let An be the set of all permutations 2 Sn such that is expressible as a
product of an even number of transpositions. Prove: An is a subgroup of Sn . [An
is called the alternating group on n letters.]
n!
f. Prove: |An | = 2 .
80
x2 + bx + c = 0.
If we call the roots r and s, then we have
f (r, s) = r
gets permuted to the expression
g(r, s) = s
by the permutation
(r, s) = r s.
This polynomial is called the discriminant polynomial. It has the interesting
property that:
⌧ ( 2) = ( )2 = 2
,
i.e., 2 is a symmetric polynomial in r and s. Hence, by Waring’s Theorem, 2 is
expressible as a polynomial in the elementary symmetric polynomials b = r + s and
c = rs. Of course, it is easy to compute explicitly:
2
= (r s)2 = (r + s)2 4rs = ( b)2 4c = b2 4c.
Hence
p
= b2 4c
for a suitable choice of square root. Of course, we can now find r and s by solving
the system of linear equations:
(1) r + s = pb
(2) r s = b2 4c
Thus we recover the Quadratic Formula. Now let’s try to apply the same rea-
soning to the cubic equation:
x3 + px q = 0.
Again set
= (r, s, t) := r + !s + ! 2 t,
p p
where 1, ! = 12 + 23 i and ! 2 = 12 3
2 i are the three cube roots of 1, and r,
3
s, t are the three roots of f (x) := x + px q. An expression of this form is called
a Lagrange resolvent for the cubic f (x).
Now, (r, s, t) takes six values under the action of Sym({r, s, t}), suggesting why
Cardano ended up with an equation of degree 6 in attempting to solve the cubic.
If we set
µ = µ(r, s, t) := !r + s + ! 2 t,
then the six values taken by are
, ! · , ! 2 · , µ, ! · µ, ! 2 · µ.
Since ! 3 = (! 2 )3 = 1, the function 3
takes only two values under permutation:
3
and µ3 .
Thus, the set { 3 , µ3 } is a Sym({r, s, t}) orbit of size 2 on the set C[r, s, t]. We
leave as an exercise, the following corollary:
82
3 3
( )= and (µ3 ) = µ3 .
3
If has order 2, then interchanges and µ3 .
In any case, it follows that every permutation in Sym({r, s, t}) fixes both
3
+ µ3 and 3
· µ3 ,
i.e. these are symmetric polynomials in r, s, t. As a somewhat tedious exercise, you
will be asked to write out 3 + µ3 explicitly as a polynomial in r, s, t, and then to
express it as a polynomial in the three elementary symmetric functions in r, s, t.
Note that, in this case,
(1) s1 = r + s + t = 0,
(2) s2 = rs + rt + st = p, and
(3) s3 = rst = q.
3
Now and µ3 are the two roots of the quadratic polynomial
q(x) := x2 ( 3
+ µ3 )x + 3
· µ3 .
Hence, using the Quadratic Formula, we could explicitly solve for 3 and µ3 in
terms of p and q. Then, by taking cube roots, we could find and µ. Finally, we
end up with a system of three linear equations in the three unknowns r, s, and t:
r+s+t=0
r + !s + ! 2 t =
!r + s + ! 2 t = µ
to be solved in order to find r, s, and t. We leave as an exercise for you to verify
that the coefficient matrix
0 1
1 1 1
@1 ! !2 A
! 1 !2
is invertible, and hence the system has a unique solution.
Lagrange further extended these ideas to explain the solution of the quartic
equation. We give a brief description in the general spirit of his work. Consider the
quartic
f (x) = x4 + ax2 + bx + c.
Let the roots of f (x) be r1 , r2 , r3 , and r4 . Consider the elements
✓1 = (r1 + r2 )(r3 + r4 )
✓2 = (r1 + r3 )(r2 + r4 )
✓3 = (r1 + r4 )(r2 + r3 )
83
T := {✓1 , ✓2 , ✓3 }
is a S := Sym({r1 , r2 , r3 , r4 }) orbit on Z[r1 , r2 , r3 , r4 ]. The kernel of the action of
S on this orbit is a normal Klein 4-subgroup V of S. It is very important, as we
shall see later, that V is a normal subgroup of S.
Since S permutes the set T , it follows that the elementary symmetric functions
in the ✓i ’s are fixed by all of the elements of S, and hence are expressible in terms
of the elementary symmetric functions in the roots r1 , r2 , r3 , r4 , i.e., in terms of the
coefficients a, b, c of f (x). In fact, computation shows that
✓1 + ✓2 + ✓3 = 2a
✓1 ✓2 + ✓1 ✓3 + ✓2 ✓3 = a2 4c
2
✓1 ✓2 ✓3 = b .
It follows that ✓1 , ✓2 , ✓3 are the roots of the resolvent cubic
q(x) := x2 0x + ✓1 = x2 + ✓1 .
Hence r1 + r2 and r3 + r4 are the two square roots of ✓1 . Similarly, r1 + r3 and
r2 + r4 are the two square roots of ✓2 ; and r1 + r4 and r2 + r3 are the two square
roots of ✓3 . Finally, as in the cubic case, one can solve a system of linear equations
to find the roots of f (x). For example,
1 p p p
r1 = ( ✓1 + ✓2 + ✓3 ).
2
Lagrange found himself unable to extend his methods to the case of the quintic
equation. There was a good reason for this failure, but it would only be clearly
elucidated 60 years later by Evariste Galois. However, Lagrange’s work was a failure
only in the sense that Columbus’ voyages were failures. Lagrange had touched upon
a new world: the world of groups.
Here is Lagrange’s formulation of the great theorem which has come to bear his
name.
Lagrange’s Theorem. Let f (r1 , r2 , . . . , rn ) be a polynomial in n commuting vari-
ables. The number of values taken by f under permutation of the variables must be
a divisor of n!.
In the exercises you will be asked to reformulate this theorem in modern language
and to explain why it is a corollary of Lagrange’s Orbit-Stabilizer Theorem, as
stated and proved in Chapter 4.
84
Exercises
1. Prove Lemma 10.1.
3
2a. Write out + µ3 explicitly as a polynomial in r, s, t.
3
b. Express + µ3 as a polynomial in the three elementary symmetric functions
in r, s, t.
3. Prove that the matrix
0 1
1 1 1
@1 ! !2 A
! 1 !2
is invertible.
4. Using the notation from the discussion of the quartic equation, prove that
the set T = {✓1 , ✓2 , ✓3 } is a Sym({r1 , r2 , r3 , r4 })-orbit on Z[r1 , r2 , r3 , r4 ], and prove
that the kernel of the action of Sym({r1 , r2 , r3 , r4 }) on T is the group
Gd = { 2 S4 : (fd ) = fd }.
k k k
for some k, is a p-cycle with (1) = 2. Renumber so that (i) = i + 1 for all
i, 1 i < p.]
9. Let G be a group and H a subgroup of G with (G : H) = n. Let
X = {g1 H = H, g2 H, . . . , gn H}
be the set of all left cosets of H in G. For each g 2 G, define the function g :X!
X by
Galois realized that Gauss was on the right track . When considering a specific
polynomial p(x), one should not treat its roots as “indeterminates”– r1 , r2 , . . . , rn
– and indiscriminantly apply every possible permutation in Sn . Rather, one should
remember the algebraic relationships among the roots and apply only those permu-
tations which respect those relationships. [Of course, Lagrange was right too. He
was looking for a general formula valid for all equations, not a specific formula for
a specific equation.]
Definition 11.1. Let F be a subfield of C and let E be the splitting field over F of
the polynomial p(x) 2 F [x]. The Galois group of p(x) over F , Gal(E/F ), is the
group of all 2 Aut(E) such that (x) = x for all x 2 F . We call E/F a Galois
extension. As noted, we shall assume without further comment that F ✓ E ✓ C.
Thus, Gal(E/F ) = Aut(E/F ) for the special case when E is a splitting field over
F . In particular, in the important case when F = Q, we simply have Gal(E/Q) =
Aut(E), since every automorphism of E fixes every rational number. We leave the
proof of the following fact as an exercise.
Other than the identity function, it is not obvious that there are any Galois
automorphisms of E/F . The remarkable fact is that there are quite a few. This is
the content of the following converse of Theorem 11.2.
87
Theorem 11.3. Let E/F be a Galois extension. Let a 2 E with minimum poly-
nomial p(x) 2 F [x]. Let b be any root of p(x). Then b 2 E and there exists
2 Gal(E/F ) with (a) = b.
Let a be a root of the irreducible polynomial p(x) 2 F [x], and let b be a root of the
irreducible polynomial h̃(p(x)) 2 F 0 [x]. Then there is an isomorphism h⇤ : F (a) !
F 0 (b) such that
(1) h⇤ (c) = h(c) for all c 2 F ; and
(2) h⇤ (a) = b.
Proof. Set p0 (x) = h̃(p(x)) 2 F 0 [x]. By the construction of extension fields, there
are isomorphisms:
f : F [x]/(p(x)) ! F (a)
and
g : F 0 [x]/(p0 (x)) ! F (b),
given by
and
g(c0 + (p0 (x))) = c0 for all c0 2 F 0 , and g(x + (p0 (x))) = b.
The ring isomorphism h̃ : F [x] ! F 0 [x] maps the principal ideal (p(x)) to the
principal ideal (p0 (x)), and so there is an induced isomorphism
such that
ĥ(c + (p(x))) = h(c) + (p0 (x)) for all c 2 F , and ĥ(x + p(x)) = x + (p0 (x)).
⇤
88
Theorem 11.5. Let E be the splitting field over F of the polynomial p(x) 2 F [x].
Let E 0 be another subfield of C containing F , and let h : E ! E 0 be an isomorphism
of fields satisfying:
{↵1 , ↵2 , . . . , ↵n }
be the set of all roots of p(x). (Note: p(x) may have repeated roots. So there may
be redundancies on this list.) Then, for all i,
an · ↵in + · · · + a1 · ↵i + a0 = 0.
Since ↵i 2 E for all i, we may apply h to this equation, yielding:
an · h(↵i )n + · · · + a1 · h(↵i ) + a0 = 0.
Thus h(↵i ) is also a root of p(x), for all i. In other words, since h is an injective
function, h permutes the roots of p(x), as claimed. In particular, h(↵i ) 2 E for all
i. However, since E is the splitting field for p(x) over F , E = F (↵1 , ↵2 , . . . , ↵n ).
Hence, since h(F ) = F , we have that h(E) ✓ E, i.e., E 0 ✓ E. Note: Since E 0 is
an infinite set, this does not immediately guarantee that E 0 = E. However, in this
case, h : E ! E 0 is, in particular, an isomorphism of F -vector spaces. Since E is
finite-dimensional as an F -vector space, it now follows that E 0 = E, as desired.
Now we are ready to prove the Main Theorem on Galois automorphisms.
Theorem 11.6. Let E/F be a Galois extension. Let L be any subfield of E and
let L0 be another subfield of C containing F . Suppose that g : L ! L0 is an
isomorphism of fields satisfying:
Q = K 0 ✓ K 1 ✓ · · · ✓ Kn
such that ↵ 2 Kn and (Ki+1 : Ki ) = 2 for all i. Now p(x) 2 Kn [x]. Let E be a
splitting field for p(x) over Kn . Let be any root of p(x) in E. Then Q(↵) ⇠
= Q( )
and so, by Theorem 11.3, there exists 2 Gal(E/Q) = Aut(E) such that (↵) = .
Since Ki ✓ E for all i, we may apply to the tower above to get a new tower of
fields:
Q = L0 ✓ L1 = (K1 ) ✓ · · · ✓ Ln = (Kn )
such that = (↵) 2 Ln and (Li+1 : Li ) = 2 for all i. Hence is constructible, as
claimed.
⇤
Exercises
1. Prove Theorem 11.2.
2. Let E be the splitting field of p(x) = (x2 2)(x3 1) over Q.
a. Prove: (E : Q) = 4.
b. Prove that q(x) = x2 2 remains irreducible over Q(!), the splitting field of
x3
1 over Q.
c. Prove that Gal(E/Q) is a noncyclic group of cardinality 4.
d. Give three di↵erent subfields of E, each of degree 2 over Q.
3. Let E be as in Exercise 2. Prove that E is also the splitting field of f (x) =
(x2 + 3)(x2 4x + 2) over Q.
4. Find the splitting field and Galois group of g(x) = x3 5 over Q.
90
5. Find the splitting field and Galois group for h(x) = x4 2x2 + 9 over Q.
6. Let L be the splitting field of k(x) = x4 2 over Q.
a. Prove: (L : Q) = 8.
b. Prove that Gal(L/Q) is a subgroup of S4 isomorphic to D4 .
91
F = {K : F ✓ K ✓ E}
be the set of all subfields of E containing the field F . We let
G = {H : H ✓ G}
be the set of all subgroups of G. Recall that if H is a subgroup of G, we define
Our goal is to show that these two functions are inverses of each other and define
a one-to-one correspondence between the fields in F and the groups in G.
We leave the following theorem as an exercise. It is an easy corollary of Theorem
12.1.
Theorem 12.2. For all K 2 F, K = ⇥( (K)), i.e.,
K = E Gal(E/K) .
In particular, ⇥ : G ! F is a surjective map, and so
H = Gal(E/E H ).
Note that
H ✓ Gal(E/E H ) ✓ G.
We need to verify that Gal(E/E H ) is not bigger than it “should be”. This will
follows from the fundamental Primitive Element Theorem of Galois.
Primitive Element Theorem. Let E/F be a Galois extension. There exists
↵ 2 E such that E = F (↵).
We call ↵ a primitive element of E/F . The Primitive Element Theorem is an
immediate corollary of the following linear algebra fact.
Theorem 12.3. Let V be a finite-dimensional vector space over an infinite field
F . Then V is not the union of any finite collection of proper F -subspaces of V .
This is intuitively obvious. No finite set of lines completely covers R2 . No finite
set of planes completely covers R3 .
Proof. Let dimF (V ) = n. We shall prove the theorem by induction on n. If n = 1,
then the only proper subspace of V is {0}, and the theorem is obvious. Henceforth
assume n 2.
We call a subspace H of V a hyperplane if dimF (H) = n 1. First we argue
that V contains infinitely many hyperplanes. Let B = {e1 , e2 , . . . en 1 , en } be an
F -basis for V . For ↵ 2 F , let H↵ be the subspace of V spanned by
93
B↵ := {e1 , e2 , . . . , en 1 + ↵ · en }.
en 1 + · en = c1 · e1 + c2 · e2 + · · · + cn 1 · (en 1 + ↵ · en ).
Thus
c1 · e1 + c2 · e2 + · · · + (cn 1 1) · en 1 + (cn 1↵ ) · en = 0.
Since B is a linearly independent set, it follows first that cn 1 = 1 and then that
↵ = . Thus
H↵ = H if and only if ↵ = .
H = {H1 , H2 , . . . , Hr }
V = H1 [ H2 [ · · · [ Hr .
H = (H \ H1 ) [ (H \ H2 ) [ · · · [ (H \ Hr ),
|Gal(E/F )| n = (E : F ).
On the other hand, by the Main Theorem on Galois Automorphisms, for every
root of p(x), there exists one (and only one) Galois automorphism 2 Gal(E/F )
with (↵) = . Thus
|Gal(E/F )| n = (E : F ).
Hence equality holds, as claimed.
⇤
Corollary 12.5. Let H be any subgroup of G := Gal(E/F ). Then H = Gal(E/E H ).
Proof. Let K = E H and let
|H ⇤ | = (E : K).
We must show that (E : K) |H|.
Let H = {h1 , h2 , . . . , hm }, with h1 the identity automorphism. Let ↵ be a
primitive element of E/K. Set
Thus and ⇥ are inverses of each other. Hence both and ⇥ are bijections
between the sets F and G. This completes the proof of the Fundamental Theorem
of Galois Theory.
The Fundamental Theorem of Galois Theory. Let F ✓ E ✓ C with E/F a
Galois extension of fields. Then the correspondence
K = E H () H = Gal(E/K)
defines a one-to-one inclusion-reversing correspondence between the subfields of E
containing F and the subgroups of Gal(E/F ). Moreover,
|H| = (E : E H )
for every subgroup H of Gal(E/F ).
We apply this theorem to obtain a necessary and sufficient condition for a com-
plex number to be constructible. First we need a general fact about finite p-groups.
Theorem 12.6. Let p be a prime number and let G be a finite group with |G| = pn
for some n 2 N. Then there is a tower of subgroups
G = G0 ◆ G1 ◆ G2 ◆ · · · ◆ Gn = {e},
where (Gi : Gi+1 ) = p for all i, and e is the identity element of G.
The proof of Theorem 12.6 will require us to develop a little more basic group
theory. Back in Math 4580, we defined the relation of conjugacy on a group G. We
recall this definition now.
Definition 12.7. Let G be a group. We say that two elements x and y of G are
conjugate if there exists an element g 2 G such that y = gxg 1 . We shall write
x ⇠ y to denote the fact that x is conjugate to y.
We leave the following fact as an exercise.
Lemma 12.8. The relation of conjugacy is an equivalence relation on a group G.
We refer to the equivalence classes under this relation as conjugacy classes.
Thus the conjugacy classes of G define a partition of G into disjoint subsets. We
can think of this in a more sophisticated way in terms of group actions.
Lemma 12.9. Let G be a group. Then G acts as a group of functions on the set
G via:
1
g(x) = gxg for all x 2 G.
96
(gh)(x) = g(h(x))
for all g and h in the group G and all x in the set G:
1 1 1
(gh)(x) = (gh)x(gh) = g(hxh )g = g(h(x)),
as claimed.
⇤
Notice that the conjugacy classes of G are the G-orbits on the set G under the
conjugation action. We introduce the following notation.
Definition 12.10. Let G be a group and let x be an element of G. Then the
centralizer in G of x is the set
Note that (b) implies, in particular that |Ci | divides |G| for all i.
The displayed equation is usually referred to as Cauchy’s Class Equation. It
has the following important corollary.
Theorem 12.12. Let p be a prime and let G be a finite group with |G| = pn for
some n 2 N. Then Z(G) 6= {1}. In particular, G has a normal subgroup N with
|N | = p.
Proof. By Theorem 12.11, if Ci is a conjugacy class of G, then |Ci | divides pn .
Hence either |Ci | = 1 or p divides |Ci |. Of course, p divides |G|. Thus in the
notation of Cauchy’s Class Equation, p divides |Ci | for all i with 1 i r. Hence
p divides
N = {1, z, z 2 , . . . , z p 1
}.
Let g 2 G. Then, since N ✓ Z(G),
g · zi · g 1
=g·g 1
· z i = 1 · z i = z i 2 N,
for all i. Thus N is a normal subgroup of G with |N | = p.
⇤
We would like to produce a tower of subgroups
1 = N0 ✓ N = N1 ✓ · · · ✓ Nn = G
with |Ni | = pi for all i. This will follow easily by induction once we generalize the
quotient group construction.
Definition 12.13. Let G be a group and let N be a normal subgroup of G. We
define a quotient group G/N as follows. The set G/N is the set of all cosets
g · N for g 2 G. Multiplication in G/N is defined by the rule:
(g · N ) · (g1 · N ) = (g · g1 ) · N
for all g, g1 2 G.
Note that we do not have to specify left or right cosets, since the condition that
N is a normal subgroup of G is equivalent to the assertion
g · N = N · g for all g 2 G.
As usual, we must verify that the multiplication operation is well-defined. This
follows directly from the equation above and the Associative Law:
(g · N ) · (g1 · N ) = g · (N · g1 ) · N = g · (g1 · N ) · N = (g · g1 ) · (N · N ) = (g · g1 ) · N,
H := {h 2 G : f (h) 2 H/N },
then H is a subgroup of G. Moreover if |G| < 1, then |H| = |N | · |H/N |.
Proof. We leave as an exercise to show that f is a surjective group homomorphism.
Suppose that H/N is a subgroup of G/N and H is defined as above. Since H/N is
a group, the identity coset 1 · N is in H/N . Hence 1 2 H. Suppose that h, h1 2 H.
Then h · N 2 H/N and h1 · N 2 H/N . Hence h 1 · N = (h · N ) 1 2 H/N and
(h · h1 ) · N = (h · N ) · (h1 · N ) 2 H/N.
Hence h 1 2 H and h · h1 2 H. Thus H is indeed a subgroup of G. Moreover,
if G is finite, then H is the union of |H/N | cosets of N . Each of these cosets has
cardinality |N |. Hence
|H| = |H/N | · |N |,
as claimed.
⇤
We can now easily establish the existence of the desired tower of subgroups in a
finite p-group. The following result is a bit stronger than what we need for Theorem
12.17, but we will use it again in Corollary 12.18.
Theorem 12.16. Let p be a prime and let G be a finite group with |G| = pn for
some integer n 0. (We call such a group a finite p-group.) Let H be a subgroup
of G with |H| = pm . Then there is a tower of subgroups
H = H0 ✓ H1 ✓ · · · ✓ Hn m =G
with |Hi | = pm+i for all i, 0 i n.
Proof. We proceed by induction on k := n m. The result is trivial if n m = 0.
Suppose then that the result is true for k = n m 1, and that |G| = pn and
|H| = pm . By Theorem 12.12, G has a normal subgroup N with |N | = p. If N is
not contained in H, then H ✓ N H with |N H| = pm+1 . Since by induction, the
result is true for k = n m 1 = n (m + 1), it follows that there is a tower of
subgroups
N H = H 1 ✓ · · · ✓ Hn m =G
m+1
with |Hi | = p . Then taking H := H0 , we are done.
Hence, we may assume that N ✓ H. Let G = G/N . By induction, since
(G : H) = pn m 1 , there is a tower of subgroups
H0 = H ✓ H 1 ✓ · · · ✓ H n m =G
99
|Hi | = |H i | · |N | = pi 1
· p = pi ,
for all i, 0 i n. Clearly, H0 = H and Hi ✓ Hi+1 for all i. Hence these groups
provide the desired tower of subgroups of G.
⇤
We can now complete our characterization of constructible numbers.
Theorem 12.17. Let ↵ 2 C. Let f (x) 2 Q[x] be the minimum polynomial of ↵
over Q. Let E be the splitting field of f (x) over Q, and let G = Gal(E/Q). Then
↵ is constructible if and only if |G| is a power of 2.
Proof. Suppose first that ↵ is constructible. We have seen earlier that then every
root of f (x) is constructible. Hence by combining towers of fields, we can achieve
a tower
Q = K0 ✓ K1 ✓ K 2 ✓ · · · ✓ Kn
Q = E0 ✓ E1 ✓ E 2 ✓ · · · ✓ En = E
with (Ei+1 : Ei ) = 2 for all i. By the Galois Correspondence Theorem, this is true
if and only if there is a tower of subgroups
G = G0 ◆ G1 ◆ G2 ◆ · · · ◆ Gn = {I}
with (Gi : Gi+1 ) = 2. Since |G| is a power of the prime 2, this is immediate from
Theorem 12.16.
⇤
Using Theorems 12.16 and 12.17, we can obtain the following sharpened purely
field-theoretic criterion for a number to be constructible.
Corollary 12.18. Let ↵ 2 C with (Q(↵) : Q) = 2n . Then ↵ is constructible if and
only if there exists a tower of subfields of C:
Q = F0 ✓ F1 ✓ · · · ✓ Fn = Q(↵)
H = H0 ✓ H1 ✓ · · · ✓ Hn = G
with (G : Hi ) = 2 . Let Fi := E Hi . Again, by the Galois Correspondence
n i
Fn = E G = Q ✓ Fn 1 ✓ · · · ✓ F1 ✓ F0 = E H = Q(↵)
with (Fi+1 : Fi ) = 2 for all i, as claimed.
⇤
Corollary 12.18 does not seem very di↵erent from Theorem 13.16 in the Math
4580 text. However, without Galois Theory, it is not clear that Corollary 12.18
can be deduced directly from the earlier Theorem 13.16, even in the following very
elementary case. Suppose that ↵ is a constructible number which is the root of a
quartic irreducible polynomial p(x) 2 Q[x]. Let F = Q(↵) and suppose that E is
the splitting field for p(x) over Q with (E : Q) = 8. Suppose you know that there
is a tower of fields:
Q = E0 ✓ E 1 ✓ E 2 ✓ E 3 = E
with (Ei : Q) = 2i . Show that there exists a subfield F1 of F with (F1 : Q) = 2. I
don’t know how to do this without using Galois Theory.
We now discuss a way to show that there exist nonconstructible numbers whose
minimum polynomial over Q has degree a power of 2, specifically 4. Going back to
Lagrange’s analysis of the quartic polynomial, we recall that if
h · g = g · h1 .
Then
E N = F (↵1 , . . . , ↵r ),
102
and let mi (x) 2 F [x] be the minimum polynomial of ↵i over F . If i is any root
of mi (x), then i 2 E and there exists gi 2 G with gi (↵i ) = i . But gi (↵i ) 2 E H .
Hence i 2 E H . In other words, E H contains all of the roots of mi (x) for all i,
1 i r. Thus E H is the splitting field of m(x) = m1 (x) · . . . mr (x) 2 F [x], i.e.,
E H /F is a Galois extension of fields.
Galois’ work finally clarified the question of when a polynomial equation can
be solved by a process involving only addition, subtraction, multiplication division,
and extraction of roots. The fact that this was impossible for the general polynomial
equation of degree n 5 had been established a bit earlier by Ruffini and Abel.
Speaking informally, what Galois showed was the following: Suppose p(x) is a
polynomial with rational coefficients, having splitting field E/Q. The problem of
finding the roots of this polynomial can be reduced to the problem of finding roots of
polynomials of lower degree if and only if there exists a Galois extension F/Q with
F a proper subfield of E. If there is such a subfield, then one can first try to solve
the polynomial equation f (x) for which F/Q is the splitting field. Next one can
try to solve the polynomial equation g(x) 2 F [x] for which E is the splitting field.
Since (F : Q) < (E : Q) and (E : F ) < (E : Q), the problem has been reduced to
two smaller problems. By the fundamental Galois Correspondence Theorem, this is
possible if and only if the Galois group G := Gal(E/Q) contains a proper normal
subgroup, i.e., a normal subgroup N with N 6= {1} and N 6= G. Then one can
choose F to be E N .
A polynomial equation is solvable by radicals, i.e. its solutions can be found
using only addition, subtraction, multiplication, division, and extraction of roots,
if and only if this process can continue to be refined until one finally reaches field
extensions Fi+1 /Fi , all of prime degree, as Gauss did in his reduction of the cyclo-
tomic polynomials. Galois’ fundamental result in this context requires the following
definition.
Definition 12.20. A group G is solvable if there is a tower of normal subgroups
G = G0 ◆ G1 ◆ G2 ◆ · · · ◆ Gn = {1}
such that each quotient group Gi /Gi+1 is an abelian group.
Theorem. Let p(x) 2 Q[x]. Then p(x) can be solved by a process involving only
addition, subtraction, multiplication, division, and extraction of roots if and only if
the Galois group of p(x) is a solvable group.
It is possible to show that most polynomial equations of degree n have Galois
group Sn , the full symmetric group on n letters. Since Sn is not solvable for n 5,
it follows that most polynomials of degree at least 5 are not “solvable by radicals”.
Galois’ work to a large extent closed the book on the subject of finding algebraic
algorithms for solving general polynomial equations of degree greater than 4. How-
ever, other mathematicians, notably Leopold Kronecker, pursued this algorithmic
question much more deeply, in the case of polynomials with solvable Galois group.
Even more importantly, Galois’ work opened the book of group theory and more
generally, in conjunction with the great work of his predecessors such as Lagrange
and Gauss, opened a vast and fascinating book of abstract mathematical structures
(like groups, rings, and fields) and the “Galois correspondences”which link them.
103
Exercises
1. Prove Theorem 12.2.
2. Let G be a group. Define the conjugacy relation xỹ on G by:
1
xỹ if and only if y = gxg for some g 2 G.
Prove: The conjugacy relation is an equivalence relation on G.
3. Prove Theorem 12.11.
4. Let G be a group and let N be a normal subgroup of G. a. Prove: G/N is a
group (as defined in Definition 12.13).
|G|
b. Prove: If |G| < 1, then |G/N | = |N | .
5. Let H be a finite group with |H| = pa · m, with p and prime and with
gcd(p, m) = 1. Suppose that H has a normal subgroup P with |P | = pa . a. Prove:
If ↵ 2 Aut(H), then ↵(P ) = P .
b. Prove: If H is a normal subgroup of a (larger) group G, then P is also a
normal subgroup of G.
6. Verify that the function f defined in Lemma 12.15 is a surjective group
homomorphism.
7a. Prove: If G is a finite group with |G| even, then G contains an element g of
order 2.
b. Prove: Suppose A is an abelian group with |A| = 2a · p1 · p2 · . . . · pr , where
the pi are distinct odd primes. Then A has a subgroup B with either |B| = 2a or
with |B| = pi for some i.
c. Recall from Corollary 4.8 that two elements of S5 are conjugate if and only if
they have the same cycle structure. Use this to list all the conjugacy classes of S5
and their sizes.
d. Prove: S5 is not a solvable group. [Hint: Suppose that S5 is a solvable group.
Use (b) to argue that S5 must have a normal subgroup B with |B| = 2, 3, 4, 5, or
8. Now use (c) to derive a contradiction.
8. For each of the equations listed below, determine the Galois group over Q of
the splitting field of the equation. List all of the subgroups of the Galois group.
104
List all of the subfields of the splitting field of the equation, and draw a diagram
illustrating the Galois correspondence between subgroups and subfields for each
example.
a. (x2 + 1)(x2 2)
2
p b. (x 2)(x2 3)(x2 p+ 1) (Note: You must prove by explicit calculation that
3 is not contained in Q[ 2].)
c. x3 2
d. x7 1
4
e. x 3
f. x11 1
9. For each finite group G with |G| 7, give an example of an equation whose
Galois group over Q is isomorphic to G.
10. Let p(x) = x4 + x + 1. Let E be the splitting field for p(x) over Q. a. Find
the resolvent cubic R(x).
b. Prove that R(x) is irreducible over Q.
c. Prove that (E : Q) = 12 or 24.
d. Prove: Gal(E/Q) ⇠
= A4 or S4 .
e. If p(x) = (x2 + ax + b)(x2 + cx + d), verify the calculations on page 100 which
show that a2 is a root of the cubic polynomial r(x) = x3 4x 1.
f. Prove: r(x) = x3 4x 1 is irreducible in Q[x].
g. Explain why (Q(a2 ) : Q) = 3 and (F : Q) 2 combine to give a contradiction
to the assumed existence of the field F .
105
INDEX
Lagrange resolvent 81
Lagrange’s Theorem 37, 82
linear combination 5
linear independence 6
linear operator 11
linear transformation 6
Main Theorem on Galois Automorphisms 88
minimum polynomial of a complex number 66
normal subgroup 18, 101
octahedron 41–42
orbit 28
Orbit Counting Formula 45
orthogonal basis 22
orthogonal group, O(n) 23
orthogonal matrix 22
orthogonal operator 22
orthogonal vectors 22
permutation matrix 24
p-group (finite) 98
Primitive Element Theorem 92–93
projection operator 19
quotient group 97
regular polyhedron 39
resolvent cubic 83
scalars 4
similar matrices 18
simple group 44
singular matrix 15
SL(n, R) 20
solvability by radicals 102
solvable group 102
spanning set 5
special orthogonal group, SO(n) 23
splitting field 66
subspace 5
symmetric polynomial 70
tetrahedron 39–41
transcendental number 67
transitive group action 37
vector space 4
Waring’s Theorem 71