ANT Intro 3

Download as pdf or txt
Download as pdf or txt
You are on page 1of 484

An Introduction to Algebraic Number Theory through

Olympiad Problems

Elias Caeiro
Foreword

Here is a quick summary of how this book came to be. In July 2019, I attended a class by Gabriel
Dospinescu where he exposed a bit of algebraic number theory (roughly chapter 1 of this book).
Amazed by what I had seen, I started reading a bit of Ireland-Rosen [15] and in late 2019, the thought
of writing a handout inspired by the content of Gabriel’s class for the website https://mathraining.be
crossed my mind. I submitted a first version in July 2020. Then, in March 2021, I had to make a
few corrections. At that time, I knew significantly more than when I first wrote it, so I realised while
doing these corrections that there was so much more that I wanted to add but couldn’t (due to lack of
space). This became the present book; which I wrote during the summer 2021 of my last year of high
school.

This book is intended to serve as a transition from olympiads to higher mathematics, for high
school students who are interested in learning more advanced theory but find regular textbooks too
different from olympiads. I stress that this book is not an efficient way to prepare for
mathematical olympiads.1 As such, there is hardly any prerequisite2 , apart from some amount
of (olympiad) mathematical maturity. Accordingly, there is an appendix providing background on
polynomials at the end of the book. Most of the content of the first section should be familiar to the
reader, but I still recommend to skim through it quickly to have a firm footing on the technicalities
(e.g. a polynomial is not a polynomial function).3 The second section of this appendix is dedicated
to introducing notions of abstract algebra: no theory will be introduced there, it serves both as a way
to explain what morphisms are and as a reference for the definitions of the algebraic structures which
will be used throughout this book (you should not try to remember the actual algebraic structures,
only the intuitive concept of a morphism).

I was aware of some excellent books on algebraic number theory (and related subjects) which helped
me visualise how I wanted this book to be: Andreescu and Dospinescu’s Problems from The Book [1]
(PFTB) and Straight from The Book (SFTB) [2], Ireland and Rosen’s A Classical Introduction to
Modern Number Theory [15], and Esmonde-Murty’s Problems in Algebraic Number Theory [27]. Here
is a small summary of these books: PFTB presents miscellaneous mathematical gems in an (advanced)
olympiad style, and SFTB has solutions to the first 12 chapters and amazing expositions of advanced
topics in addenda. Ireland-Rosen discusses a wide variety of number theoretic topics with an algebraic
flavor, and Esmonde-Murty is a classical first semester course in algebraic number theory but written
from a problem-solving oriented approach. The problems in Ireland-Rosen are generally easier than
the ones in PFTB, SFTB, and Esmonde-Murty.

As a consequence, I have tried to limit the intersection of the present book with these ones, since
the exposition there was already extraordinary. Therefore, I strongly encourage the reader to have a
look at them too. I particularly recommend the addenda 3B, 7A and 9B4 of SFTB, chapter 9 and
13 on algebraic number theory and the geometry of numbers of PFTB as well as the chapters 8 and
9 on Gauss and Jacobi sums and on cubic and biquadratic reciprocity of Ireland-Rosen as they are
particularly similar to the topics of this book, but of course one should read all of the chapters if
possible.
1 Except maybe chapters 3 and 5 on cyclotomic polynomials and polynomial number theory.
2 If I had to state them, maybe the chinese remainder theorem, Fermat’s little theorem, modular arithmetic, complex
numbers, and the binomial expansion?
3 It doesn’t hurt to read it quickly even if you think you know everything, at best you learn something new, at worst

you lose a few minutes (and there are cool exercises!).


4 I have personally found 9A to be too dense for me (before reading Esmonde-Murty).

2
3

One could, for instance, read this book along with the the addenda from SFTB, and then start
with Esmonde-Murty and Ireland-Rosen as they are a bit more abstract (although Ireland-Rosen starts
very gently). This choice is also motivated by the fact that Esmonde-Murty (and some chapters of
PFTB/SFTB) makes a fair use of linear algebra, which I have included an appendix on; for this reason
as well as because it is useful in a fair amount of exercises. In a sense, this book can be thought
as a prequel to Esmonde-Murty, and one should have (almost5 ) all the necessary prerequisites after
finishing it. In particular, at no point6 do I mention ideals, even though they are fundamental in
algebraic number theory. As a consequence, some problems which are solved by tricky uses of the
fundamental theorem of symmetric polynomials can be solved more easily with ideal theory. I hope
this will not affect the reader once they learn ideal theory.

I will now talk more about the book itself. Chapter 1, which roughly corresponds to the Mathraining
version of the book, starts with general definitions and properties of algebraic numbers. It also roughly
corresponds to the chapter of PFTB7 . I don’t actually have much more to say briefly about the
chapters than what the table of contents does, so I will focus on the last two appendices, on symmetric
polynomials and linear algebra. Symmetric polynomials, and above all the fundamental theorem of
symmetric polynomials, are used everywhere in the book. The knowledge of the proofs of these results
however is not strictly required to progress through the book. I have thus arranged them in an appendix
which the reader can read when they want (personally I recommend doing so after reading the first
chapter).

Regarding the appendix on linear algebra, I would recommend reading it after chapter 1 too, but
since it is rather long, one can, say, read one section after each chapter. The first section on vector
spaces and bases is fundamental and is necessary for chapter 6 on field theory. I would also recommend
reading it before chapter 4 on finite fields since it is used to give a quick proof of the fact that the
cardinality of a finite field is a power of its characteristic. Section 2 on linear maps is less important,
but it cannot be skipped because this is where matrices are defined and where some properties of
matrices are established. Section 3 on determinants is extremely important and rather long; I suggest
to first look at applications of the determinant and then have a more careful look at its construction.
Section 4 uses the results of section 2 and 3 to derive the formula for linear recurrences. Since many
exercises are about linear recurrences, this has to be seen in the beginning, even if one does not read
the proof. Here is a diagram of chapter dependencies. Dashed lines indicate weak dependencies (some
facts from the previous chapter might be used, or the previous chapter might provide some additional
motivation, but it is still understandable without it). There is a weak dependency between 6 and 7
because some notations and results of chapter 6 will be used, but only in the last section, Section 7.4.
Also, note that there is one forward dependency: in Section 7.4: at the end of the proof of the main
result, one result from Chapter 8 is used. (This is intentional. Readers should not read Chapter 7
before Chapter 6. More generally, I think the best course of action is to read this book in order.)

5 There might still be a few things which need a bit of googling, but they should not take too long to grasp. There

is also some real analysis and geometry involved, mainly for the geometry of numbers part, but I trust the reader will
manage (PFTB has a great chapter on the geometry of numbers which can be used to introduced the subject).
6 Except in some remarks or footnotes.
7 Along with section 1 of chapter 4 on the Frobenius morphism.
4

A B

5 1

2 3

7 6 4

8 C.1

Finally, here are some miscellaneous remarks about the layout of the book. The layout of the
theorems etc. comes from Mathraining. Esmonde-Murty has also inspired me a lot: I have followed its
style of leaving parts of the theory as exercises for the reader. These exercises will be written in purple
and in smaller font. This serves two purposes at once: it keeps the exposition neater (for instance it
is easier to see the main ideas of a proof) and keeps the reader active in the learning process. Some
of these purple exercises will have a star near them, this means that they are part of the theory. In
that case, they cannot be skipped. Otherwise, it is usually an additional remark about an object
that will not be important for the rest of the book but still good to do. Purple exercises are generally
easy. They are all corrected at the end of the book to avoid the reader getting stuck at an early stage
due to a misunderstanding8 . The solutions are deliberately not linked to the exercises to encourage
the reader to try them and not read the solution directly. However, the reader is encouraged to read
the solutions to the exercises they had trouble with, and particularly so for the vagues ones such as
the ones about motivation. They are also encouraged to read them when they feel their solution is a
bit "dodgy", or when they feel it is "horrible". (Some exercises may require computations, but the
computations are never terrible.)

Similarly, a star after a proposition or corollary means that it’s an important result.

Now, here are a few remarks about the supplementary exercises at the end of the chapters, i.e. black
exercises. Some of these are pretty hard, so it is fine to move on to another chapter without having
solved them all (or even almost none of them, it is not a problem9 ) and come back later. A dagger
at the right of an exercise indicates that it is particularly instructive, beautiful, or interesting. These
are all corrected at the end of the book. The exercises can also be seen as a companion to the theory:
many exercises are theorems or classical results. If an exercise doesn’t seem nice enough to attempt it,
but nice enough to want to know the solution, it’s fine10 to read it without trying the problem. Also,
I strongly encourage the reader to read the solutions at the end after solving an exercise: multiple
solutions are often given so you may still learn something new.

I have decided to include all the objects which are defined in the book in the index at the end
of it. This means that I cannot put all the occurences of words like "algebraic number" which are
used almost everywhere. For such words, I chose to include the first occurence of the word where it’s
defined, as well as some more exotic occurences (e.g. embeddings arise everywhere in chapter 6 on
field theory but are pretty rare in the other chapters so I have included all the later occurences). If
words can appear in two indices, I chose to put them in both. For instance, "quadratic unit" appears
8 Being implemented.
9 Ofcourse, I still recommend to try them.
10 But don’t do that too much! This is for exercises like the fact that a Galois extension L/K is solvable if and only if

its Galois group is: it’s a very nice result, but it has very little to do with number theory so it’s understandable if you
feel lazy.
5

both in the index for "quadratic" and the one for "unit". Another initiative I have taken is that I do
not include words appearing in the solutions in the index unless they do not appear near the original
exercise.

There are a few notations or abbreviations which are not completely standard that I haven’t defined
in the book. The reader shall find a table with the notations of this book in the next section, but here
they are for the sake of convenience. I use LHS and RHS to denote "left-hand side" and "right-hand
side", [n] to denote the set of integers from 1 to n and := to define an object. When S is a set
and a an element of some ring, I use aS to denote {as | s ∈ S}. Similarly, U + V and U V mean
{u + v | u ∈ U, v ∈ V } and {uv | u ∈ U, v ∈ V }.

Now comes the time of the acknowledgements. As I said in the beginning of the foreword, this
book could not have existed without the classes of Gabriel Dospinescu and Bodo Lass at the Club de
Mathématiques Discrètes in Lyon. I want to thank Nicolas Radu as well, the creator of Mathraining,
for his very valuable comments on the Mathraining version. I also thank everyone involved in the
French Olympiad Mathematics Preparation as well as in the website https://mathraining.be, for
making me discover (olympiad) mathematics. Lastly, many thanks to Lucas Nistor for patching up
solutions that should work but don’t as well as to Vladimir Ivanov and Alexis Miller for their very
careful proofreading11 .

Finally, I am still very inexperienced so I apologise in advance for all the poor expositions and
mistakes in this book! In particular, I would be very grateful if you could email all the mistakes and
typos you find as well as any suggestion you have (for instance, a very nice alternative solution to
an exercise, or a better way to present the motivation of a solution) to caeiro.elias11@gmail.com
(or pm me on AoPS or discord depending on where you found this book). The dropbox link should
always be (mostly) up to date. The advantage is that you will always have the last version, but the
drawback is that the numbering of theorems, etc. and results may change with time, because I add
content thematically.
Paris Elias Caeiro
October 2

11 If you still see many mistakes, it means that they haven’t finished proofreading the whole book yet, or that there

were so many mistakes that they couldn’t catch them all. The latter is probably true in all cases.
Notations

Sets
• [[a, b]]: the set of integers [a, b] ∩ Z between a and b.

• [n]: the set of integers [[1, n]] between 1 and n.

• N: the set of natural integers {0, 1, 2, . . .}.

• N∗ : the set of positive integers {1, 2, 3, . . . , }.

• Z: the ring of rational integers.

• Q: the field of rational numbers.

• Z: the ring of algebraic integers.

• Q: the field of algebraic numbers.

• H: the skew field of quaternions.

• H: the ring of Hurwitz integers.

• Fq : the field with q elements.

• Zp : the ring of p-adic integers.

• Qp : the field of p-adic numbers.

• Z/nZ: Z modulo n.

• R[α1 , . . . , αn ]: the ring of polynomial expressions in α1 , . . . , αn with coefficients in R.

• K(α1 , . . . , αn ): the field of rational expressions in α1 , . . . , αn (with non-zero denominator) with


coefficients in K.

• OK : the ring of integers K ∩ Z of a number field K.

• C(D): the class group of primitive integral binary quadratic forms of discriminant D.

• Sn : the symmetric group of permutations of [n].

• Rm×n : the set of m × n matrices with coefficients in a commutative ring R. When n = 1 we


simply write Rm .

6
7

Polynomials
• Φn : the nth cyclotomic polynomial.


• Ψn : the minimal polynomial of 2 cos n .

• ek : the kth elementary symmetric polynomial.

• pk : the kth power sum polynomial.

• hk : the kth complete homogeneous polynomial.

• πα : the minimal polynomial of an algebraic number α.

• πϕ : the minimal polynomial of a linear map (or matrix) ϕ.

• πϕ,x : the minimal polynomial of a linear map (or matrix) at some vector x.

• χϕ : the characteristic polynomial of a linear map (or matrix) ϕ, with the convention that it is
monic.

• f 0 : the (formal) derivative of a rational function f .

• f ∗ : the primitive part f /c(f ) of f ∈ Q[X].

Sequences and Functions


• Fn : the nth Fibonacci number.

• Ln : the nth Lucas number.

• P (n): the greatest prime factor of a non-zero rational integer n ∈ Z.

• µ(·): the Möbius function defined by µ(n) = (−1)r when n is squarefree and has n distinct prime
factors, and µ(n) = 0 otherwise.

• rad(·): the squarefree part of the the prime factorisation of an element in a UFD. For Z you take
it to be positive, for Q[X] monic and for Z[X] primitive with positive leading coefficient.

• c(f ): the content of a polynomial f ∈ Z[X].

• N(α): the absolute norm of an algebraic number α ∈ Q, i.e. the product of its conjugates.

• NL/K (α): the norm of α in the extension L/K.

• α: the quadratic conjugate of an element α in a quadratic extension L/K. Without context it is


the complex conjugate (L = C and K = R).

• bxc: the floor of a real number x, i.e. the greatest integer n ≤ x.

• dxe: the ceiling of a real number x, i.e. the smallest integer n ≥ x.

• sgn(x): the sign of a real number x, i.e. 0 if x = 0, 1 if x > 0 and −1 if x < 0.

• <(z): the real part of a complex number z ∈ C.

• =(z): the imaginary part of a complex number z ∈ C.

• nk : n choose k, the number of ways to select a subset of k elements from a set of n elements,

n!
i.e. k!(n−k)! .

• vp (r): the p-adic valuation of a rational number r, i.e. the exponent of p in its prime factorisation.

• |x|p : the p-adic absolute value of a p-adic number x.


8
 
a
• : the Legendre symbol (or the Jacobi or Kronecker symbols when p isn’t prime).
p
 
• a,b × 2 2 2
p : the Hilbert symbol of a, b ∈ Qp which is 1 if Z − aX − bY represents 0 over Qp and
−1 otherwise.
• cp (f ): the Hasse invariant of a non-zero quadratic form f over Qp .
• d(f ): the determinant of a quadratic form f .
• ∆(f ): the discriminant of a polynomial f in one variable or of a homogeneous polynomial f in
two variables.
• h(D): the class number |C(D)|.

Algebra
• |R : divides in R.
• d: left-divisibility.
• e: right-divisibility.
• R× : the multiplicative group of units of R.
• FrobR : the Frobenius morphism of R.
• EmbK (L): the set of K-embeddings of L.
• Gal(L/K): the Galois group of L/K.
• AutK (L) = Aut(L/K): the group of automorphisms of L/K.
• LH : the fixed field of H.
• Res(f, g): the resultant of two polynomials f and g.
• ker: the kernel of a morphism.
• im: the image of a morphism.
• det: the determinant of a matrix or a linear map.
• Tr: the trace of a matrix or a linear map.
• ⊕: direct sum, i.e. W = U ⊕ V means that W = U + V and any element of W can be written
in exaclty one way as u + v with (u, v) ∈ U × V .
• ⊕:
b orthogonal direct sum, i.e. in U ⊕ V , the sum is direct and u · v = 0 for any (u, v) ∈ U × V .

Miscellaneous
• ":=": a definition.
• LHS: left-hand side.
• RHS: right-hand side.
• f n : the nth iterate of a function f unless otherwise specified.
• U ? V : the set {u ? v | (u, v) ∈ U × V } for some sets U, V and an operation ? on (U ∪ V )2 (e.g.
addition or multiplication on the complex numbers). When U = {a} we also write a ? V for {a}V
(and U ? a for U {a} when ? is not commutative).
• ]a, b[: the open interval {x | a < x < b}. The intervals [a, b[ and ]a, b] are defined similarly.
• B<ε (x): the open ball {y | d(x, y) < ε} with radius ε and center x in a metric space (M, d).
• B≤ε (x): the closed ball {y | d(x, y) ≤ ε} with radius ε and center x in a metric space (M, d).
Contents

Foreword 2

Notations 6

Theory 11
1 Algebraic Numbers and Integers 11
1.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2 Minimal Polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.3 Symmetric Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.4 Worked Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2 Quadratic Integers 24
2.1 General Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.2 Unique Factorisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.3 Gaussian Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4 Eisenstein Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.5 Hurwitz Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3 Cyclotomic Polynomials 41
3.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.2 Irreducibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.3 Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.4 Zsigmondy’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4 Finite Fields 55
4.1 Frobenius Morphism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.2 Existence and Uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.3 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.4 Cyclotomic Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.5 Quadratic Reciprocity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

5 Polynomial Number Theory 75


5.1 Factorisation of Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.2 Prime Divisors of Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.3 Hensel’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.4 Bézout’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

9
10 CONTENTS

6 The Primitive Element Theorem and Galois Theory 90


6.1 General Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
6.2 The Primitive Element Theorem and Field Theory . . . . . . . . . . . . . . . . . . . . . 94
6.3 Galois Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
6.4 Splitting of Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

7 Units in Quadratic Fields and Pell’s Equation 112


7.1 Fundamental Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
7.2 Pell-Type Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
7.3 Størmer’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
7.4 Units in Complex Cubic Fields and Thue’s Equation . . . . . . . . . . . . . . . . . . . . 119
7.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

8 p-adic Analysis 125


8.1 p-adic Integers and Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
8.2 p-adic Absolute Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
8.3 Binomial Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
8.4 The Skolem-Mahler-Lech Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
8.5 Strassmann’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
8.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

A Polynomials 145
A.1 Fields and Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
A.2 Algebraic Structures and Morphisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
A.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

B Symmetric Polynomials 165


B.1 The Fundamental Theorem of Symmetric Polynomials . . . . . . . . . . . . . . . . . . . 165
B.2 Newton’s Formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
B.3 The Fundamental Theorem of Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
B.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

C Linear Algebra 174


C.1 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
C.2 Linear Maps and Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
C.3 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
C.4 Linear Recurrences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
C.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198

Solutions 205
1 Algebraic Numbers and Integers 205
1.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
1.2 Minimal Polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
1.3 Symmetric Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
1.4 Worked Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
1.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208

2 Quadratic Integers 221


2.1 General Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
2.2 Unique Factorisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
2.3 Gaussian Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
2.4 Eisenstein Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
2.5 Hurwitz Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
2.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
CONTENTS 11

3 Cyclotomic Polynomials 252


3.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
3.2 Irreducibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
3.3 Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
3.4 Zsigmondy’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
3.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261

4 Finite Fields 280


4.1 Frobenius Morphism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
4.2 Existence and Uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
4.3 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
4.4 Cyclotomic Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
4.5 Quadratic Reciprocity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
4.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287

5 Polynomial Number Theory 305


5.1 Factorisation of Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
5.2 Prime Divisors of Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
5.3 Hensel’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
5.4 Bézout’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
5.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308

6 The Primitive Element Theorem and Galois Theory 320


6.1 General Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320
6.2 The Primitive Element Theorem and Field Theory . . . . . . . . . . . . . . . . . . . . . 322
6.3 Galois Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324
6.4 Splitting of Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
6.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331

7 Units in Quadratic Fields and Pell’s Equation 353


7.1 Fundamental Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
7.2 Pell-Type Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
7.3 Størmer’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
7.4 Units in Complex Cubic Fields and Kobayashi’s Theorem . . . . . . . . . . . . . . . . . 354
7.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356

8 p-adic Analysis 367


8.1 p-adic Integers and Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
8.2 p-adic Absolute Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368
8.3 Binomial Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368
8.4 The Skolem-Mahler-Lech Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
8.5 Strassmann’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
8.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373
8.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 400

A Polynomials 401
A.1 Fields and Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401
A.2 Algebraic Structures and Morphisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404
A.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407

B Symmetric Polynomials 428


B.1 The Fundamental Theorem of Symmetric Polynomials . . . . . . . . . . . . . . . . . . . 428
B.2 Newton’s Formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429
B.3 The Fundamental Theorem of Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430
B.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430
12 CONTENTS

C Linear Algebra 441


C.1 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441
C.2 Linear Maps and Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441
C.3 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442
C.4 Linear Recurrences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449
C.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450

Further Reading 471

Bibliography 474

Index 481
Theory

13
Chapter 1

Algebraic Numbers and Integers

Prerequisites for this chapter: Section A.1.

1.1 Definition
First of all, what is an algebraic number?

Definition 1.1.1 (Algebraic Numbers and Algebraic Integers)

Let α ∈ C be a complex number. We say α is an algebraic number if it is a root of a monic


polynomial with rational coefficients. Further, if this polynomial has integer coefficients, we say
α is an algebraic integer .

The set of algebraic numbers will be denoted by Q, and the set of algebraic integers by Z.

Note that the "monic" part is very important, otherwise there would be no difference between
algebraic numbers and algebraic integers for a number is a root of a polynomial with integer coefficients
if and only if it is a root of a polynomial with rational coefficients.

Note also that every integer n is an algebraic integer since it’s a root of X − n, and every rational
number q is an algebraic number since it’s a root of X − q. This partly explains the notations we chose.

We also say a complex number is transcendental if it isn’t algebraic, but this won’t be relevant in
this book as we will only discuss properties of algebraic numbers.

Here are some examples of algebraic numbers:

• 1 is an algebraic integer (root of X − 1).

• i is an algebraic integer (root of X 2 + 1).



• 2 + 4 3 is an algebraic integer (root of (X − 2)4 − 3).
1
• 2 is an algebraic number (root of X − 12 ). However, it is not an algebraic integer. This is a
consequence of the following proposition.

Proposition 1.1.1 (Rational Algebraic Integers)*

The only rational algebraic integers are regular integers. In other words, Z ∩ Q = Z.

14
1.1. DEFINITION 15

Proof

Firstly, it is clear that regular integers are algebraic integers since n ∈ Z is a root of X −n ∈ Z[X].
Pn
Let f = i=0 ai X i be a monic degree n polynomial with integer coefficients, and assume uv is a
rational root of f , where u, v are coprime integers.

Then,
n
X  u i
ai =0
i=0
v
is equivalent, after multiplication by v n ,
n
X
ai ui v n−i = 0.
i=0

Modulo v, we get an un ≡ 0, i.e. un ≡ 0 since f is monic. Since u and v are coprime by


assumption, this must mean that v = ±1. Finally, this means that the root uv we started with
was in fact an integer.


i
Exercise 1.1.1. Is 2
an algebraic integer?

Exercise 1.1.2 (Rational Root Theorem). Let f ∈ Z[X] be a polynomial. Suppose that u/v is a rational root
of f , written in irreducible form. Prove that u divides the constant coefficient of f and v divides its leading
coefficient. (This is a generalisation of Proposition 1.1.1.)

To distinguish algebraic integers from regular integers, we will call the latter rational integers since
they are precisely the algebraic integers which are rational.
A deep fact about algebraic numbers and algebraic integers is that they’re closed under addition
and multiplication. This will be proven in Section 1.3, but we will already give an application of these
results to a seemingly unrelated problem in order to showcase their power.

Problem 1.1.1

Let q be a rational number. Which rational values can cos(qπ) take?

Solution

The key point is that the numbers of the form cos(qπ) are precisely the real parts of roots of
unity. Indeed, any root of unity has its real part of this form, and if q = ab then cos(qπ) is the
real part of the 2bth root of unity exp 2aπi
2b .

Thus, let ω = exp(qiπ) be a root of unity. Twice its real part is ω +ω, where ω = ω1 is the complex
conjugate of ω. Thus, 2<(ω) is a sum of the roots of unity and hence of algebraic integers, which
means it’s an algebraic integer itself.

Finally, we conclude that if 2 cos(qπ) = 2<(ω) is rational it must be a rational integer. Since
2 cos(qπ) ∈ [−2, 2] we must have
 
1
cos(qπ) ∈ 0, ± , ±1
2

which, conversely, are all easily seen to work.



16 CHAPTER 1. ALGEBRAIC NUMBERS AND INTEGERS

We may now define divisibility and congruences in algebraic integers, exactly like it is done in Z.

Definition 1.1.2 (Divisibility in Z)

Let α and β be algebraic integers. We say α divides β and write α | β if there exists an algebraic
integer γ such that β = αγ.

Definition 1.1.3 (Congruences in Z)

Let α, β, γ be algebraic integers. We say α is congruent to β modulo γ, and write α ≡ β (mod γ),
if γ | α − β.

Like in Z, α ≡ β (mod 0) is just equivalent to α = β since 0 only divides 0.

There is one thing that makes it very nice to work with congruences in algebraic integers: it is the
fact that if a, b, n are rational integers, then a ≡ b (mod n) in rational integers is the same thing as
a ≡ b (mod n) in algebraic integers (which is why we use the same notation). Assume n is non-zero,
otherwise it is obvious by the previous remark. What does the former mean? It means that a−b n is a
rational integer. What does the latter mean? It means that a−b n is an algebraic integer. Since we are
given that a−b
n is rational, it being a rational integer is equivalent to it being an algebraic integer by
Proposition 1.1.1.

As stated before, in Section 1.3, we will prove that the algebraic integers are closed under addition
and multiplication (thus forming a ring) which means that we can manipulate these congruences like
we would in Z.

1.2 Minimal Polynomial


The goal of this section is to provide an abstract framework to manipulate algebraic numbers better,
most of the results will not have any direct application but will help simplify proofs and provide a
more conceptual way of thinking about algebraic numbers.

In the first chapter we saw that 2 + 4 3 was a root of (X − 2)4 − 3, but is it the smallest polynomial
having this property? It is natural to ask oneself, given an algebraic number α, what is the least degree
non-zero polynomial (with integer coefficients) vanishing at α.

Definition 1.2.1 (Minimal Polynomial)

Let α ∈ Q be an algebraic number. We say a least degree monic polynomial vanishing at α is


a minimal polynomial of α. We also say α is an algebraic number of degree n, where n is the
degree of any of its minimal polynomials.

The following proposition shows that the minimal polynomial is unique.

Proposition 1.2.1*

Let α ∈ Q be an algebraic number and πα be one of its minimal polynomial. Then, for any
polynomial f ∈ Q[X], f (α) = 0 if and only if πα | f . In particular, πα is unique.
1.2. MINIMAL POLYNOMIAL 17

Proof

Clearly, if πα | f ∈ Q[X], then f vanishes at α. For the converse, assume that f ∈ Q[X] vanishes
at α. Then, perform the Euclidean division of f by πα : f = gπα + h with deg h < deg πα . If h
is non-zero, then, after dividing it by its leading coefficient, we are left with a monic polynomial
with rational coefficients vanishing at α of degree less than deg πα , a contradiction.

Therefore, πα | f . Now, if πα0 is another minimal polynomial of α, we get πα | πα0 and πα0 | πα so
πα = πα0 since they are both monic.


We will thus use πα to denote the minimal polynomial of an algebraic number α. Notice that a
minimal polynomial is always irreducible in Q[X], and, conversely, an irreducible polynomial is always
the minimal polynomial of its roots.

Exercise 1.2.1∗ . Prove that the minimal polynomial of an algebraic number is irreducible and that an
irreducible polynomial is always the minimal polynomial of its roots.

We can now answer our original question. The minimal polynomial of 2 + 4
3 is in fact (X − 2)4 − 3
as Y 4 − 3 is easily seen to be irreducible in Q[X].

Exercise 1.2.2. Prove that Y 4 − 3 is irreducible in Q[X].

Given an algebraic number α, it is often particularly useful to look at the other roots of its minimal
polynomials ; these are called the conjugates of α. This is because α and its conjugates are all symmetric
because of Proposition 1.2.1: if α satisfies a certain polynomial equation with rational coefficients, then
so do its conjugates.

Definition 1.2.2 (Conjugates)

Let α ∈ Q be an algebraic number. Its conjugates are defined as the roots of its minimal
polynomial: α1 , . . . , αn with n = deg πα (we include α).

√ √ √
For instance, the conjugates of d where d is non-perfect-square rational number are d and − d.
A more elaborate example is the one of a primitive pth roots of unity, i.e. a pth root of unity ω 6= 1
p
−1
(where p is some prime number). By Theorem 3.2.1 or Corollary 5.1.5, XX−1 is irreducible so its
conjugates are all the primitive pth roots.1

Note that an algebraic number of degree n always has n distinct conjugates, because an irreducible
polynomial always has distinct roots.

Exercise 1.2.3∗ . Prove that any algebraic number of degree n has n distinct conjugates.

Notice also that α, the complex conjugate of α, is always a conjugate of α (see Appendix A). It
is of interest to discuss a bit more the link between the conjugates we just defined and the complex
conjugate of a number.

Imagine that, instead of being interested with the field of rational numbers, we were interested
in the field of real numbers and we wanted to do algebraic number theory with it. Thus, we define
algebraic numbers as roots of polynomials with real coefficients, etc. Now, every minimal polynomial
has degree 1 or 2, because any irreducible polynomial in R[X] has degree 1 or 2 (see Appendix A).
Thus, the conjugates of α are either {α} = {α, α} in the first case, or {α, α} in the second. In fact,
this can be generalised a lot, see Chapter 6.
1 I know forward references are annoying, but I need this fact for one of the worked examples. It is also useful to

know as roots of unity are absolutely fundamental in algebraic number theory. For now, you can just take my word on
it until you reach Chapter 3.
18 CHAPTER 1. ALGEBRAIC NUMBERS AND INTEGERS

Finally, we focus a bit on the algebraic integers. Can we say anything about the minimal polynomial
of an algebraic integer? We know that the minimal polynomial of an algebraic number which isn’t an
algebraic integer can’t have only integer coefficients, but what about the converse? The answer is yes,
as proven by the following proposition.

Proposition 1.2.2

Let α ∈ Q be an algebraic number. Then, πα ∈ Z[X] if and only if α ∈ Z.

Proof

It is clear that if πα ∈ Z[X], α is an algebraic integer. Thus, assume α is an algebraic integer


for the reverse implication. We will make the assumption that Z is closed under addition and
multiplication, see Section 1.3 for a proof.

Let α1 , . . . , αn be the conjugates of α. By Vieta’s formulas A.1.4, the coefficient of X k of πα is


X
(−1)n−k · αi1 · . . . · αin−k .
i1 <...<in−k

This is an algebraic integer by Theorem 1.3.2 and Exercise 1.2.4∗ but by assumption it is also
rational. Therefore, it is a rational integer and πα ∈ Z[X] as wanted.


Exercise 1.2.4∗ . Prove that the conjugates of an algebraic integer are also algebraic integers.

Exercise 1.2.5. We call an algebraic number of degree 2 a quadratic number . Characterise quadratic integers.

1.3 Symmetric Polynomials


Given a commutative ring R (in our case we will consider Z and Q) and an integer n ≥ 0, we can
consider the symmetric polynomials in n variables with coefficients in R. These are defined as the
polynomials in n variables invariant under all permutations of these variables.

Definition 1.3.1 (Symmetric Polynomials)

We say a polynomial f ∈ R[X1 , . . . , Xn ] is symmetric if f (X1 , . . . , Xn ) = f (Xσ(1) , . . . , Xσ(n) ) for


any permutation σ of [n].

As an example, f = X 2 Y + XY 2 + X 2 + Y 2 is a symmetric polynomial in two variables, and


g = X 2 Y Z + XY 2 Z + XY Z 2 + XY 2 + X 2 Y + XZ 2 + X 2 Z + Y Z 2 + Y 2 Z
is a symmetric polynomial in three variables.

Definition 1.3.2 (Elementary Symmetric Polynomials)

The kth elementary symmetric polynomial for k ≥ 0, ek ∈ R[X1 , . . . , Xn ], is defined by


X
ek = Xi1 · . . . · Xik .
1≤i1 <...<ik ≤n

Further, if k > n then ek = 0 (the empty sum) and if k = 0 then e0 = 1 (the sum of the empty
product).
1.3. SYMMETRIC POLYNOMIALS 19

The two-variable symmetric polynomials are thus simply e1 = X + Y and e2 = XY . The three-
variable ones are e1 = X + Y + Z, e2 = XY + Y Z + ZX and e3 = XY Z.

We now state the fundamental theorem of symmetric polynomials. See Appendix B for a proof.

Theorem 1.3.1 (Fundamental Theorem of Symmetric Polynomials)

Suppose f ∈ R[X1 , . . . , Xn ] is a symmetric polynomial. Then f ∈ R[e1 , . . . , en ]. In other words,


there is a polynomial g ∈ R[X1 , . . . , Xn ] such that

f (X1 , . . . , Xn ) = g(e1 , . . . , en ).

This theorem explains why we called ek "elementary symmetric polynomials": because they gen-
erate all symmetric polynomials.
Exercise 1.3.1. Let α ∈ Q be an algebraic number with conjugates α1 , . . . αn and f ∈ Q[X1 , . . . , Xn ] be a
symmetric polynomial. Show that f (α1 , . . . , αn ) is rational. Further, prove that if α is an algebraic integer
and f has integer coefficients, f (α1 , . . . , αn ) is in fact a rational integer.

We can now prove that algebraic integers are closed under addition and multiplication.

Theorem 1.3.2

Let α and β be two algebraic integers. Then, αβ and α + β are also algebraic integers.

Proof

The idea is to construct a monic polynomial whose coefficients are symmetric in both the conju-
gates of α and the conjugates of β. Intuitively, Exercise 1.3.1, they will thus be rational integers
which will imply that α + β is an algebraic integer.

We thus consider the conjugates α1 , . . . , αm of α and β1 , . . . , βn of β and let


Y YY Y
f (X) = (X − (αi + βj )) = ((X − αi ) − βj ) = πβ (X − αi ).
i,j i j i

If we define Y
g(X, X1 , . . . , Xm ) = πβ (X − Xi ),
i

it is symmetric as a polynomial in X1 , . . . , Xm (over the ring R = Z[X]). We can thus write

g = h(X, e1 , . . . , em )

for some h ∈ Z[X, X1 , . . . , Xm ] by the fundamental theorem of symmetric polynomials. Finally,


our original polynomial f is

f = h(X, e1 (α1 , . . . , αm ), . . . , em (α1 , . . . , αm )).

But, by Vieta’s formulas A.1.4, ek (α1 , . . . , αm ) is an integer as it is ± the coefficient of X m−k


of πα ! We thus conclude that f has integer coefficients which means that α + β is an algebraic
integer.

The αβ ∈ Z part is handled similarly and we thus omit it.




Exercise 1.3.2∗ . Prove that Z is closed under multiplication.


20 CHAPTER 1. ALGEBRAIC NUMBERS AND INTEGERS

Remark 1.3.1
There is a slightly better but also slightly more complicated proof of this result. The idea is a di-
rect generalisation of Exercise 1.3.1: we prove that any polynomial s ∈ R[A1 , . . . , Am , B1 , . . . , Bn ]
which is symmetric in A1 , . . . , Am as well as symmetric in B1 , . . . , Bn is a polynomial in eA A
1 , . . . , em
B B A B
and e1 , . . . , en , where ei and ei represent the ith elementary symmetric polynomial in A1 , . . . , Am
and B1 , . . . , Bn respectively. Then, when we evaluate it at

(α1 , . . . , αm , β1 , . . . , βn )

we get that the result lies in R by Vieta’s formulas. In particular, for


Y
s= X − (Ai + Bj )
i,j

and R = Z[X], we get that f ∈ Z[X] as wanted.

To prove our first claim, we simply use the fundamental theorem of symmetric polynomials twice.
First, s is symmetric as a polynomial in B1 , . . . , Bn with coefficients in R[A1 , . . . , Am ] so

s ∈ R[A1 , . . . , Am ][eB B B B
1 , . . . , en ] = R[A1 , . . . , Am , e1 , . . . , en ].

Then, s is symmetric as a polynomial in A1 , . . . , Am with coefficients in R[eB B


1 , . . . , en ] (by the
first step) so
s ∈ R[eB B A A A A B B
1 , . . . , en ][e1 , . . . , em ] = R[e1 , . . . , em , e1 , . . . , en ]

as claimed.

Remark 1.3.2
Our proof also shows that the conjugates of α+β and αβ are among αi +βj and αi βj respectively.

The following straightforward consequence of the fundamental theorem of symmetric polynomials


1.3.1 is sometimes useful.

Proposition 1.3.1
Qn Qn
Let f = a · k=1 X − αk and g = b · k=1 X − βk be two polynomials with integer coefficients,
and let m ∈ Z be a rational integer which is coprime with a and b. Suppose that f ≡ g (mod m).
Then,
S(α1 , . . . , αn ) ≡ S(β1 , . . . , βn ) (mod m)
for any symmetric S ∈ Z[X1 , . . . , Xn ].

Exercise 1.3.3∗ . Prove Proposition 1.3.1.


Here is why this proposition is interesting. It lets us use a local-global principle. Suppose we have
a polynomial f ∈ Z[X] and you know a bunch of information about its roots modulo prime numbers
p. Then, Proposition 1.3.1 lets us deduce information about symmetric sums of the complex roots of
f , modulo p. If we let p vary, we can thus get information about symmetric sums of the roots of f ,
and hence information about the roots of f themselves.
We illustrate this by an example. Problem 1.4.1 and Exercise 3.5.33† provide more elaborate
applications.

Question

Let f ∈ Z[X] be a polynomial. Suppose that f has a double root in Fp for infinitely many primes
p. Must it follow that f has a complex double root?
1.3. SYMMETRIC POLYNOMIALS 21

Answer

We prove that it does. Clearly, f needs to have degree at least 2 for it to have a double root
modulo some prime so we may assume that it does. Suppose that f has a double root β ∈ Z
modulo a rational prime p. Consider the polynomial

g(X) = f (X) − (X − β)f 0 (β) − f (β).

This may seem unmotivated, but this is just a polynomial congruent to f modulo p (by assump-
tion) which now has β as a complex double root.

We may now consider the complex roots α1 , . . . , αn of f and β = β1 , . . . , βn = β of g. By


Proposition 1.3.1, we thus have
Y Y
αi − αj ≡ βi − βj (mod p).
i6=j i6=j

Since g has a double root, the RHS is zero. Thus, p divides the LHS. Since this is true for
infinitely many primes p, we deduce that the LHS is also zero: f has a complex double root.


Remark 1.3.3
The number  2
n(n−1) Y Y
∆ = (−1) 2 a2n−2 · αi − αj = an−1 αi − αj 
i6=j i<j

is called the discriminant of f . The fundamental theorem of symmetric polynomials, together


with Vieta’s formulas show that this is a polynomial in the coefficients of f . This is also why the
factor of a2n−2 is there, since Vieta’s formulas yield that elementary symmetric sums in the roots
are coefficients of f divided by some power of a. We can easily check that it agrees with the usual
definition when n = 2. We have thus shown that if f has a double root mod p then p | ∆.

Remark 1.3.4
We could have also done this with Bézout’s lemma: if f has no complex double root, f and f 0 are
coprime so that uf + vf 0 = 1 for some u, v ∈ Q[X]. Multiplying by the common denominator N
of the coefficients of u and v, we get that any prime p for which f has a double root in Fp divides
N . Thus, there are finitely many such primes.

This idea will be explored in more detail in Chapter 5. It is linked to our approach by Exer-
cise B.4.4† (which also shows that the discriminant ∆(f ) is indeed a polynomial in the coefficients
of f ).

Remark 1.3.5
We stated Proposition 1.3.1 that way because we have not yet developed the theory of finite fields.
However, after Chapter 4, we will no longer use this versio. Instead we will use the following result:
given a polynomial f ∈ Z[X] with complex roots α1 , . . . , αn ∈ Q and a prime p which doesn’t
divide the leading coefficient of f , for any symmetric S ∈ Z[X1 , . . . , Xn ], we have

S(α1 , . . . , αn ) ≡ S(β1 , . . . , βn ) (mod p),

where β1 , . . . , βn ∈ Fp are the roots of f in Fp this time. The proof is exactly the same as the
one for Proposition 1.3.1. This statement makes obvious the fact that this is some a local-global
principle of some kind: we link global information (the roots in Q) to local information (the roots
in Fp ).
22 CHAPTER 1. ALGEBRAIC NUMBERS AND INTEGERS

With this formalism, the solution to our previous example is nicer: there is no need to make the
transformation g = f − (X − β)f 0 (β) − f (β). We also get a converse of Remark 1.3.4 (for primes
which do not divide the leading coefficient of f ): if f has a double root modulo p, then p | ∆.

1.4 Worked Examples


In this section, we present two complicated problems where our previous results come to the spotlight.
However, exercises in Section 1.5 will show that the theory we have developed applies to a wide variety
of situations. Although the full power of our results is often not needed, they provide a more conceptual
framework to solve these problems.

Problem 1.4.1 (AMM 10748)

Let q be a prime number and q - r be a positive integer. Suppose that p > rq−1 is a prime
number congruent to 1 modulo q and a1 , . . . , ar are rational integers such that
r
X p−1

p| ai q .
i=1

Prove that p divides one of the ai .

Solution
p−1

Suppose for the sake of contradiction that none of ai are zero modulo p. Notice that ai q is
a qth root of unity modulo p. Let z be an element of order q modulo p; there must exist one
otherwise
Xr p−1

r≡ ai q ≡ 0 (mod p)
i=1

which is impossible as p > rq−1 . (In fact, there must always exist one if p ≡ 1 (mod q) but this
is proven in Chapter 3. It also turns out that, as you will see in Chapter 4, even if there did not
exist one the argument would still work.)

Then, as Fp is a field, the roots of X q − 1 are 1, z, . . . , z q−1 as these are all roots and this
p−1

polynomial has at most q roots. Thus, consider ki such that z ki ≡ ai q .


Pr
Let f be the polynomial i=1 X ki . By assumption, p | f (z) so
q−1
Y
f (z k ) ≡ 0.
k=1

Also, by Proposition 1.3.1 we know this is congruent to


q−1
Y
f (ω k )
k=1

modulo p where ω 6= 1 is a complex qth root of unity. However, by the triangular inequality,
q−1
Y q−1
Y
f (ω k ) ≤ |1| + . . . + |1| = rq−1 .
k=1 k=1

Since p > rq−1 , this means that this product must be zero.
1.4. WORKED EXAMPLES 23

To conclude, as mentioned after Definition 1.2.2, we know that the minimal polynomial of ω is
X q −1
X−1 by Theorem 3.2.1 or Corollary 5.1.5. Thus, by Proposition 1.2.1, this means that

X q−1 + . . . + 1 | f.

Hence, we have
f = (X q−1 + . . . + 1)g
for some g ∈ Z[X] as X q−1 + . . . + 1 is monic. Finally, this means that q | f (1) = r which is a
contradiction.


Problem 1.4.2 (Problems from the Book)


√ √
Let a1 , . . . , am ∈ R be positive real numbers such that n a1 + . . . + n am is rational for any
integer n ≥ 1. Prove that a1 = . . . = am = 1.

Before delving into the solution, for the convenience of the reader, we recall a special case of
Newton’s formulas (see Corollary B.2.1) that will be used in the proof.

Proposition 1.4.1

Let α1 , . . . , αm ∈ C be complex numbers. If


m
X
pk (α1 , . . . , αm ) := αik
i=1

is rational for k = 1, . . . , m, then so are

e1 (α1 , . . . , αm ), . . . , em (α1 , . . . , αm ).

Solution

We will proceed in two steps. First, we show that a1 , . . . , am are all algebraic numbers. Let

bi = m! ai . Then, by assumption, for k ∈ [m],
m
X
pk (b1 , . . . , bm ) = bki
i=1

is a rational number. Thus, by the above proposition, the elementary symmetric polynomials
evaluated b1 , . . . , bm are all rational: b1 , . . . , bm are algebraic. Our claim follows: ai = bm!
i is also
algebraic.

Finally, let N be a positive rational integer such that N a1 , . . . , N am are all algebraic integers.
There exists one by Exercise 1.4.1∗ . Notice that
√ √ √
n
p p
N ( n a1 + . . . + n am ) = N n−1 ( n N a1 + . . . + n N am )

is an algebraic integer. Since by assumption it is rational, by Proposition 1.1.1 it is a rational



integer. Call it un . Since n ai → 1, (un ) converges to N m. As it is a sequence of integers, it
must be eventually constant. Take a sufficienly large n so that un = N m = u2n .
24 CHAPTER 1. ALGEBRAIC NUMBERS AND INTEGERS

By the Cauchy-Schwarz inequality we have


v
um m
p uX √ X √
m = 12 + . . . + 1 2 t n
ai ≥ 2n
ai = m
i=1 i=1

with
Pm equality if and only if all ai are the same. This is the conclusion that we wanted: since

n a = m, they must all be equal to 1.
i=1 i


Exercise 1.4.1∗ . Let α ∈ Q be an algebraic number. Prove that there exists a rational integer N 6= 0 such
that N α is an algebraic integer.

Remark 1.4.1

We could also have finished with Newton’s formulas again: if we set ci = n! ai for a sufficiently
large n, we get that pk (c1 , . . . , cm ) = m for k = 1, . . . , m, which implies that c1 = . . . = cm = 1
since (X − c1 ) · . . . · (X − cm ) = (X − 1)m by Corollary B.2.1. Thus, ai = cn! i = 1 as wanted.

1.5 Exercises
Elementary-Looking Problems
Exercise 1.5.1† . Find all non-zero rational integers a, b, c ∈ Z such that a
b + cb + ac and b
a + cb + ac are
integers.2

Exercise 1.5.2† (USAMO 2009). Let (an )n≥0 and (bn )n≥0 be two non-constant sequences of rational
numbers such that (ai −aj )(bi −bj ) ∈ Z for any i, j. Prove that there exists a non-zero rational number
b −b
r such that r(ai − aj ) and i r j are integers for any i, j.
xn −y n
Exercise 1.5.3 (AMM E 2998). Let x 6= y ∈ C be complex numbers such that x−y is a rational
integer for 4 consecutive values of n. Prove that it is always an integer for n ≥ 0.

Exercise 1.5.4† (Adapted from Irish Mathematical Olympiad 1998). Let x ∈ R be a real number
such that both x2 − x and xn − x for some n ≥ 3 are rational. Prove that x is rational.

Exercise 1.5.5. Suppose that a1 , . . . , am ∈ Z are positive rational integers such that
m
X √
n i ai
i=1

also is a rational integer. Prove that ni ai is a rational integer for any i = 1, . . . , m.
√ √
Exercise 1.5.6. Find the least n such that cos nπ can not be written in the form a + b + 3 c for


some rational numbers a, b, c. (More generally, all such n will be determined in Chapter 3.)

Exercise 1.5.7 (Miklós Schweitzer Competition 2015). Let f, g ∈ C[X] be such that

f ◦ g = X n + X n−1 + . . . + X + 2016

for some integer n ≥ 4. Prove that one of them must have degree 1.
2 The point of this problem is to do it specifically with algebraic numbers. You can of course solve it elementarily,

but that will just amount to reproving one of the results we showed. The reason why this is an exercise is to enable the
reader to recognise certain situations which are immediately solved with algebraic numbers, without needing to redo all
the work each time.
1.5. EXERCISES 25

Exercise 1.5.8† . Let |x| < 1 be a complex number. Define



X
Sn = k n xk .
k=0

Suppose that there is an integer N ≥ 0 such that SN , SN +1 , . . . are all rational integers. Prove that
Sn is a rational integer for any integer n ≥ 0.
Exercise 1.5.9† . Let n ≥ 3 be an integer. Suppose that there exist a regular n-gon with integer
coordinates. Prove that n = 4.
Exercise 1.5.10† . Let P be a polygon with rational sidelengths for which there exists a real number
α ∈ R such that all its angles are rational multiples of α, except possibly one. Prove that cos α is
algebraic.
Exercise 1.5.11 (Adapted from USA TST 2007). Let 0 < θ < π2 be a real number and m, n two
coprime rational integers. Suppose that cos θ is irrational but cos(mθ) and cos(nθ) are both rational.
Prove that θ = π6 .
Exercise 1.5.12 (IMC 2001). Let k and n be positive integers and let f be a polynomial of degree n
with coefficients in {−1, 0, 1}. Suppose that (X − 1)k | f and that

p k
<
log p log(n + 1)

for some rational prime p. Prove that all complex roots of unity of order p are roots of f .
Exercise 1.5.13 (IZHO 2021). Let f ∈ Q[X] be an irreducible polynomial of degree n. Prove that
there are at most n polynomials g ∈ Q[X] of degree less than n such that f | f ◦ g.
Exercise 1.5.14† . Let ω1 , . . . , ωm be nth roots of unity. Prove that |ω1 + . . . + ωm | is either zero or
greater than m−n .
Exercise 1.5.15† . Let n ≥ 1 and n1 , . . . , nk be integers. Prove that
   
2πn1 2πnk
cos + . . . + cos
n n
1
is either zero or greater than 2(2k)n/2
.

Exercise 1.5.16† (USA TST 2014). Let N be an integer. Prove that there exists a rational prime p
and an element α ∈ F× 2
p such that the orbit {1, α, α , . . .} has cardinality at least N and is sum-free,
meaning that αi + αj 6= αk for any i, j, k. (You may assume that, for any n, there exist infinitely many
primes for which there is an element of order n in Fp . This will be proven in Chapter 3.)

Properties of Algebraic Numbers


Exercise 1.5.17. Which of the following are algebraic integers?
p5
√ p
17

• 1+ 33− 4 − 7 2.

5+1
• 2 .

3+1
• 2 .
7
• 12 .
√3 √
7−i 4 5
• 6 .
√ i+2
• 2· 2 .

Exercise 1.5.18. Prove that Q is a field.


26 CHAPTER 1. ALGEBRAIC NUMBERS AND INTEGERS

Exercise 1.5.19† . Let α ∈ Q be an algebraic number with conjugates α1 , . . . , αn and let f ∈ Q[X]
n
be a polynomial. Prove that the m conjugates of f (α) are each represented exactly m times among
f (α1 ), . . . , f (αn ).
Exercise 1.5.20† . Let α1 , . . . , αm ∈ Q be algebraic number and f ∈ Q[X1 , . . . , Xm ] a polynomial.
(1) (n )
Denote the conjugates of αk by αk , . . . , αk k . Prove that the conjugates of f (α1 , . . . , αm ) are among3
(i ) (im )
{f (α1 1 , . . . , αm ) | ik = 1, . . . , nk }.

Exercise 1.5.21† . Let f ∈ Z[X] be a monic polynomial and α be one of its roots. Prove that α is an
algebraic integer.
Exercise 1.5.22† . We say an algebraic integer α ∈ Z is a unit if there exists an algebraic integer
α0 ∈ Z such that αα0 = 1. Characterise all units.
Exercise 1.5.23† . Let m be a rational integer. We say an algebraic integer α ∈ Z is a unit mod m if
there exists an algebraic integer α0 ∈ Z such that αα0 ≡ 1 (mod m). Characterise all units mod m.
Exercise 1.5.24. Let α ∈ Z be an algebraic integer which is not a unit. Prove that the set of residues
of algebraic integers modulo α, denoted by Z/αZ, is infinite.

Exercise 1.5.25† . Let α ∈ Z be an non-rational algebraic integer. Prove that there are a finite
number of rational integers m such that α is congruent to a rational integer mod m.
Exercise 1.5.26† (Kronecker’s Theorem). Let α ∈ Z be a non-zero algebraic integer such that all its
conjugates have module at most 1. Prove that it is a root of unity.

Exercise 1.5.27† . Determine all non-zero algebraic integers α ∈ Z such that all its conjugates are
real and have module at most 2.
Exercise 1.5.28† . Suppose that ω is a root of unity whose real part is an algebraic integer. Prove
that ω 4 = 1.

Exercise 1.5.29† . Let ω1 , . . . , ωn be roots of unity. Suppose that 1


n (ω1 + . . . + ωn ) is a non-zero
algebraic integer. Prove that ω1 = . . . = ωn .4
Exercise 1.5.30† . Let α ∈ Z be an algebraic number and let p be a rational prime. Must it follow
that αn ≡ 0 (mod p) or αn ≡ 1 (mod p) for some n ∈ N?5
Exercise 1.5.31† (Lindemann-Weierstrass).
Rt
a) Given a polynomial f ∈ C[X], we write I(·, f ) for the function t 7→ 0
et−u f (u) du. Prove that
m
X m
X
I(t, f ) = et f (k) (0) − f (k) (t)
k=0 k=0

where m = deg f .
Pn
b) By looking at k=0 ak I(k, f ) with f = X p−1 (X − 1)p · . . . · (X − n)p for some prime number p
and some rational integers a0 , . . . , an , prove Hermite’s theorem: e is transcendental.

c) Prove the Lindemann-Weierstrass theorem: if α1 , . . . , αn ∈ Q are distinct algebraic numbers,


eα1 , . . . , eαn are linearly independent (over Q)6 . Deduce that π is transcendental.

3 This †

√time, unlike Exercise 1.5.19 √, these are usually not all conjugates. For instance, for f = X + Y , α = 2 and
β = 1 − 2, 1 is not a conjugate of 2 2 − 1.
4 In fact, any algebraic integer that can be written as a linear combination of roots of unity with rational coefficients

can also be written as a linear combination of roots of unity with integer coefficients. However, this is a difficult result
to prove (see Exercise 3.5.26† for a special case).
5 In Chapter 4, we prove that the answer is positive for sufficiently large p.
6 Equivalently, if α , . . . , α ∈ Q are linearly independent (over Q), eα1 , . . . , eαn are algebraically independent.
1 n
Chapter 2

Quadratic Integers

Prerequisites for this chapter: Chapter 1 and Section A.2.

It is best to start with an example. Suppose we want to solve the equation x2 + 1 = y 3 . Write this
equation as
(x + i)(x − i) = y 3 .
Imagine that we could conclude x + i = (a + bi)3 (this is analogous to the rational integers case: if a
product of two coprime integers is a cube, then each factor is a cube1 ). Thus, after expanding this, we
find x = a(a2 − 3b2 ) and 1 = −b(b2 − 3a2 ). This is now very easy to solve: b = ±1 since it divides 1,
so 3a2 = 1 ± 1 since b2 − 3a2 also divides 1. Thus, this implies a = 0 which finally means x = 0. We
conclude that the only solution is (x, y) = (0, 1).
It is remarkable to see that we have solved a problem about rational integers by introducing a
certain class of non-rational algebraic integers. The following sections aim to formalize this approach.
Exercise 2.0.1. Why is the "naive" approach of factorising the equation as x2 = (y − 1)(y 2 + y + 1) difficult
to conclude with? Why does our solution not work as well for the equation x2 − 1 = y 3 ?

2.1 General Definitions


Given a quadratic integer α (meaning an algebraic integer of degree 2), we define the set
Z[α] := Z + αZ = {a + bα | a, b ∈ Z}.
Given a quadratic number α (meaning an algebraic number of degree 2), we define the set
Q(α) := Q + αQ = {a + bα | a, b ∈ Q}.
The former is in fact a ring, while the latter is a field (called a quadratic field ).

Remark 2.1.1
Normally, Z[α] is defined as the smallest ring containing Z and α, i.e. the ring of all polynomials
in α with integer coefficients. Similarly, Q(α) is defined as the smallest field containing Q and α,
i.e. the field of all rational functions in α with rational coefficients. We have chosen the previous
definition for the sake of clarity. See Chapter 6 for the general definition.

It might be confusing to see square brackets used for Z[α] while round brackets are used for Q(α).
In fact, Q[α] also exists: this is the smallest ring containing α as well as Q (so polynomials in α with
rational coefficients). It turns out that for algebraic numbers α, Q[α] = Q(α) (Exercise 6.1.2∗ ).
Thus, while it is technically correct to use square brackets, we have used round brackets to

1 However it remains to prove that x + i and x − i are indeed coprime, for a suitable definition of coprime. Or, if it’s

not the case, incorporate the gcd in the argument, again, for a suitable definition of gcd.

27
28 CHAPTER 2. QUADRATIC INTEGERS

emphasise the fact that it is a field.

Exercise 2.1.1∗ . Prove that Z + αZ is a ring for any quadratic integer α. This amounts to checking that it
is closed under addition, subtraction, and multiplication. What happens if α is a quadratic number which is
not an integer?

Exercise 2.1.2∗ . Prove that α + αQ is a ring for any quadratic integer α. This amounts to checking that it
is closed under addition, subtraction, multiplication, and division.

Exercise 2.1.3∗ . Let α be a quadratic number and β ∈ Q(α). Show that β has degree 1 or 2.


Exercise 2.1.4∗ . Prove that a quadratic field K is equal to Q( d) for some squarefree rational integer
d 6= 1. Moreover, prove that such fields are pairwise non-isomorphic (and in particular distinct),
√ meaning

that, for distinct squarefree a, b 6= 1, there does not exist a bijective function f : Q( a) → Q( b) such that

f (x + y) = f (x) + f (y) and f (xy) = f (x)f (y) for any x, y ∈ Q( a).

We have seen that algebraic integers have very important properties in algebraic number theory.

Definition 2.1.1 (Ring of Integers of Quadratic Fields)

Let α be a quadratic number. We define the ring of integers of K = Q(α) to be the ring

OQ(α) := Q(α) ∩ Z

consisting of the elements of Q(α) which are also algebraic integers.

The following proposition characterises the√ring of integers of a quadratic field since by Exer-
cise 2.1.4∗ any quadratic field is of the form Q( d) for some d.

Proposition 2.1.1*

Let d ∈ Z be a squarefree rational integer. We have



OQ(√d) = Z[ d]

if d 6≡ 1 (mod 4), and " √ #


1+ d
OQ(√d) =Z
2

if d ≡ 1 (mod 4).

Remark 2.1.2

There is no ambiguity in writing Q( d) for negative d as the square
√ root of d we choose doesn’t
change that field. For instance, Z[i] = Z[−i] so we can write Z[ −1].

Proof

This follows
√ from Exercise 1.2.5,√which we reproduce here for the sake of completeness. Let
x = a + b d be han element of Q( d). If x is rational then it is a rational integer which is indeed
√ √ i
1+ d
in Z[ d] and Z 2 . Otherwise, x is an integer if and only if its minimal polynomial

X 2 − 2aX + (a2 − db2 )


2.1. GENERAL DEFINITIONS 29

has integral coefficients, by Proposition 1.2.2. This means that 2a ∈ Z and a2 − db2 ∈ Z. Thus,
0 0
4b2 d ∈ Z so 2b ∈ Z since d is a squarefree rational integer. Let a = a2 and b = b2 for some
a0 , b0 ∈ Z. We see that x is an integer if and only if

(a0 )2 − d(b0 )2
a2 − db2 = ∈ Z.
4
This is now an easy exercise in congruences: if one of a0 , b0 is odd then the other one must be
too since 4 - d. However, an odd perfect square is always congruent to 1 modulo 4, thus if d ≡ 1
(mod 4), (a0 , b0 ) works if and only if they have the same parity, otherwise they must both even.
This is exactly what we wanted to prove.


Since a quadratic number has exactly one conjugate distinct from itself, we will call it the conjugate.

Definition 2.1.2 (Conjugation in a Quadratic Field)


√ √
Let d 6= 1 be a squarefree
√ rational integer, and let α = a + b d ∈ Q( d). The conjugate of α,
denoted α, is a − b d.

In particular, this conjugate is also defined for rational numbers, in which case we have α = α.
It is true that this is the same notation as the complex conjugate, but the context will make it clear
what is meant. (Note that, when d = −1, this conjugate is exactly the complex conjugate, but this is
the only time this happens.)
Exercise 2.1.5∗ . Prove that the conjugate is well defined.

Exercise 2.1.6∗ . Let d 6= 1 be√a rational squarefree number. Prove that the conjugation
√ satisifes α + β = α+β
and αβ = αβ for all α, β ∈ Q( d). Such a function is called an automorphism of Q( d) if it is also bijective.

Exercise 2.1.7. Let d 6= 1 be a rational squarefree number. Prove that the only automorphisms of Q( d)
are the identity and conjugation.
We now define a very important map. See Chapter 6 for more.

Definition 2.1.3 (Absolute Norm)

Let α ∈ Q be an algebraic number. Its absolute norm N(α) is defined as the product of its
conjugates.

In other words, the norm of α is (−1)n times the constant coefficient of its minimal polynomial by
Vieta’s formulas A.1.4. This norm however isn’t
√ convenient√ to work with in specific
√ fields because it
is not homogeneous: N(2) = 21 N(1) but N(2 2) = 22 N( 2). This is because 2 has two conjugates
while 1 has only one conjugate. We thus define

Definition 2.1.4 (Norm in Quadratic Fields)



Let d 6= 1 be a squarefree rational integer. We define the norm map NQ(√d) : Q( d) → Q as

NQ(√d) (α) = αα.



When the context makes it clear what the base field is, we will drop the Q( d) and simply write
N.

This norm is now homogeneous, and even multiplicative! It corresponds to the absolute norm for
quadratic integers, and to the square of the absolute norm for rational integers.
30 CHAPTER 2. QUADRATIC INTEGERS


Exercise 2.1.8∗ . Let d 6= 1 be a squarefree rational integer, and α, β ∈ Q( d). Prove that N (αβ) =
N (α)N (β).

Exercise 2.1.9. Prove Exercise 2.1.8∗ without any computations using Exercise 2.1.6∗ .

Exercise 2.1.10. Let d < 0 be a squarefree integer. Prove that√ the conjugate of an element of Q( d) is the
same as its complex conjugate. In particular, the norm over Q( d) is the module squared.

2.2 Unique Factorisation


This section will be a bit of abstract nonsense. I hope the reader doesn’t get too confused.
Our goal is to have an analogue of the fundamental theorem of arithmetic in quadratic rings
of integers. This, however, will not hold over every such ring (in fact it hasn’t even been proven
that it holds for infinitely many ones!) but it will still yield substantial applications such as the
diophantine equation we "solved" in the beginning of the chapter. First we have to define what
"unique factorisation" means. It’s not just "each element can be written in a unique way as a product
of primes", because in Z we need to add a sign for negatives. This is because Z has two units (1 and
−1), while N only has one (1).

Definition 2.2.1 (Unit)

We say en element α of a ring R is a unit if it’s invertible: i.e., there exists some β such that
αβ = βα = 1.

Exercise 2.2.1∗ . Let d 6= 1 be a squarefree rational integer. Prove that the product of two units of OQ(√d)
is still a unit, and that the conjugate of a unit is also a unit.

Exercise 2.2.2∗ . Let d 6= 1 be a squarefree rational integer. Prove that α ∈ OQ(√d) is a unit if and only if
|N (α)| = 1.

Exercise 2.2.3∗ . Determine the units of the ring Z[i].

Definition 2.2.2 (Unique Factorisation Domain)

We say an integral domain R has unique factorisation and is a unique factorisation domain
(UFD) if there exists a set of elements of R called primes such that each non-zero element α ∈ R
can be written in a unique way as a product of primes

α = p1 · . . . · pn

up to permutation and multiplication by units.

Indeed, the factorisation 6 = (−2)(−3) doesn’t bring anything new to the factorisation 6 = 2 · 3.
In Z, there is a canonical way to say which of −2 and 2 is prime, but in general there isn’t (and it
is actually more useful to say they are both prime). Thus, we say two primes p and q are associates
if there is a unit u such that q = up. Having unique factorisation then means that it is unique up to
permutations and associates.
We now discuss some ways to prove an integral domain is a UFD. Recall how unique factorisation
is proven in Z. We define prime numbers as usual, then prove Bézout’s lemma (if a and b are coprime
there are x and y such that ax + by = 1) and from this deduce the fundamental Euclid lemma: if
some prime divides a product, it divides one of the factors. Finally, we induct on the natural integers
to show that a prime factorisation always exists and, using Euclid’s lemma, that it’s unique up to
permutation.
2.2. UNIQUE FACTORISATION 31

We wish to imitate this process. It suggests that the fundamental fact about prime numbers is the
Euclid lemma, and not that it can’t be written as a non-trivial product. It also suggests that Bézout’s
lemma is the fundamental step. We thus make the following definitions.

We first take care of our "objection" about primes: they should be defined as having the Euclid
property instead of not being writable as a non-trivial product.

Definition 2.2.3 (Prime Element)

We say a non-unit p ∈ R is a prime element if it is non-zero and, for all a, b ∈ R, p | ab implies


p | a or p | b. (Divisibility is defined as usual: α | β if there exists a γ ∈ R such that β = αγ.)

Definition 2.2.4 (Irreducible Element)

We say a non-unit x ∈ R is an irreducible element if it is non-zero and x = αβ implies that α is


a unit or β is one.

The usual definition of a prime in Z is thus as an irreducible element, instead of as a prime one.
To further distinguish prime elements from prime numbers, we will thus call the latter rational primes
(because they are rational integers). This is somewhat contradictory as the primes of Z are ±p, but
by "rational prime" we will always mean a prime of N, i.e. a positive prime number.
Exercise 2.2.4∗ . Prove that an associate of a prime is also prime.

Exercise 2.2.5∗ . Prove that the conjugate of a prime is also a prime.

Exercise 2.2.6∗ . Prove that primes are irreducible.

Exercise 2.2.7∗ . Let d 6= 1 be a squarefree rational integer and let x ∈ OQ(√d) be a quadratic integer.
Suppose that |N (x)| is a rational prime. Prove x is irreducible.

Exercise 2.2.8∗ . Suppose a prime p divides another prime q. Prove that p and q are associates.

Exercise 2.2.9∗ . Prove that p is a prime element of R if and only if it is non-zero and R (mod p) is an
integral domain (this means that the product of two non-zero elements is still non-zero). In particular, if R
(mod p) is a field (this means that elements which are not divisible by p have an inverse mod p), p is prime.

Exercise 2.2.10. Let d 6= 1 be a squarefree rational integer and let p ∈ OQ(√d) be a prime. Prove that p
divides exactly one rational prime q ∈ Z.

Exercise 2.2.11. Prove that 2 is irreducible in Z[ −5] = OQ(√−5) but not prime.

Exercise 2.2.12. Show that the primes of Definition 2.2.2 must all be prime elements, and that there is at
least one associate of each prime element in that set. (Conversely, if we have unique factorisation, any such
set of primes work. This explains why we consider all primes defined in Definition 2.2.3.)

We now define formally the "Bézout property".

Definition 2.2.5 (Bézout Domain)

We say an integral domain R is Bézout Domain if, for any α, β ∈ R there exist γ ∈ R such that

αR + βR = γR.

We say such a γ is a greatest common divisor (gcd) of α and β.


32 CHAPTER 2. QUADRATIC INTEGERS

Exercise 2.2.13∗ . Prove that a greatest common divisor γ of α and β really is a greatest common divisor of
α and β, in the sense that if γ | α, β and δ | α, β then δ | γ.

Exercise 2.2.14∗ . Prove that an associate greatest common divisor is also a greatest common divisor, and
that the greatest common divisor of two elements is unique up to association.
Now that we have defined rings where Bézout’s lemma holds, let’s see how we can prove that the
rings we are interested in have this property. Here is the usual proof that Z is a Bézout Domain.

Proof that Z is a Bézout Domain

Let a, b ∈ Z be rational integers, without loss of generality positive. Let c be the minimal positive
element of aZ + bZ. We wish to show that aZ + bZ = cZ. Suppose that it is not the case: there
exists some c - d ∈ aZ + bZ. Perform the Euclidean division of d by c: d = cq + r where 0 < r < c.
Thus, we have a positive element r ∈ aZ + bZ smaller than c, a contradiction since we assumed
c was the smallest.


Aha! What we need is a Euclidean division! More specifically, we need to be able to write α = ρβ+τ
for some τ which is "smaller" in some sense than β. This yields the following definition.

Definition 2.2.6 (Euclidean Domain)

We say an integral domain R is Euclidean if there exists a function f : R → N such that for any
α, β ∈ R with β 6= 0 there exist ρ, τ ∈ R such that α = ρβ + τ and f (τ ) < f (β). Such a function
f will be called a Euclidean function.

Remark 2.2.1
The remainder doesn’t have to be unique.

Exercise 2.2.15. Let R be a Euclidean domain with Euclidean function f . Show that, if f (α) = 0, then
α = 0, and if f (α) = 1, then α is a unit or zero.
The reason why we introduced a function f : R → N is to get a measure of the size an element of
R. This is the role of f (n) = n over N, and f (n) = |n| over Z (if one wishes to prove directly that
unique factorisation holds there). Over quadratic rings of integers, this function will usually be the
absolute value of the norm. If OQ(√d) is Euclidean for the absolute value of the norm, we say it is

norm-Euclidean. By abuse of terminology we will also sometimes say Q( d) is norm-Euclidean.
The same proof as the proof that Z is a Bézout domain shows the following very important propo-
sition.

Proposition 2.2.1*

Any Euclidean domain is a Bézout domain.

Exercise 2.2.16∗ . Prove that a Euclidean domain is a Bézout domain.

Exercise 2.2.17∗ . Prove that irreducible elements are prime in a Bézout domain.
We can now state our main theorem. It might seem a bit restrictive but it works over any ring of
integer, and we invite the reader to prove it after reading Chapter 6.

Theorem 2.2.1

Any quadratic ring of integers which is a Bézout domain is a UFD.


2.3. GAUSSIAN INTEGERS 33

Proof

Let O be that ring of integers. We proceed exactly like in Z. First, we prove the existence of
a prime factorisation. Suppose that a non-zero element α ∈ O has no prime factorisation and
choose its norm N (α) to be the smallest in absolute value. Clearly, α isn’t a unit since units are
their own prime factorisation (the empty factorisation) and isn’t prime either. Since irreducible
elements are primes by Exercise 2.2.17∗ , α is not irreducible so α = βγ for some non-units
β, γ. Since N (α) = N (β)N (γ), we have |N (β)|, |N (γ)| < |N (α)| because |N (β)|, |N (γ)| 6= 1 by
Exercise 2.2.2∗ . Thus, since α was the smallest element with no prime factorisation, β and γ
have one: this means that βγ = α also has one, a contradiction.

It remains to prove the uniqueness of this factorisation. Suppose an element α has two different
prime factorisations
p1 · . . . · pn = q1 · . . . qm
and take m + n to be minimal. Since pn is prime, by definition, it must divide one of the qi , say,
qm . By Exercise 2.2.8∗ , upn = qm for some unit u. Finally, this means that

p1 · . . . · pn−1 = q1 · . . . · (uqm−1 )

so we have two different factorisations of smaller length for the same element. This is a contra-
diction since we assumed m + n was minimal.


Combining this with Proposition 2.2.1, we have proven that it suffices to find a Euclidean function
to show that
√a quadratic ring of integers has unique factorisation. By abuse of notation, we will also
say that Q( d) has unique factorisation if OQ(√d) does.

Remark 2.2.2
Most quadratic rings of integers are not Euclidean domains, Bézout domains, or even UFD. In
fact, it has only been conjectured that there are infinitely many squarefree 1 6= d ∈ Z such that
OQ(√d) is UFD! For negative d there is a complete list

{−1, −2, −3, −7, −11, −19, −43, −67, −163}

but the problem is still open for positive d.

Similarly, OQ(√d) is norm-Euclidean only for

d ∈ {−11, −7, −3, −2, −1, 2, 3, 5, 6, 7, 11, 13, 17, 19, 21, 29, 33, 37, 41, 57, 73}.

On the other hand, it has been conjectured that, for positive d, if OQ(√d) is a UFD then is
Euclidean for some exotic Euclidean function! This has been proven recently for d = 14 and
d = 69. Note that these are not part of the previous list. For negative d, OQ(√d) is Euclidean if
and only if d ∈ {−1, −2, −3, −7, −11}.

2.3 Gaussian Integers


Time for applications! We go back to the Gaussian integers Z[i] which we used at the beginning. The
norm in Q(i) is N (a + bi) = a2 + b2 .

Proposition 2.3.1*

Z[i] is norm-Euclidean.
34 CHAPTER 2. QUADRATIC INTEGERS

Proof

Let α = a + bi ∈ Z[i] and β = c + di ∈ Z[i]. Consider the number x + yi = α


β . Choose rational
1 1
integers m and n such that |x − m| ≤ 2 and |y − n| ≤ 2 . Thus, |N (x + yi − (m + ni))| ≤
1 2
2
+ 12 = 12 .

2

Hence,
|N (β)|
|N (α − β(m + ni))| = |N (β)| · |N (x + yi − (m + ni))| ≤
2
which means that the remainder τ = α − β(m + ni) works since it has norm less than |N (β)|.


Corollary 2.3.1*

Z[i] has unique factorisation.

We shall now analyse the prime elements of Z[i]. Suppose α ∈ Z[i] is prime. Then N (α) =
αα must have at most two rational prime factors since it has exactly two prime factors in Z[i] (by
Exercise 2.2.5∗ ). Moreover, if it has exactly two rational prime factors, then α is an associate of one
of them and we may assume without loss of generality that it is a rational prime.
The problem of finding the primes of Z[i] is therefore reduced to finding when a rational prime
p ∈ Z stays prime in Z[i], and when it splits as a product of two Gaussian primes αα. Indeed, if
N (α) = −p then N (iα) = p so we may assume p is positive.

Theorem 2.3.1 (Gaussian Primes)

The primes of Z[i] are, up to multiplication by a unit,


• 1 − i.
• a + bi and a − bi where a2 + b2 = p for some (positive) rational prime p ≡ 1 (mod 4).

• p where p ≡ −1 (mod 4) is some (positive) rational prime.

In algebraic number theory terminology, we say


• 2 ramifies because it becomes non-squarefree (2 = i(1 − i)2 ),
• p ≡ 1 (mod 4) splits, and
• p ≡ −1 (mod 4) stays inert because it stays prime.

Proof

First, we see that 2 = (1 + i)(1 − i) = i(1 − i)2 and that N (1 − i) = 2 so these are primes by
Exercise 2.2.7∗ .
Suppose an odd rational prime p ∈ Z does not stay inert in Z[i]. Then, by the previous discussin,
there is an α = a + bi ∈ Z[i] such that a2 + b2 = N (α) = p. The numbers a and b are clearly not
divisible by p, so (a · b−1 )2 + 1 ≡ 0 (mod p). By Exercise 2.3.1∗ , we must have p ≡ 1 (mod 4).
Thus, p ≡ 3 (mod 4) stays inert. It remains to prove that p ≡ 1 (mod 4) splits. This follows
from Exercise 2.3.2∗ : let n be an integer such that p | n2 + 1. Then,
p | (n + i)(n − i)
2.4. EISENSTEIN INTEGERS 35

but p - n + i, n − i so p isn’t prime in Z[i] as wanted. To show that it doesn’t ramify, write p = ππ
for some Gaussian prime π and notice that the gcd of a + bi = π and a − bi = π divides 2a and
2b so divides 2. However, for p 6= 2, π doesn’t divide 2 so they have gcd 1, i.e. π and π are not
associates.


Exercise 2.3.1∗ . Let n ∈ Z be a rational integer and p an odd rational prime. If n2 ≡ −1 (mod p), prove
that p ≡ 1 (mod 4).

Exercise 2.3.2∗ . Let p ≡ 1 (mod 4) be a rational prime. Prove that there exist a rational integer n such
that n2 ≡ −1 (mod p). (Hint: Consider (p − 1)!.)

As a corollary, we get

Corollary 2.3.2 (Fermat’s Two-Square Theorem)

Any rational prime congruent to 1 modulo 4 is a sum of two squares of rational integers.

Exercise 2.3.3. Which rational integers can be written as a sum of two squares of rational integers?

Exercise 2.3.4∗ . Find all rational integer solutions to the equation x2 + 1 = y 3 . (This is the example we
considered in the beginning of the chapter.)

2.4 Eisenstein Integers



2iπ −1+i 3

In this section we look at the field of Eisenstein numbers Q(j) where j = exp 3 = 2 satisfies

j3 − 1
0= = j 2 + j + 1.
j−1

By Proposition 2.1.1, we have OQ(j) = Z[j] since −3 ≡ 1 (mod 4).

Remark 2.4.1
A small word of warning: the notations we use for a primitive third root of unity j, along with
the notation for a primitive fourth root of unity i are the same as the ones we usually use for
indexing sums, sets, etc. Which notation is being used should be clear from the context, and,
most of times (but not always), we shall also redefine j before using it, as it is less standard than
i.

Exercise 2.4.1∗ . Prove that the norm of a + bj is a2 − ab + b2 . (Bonus: do it without any computations using
cyclotomic polynomials from Chapter 3.)

Exercise 2.4.2∗ . Determine the units of Z[j].

Exercise 2.4.3∗ . Prove that Z[j] is norm-Euclidean.

Exercise 2.4.4. Characterise the primes of Z[j]. Conclude that when p ≡ 1 (mod 3) there exist rational
integers a and b such that p = a2 − ab + b2 . (You may assume that there is an x ∈ Z such that x2 + x + 1 ≡ 0
(mod p) if p ≡ 1 (mod 3). This will be proven in Chapter 3, as a corollary of Theorem 3.3.1.)

We now look at a very interesting application of Eisenstein integers: Fermat’s last theorem for
n = 3. In fact, we will even show the following stronger result.
36 CHAPTER 2. QUADRATIC INTEGERS

Theorem 2.4.1

There do not exist non-zero Eisenstein integers α, β, γ ∈ Z[j] such that α3 + β 3 + γ 3 = 0.

Let λ = 1 − j. Since 3 = N (1 − j) = (1 − j)(1 − j 2 ) = λ2 (1 + j) we see that λ is prime and that


λ is the prime factorisation of 3 (up to a unit) because 1 + j = −j 2 is a unit.
2

Exercise 2.4.5∗ . Let θ ∈ Z[j] be an Eisenstein integer. Prove that, if λ - θ, then θ ≡ ±1 (mod λ). In that
case, prove that we also have θ3 ≡ ±1 (mod λ4 ).

Proof

We will in fact prove that the equation


α3 + β 3 + εγ 3 = 0
where ε is a unit does not have non-zero solutions in Z[j] where λ - α, β. This will imply that
α3 +β 3 +γ 3 = 0 does not have non-zero solutions either. Indeed, suppose (α, β, γ) is a solution of
the latter. Without loss of generality, suppose they are pairwise coprime. Then, either λ - α, β, γ
in which case it is also a solution to the former, or we can suppose λ | γ by symmetry which
again makes it a solution of the former as λ can’t divide α or β.
Thus, suppose for the sake of contradiction that (α, β, γ) is a solution of α3 + β 3 + εγ 3 = 0 for
some unit ε and where λ - α, β. Without loss of generality, assume they are pairwise coprime.
Suppose also vλ (γ) is minimal among the solutions. If it is zero, by Exercise 2.4.5∗ ,
α3 + β 3 + εγ 3 ∈ {±ε, ±2 ± ε} (mod λ4 )
and we can check that this is never divisible by λ4 : the norm of the former is in {1, 3, 9, 7} while
the norm of λ4 is 34 = 81. Thus, we already reach a contradiction: α3 + β 3 + εγ 3 can’t be zero
if λ - α, β, γ.
Now, suppose γ = λn δ for some λ - δ and n ≥ 1. Write
α3 + β 3 = (α + β)(α + βj)(α + βj 2 ) = −ελ3n δ 3 .
By Exercise 2.4.6∗ , the gcd of each pair of factors is λ. By replacing β by βj k for a suitable k,
we may assume that vλ (α + j ` β) = 1 for ` ∈ {1, 2}. Then, by unique factorisation, there exist
units u, v, w and Eisenstein integers λ - x, y, z ∈ Z[j] such that
 
α + β = uλ
 3n−2 3
x α + β = uλ
 x =: u0 λ3n−2 x3
3n−2 3

α + βj = vλy 3 ⇐⇒ αj + βj 2 = vjλy 3 =: v 0 λy 3 .
αj + βj = wj 2 λz 3 =: w0 λz 3
 2 3

 2
α + βj = wλz

To conclude, notice that (λx, λy, λ3n−2 z) is another smaller solution: by summing the three lines
we get
u0 λx3 + v 0 λy 3 + w0 λ3n−2 z 3 = 0
for some units u0 , v 0 , w0 since j 2 + j + 1 = 0.
Now, divide everything by u0 λ to get
x3 + µy 3 + ηλ3(n−1) z 3 = 0
for units µ, η. If n = 1, we get, modulo λ4 , ±1 ± µ ± η ≡ 0 which is easily seen to be impossible.
Thus n − 1 ≥ 1. Modulo λ3 , we get ±1 ± µ ≡ 0 so µ must be ±1. Finally, we get x3 + (±y)3 +
ηλ3m z 3 = 0 for some smaller 1 ≤ m < n which contradicts the minimality of n. In other words,
there are no solutions.

2.5. HURWITZ INTEGERS 37

Exercise 2.4.6∗ . Let α, β ∈ Z[j] be coprime Eisenstein integers non-divisible by λ. Prove that, if

λ | α3 + β 3 = (α + β)(α + βj)(α + βj 2 ),

each pair of factors has gcd λ.

Exercise 2.4.7. Check the computational details: ±1±µ±η is never zero mod λ4 for units µ, η and ±1±µ ≡ 0
(mod λ3 ) implies µ = ±1.

Remark 2.4.2
The reason why Eisenstein integers turned out to be so useful to solve Fermat’s last theorem for
n = 3 is that a3 + b3 factorises completely there. See Exercise 3.5.30† for more cases.

Remark 2.4.3
The part where we looked at the equation modulo λ4 is completely analogous to the proof that
a3 + b3 + c3 = 0 does not have rational integers solution where 3 - a, b, c by looking at the equation
modulo 9. In fact it is exactly the same as λ4 is a unit times 9.

2.5 Hurwitz Integers


In this section we discuss the Hurwitz integers. These are not algebraic numbers as they are not even
complex numbers, but they fit perfectly in this chapter as the reader will quickly see. They will allow
us to prove the four square theorem, stating that any positive integer is a sum of four squares, in
a similar manner as our proof of the two square theorem. First, we define the quaternion numbers,
which were introduced by Hamilton. Recall that a skew fiel is like a field but where multiplication is
not necessarily commutative (see Definition A.2.8).

Definition 2.5.1 (Quaternions)

The skew field of the quaternion numbers H is defined as the algebra R[i, j, k] := R+iR+jR+kR
where i, j, k satisfy the following multiplication rules:
2

 i = j2 = k2 = −1

ij = k = −ji
.


 jk = i = −kj
ki = j = −ik

Remark 2.5.1
One usually sees the quaternion with the equations i2 = j2 = k2 = ijk = −1.

Exercise 2.5.1∗ . Prove that ij = k = −ji, jk = i = −kj and ki = j = −ik follows from i2 = j2 = k2 =
−ijk = −1 and associativity of the multiplication.

Remark 2.5.2
One may also represent quaternions by the algebra of two by two complex matrices of the form
         
a + di b + ci 1 0 0 1 0 i i 0
=a +b +c +d := a + bi + cj + dk.
−b + ci a − di 0 1 −1 0 i 0 0 −i

It is then an easy exercise to check that i2 = j2 = k2 = ijk = −1.


38 CHAPTER 2. QUADRATIC INTEGERS

Exercise 2.5.2. Prove that i, j, k are distinct.

In particular, we deduce from Exercise 2.5.1∗ that multiplication is not commutative in H! This
is also why X 2 + 1 has 3 distinct roots when its degree is only 2: almost all the theory developped in
Appendix A, and in particular Corollary A.1.1, fails when multiplication is not commutative.
Exercise 2.5.3∗ . Let α, β, γ ∈ H be quaternions. Prove that (αβ)γ = α(βγ). (We say multiplication is
associative. This is why we can write αβγ without ambiguity.)

Exercise 2.5.4. Prove that there are infinitely many square roots of −1 in H.

Definition 2.5.2 (Quaternion Conjugate)

Let α = a + bi + cj + dk ∈ H. The conjugate of α, denoted α is a − bi − cj − dk.

Exercise 2.5.5∗ . Prove that, for any α, β ∈ H, α + β = α + β and αβ = βα (this is because multiplication is
not commutative anymore).

Definition 2.5.3 (Quaternion Norm)

Let α = a + bi + cj + dk ∈ H. The norm of α, N (α) is

αα = a2 + b2 + c2 + d2 .

Exercise 2.5.6∗ . Check that (a + bi + cj + dk)(a − bi − cj − dk) is indeed a2 + b2 + c2 + d2 .

Exercise 2.5.7∗ . Prove that H is a skew field. This amounts to checking that elements have multiplicative
inverses (i.e. for any α there is a β such that αβ = βα = 1).

Exercise 2.5.8∗ . Prove that the norm is multiplicative: for any α, β ∈ H, N (αβ) = N (α)N (β).

Our object of study will be the ring of Hurwitz integers 2


   
1+i+j+k 1+i+j+k
H = Z i, j, k, := Z + iZ + jZ + kZ + Z
2 2

as a subring of the skew field


Q(i, j, k) := Q + iQ + jQ + kQ.

Exercise 2.5.9∗ . Prove that H = { a+bi+cj+dk


2
| a ≡ b ≡ c ≡ d (mod 2)}. Deduce that the elements of H
have integral norms.

Exercise 2.5.10∗ . Determine the units of H.

Although multiplication is not commutative anymore, this does not mean we lose all the theory
built previously. We can still define divisibility, associates, Euclidean domains and Bézout domains.
We just need to incorporate "left" or "right" in the definition to indicate from which side we multiply.
The definitions for irreducible elements and units do not change as the first one did not use the
commutativity of multiplication while for the second one left and right units are the same since left
and right inverses are the same.
h i
2 One 1+i+j+k
might wander why we defined H as Z i, j, k, 2
instead of simply Z[i, j, k]. This is because they form
h i
maximal order while Z[i, j, k] doesn’t. More concretely, we will see in Exercise 2.5.14 that Z i, j, k, 1+i+j+k
2
has a
Euclidean division while Z[i, j, k] does not.
2.5. HURWITZ INTEGERS 39

Definition 2.5.4 (Left and Right Divisibility)

Let R be a ring and α, β ∈ R. We say α left-divides β and write α d β if there exists a γ ∈ R


such that β = αγ. Similarly, if there exists a γ ∈ R such that β = γα, we say α right-divides β
and write α e β.

Remark 2.5.3
The notations d and e for divisibility are non-standard.

Exercise 2.5.11∗ . Let α, β, γ ∈ H. Prove that α d β implies α d βγ but does not always imply α d γβ.

Definition 2.5.5 (Left and Right Associates)

Let R be a ring and α, β ∈ R. We say α is a left-associate (resp. right-associate) of β if there


exists a unit ε such that α = βε (resp. α = εβ).

Exercise 2.5.12∗ . Prove that being left-associate is an equivalence relation, i.e., for any α, β, γ, α is a left-
associate of itself, α is a left-associate of β if and only if β is a left-associate of α, and if α is a left-associate
of β and β is a left associate of γ then α is a left-associate of γ.

Definition 2.5.6 (Left and Right Euclidean Domains)

We say a domain R is left-Euclidean (resp. right-Euclidean) if there exists a function f : R → N


such that for any α, β ∈ R with β 6= 0 there exist ρ, τ ∈ R such that α = βρ + τ (resp.
α = ρβ + τ ) and f (τ ) < f (β). Such a function f will be called a left-Euclidean (resp. right-
Euclidean) function.

Definition 2.5.7 (Left and Right Bézout Domains)

We say a domain R is left-Bézout (resp. right-Bézout) if, for any α, β ∈ R, there exists a γ ∈ R
such that αR + βR = γR (resp. Rα + Rβ = Rγ). Such a γ will be called a left-gcd (resp.
right-gcd ) of α and β.

Left and right definitions are completely symmetric so we will focus primarly on left ones.
Exercise 2.5.13∗ . Prove that a left-gcd γ of α and β satisifies the following property: γ d α, β and if δ d α, β
then δ d γ.

Exercise 2.5.14. Prove that 1+i and 1−j do not have a left-gcd in Z[i, j, k]. In particular, it is not left-Bézout
and thus not left-Euclidean too (and the same holds for being right-Bézout and right-Euclidean by symmetry).

As before, we have the following proposition.

Proposition 2.5.1*

A left-Euclidean (resp. right-Euclidean) domain R is a left-Bézout (resp. right-Bézout) domain.

Exercise 2.5.15∗ . Prove Proposition 2.5.1.

However, being left or right Euclidean does not garantee unique factorisation anymore. That said,
some of our results will still hold for rational primes which stay prime in H because rational numbers
commute with every quaternion (we will paradoxically use this to show that they do not exist).
40 CHAPTER 2. QUADRATIC INTEGERS

We first prove that H is norm-Euclidean.

Proposition 2.5.2*

H is both left and right norm-Euclidean.

Proof

Let α, β ∈ H, with β 6= 0. Consider the quotient γ = β −1 α = a+bi+cj+dk. Choose x, y, z, t ∈ Z


such that
1
|a − x|, |b − y|, |c − z|, |d − t| ≤
2
and let δ = x + yi + zj + tk. Then,
 2  2  2  2
1 1 1 1
N (γ − δ) ≤ + + + =1
2 2 2 2

with equality if and only if |a − x| = |b − y| = |c − z| = |d − t| = 21 . If the inequality is strict,


then N (α − βδ) < N (β), otherwise γ ∈ H so N (α − βγ) = 0 < N (β) as wanted.

The right-Euclidean proof is exactly the same with the order of factors reversed by symmetry.


As a corollary, we get that H is left and right Bézout. Now, we show that any Hurwitz integer has
a factorisation in irreducible Hurwitz integers.

Proposition 2.5.3

Any non-zero Hurwitz integer has a factorisation in irreducible Hurwitz integers. (When it is a
unit it is the empty factorisation.)

Exercise 2.5.16∗ . Prove Proposition 2.5.3.

Exercise 2.5.17. Prove that there is an irreducible Hurwitz integer x ∈ H for which there exist α and β such
that x d αβ but x left-divides neither α nor β.

We can now prove our main result: the Lagrange four square theorem.

Theorem 2.5.1 (Lagrange’s Four-Square Theorem)

Any non-negative rational integer is a sum of four squares of rational integers.

Proof

This is equivalent to showing that any integer arises as a norm of an element of Z[i, j, k]. Consider
the prime case first. We wish to find a non-trivial factorisation p = αβ in Hurwitz integers.
Suppose that p is irreducible, which implies that it is odd as 2 = (1 + i)(1 − i).

Then, p is in fact also prime because p commutes with any quaternion since it’s real, which means
left and right divisibility by p are the same. Indeed, suppose that p | αβ and p - β. Since H is
left-Bézout, there are some γ, δ ∈ H such that

pγ + βδ = 1.
2.6. EXERCISES 41

Thus, β is right-invertible modulo p which means that

p | αβδ = α − pγδ

so p | α. The p - α case is handled similarly. (This could be phrased more efficiently using
modular arithmetic, but we did it that way to emphasise how the commutativity of p and H
made this possible.)

However, by Exercise 2.5.18∗ , there exist rational integers a and b such that

p | 1 + a2 + b2 = (1 + ai + bj)(1 − ai − bj)

but p - 1 + ai + bj, 1 − ai − bj as it is odd. Thus it can’t be prime and therefore irreducible too.

This means that there exist non-units Hurwitz integers α and β such that p = αβ. By taking the
norm we get p2 = N (α)N (β). Since neither of α, β are units, N (α), N (β) must both be equal to
p as they are different from 1.

We are almost done: we have represented p as the norm of a Hurwitz integer α and we just want
to have α ∈ Z[i, j, k]. Suppose that it is not the case. Consider the unit ε = ±1±i±j±k
2 where the
± signs are chosen so that ρ := α − ε ∈ 2Z[i, j, k]. Then,

p = αα = (ε + ρ)εε(ε + ρ) = (1 + ρε)(1 + ερ) =: α0 α0

where α0 now has rational integer coordinates since the coordinates of ρ are even so ρε ∈ Z[i, j, k].
Qk
The general case follows from the multiplicativity of the norm: if pi = N (αi ) and n = i=1 pm i
i ,
then !
Yk
mi
n=N αi .
i=1

Exercise 2.5.18∗ . Let p be a rational prime. Prove that there exist rational integers a and b such that
p | 1 + a2 + b2 .

2.6 Exercises
Diophantine Equations
Exercise 2.6.1. Solve the equation x2 + 4 = y 3 over Z.
Exercise 2.6.2† . Prove that OQ(√2) and OQ(√−2) are Euclidean.

Exercise 2.6.3. Solve the equations x2 + 2 = y 3 and x2 + 8 = y 3 over Z..


Exercise 2.6.4† . Prove that OQ(√−7) is Euclidean.

Exercise 2.6.5. Solve the equation x2 + x + 2 = y 3 over Z.


Exercise 2.6.6† . Solve the equation x2 + 11 = y 3 over Z.
Exercise 2.6.7. Let a, b, c ∈ Z be rational integers. Prove that a2 + b2 = c3 if and only if there exist
rational integers m and n such that a = m3 − 3mn2 , b = −n3 + 3m2 n and c = m2 + n2 . More generally,
if k ≥ 1 is an integer, find all the solutions a, b, c ∈ Z to the equation a2 + b2 = ck .
Exercise 2.6.8† . Let n be a non-negative rational integer. In how many ways can n be written as a
sum of two squares of rational integers? (Two ways are considered different if the ordering is different,
for instance 2 = 12 + (−1)2 and 2 = (−1)2 + 12 are different.)
42 CHAPTER 2. QUADRATIC INTEGERS

Exercise 2.6.9. Which rational integers can be written in the form a2 + 2b2 for some rational integers
a and b? What about a2 + 2b2 ? In how many ways? (You may assume that, for an odd rational prime
p, there exists a rational integer such that x2 ≡ 2 (mod p) if and only if p ≡ ±1 (mod 8), and there
exists a rational integer x such that x2 ≡ −2 (mod p) if and only if p ≡ 1 (mod 8) or p ≡ 3 (mod 8).
This will be proven in Chapter 4, as a corollary of the quadratic reciprocity law 4.5.2.)

Exercise 2.6.10 (Saint-Petersbourg Mathematical Olympiad 2013). Find all rational primes p and q
such that 2p − 1, 2q − 1, and 2pq − 1 are all perfect squares.

Exercise 2.6.11† (Euler). Let n ≥ 3 be an integer. Prove that there exist unique positive odd
rational integers x and y such that 2n = x2 + 7y 2 .

Exercise 2.6.12† (Fermat’s Last Theorem for n = 4). Show that the equations α4 + β 4 = γ 2 and
α4 − β 4 = γ 2 have no non-zero solution α, β, γ ∈ Z[i].

Exercise 2.6.13 (Chinese Mathematical Olympiad 2006). Positive integers k, m, n satisfy mn =


k 2 + k + 3. Prove that at least one of the equations

x2 + 11y 2 = 4m

and
x2 + 11y 2 = 4n
has a solution in odd rational integers.

Exercise 2.6.14† . Prove that OQ(√5) is Euclidean.

Hurwitz Integers and Jacobi’s Four Square Theorem3


Exercise 2.6.15† . Let α ∈ H be a primitive Hurwitz integer, meaning that there does not exist a
α
non-zero m ∈ Z such that m ∈ H and let N (α) = p1 · . . . · pn be its prime factorisation. Then, the
factorisation of α = π1 · . . . · πn for irreducible elements πi of norm pi is unique up to unit-migration,
meaning that if if τ1 · . . . · τk is another such factorisation, then k = n and



 τ1 = π1 u1
τ2 = u−1
1 π2 u2



...
τn−1 = u−1

n−1 πn un



−1

τ
n = un πn .

for some units u1 , . . . , un . Deduce that α is irreducible if and only if its norm is a rational prime.

Exercise 2.6.16† . Prove that (1 + i)H = H(1 + i)4 . Set ω = 1+i+j+k 2 . We say a Hurwitz integer
α ∈ H is primary if it is congruent to 1 or 1 + 2ω modulo 2 + 2i.5 Prove that, for any Hurwitz integer
α of odd norm, exactly one of its right-associates is primary.

Exercise 2.6.17† . Let m ∈ Z be an odd integer. Prove that the Hurwitz integers modulo m, H/mH,
are isomorphic to the algebra of two by two matrices modulo m, (Z/mZ)2×2 . In addition, prove that
the determinant of the image is the norm of the quaternion.

Exercise 2.6.18† . Let m be an odd integer. We say a Hurwitz integer α = a + bi + cj + dk is primitive


modulo n if gcd(2a, 2b, 2c, 2d, m) = 1. Compute the number ψ(m) of primitive Hurwitz integers modulo
m with norm zero (modulo m).
3 The following series of exercises comes from the work of Hurwitz, but our presentation follows the PhD thesis of

Nikolaos Tsopanidis, see [47].


4 This means that we can manipulate congruences modulo 1 + i normally. Note that the choice of i is not arbitrary

at all, since 1 − i = −i(1 + i) and 1 − j = (1 − ω)(1 + i) are associates. By α ≡ β (mod γ), we mean that γ divides α − β
from the left and from the right.
5 Note that a primary Hurwitz integer is always in Z[i, j, k].
2.6. EXERCISES 43

Exercise 2.6.19† . Let p be an odd prime. Prove that any non-zero α ∈ H/pH of zero norm modulo p
has a representative of the form ρπ, where π is a primary element of norm p and ρ ∈ H, and that this
π is unique. Conversely, let π ∈ H have norm p. Prove that the equation ρπ ≡ 0 (mod p) has exactly
p2 solutions ρ ∈ H/pH. Deduce that there are exactly p + 1 primary irreducible Hurwitz integers with
norm p.

Exercise 2.6.20† (Jacobi’s Four Square Theorem). Let n be a positive rational integer. In how many
ways can n be written as a sum of four squares of rational integers. (Two ways are considered different
if the ordering is different, for instance 2 = 12 + 02 + 02 + (−1)2 and 2 = (−1)2 + 02 + 02 + 12 are
different.)

Domains
Exercise 2.6.21.
√ Prove that there are finitely many rational integers d ≡ 2 (mod 4) or d ≡ 3 (mod 4)
such that Q( d) is norm-Euclidean.
Exercise 2.6.22. Let R be an integral domain such that for any set S ⊆ R there exists a β ∈ R such
that X
αR = βR.
α∈S

(In other words, any ideal


P is principal.) Such a ring is called aPprincipal ideal domain (PID). Prove that
it is a UFD. (The sum α∈S αR is defined as the union of α∈S 0 αR over all finite subsets S 0 ⊆ S.)
Exercise 2.6.23. Let R be a Euclidean domain. Prove that it is a PID (and thus a UFD as well).

Exercise 2.6.24. Let R = Z+XQ[X] be the ring of polynomials with rational coefficients and integral
constant coefficient. Prove that R is a Bézout domain but not a UFD, and hence not a PID either.

Miscellaneous
Exercise 2.6.25† . Let (Fn )n∈Z be the Fibonacci sequence defined by F0 = 0, F1 = 1, and Fn+2 =
Fn+1 + Fn for any integer n. Prove that, for any integers m and n, gcd(Fm , Fn ) = Fgcd(m,n) .
Exercise 2.6.26. Let (Ln )n∈Z be the Lucas sequence defined by L0 = 2, L1 = 1, and Ln+2 =
Ln+1 + Ln for any integer n. Given two integers m and n, find a formula for gcd(Lm , Ln ) analogous
to Exercise 2.6.25† .
√ √
Exercise 2.6.27† . Let n√be a rational integer. Prove that (1 + 2)n is a unit of Z[ 2]. Moreover,
prove that any unit of Z[ 2] has that form, up to sign.
Exercise 2.6.28† (IMO 2001). Let a > b > c > d be positive rational integers. Suppose that

ac + bd = (b + d + a − c)(b + d − a + c).

Prove that ab + cd is not prime.


Exercise 2.6.29† . Let x ∈ R be a non-zero real number and m, n ≥ 1 coprime integers. Suppose that
xm + x1m and xn + x1n are both rational integers. Prove that x + x1 is also one.

Exercise 2.6.30. Find all automorphisms of the quaternions H, i.e. additive and multiplicative
bijections ϕ : H → H.
Chapter 3

Cyclotomic Polynomials

Prerequisites for this chapter: Chapter 1.

Quadratic numbers and roots of unity are very important in algebraic number theory; they were
one of the first objects studied in detail. We have studied a bit the former in Chapter 2, here we will
look at the minimal polynomials of the latter and their properties.

3.1 Definition
We say an nth root of unityω is a primitive nth root if its order is n, i.e. ω k 6= 1 for k = 1, 2, . . . , n − 1.
Note that, if ω = exp 2kiπ
n , ω is a primitive nth root if and only if gcd(k, n) = 1.

We may now define cyclotomic polynomials. These are the polynomials with roots primitive nth
roots of unity for some n.

Definition 3.1.1 (Cyclotomic Polynomials)

Let n ≥ 1 be an integer. The nth cyclotomic polynomial , Φn , is the polynomial of degree ϕ(n)
 
Y Y 2kiπ
X −ω = X − exp .
n
ω primitive nth root gcd(k,n)=1,k∈[n]

For instance, Φ1 = X − 1, Φ2 = X − (−1) and Φ4 = (X − i)(X + i) = X 2 + 1. Below are the first


few cyclotomic polynomials.

• Φ1 = X − 1.

• Φ2 = X + 1.

• Φ3 = X 2 + X + 1.

• Φ4 = X 2 + 1.

• Φ5 = X 4 + X 3 + X 2 + X + 1.

• Φ6 = X 2 − X + 1.

There is one striking thing about these polynomials: they all have integer coefficients!1 In fact, this
is true for any n, despite the fact that our definition involved complex numbers. This is a consequence
of the following fundamental proposition.
1 Which makes sense, since we said they were the minimal polynomials of roots of unity.

44
3.1. DEFINITION 45

Proposition 3.1.1*

Let n ≥ 1 be an integer. Then, Y


Xn − 1 = Φd .
d|n

Remark 3.1.1
In general, unless otherwise specified, when we index something (e.g. a sum or a product) by
d | n we mean that the indexing is done over the non-negative divisors of n.

Proof

This is a simple root counting exercise. We have to show that any nth root of unity is a primitive
dth root for exactly one d | n, which is clearly true as this d is the order of the root. Conversely
it is clear that any primitive dth root for some d | n is an nth root of unity.


Exercise 3.1.1∗ . Let ω be an nth root of unity. Prove that its order divides n.

Exercise 3.1.2∗ . Let p be a rational prime. Prove that Φp = X p−1 + . . . + 1.

Exercise 3.1.3∗ . Let n ≥ 1 be an integer. Prove that Φn (0) = −1 if n = 1 and 1 otherwise.

Exercise 3.1.4. Let n > 1 be an integer. Prove that Φn (1) = p if n is a power of a prime p, and Φn (1) = 1
otherwise.

Corollary 3.1.1*

Cyclotomic polynomials have integer coefficients.

Exercise 3.1.5∗ . Prove the Corollary 3.1.1 by induction.

By looking at the degrees of both sides of Proposition 3.1.1, we also get the following corollary.

Corollary 3.1.2

For any integer n ≥ 1, we have X


ϕ(d) = n.
d|n

Let us examine more closely why Proposition 3.1.1 is amazing. It gives us a very good factorisation
of X n − 1, so much better than (X − 1)(X n−1 + . . . + 1). We can also get a factorisation for an − bn
by rewriting it as bn ((a/b)n − 1). Indeed, define the two-variable homogeneous polynomial Φn (a, b) :=
bϕ(n) Φn (a/b). Then, Y
an − bn = Φd (a, b).
d|n

Exercise 3.1.6∗ . Prove that Φn (1/X) = Φn (X)/X ϕ(n) for n > 1.

Exercise 3.1.7∗ . Prove that, for n > 1, Φn (X, Y ) is a two-variable symmetric and homogeneous, i.e. where
all monomials have the same degree, polynomial with integer coefficients.
46 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

Exercise 3.1.8∗ . Prove that Y


Φn (X, Y ) = X − ωY.
ω primitive nth root

We can already use this on a problem.

Problem 3.1.1
n+1 n
Let n ≥ 0 be an integer. Prove that the number 22 + 22 + 1 has at least n + 1 prime factors
counted with multiplicity.

Solution
n n+1 n
Let x = 22 . The number 22 + 22 + 1 then becomes
n
x3 − 1 23·2 − 1
x2 + x + 1 = = 2n .
x−1 2 −1
We factorise the numerators and denominators using Proposition 3.1.1:
Q
d|3·2n Φd (2)
Y
2
x +x+1= Q = Φd (2).
d|2n Φd (2) n n d|3·2 ,d-2

Notice that the divisors of 3 · 2n that do not divide 2n are precisely the divisors of the form 3d
where d | 2n , i.e. of the form 3 · 2k for some 0 ≤ k ≤ n. Thus,
n
n+1 n Y
22 + 22 + 1 = Φ3·2k (2).
k=0

We have found our n + 1 divisors! It remains, however, to check that they are non-trivial, i.e.
greater than 1. For this, we return to the definition of cyclotomic polynomials:

Y Y
|Φn (2)| = 2−ω ≥ 1=1
ω primitive nth root ω primitive nth root

since |2 − ω| ≥ 2 −|ω| = 1 for any |ω| = 1 by the triangular inequality. In addition, this inequality
is strict if n 6= 1, 2.


Finally, we give a formula to compute cyclotomic polynomials, a lot more efficient than just using
Proposition 3.1.1.

Proposition 3.1.2*

Let p be a prime number and n ≥ 1 an integer. If p | n then Φpn (X) = Φn (X p ), otherwise


p
Φpn (X) = ΦΦnn(X )
(X) .

Proof

This is again a simple root counting exercise. Note first that both sides have the same degree: if
p | n the RHS has degree pϕ(n) = ϕ(pn) and if p - n the LHS has degree
pϕ(n) − ϕ(n) = (p − 1)ϕ(n) = ϕ(pn).
3.2. IRREDUCIBILITY 47

Note also that the quotient makes sense. Indeed, for any primitive nth root ω, ω p is also a
primitive nth root of unity iff gcd(n, p) = 1, i.e. p - n. Thus, it suffices to show that each root of
the LHS is a root of the RHS.

This is very easy: let ω be a primitive pnth root of unity. Then, ω p is a primitive nth root as
wanted (and ω isn’t so the denominator is non-zero).


As a corollary, we get

Corollary 3.1.3

Let n > 1 be an odd integer. Then, Φ2n (X) = Φn (−X).

Exercise 3.1.9∗ . Prove that, for odd n > 1, Φn (X)Φn (−X) = Φn (X 2 ) and deduce Corollary 3.1.3.

Exercise 3.1.10. Prove that, for any polynomial f , f (X)f (−X) is a polynomial in X 2 .

Exercise 3.1.11∗ . Let p be a prime number and n ≥ 1 an integer. Prove that if p | n then Φpn (X, Y ) =
p
,Y p )
Φn (X p , Y p ), and that Φpn (X, Y ) = ΦΦnn(X
(X,Y )
otherwise.

Exercise 3.1.12∗ . Let k ≥ 1 be an integer. Prove that Φ2k = X 2


k−1
+ 1.

3.2 Irreducibility
In fact, the factorisation we got for X n − 1 is not only very good, it is the best possible: cyclotomic
polynomials are irreducible! In algebraic-number-theoretic terminology, the conjugates of a primitive
nth root of unity are all primitive nth roots of unity. It is a notoriously hard problem to prove certain
polynomials are irreducible, so such a result is remarkable.

Theorem 3.2.1

For any integer n ≥ 1, Φn is irreducible in Q[X].

We present a proof using algebraic number theory, and leave another one as an exercise.

Proof

Let ω be a primitive nth root of unity with minimal polynomial π. We will show that, for any
rational prime p - n, ω p is also a root of π. Thus, ω k will also be a root of π for any gcd(n, k) = 1.
Since all primitive nth roots have this form by Exercise 3.2.1∗ , we have π = Φn as wanted. The
key point for this is the congruence π(ω p ) ≡ π(ω)p ≡ 0 (mod p), given by Exercise 3.2.3∗ .

Let p - n be a rational prime. Suppose for the sake of contradiction that π(ω p ) 6= 0. Then, π
divides (in Z[X], as π is monic)
Xn − 1 Y
p
= X − ωk .
X −ω
k6=p

p p k
Q
Thus, π(ω ) divides p6=k ω − ω . which is just

n(ω p )n−1
48 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

by Exercise 3.2.2∗ . Since ω is invertible in Z (we say it’s a unit) as it is a root of unity, we have
π(ω p ) | n. Finally, since p divides π(ω p ), we also have p | n: this is a contradiction since we
assumed p - n.


Exercise 3.2.1∗ . Let n ≥ 1 be an integer and ω be a primitive nth root of unity. Prove that any primitive
nth root can be written in the form ω k for some gcd(k, n) = 1.

Exercise 3.2.2∗ . Let f =


Qn
Q k=1 X − αi be a polynomial. Prove that, for any k = 1, . . . , n, f 0 (αk ) =
i6=k αk − αi .

Exercise 3.2.3∗ (Frobenius Morphism). Prove the following special case of Proposition 4.1.1: for any rational
prime p and any polynomial f ∈ Z[X], f (X p ) ≡ f (X)p (mod p).

Exercise 3.2.4 (Alternative Proof of Theorem 3.2.1). Let ω be a primitive nth root of unity with minimal
polynomial π and let p - n be a rational prime. Suppose τ is the minimal polynomial of π(ω p ). Prove that
p | τ (0) and that τ (0) is bounded when p varies. Deduce that ω p is a root of π for sufficiently large p, and thus
that ω k is a root of π for any gcd(n, k) = 1.
2kπ

An interesting corollary of this theorem is that
 we can know the conjugates of cos n for any
2k0 π
gcd(k, n) = 1: they are precisely the numbers cos n for gcd(k 0 , n) = 1.

However, unlike the primitive nth roots of unity which have degree ϕ(n), they have degree 1 for
n = 1, 2 and degree ϕ(n) for n ≥ 3 as cos 2kπ = cos −2kπ
 
2 n n .

In particular, this gives an alternative proof of Problem 1.1.1: cos 2kπ is rational iff ϕ(n)

n 2 = 1 or
n = 1, 2, i.e. n = 1, 2, 3, 4, 6 and we can easily check that cos n = 0, ±1, ± 21 for these n.
2kπ


Exercise 3.2.5. Let k and n ≥ 1 be coprime integers. Prove that the conjugates of cos 2kπ

n
are the numbers
 0 
2k π 0 2kπ

cos n for gcd(k , n) = 1 and that they have degree ϕ(n)/2. What about sin n : what are its conjugates
and what is its degree?

Exercise 3.2.6. Find all quadratic cosines.

3.3 Orders
We will now see very important arithmetic properties of cyclotomic polynomials. This is the funda-
mental result.

Theorem 3.3.1

Let p be a rational prime and a a rational integer. Then, p divides Φn (a) if and only if the order
of a modulo p is pvpn(n) .

For instance, it is easy to see that p | a − 1 means a has order 1 mod p, p | a + 1 means a has order
2 mod p unless p = 2, and p | a2 + 1 means a has order 4 mod p unless p = 2.

In fact, this theorem is perhaps not so surprising if one recalls the local-global principle from
Proposition 1.3.1. Over the complex numbers C, Φn (a) is zero if and only if a has order n (by
definition). Thus, one can expect that the same holds over the integers mod p, which is exactly what
this theorem says.
3.3. ORDERS 49

Proof

We do the case where p - n first. The general case will follow from Exercise 3.3.1∗ by induction
on vp (n).

Note that the statement makes sense as p | Φn (a) implies p | an − 1 so p - a (which means that
the order of a is well-defined). Thus, suppose p - a. Let k be the order of a modulo p. We first
prove the existence of a p - n such that p | Φn (a). Since
Y
0 ≡ ak − 1 = Φd (a),
d|k

there must exist a p - n such that Φn (a) ≡ 0.

We show that this n is unique. Suppose that p - m 6= n satisifies Φm (a) ≡ 0 too. Then,
Y
X mn − 1 = Φd
d|mn

has a double root at a. Thus, by Proposition A.1.3, the derivative mnX mn−1 is zero at a: this
is impossible as p - a, m, n.

Finally, notice that such an n must be the order of a modulo p. By construction, n divides the
order of a. If it was distinct from it, then

ak − 1 Y
n
= Φd (a)
a −1
d|k,d-n

would be zero thus there would be some p - m 6= n such that Φm (a) ≡ 0 which is impossible.


Exercise 3.3.1∗ . Let p be a rational prime and a a rational integer. Prove that, for any n ≥ 1, p | Φn (a) if
and only if p | Φpn (a).

Exercise 3.3.2∗ . Let p be a rational prime. Prove that there always exists a primitive root or generator
modulo p, i.e. an integer g such that g k generates all integers p - m modulo p.

Exercise 3.3.3∗ . Let p be a rational prime and a, b two rational integers. Prove that p | Φn (a, b) if and only
if p | a, b or vpn(n) is the order of ab−1 modulo p.
p

From this we get the following very important corollary.

Corollary 3.3.1*

Let p be a rational prime and a a rational integer. Suppose that p | Φn (a). Then, p ≡ 1 (mod n)
or p is the greatest prime factor of n.

Proof

If p - n, then n is the order of a modulo p by Theorem 3.3.1 so n | p − 1. Otherwise, pvpn(n) | p − 1


so all prime factors of the former are smaller than p. But the prime factors of pvpn(n) are exactly
the prime factors of n distinct from p!

50 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

Exercise 3.3.4∗ . Let p be a rational prime and a an integer of order n modulo p. Prove that ak ≡ 1 (mod p)
if and only if n | k. Deduce that n divides p − 1.2

Exercise 3.3.5∗ . Let p be a rational prime and a, b two rational integers. Suppose that p | Φn (a, b). Prove
that, if p does not divide both a and b, either p ≡ 1 (mod n) or p is the greatest prime factor of n.

Exercise 3.3.6∗ . Let p be a rational prime and a an integer. Suppose p | Φn (a), Φm (a) and n 6= m. Prove
that m
n
is a power of p.

Exercise 3.3.7. Prove the following strengthening of Problem 3.1.1: for any integer n ≥ 0, the number
n+1 n
22 + 22 + 1 has at least n + 1 distinct prime factors.

We also get the following result. It is a special case of the celebrated theorem of Dirichlet on
arithmetic progressions which asserts that, for any gcd(m, n) = 1, there are infinitely many rational
primes p ≡ m (mod n). Its proof is significantly more involved.

Corollary 3.3.2*

For any integer n ≥ 1, there are infinitely many rational primes p ≡ 1 (mod n).

Exercise 3.3.8∗ . Let n ≥ 1 be an integer. Prove that there exist infinitely many rational primes p ≡ 1
(mod n).

Here is an example of problem that follows from the first corollary.

Problem 3.3.1 (ISL 2006 N5)

Prove that there doesn’t exist integers x 6= 1 and y such that

x7 − 1
= y 5 − 1.
x−1

Solution

Suppose for the sake of contradiction that (x, y) is a solution. We rewrite the equation using
cyclotomic polynomials:
Φ7 (x) = Φ1 (y)Φ5 (y).
By Corollary 3.3.1, a prime factor p of the LHS is either 7 or 1 mod 7. Suppose that 7 - Φ7 (x).
5 7
−1
Since Φ5 (y) = yy−1 and Φ7 (x) = xx−1−1
are positive, Φ1 (y) must be as well. Then, the previous
remark yields
Φ1 (y), Φ5 (y) ≡ 1 (mod 7).
Thus, from Φ1 (y) ≡ 1 (mod 7) we get y ≡ 2 (mod 7). This means that
25 − 1
Φ5 (y) ≡ ≡2 (mod 7),
2−1
a contradiction.
Hence, we must have 7 | Φ7 (x). Since 7 is distinct from 5 and not congruent to 1 mod 5, it can’t
divide Φ5 (y) which means it must divide Φ1 (y). Thus, y ≡ 1 (mod 7). This implies that
Φ5 (y) ≡ 1 + 1 + 1 + 1 + 1 ≡ 5 (mod 7)

2 This is the mod p version of Exercise 3.1.1∗ . In fact the proof should be the same as it works in any group (see

Section A.2 and Theorem 6.3.2).


3.4. ZSIGMONDY’S THEOREM 51

which is again a contradiction.




3.4 Zsigmondy’s Theorem


In this section, we formulate and prove the powerful Zsigmondy theorem.

Definition 3.4.1

Let (un )n≥1 be a sequence of rational integers. We say a prime p is a primitive prime factor of
an if p | un but p - u1 , . . . , un−1 .

In other words, a primitive prime factor is a new prime factor.

Theorem 3.4.1 (Zsigmondy)

Let a > b be non-zero coprime positive integers. The sequence (an − bn )n≥1 always has a rational
primitive prime factor for n ≥ 2 except in the following cases: n = 2 and a + b is ± a power of
2, and n = 6 and (a, b) = (2, 1).

Exercise 3.4.1∗ . Check that the exceptions stated in Theorem 3.4.1 are indeed exceptions.

Exercise 3.4.2∗ . Prove that a2 − b2 has no primitive prime factor if and only if a + b is ± a power of 2.

Here is how we will prove this theorem. The numbers an − bn have many prime factors in common.
However, we have seen that the numbers Φn (a, b) have strong restrictions on their prime factors, and
thus don’t have many common prime factors (see e.g. Exercise 3.3.6∗ ). Notice now that, since
Y
an − bn = Φd (a, b)
d|n

finding a primitive prime factor of an − bn reduces to finding a primitive prime factor of Φn (a, b)!

Before delving into the proof, we need a lemma called the "lifting the exponent lemma" or "LTE".

Theorem 3.4.2 (Lifting the Exponent Lemma)

Let p | n be an odd rational prime, where n ≥ 1 is an integer. Then, for any rational integers
p - a, b, vp (Φn (a, b)) ≤ 1. Moreover, for p = 2, if 4 | n then vp (Φn (a, b)) ≤ 1.

Proof

Notice that
an − bn
Φn (a, b) | .
an/p − bn/p
If an/p 6≡ bn/p (mod p) then an 6≡ bn (mod p) so vp (Φn (a, b)) = 0.
up −v p
Thus, it suffices to show that for any distinct rational integers u ≡ v (mod p), p2 - u−v . Write
52 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

v = u + mp. Then,
p−1 p−1 p−1
up − v p X X X
= up−1−k v k = up−1−k (u + mp)k ≡ up−1−k (uk + kmpum−1 ) (mod p2 )
u−v
k=0 k=0 k=0

by the binomial expansion. However, this sum is just


p−1
X p(p − 1)
up−1 + pmup−2 k = pup−1 + pmup−2 · ≡ pup−1 (mod p2 )
2
k=0

which is indeed non-zero for odd p since p - u.


 n n 
a −b
For p = 2, we have v2 an/2 −bn/2
= v2 (an/2 + bn/2 ) ≤ 1 when 2 | n/2, i.e. 4 | n.


Some might be more familiar with this version of the lemma:

Theorem 3.4.3 (Lifting the Exponent Lemma)

Let p be rational prime and u ≡ v 6≡ 0 (mod p). Then,

vp (un − v n ) = vp (u − v) + vp (n).

Moreover, for p = 2, we have v2 (un − v n ) = v2 (u − v) for odd n and v2 (un − v n ) = v2 (u2 − v 2 ) +


v2 (n) − 1 for even n.

Proof

Rewrite this as
un − v n
 
vp = vp (n).
u−v
Then, this follows from the following equality:
un − v n Y
= Φd (u, v).
u−v
d|n,d>1

By Exercise 3.3.3∗ , p | Φd (u, v) if and only if d


pvp (d)
is the order of u · v −1 . Since u ≡ v, the order
of u · v −1 is just 1. Thus, p | Φd (u, v) if and only if d is a power of p. By our version of the
lemma 3.4.2, each such factor adds 1 to the p-adic valuation since a power of p distinct from 1 is
n
−v n
divisible by p. Finally, the p-adic valuation of uu−v is just the number of powers of p distinct
from 1 dividing n: i.e. vp (n). For p = 2, we also need to take in account the contribution of Φ2
so we get v2 (un − v n ) = v2 (n) − 1 + v2 (u2 − v 2 ) for 2 | n (and the case 2 - n is the same as for p
odd).


We can now start proving Zsigmondy’s theorem.

Beginning of the Proof of Zsigmondy’s Theorem 3.4.1

Suppose that Φn (a, b) does not have any primitive prime factor for some n ≥ 3; the case n = 2
was done Exercise 3.4.2∗ . Then, let p | Φn (a, b) be a non-primitive prime factor, say that it also
3.4. ZSIGMONDY’S THEOREM 53

divides Φm (a, b) for some m < n. Since a and b are coprime integers, p cannot divide both of
them so it divides neither and the order of ab−1 mod p is both pvpn(n) and pvpm(m) by Exercise 3.3.3∗

Thus, n and m differ multiplicatively by a power of p. In particular, p | n so p is the greatest


prime factor of n by Exercise 3.3.5∗ and hence is unique!
k−1 k−1
Moreover, p can’t be equal to 2, otherwise n would be a power of 2 but Φ2k (a, b) = a2 + b2
is a sum of two coprime squares hence not divisible by 4 but clearly at least 4 which means it
can’t be a power of 2. Thus, by the LTE lemma 3.4.2, vp (Φn (a, b)) ≤ 1. We have reduced the
problem to showing |Φn (a, b)| is not equal to the greatest prime factor of n!


If |Φn (a, b)| were equal to the greatest prime factor of n, it would in particular be at most n.
Intuitively, this should not be the case as cyclotomic polynomials are exponential in n. We are thus
led to find bounds on them. This is achieved in the following proposition.

Proposition 3.4.1*

Let |a| > |b| be two real numbers and n ≥ 1 an integer. Then,

(|a| − |b|)ϕ(n) ≤ |Φn (a, b)| ≤ (|a| + |b|)ϕ(n)

with equality in either side only if n = 1 or n = 2. In addition, if n > 2,

|b|ϕ(n) ≤ Φn (a, b)

with equality only if |a| = |b|.

Proof

The first part of this follows from the triangular inequality exactly like we did for Problem 3.1.1:
Y
|Φn (a, b)| = |a + bω|
ω primitive nth root of unity

by Exercise 3.1.8∗ and each factor is between |a| − |b| and |a| + |b|. The equality case are easy to
work out: |a + bω| = |a| ± |b| implies ω is real so n = 1 or n = 2.

For the |b|ϕ(n) part, after dividing by it it reduces to |Φn (a/b)| > 1 thus to showing |Φn (x)| > 1
for all |x| > 1. Notice that, for any |ω| = 1, |x − ω| is a strictly decreasing function in x if x ≤ −1
and is a stricly increasing function in x if x ≥ 1. Hence,
Y
|Φn (x)| = |x − ω|
ω primitive nth root of unity

is either at least |Φn (1)| or |Φn (−1)|. But these are both non-zero integers so in both cases it is
at least 1 and by strict monotony if we have equality |a| = |b|.


Exercise 3.4.3. Let n ≥ 3 be an integer. Prove that Φn is positive on R.


54 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

Back to the the Proof of Zsigmondy’s Theorem 3.4.1

Suppose that Φn (a, b) = p where p is a prime factor of n. Then, by Proposition 3.4.1,

bϕ(n) , (a − b)ϕ(n) ≤ Φn (a, n) = p.

In particular, since p | n,
bp−1 , (a − b)p−1 ≤ p.
Exercise 3.4.4 therefore implies that b = 1 and a − b = 1 since p 6= 2, i.e. b = 1 and a = 2.

We now use Proposition 3.1.2:


ϕ(n/p)
Φn/p (ap ) 2p − 1

Φn (a) ≥ ≥ .
Φn (a) 3
p
2p −1 ϕ(n/p)
By Exercise 3.4.4, since 2 3−1 ≤ p, we must have p = 3. Since we also had

3 ≤ p this
means that ϕ(n/p) = 1 so n = p or 2p.

Since Φ1 (2, 1) = 1 and Φ2 (2, 1) = 3, Φ3 (2, 1) = 7 has a primitive, which means that n = 6 and
we have finally found our exception!


Exercise 3.4.4. Prove that 2m−1 > m for any integer m ≥ 3 and 2m − 1 > 3m for any integer m ≥ 4.

Remark 3.4.1
As Exercise 3.5.40† shows, we can still get an exponential bound for Φn (2) and make that case
similar to the others, but this is technical so we preferred this approach.

3.5 Exercises
Diophantine Equations
Exercise 3.5.1. Find all rational integers x and y such that x2 + 9 = y 3 .

Exercise 3.5.2† (USA TST 2008). Let n be a rational integer. Prove that n7 + 7 is not a perfect
square.

Exercise 3.5.3. Solve the equation

x3 = y 16 + y 15 + . . . + y + 9

over Z.

Exercise 3.5.4 (Japanese Mathematical Olympiad 2011). Find all positive integers a, p, q, r, s such
that
as − 1 = (ap − 1)(aq − 1)(ar − 1).

Exercise 3.5.5† (French TST 1 2017). Determine all positive integers a for which there exists positive
integers m and n as well as positive integers k1 , . . . , km , `1 , . . . , `n such that

(ak1 − 1) · . . . · (akm − 1) = (a`1 + 1) · . . . · (a`n + 1).


3.5. EXERCISES 55

Divisibility Relations
Exercise 3.5.6 (IMO 2000). Does there exist a rational integer n such that n has exactly 2000 distinct
prime factors and n divides 2n + 1?
Exercise 3.5.7† . Find all coprime positive integers a and b for which there exist infinitely many
integers n ≥ 1 such that
n2 | an + bn .
Exercise 3.5.8. Prove that there exist infinitely many positive integers n such that
2
n3 | 2n + 1.
Exercise 3.5.9 (Iran TST 2013). Prove that there does not exist positive rational integers a, b, c such
that 3(ab + bc + ca) | a2 + b2 + c2 .
Exercise 3.5.10 (ISL 1998). Determine all positive integers n for which there is an m ∈ Z such that
2n − 1 | m2 + 9.

Prime Factors
Exercise 3.5.11† (ISL 2002). Let p1 , . . . , pn > 3 be distinct rational primes. Prove that the number
2p1 ·...·pn + 1
n
has at least 22 distinct divisors.
Exercise 3.5.12† (Problems from the Book). Let a ≥ 2 be a rational integer. Prove that there exist
infinitely many integers n ≥ 1 such that the greatest prime factor of an − 1 is greater than n loga n.
Exercise 3.5.13† (Inspired by IMO 2003). Let m ≥ 1 be an integer. Prove that there is some rational
prime p such that p - nm − m for any rational integer n.
Exercise 3.5.14† . Prove that ϕ(n)/n can get arbitrarily small. Deduce that π(n)/n → 0, where π(n)
denotes the number of primes at most n.
Exercise 3.5.15† . Let P (n) denote the greatest prime factor of any rational integer n ≥ 1 (P (1) = 0).
Let ε > 0 be a real number. Prove that there exist infinitely many rational integers n ≥ 2 such that
P (n − 1), P (n), P (n + 1) < nε .
Exercise 3.5.16† (Brazilian Mathematical Olympiad 1995). Let P (n) denote the greatest prime factor
of any rational integer n ≥ 1. Prove that there exist infinitely many rational integers n ≥ 2 such that
P (n − 1) < P (n) < P (n + 1).

√ √
Exercise 3.5.17. Let a, b ∈ Z[ 5] be quadratic integers such that a ≡ b (mod 5) and n ≥ 1 an
integer. Prove that
v√5 (an − bn ) = v√5 (a − b) + v√5 (n)
√ √
where v√5 (x) denotes the greatest integer v such that ( 5)v | x but ( 5)v+1 - x. Deduce that
v5 (Fn ) = v5 (n)
for any n ≥ 1, where (Fn )n≥0 is the Fibonacci sequence defined by F0 = 0, F1 = 1 and Fn+2 =
Fn+1 + Fn for n ≥ 0.
Exercise 3.5.18† (Structure of units of Z/nZ). Let p be an odd rational prime and n ≥ 1 and integer.
Prove that there is a primitive root modulo pn , i.e. a number g which generates all the numbers coprime
with p modulo pn . Moreover, show that there doesn’t exist a primitive root mod 2n for n ≥ 3, but
that, in that case, there exist a rational integer g and a rational integer a such that each rational
integer is congruent to either g k for some k or ag k modulo 2n .3
3 In group-theoretic terms, this says that (Z/pn Z)× ' Z/ϕ(pn )Z and that (Z/2Z)n ' (Z/2Z) × (Z/2n−2 Z) for n ≥ 2.

The Chinese remainder theorem then yields


(Z/2n pn 1 nm ×
1 · · · pm Z) ' (Z/2Z) × (Z/2
n−2
Z) × (Z/ϕ(pn 1 nm
1 )Z) × . . . × (Z/ϕ(pm )Z).
56 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

Coefficients of Cyclotomic Polynomials


Exercise 3.5.19. Define the Möbius function µ : N∗ → {−1, 0, 1} by µ(n) = (−1)k where k is the
number of prime factors of n if n is squarefree, and µ(n) = 0 otherwise. Prove that
Y
Φn = (X d − 1)µ(n/d) .
d|n

Exercise 3.5.20† . Let m ≥ 0 be an integer. Prove that the coefficient of X m of Φn is bounded when
n varies.
Exercise 3.5.21† . Let ψ(x) = pα ≤x log p. By noticing that
P

Z 1
exp(ψ(2n + 1))
exp(ψ(2n + 1)) xn (1 − x)n dx ≤ ,
0 4n
prove that π(n), the number of primes at most n, is greater than Cn/ log n for some constant C > 0.
Exercise 3.5.22† . Let m ≥ 3 be an odd integer and suppose that p1 < . . . < pm = p are rational
primes such that p1 + p2 > pm and let n = p1 · . . . · pm . What are the coefficient of X p and X p−2 of
Φn ? Deduce that any rational integer arises as a coefficient of a cyclotomic polynomial.4
Exercise 3.5.23† . Let p and q be two rational primes. Prove that the coefficients of Φpq are in
{−1, 0, 1}.

Cyclotomic Fields and Fermat’s Last Theorem


Exercise 3.5.24† (Sophie-Germain’s Theorem). Let p be a Sophie-Germain prime, i.e. a rational
prime such that 2p + 1 is also prime. Prove that the equation ap + bp = cp does not have rational
integer solutions p - abc.
Exercise 3.5.25† . Let ω be an nth root of unity. Define Q(ω) as Q + ωQ + . . . + ω n−1 Q. Prove that
Q(ω) ∩ R = Q(ω + ω −1 )
where Q(ω + ω −1 ) = Q + (ω + ω −1 )Q + . . . + (ω + ω −1 )n−1 Q.
Exercise 3.5.26† . Let ω be a primitive pth root of unity, where p is prime. Prove that the ring of
integers of Q(ω), OQ(ω) := Q(ω) ∩ Z is

Z[ω] := Z + ωZ + . . . + ω n−1 Z.
(In fact this holds for any nth root of unity but it is harder to prove.)
Exercise 3.5.27† . Let ω be a primitive pth root of unity, where p is prime. Prove that p = u(1−ω)p−1 ,
where u ∈ Z is a unit of Z, i.e. 1/u is also an algebraic integer. Deduce that 1 − ω is prime in Q(ω).
Exercise 3.5.28† (Kummer). Let ω be a root of unity of odd prime order p and suppose ε is a unit
of Q(ω). Prove that ε = ηω n for some n ∈ Z and η ∈ R.
Exercise 3.5.29† . Let α ∈ Z[ω], where ω is a primitive pth root of unity. Prove that αp is congruent
to a rational integer modulo p.
Exercise 3.5.30† (Kummer). Let p be an odd prime and ω a primitive pth root of unity. Suppose
that Z[ω] is a UFD.5 Prove that there do not exist non-zero rational integers a, b, c ∈ Z such that
ap + bp + cp = 0.
(You may assume that, if a unit of Z[ω] is congruent to a rational integer modulo p, it is a pth power
of a unit. This is known as "Kummer’s lemma". See Borevich-Shafarevich [7, Chapter 5, Section 6]
or Conrad [12] for a (1 − ω)-adic proof of this.)
4 This may come off as a bit surprising considering that all the cyclotomic polynomials we saw had only ±1 and 0

coefficients.
5 Sadly, it has been proven that Z[ω] is only a UFD when p ∈ {3, 5, 7, 11, 13, 17, 19, 23}. This approach works however

almost verbatim when the class number h of Q(ω) is not divisible by p. The case h = 1 corresponds to Z[ω] being a
UFD. That said, it has not been proven that there exist infinitely many p such that p - h (but it has been conjectured
to be the case), while it has been proven that there exist infinitely many p such that p | h.
3.5. EXERCISES 57
j k
Exercise 3.5.31† (Fleck’s Congruences). Let n ≥ 1 be an integer, p a rational prime and q = n−1
p−1 .
Prove that, for any rational integer m,
 
k n
X
q
p | (−1) .
k
k≡m (mod p)

Miscellaneous
Exercise 3.5.32. Prove a version of Zsigmondy where a and b are coprime rational integers (not
necessarily positive).
Exercise 3.5.33† (Korea Winter Program Practice Test 1 2019). Find all non-zero polynomials f ∈
Z[X] such that, for any prime number p and any integer n, if p - n, f (n), the order of f (n) modulo p
is at most the order of n modulo p.
Exercise 3.5.34† (Korea Mathematical Olympiad Final Round 2019). Show that there exist infinitely
many positive integers k such that the sequence (an )n≥0 defined by a0 = 1, a1 = k + 1 and

an+2 = kan+1 − an

for n ≥ 0 contains no prime number.


Exercise 3.5.35† (Iran Mathematical Olympiad 3rd round 2018). Let a and b be rational integers
distinct from ±1, 0. Prove that there are infinitely rational primes p such that a and b have the same
order modulo p. (You may assume Dirichlet’s theorem ??.)

Exercise 3.5.36 (All-Russian Mathematical Olympiad 2008). Let S be a finite set of rational primes.
Prove that there exists a positive rational integer n which can be written in the form ap + bp for some
a, b ∈ Z if and only if p ∈ S.
Exercise 3.5.37† (IMC 2010). Let f : R → R be a function and a < b two real numbers. Suppose
that f is zero on [a, b], and
p−1  
X k
f x+ =0
p
k=0

for any x ∈ R and any rational prime p. Prove that f is zero everywhere.

Exercise 3.5.38. What is the discriminant of Φn ?


2kπ

Exercise 3.5.39. Let k and n ≥ 1 be coprime integers. Find the minimal polynomial of tan n .
Deduce that tan(qπ) takes only the rational values 0 and ±1 for rational q.
Exercise 3.5.40† . Let n ≥ 1 be an integer. Prove that, for all x ∈ R, Φn (x) ≥ (x − 1)xϕ(n)−1 with
equality if and only if n = 1.6

6 In particular, Φn (2) ≥ 2ϕ(n)−1 .


Chapter 4

Finite Fields

Prerequisites for this chapter: Chapters 1 and 3 and Section A.2.

The start of this chapter will be a bit technical, I hope the reader will bear with it.
Recall what a field is. We say a (K, +, ·) is a field (we usually just write "K is a field" when the
addition and multiplication are clear from the context) if + and · have nice properties: commutativity,
associativity, existence of additive identiy (0), existence of multiplicative identity (1), multiplication
over addition, existence of additive inverse, and, most importantly, existence of multiplicative
inverse (except for 0). There is no need to remember all of these: a field is an integral domain where
each non-zero element has a multiplicative inverse. This might seem a bit complicated, but just think
of Q when you have to use fields.
Exercise 4.0.1. Suppose K is a field of characteristic 0 , i.e.

1 + ... + 1
| {z }
n times

(where 1 is the multiplicative identity) is never zero for any n ≥ 1. Prove that K contains (up to relabelling
of the elements) Q.1

First, we discuss the simplest case of finite fields: the integers modulo p: Z/pZ. We will call this
field Fp for "field with p elements". It is very important to understand that the elements of Fp are
not rational integers! p = 0 in Fp (note that this is an equality and not a congruence: congruences
are for rational integers while equality is just equality but in another field) while p 6= 0 in Z. Thus,
while we use the same notations for the elements of Fp and elements of Z, they are not the same.
Exercise 4.0.2∗ . Let p be a rational prime. Prove that there exists a unique field with p elements (it’s Z/pZ).

Now we can discuss what finite fields are. Their name is quite explicit: they are the fields which
are also finite. However their is a way nicer characterisation of them: they are the finite extensions of
some Fp , i.e. Fp with some elements algebraic over Fp adjoined. This is analogous to the construction
of the complex numbers: you adjoin an imaginary number i such that i2 = −1 to the real numbers
R. You can do exactly the same thing for F3 : the polynomial X 2 + 1 doesn’t have a root there so
you can adjoin an imaginary (formal) number i3 such that i23 = −1 (in F3 ), thus getting a field with 9
elements.
Exercise 4.0.3∗ . Prove that F3 (i) := F3 + iF3 is a field (with 9 elements). (The hard part is to prove that
each element has an inverse.)

Why are we interested in finite fields other than Fp ? Well, for the same reason we are interested
in algebraic numbers. It is nice to have polynomials factorise completely (we say they split), thus we
create new fields by adjoining roots of polynomials to Fp .
1 Technically, it will usually not contain Q because Q is a very specific object. Indeed, the definition of a field is

extremely sensitive: if you change the set K (relabel its elements) but keep everything else the same you get a different
field. In that case we say the new field is isomorphic to the old one. So you must prove that K contains a field isomorphic
to Q, i.e. Q up to relabeling of its elements.

58
4.1. FROBENIUS MORPHISM 59

Let’s explain a bit what we mean by that. Given an irreducible polynomial f ∈ Fp [X] of degree
n, we let α be a formal object satisfying f (α) and then consider the field generated by α (and Fp ).
It contains α so it must be also contain α2 , . . . , αn−1 and hence all the linear combinations of these
elements. Conversely, Exercise 4.2.1∗ shows that Fp + αFp + . . . + αn−1 Fp is a field (has multiplicative
inverses) and this is therefore what we mean by "adjoining a root of f to Fp ". We denote this field by
Fp (α). Iterating this process shows that, given any polynomial (not necessarily irreducible) f , we can
construct a field containing Fp where f splits: these are exactly the finite fields we are interested in.

Finally, we come back to our earlier remark about elements of Fp not being integers. Here, the same
is true for F3 (i3 ): its elements are not Gaussian integers. You may protest and claim that F3 (i3 ) is just
Z[i]/3Z[i], i.e. the Gaussian integers modulo 3. And that is true (up to relabelling of the elements).
However imagine that we were working with F5 instead. Then, any i5 satisfying i25 = −1 must already
be in F5 as X 2 + 1 = (X + 2)(X − 2). Thus, Z[i]/5Z[i] is very different from F5 (i5 ) as the former has
25 elements while the latter only 5. (The former is not a field because the polynomial X 2 + 1 has 4
distinct roots. Equivalently, i + 2 has no inverse.)

You might argue that we need to distinguish the cases where the polynomial f already has a root
in Fp and when it does not. Having to distinguish these cases is not only ugly and artificial, but even
false! What we want is for p to stay prime in Q(α) where α is a root of f which, up to a finite number
of exceptions, is equivalent f staying irreducible in Fp [X] (see Exercise 6.5.35).2

As a last remark, I hope you are now convinced that a field with, say, p2 elements is very different
from Z/p2 Z! The latter is not even a field since p does not have an inverse!

4.1 Frobenius Morphism


Before constructing finite fields, we need to discuss some things about Fp itself.

Definition 4.1.1 (Frobenius)

Let p be a rational prime and R a commutative ring of characteristic p, meaning that p = 0 in


R. The Frobenius morphism of R is FrobR : x 7→ xp .

p = 0 means that
1 + . . . + 1 = 0;
| {z }
p times

this is for instance the case in Fp or Z modulo p. The word "morphism" in this context means that
it’s an additive map. Indeed,
p  
X p
(x + y)p = xk y p−k = xp + y p
k
k=0

as  
p p!
= =0
k k!(p − k)!
for k = 1, . . . , p − 1 (p divides the top but not the bottom).

Proposition 4.1.1*

The Frobenius morphism is indeed a morphism.

2 However, the intuition that finite fields are represented by algebraic integers is not completely wrong, but instead

of rational primes, we need to look at OQ(α) modulo a prime ideal p (the definition of a prime ideal being an ideal such
that OQ(α) (mod p) is a field). The ideal point of view is very rich and is actually the best point of view (compared to
tricky uses of the fundamental theorem of symmetric polynomials), but we do not discuss this in this book. See [27].
60 CHAPTER 4. FINITE FIELDS

Exercise 4.1.1. Why is commutativity (of R) needed?

A direct corollary is the following.

Corollary 4.1.1*
n n n
The nth iterate of the Frobenius, FrobnR is also a morphism, i.e. (x + y)p = xp + y p for any
x, y ∈ R.

Let’s see a quick application of this result.

Problem 4.1.1

Let (an )n≥0 be the sequence defined by a0 = 3, a1 = 0, a2 = 2 and an+3 = an+1 + an for n ≥ 0.
Prove that p | ap for any prime number p.

Solution

A quick computation shows that an = αn + β n + γ n (mod p) where α, β, γ ∈ Z are the roots of


the characteristic polynomial X 3 − X − 1 (see Theorem C.4.1).

Thus, by Proposition 4.1.1,

ap = αp + β p + γ p ≡ (α + β + γ)p = ap1 = 0 (mod p)

as wanted.


Exercise 4.1.2∗ . Prove that an = αn + β n + γ n .

4.2 Existence and Uniqueness


Here, we show how to construct all finite fields and prove that there is a unique one of cardinality q for
each prime power q 6= 1 (F1 doesn’t exist because the definition of a field specifies that the multiplicative
and additive identities are different), and none when q isn’t a prime power. Although we can construct
a field with pn elements by adjoining a root of an irreducible polynomial f ∈ Fp [X] of degree n we do
not do it that way because it is not obvious that such a polynomial exists (Exercise 4.6.3† provides a
proof but uses itself finite fields), but we use this to show the field is unique (surprisingly). In fact, it is
actually very surprising√that it’s√the same for any irreducible f of degree√ n. Over Q for instance, this
is completely false: Q( 2) 6= Q( 3) (Exercise 2.1.2∗ shows that each Q( d) for integral squarefree d
is distinct from the others).

It is convenient to think of elements algebraic over Fp similarly to Q, as part of one big field and
hence compatible with each other even if that is not trivial (for algebraic numbers it is as Q is already
part of C, but how can we define roots of polynomials in Fp [X] which don’t exist in Fp ?). It will be
proven in Definition 4.3.1.

Proposition 4.2.1

For any rational prime p and integer n ≥ 1, there exists a field with pn elements.
4.2. EXISTENCE AND UNIQUENESS 61

Proof
n
Consider the polynomial X p − X over Fp [X]. Factorise it as f1 · . . . · fk where f1 , . . . , fk are
irreducible in Fp [X] (not necessarily distinct).

We can construct a field F where f1 , . . . , fn split (have all their roots in F ) inductively using
Exercise 4.2.1∗ . Indeed, one can adjoin a root of X p − X (hence a root of one of the fi ) which
n

is not already in Fp to get a new field F 0 and repeat this process inductively until all fi have
n
roots in F . This must terminate as X p − X has at most pn roots in any field.
n
We claim that this field has exactly pn elements. Notice that the derivative of X p − X is −1
n
which is coprime with X p − X so all its roots are distinct. Denote them by α1 , . . . , αpn . Since
they all lie in F , it has at least pn elements.
n n
To conclude, we prove that any element of F is a root of X p −X. The roots of X p −X are clearly
closed under multiplication, multiplicative inverse and additive inverse, and Corollary 4.1.1 shows
that they are also stable under addition. Since any element of F can be written that way, we
n
conclude that they are all roots of X p − X and thus there are at most pn elements of F .


Exercise 4.2.1∗ . Let K be a field and f ∈ K[X] an irreducible polynomial of degree n. Prove that

K(α) := K + αK + . . . + αn−1 K

is a field, where α is defined as a formal root of f , i.e. an object satisfying f (α) = 0.

Before proving the uniqueness of this field up to isomorphism, we need an analogue of Fermat’s
little theorem. Here Fq denotes a field with q elements.

Definition 4.2.1

For any ring R, we write R× for the multiplicative group of R, i.e. the units of R.

When R = K is a field, we thus have R× = K \ {0}.

Theorem 4.2.1 (Fermat’s Little Theorem in Finite Fields)

For any non-zero α ∈ F×


q , we have α
q−1
= 1. Equivalently, αq = α for any α ∈ Fq .

Proof

Let α ∈ F× q be a non-zero element. The function β 7→ αβ is a bijection from Fp to Fp since α is


invertible. Thus, Y Y Y
β= αβ = αq−1 β.
β∈F×
q β∈F×
q β∈F×
q

β 6= 0, we conclude that αq−1 = 1.


Q
Since β∈F×
q

Remark 4.2.1
For the reader knowing a bit of group theory, this can also be seen to be Lagrange’s theorem
applied to the multiplicative group F×
p.
62 CHAPTER 4. FINITE FIELDS

Before proving that finite fields are unique (up to isomorphism), we will present an application of
Theorem 4.2.1 to the period of linear recurrences modulo p to further motivate the interest of finite
fields. Note that, we already have the uniqueness of finite fields in a special situation: the one where
the fields which can be compared, i.e. are both subfields of a larger field K. In that case, if F, F 0 ⊆ K
both have cardinality q, their elements are the roots of X q − X which means that they are in fact
exactly the same. For us, this larger field will be an algebraic closure of Fp , which we will introduce
in Section 4.3.

We shall prove that the sequence (an )n≥0 in Problem 4.1.1 has period (dividing) p6 − 1 modulo
p for any prime p. Try to find a proof for this seemingly elementary fact without appealing to finite
fields!

Proof

X 3 − X − 1 factorises (in Fp [X]!) either as three linear factors, one linear and one quadratic, or
one cubic. In these cases the roots αp , βp , γp are respectively all in Fp , one in Fp and two in Fp2 ,
or all in Fp3 . (αp , βp , γp are not algebraic numbers, they are (formal) algebraic elements over
Fp .)

We hence always have


6 6 6
−1 −1 −1
αpp = βpp = γpp =1
2 3 6
(as p − 1 and p − 1 divide p − 1).

Finally, we conclude that


6 6 6
−1 −1 −1
an+p6 −1 ≡ αpn · αpp + βpn · βpp + γpn · γpp = αpn + βpn + γpn ≡ an (mod p)

by Theorem 4.2.1 since αp , βp , γp 6= 0.

In fact, we even get that the period divides p2 − 1 or p3 − 1.




By a similar argument, the Fibonacci sequence (Fn ) has period dividing p2 − 1 modulo any rational
prime p.

Remark 4.2.2
This √
can actually be proven elementarily for Fp2 . Indeed, any element of Fp2 can be written as
a + b d where d ∈ Fp is not a square modulo p. Since Frob2p is a morphism, it suffices to prove
√ √
that Frob2p ( d) = d to conclude that it fixes all of Fp2 . But this is easy for odd p:
√ 2 p+1
( d)p −1 = (d 2 )p−1 = 1.

Note that this argument can be written without appealing to finite fields to prove that (Fn ) has
period dividing p2 − 1, but things already become more messy as we need to treat separately
the cases where p ∈ {2, 5} (because of the denominator or the square root). Also, this does not
√ to recurrences of order ≥ 3 as algebraic numbers of degree 3 are not simply of the form
generalise
a + b 3 d (but over Fp they are by Theorem 4.2.2).

We now prove that finite fields are unique. For this, we shall need the fact that finite fields have
prime characteristic and thus contain (a copy of) Fp where p is the characteristic. Indeed, if the
characateristic of the finite field F is c = char F = ab, then we must have a = 03 or b = 0 in F since
ab = 0 in F and a field is an integral domain. By minimality of the characteristic, this means that
3 Here we abusively mean 1 + . . . + 1.
| {z }
a times
4.2. EXISTENCE AND UNIQUENESS 63

a = c or b = c, i.e. c is prime (since c 6= 1: the trivial ring is not a field!). F then naturally contains
the copy of Fp where we send a ∈ Fp to4

1 + ... + 1.
| {z }
a times

That way, we can consider finite fields as field extensions of some Fp , as stated in the beginning of the
chapter.5

Theorem 4.2.2

Let q be an integer. If q 6= 1 is a power of a prime then there exists a unique (up to isomorphism)
field with q elements, otherwise there is none.

This proof is not super instructive and slightly technical so it can be skipped upon a first reading.
(However once understood, one sees that it only consists of more or less trivial technicalities. The key
point is Fermat’s little theorem.) In fact, as mentioned before, it is not even useful to us (since we
already know that they are unique in the context we are interesting in, see also Remark 4.2.4), but it
is nonetheless a nice result.

Proof

The fact that if F is a finite field of cardinality q then q 6= 1 is a prime power follows from
Proposition C.1.4.

We now show that finite fields of cardinality pn are unique, existence was proven in Proposi-
tion 4.2.1. We proceed by induction on n. It is clearly true for n = 1. Since the cardinality is a
power of the characteristic, as proven in Proposition C.1.4, any such field must have characteristic
p.

Suppose Fpm is unique for m < n and let F, F 0 be two fields with pn elements. Since p + p2 +
. . . + pn−1 < pn , there is some element α of F which is not in any of the previous Fpm . (By this
m
we mean that αp 6= α for any m < n. Indeed, this is the property that defines elements of Fpm .
n−1
We do not actually need the inductive hypothesis: the polynomial (X p − X) · . . . · (X p − X)
has less than pn roots so it doesn’t vanish on all of Fpn .)

Let k be the degree of α so that Fp (α) is a field with pk elements. If k < n, by the inductive
k
hypothesis this must be Fk so α ∈ Fk which is not the case. (Again, this means that αp = α.)
k n
Conversely, if k > n, then Fp (α) has p > p elements which is impossible since it is contained
in F , a field of cardinality pn . Thus α has degree n (coincidentally this shows that there always
exists an irreducible polynomial in Fp [X] of degree n). (We can also consider a primitive root of
Fpn for α, there exists one by the same argument as Exercise 3.3.2∗ .)
n
Accordingly, we conclude that F = Fp (α). By Theorem 4.2.1, we know that f | X p − X. Again,
n
by Theorem 4.2.1, we know that X p − X splits in F 0 , so f must also split in F 0 .

To conclude, let β be a root of f in F 0 . Then, we again have F 0 = Fp (β). Since α and β have
the same minimal polynomial, F and F 0 are the same except that α has been relabelled as β.
Indeed, just relabel g(α) ∈ F as g(β) ∈ F 0 .

This gives us an isomorphism (a relabelling conserving the structure) between F and F 0 : it


is clear that it is additive and multiplicative (hence same field structure) so we just need to

4 By this we mean that we take a representative A ∈ N of a and then add 1 A times. This is well defined since F and
p
F have the same characteristic.
5 In reality, the reason why we care about finite fields is not because they are finite but because they are finite field

extensions of prime fields of the form Fp , meaning that they have the form Fp adjoined elements which are algebraic
over Fp . (For instance, it wouldn’t really change anything for us if there was a finite field of cardinality 6, we would
simply ignore it.)
64 CHAPTER 4. FINITE FIELDS

check that it is well-defined. This follows from the fact that α and β have the same minimal
polynomials: if g(α) = h(α) then g ≡ h (mod f ) so g(β) = h(β).


Remark 4.2.3
Some readers might recognise that the proofs of uniqueness and existence are just saying that
n
Fpn is the splitting field of X p − X.

Remark 4.2.4
It turns out that the fact that, for any field K, there is at most one subfield of K of cardinality
q (see the paragraph after Remark 4.2.1) already implies that there is at most one field of cardi-
nality q up to isomorphism (and is clearly stronger: a field may contain two distinct isomorphic
subfields). For this, we suppose that A and B both have cardinality q and we construct a field K
containing subfields A0 ' A and B 0 ' B. Our assumption then implies A0 = B 0 and thus A ' B.

Consider the tensor product C = A ⊗ B, consisting of finite sums of formal products a ⊗ b with
a ∈ A and b ∈ B together with the assumption that this product is bilinear, i.e.

(a + a0 ) ⊗ (b + b0 ) = a ⊗ b + a0 ⊗ b + a ⊗ b0 + a0 ⊗ b0

for all a, a0 ∈ A and b, b0 ∈ B (this construction applies more generally to vector spaces over a
common field, here it is Fp with p = char A = char B). This becomes a commutative ring when
we endow it with the multiplication (a ⊗ b) · (a0 ⊗ b0 ) = (aa0 ) ⊗ (bb0 ). It also contains the fields
A0 = A ⊗ 1 ' A and B 0 = 1 ⊗ B ' B. However, C may not necessarily be a field. To fix this,
we consider a maximal ideal m of C, i.e. a strict ideal of C which is contained in no other strict
ideal (just pick a strict ideal of maximal cardinality since C is finite). Then, by Exercise A.3.23† ,
K = C/m is a field. Moreover, we have an obvious reduction modulo m morphism ϕ : C → K:
we just reduce an element c to c (mod m). Since field morphisms are injective by Exercise A.2.11,
we conclude that the restriction of ϕ to A0 or B 0 is an isomorphism from A0 to ϕ(A0 ) and B to
ϕ(B 0 ). We have constructed the field and the subfields we wanted:

A ' ϕ(A0 ), B ' ϕ(B 0 ) ⊆ K.

As a final note, we remark that this argument shows more generally that, given any two fields
A, B, we can find a common field extension K in the sense defined above. Indeed, an argument
similar to the one in Remark C.1.4 proves that Zorn’s lemma implies that every commutative
ring has a maximal ideal (and is in fact equivalent to it).

Exercise 4.2.2. Let R be a commutative ring. Prove, using Zorn’s lemma, that R has a maximal ideal. More
generally, prove that any strict ideal of R is contained in a maximal ideal. (You really don’t need to do this.)

Note that our proof also yields the following corollary.

Corollary 4.2.1

Any finite field Fpn has the form Fp (α) for some α, i.e. is generated by one element.

4.3 Properties
From the uniqueness of finite fields and Theorem 4.2.1 we can deduce a few fundamental corollaries.
4.3. PROPERTIES 65

Corollary 4.3.1*

The nth iterate of the Frobenius, Frobnp , fixes exactly Fpn .

Proof

Theorem 4.2.1 says that Fpn is fixed by Frobnp . Conversely, this polynomial can have at most pn
n
roots, so any element satisfying αp = α must lie in Fpn .


This might seem trivial but is in fact very useful as it allows us to compare elements of different
finite fields. For instance, α is in Fp if and only if αp = α (we will use this in Proposition 4.4.2).

Corollary 4.3.2*

Let m and n be positive integers. We have the inclusion Fpm ⊆ Fpn if and only if m | n.

By this, we mean that Fpn has a subfield isomorphic to Fpn if and only if m | n.

Proof
m n m
This amounts to saying that the roots of X p − X are all roots of X p − X, i.e. that X p −1 − 1 |
X p −1 − 1 since they are distinct. By Exercise 4.3.1∗ this means that pm − 1 | pn − 1. By the
n

same exercise, this is equivalent to m | n.




Exercise 4.3.1∗ . Let a and b be positive integers and K a field. Prove that X a − 1 divides X b − 1 in K if
and only if a | b. Similarly, if x ≥ 2 is a rational integer, prove that xa − 1 divides xb − 1 in Z if and only if
a | b.

Corollary 4.3.3*

Let α be a root of an irreducible polynomial f ∈ Fp [X] of degree m. Then, α ∈ Fpn if and only
if m | n.

Again, this means that α is contained in some field of cardinality pn if and only if m | n.

Proof

Fp (α) is a field with pm elements so it is Fpm . Thus, α ∈ Fpn if and only if Fpm ⊆ Fpn which is
equivalent to m | n by Corollary 4.3.2.


In particular, from the uniqueness of finite fields, we deduce that any polynomial in Fp [X] √
of degree

2 splits in Fp2 which was initially not obvious at all. Over Q this is again completely false: 2 + 3
has degree 4 6= 2.
66 CHAPTER 4. FINITE FIELDS

Remark 4.3.1
The uniqueness of Fp2 can actually be seen quite easily: if a√and b are quadratic non-residues in

Fp then there is some c ∈ Fp such that a = c2 b (so a = c b) (see Section 4.5). The degree 2
case in general is a bit pathological because a polynomial is either irreducible or splits. If this
example didn’t convince you, you can think about the fact that each polynomial of degree 3 has
all its roots in Fp6 which is not obvious at all. Why would approximately 3p3 elements generate a
field of cardinality only p6 when one element (of degree 7) is sufficient to generate p7 − 1 others?

Exercise 4.3.2∗ . Let f ∈ Fp [X] be a polynomial of degree n. Prove that f splits over Fpn! .
With this we can define the algebraic closure of Fp , consisting of the elements algebraic over Fp
(roots of polynomials with coefficients in Fp ). Here is how this is done: we pick a field with p elements
Fp , a field with p2 elements Fp2 which contains Fp (we can do this by relabelling the elements), a field
with p6 elements Fp6 which contains Fp2 , etc. We thus get a chain of finite fields
Fp ⊆ Fp2 ⊆ Fp6 ⊆ . . . ⊆ Fpn! ⊆ . . .
the union of which contains all finite fields since any n divides n! so Fpn ⊆ Fpn! . Thus, any polynomial
f ∈ Fp [X] as a root in this union, and one can show that in fact all the roots must lie there, for
instance using Exercise 4.3.2∗ . This is the algebraic closure of Fp .

Definition 4.3.1

The algebraic closure of Fp , Fp , is defined as the elements algebraic over Fp , i.e. the union

[ ∞
[
Fpn! = Fpn .
n=1 n=1

Note that this union makes sense as if α ∈ Fpn and β ∈ Fpm then α and β are both in Fpmn so
their sum and product are well defined.

Remark 4.3.2
To be completely formal, Fp should be defined as the inductive limit

lim Fpn
−→

of (Fpn )n≥0 (no need for factorials


S∞ this time!). In other words, an element of Fp is an element
of the set-theoretic union n=1 Fpn (we have not chosen specific copies of Fpn here), where two
elements α ∈ Fpm and β ∈ Fpn are identified if m | n and the image of α through the canonical
map Fpm → Fpn given by Corollary 4.3.2 is α.

Remark 4.3.3
As said at the beginning of the chapter, Fp is not Z/pZ, the ring of algebraic integers modulo p!
√ √ √
Indeed, the latter is not a field since it’s not an integral domain: p · p ≡ 0 but p 6≡ 0 (as

p
√1 6∈ Z)! It can even be shown that any polynomial f ∈ Z/pZ[X] has infinitely many roots
p = p
in Z/pZ (see Exercise 4.6.18).

Sometimes, when we want to evaluate a symmetric expression of algebraic numbers modulo p,


it can be useful to replace these algebraic numbers by the corresponding elements of Fp with the
fundamental theorem of symmetric polynomials (analogous to Proposition 1.3.1) to use finite field
theory. (Section 6.2 of Chapter 6 will show that any expression of algebraic numbers which is rational
can be written as a symmetric expression of some algebraic numbers and their conjugates, implying
that this replacement from Q to Fp can always be made (when p doesn’t divide the denominator of
the expression).)
4.4. CYCLOTOMIC POLYNOMIALS 67

Finally, we have one last result that again highlights how much better the situation is over Fp
compared to Q.

Theorem 4.3.1

Let f ∈ Fp [X] be an irreducible polynomial (in Fp [X]) of degree n. Suppose α ∈ Fpn is one its
n−1
roots (by Corollary 4.3.3). Then, all its roots are α, αp , . . . , αp .

In other words, if we know a root of f , we know all of its roots and they are generated by the
Frobenius morphism! This is completely false over Q!

Proof
k k k
By Proposition 4.1.1, f (X p ) = f (X)p so αp is always a root of f . In addition, these are all
i j
distinct as αp = αp for some i > j implies that
i−j
αp =α

so α is fixed by Frobpk where k = i − j < n. Thus, α would be in Fpk but this is impossible as
Fp (α) := Fp + αFp + . . . + αn−1 Fp has pn elements while Fpk has pk < pn elements.


This proposition will in particular allow us to determine how cyclotomic polynomials factorise in
Fp .

4.4 Cyclotomic Polynomials


n
Recall Theorem 3.3.1 if a ∈ Fp and Φn (a) = 0, the order of a is pvp (n)
. This holds true over arbitrary
finite fields too. We define the order of a non-zero element α ∈ Fp to be the smallest k > 0 such that
αk = 1. Primitive mth roots over Fp are defined as elements of order m. This time however, there are
0 0 0
no primitive mth roots when p | m as αm p = 1 implies (αm − 1)p = 0 so αm = 1. However, when
p - m, primitive mth roots always exist because X m − 1 has distinct roots (its derivative mX m−1 is
non-zero and thus coprime with X m − 1). Theorem 3.3.1 thus takes the following form.

Proposition 4.4.1

Let n = pk m ≥ 1 be an integer where k = vp (n). Then, over Fp ,


 ϕ(pk )
k Y
Φn = Φϕ(p
m
)
= X − ω .
ω∈Fp primitive mth root

Exercise 4.4.1∗ . Prove Proposition 4.4.1. (This proof is independent from the one in Chapter 3.)

In particular, there always exists a primitive root of Fpn , i.e. an element g of order pn − 1 (which
Q all the other ones): they are the roots of Φpn −1 (and these are all in Fpn as Φpn −1 |
thus generates
n
X p − X = α∈Fpn X − α by Theorem 4.2.1.)

Exercise 4.4.2∗ . Let p - m be a positive integer. Prove that Φm has a root in Fpn if and only if m | pn − 1.

Here is an application of Proposition 4.4.1.


68 CHAPTER 4. FINITE FIELDS

Problem 4.4.1 (Brazilian Mathematical Olympiad 2017 Problem 6)

Let 3 6= p | a3 − 3a + 1 be a rational prime where a is some rational integer. Prove that p ≡ ±1


(mod 9).

Solution

Perform the substitution a = α+ α1 where α ∈ Fp2 . This is possible as the polynomial X 2 −aX +1
has degree two and thus has its roots in Fp2 . Then,
 3  
1 1 1
a3 − 3a + 1 = α+ −3 α+ + 1 = α3 + 3 + 1.
α α α

Thus,
Φ9 (α) = α6 + α3 + 1 = 0.
We conclude that Φ9 has a root in Fp2 which means that 9 | p2 − 1 by Exercise 4.4.2∗ . This is
exactly equivalent to p ≡ ±1 (mod 9)!


Exercise 4.4.3. Prove that p2 ≡ 1 (mod 9) if and only if p ≡ ±1 (mod 9).

Remark 4.4.1
This seems a bit miraculous and I don’t really have a good explanation for the motivation apart
from "p ≡ ±1 (mod 9) is the same as 9 | p2 − 1 which makes us think of Φ9 in Fp2 " or "a = α + α1
makes things cancel well".

We can in fact prove that the converse also holds: if p ≡ ±1 (mod 9), X 3 − 3X + 1 has a root in
Fp . We have seen that the roots of this polynomial have the form ω + ω1 where ω ∈ Fp is a primitive
9th root of unity. It remains to check that this is indeed an element of Fp : by Corollary 4.3.1 this
amounts to checking that  p
1 1
ω+ =ω+ .
ω ω
Since p ≡ ±1 (mod 9) and ω 9 = 1, the LHS is
1
ωp + = ω ±1 + ω ∓1
ωp
which is indeed equal to ω + ω1 .

In fact, we can generalise this problem to find, for any n, a polynomial Ψn which has a root in
Fp if and only if p ≡ ±1 (mod n) (with some possible exceptions if p | n). By adapting the previous
solution, we wish to have  
1 Φn (X)
Ψn X + = ϕ(n)/2
X X
(for n ≥ 3, the other cases are trivial). Such a polynomial indeed exists: the key point is that
Φn (X)/X ϕ(n)/2 is symmetric in X and 1/X (Exercise 3.1.6∗ ). Hence, by the fundamental theorem of
1 1
symmetric polynomials, it is a polynomial in X + X and X · X , i.e. a polynomial in X + X1 .

Remark 4.4.2
One can also prove that there exists a polynomial Tn such that Tn (X + 1/X) = X n + 1/X n by
induction on n. This polynomial is called the nth Chebyshev polynomial .
4.4. CYCLOTOMIC POLYNOMIALS 69

Remark 4.4.3
The polynomial Ψn we have constructed is in fact the minimal polynomial of 2 cos 2π

n , see
Exercise 3.2.5. For the sake of consistency we can thus also artifically define Ψ1 = X − 2 and
Ψ2 = X + 2 (they also satisfy Proposition 4.4.2)

Exercise 4.4.4. Compute Ψ1 , . . . , Ψ8 .

Proposition 4.4.2

Let p - n be a prime number. Then, Ψn has a root in Fp if and only if p ≡ ±1 (mod n).

Proof

By definition, the roots of Ψn are ω + ω1 where ω is a root of Φn , i.e. an element of order n. Let’s
see when this is in Fp . We have  p
1 1
ω+ = ωp + p .
ω ω
Note that
X + 1/X − (ω + 1/ω) = (X − ω)(X − 1/ω)/X.
Thus, ω p + 1
ωp =ω+ 1
ω if and only if ω p = ω ±1 . This is exactly equivalent to p ≡ ±1 (mod n).


From this we get the following corollary, similar to Exercise 3.3.8∗ .

Theorem 4.4.1

For any positive rational integer n, there exist infinitely many rational primes congruent to −1
modulo n.

Proof

It suffices to prove the existence of one such prime since primes congruent to −1 modulo kn are
also congruent to −1 modulo n, which means that we can pick a sufficiently large k to get a new
prime congruent to −1 modulo n.

Thus, suppose for the sake of contradiction that there are no such primes. Let a be the constant
coefficient of Ψn . We shall consider the polynomial f = Ψn (anX)/a, which now has constant
coefficient 1 (without loss of generality, suppose that n 6= 4 so that cos(2π/n) 6= 0 and thus
a 6= 0). Consider f (m) for some rational integer m. By construction, it is congruent to 1 modulo
n and by assumption, all it prime factors are congruent to 1 modulo n. This implies that f (m)
is positive, as otherwise it would be congruent to −1 modulo n!

The key point is that the complex roots of Ψn are all real and distinct as they are cos 2kπ

n for
gcd(k, n) = 1. In particular, there is some interval [a, b] where f is negative. Unfortunately, this
intervall is very small and may not contain any integer. To fix this, we shall consider, with some
restrictions, m ∈ Q instead of m ∈ Z! More precisely, we let m be in Z[1/p] where p is some
prime congruent to 1 modulo n (there exists one by Corollary 3.3.2), i.e. m = a/pk for some a, k.

Finally, let m be an element of Z[1/p] in [a, b]. A prime factor q of the numerator of f (m) is
either equal to p, or congruent to 1 modulo n by Proposition 4.4.2 (since m then makes sense
modulo q). In addition, the prime factors of the denominator are congruent to 1 modulo n as well
70 CHAPTER 4. FINITE FIELDS

since p is the only such prime. Thus, the prime factors of both the numerator and denominator
of f (m) are congruent to 1 modulo n, so that f (m) ≡ −1 (mod n) since it is negative. This
contradicts the fact that f (m) ≡ 1 (mod n) by construction.


Exercise 4.4.5∗ . Let p 6= 0 be an integer. Prove that Z[1/p], i.e. the set of numbers m/pk with m ∈ Z and
k ∈ N is dense in R.

Exercise 4.4.6∗ . Prove that the leading coefficient of Ψn is 1.


Finally, we discuss the factorisation of cyclotomic polynomials in Fp [X]. While they are irreducible
in Q[X], over Fp the situation is quite different.

Proposition 4.4.3

Let n ≥ 1 be an integer. The nth cyclotomic polynomial Φn factorises as a product of ϕ(n) k


irreducible polynomials, where k is the order of p modulo n. In particular, it stays irreducible if
and only if p is a primitive root modulo n.

Proof

It suffices to show that each irreducible factor has degree k. By Theorem 4.3.1, this is equivalent
`
to k being the smallest positive integer ` such that ω p = 1 for any element ω of order n. This
`
is very easy to show: ω p = 1 if and only if p` ≡ 1 (mod n) since ω has order n by definition.
Thus, ` is the smallest integer such that p` ≡ 1 (mod n) which is, by definition, the order of p
modulo n.


As a perhaps surprising corollary, Φ8 = X 4 + 1 is irreducible in Q[X] but reducible in any Fp [X]


as there is no primitive root modulo 8.

4.5 Quadratic Reciprocity


We are interested in knowing when an element a ∈ Fp is a perfect square, i.e. when there exists a
b ∈ Fp such that a = b2 . Equivalently, we want to know how the polynomial X 2 − a factorises in
Fp [X]. We thus make the following definitions.

Definition 4.5.1 (Quadratic Residues and Non-Residues)

Given a non-zero a ∈ Fp , we say a is a quadratic residue if it is a square in Fp . Otherwise, we


say it is a quadratic non-residue.

Note that 0 is not a quadratic residue nor a non-residue (it’s "zero"). The reason for this definition
will become clear shortly. Using primitive roots, one can easily prove the following criterion.

Proposition 4.5.1 (Euler’s Criterion)


p−1
An element a ∈ Fp is a quadratic residue if and only if a 2 = 1. Similarly, a is a quadratic
p−1
non-residue if and only if a 2 = −1.

Exercise 4.5.1∗ . Prove Proposition 4.5.1.


Based on this result, we make the following definition.
4.5. QUADRATIC RECIPROCITY 71

Definition 4.5.2 (Legendre Symbol)


 
a
Let p be an odd rational prime. Given an a ∈ Fp , we define the Legendre symbol of a, p to be
p−1
the integer among {−1, 0, 1} which is congruent to a 2 . We also define 12 = 1 and 0
 
2 = 0.

We could have also defined the Legendre symbol before stating Euler’s criterion (0 if a = 0, 1 if a
is quadratic residue, −1 otherwise) but one very nice property of this object is that it’s multiplicative
(by Euler’s criterion).
 
Another way of thinking about the Legendre symbol is that 1 + ap counts (without multiplicity)
the number of roots of X 2 − a in Fp : it’s 1 + 0 = 1 when a = 0, 1 + 1 = 2 when a is a quadratic residue,
and 1 − 1 = 0 when a is a quadratic non-residue.
 
Let’s first analyze −1
p .

Theorem 4.5.1 (First Supplement of the Quadratic Reciprocity Law)


  p−1
−1
Let p be an odd prime. Then, p = (−1) 2 .

Proof

The polynomial X 2 − (−1) = Φ4 has a root in Fp if and only if 4 | p − 1 which is exactly what
we wanted to show.


We now state the quadratic reciprocity law. So far, we have only studied relations between finite
fields of fixed characteristic p. This result provides a very beautiful link between the structure of Fp
and Fq for distinct primes p and q.

Theorem 4.5.2 (Quadratic Reciprocity Law)

Let p and q be distinct odd rational primes. Then,


  
p q p−1 q−1
= (−1) 2 · 2 .
q p

Remark 4.5.1
 
Technically, this statement doesn’t make sense because pq is defined for p ∈ Fq and not p ∈ Z,
   
and pq is defined for q ∈ Fp and not q ∈ Z. This is of course very easy to fix: we define ap
 
for a ∈ Z as a (modp
p)
. We will make many such identifications throughout this book.

Theorem 4.5.3 (Second Supplement of the Quadratic Reciprocity Law)

p2 −1
 
2
Let p be an odd prime. Then, p = (−1) 8 .
72 CHAPTER 4. FINITE FIELDS

Combined with the second supplement of this theorem, this allows us to compute (more or less
efficiently)
  arbitrary Legendre symbols since the Legendre symbol is multiplicative,. Indeed, to compute
a
p we can suppose a ∈ [p], then consider its prime factorisation a = q1 · . . . · qn and use the quadratic
   
reciprocity law and its second supplement to reduce the computation of ap to qpk where qk < p
and repeat the process sufficiently many times.

77

Exercise 4.5.2. Compute 101
.

In fact, we have already proven the second supplement with our Proposition 4.4.2.

Proof of the Second Supplement

Notice that Ψ8 = X 2 − 2. But, by Proposition 4.4.2, this polynomial has a root in Fp if and only
p2 −1
if p ≡ ±1 (mod 8) which is exactly equivalent to (−1) 8 = 1.


p2 −1
Exercise 4.5.3. Prove that Ψ8 = X 2 − 2 and that (−1) 8 = 1 if and only if p ≡ ±1 (mod 8).

Proof of the Quadratic Reciprocity Law

We shall make an ingenious use of the Frobenius morphism of Z (mod p). Let ω ∈ Z be a
primitive qth root of unity.

Define the `th quadratic Gauss sum


X k  
`
g` = ω k` = g
q q
k∈Fq

 
−1
for ` ∈ Fq where g = g1 . We will prove that g 2 = q. Note that we can already know, prior
q
to the computation, that g is a rational integer by Exercise 4.5.5∗ .
2

Since we wish to compute g 2 , we expand g 2 :


  
X i X j  X  ij 
g2 =  ωi   ωj  = ω i+j .
q q q
i∈Fq j∈Fq i,j∈Fq

Now we use a well known trick: the unity root filter we encoutered in Exercise A.3.9† . The
idea is that, when we sumPω n for some fixed n over the other qth roots of unity raised to the
nth power, i.e. consider k∈Fq ω kn , we get massive simplification. Hence, consider the sum
δ(n) := k∈Fq ω kn for n ∈ Fq . When n = 0 (in Fq ), this sum is q, otherwise it’s
P

ω qn − 1
= 0.
ωn − 1
4.6. EXERCISES 73

We can now finish our computation of g 2 :


X
(q − 1)g 2 = g`2
`∈Fq
X X  ij 
= ω (i+j)`
q
`∈Fq i,j∈Fq
X  ij  X
= ω (i+j)k
q
i,j∈Fq `∈Fq
X  ij 
= δ(i + j)
q
i,j∈Fq
X  −i2 
= q
q
i∈Fq
 
−1
= (q − 1) .
q
 
−1
Hence, g 2 = q q as wanted.

On the one hand,


 
p−1 p−1 q−1 q p−1 q−1
g p = gq 2 (−1) 2 · 2 ≡g (−1) 2 · 2 (mod p)
p
 
by the previous computation. On the other hand, g p ≡ gp = g pq (mod p) by Frobenius.
Therefore,    
q p−1 q−1 p
g (−1) 2 · 2 ≡ g (mod p).
p q
We want to divide both sides by g but we don’t know if g is invertible (actually
 we do from
† 2 −1
Exercise 1.5.23 ). Instead, we multiply both sides by g to transform g into g = q q which is
indeed invertible modulo p as p 6= q. Finally, we get
   
p−1 q−1 q p
(−1) 2 · 2 ≡ (mod p).
p q

Since both sides are ±1, they must be equal which is exactly what we wanted to prove.


Exercise 4.5.4∗ . Prove that, for any ` ∈ Fq , g` =


 
`
q
g.

Exercise 4.5.5∗ . Prove without computing g 2 that g has exactly 2 conjugates, i.e. is a quadratic number.

4.6 Exercises
Dirichlet Convolutions
Exercise 4.6.1† (Dirichlet Convolution). A function f from N∗ to C is said to be an arithmetic
function. Define the Dirichlet convolution 6 f ∗ g of two arithmetic functions f and g as
X X
n 7→ f (d)g(n/d) = f (a)g(b).
d|n ab=n

6 The Dirichlet convolution appears naturally in the study of Dirichlet series: the product of two Dirichlet series
P∞ f (n) P∞ g(n) P∞ (f ∗g)(n)
n=1 ns and n=1 ns is the Dirichlet series corresponding to the convolution of the coefficients n=1 ns
.
74 CHAPTER 4. FINITE FIELDS

Prove that the Dirichlet convolution is associative. In addition, prove that if f and g are multiplicative 7 ,
meaning that f (mn) = f (m)f (n) and g(mn) = g(m)g(n) for all coprime m, n ∈ N, then so is f ∗ g.
Exercise 4.6.2† (Möbius Inversion). Define the Möbius function µ : N∗ → {−1, 0, 1} by µ(n) = (−1)k
where k is the number of prime factors of n if n is squarefree, and µ(n) = 0 otherwise. Define also δ
as the function mapping 1 to 1 and everything else to 0. Prove that δ is the identity element for the
Dirichlet convolution: f ∗ δ = δ ∗ f = f for all arithmetic functions f . In addition, prove that µ is
the inverse of 1 for the Dirichlet convolution, meaning that µ ∗ 1 = 1 ∗ µ = δ where 1 is the function
n 7→ 1.8
Exercise 4.6.3† (Prime Number Theorem in Function Fields). Prove that the number of irreducible
polynomials in Fp [X] of degree n is
1 X n d
Nn = µ p
n d
d|n

pn
and show that this is asymptotically equivalent to logp (pn ) .

Linear Recurrences
Exercise 4.6.4† (China TST 2008). Define the sequence (xn )n≥1 by x1 = 2, x2 = 12 and xn+2 =
6xn+1 − xn for n ≥ 0. Suppose p and q are rational primes such that q | xp . Prove that, if q 6= 2, 3,
then q ≥ 2p − 1.
Exercise 4.6.5 (Korean Mathematical Olympiad 2013 Final Round). Let a and b be two coprime
positive rational integers. Define the sequences (an )n≥0 and (bn )n≥0 by
√ √
(a + b 2)2n = an + bn 2

for n ≥ 0. Find all rational primes p for which there is some positive rational integer n ≤ p such that
p | bn .
 
Exercise 4.6.6† . Let p 6= 2, 5 be a prime number. Prove that p | Fp−ε where ε = p5 .
 
Exercise 4.6.7† . Let p 6= 2, 5 be a rational prime. Prove that p | Fp − 5
p .

Exercise 4.6.8† . Let m ≥ 1 be an integer and p a rational prime. Find the maximal possible period
modulo p ≥ m of a sequence satisfying a linear recurrence of order m.
Exercise 4.6.9† . Let f ∈ Z[X] be a polynomial and (an )n≥0 be alinear  recurrence of rational integers.
an
Suppose that f (n) | an for any rational integer n ≥ 0. Prove that f (n) is also a linear recurrence.9
n≥0

Polynomials and Elements of Fp


Exercise 4.6.10. Suppose f ∈ Fp [X] is such that f | X n − 1 implies n > pdeg f . Prove that f is
irreducible in Fp [X].
n
Exercise 4.6.11† . Let a ∈ Fp be non-zero. Prove that X p − X − a is irreducible over Fp if and only
if n = 1, or n = p = 2.
Exercise 4.6.12† (ISL 2003). Let (an )n≥0 be a sequence of rational integers such that an+1 = a2n − 2.
Suppose an odd rational prime p divides an . Prove that p ≡ ±1 (mod 2n+2 ).
Exercise 4.6.13. Let f ∈ Fp [X] be a polynomial. Prove that f has a double root in Fp if and only if
its discriminant is zero.
7 This terminology has conflicting meanings: in algebra, it means that f (xy) = f (x)f (y) for all x, y, while for arithmetic

functions, it only means that f (xy) = f (x)f (y) for coprime x, y.


8 This also explains how we found the formula for Φ from Exercise 3.5.19
n
9 In fact, the Hadamard quotient theorem states that if a linear recurrence b always divides another linear recurrence
  n
an then ab n is also a linear recurrence.
n n≥0
4.6. EXERCISES 75

Exercise 4.6.14† . Let f ∈ Fp [X] be an irreducible polynomial of odd degree. Prove that its discrim-
inant is a square in Fp .
Exercise 4.6.15† (Chevalley-Warning Theorem). Let f1 , . . . , fm ∈ Fpk [X1 , . . . , Xn ] be polynomials
such that d1 + . . . + dm < n, where di is the degree of fi . Prove that, if f1 , . . . , fm have a common
root in Fpk , then they have another one.
Exercise 4.6.16† (USA TST 2016). Define Ψ : Fp [X] → Fp [X] by
n
! n
X X i
i
Ψ ai X = ai X p .
i=0 i=0

Prove that, for any f, g ∈ Fp [X], gcd(Ψ(f ), Ψ(g)) = Ψ(gcd(f, g)).


Exercise 4.6.17. Prove that

Fp = Fp ({ω ∈ Fp | ω has prime order}).

Exercise 4.6.18. Prove that any polynomial f ∈ Z/pZ[X] has infinitely many roots in Z/pZ.
Exercise 4.6.19 (Miklós Schweitzer 2018). Suppose X 4 + X 3 + 2X 2 − 4X + 3 has a root in Fp . Prove
that p is a fourth power modulo 13.

Squares and the Law of Quadratic Reciprocity


Exercise 4.6.20† . Let q be a prime power, a ∈ F×
q and m ≥ 1 an integer. Prove that a is an mth
p−1
power in Fq if and only if a gcd(p−1,m) = 1.
Exercise 4.6.21† . Let a be a rational integer. Suppose a is quadratic residue modulo every rational
prime p - a. Prove that a is a perfect square.
Exercise 4.6.22† . Prove that 16 is an eighth power modulo every prime but not an eighth power in
Q.
Exercise 4.6.23† . Prove that, if a polynomial f ∈ Z[X] of degree 2 has a root in Fp for any rational
prime p, then it has a rational root. However, show that there exists polynomials of degree 5 and 6
that have a root in Fp for every prime p but no rational root.10
Exercise 4.6.24† (Jacobi Reciprocity). Define the Jacobi symbol n· of an odd positive integer n as


the product    
· ·
· ... ·
pn1 1 pnk k
where n = pn1 1 · . . . · pnk k is the prime factorisation of n. Prove the following statements: for any odd
m, n
m−1 n−1
• m 2 · 2 .
 n
n m = (−1)
m−1
• −1

m = (−1)
2 .

m2 −1
2

• m = (−1) 8 .
m

(The Jacobi symbol n is 1 if m is quadratic residue modulo n but may also be 1 if m isn’t.)
Exercise 4.6.25† . Suppose a1 , . . . , an are distinct squarefree rational integers such that
n
X √
bi ai = 0
i=1

for some rational numbers b1 , . . . , bn . Prove that b1 = . . . = bn = 0.


10 The Chebotarev density theorem implies that such a polynomial must be reducible. In fact it even characterises

polynomials which have a root in Fp for every rational prime p based on the Galois groups of their splitting field (see
Chapter 6). In particular, it shows that 5 and 6 are minimal.
76 CHAPTER 4. FINITE FIELDS

n
Exercise 4.6.26† . Let n ≥ 2 be an integer and p a prime factor of 22 + 1. Prove that p ≡ 1
(mod 2n+2 ).
Exercise 4.6.27† (USA TST 2014). Find all functions f : Z → Z such that (m − n)(f (m) − f (n)) is
a perfect square for all m, n ∈ Z.
Exercise 4.6.28† . Suppose that positive integers a and b are such that 2a − 1 | 3b − 1. Prove that,
either a = 1, or b is even.

Sums and Products


Exercise 4.6.29† (Tuymaada 2012). Let p be an odd prime. Prove that
p+1
1 1 1 (−1) 2
+ + ... + ≡ (mod p)
02 + 1 12 + 1 (p − 1)2 + 1 2

where the sum is taken over the k for which k 2 + 1 6≡ 0.


Exercise 4.6.30. How many pairs (x, y) of elements of Fp are there such that x2 + y 2 = 1?
Exercise 4.6.31 (USAMO 2020). What is the product of the elements a of Fp such that both a and
4 − a are quadratic non-residues?
Exercise 4.6.32† . Let n ≥ 1 be an integer. Prove that, for any rational prime p,
p−1
Y ϕ(n)
Φn (k) ≡ Φn/ gcd(n,p−1) (1) ϕ(n/ gcd(n,p−1)) (mod p).
k=1

Miscellaneous
Exercise 4.6.33. Compute Ψn (0) for n ≥ 1.
Exercise 4.6.34† (Lucas’s Theorem). Let p be a prime number and

n = pm nm + . . . + pn1 + n0

and
k = pm km + . . . + pk1 + k0
be the base p expansion of rational integers k, n ≥ 0 (ni and ki can be zero). Prove that
  Y m  
n ni
≡ .
k i=0
ki

Exercise 4.6.35† (Carmichael’s Theorem). Let a, b be two coprime integers such that a2 − 4b > 0,
and let (un )n≥1 denote the linear recurrence defined by u0 = 0, u1 = 1, and

un+2 = aUn+1 − bUn .

Prove that for n 6= 1, 2, 6, un always have a primitive prime factor, except when n = 12 and a = b = ±1
(corresponding to the Fibonacci sequence).11
Exercise 4.6.36† . Suppose p ≡ 2 or p ≡ 5 (mod 9) is a rational prime. Prove that the equation

α3 + β 3 + εaγ 3 = 0

where ε ∈ Z[j] is a unit and 2 6= a ∈ {p, p2 } does not have solutions in Z[j].
11 Although Schinzel [37] proved in 1993 that, for any non-zero algebraic integers α, β such that α/β is not a root of

unity, αn − β n had a primitive prime ideal factor for all sufficiently large n, the quest for determining all exceptions
in the conjugate quadratic quest continued until 1999, where it was finally settled by Bilu, Hanrot and Voutier [5]. In
particular, there is a primitive factor for any n > 30.
4.6. EXERCISES 77

Exercise 4.6.37† (Class Equation of a Group Action and Wedderburn’s Theorem). Let G be a finite
group, S a finite set, and · a group action of G on S.12 Given an element s ∈ S, let Stab(s) and Fix(G)
denote the set of elements of G fixing s and the elements of S fixed by all of G respectivelly. Finally,
let Oi = Gsi be the (disjoint) orbits of elements of G. Prove the class equation:
X |G|
|S| = | Fix(G)| + .
| Stab(si )|
|Oi |>1

By applying this to the conjugation action (S = G and g · h = ghg −1 ), deduce Wedderburn’s theorem:
any finite skew field is a field.
Exercise 4.6.38† (Finite Projective Planes). We say a pair (Π, Λ) of sets of cardinality at least 2,
together with a relation of "a point lying on a line", where Π is a set of "points" and Λ a set of lines
is a projective plane if
1. any two distinct points lie on exactly one line, and

2. any two distinct lines meet in exactly one point.


Prove that, for any finite projective plane, there exists an integer n, called the order of the plane, such
that any line contains exactly n + 1 points and any point lies on exactly n + 1 lines. With this setting,
prove also that there are N = n2 + n + 1 points and N lines. Finally, prove that, for any prime power
q 6= 1, there is a projective plane of order q.13
Exercise 4.6.39 (USA TSTST 2016). Does there exist a non-constant polynomial f ∈ Z[X] such
that, for any rational integer n > 2,

f (Z/nZ) := {f (0), . . . , f (n − 1)} (mod n)

has cardinality at most 0.499n?

12 In other words, a map · : G × S → S such that e · s = s and (gh) · s = g · (h · s) for any g, h ∈ G and s ∈ S. See also

Exercise A.3.21† .
13 These have been conjectured to be the only such integers, but this remains unproven so far. See ?? for a necessary

condition for there to be a projective plane of order n (which excludes in particular n = 6).
Chapter 5

Polynomial Number Theory

Prerequisites for this chapter: Section A.1.

Algebraic number theory is deeply linked with polynomials (already by definition!). Here we study
some arithmetic properties of polynomials with rational coefficients.

5.1 Factorisation of Polynomials


We have already mentioned factorisation of polynomials as a unique product of irreducible polynomials
in Chapter 2 (in an abstract context) and Chapter 4 (for Fp [X]) but we restate the main results here
since they are fundamental.

Theorem 5.1.1 (Factorisation in Irreducible Polynomials in Q [X])

Any non-zero polynomial f ∈ Q[X] has a unique factorisation as a constant times a product of
monic irreducible polynomials.

Proof

Q is a field so Q[X] is Euclidean (for the degree map) (see Proposition A.1.1) which means it’s a
UFD by Proposition 2.2.1 and Theorem 2.2.1. To finish, any irreducible polynomial has a unique
monic associate so just use them in the factorisation and collect the leading coefficient in the
beginning.


Since we deal with arithmetic property of polynomials, we are interested in factorising polynomials
over Z[X]. However Z is not a field anymore, so is Z[X] really a UFD? Gauss’s lemma shows that as
long as R is a UFD, R[X] also is one. Before proving this however, we expand a bit on irreducible
polynomials in Z[X]. The polynomial 2X is irreducible in Q[X] (if 2X = f g then either f or g is
constant and non-zero, i.e. a unit of Q[X]) but not anymore in Z[X]. Indeed, it factorises as 2 · X and
2 is not a unit anymore (1/2 6∈ Z[X]).

We are thus led to make the following definition.

Definition 5.1.1 (Primitive Polynomial)

We say a polynomial f ∈ Z[X] is primitive if the gcd of its coefficients is 1.

78
5.1. FACTORISATION OF POLYNOMIALS 79

For instance, the only constant primitive polynomials are 1 and −1.

Theorem 5.1.2 (Gauss’s Lemma)

The product of two primitive polynomials is primitive.

Proof

Suppose f and g are primitive but f g isn’t. Let p be a prime dividing all coefficients of f g, i.e.
f g ≡ 0 (mod p). Since Fp [X] is an integral domain, this means f ≡ 0 (mod p) or g ≡ 0 (mod p)
which is impossible as they are primitive.


We can also state Gauss’s lemma with the notion of content. The primitive polynomials are
polynomials of content 1.

Definition 5.1.2 (Content)

The content of a polynomial f ∈ Z[X], c(f ), is defined as the gcd of the coefficients of f . For
f ∈ Q[X], it is c(N f )/|N | where 0 6= N is such that N f ∈ Z[X].

Exercise 5.1.1∗ . Prove that the content is well-defined: c(N f )/|N | = c(M f )/|M | for any non-zero M, N ∈ Z
such that N f, M g ∈ Z[X].

Proposition 5.1.1 (Equivalent Form of Gauss’s Lemma)*

The content is completely multliplicative, i.e. c(f g) = c(f )c(g) for any f, g ∈ Q[X].

Proof

Without loss of generality, we may assume f, g ∈ Z[X] as c(N f ) = |N |c(f ) for any N ∈ Z
(Exercise 5.1.1∗ ). Then, f /c(f ) and g/c(g) are primitive so c(ff)c(g)
g
is too by Theorem 5.1.2.
Accordingly,  
fg
c(f g) = c(f )c(g)c = c(f )c(g).
c(f )c(g)

Corollary 5.1.1 (Irreducible Polynomials in Z [X])*

A polynomial f ∈ Z[X] is irreducible in Z[X] if and only if it is primitive and irreducible in Q[X].

Proof

Clearly, if f is primitive but reducible in Z[X], it is reducible in Q[X]. Thus, it suffices to show
that a primitive polynomial which is reducible in Q[X] also is reducible in Z[X]. Suppose f = gh.
80 CHAPTER 5. POLYNOMIAL NUMBER THEORY

By multiplicativity of the content, we also have

f = (g/c(g))(h/c(h))

which is a factorisation in Z[X] as wanted (by Exercise 5.1.2∗ ).




Exercise 5.1.2∗ . Suppose f ∈ Q[X] has integral content. Prove that f has integer coefficients.

In fact, we even have the following more general result.

Proposition 5.1.2

Suppose f, g ∈ Z[X] are polynomials such that f divides g in Q[X]. Then, f ∗ divides g in Z[X],
where f ∗ = f /c(f ) is the primitive part of f .

Exercise 5.1.3∗ . Prove Proposition 5.1.2.

We finally get our factorisation in Z[X].

Corollary 5.1.2 (Factorisation in Irreducible Polynomials in Z [X])*

Any non-zero polynomial f ∈ Z[X] has a unique factorisation as a constant times a product
of non-constant primitive irreducible polynomials with positive leading coefficient. Equivalently,
Z[X] is a UFD.

Exercise 5.1.4∗ . Prove Corollary 5.1.2.

As another corollary of Proposition 5.1.2, we get a new proof of Proposition 1.2.2, asserting that
the minimal polynomial of an algebraic integer has integer coefficients, which uses neither the fact that
rational integers are the only rational algebraic integers nor the fact that Z is closed under addition
and multiplication.

Corollary 5.1.3

Let α ∈ Q be an algebraic number. Then, πα ∈ Z[X] if and only if α ∈ Z.

Proof

It is clear that if πα ∈ Z[X] then α ∈ Z, thus suppose that α ∈ Z. Let f ∈ Z[X] be a monic
polynomial vanishing at α. Then, πα∗ divides f in Z[X] by Proposition 5.1.2 so the leading
coefficient of πα∗ is ±1 since it divides the leading coefficient of f which is 1. Finally, we have
πα∗ = πα which means that it has integer coefficients as wanted.


Because of these results, from now on we will say "irreducible" to mean "irreducible in Q[X]" and
"primitive and irreducible" to mean "irreducible in Z[X]", unless otherwise specified. By default, f | g
means that f divides g in Q[X] and we will specify if it’s true in Z[X] too when needed. If necessary,
we will use |Q[X] for divisibility in Q[X] and |Z[X] for divisibility in Z[X].
5.1. FACTORISATION OF POLYNOMIALS 81

Before discussing another result, we will say one last thing on Gauss’s lemma. One can see that
its proof only uses the fact that Z is a UFD (see Chapter 2). Indeed, we did not use the fact that Fp
is a field, only that it is an integral domain, and this is true for any prime p in any ring. In fact, it
is the definition: p is prime in R if and only if R/pR is integral (see Exercise 2.2.9∗ ). Thus, we could
restate it in the following form.

Proposition 5.1.3 (Gauss’s Lemma)

Suppose that a ring R is a UFD. Then, R[X] is also one.

It can also be seen from our proof of Corollary 5.1.2 that the primes of R[X] are either primes of
R or primitive polynomials which are irreducible over Frac R. A very important consequence is that,
by induction, R[X1 , . . . , Xn ] is a UFD when R is, and in particular K[X1 , . . . , Xn ] is a UFD for every
field K. In other words, we still have factorisation in irreducible polynomials with more variables.

Corollary 5.1.4

For any field K and any integer n ≥ 1, K[X1 , . . . , Xn ] is a UFD.

Remark 5.1.1
It is however not Bézout anymore for n ≥ 2! What we can still do, though, is get rid of one
variable by considering a Bézout relation in K(X1 , . . . , Xn−1 )[Xn ] since K(X1 , . . . , Xn−1 ) is a
field: we get uf + vg ∈ K(X1 , . . . , Xn−1 ) for some u, v ∈ K(X1 , . . . , Xn−1 )[Xn ], and by clearing
the denominators we get rf + sg ∈ K[X1 , . . . , Xn−1 ] for some r, s ∈ K[X1 , . . . , Xn ].

We end this section with a classical criterion for proving certain polynomials are irreducible.

Proposition 5.1.4 (Eisenstein’s Criterion)

Suppose p is a rational prime and f = an X n + . . . + a0 ∈ Z[X] is a polynomial such that p - an ,


p | an−1 , . . . , a0 and p2 - a0 . Then f is irreducible (in Q[X]).

Proof

Suppose f = gh where g, h ∈ Z[X] are non-constant. Then, modulo p, gh ≡ X p so g ≡ X i and


h ≡ X j for some k. Moreover, we must have deg(g (mod p)) = deg g as otherwise the leading
coefficient of g is divisible by p which is impossible as p - an .

Thus, i, j ≥ 1. This must mean that p | g(0), h(0) so p2 | g(0)h(0) = a0 , a contradiction. Hence
by Gauss’s lemma f is irreducible in Q[X].


Corollary 5.1.5

The pth cyclotomic polynomial Φp = X p−1 + . . . + X + 1 is irreducible for any rational prime p.
82 CHAPTER 5. POLYNOMIAL NUMBER THEORY

Proof

Apply the Eisenstein criterion to

(X + 1)p − 1
     
p p p
Φp (X + 1) = = X p−1 + X p−2 + . . . + X.
(X + 1) − 1 p p−1 1

Remark 5.1.2
If one finds this transformation a bit unnatural, one can also reprove Eisenstein for this poly-
p
−1
nomial: Φp = XX−1 ≡ (X − 1)p−1 by Proposition 4.1.1 so if it were reducible p2 would divide
Φp (1).

More generally, there are two basic principles to prove a polynomial is irreducible in Q[X]: find
some (impossible) information about a hypothetical factorisation modulo some prime p, or find
some bounds on its roots to get a contradiction if it were reducible (for instance if a monic
polynomial’s constant coefficient is prime and it were reducible, one of the factors must have
constant coefficient ±1 and hence a root of absolute value less than 1). We do not explore the
second idea in this book, see chapter 17 of PFTB [1] for an account of this. Note that, even if f
is irreducible in Q[X], it is not always possible to find a rational prime p for which f (mod p) is
irreducible in Fp as Φ8 shows (Proposition 4.4.3).

Exercise 5.1.5. Prove that Φpn is irreducible with Eisenstein’s criterion.

5.2 Prime Divisors of Polynomials


Given a polynomial f ∈ Z[X] are interested in knowing which rational primes p are such that f
has a root in Fp . In fact we can already see that it is deeply linked to algebraic number theory: if
f = α(X − α1 ) · . . . · (X − αn ), we want to know whether the prime p divides the product

α(a − α1 ) · . . . · (a − αn )

for some a ∈ Z. We will not answer this question however, as it goes beyond the scope of this book
(see the chapter on density theorems of [27]). Instead, we will only prove that there are infinitely many
such primes (we call them "prime divisiors of the polynomial" by abuse of terminology, as they divide
one of the value taken by it.)

Theorem 5.2.1

For any non-constant polynomial f ∈ Z[X], there exists infinitely many rational primes p such
that p | f (a) for some a.

Proof

Suppose there were only finitely many such primes, p1 , . . . , pm . Clearly, f (0) 6= 0 as otherwise
all primes divide f (0) (or f (p) if you prefer large numbers). Thus, let
v (f (0))+1 v (f (0))+1
N = p1p1 · . . . · pmpm .
5.2. PRIME DIVISORS OF POLYNOMIALS 83

vp (f (0))+1
Consider the numbers f (kN ) for k ∈ Z: they are congruent to f (0) modulo pi i so their
vpi is vpi (f (0)). By assumption, their only prime factors are the pi : we conclude that
v (f (0)) v (f (0))
f (kN ) = ±p1p1 · . . . · pmpm = ±f (0).

Finally, the polynomial f (X)2 − f (0)2 has infinitely many roots so is zero which means that f is
constant, a contradiction.


Remark 5.2.1
Perhaps a simpler proof is to consider the polynomial f (aX)/a, where a = f (0) is the constant
coefficient of f , to avoid problems with its constant coefficient (which is now 1). We have presented
the other proof first because we find it to be more instructive (but the alternative one is instructive
too). This is what we did in the proof of Theorem 4.4.1.

Remark 5.2.2
We can also prove a much stronger result analytically: if (an )n≥0 is an increasing sequence of
positive integers bounded by a polynomial, an ≤ f (n) for some f ∈ R[X], then there are infinitely
many primes which divide at least one term of the sequence. Indeed, if there was only p1 , . . . , pm ,
then, on the one hand

X 1
1/ deg f
n=1 an
would diverge since it grows faster than (a constant) times the harmonic series. On the other
hand, by assumption,

m X
X 1 Y 1
6∞ 1/ deg f

n=1 an k=1 n=1
pn/ deg f
m
Y 1
= 1
k=1
1− p1/ deg f

< ∞.

An interesting corollary is that, if f ∈ Z[X] is a polynomial and S ⊆ N is a set of non-zero density


(that is, |S∩[n]|
n 6→ 0), then there are infinitely many primes p such that p | f (s) for some s ∈ S.

If we define P(f ) to be the set of primes p such that f has a root modulo p, this result becomes
the fact that P(f ) is infinite when f is non-constant.

Here is an application of this result.

Problem 5.2.1 (APMO 2021 Problem 2)

Find all polynomials f ∈ Z[X] such that, for any n, there are at most 2021 pairs of rational
integers 0 < a < b ≤ n for which |f (a)| ≡ |f (b)| (mod n).

Solution

We shall show that if f has degree at least 2, one value will of f will be reached arbitrarily
many times. Since f (m) is always positive or always negative for large m, we may assume by
translating f that its sign is constant on positive numbers.
84 CHAPTER 5. POLYNOMIAL NUMBER THEORY

Thus, we want to estimate the number of (a, b) ∈ (Z/nZ)2 such that f (a) ≡ f (b) (mod n). By
Theorem 5.2.1, there are infinitely many prime divisors of f (X + 1) − f ; let p1 , . . . , pm be such
primes.

Thus, for n = pi , there is one value f (a) which is reached twice modulo n. Hence, for n =
p1 · . . . · pm , by the Chinese remainder theorem (CRT), there is a value which is reached 2m times
modulo n. This indeed grows unbounded.

We conclude that f must have degree 1 (constant f clearly doesn’t work), i.e. f = uX + v for
some u, v ∈ Z and now we need to take in account the absolute values. We show that u = ±1.
Suppose for the sake of contradiction that |u| ≥ 2. Notice that the sign of f (n) is constant for
n ≥ v.

Modulo un , f (a) ≡ f (b) if and only if a ≡ b (mod un−1 ). For each residue modulo un−1 , there
are |u| residues modulo un . Thus, there are |u|n−1 · |u|

2 pairs of 0 < a < b ≤ un such that
f (a) ≡ f (b). Now subtract the contribution of the residues where the sign is potentially not the
same to get at least  
|u|
(|u|n−1 − |v|)
2
pairs which indeed grows unbounded.

Finally, f = ±X + v. It is easy to see that when v and the leading coefficient have the same
sign there is no pair working since the sign of f (a) is constant. When v has the opposite sign, f
works if and only if |v| ≤ 2022.


Exercise 5.2.1∗ . Why does CRT imply that there is a value reached 2m times modulo p1 · . . . · pm ?

Exercise 5.2.2. Prove that X − v works iff 0 ≤ v ≥ −2022, and −X + v works iff 0 ≤ v ≤ 2022.

5.3 Hensel’s Lemma


We have found (some results about) when a polynomial f has a root modulo p. Now suppose we want
to know when f has a root modulo n = pn1 1 · . . . · pnmm . By the Chinese remainder theorem, this is
equivalent to knowing when f has a root modulo pni i for each i. Indeed, f (a) ≡ 0 (mod n) if and only
if f (a) ≡ 0 (mod pni i ) for each i, i.e. a is congruent modulo pni i to a root ai of f (mod pni i ). A partial
result is provided by the Hensel lemma.

Theorem 5.3.1 (Hensel’s Lemma)

Let f ∈ Z[X] be a polynomial and p a rational prime. If p | f (a) for some a ∈ Z and p - f 0 (a),
then, for any k, there is a unique b ∈ Z/pm Z such that pm | f (b) and b ≡ a (mod p).

Remark 5.3.1
This tells us that, in many case, if we have a root in modulo p we have one modulo powers of p
too, but what about roots modulo composite numbers? Q This follows from CRT: if ni is a root
modulo pki i , then n ≡ ni (mod pki i ) is a root modulo i pki i . Thus, CRT combined with Hensel’s
lemma usually let us reduce the study of roots of f modulo integers to roots of f modulo primes
(which is a lot nicer, since Fp is a field).

Before proving this result, we need a lemma which is in fact even more important than Hensel.
Rather than only remembering the statement of Hensel’s lemma, the reader should also learn the
method of proving it which can be useful in a larger variety of situations.
5.3. HENSEL’S LEMMA 85

Proposition 5.3.1 (Taylor’s Formula)*

Let f ∈ Q[X] be a polynomial of degree n. For any h ∈ Q, we have

f f (n)
f (X + h) = f + hf + h2 · + . . . + hn · .
2 n!

Remark 5.3.2
One can also write

X f (k)
f (X + h) = hk ·
k!
k=0

as all terms after k = n vanish (since f has degree n.)

Proof
Pn
Consider h as a formal varable and expand f (X + h) as i=0 ai hi where ai ∈ Q[X]. We wish to
(i)
prove that ai = fi! for all i. For this, consider some 0 ≤ k ≤ n and differentiate f (X + h) as a
polynomial in h k times. This gives
n−k
X
f (k) (X + h) = ai+k hi (i + 1) · · · (i + k).
i=0

If we evaluate this at h = 0, we get k!ak = f (k) as wanted.




Exercise 5.3.1. Give a direct proof of Taylor’s formula by expanding the RHS.

Corollary 5.3.1*

Let p be a rational prime, k a positive integer and f ∈ Z[X] a polynomial. For any rational
integer h divisible by pk , we have

f (X + h) ≡ f + hf 0 (mod pk+1 ).

Proof
(k)
It suffices to prove that fk! have integer coefficients by Proposition 5.3.1 (we evaluate both sides
modulo pk+1 ). But we have already shown that, if f = i ai X i , we have
P

f (k) X i
= ai X i−k .
k! i
k

Here’s an application of this result, before proving Hensel’s lemma.


86 CHAPTER 5. POLYNOMIAL NUMBER THEORY

Problem 5.3.1 (USA TST 2010 Problem 1)

Let f ∈ Z[X] be a non-constant polynomial such that gcd(f (0), f (1), f (2), . . .) = 1 and f (0) = 0.
Prove that there exist infinitely many integers rational integer n ∈ Z such that

gcd(f (n) − f (0), f (n + 1) − f (1), . . .) = 1.

Solution

We take n = p a rational prime. Suppose that a rational prime q 6= p divides f (p + k) − f (k)


for all k. Then, f (mp) ≡ f (0) (mod q) for any m by induction. But, since q 6= p, mp (mod q)
goes through every element of Fq which means that f is constant modulo q. Since f (0) = 0, this
means q | gcd(f (0), f (1), f (2), . . .) which is impossible.

Hence, we know that gcd(f (p)−f (0), f (p+1)−f (1), . . .) is a power of p. It is clearly divisible by p;
thus it remains to prove that it’s not divisible by p2 . By Corollary 5.3.1, f (p + k) − f (k) ≡ pf 0 (k)
(mod p2 ), this is equivalent to p not dividing at least one number of the form f 0 (k).

This is very easy to have: f has degree at least 1, so f 0 is non-zero. Now, just pick a k such that
f 0 (k) 6= 0 and any rational prime p - f 0 (k) (there are clearly infinitely many such primes).


Proof of Hensel’s Lemma

We proceed by induction on k. For k = 1, the result is clear. Now, suppose there is a unique
bk ∈ Z/pk Z such that f (bk ) ≡ 0 (mod pk ) and bk ≡ a (mod p). This means that any bk+1
satisfying f (bk+1 ) ≡ 0 (mod pk+1 ) and bk+1 ≡ a (mod p) must be congruent to bk modulo pk .
We show that a unique bk+1 ≡ bk (mod pk ) which is a root of f modulo pk+1 exists (modulo
pk+1 ).

Write bk+1 = bk + upk . By Corollary 5.3.1, we have

f (bk+1 ) ≡ f (bk ) + upk f 0 (a) (mod pk+1 ),

Accordingly, bk+1 is a root of f modulo pk+1 if and only if u ≡ (f 0 (a))−1 (f (bk )/pk ) (mod p)
as f 0 (a) is invertible modulo p by assumption. This exists and is unique modulo p, hence
bk+1 = bk + upk indeed exists and is unique modulo pk+1 .


Here’s an application of Hensel itself.

Problem 5.3.2 (ISL 1995 N1)

Prove that, for any positive integer n, there exists a rational integer k such that k · 2n − 7 is a
perfect square.

Solution

This is equivalent to −7 being a perfect square modulo 2n . This makes us think of applying
Hensel’s lemma to the polynomial X 2 + 7. Unfortunately, its derivative 2X ≡ 0 is zero modulo
2, thus the hypotheses can never be satisfied.
5.4. BÉZOUT’S LEMMA 87

Nevertheless, we already know a root a of X 2 − 17 modulo 2n must be odd. Thus, we can make
the substitution X = 2Y + 1 to get

X 2 + 7 = (2Y + 1)2 + 7 = 4(Y 2 + Y + 2).

We can now use Hensel’s lemma on Y 2 + Y + 2: its derivative is 2Y + 1 ≡ 1 which is never zero,
hence we can lift the root 1 of Y 2 + Y + 2 modulo 2 to a (unique) root a modulo 2n−2 . Such an
a will satisfy
(2a + 1)2 ≡ −7 (mod 2n )
by what we have shown.


Exercise 5.3.2∗ . Let p be an odd prime and a a quadratic residue modulo p. Prove that a is a quadratic
residue modulo pn , i.e. a square modulo pn (coprime with pn ), for any positive integer n.

Exercise 5.3.3∗ . Prove that an odd rational integer a ∈ Z is a quadratic residue modulo 2n for n ≥ 3 if and
only if a ≡ 1 (mod 8).

5.4 Bézout’s Lemma


We shall now see how irreducible polynomials really shine in polynomial number theory, with a few
worked examples. Recall Bézout’s lemma for Q[X] (Q[X] is Euclidean and hence a Bézout domain
too).

Proposition 5.4.1 (Bézout’s lemma for Q [X])

For any coprime polynomials f, g ∈ Q[X], there exists polynomials u, v ∈ Q[X] such that

f u + gv = 1.

Of course, this also holds for multiple polynomials: if f1 , . . . , fn are coprime then some linear
combination (with coefficients in Q[X]) of them is 1 (just induct on n using Proposition 5.4.1).

Corollary 5.4.1*

For any coprime polynomials f, g ∈ Z[X], there exists polynomials u, v ∈ Z[X] and a non-zero
constant N ∈ Z such that
f u + gv = N.

Exercise 5.4.1∗ . Prove Corollary 5.4.1.

Corollary 5.4.2

Suppose f, g ∈ Z[X] are such that, for any sufficiently large rational prime p, p | f (n) implies
p | g(n) for any n ∈ Z. Then, rad f | rad g, i.e. every irreducible factor of f in Q[X] divides g.
88 CHAPTER 5. POLYNOMIAL NUMBER THEORY

Proof

Suppose that π is a non-constant primitive irreducible factor of f which doesn’t divide g.

Since π is irreducible, it is then coprime with g, so by Bézout’s lemma there exist polynomials
u, v ∈ Z[X] and a non-zero integer N ∈ Z such that

πu + gv = N.

In particular, any common prime factor of π(n) and g(n) must also divide N . Thus, if p > N is
a sufficiently large prime factor of π(n) | f (n) (there exists one by Theorem 5.2.1), then p - N so
p - g(n) which is a contradiction.


In other words, the prime divisors of a polynomial are controlled by the prime divisors of its
irreducible prime factors, thanks to Bézout’s lemma. Here is a more elaborate example, not involving
irreducible polynomials in the statement.

Problem 5.4.1

Suppose f ∈ Q[X] is a polynomial which takes only perfect square values (in Q). Prove that it
is the square of a polynomial with rational coefficients.

Solution

By multiplying f by an appropriate integral perfect square, we may assume f has integer coeffi-
cients. Without loss of generality, we may assume f is squarefree. We shall show that f must be
constant, since this clearly implies that it is the square of a polynomial with integer coefficients.

Consider its factorisation in non-constant primitive irreducible polynomials f = aπ1 · . . . · πm .


Suppose for the sake of contradiction that m ≥ 1. First, we wish to distinguish the prime divisors
of π1 from the prime divisors, so that, when p | π1 (n), vp (π1 (n)) = vp (f (n)) (which must be
even by assumption). By Bézout’s lemma, since π1 and π2 π3 · . . . · πm are coprime, there exist
polynomials u, v ∈ Z[X] and a non-zero integer N such that

uπ1 + vπ10 π2 · . . . · πm = N.

Now, consider a rational prime p > a, N and a rational integer n ∈ Z such that p | π1 (n); there
exists one by Theorem 5.2.1. By assumption, p - aN , thus p - aπ2 (n) · . . . · πm (n) which implies
vp (f (n)) = vp (π1 (n)).

In particular, since vp (π1 (n)) and vp (π1 (n + p)) are even and positive, we must have

p2 | π1 (n), π1 (n + p).

But, by Corollary 5.3.1

π1 (n + p) ≡ π1 (n) + pπ10 (n) ≡ pπ10 (n) (mod p2 )

which means p must divide π10 (n).

To conclude, π1 and π10 are coprime (since π1 is irreducible and deg π1 > deg π10 ) so, by Bézout’s
lemma, there are some r, s ∈ Z[X] and a non-zero M ∈ Z such that rπ1 +sπ10 = M . Then, for p >
a, M, N , the previous remark is impossible and we must have vp (π1 (n)) = 1 or vp (π1 (n + p)) = 1
which is a contradiction.

5.5. EXERCISES 89

Remark 5.4.1
We could have directly used Bézout on π1 and π10 π2 · . . . · πn but we presented it that way to
highlight the motivation. In fact, what we have proven wit this is that, if π and π1 , . . . , πk are
distinct primitive irreducible polynomials, there exists infinitely many primes p for which there
is an n such that vp (π(n)) = 1 and vp (πi (n)) = 0 for i = 1, . . . , k.

Remark 5.4.2
In fact, the problem of determining which polynomials reach infinitely many square values has
been completely settled with deep arithmetic geometry results. We can also approach this an-
alytically (and elementarily) to get results stronger than what we proved, but worse than the
complete characterisation. The idea is that if the leading coefficient of f ∈ Z[X] is a square, we
can find a polynomial g ∈ Z[X] such that

g(x)2 ≤ f (x) < (g(x) + 1)2

for any sufficiently large |x|, which forces f = g 2 if f takes infinitely many square values. When
the leading coefficient is not a square, we can still transform it into a square in some cases. For
instance, if f (2n ) is a square for all n, then the leading coefficient of f (X)f (2m X) is a square for
even m, and this polynomial takes infinitely many square values, so must be a square. It is not
hard that this must imply that f is a square of X times a square, which doesn’t work (e.g. by
looking at the roots: for sufficiently large m, the only possible common root of f and f (2m X) is
0).

We conclude this chapter with two additional remarks. When dealing with problems about polyno-
mials modulo some prime p, it is very important to keep in mind a polynomial of degree n has at most
n roots modulo p (since Fp is a field). Also, when dealing with exponential functions and polynomials
at the same time, say f (n) and an , modulo p one can choose the value of an and f (n) independently
as the first one has period p − 1 while the latter has period p. We illustrate this by an example.

Problem 5.4.2 (Polish Mathematical Olympiad 2003 Problem 3)

Find all polynomials f ∈ Z[X] is a polynomial such that f (n) | 2n − 1 for any positive rational
integer n.

Solution

Suppose some prime p divides f (n) for some rational integer n. Choose a rational integer n0
satisfying n0 ≡ n (mod p) and n0 ≡ 0 (mod p − 1) by CRT. Then,
0
p | f (n0 ) | 2n − 1 ≡ 1 (mod p)

which is impossible. Thus, we conclude f (n) = ±1 for all n which means f = ±1. These are
indeed solutions.


5.5 Exercises
Algebraic Results
Exercise 5.5.1† . Suppose f, g ∈ Z[X] are polynomials such that f (n) | g(n) for infinitely many
rational integers n ∈ Z. Prove that f | g. In addition, generalise the previous statement to f, g ∈
Z[X1 , . . . , Xm ] such that f (x) | g(x) for x ∈ S1 × . . . × Sn , where S1 , . . . , Sn ⊆ Z are infinite sets.
90 CHAPTER 5. POLYNOMIAL NUMBER THEORY

Exercise 5.5.2† . Let f ∈ Q[X] be a polynomial. Suppose that f always takes values which are mth
powers in Q. Prove that f is the mth power of a polynomial with rational coefficients. More generally,
find all polynomials f ∈ Q[X1 , . . . , Xm ] such that f (x1 , . . . , xm ) is a (non-trivial) perfect power for
any (x1 , . . . , xm ) ∈ Zm .
Exercise 5.5.3† . Suppose f, g ∈ Z[X] are polynomials such that f (a) − f (b) | g(a) − g(b) for any
rational integers a, b ∈ Z. Prove that there exists a polynomial h ∈ Z[X] such that g = h ◦ f .
Exercise 5.5.4 (RMM SL 2016). Let p be a prime number. Prove that there are only finitely many
primes q such that
bq/pc
X
q| k p−1 .
k=1

Exercise 5.5.5. Let x and y be positive rational integers. Suppose f, g ∈ Z[X] are polynomials such
that f (ab) | g(ax + by ) for any a, b ∈ Z. Prove that f is constant.
Exercise 5.5.6 (ISL 2019). Suppose a and b are positive rational integers such that
 
an
an + 1 | +1
b

for any rational integer n ≥ b. Prove that b + 1 is prime.


Exercise 5.5.7. Suppose f ∈ Z[X1 , . . . , Xn ] is a polynomial such that, for each tuple of rational
primes (p1 , . . . , pn ), there is some i for which pi | f (p1 , . . . , pk ). Prove that Xi | f for some i. (You
may assume Dirichlet’s theorem ??.)
Exercise 5.5.8 (Inspired by Iran TST 2019). Suppose f1 , . . . , fk ∈ Q[X] are polynomials such that,
whenever n is a perfect power, one of f1 , . . . , fk is too. Prove that one of f1 , . . . , fk is a non-trivial
power of a polynomial or X.

Polynomials over Fp
Exercise 5.5.9† (Generalised Eisenstein’s Criterion). Let f = an X n +. . .+a0 ∈ Z[X] be a polynomial
and let p a rational prime. Suppose that p - an , p | a0 , . . . , an−1 , and p2 - ak for some k < n. Then
any factorisation f = gh in Q[X] satisfies min(deg g, deg h) ≤ k.
Exercise 5.5.10† (China TST 2008). Let f ∈ Z[X] be a (non-zero) polynomial with coefficients in
k
{−1, 1}. Suppose that (X − 1)2 divides f . Prove that deg f ≥ 2k+1 − 1.
Exercise 5.5.11† (Romania TST 2002). Let f, g ∈ Z[X] be polynomials with coefficients in {1, 2002}.
Suppose that f | g. Prove that deg f + 1 | deg g + 1.
Exercise 5.5.12† (USAMO 2006). Find all polynomials f ∈ Z[X] such that the sequence (P (f (n2 ))−
2n)n≥0 is bounded above, where P is the greatest prime factor function. (In particular, since P (0) =
+∞, we have f (n2 ) 6= 0 for any n ∈ Z.)
Exercise 5.5.13 (Iran TST 2011). Suppose a polynomial f ∈ Z[X] is such that pk | f (n) for all
n ∈ Z, for some k ≤ p. Prove that there exist polynomials g0 , . . . , gk ∈ Z[X] such that
k
X
f= (X p − X)i pk−i gi .
i=0

In addition, prove that this becomes false when k > p.


Exercise 5.5.14† (China TST 2021). Suppose the polynomials f, g ∈ Z[X] are such that, for any
sufficiently large rational prime p, there is an element rp ∈ Fp such that f ≡ g(X + rp ) (mod p). Prove
that there exists a rational number r ∈ Q such that f = g(X + r).
Exercise 5.5.15 (IMO 1993). Let n ≥ 2 be an integer. Prove that the polynomial X n + 5X n−1 + 3
is irreducible.
5.5. EXERCISES 91

Iterates
Exercise 5.5.16† . Let f ∈ Z[X] be a polynomial. Show that the sequence (f n (0))n≥0 is a Mersenne
sequence, i.e.
gcd(f i (0), f j (0)) = f gcd(i,j) (0)
for any i, j ≥ 0.
Exercise 5.5.17† . Suppose the non-constant polynomial
f = ad X d + . . . + a2 X 2 + a0 ∈ Z[X]
has positive coefficients and satisfies f 0 (0) = 0. Prove that the sequence (f n (0))n≥1 always has a
primitive prime factor.
Exercise 5.5.18† (Tuymaada 2003). Let f ∈ Z[X] be a polynomial and a ∈ Z a rational integer.
Suppose |f n (a)| → ∞. Prove that there are infinitely many primes p such that p | f n (a) for some
n ≥ 0 unless f = AX d for some A, d.
Exercise 5.5.19† (USA TST 2020). Find all rational integers n ≥ 2 for which there exist a rational
integer m > 1 and a polynomial f ∈ Z[X] such that gcd(m, n) = 1 and n | f k (0) ⇐⇒ m | k for any
positive rational integer k.
Exercise 5.5.20† . Let f ∈ Q[X] be a polynomial of degree k. Prove that there is a constant h > 0
such that that the denominator of f (x) is greater than h times the denominator of xk .
Exercise 5.5.21† . Let f ∈ Q[X] be a polynomial of degree at least 2. Prove that

\
f k (Q)
k=0

is finite.
Exercise 5.5.22† (Iran TST 2004). Let f ∈ Z[X] be a polynomial such that f (n) > n for any positive
rational integer n. Suppose that, for any N ∈ Z, there is some positive rational integer n such that
N | f n (1).
Prove that f = X + 1.

Divisibility Relations
Exercise 5.5.23† . Find all polynomials f ∈ Z[X] such that f (n) | nn−1 − 1 for sufficiently large n.
Exercise 5.5.24. Find all polynomials f ∈ Z[X] such that gcd(f (a), f (b)) = 1 whenever gcd(a, b) = 1.
Exercise 5.5.25† (ISL 2012 Generalised). Find all polynomials f ∈ Z[X] such that rad f (n) |
rad f (nrad n ) for all n ∈ Z. (You may assume Dirichlet’s theorem ??.)
Exercise 5.5.26 (ISL 2011). Suppose f, g ∈ Z[X] are coprime polynomials such that f (n) and g(n)
are positive for any positive rational integer n. Suppose that
2f (n) − 1 | 3g(n) − 1
for any positive rational integer n. Prove that f is constant.
Exercise 5.5.27† . Find all polynomials f ∈ Z[X] such that f (p) | 2p − 2 for any prime p. (You may
assume Dirichlet’s theorem ??.)
Exercise 5.5.28 (Iran Mathematical Olympiad 3rd Round 2016). We say a function g : N → N is
special if it has the form g(n) = af (n) where f ∈ Z[X] is a polynomial such that f (n) is positive when
n is a positive rational integer and a is a rational integer. We also say the sum, difference, and product
of two special functions is special. Prove that there does not exist a non-zero special function g and a
non-constant polynomial f ∈ Z[X] such that
f (n) | g(n)
for any positive rational integer n.
92 CHAPTER 5. POLYNOMIAL NUMBER THEORY

Miscellaneous
Exercise 5.5.29† (Generalised Hensel’s Lemma). Let f ∈ Z[X] be a polynomial and a ∈ Z an integer.
Let m = vp (f 0 (a)). If p2m+1 | f (a), prove that f has exactly one root b modulo pk which is congruent
to a modulo pm+1 for all k ≥ 2m + 1.
Exercise 5.5.30† . Let f ∈ Z[X] be a non-constant polynomial. Is it possible that f (n) is prime for
any n ∈ Z?
Exercise 5.5.31† . Find all polynomials f ∈ Q[X] which are surjective onto Q.

Exercise 5.5.32 (Inspired by USA TST 2008). Let n be a positive rational integer. How many
sequences of n elements of Z/nZ have the form

(f (0), . . . , f (n − 1))

for some f ∈ Z/nZ[X]?


Exercise 5.5.33 (ELMO SL 2018). We say a subset S of Z/nZ is d-coverable if there exists a poly-
nomial f ∈ Z/nZ[X] of degree at most d such that

S = {f (0), . . . , f (n − 1)}.

Find all rational integers n such that all subsets of Z/nZ are d-coverable for some d, and find the
minimum possible d for these n.
Exercise 5.5.34 (Iran TST 2015). Let (an )n≥0 denote the sequence of rational integers which are
sums of two squares: 0, 1, 2, 4, 5, 8, . . .. Let m ∈ Z be a positive rational integer. Prove that there are
infinitely many integers n such that an+1 − an = m.

Exercise 5.5.35† (ISL 2005). Let f ∈ Z[X] be a non-constant polynomial with positive leading
coefficient. Prove that there are infinitely many positive rational integers n such that f (n!) is composite.
Chapter 6

The Primitive Element Theorem and


Galois Theory

Prerequisites for this chapter: Chapters 1, 3 and 4 and Sections A.2 and C.1 for the whole chapter
and Chapter 5 for Section 6.4. Chapter 2 is recommended.

6.1 General Definitions


Let’s start with some general definitions with which you should be somewhat familiar with by now
(from Chapters 2 and 4 and the exercises).

Definition 6.1.1

The smallest ring containing a commutative ring R and elements α1 , . . . , αn is denoted

R[α1 , . . . , αn ].

It consists of polynomial expressions in α1 , . . . , αn :

R[α1 , . . . , αn ] = {f (α1 , . . . , αn ) | f ∈ R[X1 , . . . , Xn ]}.

Definition 6.1.2

The smallest field containing a commutative field K and elements α1 , . . . , αn is denoted

K(α1 , . . . , αn ).

It consists of rational expressions in α1 , . . . , αn :

K(α1 , . . . , αn ) = {f (α1 , . . . , αn ) | f ∈ K(X1 , . . . , Xn )}.

Exercise 6.1.1∗ . Prove that R[α1 , . . . , αn ] is indeed the smallest ring containing R and α1 , . . . , αn , in the
sense that any other such ring must contain R[α1 , . . . , αn ]. Similarly, prove that any field containing K and
α1 , . . . , αn contains K(α1 , αn ).

Exercise 6.1.2∗ . Let α ∈ Q be an algebraic number. Prove that Q(α) = Q[α].

Remark 6.1.1
Of course, we assumed the multiplication and addition of R and K were compatible with the αi ,
and in the second definition, that K[α1 , . . . , αn ] is an integral domain, otherwise the definitions

93
94 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

do not make sense. Indeed, if α1 α2 = 0 but α1 , α2 6= 0 then no field can contain α1 and α2 .

We now generalise the quadratic fields from Chapter 2 to arbitrary fields of the form Q(α1 , . . . , αn )
for some algebraic numbers α1 , . . . , αn ∈ Q. These are called number fields. However, to get more
number-theoretic information on number fields, we must do algebraic number theory with numbers
fields too instead of only Q.

Here is what this means: we defined algebraic numbers as roots of polynomials with rational
coefficients. We can then define algebraic numbers over a number field K as the roots of polynomials
with coefficients in K. These turn out to be the same as the regular algebraic numbers by the
fundamental theorem of symmetric polynomials,

but what’s different is their minimal polynomial.
Indeed, the minimal polynomial of (i+1)2
2
over Q is X 4 + 1 but over Q(i) it is just X 2 − i.

Exercise 6.1.3∗ . Prove that the minimal polynomial of (i+1) 2
2
over Q(i) is X 2 − i.

We thus make the following definitions. One of the reasons we do it in so much generality is to do
number theory with other base fields than Q (but which are still number fields), but another one is
also to revisit the theory of finite fields a bit, since these are also about algebraic elements.

Definition 6.1.3 (Field Extensions)

We say two fields K ⊆ L form a field extension denoted L/K (big field over small field).

We will usually only say "extension" for "field extension". We also make analogous definitions for
elements algebraic over K, their minimal polynomial, their degree, their conjugates, etc. The field of
elements algebraic over L is denoted L and called the algebraic closure of L (this won’t be needed in
this book as L = Q for any number field L).

Definition 6.1.4 (Degree of a Field Extension)

The degree [L : K] of an extension L/K is the dimension of L as a K-vector space.

Exercise 6.1.4∗ . Check that L is a K-vector space.

What the degree does is that it measures the "size" of the extension. This definition might seem
somewhat complicated at first, but it is in fact very simple (in the cases we’re interested in): when
L = K(α) for some α algebraic over L of degree n, the degree of L/K is also just n! Indeed, by
definition the elements 1, α, . . . , αn−1 are K-linearly independent (otherwise the minimal polynomial
of α has degree less than n) while 1, α, . . . , αn aren’t (since some polynomial of degree n vanishes at
α). For our purposes, all extensions of number fields have the form L = K(α): this is the primitive
element theorem 6.2.1. You might thus wonder why we state things with linear algebra terminology:
it’s simply because linear algebra and bases are convenient to work with, as the following example as
well as the tower law 6.1.1 show.

Proof that algebraic numbers are closed under addition and multiplication

Let α, β ∈ Q be algebraic numbers of respective degrees m and n. Then, Q(α, β) is a Q-vector


space with dimension at most mn, since its generated by αi β j , i = 0, . . . , m − 1, j = 0, . . . , n − 1.

As a consequence, 1, α+β, (α+β)2 , . . . , (α+β)mn are linearly dependent which means that there
is a polynomial with rational coefficients of degree at most mn vanishing at αβ , in particular it’s
algebraic.

Similarly, 1, αβ, (αβ)2 , . . . , (αβ)mn are linearly dependent so αβ is algebraic.



6.1. GENERAL DEFINITIONS 95

Note that this proof does not work to show that Z is closed under addition and multiplication, as
Z is not a Q-vector space anymore so bases don’t work nicely, However, a proof using linear algebra
still exists, see Section C.3.

Proposition 6.1.1 (Tower Law)

Suppose M/L/K is a tower of extensions (meaning K ⊆ L ⊆ M ). Then,

[M : K] = [M : L][L : K].

In other words, the degree is multiplicative in towers of extensions.

Proof

Let m = [M : L] and n = [L : K]. Let u1 , . . . , um be a L-basis of M , and v1 , . . . , vn be a


K-basis of L. Then, (ui vj )i∈[m],j∈[n] is a K-basis of M . Since this basis has cardinality mn,
[M : K] = mn.


Exercise 6.1.5∗ . Prove that (ui vj )i∈[m],j∈[n] is a K-basis of M .

Exercise 6.1.6∗ . Let M/L/K be a tower of extensions and α ∈ M . Prove that the minimal polynomial
of α over L divides the minimal polynomial of α over K. In other words, its L-conjugates are among its
K-conjugates.

Before we make more definitions, here is an application of why we care about extensions of number
fields (where K 6= Q), and why the tower law is useful.

Problem 6.1.1 (IMC 2012 Problem 5)

Let a ∈ Q be a rational number, and n ≥ 1 be an integer. Prove that the polynomial


n
(X 2 + aX)2 + 1

is irreducible in Q[X].

Before solving this problem, we need a lemma which follows from the tower law.

Lemma 6.1.1

Let f, g ∈ Q[X] be polynomials. Then, f ◦ g is irreducible in Q[X] if and only if f is irreducible


in Q[X] and g − α is irreducible in Q(α)[X], where α is a root of f .

Proof

Let m = deg f and n = deg g. Consider a root α of f and a root β of g − α. Then, β is a root of
f ◦ g and we have
[Q(β) : Q] = [Q(β) : Q(α)][Q(α) : Q].
f ◦ g is irreducible if and only if [Q(β) : Q] = deg f ◦ g = mn. Also, [Q(α) : Q] ≤ m since α is a
root of f , with equality iff f is irreducible in Q[X]. Similarly, [Q(β) : Q(α)] ≤ n since β is a root
96 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

of g − α, with equality iff g − α is irreducible in Q(α). To conclude,

[Q(β) : Q] = [Q(β) : Q(α)][Q(α) : Q] ≤ mn

with equality if and only if f is irreducible in Q[X] and g − α is irreducible in Q(α)[X], as


wanted.


Solution
n
Using the lemma, we wish to show that f = X 2 + 1 = Φ2n+1 is irreducible in Q[X], which is
true by Theorem 3.2.1 (or Eisenstein’s criterion) and that g − ω = X 2 + aX − ω is irreducible
in Q(ω) where ω is a primitive 2n+1 th root of unity. Here is why this is easier to manipulate: a
polynomial of degree two is reducible if and only if it has no root. Thus, suppose for the sake of
contradiction that X 2 + aX − ω has a root in Q(ω), i.e. h(ω)2 + ah(ω) = ω for some h ∈ Q[X].

We complete the square and take the norm: (h(ω) + b)2 = ω + b2 , where b = a/2, and
Y Y
(h(ω k ) + b)2 = ω k + b2 .
k odd k odd

The LHS is a perfect square by the fundamental theorem of symmetric polynomials, while the
RHS is Y n+1
b2 − ω k = Φ2n+1 (b2 ) = b2 +1
k odd
n+1
since ϕ(2 ) = 2 is even. Now, we suppose that the diophantine equation x4 + y 4 = z 2 has
n

no solution in non-zero rational integers. This is a classical result which was proven by Fermat.
See Exercise 2.6.12† for a proof.
n+1
Hence, the diophantine equation b2 + 1 = c2 has only the rational solution b = 0 which means
2 2n
a = 0. But then (X + 0X) + 1 = Φ2n+2 is clearly irreducible so we reach a contradiction in
all cases.


Finally, we make three more definitions, the last two having been encountered a few times already
in special cases.

Definition 6.1.5 (Finite Extension)

We say an extension L/K is finite if its degree [L : K] is finite.

Definition 6.1.6 (Number Field)

A finite extension of Q is called a number field .

Exercise 6.1.7∗ . Prove that finite extensions of K are exactly the fields of the form K(α1 , . . . , αn ) for
α1 , . . . , αn algebraic elements over K, using Proposition 6.1.1.

Definition 6.1.7 (Ring of Integers)

Let K be a number field. Its ring of integer , OK , is the ring of algebraic integers of K: K ∩ Z.
6.2. THE PRIMITIVE ELEMENT THEOREM AND FIELD THEORY 97

6.2 The Primitive Element Theorem and Field Theory


Our main result is the following: every number field is generated by one element. This is extremely
nice, as you will see with all the applications, as all one has to do is to take in account the minimal
polynomial of the generator to deduce the structure of the field. For instance, as mentioned before,
one can easily compute the degree of K = Q(α) (if one knows α). Compare this to, say, Q(α, β): you
not only have to take in account the contribution of α (its degree over Q), but also what β adds to
the contribution of α (its degree over Q(α))!

Theorem 6.2.1 (Primitive Element Theorem)

Let K ⊆ C be a field, and α, β ∈ K algebraic elements over K. Then, there exists a γ ∈ K(α, β)
such that
K(α, β) = K(γ).

Since, by Exercise 6.1.7∗ , every number field is finitely generated, repeated applications of the
primitive element theorem lead to number fields being generated by one element (by induction).

Proof

We take γ = α + tβ, for some t ∈ K that will be chosen later. We will find two polynomials in
K(γ)[X] whose gcd is X − α: since the gcd can be obtained by the Euclidean algorithm, this
means that α ∈ K(γ) and thus also β = (γ − α)/t ∈ K(γ).

For these polynomials, we choose


Y
f = πα = X − αi
i

and Y
g= (X − (γ − tβj ))
j

where αi and βj are the conjugates of α and β respectively (over K). By the fundamental
theorem of symmetric polynomials, they both have coefficients in K(γ). It remains to see that
their gcd is X − α: since they have distinct roots this is equivalent to α being the only common
root. Suppose αi = γ − tβj = α + t(β − βj ) is another common root: this yields
αi − α
t=
β − βj

(β 6= βj as α 6= αi ). There are clearly a finite number of such values, thus any sufficiently large
t works.


If you read this proof carefully, you might notice that it almost works for any infinite field K. In
fact, it does show that if L/K is finite and K is infinite, then L is generated by one element, under
the assumption of separability. This means that the conjugates of an element are always distinct.
Indeed, if this assumption is not satisfied, then the gcd of the two polynomials we constructed could
be divisible by (X − α)2 and thus not equal to X − α. It seems obvious that irreducible polynomials
have no repeated roots, and that’s because it is, but only in characteristic 0. In characteristic p, things
becomes weird, but one can check that it still holds for finite fields (it can only fail if the derivative
of the polynomial is zero). Since these are the cases we are interested in, we will assume that all our
extensions are separable, so that we can use the primitive element theorem. See Exercise 6.5.39 for an
example of a non-separable extension.
98 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Note also that we previously proved the primitive element theorem for finite fields, in Corol-
lary 4.2.1. In other words, the primitive element theorem holds whenever L/K is finite and separable
(without the assumption that K is infinite).

Definition 6.2.1 (Separable Extension)

An algebraic (i.e. L ⊆ K) extension L/K is said to be separable if the minimal polynomial of


any element of L has distinct roots.

Remark 6.2.1
In fact, our proof of the primitive element theorem shows that we only need L to be generated
by separable elements of K. However, as Exercise 6.3.10 shows, this turns out to be equivalent
to our definition of separability.

With this, we can establish a numbers of field theoretic results for number fields. In particular, we
shall generalise the norm NQ(√d) of quadratic fields Definition 2.1.4.

Definition 6.2.2 (Embeddings)

Let L/K be a finite extension. The K-embeddings of L are functions f : L → L which additive,
multiplicative, and the identity on K. In other words, functions satisfying f (k) = k for k ∈ K,
and f (x + y) = f (x) + f (y) as well as f (xy) = f (x)f (y) for x, y ∈ L. The set of K-embeddings
of L is denoted EmbK (L).

Since we are usually concerned with the case K = Q, we will just say "embedding" for "Q-
embedding" as well as write Emb for EmbQ . Note that, as Exercise 6.2.2∗ shows, K-embeddings (or
K-morphisms) are nothing spectacular, their axioms are precisely the ones which make them commute
with polynomials!

Remark 6.2.2
We say "embedding" because a normal embedding of S into U is an injective morphism ϕ : S 7→
U . Notice that, for U = L, this corresponds to the Q-embeddings of Q/L, which we just call
embeddings. What do we call such a morphism an "embedding"? Because you "embed" S into
U by associating it with its isomorphic image f (S): you get a copy of S in U .

Exercise 6.2.1. Let K be a number field. Prove that the embeddings of K are the non-zero functions
f : K → C which are both multiplicative and additive.

Exercise 6.2.2∗ . Prove that ϕ ∈ EmbK (L) if and only if ϕ commutes with polynomials, i.e.

ϕ(f (x1 , . . . , xn )) = f (ϕ(x1 ), . . . , ϕ(xn ))

for any f ∈ K[X1 , . . . , Xn ] and any x1 , . . . , xn ∈ L.

Exercise 6.2.3∗ . Let α ∈ L be an element and σ ∈ EmbK (L) be an embedding. Prove that σ(α) is a
conjugate of α.

Exercise 6.2.4∗ . Prove that an embedding is injective.

Why do we care about embeddings? Well, because they are precisely the morphisms obtained by
conjugation (in the case where L is generated by one element, which can be assumed thanks to the
primitive element theorem)!
6.2. THE PRIMITIVE ELEMENT THEOREM AND FIELD THEORY 99

Proposition 6.2.1 (Embeddings)*

Let K(α) = L/K be a finite separable extension. The K-embeddings of K(α) are precisely the
functions ϕ : f (α) 7→ f (β) where β is some conjugate of α and f ∈ K(X). In particular, there
are exactly [L : K] of them.

Let’s check that this statement makes sense: ϕ is clearly additive and multiplicative, and it indeed
fixes K since if f = k is a constant polynomial then k(α) = k = k(β). Why doesn’t it work for any
β then? That’s because we need to check it’s well defined: an element of K(α) has multiple ways of
being written (e.g. 0 = πα (α)) (here πa means the minimal polynomial with coefficients in K). This
is very easy to show: if f (α) = g(α) then

f ≡g (mod πα ) ⇐⇒ f ≡ g (mod πβ )

so f (β) = g(β) which shows that it’s well defined. In fact, we have already proven the proposition like
this, since by Exercise 6.2.3∗ β = ϕ(α) is a conjugate of α and by Exercise 6.2.2∗ ϕ : f (α) 7→ f (β).

Remark 6.2.3
If L/K is separable but not finite, there are many embeddings and the fundamental theorem of
Galois theory Theorem 6.3.1 does not hold anymore (the way it’s currently stated). For L/K
not algebraic, there are even more embeddings! For instance, if L = K(α) with α transcendental
over K, then the embeddings of L/K are σβ : f (α) 7→ f (β) for any transcendental β. (In other
words, transcendental elements are all conjugates in some sense.)

Embeddings give us a systematic way to manipulate conjugation. For instance, the embeddings
of C/R are the identity and complex conjugation, and using the conjugation embedding we proved
Proposition A.1.2. We will illustrate this by an application soon, but we need to build up a few results
on embeddings first, namely an equivalent version of Exercise 1.5.19† . Here is how to prove it with
the formalism of embeddings (note that it’s the same proof, but with more comfortable objects). Note
that this result is completely obvious: everything is symmetric between conjugates, so of course they
are reached the same number of times. We are just expressing this symmetry more formally.

Proposition 6.2.2*

Let f ∈ K[X]. The m conjugates of f (α) are f (ϕ(α)), and each is represented n/m times where
n = [K(α) : K] is the degree of α.

Proof

Note that conjugates areQreached at least once: this follows from the fundamental theorem of
symmetric polynomials: i X − f (αi ) has integer coefficients where αi are the conjugates of α.
Moreover, by Exercise 6.2.3∗ f (αi ) is always a conjugate of f (α).

It remains to see that they all appear the same number of times. Suppose f (ϕ1 (α)) = f (ϕ2 (α)) =
. . . = f (ϕk−1 (α)) is reached exactly k times, where ϕ1 = id.

Consider one of its conjugate, f (ψ(α)) = ψ(f (ϕ1 (α))). Since ψ is injective by Exercise 6.2.4∗ ,
we conclude that f (ψ(α)) is also reached exactly k times, for

f (ϕ(α)) = ψ(f (ϕi (α)) = f (ψϕi (α)))

and if f (ψ(α)) = f (ψ 0 (α)) then f (α) = f (ψ −1 ψ 0 (α)) so ψ −1 ψ 0 ∈ {ϕ1 , . . . , ϕk } as wanted.



100 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Remark 6.2.4
Here is how this could be proven without field theory. In fact, in this case it is even a lot
quicker. However, we have chosen this approach because in general it is nicer to think in terms
of
Q embeddings, and this will be particularly important for Section 6.3. Since all roots of π =
k
i X − f (αi ) are roots of π α , the factorisation in irreducible polynomials must be π α for some k.
This means exactly that all conjugates are reached the same amount of times.

Exercise 6.2.5. Let α1 , . . . , αn ∈ L be all the K-conjugates of some α ∈ L, and let σ ∈ EmbK (L) be an
embedding. Prove that σ permutates α1 , . . . , αn .

This result can be reformulated to show that, if M/L/K is a tower of extensions such that M/K
is separable (since we only defined embeddings for separable extensions), every K-embedding of L
extends to exactly [M : K]/[L : K] = [M : L] K-embedding of M .

Proposition 6.2.3*

Any K-embedding of L extends to exactly [M : L] K-embedding of M . In particular, every


embedding is reached.

Proof

By Exercise 6.2.3∗ , embeddings of M/K restrict to embeddings of L/K, and by Proposition 6.2.2
each embedding is reached the same number of times, i.e. [M : K]/[L : K] = [M : L].


With these result, we can now define the norm for any finite extension! But first, we present an ex-
ample that shows the conceptual power of embeddings (which, again, only provide a more comfortable
to solve the problem, the essence stays the same).

Problem 6.2.1

Suppose α1 , . . . , αn ∈ Q are positive real algebraic numbersPsuch that αi is maximal out of the
n
absolute value of its conjugates for each i. Prove that, if i=1 αi is rational, then αi ∈ Q for
each i.

Proof
Pn
Consider the embeddings of K = Q(α1 , . . . , αn ). If i=1 αi ∈ Q is rational, it is fixed by every
embedding σ of K. But, the absolute value of
n n
! n
X X X
αi = σ αi = σ(αi )
i=1 i=1 i=1
Pn Pn
is at most i=1 |σ(αi )| ≤ i=1 αi . Hence, since we have equality in the triangular inequality,
we must have σ(αi ) = uαi for some |u| = 1. But then,
n
X n
X n
X
αi = σ(αi ) = u αi
i=1 i=1 i=1

so u = 1. Finally, this means that αi is fixed by any σ ∈ Emb(K), which means that its fixed by
any σ ∈ Emb(Q(αi )) by Proposition 6.2.3. We conclude that its only conjugate is itself: αi ∈ Q.

6.3. GALOIS THEORY 101

Exercise 6.2.6. Solve Problem 6.2.1 without field theory, i.e. using only the content of Chapter 1.
Pn √
As a notable corollary, we get that i=1 ki ai for positive ai is rational iff ai is a perfect ki th power
for each i, which was Exercise 1.5.5. To conclude this section, we define the norm in arbitrary finite
extensions.

Definition 6.2.3 (Norm)

Let L/K be a finite separable extension. The norm NL/K defined as


Y
NL/K (α) = σ(α).
σ∈EmbK (L)

Notice in particular that the norm is obviously multiplicative because the embeddings are! As an
example, the norm in C/R is the square of the module: N(a + bi) = a2 + b2 = |a + bi|2 . Here is a bad
application of the multiplicativity of the norm, which nonetheless appeared in a USA TST.

Problem 6.2.2 (USA TST 2012)

Do there exist abritrarily large rational integers a, b, c such that a3 + 2b3 + 4c3 = 6abc + 1?

Solution

3

3 3 3 3
One can check that NQ( √ 3
2) (a + b 2 + c 4) = a + 2b + 4c − 6abc. Thus we want to find
√3

elements of norm 1 in this field. Notice that NQ( √
3
2) (1 + 2 + 3 4) so
√ √ √ √
a + b 2 + c 4 = (1 + 2 + 4)n
3 3 3 3

will also have norm 1 for any n. Pick an n sufficiently large and we are done.


√ √
Exercise 6.2.7∗ . Check that NQ( √
3
2)
(a + b 3 2 + c 3 4) = a3 + 2b3 + 4c3 − 6abc.

Norms in different extensions are linked by the following proposition. This is left as an exercise in
the next section, as we need to define Galois groups to prove it.

Proposition 6.2.4

Let M/L/K be a tower of finite separable field extensions. Then NM/K = NL/K ◦ NM/L .

6.3 Galois Theory


Galois theory studies Galois extensions, i.e. extensions closed under embeddings. An algebraic exten-
sion L/K means that every element of L is algebraic over K.

Definition 6.3.1 (Galois Extension)

A Galois extension L/K is a separable algebraic extension closed under embeddings, meaning
that for any α ∈ L, all the conjugates of α also lie in L.
102 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

The simplest case of a Galois extension are the quadratic fields we say in Chapter 2 and the
cyclotomic fields Q(ω) where ω is a root of unity. Indeed, all its conjugates have the form ω k ∈ Q(ω).
It is easy to construct Galois extensions: one starts with any extension K(α) and then gets the galois
closure L = K(α1 , . . . , αn ) by adding the conjugates α1 , . . . , αn of α.
Exercise 6.3.1∗ . Check that K(α1 , . . . , αn )/K is Galois and prove that any Galois extension has this form.

Galois extensions are particularly interesting because EmbK (L) becomes a group, meaning that
the composition of two embeddings is still an embedding since embeddings are now L → L. We thus
denote EmbK (L) as AutK (L) (also written Aut(L/K)) when L is Galois, because its embeddings are
now automorphisms, that is, isomorphisms from a field to itself.

Definition 6.3.2 (Galois Group)

The Galois group Gal(L/K) of an extension L/K is its group of K-embeddings: AutK (L).

Exercise 6.3.2∗ . Can you express the Galois group of a quadratic extension L/K in a way that doesn’t
depend on L or K? (More specifically, show that the Galois groups of quadratic extensions are all isomorphic.)

Exercise 6.3.3∗ . Check that the Galois group is a group under composition.

Exercise 6.3.4. Let L/K be Galois and K ⊆ M ⊆ L be an intermediate field. Prove that EmbK (M ) is a
system of representatives of Gal(L/K)/ Gal(L/M ), where the quotient means Gal(L/K) modulo Gal(L/M ),
i.e. we say σ 0 ≡ σ if σ −1 ◦ σ 0 ∈ Gal(L/M ). (Our quotient A/B is more commonly though of as the set
of right-cosets of B in A, i.e. the sets Ba for a ∈ A (which we just wrote as a in our case).) (See also
Exercise A.3.15† .)

Exercise 6.3.5. Prove Proposition 6.2.4 using Exercise 6.3.4. (This is a bit technical.)

Again, when K = Q, we may drop the K in the notation. We will also sometimes write G(L/K).
Here is the most important Galois group: Gal(Q(ωn )/Q) where ωn is a primitive nth root of unity. By
Theorem 3.2.1, this is σk : ωn 7→ ωnk for gcd(n, k) = 1. Note that σi ◦ σj maps ωn to (ω j )i = ω ij . This
means that σi ◦ σj = σij . It is thus isomorphic to (Z/nZ)× : just label σk as k (mod n) and this makes
sense by the previous consideration on composition (which becomes multiplication in (Z/nZ)× ). In
particular, it is abelianindexgroup!abelian which means that ab = ba (composition commutes).

Another particularly simple Galois group is Gal(Fpn /Fp ): by Theorem 4.3.1, its elements are

id, ϕp , ϕ2p , . . . , ϕn−1


p .

In particular, it is generated by only one element: if we relabel ϕkp as k (mod n) we get Gal(Fpn /Fp ) '
Z/nZ!

Before we state and prove the fundamental theorem of Galois theory, here is a quick application of
the Galois group.

Problem 6.3.1

3
Is 2 a sum of roots of unity?

Solution

Suppose that this

3
2 ∈√ Q(ω) for some root of unity ω. X 3 − 2 is irreducible in Q[X], so the
3 i 3
√ of 2 are ζ 2 where ζ is a primitive third root of unity. Since Q(ω) is Galois,
conjugates
K = Q( 3 2, ζ) ⊆ Q(ω). By Proposition 6.2.3, Gal(Q(ω)/Q) restricts to (multiple copies of)
6.3. GALOIS THEORY 103


G = Gal(Q( 3 2, ζ)).

The key point is that Gal(Q(ω)/Q) is abelian (commutes), while G isn’t so this is impossible.
We have already shown that the former is abelian, so it remains to show that G is not. We claim
that the embeddings of K are (√ √
3
2 7→ ζ a 3 2
σ(a,b) :
ζ 7→ ζ b
for (independent) a ∈ Z/3Z and b ∈ (Z/3Z)× .

Clearly, these are all the possible embeddings of K, so it remains to check that they are indeed
embeddings, i.e. that [K : Q] = 3 · 2 = 6. Since

[K : Q] = [K : Q(ζ)][Q(ζ) : Q] = 2[K : Q(ζ)],

it remains to prove that [K : Q(ζ)] = 3, i.e. that X 3 − 2 is irreducible over Q(ζ). This is very
easy: if it wasn’t the case it would have a root in Q(ζ), so Q(ζ) would contain an element of
degree 3 which is impossible as Q(ζ) has degree 2.

Finally, one can see that G is not abelian as σ(0,−1) ◦σ(1,1) = σ(−1,−1) , but σ(1,1) ◦σ(0,−1) = σ(1,−1) .

Therefore, 3 2 is not a sum of roots of unity.


Remark 6.3.1
In fact, the Kronecker-Weber theorem asserts the converse: if Gal(K/Q) is abelian, K is contained
in a cyclotomic field Q(ω) for some root of unity ω.

Exercise 6.3.6∗ . Compute σ(0,−1) ◦ σ(1,1) and σ(1,1) ◦ σ(0,−1) .

Here is the main reason why Galois groups are interesting.

Theorem 6.3.1 (Fundamental Theorem of Galois Theory)

Let L/K be a finite Galois extension. There is a one-to-one correspondence – called the Galois
correspondence – between subgroups H of Gal(L/K) (subsets closed under composition and
inversion) and the intermediate fields L/M/K. This correspondence is given by

H 7→ LH ,

where LH is the fixed field of H, i.e. the elements of L which are fixed by all of H. The reverse
direction is given by
M 7→ Gal(L/M ).

Note that, since Galois groups of finite extensions are finite, this immediately yields the following
non-obvious corollary.

Corollary 6.3.1

Let L/K be a finite separable extension. Then, there are only finitely many intermediate fields
L/M/K.
104 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Exercise 6.3.7∗ . Prove Corollary 6.3.1. (You may assume Exercise 6.3.101 .)

Exercise 6.3.8. Prove that the primitive element theorem follows from Corollary 6.3.1 by considering the
fields Kt = K(α + tβ), where α, β ∈ K are given elements.

Proof of Theorem 6.3.1

Note that Gal(L/M ) = EmbM (L). In particular, the elements fixed by Gal(L/M ) are those
which have only one M -conjugate: they are thus in M . This shows that LGal(L/M ) = M .

It remains to prove that Gal(L/LH ) = H. Clearly, H ⊆ Gal(L/LH ) since H fixes LH by


definition. Write L = LH (α): the cardinality of Gal(L/LH ) is the degree of α (over LH ).
However, note that
ei (σ1 (α), . . . , σk (α))
is fixed by H for any i, where σ1 , . . . , σk are the elements of H (this is because σH = H for
any σ ∈ H). Thus, α has at most k conjugates. To conclude, we have H ⊆ Gal(L/LH ) and
| Gal(L/LH )| ≤ |H| which implies H = Gal(L/LH ).


Exercise 6.3.9∗ . Prove that ei (σ1 (α), . . . , σk (α)) is fixed by H for any i.

Remark 6.3.2
There is an explicit way Q of choosing a primitive element of LH from a primitive element of L. If
L = K(α), set fH (X) = σ∈H X − σ(α). For sufficiently large n ∈ Z, fH (n) is fixed only by H:
since σfH and fH are distinct polynomials for σ 6∈ H, they have a finite number of common roots.
Finally, if β is fixed only by H, then Gal(L/K(β)) = H by definition, which means K(β) = LH .

Exercise 6.3.10. Let α, β be separable over K (i.e. their minimal polynomials have distinct roots). Prove
that K(α, β) is separable over K.

For L/K = Fpn /Fp for instance, this is Corollary 4.3.1. Indeed, the (additive, so closed under
Z/dZ
addition) subgroups of Z/nZ are simply Z/dZ for d | n and the fixed field Fpn is Fpn/d , the fixed
n/d n/d
field of ϕp (as Z/dZ is generated by ϕp ). We now present a quick application of the fundamental
theorem of Galois theory in the case of cyclotomic fields.

Problem 6.3.2

Let ωm be a primitive mth root of unity, and ωn a primitive nth root of unity. What are
Q(ωm , ωn ) and Q(ωm ) ∩ Q(ωn )?

Solution

Let ω be a primitive mnth root of unity and σk be the embedding ω 7→ ω k .

Let ωd be a primitive dth root of unity where d | mn. Notice that Q(ωd ) is the fixed field of

Hd = {σk | kmn/d ≡mn mn/d ⇐⇒ k ≡d 1}

1 Sadly, our proof of it uses the primitive element theorem. However, it is usually proved from the theory of embeddings

in any finite extension (not necessarily separable), which explains why we relied on the primitive element theorem, as
we have not delved in the theory of inseparable extensions at all. See Conrad [13] or Lang [18].
6.3. GALOIS THEORY 105

since these are exactly the automorphisms such that σk (ωd ) = ωd , as ωd = ω mn/d . In particular,

Hm ∩ Hn = {σk | k ≡m 1, k ≡n 1} = {σk | k ≡lcm(m,n) 1} = Hlcm(m,n) .

This means that Q(ωm , ωn ) is Q(ωlcm(m,n) ) by Exercise 6.3.11∗ . Similarly, the group generated
by Hm and Hn is
hHm , Hn i = {ab | a ≡m 1, b ≡n 1} = Hgcd(m,n)
since (1 + am)(1 + bn) ≡mn 1 + (am + bn) goes through every residue which is 1 modulo gcd(m, n)
by Bézout’s lemma. Thus, Q(ωm ) ∩ Q(ωn ) is Q(ωgcd(m,n) ).


Exercise 6.3.11∗ . Given two subfields A and B of a field L, define their compositum or composite field AB as
the smallest subfield of L containing both A and B (in other words, the field generated by A and B). Let L/K
be a finite Galois extension and A, B be intermediate fields. Prove that Gal(L/AB) = Gal(L/A) ∩ Gal(L/B).

Exercise 6.3.12∗ . Given two subgroups H1 , H2 of a group H, define the subgroup they generate, hH1 , H2 i,
as the smallest subgroup containing both H1 and H2 . Let L/K be a finite Galois extension and A, B be
intermediate fields. Prove that Gal(L/A ∩ B) = hGal(L/A), Gal(L/B)i.

Remark 6.3.3
In fact, it is possible to give a very direct proof for Q(ωm , ωn ) = Q(ωlcm(m,n) ): one inclusion is
trivial, and for the other we have
 b  a  
2iπ 2iπ 2iπ
exp exp = exp
m n lcm(m, n)

where am + bn = gcd(m, n). However, such a proof does not work anymore for Q(ωm ) ∩ Q(ωn )
because we do not have access directly to this field. For instance, K(ωm , ωn ) = K(ωlcm(m,n) ) is

always true, but K(ωm ) ∩ K(ωn ) = K(ωgcd(m,n) ) isn’t always! As an example, if K = Q( 3),

then K(i) = Q(i, 3) = K(j) where j is a primitive cube root of unity. Thus, K(i) ∩ K(j) 6= K.

Here are a few additional properties of the Galois correspondence.

Proposition 6.3.1

We have [LH : K] = |G|/|H| where G = Gal(L/K).

Proof

|H| = | Aut(L/LH )| = [L : LH ] and |G| = [L : K] so

[L : K] |G|
[LH : K] = = .
[L : LH ] |H|

Proposition 6.3.2

The Galois correspondence is inclusion reversing: H1 ⊆ H2 ⇐⇒ LH1 ⊇ LH2 .

Exercise 6.3.13∗ . Prove Proposition 6.3.2.

Here is another application of the fundamental theorem of Galois theory, generalising Problem 6.3.1.
106 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Problem 6.3.3

n
When is 2 a sum of roots of unity?

Solution

Suppose that Q( n 2) ⊆√Q(ω) for some root of unity ω. By Exercise 6.3.14∗ , any subfield of Q(ω)
n
is Galois over Q, so√Q( n 2) also
√ is. Note that, since X −2 is irreducible by Eisenstein’s criterion,
n k n
the conjugates of√ 2 are ζ 2 where ζ is a primitive nth root of unity. In particular, we must
n
have
√ Q(ζ) ⊆ Q( 2) ⊆ R, which implies n = 1 or n = 2. Conversely, these works: 2 = 1 + 1 and
n
2 = ±(ω + 1/ω) for any primitive eight root of unity ω. Indeed,
 2
1 1
ω+ = ω2 + +2=2
ω ω2

as ω 4 = −1 (this is a Gauss sum).




Exercise 6.3.14∗ . Let L/K be a finite Galois extension and let M be an intermediate field. Prove that, for
any σ ∈ Gal(L/K), Gal(L/σM ) = σ Gal(L/M )σ −1 . Deduce that the intermediate fields which are also Galois
(over K) are M = LH where H is a normal subgroup of G = Gal(L/K), meaning that σHσ −1 = H for any
σ ∈ G. In particular, if L/K is abelian, meaning that its Galois group is, any intermediate field is Galois over
K.

This fundamental theorem of Galois theory lets us get deeper insight on the Gauss sums of Sec-
tion 4.5. Indeed, the Galois group of Q(ω) where ω is a primitive qth root of unity is (isomorphic to)
(Z/qZ)× . By Galois theory (and a bit of group theory), we already know that this field contains a
unique field of degree 2: indeed, by Proposition 6.3.1, it’s LH where |G|/|H| = 2. It can be seen easily
that the unique such subgroup is the subgroup of quadratic residues. Let σk denote the embedding
ω 7→ ω k . Hence, we get X
ω k ∈ LH
( kq )=1
and X
ω k ∈ LH
( kq )=−1
(they are fixed by the embeddings of H) and then it’s just a matter of computing the value of these
sums to deduce what the quadratic field is. To simplify things a bit we can consider our Gauss sum
since when we square it it’s fixed by all automorphisms which means it’s rational.
√ ∗ q−1
Once we know that this quadratic field is Q( q ∗ ) where 2 1, we directly2 get the law
√ q∗ = (−1)
of quadratic reciprocity (without using the Gauss sum): if q = f (ω) ∈ LH for some f ∈ Z[X], then

( q ∗ )p ≡ σp (q ∗ ) (mod p)

by Frobenius and this√is equal to q ∗ iff σp ∈ H, i.e. p is a quadratic residue modulo q. Otherwise, it’s
its other conjugate − q ∗ . The rest of the proof is the same as before.

To finish with this proof of the quadratic


√ reciprocity law, we said that we didn’t need to use Gauss
sums, but then how do we show that q ∗ ∈ Q(ω) without computing them? One way of doing this is
to notice that, on the one hand,
q−1
Y
1 − ω k = Φq (1) = q.
k=1
2 Actually,
we need to know that the denominator of the coefficients of f are not divisible by p, so that f makes sense
over Fp and we can use the Frobenius morphism. This follows for instance from Exercise 3.5.26† .
6.3. GALOIS THEORY 107

On the other hand,


q−1 q−1
 q−1 2
q−1 2 q−1 2 2
Y Y Y Y q−1 Y
1 − ωk = (1 − ω k )(1 − ω −k ) = 1 − ωk = ω −k (1 − ω k )2 = (−1) 2 ω `  (1 − ω k )
k=1 k=1 k=1 k=1 k=1

q−1 0
so (−1) 2 q is a square (ω ` is a square: just choose `0 ≡ `/2 (mod q) to get ω ` = (ω ` )2 ) thus
concluding our new proof of the quadratic reciprocity law.
Exercise 6.3.15∗ . Fill in the details of this proof of the quadratic reciprocity law.

As a final remark on the quadratic reciprocity, we note that, perhaps more intuitively, by looking
  coincides with σp modulo p) in Q(ω4a ) where ω4a is√a primitive
at the Frobenius morphism (which
4ath root of unity, we get that ap depends only on σp , i.e. only on p (mod 4a). Indeed, a ∈ Q(ω4a )

follows from the fact that i ∈ Q(ω4a ) so that p ∈ Q(ω4a ) for any p | a, which means that
√ √
 
a
≡ σp ( a)/ a (mod p)
p
depends only on p (mod 4a) as claimed. This is also enough to deduce the quadratic reciprocity law.
   
a a
Exercise 6.3.16. Prove that, for any positive integer a and primes p, q - a, we have p
= q
whenever
p ≡ ±q (mod 4a). Deduce the quadratic reciprocity law: for any odd primes p, q,
  
p q p−1 q−1
= (−1) 2 · 2 .
q p
We worked hard to get all of this, so here is a concrete application of the fundamental theorem of
Galois theory, which generalises Proposition 4.4.2.

Problem 6.3.4 (Schur)

Let H be a subgroup of (Z/nZ)× (i.e. a subset closed under multiplication and inversion,
or equivalently just closed under multiplication by little Fermat). Prove that there exists a
polynomial f ∈ Z[X] such that, for any rational prime p - n, f (mod p) has all its roots in Fp
when p (mod n) ∈ H, and no roots in Fp otherwise, up to a finite number of exceptions.

Proof

P of Ψ in Proposition 4.4.2, one might choose f to be the


If one tries to copy the construction
minimal polynomial polynomial of h∈H ω h where ω is a complex primitive nth root of unity.
However, when we need to show that
!p
X X
h
ζ = ζ ph
h∈H h∈H

is equal to h∈H ζ h (where ζ is now a primitive nth root of unity in Fp ) iff p (mod n) ∈ H we
P
run into trouble. First, notice that this is equivalent to
X X
ω ph = ωh
h∈H h∈H

for sufficiently large p by the fundamental theorem of symmetric polynomials. Indeed, suppose
there are infinitely P many primesP p ≡ k (modP n) such thatPf has a root in Fp . Consider the
(absolute) norm of h∈H ω hk − h∈H ω h : if h∈H ζ hk = h∈H ζ h this norm is divisible by p
so if it’s true infinitely many times it must be zero (which we want to show is false if k 6∈ H).
Conversely, clearly if p (mod n) ∈ H, f has all its roots ζ h in Fp .
108 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Unfortunately, it is not always true that h∈H ω h is distinct from its conjugates h∈H ω kh for
P P
k 6∈ H. Indeed, let n = 12 and H = {1, 7}: we get ω + ω 7 = ω(1 + ω 6 ) = 0 which is absolutely
not what we want.

However, if we think a bit about what we want, we realise that we wish that σk (r) = r if and
only if k ∈ H where r ∈ Q(ω) is a root of f (since ϕp (r) = σp (r)). This is exactly what it means
for r to be in Q(ω)H ! Thus, we are done: just choose r to be a generator of Q(ω)H .


Exercise 6.3.17∗ . Convince yourself of this solution and fill in the details.

We now show how Galois theory implies the fundamental theorem of symmetric polynomials for
fields. First, we consider a corollary of the fundamental theorem of symmetric polynomials instead
of itself. We have seen numerous times that, if α1 , . . . , αn are conjugate algebraic numbers and f ∈
Q[X1 , . . . , Xn ], then f (α1 , . . . , αn ) is rational. This also follows from Galois theory: since embeddings
are injective, any σ ∈ Gal(K/Q) permutates α1 , . . . , αn (since it also sends them to their conjugates),
where K = Q(α1 , . . . , αn ). Thus,

σ(f (α1 , . . . , αn )) = f (σ(α1 ), . . . , σ(αn )) = f (α1 , . . . , αn )

as f is symmetric: f (α1 , . . . , αn ) is fixed by all Gal(K/Q) so is in Q.

The general case is actually almost the same, except that we replace α1 , . . . , αn by variables
X1 , . . . , Xn and Q by K = F (e1 , . . . , en ) for some base field F . Set L = F (X1 , . . . , Xn ). Then,
L/K is finite and Galois as it is separable by Exercise 6.3.103 and X1 , . . . , Xn are the roots of

π = T n − e1 T n−1 + . . . + (−1)n en .

Now, note that any σ ∈ Gal(L/K) acts as a permutation on {X1 , . . . , Xn } and is uniquely determined
from this action. Conversely, any σ ∈ Sn extends to a unique element of Gal(L/K) given by

σ : f (X1 , . . . , Xn ) 7→ f (Xσ(1) , . . . , Xσ(n) ).

Thus, Gal(L/K) ' Sn , which means that K = LSn . However, LSn is the field of rational functions
fixed by all permutations, i.e. the field of symmetric rational functions! Hence, we get that any
symmetric rational function f ∈ LH can be written as a rational function in e1 , . . . , en since K =
F (e1 , . . . , en ). Only one thing remains to be done: we must prove that

K(e1 , . . . , en ) ∩ K[X1 , . . . , Xn ] = K[e1 , . . . , en ].

This follows from an analogue of Proposition 1.1.1 (which works in any UFD): the elements of
K[X1 , . . . , Xn ] are integral over K[e1 , . . . , en ] and the only integral elements of K(e1 , . . . , en ) are the
ones in K[e1 , . . . , en ].

Since Galois theory is very much related to group theory (via the Galois group), we end the section
with two fundamental group theory results: the Lagrange and Cauchy theorems. The former generalises
Fermat’s little theorem. It shows that a subgroup of a finite group is just a subset closed under
multiplication (or addition depending on what your operation is), without needing the assumption
that it’s closed under inversion too. Keep in mind that a group is not necessary abelian, so our proof
of Theorem 4.2.1 does not work for this. Also, remember that the identity e of a group G is an element
such that ge = eg = g for any g ∈ G. The latter constitutes a converse of Lagrange’s theorem when
the order is prime: it shows that, as long as p divides |G|, there is an element of order p.
Exercise 6.3.18∗ . Prove that the identity of a group is unique.

3 Again, Exercise 6.3.10 uses the primitive element theorem which uses the fundamental theorem of symmetric poly-

nomials. This time however, this is not a problem because our use of the fundamental theorem of symmetric polynomial
to prove the primitive element theorem also follows from Galois theory, as outlined above.
6.4. SPLITTING OF POLYNOMIALS 109

Theorem 6.3.2 (Lagrange’s Theorem)

Let G be a group of cardinality n (we also say G has order n) with the operation ·. Then, for
any g ∈ G, the order of g, meaning the smallest k > 0 such that g k = e, divides n.

Proof

The proof will be combinatorial. Let m be the order of G. Partition G into orbits of the form
Oh = {h, hg, hg 2 , . . .}. We claim that this is indeed a partition: if Oh ∩ Oh0 6= ∅ then Oh = Oh0 .

Indeed, if hg i = h0 g j for some i, j then hg k = h0 g j−i+k for any k so Oh = Oh0 . Since each orbit
has cardinality m, we conclude that m | n.


Exercise 6.3.19∗ . Prove the following refinement of Theorem 2.5.1: if G is a finite group and H a subgroup
of G, |H| divides |G|. Why does it imply Theorem 2.5.1?

Theorem 6.3.3 (Cauchy’s Theorem)

Let G be a finite group. If p | |G| is a rational prime, then, G has an element of order p.

Proof

The proof is again combinatorial (group theory is very combinatorial). Consider the set S of
(g1 , . . . , gp ) ∈ G such that g1 · . . . · gp = e, the identity of G. There are p | |G|p−1 such tuples.
Now group them by circular permutations: consider the orbits

{(g1 , . . . , gp ), (g2 , . . . , g1 ), . . . , (gp , . . . , gp−1 )}.

The size of each orbit has size 1 or p: indeed, if σ denotes the circular permutation (x1 , . . . , xp ) 7→
(x2 , . . . , x1 ), we have σ p = id so the order of σ divides p by Lagrange’s theorem (in the group
of permutations of (x1 , . . . , xp ), which might not be Sn for fixed xi ). (This can also be seen
directly: if σ k (g1 , . . . , gp ) = (g1 , . . . , gp ), then gi+kn = gi so if k is invertible modulo p, gi = gj
for any i 6= j which means the orbit has size 1, and otherwise the orbit has size p.)

Thus, modulo p, the cardinality of S is congruent to the number of orbits of size 1. However,
(g, . . . , g) is in S iff g p = e, i.e. g has order 1 or p. Thus, if we let np denote the number of
elements of order p, we get
1 + np ≡ |S| ≡ 0 (mod p)
since p | |S| = |G|p−1 , which implies that np is non-zero as wanted.


6.4 Splitting of Polynomials


We shall now discuss an application of the primitive element theorem, other than that it lets us build
Galois theory quickly. We have already seen in Section 5.2 that, when f ∈ Z[X] is non-constant, the
set P(f ) of rational primes p such that f (mod p) has a root in Fp is infinite. Here, we show a stronger
result, namely that Psplit (f ), the set of rational primes which don’t divide the leading coefficient of f
such that f is split modulo p, meaning that f has as many roots in Fp as its degree, is infinite.
110 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Theorem 6.4.1

For any non-constant polynomial f ∈ Z[X] of leading coefficient a, there are infinitely many
rational primes p - a such that f (mod p) is split in Fp , meaning that its root in Fp are all in Fp .

Proof

Without loss of generality, we can assume f ∈ Q[X] is monic; the condiiton p - a becomes
that p doesn’t divide the denominators of the coefficients. Let α1 , . . . , αn be the roots of f ,
and K = Q(α1 , . . . , αn ). By the primitive element theorem, there is a β such that K = Q(β).
Consider the minimal polynomial π of β: we show that whenever π (mod p) has a root in Fp
and p is sufficiently large, f (mod p) is split in Fp . By Theorem 5.2.1, there are infinitely many
such primes.

Indeed, we know that β generates the roots αi of f in Q, so we might expect the same to hold in
Fp . It is in fact quite easy to show that this intuition holds true. Let g1 , . . . , gn ∈ Q[X] be such
that fi (β) = αi .

Let p be greater than the denominators of the gi so that gi (mod p) makes sense and suppose
βp ∈ Fp is a root of π (mod p). We shall show that
n
Y
f≡ X − gk (βp ).
k=1

Consider the coefficient in front of X i : ±ei (g1 , . . . , gn ) evaluated at βp . By assumption,

ei (g1 , . . . , gn )(β) = ai ∈ Q

so that π | ei (g1 , . . . , gn ) − ai . Using Gauss’s lemma, this divisibility becomes a divisibility in


F[ X]] (as p doesn’t divide the denominators of gi ) which means that we also have

ei (g1 , . . . , gn )(βp ) ≡ ai .

This concludes the proof.




As an important corollary, we get that any set of non-constant polynomials have infinitely many
common prime divisors.

Corollary 6.4.1

For any non-constant polynomials f1 , . . . , fn ∈ Z[X], Psplit (f1 ) ∩ . . . ∩ Psplit (fn ) is infinite.

Proof

Apply Theorem 6.4.1 to f = f1 · . . . · fn .




From this, we can deduce the following very non-trivial result.

Corollary 6.4.2

Let n ≥ 1 be a rational integer. Any non-constant polynomial f ∈ Z[X] has infinitely many
prime factors p ≡ 1 (mod n).
6.5. EXERCISES 111

Proof

Apply Corollary 6.4.1 to f and Φn .




Exercise 6.4.1. Does there exist an a 6≡ 1 (mod n) such that any non-constant f ∈ Z[X] has infinitely many
prime factors congruent to a modulo n?

6.5 Exercises
Field and Galois Theory
Exercise 6.5.1† . Let L/K be a finite separable extension of prime degree p. If f ∈ K[X] has prime
degree q and is irreducible over K but reducible over L, then p = q.

Exercise 6.5.2† . Let L/K be a finite Galois extension and let M/K be a finite extension. Prove that
Gal(LM/M ) ' Gal(L/L ∩ M ). In particular, [LM : L] = [L : L ∩ M ]. Conclude that, if L/K and
M/K are Galois, we have
[LM : K][L ∩ M : K] = [L : K][M : K].

Exercise 6.5.3† (Artin). Let L be a field, and G ⊆ Aut(L) a finite subgroup of automorphisms of L.
Prove that L/LG is Galois with Galois group G.

Exercise 6.5.4† . Prove that, for any n, there is a finite Galois extension K/Q such that Gal(K/Q) '
Z/nZ.

Exercise 6.5.5† (Cayley’s Theorem). Let G be a finite group. Prove that it is a subgroup of Sn for
some n. Conclude that there is a finite Galois extension L/K of number fields such that G ' Gal(L/K).
(This is part of the inverse Galois problem. So far, it has only been conjectured that we can choose
K = Q.)

Exercise 6.5.6† (Dedekind’s Lemma). Let L/K be a finite separable extension in characteristic 0.
Prove that the K-embeddings of L are linearly independent.

Exercise 6.5.7† (Hilbert’s Theorem 90). Suppose L/K is a cyclic extension in characteristic 0, mean-
ing its Galois group Gal(L/K) ' (Z/nZ, +) for some n (like Gal(Fpn /Fp )) or Gal(Q(exp(2iπ/p))/Q)).
Prove that α ∈ L has norm 1 if and only if it can be written as β/σ(β) for some β ∈ L, where σ is a
generator of the Galois group (element of order n).

Exercise 6.5.8. When are two number fields isomorphic?

Exercise 6.5.9† (Lüroth’s Theorem). Let K be a field and L a field between K and K(T ). Prove
that there exists a rational function f ∈ K(T ) such that L = K(f ).

nth Roots
Exercise 6.5.10† . Let K be a field, p a prime number, and α an element of K. Prove that X p − α
is irreducible over K if and only if it has no root.

Exercise 6.5.11† . Let f ∈ K[X] be a monic irreducible polynomial and p a rational prime. Suppose
that (−1)deg f f (0) is not a pth power in K. Prove that f (X p ) is also irreducible.

Exercise 6.5.12† (Vahlen, Capelli, Redei). Let K be a field and α ∈ K. When is X n − α irreducible
over K?

Exercise 6.5.13
√ . Let n ≥ 1 be an integer and ζ a primitive nth root of unity. What is the Galois
n
group of Q( 2, ζ) over Q?
112 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Exercise 6.5.14† . Let n ≥ 1 be an integer and p1 , . . . , pm rational primes. Prove that


√ √
[Q( n p1 , . . . , n pm ) : Q] = nm .

(This is a generalisation of Exercise 4.6.25† .)


Exercise 6.5.15† (Kummer Theory). Let L/K be a finite Galois extension in characteristic 0. Suppose
that Gal(L/K) ∼ Z/nZ. If K contains a primitive nth root of unity, prove that L = K(α) for some
αn ∈ K.
Exercise 6.5.16† (Artin-Schreier Theorem). Let L/K be a finite extension such that L is algebraically
closed. Prove that [L : K] ≤ 2.

Constructibility and Solvability


Exercise 6.5.17. Given two points, you are allowed to draw the line between them, as well as the
circle of center one of the points going through the other. Initially, you may start with the points (0, 0)
and (0, 1) and define additional points that way. We say a real number r is constructible p if the point
(0, r) is constructible. Prove that, if x and y are constructible, so are x + y, xy, −x, |x|, and x1 if
x 6= 0.
Exercise 6.5.18† . Prove that a real number is constructible if and only if it is algebraic and the
degree of its splitting field, meaning the field generated by its conjugates, is a power of 2. Deduce that,
using only a straightedge (a non-graded ruler) and a compass,
1. A regular n-gon is constructible if and only if ϕ(n) is a power of 2.
2. It is not always possible to trisect an angle.
3. It is not possible to construct the side of a cube of volume 2.
4. It is not possible to square the circle, i.e. construct a square with the same area as the unit
circle. (You may assume that π is transcendental. This follows from Exercise 1.5.31† .)
Exercise 6.5.19† . We say a finite Galois extension L/K in characteristic 0 is solvable by radicals if
there is a tower of extensions
K = K0 ⊂ K1 ⊂ . . . ⊂ Km ⊇ L
such that Ki+1 is obtained from Ki by adjoining an nth root of some element of Ki to Ki , for some
n. We also say a group G is solvable if there is a chain 0 = G0 ⊂ G1 ⊂ . . . ⊂ Gm = G such that Gi is
normal in Gi+1 (see Exercise 6.3.14∗ ) and Gi+1 /Gi is cyclic. Prove that L/K is solvable by radicals if
and only if its Galois group is. (When L is the field generated by the roots of a polynomial f ∈ K[X],
L/K being solvable by radicals means that the roots of f can be written with radicals, which explains
the name.)
Exercise 6.5.20† . Let n ≥ 1 be an integer. Prove that Sn is not solvable for n ≥ 5. Conclude from
Exercise 6.5.22† that some polynomial equations are not solvable by radicals.4 (This is quite technical.)
Exercise 6.5.21† . We say a finite Galois extension L/K of real fields, i.e. L ⊆ R, is solvable by real
radicals if there is a tower of extensions

K = K0 ⊂ K1 ⊂ . . . ⊂ Km ⊇ L

such that Ki+1 is obtained from Ki by adjoining the nth root of some positive element of Ki to Ki .
Prove that L/K is solvable by real radicals if and only if [L : K] is a power of 2.
Exercise 6.5.22† . Let p be a prime number and G ⊆ Sp a subgroup containing a transposition τ
(see the paragraph after Definition C.3.2) and an element γ of order p. Prove that G = Sp . Deduce
that, if f ∈ Q[X] is an irreducible polynomial of degree p with precisely two non-real complex roots,
then the Galois group of the field generated by its roots (called its splitting field , because it is a field
where it splits) over Q is Sp .
4 If one only wants to show that there is no general formula, one doesn’t need to do the first part since the general

polynomial n
Q
i=1 X − Ai ∈ Q(A1 , . . . , An )[X] already has Galois group Sn over Q(A1 , . . . , An ) (where A1 , . . . , An are
formal variables).
6.5. EXERCISES 113

Exercise 6.5.23† . Let n be a positive integer. Prove that there is a number field K, Galois over
Q, such that Gal(K/Q) ' Sn . (You may assume the following result of Dedekind: if f ∈ Z[X] is a
polynomial, for any prime number p not dividing the discriminant ∆ of f , the Galois group of f over
Fp is a subgroup of the Galois group of f over Q.5 )

Cyclotomic Fields
Exercise 6.5.24† . Let ω be a primitive nth root of unity. When is Φm irreducible over Q(ω)?

Exercise 6.5.25† . Let n be an integer and m ∈ Z/nZ be such that m2 ≡ 1 (mod n). Prove that
there exist infinitely many primes congruent to m modulo n, provided that there exists at least one
which is greater than n2 . (It is also true that our Euclidean approach to special cases of Dirichlet’s
theorem only works for m2 ≡ 1 (mod n), see [29].)
Pn
Exercise 6.5.26†P(Mann). Suppose that ω1 , . . . , ωn are roots of unity such that i=1 ai ωi = 0 for
some ai ∈ Q and i∈I ai ωi 6= 0 for any non-empty strict subset I ⊆ [n]. Prove that ωim = ωjm for any
i, j ∈ [n] where m is the product of primes at most n.

Exercise 6.5.27. Which quadratic subfields does a cyclotomic field contain?

Exercise 6.5.28† . Prove the Gauss and Lucas formulas: given an odd squarefree integer n > 1, there
exist polynomials An , Bn , Cn , Dn ∈ Z[X] such that
n−1 n−1
4Φn = A2n − (−1) 2 nBn2 = Cn2 − (−1) 2 nXDn2 .

Deduce that, given any non-zero rational number r, there are infinitely many pairs of distinct rational
prime (p, q) such that r has the same order modulo p and modulo q.

Exercise 6.5.29 (Inspired by USAMO 2007). Let p be an odd prime and n ≥ 1 an integer. Prove
that the number
n
p2p − 1
has at least 3n prime factors (counted with multiplicity).

Miscellaneous
Exercise 6.5.30† . Let f ∈ Q[X] be an irreducible polynomial with exactly one real root of degree at
least 2. Prove that the real parts of its non-real roots are all irrational.

Exercise 6.5.31† . Let K be a number field of degree n. Prove that there are elements α1 , . . . , αn of
K such that
OK ⊆ α1 Z + . . . + αn Z.
By showing that any submodule of a Z-module generated by n elements is also generated by n elements,
deduce that OK has an integral basis, i.e. elements β1 , . . . , βn such that

OK = β1 Z + . . . + βn Z.

Exercise 6.5.32† . Let f ∈ Q[X] be an irreducible polynomial of prime degree p and denote its roots
by α0 , . . . , αp−1 . Suppose that
λ0 α0 + . . . + λp−1 αp−1 ∈ Q
for some rational λi . Prove that λ0 = . . . = λp−1 .

Exercise 6.5.33† (TFJM 2019). Let N be an odd integer. Prove that there exist infinitely many
rational primes p ≡ 1 (mod N ) such that x 7→ xn+1 + x is a bijection of Fp , where n = p−1
N .

Exercise 6.5.34† . Let f ∈ C(X) be a rational function, and suppose f sends rational integers algebraic
integers to algebraic integers. Prove that f is a polynomial.
5 The Galois group of a polynomial f over a field F is defined as the Galois group of its splitting field over F , i.e. as

Gal(F (α1 , . . . , αk )/F ), where α1 , . . . , αk are the roots of f .


114 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Exercise 6.5.35. Let α ∈ Z be an algebraic integer with minimal polynomial f . Prove that a rational
prime p not dividing the discriminant ∆ of f stays prime in OQ(α) if and only if f stays irreducible in
Fp [X].
Exercise 6.5.36. Suppose f ∈ R[X] is positive on R. Prove that there exist polynomials g, h ∈ R[X]
such that f = g(X)2 + h(X)2 .

Exercise 6.5.37. Suppose f ∈ R[X] is positive on R>0 . Prove that there exist polynomials g, h ∈ R[X]
such that f = g(X)2 + Xh(X)2 .
Exercise 6.5.38 (Inspired by ISL 2020). Let p be a prime and f ∈ Z[X] a polynomial. Alice and
Bob play a game: Bob chooses two initial element α, β ∈ Fp and Alice iteratively replace α by f (α)
or α0 such that α = f (α0 ). She wins if she can reach β, otherwise Bob wins. Prove that there exists
infinitely many primes p such that Bob is able to win.
Exercise 6.5.39. Prove that Fp (U, T )/Fp (U p , T p ) has no primitive element.
Chapter 7

Units in Quadratic Fields and Pell’s


Equation

Prerequisites for this chapter: Chapter 2 for the whole chapter and Chapter 6 for Section 7.4.

7.1 Fundamental Unit


Recall that a unit α ∈ OK is an invertible element (in OK ), i.e. an element of norm ±1. By abuse of
terminology, we shall also call α a unit of K, even though all non-zero elements are units in K since
it’s a field.
Exercise 7.1.1∗ . Prove that α is invertible if and only if its norm is ±1.

We are interested in characterising units in quadratic fields. Notice that a + b d ∈ OQ(√d) is a

unit if and only if ±1 = N (a + b d) = a2 − db2 so units in quadratic fields are deeply linked with the
so-called Pell equation.

Note also that units are closed under multiplications since the norm is multiplicative. In particular,
if K has a unit which is not a root of unity, it has infinitely many units.

We shall √
prove that there always exists such a unit, but first we turn ourselves over to the complex
case, i.e. Q( d) for d < 0 (such a field√is called a complex quadratic field ), for which the situation is a
lot simpler. Indeed, the norm of a + b d is a a − db2 ≥ a2 + b2 since d < 0. We thus get the following
characterisation of units in complex quadratic fields.

Proposition 7.1.1

Let d < 0 be a squarefree rational integer. The units of Q( d) are {1, −1, i, −i} for d = −1,
{1, −1, j, j 2 } for d = −3, and {1, −1} for other d.

Exercise 7.1.2∗ . Prove Proposition 7.1.1.



In the real case, however, the situation is completely different (Q( d) for d > 0 is called a real
quadratic field ). Indeed, there always exists infinitely
√ many units. Before we prove this, let us talk
about fundamental units. Notice that since Q( d) ⊆ R, the only roots of unity it has are ±1.

Definition 7.1.1 (Fundamental Unit)

Let K be a real quadratic field. A unit θ of K is said to be a fundamental unit if it generates all
other units of K, i.e. any unit has the form ±θn for some n ∈ Z.

115
116 CHAPTER 7. UNITS IN QUADRATIC FIELDS AND PELL’S EQUATION

We now show that, if there is a non-trivial unit, there always exists a fundamental unit greater
than 1. We will refer to this unit when we say "the fundamental unit".

Proposition 7.1.2

Any real quadratic field has a fundamental unit θ (unique with the additional condition θ > 1).

Proof of Proposition 7.1.2 assuming there is a non-trivial unit

The uniqueness is obvious: if α = ±β u and β = ±αv , then α = ±(±αv )u so either u = v = ±1


and α = ±β ±1 as wanted, or α is a root of unity (which means the only units are ±1 but we
assumed there was a non-trivial unit).
√ √
Notice that, if K has a unit α = a + b d 6= ±1, then it has a unit β = |a| + |b| d > 1. Let θ
be the smallest unit which is greater than 1; there exists one since there are only finitely many
units in any interval [a, b] for positive a, b.

Indeed, if θ ∈ [a, b] then θ = 1/θ ∈ [1/b, 1/a] so the minimal polynomial of θ has bounded
coefficients which shows that there are a finite number of such θ.

Now, we prove that all units are generate by θ. Suppose for the sake of contradiction that ε > 1
is the smallest unit which is not a power of θ (we can do that for the same reasons as before).
Since θ is the minimal unit greater than 1, we must have ε > 1; but then 1 < ε/θ < ε is a smaller
unit which is not a power of θ and that is a contradiction.


It remains to prove that the units of a real quadratic field are not all trivial. We follow the proof
of Lagrange. First, we need a lemma.

Lemma 7.1.1 (Dirichlet’s Approximation Theorem)

Let α ∈ R be a real number. For any rational integer N > 0, there are rational integers p, q such
that 0 < q ≤ N and
1
|qα − p| < .
N

p 1
In particular, there are infinitely many pairs of rational integers (p, q) such that α − q < q2 . This
will prove to be very useful for finding units.

Proof

Consider the fractional parts of the numbers 0, α, 2α, . . . , N α. They all lie in the intervals
     
1 1 2 N −1
0, , , ,..., ,1
N N N N
so, by the pigeonhole principle, two of them must lie in the same interval. Thus, their difference
has absolute value less than N1 , which is exactly what we were looking for:

1
|αq − p| = |(αq1 − p1 ) − (αq2 − p2 )| <
N
where q = |q1 − q2 |, p = ±(p1 − p2 ) and p1 = bαq1 c, p2 = bαq2 c.

7.2. PELL-TYPE EQUATIONS 117

Finally, we prove the existence of a non-trivial unit.

Proof of the existence of a non-trivial unit


√ √
Take α − d in the Dirichlet approximation theorem. Suppose |a − b d| < 1b . Then,
√ √
2 2 a+b d 2b d + 1
|a − db | < ≤
b b

as a ≤ b d + 1b .

In particular, some value M must be reached infinitely many times by a2 − db2 . Moreover, again
by the pigeonhole principle, some pair (a, b) (mod M ) must be repeated infinitely many times.
If (a, b) ≡ (a0 , b0 ) (mod M ) and a2 − db2 = M = a02 − db02 then
√ √ √
a+b d (a + b d)(a0 − b0 d)
√ =
a0 + b0 d M
√ √ √ √
is an algebraic integer as (a + b d)(a0 − b0 d) ≡ (a + b d)(a − b d) ≡ 0 (mod M ) and has norm
1. We have found a non-trivial unit.


Here is a table of the fundamental units for small d.



• θ2 = 1 + 2 (norm −1).

• θ3 = 2 + 3 (norm 1).

1+ 5
• θ5 = 2 (norm −1).

• θ6 = 5 + 2 6 (norm 1).

• θ7 = 8 + 3 7 (norm 1).

• θ10 = 3 + 10 (norm −1).

7.2 Pell-Type Equations


Notice that the fundamental unit may have norm 1 or norm −1. To have norm −1, a necessary
condition is that −1 is a quadratic residue modulo d. However, as Exercise 7.5.8† shows, it is not
sufficient and one cannot really predict which sign the norm of the fundamental unit will have. That
said, this condition is sufficient when d is a prime number.

Proposition 7.2.1

√  
Let p be a rational prime. The fundamental unit of Q( p) has norm −1 if and only if −1
p = 1,
i.e. p ≡ 1 (mod 4) or p = 2.

Proof
 
−1
If the fundamental unit has norm −1, then p = 1 so it suffices to prove that the converse also
holds. When p = 2, the fundamental unit indeed has norm −1. Now suppose p ≡ 1 (mod 4), and
√ √
let a + b p > 1 be the minimal unit of Q( p) with a, b ∈ Z (so not necessarily the fundamental
118 CHAPTER 7. UNITS IN QUADRATIC FIELDS AND PELL’S EQUATION

unit). We have a2 − pb2 = 1 so, modulo 4 we get that a is even (otherwise a2 − pb2 ≡ −1). Since

(a − 1)(a + 1) = pb2 ,

we must have a ± 1 = 2x2 and a ∓ 1 = 2py 2 for some x, y ∈ Z. If a + 1 = 2x2 , we get

2x2 − 2py 2 = (a + 1) − (a − 1) = 2

which is impossible as a + b d was already the smallest unit with norm 1. Thus,

2x2 − 2py 2 = (a − 1) − (a + 1) = −2

as wanted.


Note that when we proved the existence of a unit, we did not use the fact that d was squarefree
anywhere. Thus, we in fact get that the Pell equation x2 − dy 2 = 1 has integral solutions for any
positive squarefree d, and that all solutions are generated by the minimal one. However, we will also
prove this from the existence of a fundamental unit.

Write d = uv 2 where u is the squarefree part of d, and let α be the fundamental unit of Q( u).
Then, the (positive) solutions of x2 − uy 2 = 1 have the form

x + y u = αn .

Thus, we get
αn − αn
y= √ .
2 u
We want to know when y is divisible by v, i.e. when

2v | αn − αn = αn − α−n

which is equivalent to 2v u | α2n − 1 as α is a unit.

In fact, we can find when any non-zero β ∈ OQ(√d) divides α2n − 1 exactly like we would in Z.
Indeed, OQ(√u) modulo β has a finite number of elements so α2n cycles modulo k:

α2i ≡ α2j ⇐⇒ k | α2(i−j) − 1

(α is a unit so we can divide by it). Then, we can define the order of α2 modulo β to be the smallest
m such that α2m ≡ 1 to get that α2n ≡ 1 ⇐⇒ m | n which means that the solutions to k | α2n are
generated by αm , the minimal solution, as wanted.

Exercise 7.2.1∗ . Prove that OQ(√u) /βOQ(√u) is finite if β 6= 0.

Here is how our previous discussion


h √translates.
i Actually, our statement is a bit more general because
1+ d
we also allow rings of the form Z 2 when d ≡ 1 (mod 4), but these still have a non-trivial unit

since we have shown that Z[ d] do (and the same proof as Proposition 7.1.2 shows that the minimal
unit greater than 1 is fundamental).

Definition 7.2.1 (Fundamental Unit)

Let δ be a quadratic integer. A unit θ of Z[δ] is said to be a fundamental unit if it generates all
other units of K, i.e. any unit has the form ±θn for some n ∈ Z.
7.2. PELL-TYPE EQUATIONS 119

Proposition 7.2.2

For any quadratic integer δ, Z[δ] always has a unique fundamental unit greater than 1.

Note that this differs from our previous definition (although the proof is the same as before) because

Z[δ] may not be OQ(√d) . We will also call (x, y) the fundamental solution of x2 − dy 2 = 1 if x + y d
√ h √ i
is the fundamental unit of Z[δ], where Z[δ] = Z[ d] or Z 1+2 d (d is not necessarily squarefree
anymore).

We now discuss equations of the form x2 − dy 2 = k for some fixed k. As we have seen earlier with
k = −1, it is very hard to determine when this equation has a solution, so we will instead show that
all solutions are generated by the fundamental solution of x2 − dy 2 = 1 and a finite number of pairs
(xi , yi ) such that x2i − dyi2 = k.

Proposition 7.2.3

Let k ∈ Z be a non-zero rational integer which is not a perfect square and θ the fundamental
unit of Z[δ]. There exists elements α1 , . . . αn ∈ Z[δ] of norm k such that the elements of Z[δ] of
norm k are exactly those of the form ±θi αj .

Proof

a0 +b0 d
The proof is the same as before. For each (a, b) ∈ (Z/2kZ)2 , pick an element α(a,b) = 2 ∈

0 0 a+b d
Z[δ] with (a , b ) ≡ (a, b) (mod k) of norm k if there exists one. Then, if α = 2 has norm k,
α
α(a,b) (mod k)

is a unit of Z[δ] so has the form θi for some i.




Remark 7.2.1
Note that we can solve this equation in finite time, since it suffices to find elements of norm k
between 1 and the fundamental unit θ, as any solution greater than θ reduces to one smaller after
a division by a suitable power of θ.

To conclude this section, we consider the equation ax2 − by 2 = k. Again, we will not determine
when this has non-trivial solutions since the case b = −1 reduces to y 2 − ax2 = −1. We shall get
a characterisation of the solutions to these equations, albeit non-explicit and slightly cumbersome.
Nevertheless, for any given values of a, b, k one can compute all solutions explicitly with this.

Proposition 7.2.4

Let a and b be non-zero rational√ numbers of same sign such that ab is not a square and let θ
be the fundamental unit√of Z[ ab]. Further, let k 6= 0 be a rational integer. Then, there exists
elements α1 , . . . αn ∈ Z[ ab] of norm k and rational integers u1 , . . . , un , m1 , . . . , mn such that
the integral solutions of ax2 − by 2 = k are exactly the x, y for which

x + y ab ∈ {±θui +jni αi | i ∈ [n], j ∈ Z}.
120 CHAPTER 7. UNITS IN QUADRATIC FIELDS AND PELL’S EQUATION

Proof

Solving ax2 − by 2 = k is equivalent to solving √


(ax)2 − aby 2 = ak. We already know that the
solutions of x − aby = ak have the form x + y d = ±αi θn . We wish to know when a divides
2 2

n
αi θn + αi θ
x= ,
2
i.e. when 2a divides αi θ2n + αi . Let mi > 0 be the smallest integer such that αi (θ2mi − 1) ≡ 0.

Either there is no solution to αi θ2n ≡ −αi or ui is a solution and all solutions are given by n ≡ ui
(mod mi ). Indeed, αi θ2n ≡ −αi is then equivalent to

αi θ2n ≡ αi θ2ui ⇐⇒ α(θ2(n−ui )−1 ) ≡ 0

since θ is a unit. Thus, it remains to prove that αi θ2m ≡ αi if and only if mi | m.

We proceed like we would when αi = 1. Suppose, for the sake of contradiction, that αi θ2m ≡ α
and mi - m. Write the Euclidean division m = qmi + r with 0 < r < mi . Then,

αi ≡ αi θ2m = αi θ2qmi θ2r ≡ αi θ2r

which is a contradiction since mi was assumed to be minimal.




Remark 7.2.2
It is no coincidence that proving that

αi θ2m ≡ αi (mod 2a)

iff mi | m was so similar to proving that θ2m ≡ 1 iff the order of θ2 divides m. This is because
it is in fact the same result, but modulo 2a/ gcd(2a, αi ). Since we have not defined the gcd in
non-Bézout domains, we could not use this approach (the gcd is usually not a number but an
ideal!).

7.3 Størmer’s Theorem


In this section, we focus on a very nice application of Pell equations, regarding consecutive S-units.

Definition 7.3.1 (S-Units)

Let S be a finite set of rational primes. A rational number r ∈ Q is said to be a S-unit if the
prime factors of its numerator and denominator are in S. Given a non-zero rational integer s ∈ Z,
We also say r is a s-unit if all prime factors of the numerator and denominator of r are prime
factors of s.

Theorem 7.3.1 (Størmer’s Theorem)

For any set of rational primes S of cardinality n, the equation u − v = 1 has at most 3n solutions
in positive integral S-units.

Here is how we will approach this theorem: if v and u = v + 1 are S-units, then one of them is
even so 2 ∈ S. Thus, 4v(v + 1) = (2v + 1)2 − 1 is also an S-unit. Let x = 2v + 1. Write
Y
x2 − 1 = pkp ,
p∈S
7.3. STØRMER’S THEOREM 121

and
Q forkpeach kp 6= 0, choose dp ∈ {1, 2} such that dQ p ≡ kp (mod 2) (otherwise set dp = 0). Then,
−dp 2 dp
p∈S p is a perfect square, say y . Letting d = p∈S p we get the Pell equation

x2 − dy 2 = 1.

Also – and this is the key point – note that y is a d-unit by construction. We shall prove that the only
possible solution to the Pell equation x2 − dy 2 = 1 where y is a d-unit is the fundamental solution.
Thus, for each such Pell equation there is at most one corresponding pair of consecutive S-units. Since
dp ∈ {0, 1, 2} for each p, there are 3n equations to consider which yields the result.

Hence, we just need to prove the following proposition.

Proposition 7.3.1

For any positive rational integer d which is not a perfect square, the only possible positive solution
to the Pell equation x2 − dy 2 = 1 where y is a d-unit is the fundamental solution.

Proof
√ √
Let x + y d be the fundamental unit of Z[ d]. Since we are interested in positive solutions of
the Pell equation, by Proposition 7.2.2, we want to show that if
√ √
(x + y d)n − (x − y d)n
yn = √
2 d
is a d-unit, then n = 1. Suppose there is a solution where n 6= 1. Since ym | yn when m | n, we
may assume n = p is prime.

We have      
p p−1 p p−3 2 p p−5 4 2
yp = x + x y d+ x y d + ... (∗)
1 3 5
Let q be a prime factor of yp ; by assumption q | d. Every term of this sum except the first one
is divisible by d, thus  
p p−1
q| x = pxp−1 .
1
Since x2 − dy 2 = 1, x is coprime with d so q | p which means q = p. Thus, yp is a power of p and
in particular divisible by p2 unless p = 2.

However, as p | kp for 0 < k < p, if p > 3, every term of (∗) is divisble by p2 except the first one


which is pxp−1 . This is a contradiction.

It remains to settle the cases p = 2 and p = 3. The first


√ one is trivial: we have y2 = 2x so x = 1
as it’s coprime with d which is impossible since x + y d was a non-trivial unit.

Finally, in the case p = 3 we get y3 = 3x2 + dy 2 , and since x2 − dy 2 = 1 this means

y3 = 3x2 + (x2 − 1) = (2x − 1)(2x + 1).

This is a product of two numbers which differ by 2, thus it can only be a power of 3 if x√= 1
since the only powers of 3 which differ by 2 are 1 and 3. This is again impossible as x + y d is
a non-trivial unit.


Exercise 7.3.1∗ . Prove that ym | yn iff m | n.


122 CHAPTER 7. UNITS IN QUADRATIC FIELDS AND PELL’S EQUATION

7.4 Units in Complex Cubic Fields and Thue’s Equation


In this section, we will prove that, for any finite set of rational primes S and any fixed integer k 6= 0,
the equation u − v = k has finitely many solutions in integral S-units. We will, however, not find an
explicit bound like we did in the last section for k = 1. In fact, our method can not give bounds; and
it does not let us compute effectively all solutions for a fixed k and S.
Exercise 7.4.1. Why does looking at the (2k )2 Pell-type equations ax2 − by 2 = k for squarefree integral
S-units a, b not prove that u − v = k has finitely many integral S-units solutions?

Thus, instead of considering Pell-type equations ax2 − by 2 = k that usually have infinitely many
3 3 rp
Q
solutions, we shall consider equations of the form ax − by = k. Indeed, if u = p∈S p and
v = p∈S psp then, by choosing ap , bp ∈ {1, 2, 3} such that ap ≡ rp (mod 3) and bp ≡ sp (mod 3) and
Q
p p
defining a = p∈S pap and b = p∈S pbp , we get that ( 3 u/a, 3 v/b) is a solution of one of the 3k
Q Q
Thue equations
ax3 − by 3 = k.
Indeed, it is a theorem of Thue that such an equation has only finitely many solutions for k 6= 0. This
is what we’ll prove in this section.

The theorem that u−v = k has finitely many integral S-units solutions is also known as Kobayashi’s
theorem. It is usually written as such:

Theorem 7.4.1 (Kobayashi’s Theorem)

Let M be an infinite set of rational integers with finitely many prime divisors, meaning that there
are finitely many rational primes which divide at least one element of M . Then, the translate
k + M has infinitely many prime divisors for any rational integer k 6= 0.

Note that this is indeed equivalent to our result on the finiteness of integral S-units equations: M
has finitely many prime divisors if and only if all its elements are S-units for some finite S, and the
same holds for k + M . So if they’re both sets of S-units for some finite S, we can assume it’s the same
S for both of them but then the equation u − v = k has infinitely many solutions in integral S-units.

Thus, we need to prove the following special case of Thue’s theorem.

Theorem 7.4.2 (Thue)

For any non-zero rational integers a, b and k, the equation ax3 + by 3 = k has finitely many
solutions in rational integers.

2 2
p
The equation
p ax − bx = k was linked to units in Q( b/a), thus, for ax3 + by 3 we will consider
3
units in Q( b/a). When a/bpis perfect cube this problem is more or less trivial so we can assume that
it isn’t the case, i.e. that Q( 3 b/a) is a field of degree 3.

Exercise 7.4.2∗ . Prove Theorem 7.4.2 in the case where a/b is a rational cube.

As before, we first consider the case√where a = 1 and k = 1 since it corresponds
√ to units of Q( 3 b).
Indeed, if x3 + by 3 = 1, then N (x + y b) = 1. Why should we√expect Q( b) to have finitely many
3 3

such units when√there are infinitely many of them for K = Q( d)? It’s because this unit does not
3
have a term in b2 , so for instance units of that form are absolutely not closed under multiplication,
contrary to the quadratic case.
√ √
We shall find a characterisation of the units of Q( 3 b). This relies on the fact that√Q( 3 b) is a
complex cubic field , not because it’s not real but because some of its conjugates fields Q(j 3 b) for some
primitive third root of unity j aren’t.
7.4. UNITS IN COMPLEX CUBIC FIELDS AND THUE’S EQUATION 123

We also say a number field K is totally real if all its conjugate fields are real. Also, we say an
embedding σ is real if σK is real, and complex otherwise. Complex embeddings come into √ pair σ, σ:
this will be quite useful for us as we only need to deduce information on an element of Q( 3 b) and one
its conjugate to have information on all its conjugates.

We define fundamental units almost as before, but this time we don’t require θ to be non-trivial if
there are no non-trivial units (i.e. we allow θ = 1). There always exists a non-trivial unit, but since
it’s non-trivial to show and we do not need it (it’s better for us if there are less units) we do not do it.
Again, we say "unit of K" to mean "unit of OK ".

Definition 7.4.1 (Fundamental Unit For Complex Cubic Fields)

Let K ⊆ R be a complex cubic field. A unit θ ≥ 1 of K is said to be a fundamental unit if it


generates all others: any unit can be written as ±θn .

Proposition 7.4.1

Let K ⊆ R be a complex cubic field. If K has a non-trivial unit, then K has a (unique)
fundamental unit.

Proof

Again, uniqueness is obvious from existence. Suppose K has a unit greater than 1, otherwise all
its units are ±1 so we can take θ = 1 (this is in fact impossible and even if it were possible we
wouldn’t call it a fundamental unit because the units are generated by no element).

We imitate the proof of Proposition 7.1.2. The key step is the existence of a minimal unit θ > 1.
Such a unit exists, because if ε 6= ±1 is a unit then |ε|±1 is a unit greater than 1 for some choice
of ±1.

Now, let’s prove that a minimal one exists. As before, we prove that there are finitely many
units in any interval [a, b] for positive a, b. Suppose ε ∈ [a, b] is a unit. Let σ, σ be the complex
embeddings of K. Then,
1 = |εσ(ε)σ(ε)| = ε|σε|2 .
Thus, the absolute values of the conjugates of ε are all bounded, which means that the minimal
polynomial of ε has bounded coefficients: there exists finitely many such ε.

Finally, we again proceed as in the quadratic case. Let θ be the minimal unit greater than 1
Suppose ε > 1 is the minimal unit which is not a power of θ. Then, ε > θ by minimality of θ,
which means 1 < ε/θ < ε, contradicting the minimality of ε.


Now that we have characterised units of Q( 3 b/a), we characterise elements of OQ( √


p
3
b/a)
of norm
k for a fixed k. This is related√to our equation ax3 + by 3 = k: indeed, if x, y is an integral solution of
3
this equation, then N (ax + y a2 b) = a2 k.

Proposition 7.4.2

Let k ∈ Z be a non-zero√rational integer and θ the fundamental unit of Q( 3 d). There exists
elements α1 , . . . , αn ∈ Z[ 3 d] of norm k such that the elements of OK of norm k all have the form
±θi αj .
124 CHAPTER 7. UNITS IN QUADRATIC FIELDS AND PELL’S EQUATION

Proof

The proof is again the same as before. For each (a, b, c) ∈ (Z/kZ)3 , pick an element of norm k
√ √
α(a,b,c) = a0 + b0 d + c0 d2
3 3

√ √
with (a0 , b0 , c0 ) ≡ (a, b, c) (mod k) if there exists one. Then, if α = a + b 3 d + c d2 has norm k,
3

α
α(a,b,c) (mod k)

is a unit of OK so has the form θi .




Remark 7.4.1
We did not consider the equation N (α) = k in OK because that would require to find the structure
of OK which is slightly cumbersome. This is left as Exercise 7.5.22 and we let the reader adapt
the proof for OK (the conclusion is that the elements of norm k are exactly those of the form
±θjα .)

Finally, we prove Theorem 7.4.2.

Proof of Theorem 7.4.2


p √
Let K = Q( 3 b/a) = Q( 3 d) where d = a2 b, and denote its fundamental unit by θ.
From the previous consideration and Proposition 7.4.2, it suffices to show that, for any non-zero

element α ∈ OK , there are finitely many rational integers n such that αθn has the form x + y 3 d.
If θ = ±1 is trivial, the claim is obvious, thus suppose θ > 1 is non-trivial.

Let
√ j be a primitive third root of unity and σ be the
√ complex embedding of K sending 3 d to
j 3 d. Since j 2 + j + 1 = 0, αθn has the form x + y 3 d if and only if
αθn + jσ(α)σ(θ)n + j 2 σ(α)σ(θ)n = 0.
√ √
3
Indeed, if β = r + s 3 d + t d2 , we have
√3

3

3

3

3

3

3
3t d2 = (r + s d + t d2 ) + (jr + j 2 s d + t d2 ) + (j 2 r + js d + t d2 ) = β + σβ + σβ.
Thus, we wish to show that the linear recurrence of algebraic numbers
αθn + jσ(α)σ(θ)n + j 2 σ(α)σ(θ)n
has finitely many zeros. By Corollary 8.4.2 of the Skolem-Mahler-Lech theorem 8.4.1 which will
be proven in Chapter 8, there exists two embeddings σ1 and σ2 such that
σ1 (θ)
σ2 (θ)
is a root of unity. By composing with another embedding, we may assume σ1 = id, and by
symmetry between j and j 2 we may assume σ2 = σ.

By an argument similar to Problem 6.3.1, we can show that the only roots of unity √ in Q( 3√d, j)
2 2 3 3
are ±1, ±j and ±j . Hence, we must have θ/σ(θ) ∈ {±1, ±j, ±j }. Write θ = x + y d + z d2 .
Suppose θ/σ(θ) = ±1. We get

3
√3

3
√3
x + y d + z d2 = ±(x + yj d + zj 2 d2 )
√3

3
which
√ means
√ y√ = z =
√ 0 as j has degree two over Q( d) (since it’s not in Q( d) ⊆ R) so
3 3
1, 3 d, d2 , j, j 3 d, j 2 d2 are Q-linearly independent. This is impossible since θ is non-trivial by
assumption. The other cases yield similar contradictions, which finishes the proof.

7.4. UNITS IN COMPLEX CUBIC FIELDS AND THUE’S EQUATION 125


Exercise 7.4.3∗ . Prove that the only roots of unity of Q( 3 d, j) are ±1, ±j and ±j 2 .

Exercise 7.4.4∗ . Prove that θ/σ(θ) ∈ {±j, ±j 2 } is also impossible.

Remark 7.4.2
In fact, if K is a number field of degree n with real embeddings τ1 , . . . , τr and complex embeddings
σ1 , σ 1 , . . . , σs , σ s , the Dirichlet unit theorem states that the units of K have the form
n
ζεn1 1 · . . . · εr+s−1
r+s−1

where ζ is a root of unity and ε1 , . . . , εr+s−1 ∈ K are multiplicatively independent units. The
case we treated corresponded to (r, s) = (2, 0) and (r, s) = (1, 1) (although we didn’t prove they
were multiplicatively independent for complex cubic fields, i.e. that the units are not all roots of
unity).

Remark 7.4.3
Thue in fact proved more generally that if f ∈ Z[X, Y ] is an irreducible homogeneous polynomial
of degree n ≥ 3, i.e. f is homogeneous and f (X, 1) is irreducible in Z[X], the equation

f (x, y) = k

has a finite number of integral solutions x, y ∈ Z for any fixed k ∈ Z. In fact, this is deeply linked
with the irrationality measure of algebraic numbers (it can also be proven with p-adic methods
like the Skolem-Mahler-Lech theorem (see [7]) but Thue proved it that way).

The equality f (x, y) = k yields f (x/y, 1) = k/y n which means that x/y is very close to a root of
f . In fact, it is equivalent to the finiteness of pairs of rational integers (p, q) with q 6= 0 such that

p C
α− < n
q q

for any C > 0. Thue proved that there were finitely many pairs (p, q) such that
 
p 1
α− < n/2+ε
q q

for any ε > 0, thus establishing his theorem. See Silverman-Tate, [42, Chapter 5, Section 3].

We say a real number α ∈ R has irrationality measure µ if µ is the largest real number such that,
for any ε > 0, there are finitely many pairs of rational integers (p, q) with q 6= 0 such that
 
p 1
α− < µ+ε .
q q

Dirichlet’s approximation theorem Lemma 7.1.1 shows that any real number has irrationality
measure at least 2. Conversely, the very deep Thue-Siegel-Roth theorem states that any √ real
algebraic number has irrationality measure exactly 2 (Siegel proved you could take µ = 2 n and
Roth µ = 2).

Remark 7.4.4
In fact, the S-unit equation u − v = 1 also has a finite number of rational solutions. This is
considerably harder than Kobayashi’s theorem.
126 CHAPTER 7. UNITS IN QUADRATIC FIELDS AND PELL’S EQUATION

7.5 Exercises
Diophantine Equations
2 2
Exercise 7.5.1† (ISL 1990). Find all positive rational integers n such that 1 +...+n
n is a perfect
square.

Exercise 7.5.2† (BMO 1 2006). Let n be a rational integer. Prove that, if 2+2 1 + 12n2 is a rational
integer, then it is a perfect square.

Exercise 7.5.3. Find all rational integers n such that 2n + 1 and 3n + 1 are both perfect squares.

Exercise 7.5.4† (RMM 2011). Let Ω(·) denote the number of prime factors counted with multiplicity
of a rational integer, and define λ(·) = (−1)Ω(·) . Prove that there are infinitely many rational integers n
such that λ(n) = λ(n + 1) = 1 and infinitely many rational integers n such that λ(n) = λ(n + 1) = −1.

Exercise 7.5.5† . Let k be a rational integer. Prove that there are infinitely positive integers n such
that n2 + k | n!.

Pell-Type Equations
Exercise 7.5.6† . Let d be a rational integer. Solve the equation x2 − dy 2 = 1 over Q.
 
2
Exercise 7.5.7. Let p ≡ −1 (mod 4) be a rational prime. Prove that the equation x2 − py 2 = 2 p
has a non-trivial solution over Z.

Exercise 7.5.8† . Prove that the equation x2 − 34y 2 = −1 has no non-trivial solution in Z despite −1
being a square modulo 34.

Exercise 7.5.9. Solve the equation 3x2 − 2y 2 = 10 over Z.

Fundamental Units
√ √
Exercise 7.5.10† . Let d ≡ 1 (mod 4) be a squarefree integer, and suppose η = a+b d
6∈ Z[ d] is the
√ √ 2
fundamental unit of Q( d). Prove that η n ∈ Z[ d] if and only if 3 | n.

Exercise 7.5.11† . Let d 6= 1 be a squarefree 2n 2


√ rational integer, and suppose√that 2 + 1 = dm for
n
some integers n, m ≥ 0. Show that 2 + m d is the fundamental unit of Q( d), provided that d 6= 5.

Exercise 7.5.12† . Suppose that d = a2 ± 1 is squarefree, where a ≥ 1 is some rational integer and
let k ≥ 0 be a rational integer. Suppose that the equation x2 − dy 2 = m has a solution in Z for some
|m| < ka. For sufficiently large d, prove that |m|, d + m or d − m is a square.

Exercise 7.5.13† . Solve completely the equation x3 + 2y 3 + 4z 3 = 6xyz + 1 which was seen in
Problem 6.2.2.

Exercise 7.5.14† (Weak Dirichlet’s Unit Theorem). Let K be a number field with r real embeddings
and s pairs of complex embeddings. Prove that there exist units ε1 , . . . , εk with k ≤ r + s − 1 such
that any unit of K can be written uniquely in the form

ζεn1 1 · . . . · εnk k

for some integers ni and a root of unity ζ.

Exercise 7.5.15† (Gabriel Dospinescu). Find all monic polynomials f ∈ Q[X] such that f (X n ) is
reducible in Q[X] for all n ≥ 2 but f is irreducible.
7.5. EXERCISES 127

Miscellaneous
Exercise 7.5.16† (Liouville’s Theorem). Let α be an algebraic number of degree n. Prove that there
exists a constant C > 0 such that
p C
α− > n
q q
for any p, q ∈ Z (with q > 0).
Exercise 7.5.17† . Prove that 5n2 ± 4 is a perfect square for some choice of ± if and only if n is a
Fibonacci number.

Exercise 7.5.18† (ELMO 2020). Suppose n is a Fibonacci number modulo every rational prime.
Must it follow that n is a Fibonacci number?
Exercise 7.5.19† (Nagell, Ko-Chao, Chein). Let p be an odd rational prime. Suppose that x, y ∈ Z
are rational integers such that x2 − y p = 1. Prove that 2 | y and p | x. Deduce that this equation has
no solution for p ≥ 5. (The case p = 3 is Exercise 8.6.29† .)

Exercise 7.5.20† . Prove that there are at most 3|S| pairs of S-units distant by 2.
Exercise 7.5.21† . Assuming the finiteness of rational solutions to the S-unit equation u + v = 1 for
any finite S, determine all functions f : Z → Z such that m − n | f (m) − f (n) for any m, n and f is a
bijection modulo sufficiently large primes.

Exercise 7.5.22. Let m be a rational integer. What are the integers of Q( 3 m)?
Chapter 8

p-adic Analysis

Prerequisites for this chapter: Section A.1 for the whole chapter, Sections 6.2 and 6.4 for Section 8.4
and Chapter 2 for Section 8.5. Chapter 6 is recommended.

p-adic numbers have many applications and are absolutely fundamental in number theory nowadays.
That said, this chapter will be almost exclusively dedicated to proving the Skolem-Mahler-Lech theorem
8.4.1 and related results. We refer the reader to the Addendum 3A of [2] for more applications of p-adic
numbers.

8.1 p-adic Integers and Numbers


Again, this section will be a bit abstract. If you have trouble, following, skip to Problem 8.3.1 for
motivation. In elementary number theory, when working with diophantine equations, it is often useful
to reduce the equation modulo a rational prime p. If that is not sufficient, one might look modulo p2 ,
then modulo p3 , etc. p-adic numbers are what you get when you consider something modulo pn for all
n. More precisely, a p-adic integer is the data of an element of Z/pZ, of an element of Z/p2 Z, of an
element of Z/p3 Z, ..., such that these elements are compatible between them (the element of Z/p2 Z is
congruent to the element of Z/pZ modulo p.)

Definition 8.1.1 (p-adic Integer)

A p-adic integer a is a tuple

(a1 , a2 , a3 . . .) ∈ Z/pZ × Z/p2 Z × Z/p3 Z × . . .

such that ai ≡ aj (mod pmin(i,j) ) for any i, j. The set of p-adic integers is denoted Zp .

Remark 8.1.1
We have precisely defined Zp as the projective limit lim Z/pn Z of (Z/pn Z)n≥1 .
←−

The p-adic integers Zp 1 form an integral domain under component-wise addition and multiplication,
meaning that
(a1 , a2 , a3 , . . .)(b1 , b2 , b3 , . . .) := (a1 b1 , a2 b2 , a3 b3 , . . .)
and
(a1 , a2 , a3 , . . .) + (b1 , b2 , b3 , . . .) = (a1 + b1 , a2 + b2 , a3 + b3 , . . .).

Exercise 8.1.1∗ . Check that Zp is an integral domain. What is its characteristic?


1 Now you know why you shouldn’t use Zn for Z/nZ! If you want a shorter notation you can use Z/n.

128
8.1. P -ADIC INTEGERS AND NUMBERS 129

Since p-adic integers are supposed to represent a tuple of local data modulo powers of p, it
makes sense to associate the rational integer a ∈ Z with the p-adic integer (a (mod p), a (mod p2 ), a
(mod p3 ), . . .). Thus, by abuse of notation, we say Z ⊆ Zp because of this embedding.2 In fact, since
a (mod pn ) makes sense when a is a rational number with denominator coprime with p, we even get

Z(p) ⊆ Zp

where Z(p) denotes the rational numbers with denominator coprime with p.

Exercise 8.1.2∗ . Check that a 7→ (a (mod p), a (mod p2 ), a (mod p3 ), . . .) is indeed an embedding of Z(p)
into Zp , i.e. that it’s injective.

Remark 8.1.2
We use the notation Z(p) because it is the localisation of Z at the prime ideal (p).

Now, suppose we want to make sense of 1/p p-adically. This can’t be a p-adic integer because
1/p makes no sense modulo p. Thus, we define p-adic numbers by allowing a formal (subject to some
relations) division of p-adic integers by powers of p, i.e. we define Qp := Zp [1/p].

Definition 8.1.2 (p-adic numbers)

A p-adic number is an element of the form pk a for some k ∈ Z and a ∈ Zp . The set of p-adic
numbers is denoted Qp .

With this, we can now say (somewhat abusively) that Q ⊆ Qp by associating the rational num-
ber r = pk a with a ∈ Z(p) to pk (a (mod p), a (mod p2 ), a (mod p3 ), . . .). For instance, 1/p =
p−1 (1, 1, 1, . . .).

p-adic numbers now form a field, and as we have seen numerous times, working in a field is always
great. Here is how multiplication and addition are defined: let x = pk a and y = pm b be p-adic numbers.
Suppose without loss of generality that m, k < 0 otherwise they are p-adic integers. Multiplication is
defined as (pk a)(pm b) = pk+m ab, and addition by

pk a + pm b = pk (a + pm−k b)

if k ≤ m and
pk a + pm b = pm (pk−m a + b)
if k ≥ m.3

It remains to prove that every element of Qp has a multiplicative inverse, so far we have only shown
that it is a ring. This is easy, but before we do it let us define the p-adic valuation of p-adic numbers.

Proposition 8.1.1 (Units of Zp )

The units in Zp (we will also call them "units of Qp " abusively), Z×
p , are the p-adic integer with
non-zero first coordinate: a = (a0 , . . .) and a0 6≡ 0 (mod p).

2 Z is isomorphic to the subset of p-adic integers of the previous form; in general, when f : S → U is an injective

morphism, we call f an embedding of S into U . Notice that the regular embeddings of a number fields are embeddings
into C. See also Remark 6.2.2.
3 Technically, as we have defined p-adic numbers, the numbers pk (a , a , a , . . .) and (pk a , pk a , pk a , . . .) are distinct
1 2 3 1 2 3
for positive k. Indeed, we said our division by p was formal, which means a p-adic number is a tuple (k, a) ∈ Z × Zp
which we write as pk a. This is however very easy to fix: just identify these two p-adic numbers to be the same.
130 CHAPTER 8. P -ADIC ANALYSIS

Proof

This is obvious: if a0 ≡ 0 then a0 b0 ≡ 0 for any b0 so ab can never be 1 = (1, 1, 1, . . .). Conversely,
if a0 6≡ 0, the components of a are all invertible since they are coprime with pn for any n, so

a−1 = (a−1 −1 −1
0 , a1 , a2 , . . .).

Definition 8.1.3 (p-adic valuation)

Let z ∈ Qp be a non-zero p-adic number. Write z = pk a where a ∈ Z× p is a unit. The p-adic


valuation of z, vp (z) is the integer k. We also define vp (0) = +∞.

Of course, the p-adic valuation of rational integers is the same as the regular p-adic valuation. Now
it follows directly that Qp is a field: if z = pvp (z) a, z −1 = p−vp (z) a−1 .

To finish this section, we mention one nice property of p-adic numbers. Even if this proposition
does not convince you of the use of Qp , it should at least convince you that it is a very nice object.

Theorem 8.1.1 (Hensel’s Lemma)

Let f ∈ Zp [X] be a polynomial. If, for some a ∈ Zp , |f (a)|p < 1 and |f 0 (a)|p = 1, then f has a
unique root α ≡ a (mod p) in Zp .

Proof

This is almost exactly the regular Hensel lemma 5.3.1: if f has a root a modulo p, i.e. |f (a)|p < 1,
such that p - f 0 (a), i.e. |f 0 (a)|p = 1, then f has a unique root rk in Z/pk Z congruent to a modulo
p. The number α = (α1 , α2 , α3 , . . .) is then the unique root of f in Qp congruent to a modulo
p. The only difference is that, in our previous version of Hensel’s lemma, f had coefficients in Z
and not in Zp . However, it is easy to check that this does note change anything to the proof.


This usually reduces the problem of finding roots of polynomials in Qp to finding roots in Fp . For
instance, there is a square root of −1 in Q5 .

8.2 p-adic Absolute Value


This p-adic valuation lets us define an absolute value on Qp : |z|p = p−vp (zp ) (and |0|p = 0).

Definition 8.2.1 (p-adic Absolute Value)

The p-adic absolute of Qp is defined as | · |p = p−vp (·) (in particular |p|p = 1/p). The regular
absolute value on R (or C) will be denoted | · |∞ .

By an absolute value, we mean a function | · |p : Qp → R>0 which is multiplicative, zero only at


zero, and which satisfies the triangular inequality. The first two properties are obvious, and the last
one follows from the following stronger inequality.
8.2. P -ADIC ABSOLUTE VALUE 131

Proposition 8.2.1 (Strong Triangle Inequality)*

For any p-adic numbers x, y ∈ Qp , we have |x + y|p ≤ max(|x|p , |y|p ) with equality if |xp | =
6 |y|p .

Proof

This is equivalent to vp (x + y) ≥ min(vp (x), vp (y)) with equality if vp (x) 6= vp (y) which is
obvious.


Notice that with this absolute value, the p-adic integers are now a ball : Zp = {|z|p ≤ 1 | z ∈ Qp }
since p-adic integers are the p-adic numbers with non-negative p-adic valuation.

With this norm we can now define a distance on Qp : d(x, y) = |x−y|p . This is completely analogous
to R and C, but now two numbers are very close if they are divisible by a large power of p. With
this distance, we can define convergence: a sequence (an )n≥0 of p-adic numbers converges to a ∈ Qp
if d(a, an ) → 0, i.e. |a − an |p → 0. This is also equivalent to vp (a − an ) → +∞. For instance, the
sequence (pn )n≥0 converges to 0 p-adically.

Remark 8.2.1
We will usually use xn → 0 to mean that xn goes to 0 p-adically, but sometimes it will also mean
that xn → 0 over R. We hope that the distinction will be made clear from context; the latter will
normally be used when xn is a sequence of norms of p-adic numbers.
P
Similarly,
Pn we can define convergence of series i ai : we say the series converges if its partial sums
bn = i=0 ai converge. Here is a fundamental proposition, that show that the situation is very different
in the p-adic case compared to the real or complex case.

Proposition 8.2.2 (p-adic Convergence of Series)*


P
The series i ai converges if and only if an → 0.

Proof
Pn Pn−1
It is clear that if it converges, an = i=0 ai − i=0 ai converges. The surprising part is that
the converse also holds. If an → 0, we can assume they are all p-adic integers since there will
only be a finite number of non-integral ai (an ∈ Zp iff |an |p ≤ 1).
P
Consider the kth component of the series i ai : it is the sum of the kth components of ai . But
P am → 0, the kth component of ai is zero for sufficiently large i. Thus, the kth component
since
of i ai isPa sum of a finite number of terms for each k, which mean that they are all well-defined
and thus i ai is too. (Looking at the kth component is equivalent to reducing modulo pk : there
are a finite number of ai not divisible by pk so the partial sums eventually stabilise modulo pk ,
which means that it converges p-adically since it is true for all k.)


Exercise 8.2.1∗ . Convince yourself of this proof.

Over R this is very wrong: the harmonic series i≥1 1i diverges but 1
P
i → 0. As a corollary, we get
a very simple criterion for the convergence of a sequnce (an )n≥0 .
132 CHAPTER 8. P -ADIC ANALYSIS

Corollary 8.2.1*

A sequence (an )n≥0 of p-adic numbers converges if and only if an+1 − an → 0.

Proof
P
Apply Proposition 8.2.2 to the series i ai+1 − ai (the nth partial sum is an − a0 ).


Exercise 8.2.2∗ . Prove that the strong triangle inequality also holds for series: if ai → 0 then
P
i ai p

maxi |ai |p with equality if the maximum is achieved only once.
Let us talk a bit more about the p-adic absolute value. Recall that real numbers are constructed
from rational numbers by adding the limits of sequences which
√ should converge but do not in Q. Here
is an example. If you√write down the decimal digits of 2, you get a sequence of rational numbers
converging (in R) to 2. But in Q, this sequence does not have a limit (so does not converge) as

2 6∈ Q. You might ask "how do we determine which sequences should converge without having
defined R first?". This is achieved by the notion of a Cauchy sequence: a sequence (an )n≥0 such that,
for any ε > 0, |am − an | ≤ ε for sufficiently large m and n (m, n ≥ N for some N ).
This process is called completing Q with respect to | · |∞ , and R is said to be the completion of Q
with respect to | · |∞ . For this reason we shall also denote R by Q∞ .4 We do not discuss the technical
details here, but it turns out the p-adic fields we constructed are the completions of Q with respect
to the p-adic absolute value | · |p (the fact that Cauchy sequences converges follows from the stronger
Corollary 8.2.1).5 In fact, the only fields you can get by completing Q with respect to some absolute
value are R and the p-adic fields Qp (thus called the completions of Q) by Exercise 8.6.19† .6
Exercise 8.2.3∗ (Weak Approximation Theorem). Let S be a finite set of primes or ∞ and consider elements
(xp )p∈S such that xp ∈ Qp . Prove that, for any ε > 0, there is an x ∈ Q such that |x − xp |p < ε for all p ∈ S.
The completions of Q (and their finite extensions) are called local fields (because they have local
data) while Q and its finite extensions are called global fields.7 In ?? we will see one instance of how
you can piece local data together to get global data (the Hasse principle ??). More simply, though, we
have the following proposition.

Proposition 8.2.3 (Product Formula)

For any non-zero x ∈ Q, we have Y


|x|∞ · |x|p = 1.
p

Exercise 8.2.4∗ . Prove the product formula.

8.3 Binomial Series


This section will be a bit more concrete. We wish to make sense of ab for p-adic numbers a and b.
Actually, already over Q, ab doesn’t always make sense in Q for instance 21/2 6∈ Q (and if we consider
√ √
2
Q( 2), then 2 is not even algebraic by a deep result of Gelfond and Schneider). Thus we will only
try to make sense of ab for b ∈ Zp , although even there it won’t be defined canonically for all a ∈ Qp :
we will define ab only when a ≡ 1 (mod p) and b ∈ Zp .
4Z
∞ is sometimes thought of as [−1, 1] since for p 6= ∞ we have Zp = {|x| ≤ 1 | x ∈ Qp }, but it doesn’t have
properties as nice as the other Zp . In ??, we will define Z∞ to be R because it will suit our purposes.
5 The fact that its elements have such an explicit form in terms of Q is because | · | is non-Archimedean, i.e. satisfies
p
the strong triangle inequality Proposition 8.2.1.
6 There are absolute values different from | · | 2 2
∞ and | · |p like | · |∞ , but completing Q with respect to | · |∞ gives (a
field isomorphic to) R.
7 Technically, there are other local or global fields as well, but these are the only ones in characteristic 0.
8.3. BINOMIAL SERIES 133

Write a = 1 + u, with p | u. Over Z, we can define (1 + u)b for positive b ∈ Z as


X b
uk .
k
k

In fact, the same formula works for any b ∈ Zp because uk → 0 so the series will converge. Let us
explain a bit more. We need the fundamental fact that Z and even N are dense in Zp .

Proposition 8.3.1

N is dense in Zp , meaning that for any a = (a1 , . . .) ∈ Zp and any ε > 0, there is a b ∈ N such
that |a − b| < ε. Similarly, Q is dense in Qp .

Proof

Simply pick b ≡ an (mod pn ) for some large n: we get |a−b|p ≤ p−n . For Qp it’s Exercise 8.3.1∗ .


Exercise 8.3.1∗ . Prove that Q is dense in Qp .

Here is what this implies. For a fixed k, denote by k· : Zp → Qp the function




 
n n(n − 1) · . . . · (n − (k − 1))
= .
k k!

This is a continuous function (Exercise 8.3.2∗ ) which satisfies nk p ≤ 1 on N. Since N is dense in




Zp , we in fact have nk p ≤ 1 on Zp so k· : Zp → Zp (Exercise 8.3.3∗ ). Finally, this means that, for


 

|u|p < 1, the function


∞  
X n k
(1 + u)n = u
k
k=0

converges for any n in Zp by Proposition 8.2.2 as nk uk p ≤ |u|kp → 0. In fact, this is the unique


extension of n 7→ (1 + u)n from N to Zp as N is dense in Zp .


Exercise 8.3.2∗ . Let f ∈ Qp [X] be a polynomial. Prove that f is continuous on Qp .

Exercise 8.3.3∗ . Let f : Zp → Qp be a continuous function. If |f (x)|p ≤ 1 for any n in a dense subset (in
Zp ), prove that |f (x)|p ≤ 1 for any x ∈ Zp .

Finally, we get the following proposition.

Proposition 8.3.2*

Let |u|p < 1 be a p-adic number. Then


∞  
X z
z 7→ (1 + u)z := uk
k
k=0

defines a continuous function Zp → Zp such that for any x, y ∈ Zp we have

(1 + u)x (1 + u)y = (1 + u)x+y .


134 CHAPTER 8. P -ADIC ANALYSIS

Proof

It sufices to note that (1 + u)x (1 + u)y = (1 + u)x+y for any x, y ∈ N thus for any x, y ∈ Zp by
density.


Before presenting an application, let us present a philosophical remark about p-adic numbers taken
1
from Evan Chen [10]. Imagine you are given the following problem: estimate 112 + . . . + 100002 to within

0.001. This is a statement solely about rational numbers, but it is considerably easier to solve if one
knows about real numbers:

1 1 π2 X 1
+ . . . + = −
12 100002 6 k2
k=10001
2 P∞
π 1
and it is now very easy to estimate 6 and k=10001 k2 . Similarly, suppose you are given the following
problem.

Problem 8.3.1 (USA TST 2002 Problem 2)

Let p > 5 be a rational prime. Prove that the sum


p−1
X 1
fp (x) =
(px + k)2
k=1

does not depend on x ∈ Z modulo p3 .

We wish to compute this sum modulo p3 , that is, estimate this p-adic sum S to a value s ∈ Q such
that |S − s|p ≤ p−3 . This is a statement about rational numbers, but it really helps to use p-adic
numbers to estimate it p-adically.

Solution

We work in Qp . We have
p−1 p−1
X 1 X 1  px −2
= 1 +
(px + k)2 k2 k
k=1 k=1
p−1 ∞  
X 1 X −2  px i
=
k 2 i=0 i k
k=1
p−1       2 2
X 1 −2 −2 px −2 p x
≡ + + (mod p3 )
k2 0 1 k 2 k2
k=1
p−1 p−1 p−1
X 1 X 1 2 2
X 1
= − 2xp + 3x p p − 1 4.
k2 k3 k
k=1 k=1 k=1

By Exercise 8.3.4∗ , this is congruent to k=1


Pp−1 1
k2 modulo p3 , which proves the result.


Exercise 8.3.4∗ . Prove that, if p > 5 is a rational prime, p2 | p−1


P 1
Pp−1 1
k=1 k3
and p | k=1 k4 .

Finally, we prove that x 7→ (1 + u)x can be expanded as a power series in x. This will be useful for
proving the Skolem-Mahler-Lech theorem in Section 8.4. In fact we prove the following more general
result.
8.3. BINOMIAL SERIES 135

Proposition 8.3.3

Let (an )n≥0 be a sequence of p-adic numbers such that an → 0. If ak /k! → 0, the function
∞  
X x
f (x) = ak
k
k=0

defines a convergent power series on Zp .

We shall simply expand the binomial coefficients in terms of x and switch the double sums to get
a power series. For this, we need a lemma to switch double sums (of infinitely many terms), similar to
Proposition 8.2.2. Over R and C it’s usually tricky and not always true, but over Qp it’s very simple
like for Proposition 8.2.2.

Proposition 8.3.4 (Switching Double Sums)

Let (ai,j )(i,j)∈N2 be a family of p-adic numbers. Suppose ai,j → 0 when i + j → ∞ (meaning
that, for any ε > 0, there are finitely many pairs (i, j) such that |ai,j |p > ε). Then,

X ∞
∞ X ∞ X
X ∞
ai,j = ai,j
i=0 j=0 j=0 i=0

(in particular, both series converge).

Exercise 8.3.5∗ . Prove Proposition 8.3.4.

Proof of Proposition 8.3.3

Expand k! xk = x(x − 1) · . . . · (x − (k − 1)) as i ci,k xi , where |ci,k |p ≤ 1 as ci,k ∈ Z. By


 P
Proposition 8.3.4, we get
∞   X ∞ ∞
X x i
X ak
ak = x ci,k
k i=0
k!
k=0 k=0

as |ci,k ak /k!|p ≤ |ak /k!|p −→ 0.


i+k


Finally, to conclude that x 7→ (1 + u)x is a power series, by Proposition 8.3.3, we need to estimate
|k!|p to prove that we indeed have uk /k! → 0. This follows from the following proposition.

Proposition 8.3.5 (Legendre’s Formula)*

Let n ∈ N. We have
∞  
X n n − sp (n)
vp (n!) = k
= .
p p−1
k=1
sp (n)
In particular, vp (n!) = n
p−1 + o(n) and |n!|p = p−n/(p−1)+o(n) , where o(n) = − p−1 is a quantity
such that o(n)/n → 0.
136 CHAPTER 8. P -ADIC ANALYSIS

Remark 8.3.1
One might notice that for u ∈ Qp , |u|p < p−1/(p−1) is equivalent to |u|p < 1 because the only
values |u|p ≤ 1 can take are 1, 1/p, 1/p2 , . . .. There is however a reason why we stated it that way:
it’s because we can do algebraic number theory over Qp , and over extensions of Qp we might have
p−1/(p−1) < |u|p < 1. (See Exercise 8.6.21† .)

Proof

The first equality is left as Exercise 8.3.6∗ . For the second one, write n = nm pm + . . . + n1 p + n0
the base p expansion of n. Then,
 
n
= nm pm−k + . . . + nk+1 p + nk .
pk

Thus,
∞  
X n
vp (n!) =
pk
k=1
Xm Xm
= ni pi−k
k=1 i=k
m
X i
X
= ni pi−k
i=0 k=1
m
X pi − 1
= ni ·
i=0
p−1
n − sp (n)
= .
p−1


Exercise 8.3.6∗ . Let n ∈ N be a positive rational integer and p be a prime number. Prove that
∞  
X n
vp (n)! = .
pk
k=1

Corollary 8.3.1*

For any |u|p < p−1/(p−1) , the function x 7→ (1 + u)x is a convergent power series on Zp .

Exercise 8.3.7∗ . Prove Corollary 8.3.1.

8.4 The Skolem-Mahler-Lech Theorem


Our goal is to prove the Skolem-Mahler-Lech theorem 8.4.1, which says that the zeros of a linear
recurrence (an )n∈Z of algebraic numbers8 are a union of a finite set and some arithmetic progressions;
this was used in Section 7.4 for instance. Here is how we are going to approach this Ptheorem. There
are two main steps. For the sake of simplicity, we suppose in this sketch that an = i fi (n)αin where
fi ∈ Z[X] and αi ∈ Z.
8 Actually, it is also true for sequences in any field of characteristic 0, but we only prove it for sequences of algebraic

numbers. The general case is Exercise 8.6.40† .


8.4. THE SKOLEM-MAHLER-LECH THEOREM 137

1. Transform (an )n∈Z into (the restriction of) multiple p-adic power series. n 7→ αin might not
(p−1)n
define directly a p-adic power series with Corollary 8.3.1, but n 7→ αi does since, by little
Fermat’s theorem, αip−1 ≡ 1 (mod p). Hence sk = (a(p−1)m+k )m∈Z define p−1 convergent power
series on Zp .
2. Show that a convergent power series on Zp is either identically zero on Zp , or has finitely many
zeros in Zp (and thus in Z too). This means that each sk is either always zero or has finitely
many zeros which was what we wanted to show (the zeros of (an )n∈Z are a union of a finite set
and arithmetic progressions of the form ((p − 1)m + k)m ).
We now show how to derive the theorem assuming that a non-zero convergent power series on Zp
has finitely many zeros – this result will be proven in the next section, as a corollary to Strassmann’s
theorem, giving an explicit bound on the number of zeros such a function can have.

Theorem 8.4.1 (Skolem-Mahler-Lech Theorem)

Let (un )n∈Z be a linear recurrence of algebraic numbers. The zeros Z((un )n ) of (an )n , i.e. the set
of n such that an = 0, is the union of a finite set and a finite number of arithmetic progressions:
k
[
Z((un )n ) = S ∪ (ai + bi Z)
i=0

where S is a finite set and ai , bi ∈ Z.

Remark 8.4.1
The Skolem-Mahler-Lech theorem is also valid for sequences in aribtrary fields of characteristic
0. Skolem proved it for sequences of rational numbers, Mahler for algebraic numbers and Lech
for sequences in any field of characteristic 0. Thus, the above theorem could perhaps be called
the "Skolem-Mahler theorem".
Notice that this theorem is optimal: the sequence

an = (n − s1 ) · . . . · (n − sm ) · (ω1n−a1 − 1) · . . . · (ωkn−ak − 1)
Sk
where ωi is a primitive bi th root of unity vanishes exactly on {s1 , . . . , sm } i=0 (ai + bi Z).

Proof

n
P
Write un = i fi (n)αi where fi ∈ Q[X] and αi ∈ Q by Theorem C.4.1. Note that we can
suppose without loss of generality that (un )n≥0 takes rational values. Indeed, if K is the fields
generated by the αi and the coefficients of the fi as well as all their conjugates, then, for any
σ ∈ Gal(K/Q), un is zero iff X
σ(un ) = σfi (n)σ(αi )n
i
n
Q P
is, so we can consider the norm σ i σfi (n)σ(αi ) .

Choose a rational prime such that αi and the coefficients of fi make sense in Fp for all i. This can
be done as follows: pick a non-zero N ∈ Z such that N fi ∈ Z[X] and N αi ∈ Z and then define
h as the lcm of the minimal polynomials of N αi and the minimal polynomials of the (non-zero)
coefficients of N fi . Then, choose a rational prime p - N such that h splits in Fp ; there exists
such a prime by Theorem 6.4.1.
In addition, choose p sufficiently large so that if p - h(a) then p - h0 (a); this can be done using
Bézout’s lemma 5.4.1 as g is squarefree so coprime with its derivative. Finally, we also want
the roots of h in Fp to be non-zero, this is again true for sufficiently large p as h(0) 6= 0 (since
αi 6= 0).
138 CHAPTER 8. P -ADIC ANALYSIS

Thus, write an = i gi βin where gi ∈ Qp [X] and βi ∈ Qp by Theorem 8.1.1. Since |βi |p = 1 by
P

construction, we have |βip−1 − 1|p ≤ 1/p by Fermat’s little theorem.


(p−1)n
Thus, the function n 7→ βi = (1 + (βip−1 − 1))n is a convergent power series on Zp by
Corollary 8.3.1. To conclude, for a fixed r ∈ Z/pZ, the function
(p−1)n+r
X
n 7→ gi ((p − 1)n + b)βi = u(p−1)n+r
i

is a convergent power series on Zp , so is either identically zero or has finitely many zeros in Zp
and thus in Z by Strassmann’s theorem 8.5.1.

Finally, put the zeros in the finite set when they are a finite number of them, and as an arithmetic
progression when it is identically zero and we are done.


Exercise 8.4.1∗ . Convince yourself of this proof.

Exercise 8.4.2∗ . Do you think this proof could be formulated without appealing to p-adic analysis?

Remark 8.4.2
One could wonder why the fact that s 7→ (αN )s is analytic doesn’t imply that s 7→ αs is as well, by
replacing s by s/N . The problems is that this only gives us an analytic function f which is equal
to αn when n is a rational integer divisible by N . More precisely, since f (x + y) = f (x)f (y)
for all x, y ∈ Zp , we know f (1) is an N th root of f (N ) = αN , but we don’t know which one. As
well we shall see very shortly, roots of unity are exactly the reason why some linear recurrences
can be zero infinitely many times without being identically zero.

Corollary 8.4.1

For any linear recurrence (un )n∈Z of algebraic numbers, there are finitely many α ∈ Q such that
(un )n reaches α infinitely many times.

Proof

Write un = i fi (n)αin . Our proof of Theorem 8.4.1 shows that the common difference depends
P
only on the number field K generated by the coefficients of fi as well as the αi . Clearly, if (un )n
reaches α then α ∈ K.

This means that the common difference d is the same for (an )n as well as the linear recurrence
(un − α)n , so if the latter vanishes infinitely many times then it vanishes on dZ + c for some c.
Thus, (un )n can take a value α infinitely many times only for at most d values of α, otherwise,
(un − α)n and (un − β)n will vanish on the same dZ + c which is impossible for α 6= β.


Here is a very nice corollary of the Skolem-Mahler-Lech theorem.

Corollary 8.4.2
n
P
Suppose an = i fi (n)αi is a linear recurrence of algebraic numbers which is zero infinitely
many times but not identically zero. Then, αi /αj is root of unity for some i 6= j.
8.4. THE SKOLEM-MAHLER-LECH THEOREM 139

Note that this is not a weak result at all: if the field K generated by the αi has exactly N roots of
unity, then, for any fixed m, the sequence (uN n+m )n∈Z is a linear recurrence such that the quotient

αiN /αjN = (αi /αj )N

of two distinct roots of its characteristic polynomial is never a root of unity since ω N = 1 for any root
of unity ω ∈ K. Thus, we can partition (un )n∈Z into subsequences of the form (uN n+m )n∈Z , and each
of these subsequence must either be always zero or finitely many times zero. If we are dealing with
sequences of integers, we can even combine this with Corollary 8.4.1 to get that each subsequence must
be constant of tend to infinity in absolute value.

Exercise 8.4.3∗ . Prove that any number field has a finite number N of roots of unity, and that ω N = 1 for
any root of unity ω of K. (In other words, the roots of unity of K are exactly the N th roots of unity.)

We need a lemma to prove this corollary, which was already used in the proof of Theorem C.4.1.

Lemma 8.4.1

Let K be a field of characteristic 0. If


k
X
un = gi (n)βin = 0
i=1

for any n ∈ Z, where gi ∈ K[X] and βi ∈ K are both non-zero for all i, then βi = βj for some
i 6= j.

Proof of Corollary 8.4.2 using the Lemma

n
P
If un = i fi (n)αi is infinitely many times zero for some non-zero fi ∈ Q[X] and non-zero
ri ∈ Q, then X
ur+sn = αiv fi (r + sn)αiun
i

is identically zero for some u 6= 0, v by Theorem 8.4.1. Thus, by the lemma, we must have
αis = αjs for some i 6= j, which implies that αi /αj is a root of unity as wanted.


Proof of the Lemma

We prove theP contrapositive: if β1 , . . . , βk are all distinct then gi = 0 for all i. We proceed by
induction on i deg gi , the base case follows from the Vandermonde determinant C.3.2. For the
induction step, suppose deg g1 ≥ 1 without loss of generality. Consider the sequence
X
vn = un+1 − β1 un = (βi gi (n + 1) − β1 gi (n))βin .
i

Since deg(βi gi (X + 1) − β1 gi ) ≤ deg fi for i ≥ 1 and deg(β1 (gi (X + 1) − gi )) ≤ deg fi − 1, by the


inductive hypothesis we have βi gi (X + 1) − β1 gi = 0 for all i. This means that they are constant,
but we have already treated this case so we are done.

140 CHAPTER 8. P -ADIC ANALYSIS

Alternative Proof of the Lemma for K = Q, using Algebraic Number Theory

Here is an alternative proof, which in this case is less efficient than the first one but that we
still present because it is neat. Using an argument similar to Exercise 8.6.40† , one can also
adapt it to work over any characteristic 0 field. Consider an N such that g1 (N ), . . . , gk (N ) 6= 0.
Pick aP large prime p such that gi and βi make sense modulo p, using Theorem 6.4.1 and write
un ≡ i gi0 (n)βin 0 where gi0 ∈ Fp [X] and βi0 ∈ Fp . By picking p sufficiently large, suppose also
that p - gi0 (N ), βi0 for each i.

Since the order of βi0 is coprime with p (it divides pm − 1 for some m), using CRT we can choose
0
an M such that βiM = 1 for each i and gi0 (M ) = gi0 (N ). Thus we have i gi0 (N )βin 0 = 0 for all
P
n. By Vandermonde C.3.2 this implies
Y Y
βi − βj ≡ βi0 − βj0 = 0.
i6=j i6=j

Since this is true for infinitely many primes, the LHS is zero too, i.e. βi = βj for some i 6= j.


Remark 8.4.3
Note that we again proved a global statement in a local way, although we only used finite fields
instead of p-adic numbers here. Results like Exercise 8.6.40† prove that we can even prove results
about C that seem analytic or algebraic in nature with number theory. In fact, the only known
proofs of Skolem-Mahler-Lech are p-adic in essence.

Remark 8.4.4
It is interesting to note that our proof of the Skolem-Mahler-Lech is non-effective. We can find
the common difference of the arithmetic progressions if the sequence has infinitely many zeros,
although our proof is not excellent for this because we chose to work Qp instead of finite extensions
of Qp , and we can also bound the size of the additional finite set with Theorem 8.5.1, but we
cannot decide if a linear recurrence has a zero or not. This defect is shared by all known proofs.

Remark 8.4.5
Our bound on the common difference of the arithmetic progressions is very weak because we do
not know how big the least prime such that every exists in Fp is. Using finite extensions of Qp
and the theory of finite fields, one can get a way better bound, as we can now choose the smallest
p such that the algebraic numbers are non-zero in Fp (see Exercise 8.6.21† for how to extend the
p-adic absolute value to finite extensions).

8.5 Strassmann’s Theorem


Finally, we prove here that a convergent power series on Zp has finitely many zeros, giving an explicit
on the number of zeros. With the bound we get, we are even able in some situations to determine all
zeros of certain linear recurrences, and solve certain diophantine equations thanks to this.

Remark 8.5.1
This theorem highlights another big difference between p-adic analysis and real or complex anal-
ysis. In real or complex analysis, the same result holds (a non-zero power series has finitely many
zeros on the unit ball {z | |z| ≤ 1}), but for another reason: one can prove that convergent power
series are analytic, i.e. can be expanded as power series in x − α locally around α for any α ∈ Zp ,
and that the zeros of an analytic function (on a connected open set) are isolated (the so-called
principle of isolated zeros). Since the unit ball {z | |z| ≤ 1} is compact, this implies that there are
finitely many zeros. Over Zp , convergent power series are still analytic, and Zp is still compact.
8.5. STRASSMANN’S THEOREM 141

However, Zp is not connected – in fact it is even totally disconnected (the only connected sets of
Zp are points) – so the principle of isolated zeros: if a zero α of f is not isolated, we merely get
that f is locally zero around α, and that is not enough to conclude that f is globally zero. For
instance, the function f defined by f (x) = 0 if |x|p < 1 and f (x) = 1 if |x|p = 1 can be seen to
be analytic on Zp , yet is not identically zero but has infinitely many zeros.

Theorem 8.5.1 (Strassmann’s Theorem)


P∞
Let f (x) = k=0 ak xk be a non-zero power series convergent on Zp , i.e. ak → 0. Suppose N
is maximal such that |aN |p := max(|an |p ), i.e. |aN |p > |an |p for n > N , and |aN |p ≥ |an |p for
n ≤ N . Then f has at most N zeros.

Notice that such an N always exists, for otherwise ak 6→ 0.

Proof

We proceed by induction on N . When N = 0,


|f (x)|p ≤ max(|ai xi |p ) = |a0 |p
by the strong triangle inequality 8.2.1 since |a0 |p > |ai |p ≥ |ai xi |p for any i > 0. Moreover, the
maximum is achieved only once so we have |f (x)|p = |a0 |p for all x ∈ Zp so f never vanishes (if
a0 = 0 then f = 0 since it’s the maximum coefficient, which is impossible).
Now, suppose N ≥ 1 is maximal such that |aN |p = max(|an |p ). Suppose α ∈ Zp is a zero of f , if
there is none we are already done. Write
f (x) = f (x) − f (α)
X∞ ∞
X
= a i xi − ai αi
i=0 i=0

X
= ai (xi − αi )
i=0
∞ X
X i−1
= (x − α)i ai xj αi−j−1
i=0 j=0
X∞ ∞
X
i j
= (x − α) x ai αi−j−1 .
j=0 i=j+1

using Proposition 8.3.4. Let



X
bj = ai αi−j−1
i=j+1

and define

X
g(x) = bk xk
k=0
so that f (x) = (x − α)g(x) for x ∈ Zp .
We shall prove that |bN −1 |p > |bn |p for n > N − 1, and |bN −1 |p ≥ |bn | for n ≤ N , that way g will
have at most N − 1 zeros by the inductive hypothesis so f at most N . Note that
bN −1 = aN + αaN +1 + α2 aN +2 + . . .
so that |bN −1 |p = |aN |p by the strong triangle inequality. For n 6= N − 1, we have
|bn |p = |an+1 + αan+2 + α2 an+3 + . . . |p ≤ max(|ai αi |p ) ≤ max(|ai |p ) ≤ |aN |p = |bN −1 |p
i>n i>n
142 CHAPTER 8. P -ADIC ANALYSIS

and this inequality is strict for n > N − 1 since we then have |ai |p < |aN |p for i ≥ n + 1 > N .


Here is an application from [8], very hard to solve by elementary means (which would be in essence
p-adic anyway).

Proposition 8.5.1 (Ramanujan, Nagell)

The positive integers n such that


x2 + 7 = 2n
has a solution in Z are n ∈ {3, 4, 5, 7, 15}.

Proof

First, let’s analyse this equation in Q( −7). By Exercise 8.5.1, it is Euclidean so a UFD. Suppose
(x, n) is a solution; clearly x is odd. We get
√ √
x + −7 x − −7
· = 2n−2
2 2
√ √ √ √ √
and the prime factorisation of 2 is 1+ 2 −7 · 1− 2 −7 . Let α = 1+ −7
2 and β = 1− −7
2 . Since 1± −7
2
isn’t divisible by 2, we must have (Exercise 8.5.2)

x ± −7
= αn−2 .
2
This has a solution if and only if
αn−2 − β n−2
= ±1.
α−β
The LHS is a linear recurrence which we will denote by (un−2 )n≥0 .

Now, let’s try to find a p-adic field where −7 exists. Since −7 ≡ 22 (mod 11) we can work in
Q11 . By Hensel’s lemma, there are two roots of X 2 − X + 2 (this is the characteristic polynomial
of the sequence) which we will abusively call α and β again. One of the roots is congruent to 16
modulo 112 , say α, and the other one is β = 1 − α ≡ 106 (mod 112 ).

Let r ∈ {0, 1, . . . , 9} be an integer. Since a = α10 − 1 ≡ 99 (mod 112 ) and b = β 10 − 1 ≡ 77


(mod 112 ) are divisible by 11, the functions s 7→ ur+10s are analytic. Let’s find out how many
times they can be ±1. Expand (α − β)(ur+10s ± 1) as a power series in s:

(α − β)(ur+10s ± 1) = αr (1 + a)s − β r (1 + b)s ± (α − β)


X s X s
r s r
=α a −β bs ± (α − β)
k k
k k
≡ αr (1 + as) − β r (1 + bs) ± (α − β) (mod 112 ).

An easy computation shows that the Strassmann bounds are N = 1 for r ∈ {1, 2, 5} and N = 0
for r ∈ {0, 4, 6, 7, 8, 9}. For r = 3, by expanding one more term, we find that the Strassmann
bound is N = 2. Since we have exactly this many solutions (r + 10s ∈ {1, 2, 3, 5, 13}), we are
done as they correspond to n ∈ {3, 4, 5, 7, 15}. (Technically we have to consider 20 functions
because of the ±1 sign, but it is easy to see that only the + sign works for r ∈ {1, 2}, only the
− sign works for r ∈ {3, 5}, and none of them do for other r.)

8.6. EXERCISES 143


Exercise 8.5.1. Prove that Q( −7) is norm-Euclidean. (This is also Exercise 2.6.4† .)
√  √ n−2
x± −7 1+ −7
Exercise 8.5.2. Prove that, if x2 + 7 = 2n , then 2
= 2
for some choice of ±.

Exercise 8.5.3∗ . Compute the Strassmann bounds for the function s 7→ (α − β)(us+10r ± 1), for each r ∈
{0, 1, . . . , 9}. (If you do not want to do it all by hand, you may use a computer. In any case, it is better to do
it to have a feel for why it works because it’s very cool.)

Exercise 8.5.4. Prove that 3, 4, 5, 7, 15 are indeed solutions to the given equation. (You may use a computer
for n = 15.)

8.6 Exercises
Analysis
Exercise 8.6.1† (Vandermonde’s Identity). Let x and y be p-adic integers. Prove that
    
x+y X x y
=
k i j
i+j=k,i,j≥0

for any k.
Exercise 8.6.2† (Mahler’s Theorem). Prove that a function f : Zp → Qp is continuous if and only if
there exist ai → 0 such that
∞  
X x
f (x) = ai
i=0
i
for all x ∈ Zp . These ai are called the Mahler coefficients of f . Moreover, show that max(|f (x)|p ) =
max(|ai |p ).
Exercise 8.6.3 (USA TST 2011). We say a sequence (zn )n≥0 is a p-pod if
m   !
k m
X
vp (−1) zk → ∞.
k
k=0

Prove that if (an )n≥0 and (bn )n≥0 are p-pods then (an bn )n≥0 is too.
Exercise 8.6.4† . Prove that the following power series converge if and only if for |x|p < 1 and
|x|p < p−1/(p−1) respctively:
∞ ∞
X (−1)k−1 xk X xk
logp (1 + x) = , expp (x) = .
k k!
k=1 k=0

In addition, prove that


1. expp (x + y) = expp (x) expp (y) for |x|p , |y|p < p−1/(p−1) .
2. logp (xy) = logp (x) + logp (y) for |x|p , |y|p < 1

3. expp (log(1 + x)) = 1 + x for |x|p < p−1/(p−1) .

4. logp (exp(x)) = x for |x|p < p−1/(p−1) .

Exercise 8.6.5† . Prove that !


n
X 2k
v2 → ∞.
k
k=1
P∞
Exercise 8.6.6† (Mean Value Theorem). Let f (x) = i=0 ai xi be a p-adic power series converging
for |x|p ≤ 1, i.e. ai → 0. Prove that
|f (t + h) − f (t)|p ≤ |h|p max(|ai |p )
i
−1/(p−1)
for any |t|p ≤ 1 and |h|p ≤ p .
144 CHAPTER 8. P -ADIC ANALYSIS

Topology9
Exercise 8.6.7† . Prove that Zp is sequentially compact, meaning that any sequence (an )n≥0 ∈ ZNp has
a subsequence (ani )i≥0 which converges in Zp . Prove more generally that a set S ∈ Qp is sequentially
compact if and only if it is closed, meaning that any sequence of elements of S converging in Zp (for
the Euclidean distance) converges in S, and bounded.
Exercise 8.6.8† (Bolzano-Weierstrass Theorem). Prove that a set S ⊆ R is sequentially compact if
and only if it closed, meaning that any sequence of elements of S converging in R (for the Euclidean
distance) converges in S, and bounded. Prove that the same holds over Rn for n ≥ 1.
Exercise 8.6.9† (Extremal Value Theorem). Let (M, d) be a metric space, i.e. a set with a distance
d : M → R≥0 such that d(x, y) = 0 iff x = y, d(x, y) = d(y, x) (commutativity) and d(x, y) ≤
d(x, z) + d(z, y) (triangle inequality) for any x, y, z ∈ M and let S be a sequentially compact subset of
M . Suppose f : S → R is a continuous function. Prove that f has a maximum and a minimum.
Exercise 8.6.10† (The Topology of Metric Spaces). Let (M, d) be a metric space. We say a subset
U ⊆ M is open if, for every point x ∈ U , there is an ε > 0 such that the ball {y ∈ M | d(x, y) < ε}
is contained in U . A subset F 10 of M is said to be closed if the limit of any convergent sequence of
elements of F still lies in F . Prove that M and ∅ are open, that a finite intersection of open sets is
open, and that an arbitrary union of open sets is open11 . In addition, prove that F is closed if and
only if its complement is open.
Exercise 8.6.11† (Compact Sets). We say a metric space (M, d) is compact if, for every open cover
(Ui )i∈I of M , i.e. a family of open sets such that
[
Ui = M,
i∈I

we can extract a finite subcover (Ui )i∈I 0 of M . Prove that a closed subset of a compact set is compact,
and that a closed subset of a sequentially compact space is sequentially compact.
Exercise 8.6.12† (Cantor’s Intersection Theorem). Prove that in a compact or sequentially compact
space (M, d), if
F1 ⊇ F2 ⊇ . . .
is a chain of non-empty closed subsets of M , the intersection
\
Fn
n∈N∗

is non-empty. Further, prove that the same conclusion holds when M is complete12 (not necessarily
compact), and the closed sets satisfy diam(Fn ) → 0, where diam(S) := supx,y∈S d(x, y).
Exercise 8.6.13† (Baire’s Theorem). Let (M, d) be a complete metric space. Suppose that U1 , U2 , . . .
are dense open sets, i.e. open sets that intersect any non-empty ball13 . Prove that
\
Un
n∈N∗

is still dense. Equivalently, if F1 , F2 , . . . are closed sets with empty interior , i.e. that contain no ball14 ,
then [
Fn
n∈N∗
9 You may freely use the axiom of choice in this section. (The axiom of choice is only needed to do very basic things,

because it is very common to consider infinitely many sets and pick a point in each of them in topology.)
10 The "F" stands for "fermé", which means "closed" in French.
11 These are actually the axioms of a topology. A topological space (X, τ ) is a set X together with a topology

τ ⊆ 2X consisting of subsets of X, called open sets, satisfying those properties. The closed sets are then defined as the
complements of open sets.
12 Recall that completeness means that all Cauchy sequences converge. A Cauchy sequence (u )
n n≥0 is a sequence such
that, for any ε > 0, there is an N such that d(um , un ) ≤ ε for all m, n ≥ N .
13 In general topological spaces, a set S is dense if it intersects any non-empty open set, or, equivalently, if the only

closed set contained it is the space itself.


14 The interior of S is the union of its subsets which are open in the ambient space M .
8.6. EXERCISES 145

has empty interior as well. Deduce that, if (V, k · k) is an infinite-dimensional normed vector space
(see Exercise 8.6.20† ) with countable basis (e1 , e2 , . . .), then (V, k · k) is not complete (we say it’s not
a Banach space). (You may assume Exercise 8.6.20† .)
Exercise 8.6.14† (Banach’s Fixed Point Theorem). We say a map f from a metric space (M, d) to
itself is a contraction if there is a real number λ < 1 such that d(f (x), f (y)) ≤ λd(x, y) for all x, y ∈ M .
Let f be a contraction of a complete metric space M . Prove that f has a unique fixed point x∗ , and
that, for any x0 ∈ M , limn→∞ f n (x0 ) = x∗ .
Exercise 8.6.15† . We say a metric space (M, d) is separable if it has a countable dense subset, and
that it is totally bounded if, for every ε > 0, there is a finite cover of M in open balls of radius ε.
Prove that a metric space is separable if and only if it has a countable basis of open sets (Un )n∈N ,
i.e. a family of non-empty open sets such, for any x ∈ M and any open set x ⊆ U , there is an n
for which x ∈ Un ⊆ U . In addition, prove that compact spaces and sequentially compact spaces are
totally bounded, and that totally bounded spaces are separable.
Exercise 8.6.16† . Let (M, d) be a metric15 space. Prove that the following assertions are equivalent:
(i) M is compact.
(ii) M is totally bounded and complete.
(iii) M is sequentially compact.

Absolute Values
Exercise 8.6.17† . We say an absolute value | · | over a field K, i.e. a function | · | → R≥0 such that
• |x| = 0 ⇐⇒ x = 0
• |x + y| ≤ |x| + |y|
• |xy| = |x| · |y|
is non-Archimedean if |m| ≤ 1 for all m ∈ Z and Archimedean otherwise. Prove that m is non-
Archimedean if and only if it satisfies the strong triangular inequality |x + y| ≤ max(|x|, |y|) for all
x, y ∈ K. In addition, prove that, if | · | is non-Archimedean, we have |x + y| = max(|x|, |y|) whenever
|x| =
6 |y|.
Exercise 8.6.18† . Let K be a field and let | · | : K → R≥0 be a multiplicative function which is an
absolute value on Q. Suppose that | · | satisfies the modified triangular inequality |x + y| ≤ c(|x| + |y|)
for all x, y ∈ K, where c > 0 is some constant. Prove that it satisfies the triangular inequality.
Exercise 8.6.19† (Ostrowski’s Theorem). Let | · | be an absolute value of Q. Prove that | · | is equal
to | · |rp for some prime p and some r ≥ 1, or to | · |r∞ for some 0 < r ≤ 1 or is the trivial absolute value
| · |0 which is 0 at 0 and 1 everywhere else.
Exercise 8.6.20† (Equivalence of Norms). Let (K, | · |) be a complete valued field of characteristic
0, i.e. a field with an absolute value | · | which is complete for the distance induced by this absolute
value. A norm on a vector space V over K is a function k · k : V → R≥0 such that
• kxk = 0 ⇐⇒ x = 0
• kx + yk ≤ kxk + kyk
• kaxk = |a|kxk
for all x, y ∈ V and a ∈ K. We say two norms k · k2 and k · k2 are equivalent if there are two positive
real numbers c1 and c2 such that kxk1 ≤ c1 kxk2 and kxk2 ≤ c2 kxk1 for all x ∈ V .16 Prove that any
two norms are equivalent over a finite-dimensional K-vector space V . In addition, prove that V is
complete under the induced distance of any norm k · k.
15 The theorem is not always true for arbitrary topological spaces: some compact spaces are not sequentially compact,

and some sequentially compact spaces are not compact.


16 This means that they induce the same topology on V .
146 CHAPTER 8. P -ADIC ANALYSIS

Exercise 8.6.21† . Let K = Qp be a local field17 , where p be a prime number or ∞ and let L be a
finite extension of K. Prove that there is only one absolute value of L extending | · |p on K, and that
1/[L/K] 18 19 20
it’s given by | · |p = NL/K (·) p
.

Exercise 8.6.22† . Let (K, |·)| be a complete valued field of characteristic 0 and let f ∈ K[X] be a
polynomial. Prove that f either has a root in K, or there is a real number c > 0 such that |f (x)| ≥ c
for all x ∈ K.

Exercise 8.6.23† (Ostrowski). Let (K, | · |) be a complete valued Archimedean field of characteristic
021 . Prove that it is isomorphic to (R, | · |∞ ) or (C, | · |∞ ).

Exercise 8.6.24† (Residue Field). Let K be a finite extension of Qp . We define the ring of integers
of K to be its unit ball OK = {x ∈ K | |x|p ≤ 1} (where | · |p is the extension of the absolute value of
Qp to K, see Exercise 8.6.21† ). Prove that OK is a commutative ring whose only maximal ideal (see
Exercise A.3.23† ) is pK = {x ∈ K | |x|p < 1} (we say OK is a local ring). In addition, show that the
residue field κ = OK /pK of K is a finite extension of Fp , and that we have [κ : Fp ] ≤ [K : Qp ].22

Exercise 8.6.25† . Let K be a finite extension of Qp . Prove that OK is compact.

Diophantine Equations
Exercise 8.6.26† (Brazilian Mathematical Olympiad 2010). Find all positive rational integers n and
x such that 3n = 2x2 + 1.

Exercise 8.6.27 (Taiwan TST 2021). Find all triples of positive rational integers (x, y, z) such that

x2 + 4y = 5z .

Exercise 8.6.28. Prove that the equation x3 + 11y 3 = 1 has no non-trivial rational integer solutions.

Exercise 8.6.29† . Solve the diophantine equation x2 − y 3 = 1 over Z.

Exercise 8.6.30† (Lebesgue). Solve the equation x2 + 1 = y n over Z, where n ≥ 3 is an odd integer.

Exercise 8.6.31† . Solve the equation x2 + 1 = 2y n over Z, where n ≥ 3 is an odd integer.

Linear Recurrences
Exercise 8.6.32† . Let (un )n≥0 be a linear recurrence of rational integers given by i fi (n)αin such
P
that αi /αj is not a root of unity for i 6= j. If un is not of the form aαn for some a, α ∈ Z, prove that
there are infinitely many prime numbers p such that p | un for some integer n ≥ 0.

Exercise 8.6.33† . Does there exists an unbounded linear recurrence (un )n≥0 such that un is prime
for all n?
17 This result is true for any complete valued field (K, | · |), but it is harder to prove. See Cassels [8, Chapter 7] for a

proof.
18 In particular, this absolute value is still non-Archimedean if it initially was. For instance, by Exercise 8.6.17† , if p

is prime, the extension of | · |p still satisfies the strong triangle inequality. In fact, this is the only interesting case since
it’s too hard to treat the case K = R separately.
19 Here is why this absolute value is intuitive: by symmetry between the conjugates, we should have |α| = |β| if α
p p
[K:Q ]
and β are conjugates. Taking the norm yields |NK/Qp (α)|p = |α|p p as indicated.
20 One might be tempted to also define a p-adic valuation for elements of K as v (·) = − log(| · | )/ log(p), and this is
p p
also what we will do in some of the exercises. However, we warn the reader that, if α ∈ Z is an algebraic integer and αp
is a root of its minimal polynomial in Qp , vp (αp ) ≥ 1 does not mean anymore that p divides α in Z, it only means that
p divides αp in Zp := {x ∈ Qp | |x|≤ 1}.
21 In fact it is quite easy to show that char K = 0 follows from the assumption that | · | is Archimedean, but we add

this assumption for the convenience of the reader.


22 Equality usually doesn’t hold, when it does we say the extension is unramified.
8.6. EXERCISES 147

Miscellaneous
Exercise 8.6.34† . Which roots of unity are in Qp ?

Exercise 8.6.35. Any p-adic number can be written uniquely in the following way: a = k>N ak pk
P
for some N ∈ Z and ak ∈ [p] (this amounts to choosing a system of representatives of Z/pZ). Prove
that a ∈ Q if and only if the sequence (ak )k is eventually periodic.
Exercise 8.6.36 (ISL 2020). Find all functions f : N → N such that f (xy) = f (x) + f (y) for every
integers x, y > 0 and for which there are infintely many n ∈ N satisfying f (k) = f (n − k) for every
integer 0 < k < n.
Exercise 8.6.37† (China TST 2010). Let k ≥ 1 be a rational integer. Prove that, for sufficiently
large n, nk has at least k distinct prime factors.
Exercise 8.6.38† . Find all additive functions f : ZN → Z, where addition is defined componentwise.
(To those who have read Section C.2, the fact that there are a nice characterisation of those functions
should come off as a surprise.)
Exercise 8.6.39† . Let f ∈ Zp [X] be a polynomial whose leading and constant coefficients are in Z× p.
×
Prove that, if K is any finite extension of Qp where f splits, its roots are all in OK (see Exercise 7.5.1†
for the definition of OK ).

Exercise 8.6.40† . Let K = Q(ε1 , . . . , ε` ) be a finitely generated field of characteristic 0, and let
α1 , . . . , αr ∈ K × be non-zero elements. Prove that there is a prime p and an embedding τ : K → Qp ,
i.e. an (injective) field morphism, such that τ (αi ) ∈ Z×p for every i. Deduce that the Skolem-Mahler-
Lech theorem holds over any field of characteristic 0.
Exercise 8.6.41† . Let p be a prime number. Prove that the set of zeros of the linear recurrence
(un )n∈Z ∈ Fp (T ) defined by un = (1 + T )n − T n − 1 for all n ∈ Z is {pk | k ≥ 0}. Deduce that the
Skolem-Mahler-Lech theorem doesn’t hold in positive characteristic.
Exercise 8.6.42† (Krasner’s Lemma). Let (K, | · |) be a non-Archimedean valued field and let α ∈ K
be an element with conjugates α1 , . . . , αn . Suppose that a separable element β ∈ K is such that

|α − β| < |α − αi |

for i = 2, . . . , n, where | · | is the absolute value defined in Exercise 8.6.21† . Prove that K(α) ⊆ K(β).
Appendix A

Polynomials

Prerequisites for this chapter: none.

A.1 Fields and Polynomials

Definition A.1.1 (Field)

A field (K, +, ·) is a set K with at least two elements and with two binary (taking two arguments)
operations + and ·, called addition and multiplication. These operations have the following
properties: they are associative and commutative, they have an identity, they have inverses
except for 0 which doesn’t have a multiplicative inverse, and multiplication distributes over
addition. We usually just say that K is a field by abuse of terminology.

Here is what these terms mean: a binary operation † : K 2 → K is associative if (a † b) † c = a † (b † c)


for any a, b, c ∈ K (that way we can write a † b † c without ambiguity).
It is commutative if a † b = b † a for any a, b ∈ K.
It has an identity e if a † e = e † a = a for any a ∈ K (this is denoted 0K for addition and 1K for
multiplication, but we usually drop the K when the context is clear).
a0 is an inverse of a for † if a † a0 = a0 † a = e (denoted −a for addition and a−1 for multiplication).
· distributes over + if a(b + c) = ab + ac and (b + c)a = ba + ca, where ab denotes a · b. In some
sense, · "commutes" with +.
Exercise A.1.1∗ . Let K be a field. Prove that 0K a = 0K for any a ∈ K.

Exercise A.1.2∗ . Let † be a binary associative operation on a set M . Suppose that M has an identity. Prove
that it is unique. Similarly, prove that, if an element g ∈ M has an inverse, then it is unique.1

Remark A.1.1
Note that a field must have at least two elements (the additive and multiplicative identities must
be distinct), i.e. the trivial ring R = {0} is not a field. There are various reasons for this axiom,
akin to the convention that 1 isn’t prime, but perhaps the simplest one is that if it were a field we
would not have the uniqueness of dimension anymore since {0} and the empty set are both bases
of {0}. (This is unimportant for this appendix, see Appendix C for the definition of dimension.)

Here are some examples of fields: the familiar sets of rational numbers Q, of real numbers R and
of complex numbers C. We will define a variety of other fields throughout this book, but here is one
very important field: the fields Fp of integers modulo p, where p is a prime. You can think of it
1 Such a structure is called a monoid.

148
A.1. FIELDS AND POLYNOMIALS 149

as {0, 1, . . . , p − 1} with addition and multiplication modulo p. It differs greatly from the previous
fields for two reasons: because it is finite (we will study these fields in Chapter 4) and because it has
non-zero characteristic 2 (see Section A.2 for a definition if you’re curious, but this is unimportant for
now). Why is it a field? Well, all axioms are obvious because they are true in Z so also in Z modulo
p, except the one about multiplicative inverses. But you already know that integers which are not
divisible by p have inverses modulo p since it’s prime.

We now define polynomials with coefficients in a field K. We will see shortly that fields are
not complicated at all, they are simply the smallest structure which lets us establish the theory of
polynomials3 .

Definition A.1.2 (Polynomials)

A polynomial f with coefficients in K is a object f = an X n +. . .+a1 X +a0 where a0 , . . . , an ∈ K.


The greatest k such that ak 6= 0 is called the degree deg f of f ; the degree of the zero polynomial
deg 0 is −∞. The set of polynomials with coefficients in K is denoted K[X].

The coefficient adeg f is called the leading coefficient of f , and a0 is the constant coefficient of
f . When the leading coefficient is 1, we say the polynomial is monic (the zero polynomial isn’t
monic).

Remark A.1.2
The object X is purely formal, and we shall usually exclude it when talking about polynomials
themselves. In other words, we write f , not f (X), since f is the polynomial. As a general
rule, we will almost always reserve uppercase letters for formal objects and lowercase letters for
numbers. (More precisely, a variable in a polynomial ring will always be denoted capitally, but
some numbers may also be denoted that way occasionally.)

Remark A.1.3
We can also consider similar objects but without the restriction that ak = 0 for sufficiently
large k. They are called formal power series. They are also very useful objects, but are not
considerably used in algebraic number theory so we do not consider them here (two exceptions:
see Theorem B.2.1 and Remark C.4.1). Another point to note is that, although one can obtain
many very interesting results by purely formal and algebraic considerations, we lose one advantage
of polynomials: we can not always evaluate them (since the resulting series might not converge,
or worse, we might not even have a topology to consider convergence). Thus, they demand a bit
more care if we want to do that. See Andreescu-Dospinescu [1] chapter 8 for an introduction to
the wonders of formal power series.

The sum and product of two polynomials are defined intuitively, I don’t think I have to explain
that. The formal object X will be called a "variable", even if that makes it seem like it’s not a formal
object.4 Polynomials in multiple variables are defined analogously as
X
ai1 ,...,im X1i1 · . . . · Xm
im

i1 ,...,im ≥0

where all but finitely many ai1 ,...,im are zero. The degree is now defined as the greatest value of
i1 + . . . + im for non-zero ai1 ,...,im .
Exercise A.1.3∗ . Prove that multiplication of polynomials is associative and commutative.
2 This is a consequence of its finiteness, but it has important consequences too which explains why it is mentioned.
3 We need to be able to add (and subtract) and multiply polynomials, which already explains all the axioms except for
the one about multiplicative inverses. This axiom is the key to the theory of polynomials: it’s what makes the Euclidean
division exist.
4 The technical term is "indeterminate" but I prefer using "variable".
150 APPENDIX A. POLYNOMIALS

A polynomial is not a polynomial function! A polynomial is a purely formal object: for instance
the polynomial functions x 7→ xp and x 7→ x are the same over the integers modulo p by Fermat’s
little theorem, but the polynomials X p and X are distinct. That said, we can still consider them as
polynomial functions when we want to (to evalute polynomials at a point for instance), but it is also
important to be able to consider them only as polynomials (e.g. for Corollary A.1.1).

Here is why fields are nice: they are precisely the structure that lets us define polynomials (and be
able to add them and multiply them nicely) as well as have a Euclidean division.

Proposition A.1.1 (Euclidean Division of Polynomials)

Let f, g ∈ K[X] be polynomials, with g 6= 0. There exist a unique pair of polynomials (q, r) ∈
K[X]2 with deg r < deg g such that f = gq + r.

Proof

Start with the uniqueness part. If gq + r = f = gq 0 + r0 , then (q − q 0 )g = r0 − r and q 6= q 0 . Thus,


deg(q − q 0 )g ≥ deg g > deg r0 − r which is impossible.

We now proceed by induction on deg f to prove the existence, for a fixed g. If deg f < deg g,
we already done: f = g · 0 + f . Otherwise, let a and b be the leading coefficients of f and g
respectively, which are non-zero since deg f ≥ deg g ≥ 0. The polynomial f − ab−1 X deg f −deg g g
has degree less than deg f , so by the inductive hypothesis there exist polynomials q and r such
that deg r < deg g and
f − ab−1 X deg f −deg g g = gq + r
Finally, this gives us
f = (q + ab−1 X deg f −deg g )g + r.

We now define divisibility of polynomials like we do in Z:

Definition A.1.3 (Divisibility of Polynomials)

We say a polynomial f ∈ K[X] divides a polynomial g ∈ K[X], and write f | g, if there exists a
polynomial h ∈ K[X] such that g = f h.

Note that repeated applications of the Euclidean division yield the Euclidean algorithm: given two
polynomials f, g ∈ K[X] with deg f > deg g, we iteratively replace f by the remainder of its division
by g. For instance, f = X 3 + X and g = X 2 yields

{X 3 + X, X 2 } → {X 2 , X} → {X, 0} → {0, 0}.

This will, like in Z,5 eventually produce the pair {0, h} where h is the greatest common divisor (gcd)
of f and g, i.e. a polynomial which divides both f and g, and such that, if h0 | f, g then h0 | h (in
particular it is the common divisor with greatest degree, except when f = g = 0). Note that the gcd
is only defined up to multiplication by a non-zero constant, although we will usually assume it to be
monic.

Exercise A.1.4∗ . Prove that the gcd of 0 and 0 is 0.

5 The deep reason behind all these analogies with Z lies in Chapter 2: both Z and K[X] are Euclidean domains.
A.1. FIELDS AND POLYNOMIALS 151

Exercise A.1.5∗ . Prove that the Euclidean algorithm produces the gcd. Deduce that the gcd of two polyno-
mials in K[X] is also in K[X]. (As a consequence, the fundamental theorem of algebra Theorem A.1.1 implies
that two polynomials with rational coefficients are coprime in Q[X] if and only if they don’t have a common
complex root.)

Exercise A.1.6∗ (Bézout’s Lemma). Consider two polynomials f, g ∈ K[X]. Prove that there exist polyno-
mials u, v ∈ K[X] such that uf + vg = gcd(f, g).

As another corollary of Proposition A.1.1, we get the following extremely fundamental fact.

Proposition A.1.2*

Let f ∈ K[X] be a polynomial. If f (α) = 0, then X − α | f .

Proof

Let f = (X − α)q + r be the Euclidean division of f by X − α. Since deg X − α = 1, we have


deg r < 1 so r is constant. Notice that r = r(α) = f (α) = 0, which means f = (X − α)q, i.e.
X − α divides f .


Corollary A.1.1*

A polynomial f ∈ K[X] of degree n ≥ 0 has at most n roots in K.

Proof

We proceed by induction on n, a constant non-zero polynomial obviously has no roots. If f has


degree n + 1, it either has no roots in which case we’re already done, or it has a root α ∈ K. In
that case, write f = (X − α)g where g has degree n. Then, by the inductive hypothesis, g has
at most n roots, which means that f has at most n + 1 roots as wanted.


Exercise A.1.7∗ . Let f ∈ K[X1 , . . . , Xn ] be a polynomial in n variables and suppose S1 , . . . , Sn ⊆ K are


subsets of K such that |Si | > degXi f . If f vanishes on S1 × . . . × Sn , prove that f = 0. (This is the
generalisation of Corollary A.1.1 to multivariate polynomials.)

Here is a non-trivial application of this.

Problem A.1.1

Let n ≥ 2 be a positive integer. What is the gcd of the numbers 1n − 1, 2n − 1, . . . , nn − 1?

Solution

Let d be this gcd. Suppose p is a prime factor of d. If p ≤ n, then p | pn − 1 which is impossible.


Thus p > n. Consider the polynomial

X n − 1 − (X − 1) · . . . · (X − n)

in Fp [X]. It has degree at most n − 1 and n roots (in Fp ) by assumption, thus it is the zero
152 APPENDIX A. POLYNOMIALS

polynomial. Hence we have

X n − 1 ≡ (X − 1) · . . . · (X − n) (mod p).

Expand the RHS and consider the coefficient of X n−1 : it is −(1 + . . . + n) = − n(n+1)
2 . On the
other hand, since n ≥ 2, the coefficient of X n−1 of the LHS is 0. Thus

n(n + 1)
p| .
2
Since p > n, this means p = n + 1. Thus, if n + 1 is composite we are already done: the gcd is 1.
If n + 1 = p is prime, the gcd d is a power of p and we must find out what it is. Clearly, p is odd.

By Fermat’s little theorem, p | k n − 1 for k = 1, . . . , n so p | d. It remains to prove that p2 - d.


For this, suppose for the sake of contradiction that p2 | (p − 1)p−1 − 1. Then,
p−1
2 −1
(p − 1)p−1 − 1 X p−1
p| 2
= (p − 1)2k ≡ (mod p)
(p − 1) − 1 2
k=0

which is a contradiction so we are done in this case too: d = n + 1 if it is prime and 1 otherwise.

Remark A.1.4
The ad-hoc computation of (p − 1)p−1 modulo p will be made more systematic in Chapter 3, in
the form of the lifiting the exponent lemma (LTE).

In particular, notice that X p−1 − 1 = (X − 1) · . . . · (X − (p − 1)) in Fp which will be important for


Chapter 4.

Proposition A.1.2 motivates us to make the following definition.

Definition A.1.4 (Multiple Root)

We say α is a root of multiplicity m if (X − α)m | f but (X − α)m+1 - f . The multiplicity of α


is denoted vα (f ).

Definition A.1.5 (Derivative)

ai X i ∈ K[X] is f 0 = iai X i−1 . The nth


P P
The (formal) derivative of a polynomial f = i≥0 i≥1
derivative of f is denoted f (n) (f (0) = f ).

Exercise A.1.8∗ . Prove that (f g)0 = f 0 g + gf 0 and (f + g)0 = f 0 + g 0 for any f, g ∈ K[X]. Show also that
(f n )0 = nf 0 f n−1 for any positive integer n, where f k denotes the kth power and not the kth iterate. More
generally, show that !0
Yn Xn Y
fi = fi0 fj .
i=1 i=1 j6=i

Remark A.1.5
Note that we have defined the derivative formally, without appealing to analysis (instead of the
A.1. FIELDS AND POLYNOMIALS 153

usual definition with limits). In particular, we can, and we shall, consider the derivative over all
kinds of fields, such as Fp for instance.

Remark A.1.6
The astute reader may have remarked that this definition does not actually make sense without
further explanation. The reason is that i is an element of Z, so, technically, it doesn’t make sense
to multiply ai by i. However, this is easy to take care of: we simply define iai as

ai + . . . + ai
| {z }
i times

for i ≥ 0, and iai = −(−iai ) for i ≤ 0. It should not come off as a suprise that this is the same
construction as Definition A.2.3: this is simply because repeated addition of ones is the canonical
morphism from Z to a ring R (i.e. the canonical way to "convert" an element of Z to an element
of an arbitrary field).

We can now give a criterion to compute the multiplicity of a root, using our notion of derivative.
This will however only work as long as the characteristic char K of K is greater than the multiplicity
of the root. Roughly speaking, the characteristic c ∈ N is the smallest positive number such that c = 0
in K if there exists one, and 0 otherwise. See Definition A.2.3 for a more rigorous definition. For
instance, the characteristic of Fp is p while the characteristic of Q is 0.

Proposition A.1.3 (Multiple Roots)*

Let f ∈ K[X] be a polynomial. If char K = 0, for any positive integer m and any α ∈ K, we
have (X − α)m | f if and only if

f (α) = f 0 (α) = . . . = f (m−1) (α) = 0.

Otherwise, this only holds for m < char K.

Here is how this theorem can fail when vα (f ) ≥ c: the derivative of f = X p over Fp is pX p−1 = 0,
so f (k) (0) = 0 for all k ∈ N yet X p is clearly not divisible by X k for all k ∈ N.

Proof

We proceed by induction on m. The base case is Proposition A.1.2. We shall prove that, if
f (α) = 0, vα (f 0 ) = vα (f ) − 1.

Let m = vα (f ) ≥ 1. Write f = (X − α)m g where X − α - g. Then, by Exercise A.1.8∗ ,


f 0 = m(X − α)m−1 g + (X − α)m g 0 which is indeed divisible by (X − α)m−1 but not (X − α)m
as X − α - g. Here, we used the fact that m is non-zero because it is positive but less than the
characteristic.


Using our notion of multiple roots, we get that if f has degree n, leading coefficient a, and roots
α1 , . . . , αn (not necessarily distinct, we count them with multiplicity) then (X − α1 ) · . . . · (X − αn ) | f
so that
f = (X − α1 ) · . . . · (X − αn )g

for some g which must be constant equal to a by looking at the degrees. We have factorised the
polynomial with its roots. The following proposition shows that we can recover the coefficients from
the roots using this factorisation.
154 APPENDIX A. POLYNOMIALS

Proposition A.1.4 (Vieta’s Formulas)*

Suppose f = a0 + . . . + an−1 X n−1 + X n is a monic polynomial of degree n with roots α1 , . . . , αn


(counted with multiplicity). Then,
X
an−k = (−1)k αi1 · . . . · αik
i1 <...<ik

for any k = 0, . . . , n − 1.

Proof

It is simply the expansion of f = (X − α1 ) · . . . · (X − αn ).




In particular, a0 = (−1)n α1 · . . . · αn and an−1 = −(α1 + . . . + αn ). We have in fact already used a


special case of these formulas in Problem A.1.1. Here are two more applications of this, to show how
useful it is.

Corollary A.1.2 (Wilson’s Theorem)

For any prime p, (p − 1)! ≡ −1 (mod p).

Proof

We have already seen that 1, 2, . . . , p − 1 are exactly the roots of X p−1 − 1 in Fp by Fermat’s
little theorem. Thus, their product is (−1)p−1 · (−1) by Vieta’s formulas as −1 is the constant
coefficient of X p−1 − 1. This means that (p − 1)! ≡ (−1)p ≡ −1 (mod p) as wanted (when p is
odd it’s clear and when p = 2 we have 1 = −1).


Problem A.1.2 (APMO 2014 Problem 3)

Find all positive integers n such that for any integer k there exists an integer a for which a3 +a−k
is divisible by n.

(Partial) Solution

This is equivalent to x 7→ x3 + x being bijective modulo n. In particular, if it is bijective modulo


n it is bijective modulo any prime factor p | n. We will show that p = 3. This will imply that
n is a power of 3, and conversely all powers of 3 work but we have not established the tools to
prove this yet. It will be proven in Chapter 5 as a consequence of Hensel’s lemma 5.3.1. Clearly
p = 2 doesn’t work as 2 | 03 + 0, 13 + 1 so p must be odd.

Thus, we restrict ourself to the prime case. Suppose x 7→ x3 + x is a permutation of Fp . Then,

1 · 2 · . . . · (p − 1) ≡ (13 + 1)(23 + 2) · . . . · ((p − 1)3 + (p − 1))

as 03 + 0 = 0 so x 7→ x3 + x must also be a permutation of Fp \ {0}. After simplifying by


1 · . . . · (p − 1), this is equivalent to

(12 + 1)(22 + 1) · . . . ((p − 1)2 + 1) ≡ 1.


A.1. FIELDS AND POLYNOMIALS 155

But notice that, by Fermat’s little theorem, the numbers of the form a2 + 1 are the roots of the
p−1 p−1
polynomial (X − 1) 2 − 1 whose constant term is (−1) 2 − 1. Moreover, in our product, every
root is present exactly twice (a2 + 1 and (−a)2 + 1) so we get
p−1
(12 + 1)(22 + 1) · . . . ((p − 1)2 + 1) ≡ (±((−1) 2 − 1))2 (mod p)
p−1
by Vieta’s formulas. But (±((−1) 2 − 1))2 ∈ {0, 4} so for this to be congruent to 1 modulo p
we must have p = 3. It is also easy to check that x 7→ x3 + x is a bijection of F3 , hence we are
done (with the prime case).


Conversely, the following result, which is proven in Appendix B, shows that we can always achieve
such a factorisation over the complex numbers. Fields where this result holds are said to be algebraically
closed .

Theorem A.1.1 (Fundamental Theorem of Algebra)

Any polynomial f ∈ C[X] of degree n ≥ 0 has exactly n roots, i.e.,

f = a(X − α1 ) · . . . · (X − αn )

where α1 , . . . , αn are the roots of f counted with multiplicity and a is its leading coefficient.

Over the real numbers, we have the following result.

Proposition A.1.5

Any non-zero polynomial f ∈ R[X] factorises into a product

a(X − α1 ) · . . . · (X − αm )f1 · . . . · fk

where a ∈ R is its leading coefficient, αi ∈ R are its real roots, and the fi ∈ R[X] are monic
polynomials of degree 2 with no real roots.

In fact, we shall prove that non-real roots α come in pairs of conjugates α, α, since (X − α)(X − α)
has real coefficients this will yield the result.

Proposition A.1.6*

Suppose f ∈ R[X]. Then, for any α ∈ C, vα (f ) = vα (f ).

Proof

We have f = (X − α)m g for some g ∈ C[X] if and only if f = (X − α)m g, where g denotes the
polynomial whose coefficients are the complex conjugates of those of g.

156 APPENDIX A. POLYNOMIALS

Proof of Proposition A.1.5

Write f = a(X − α1 ) · . . . · (X − αm )(X − β1 )(X − β 1 ) · . . . · (X − βm )(X − β m ) where αi ∈ R


and βi ∈ C \ R using Proposition A.1.6. Then,

f = a(X − α1 ) · . . . · (X − αm )f1 · . . . · fk

where fi = (X − βi )(X − β i ) = X 2 − 2<(βi ) + |βi |2 ∈ R[X].




The fact that some polynomials over R[X] cannot be decomposed into a product of linear polyno-
mials motivates us to make the following definition.

Definition A.1.6 (Irreducible Polynomial)

A non-zero polynomial f ∈ K[X] is said to be irreducible in K[X] if it can not be written as a


product of two polynomials of smaller degrees. We shall also say it is irreducible over K.

Notice that degree 2 and 3 polynomials are irreducible if and only if they don’t have a root in K,
since if they can be written as a product of two polynomials of smaller degrees one of them must have
degree 1. For instance, X 3 + 2 is irreducible in Q[X] since it does not have a root there, but not in
R[X]. X 2 + 1 is irreducible in R[X], but not in C[X]. Irreducible polynomials over R and C are a bit
degenerate because they have degree 1 or 2, but over Q there are irreducible polynomials of arbitrarily
large degree.

Here is one last result, which is the only one of this section which will not be used extensively
throughout this book.

Theorem A.1.2 (Lagrange Interpolation)

For any distinct a1 , . . . , an+1 ∈ K and b1 , . . . , bn+1 ∈ K there is a unique polynomial f ∈ K[X]
Pn+1
of degree at most n such that f (ai ) = bi for i = 1, . . . , n + 1. It is given by f = i=1 bi fi where
Y X − aj
fi := .
ai − aj
j6=i

Proof

First, let’s prove uniqueness. If f, g ∈ K[X] have degree at most n and f (ai ) = bi = g(ai ) for any
i = 1, . . . , n + 1 then f − g has n + 1 roots and has degree at most n so it is the zero polynomial,
i.e. f = g.

For existence, notice that fi (aj ) = 0 for any j 6= i and fi (ai ) = 1, so that
X X
f (ai ) = bj fj (ai ) = bi · 1 + bj · 0 = bi .
j j6=i


A.2. ALGEBRAIC STRUCTURES AND MORPHISMS 157

Remark A.1.7
In fact, what we did was prove a special case of the Chinese remainder theorem: the system

f ≡ bi (mod X − ai )

has a unique solution modulo (X − a1 ) · . . . · (X − an+1 ). In the same way, if we use the Chinese
remainder theorem, we can show that (in characteristic 0), for any distinct a1 , . . . , an , k1 , . . . , kn
and (bi,j )1≤i≤n,0≤j<ki , we can find an f such that

f (j) (ai ) = bi,j

for all i, j. Indeed, these conditions translate to the system


i −1
kX
bi,j
f ≡ fi := (X − ai )j (mod (X − ai )ki ).
j=0
j!

Corollary A.1.3

Let K ⊆ L be two fields. If a polynomial f ∈ L[X] of degree n reaches values in K at n + 1


points in K, it has coefficients in K.

Proof

If f (ai ) = bi with ai , bi ∈ K for i = 1, . . . , n + 1 and distinct ai , the Lagrange interpolation


formula shows that f ∈ K[X].


Exercise A.1.9∗ . Prove that every function f : Fp → Fp is polynomial.

To conclude this section, we make one final definition. Unlike what their name would suggest,
rational functions are formal objects and are not functions, like polynomials.

Definition A.1.7 (Rational Function)

A rational function with coefficients in K is a quotient of two polynomials f /g with coefficients


in K such that g 6= 0 (with the additional rule that f /g = (hf )/(hg) for any non-zero h ∈ K[X]).
The set of rational functions with coefficients in K is denoted K(X).

The derivative of a rational function f /g is (f 0 g − g 0 f )/(g 2 ), where g 2 = g · g.

Exercise A.1.10∗ . Prove that the derivative of a rational function does not depend on its form: i.e. (f /g)0 =
((hf )/(hg))0 for any f, g, h ∈ K[X] with g, h 6= 0.

A.2 Algebraic Structures and Morphisms


We introduced the notion of a field in the last section; here, we shall define a few additional algebraic
structures. There are two things to understand from this section: what an integral domain is6 and
what morphisms and isomorphisms are. This doesn’t mean that the other definitions are useless, but
you can ignore them for now. They will be used in some chapters: when this happens the reader
should come to this appendix to refresh their memory.
6 Although, usually in this book when something is obviously an integral domain and we don’t want to emphasise this

we will just call it a ring.


158 APPENDIX A. POLYNOMIALS

Definition A.2.1 (Ring)

We say a set R with two binary operations + and · from R2 to R is a ring if the following axioms
are satisifed. We write ab for a · b.
1. + is associative: (a + b) + c = a + (b + c) for any a, b, c ∈ R.
2. + is commutative: a + b = b + a for any a, b ∈ R.

3. additive identity: there is an element 0R such that 0R + a = a for any a ∈ R.


4. additive inverse: for any a ∈ R there is an element −a ∈ R such that a + (−a) = 0R .
5. · is associative: (a · b) · c = a · (b · c) for any a, b, c ∈ R.

6. multiplicative identity: there is an element 1R such that 1R · a = a · 1R = a for any a ∈ R.


7. · distributes over +: for any a, b, c ∈ R, a(b + c) = ab + ac and (b + c)a = ba + ca.

Exercise A.2.1∗ . Prove that 1R and 0R are unique, and that any element has a unique additive inverse and
a unique multiplicative inverse if it is non-zero.

Exercise A.2.2∗ . Let R be a ring. Prove that 0R a = a0R = 0R for any a ∈ R.

A ring is like a field, but possibly without the existence of multiplicative inverses, as well as with a
possibly non-commutative multiplication. Non-commutative rings will only be relevant in Chapter 2.
Again, R is technically not a ring, it is (R, +, ·) that is one, but by abuse of terminology we will say
that R is a ring when the addition and multiplication are obvious. We shall usually write 0 for 0R and
1 for 1R , even if they might not technically be our usual 0, 1 ∈ Z.

Definition A.2.2 (Z/nZ)

By Z/nZ, we denote the ring with n elements of integers modulo n. In particular, Z/0Z = Z.

Remark A.2.1
There is a deeper story behind this notation, see the footnote in Exercise A.3.15† .

Rings have a certain number associated to them which is what distinguishes rings like Z from rings
like Z/nZ. We have already encountered this notion in the case of fields in Proposition A.1.3.

Definition A.2.3 (Characteristic)

Let R be a ring. We say R has characteristic m if m is the smallest integer such that

1 + . . . + 1 = 0.
| {z }
m times

If no such m exists we say R has characteristic 0. The characteristic of R is denoted char R.

Exercise A.2.3∗ . Prove that char R is the smallest m ≥ 0 such that R contains a copy of Z/mZ.

In other words, the characteristic of R is the smallest m ≥ 0 such that R contains a copy of Z/mZ.7
7 The best way to define the characteristic is by noting that the kernel of the canonical morphism from Z to R is an

ideal of Z, and thus of the form mZ for some unique m ≥ 0 since Z is a PID with units ±1. This m is the characteristic.
A.2. ALGEBRAIC STRUCTURES AND MORPHISMS 159

Definition A.2.4 (Commutative Ring)

A commutative ring is a ring where multiplication is commutative.

For reference, here is the definition of ideals. These will not be used anywhere in the book (except in
a handful of exercises), but are used almost eveywhere in the modern literature and are of considerable
importance. I recommend to completely skip these definitions on a first reading.

Definition A.2.5 (Ideal)

An ideal a of a commutative ring R is a set which is closed under addition, as well as under
multiplication by any element of R. In other words, a + a = a and Ra = a.

Exercise A.2.4. Prove that an ideal a of R is equal to R if and only if it contains 1.

Exercise A.2.5. Prove that the ideals of Z have the form nZ for some Z.8

Remark A.2.2
Equivalent but more modern definitions include the following: an R-module which is a subset of
R, or an additive subgroup closed under multiplication by any element of R.

The canonical example of an ideal is aR for some a ∈ R, consisting of multiples of a. Such ideals
are called principal . Most texts on algebraic number theory have good motivation on why to consider
ideals, see for instance Milne [25]. We can also define the sum and products of ideals, but one needs to
be careful with the latter, as {ab | a ∈ a, b ∈ b} needs not be closed under addition in general. Hence,
we define the product ab to be the ideal generated by these products.

Definition A.2.6 (Addition and Multiplication of Ideals)

Given two ideals a and b of a commutative ring R, their sum is the ideal

a + b := {a + b | a ∈ a, b ∈ b}

and their product is the ideal


( n
)
X
ab := ai bi | a1 , . . . , an ∈ a, b1 , . . . , bn ∈ b .
i=1

We now define a field in terms of rings, exactly like before, but hopefully this makes it clearer how
our objects are connected.

Definition A.2.7 (Field)

A field K is a commutative ring where non-zero elements have multiplicative inverses, i.e. for
any a ∈ K there is an element a−1 ∈ K such that aa−1 = 1K .

For fields, you should think about Q. We also have the analogous definition for non-commutative
fields.
Exercise A.2.6∗ . Prove that the characteristic of a field is either 0 or a prime number p.

8 We say Z is a principal ideal domain (PID). See Exercise 2.6.22.


160 APPENDIX A. POLYNOMIALS

Definition A.2.8 (Skew Field)

A skew field K is a field but where multiplication is not necessarily commutative, i.e. a ring with
multiplicative inverses: for any 0 6= a ∈ K there is an element a−1 ∈ K such that aa−1 = a−1 a =
1K (we specify both equalities because multiplication is not necessarily commutative anymore).

Finally, we define the fundamental integral domains.

Definition A.2.9 (Domain)

A domain is a ring where the product of two non-zero elements is non-zero. A commutative
domain is called an integral domain.

For integral domains, you can again think about Z. An example of a commutative ring which isn’t
an integral domain is Z/4Z: 2 · 2 ≡ 0 but 2 6≡ 0. Z is really the typical example of an integral domain,
more than of a ring or commutative ring.
Exercise A.2.7. Let R be a finite integral domain (i.e. with finite cardinality). Prove that it is a field.

An important fact about integral domains is that they are precisely the subrings of fields, i.e. they
are the rings which can be embedded in a larger field. Why is this true? It’s obvious that a subring
of a field is an integral domain. For the converse, given an integral domain R you can construct its
field of fractions Frac R, exactly like how you construct Q from Z. You define formal objects a/b for
a, b ∈ R and then you say a/b = c/d if ad = bc and you define addition and multiplication in this
obvious ways; it is then easy to check that this yields a field. For instance, for R = K[X], this gives
Frac R = K(X).
Exercise A.2.8∗ . Prove that a subring of a field is an integral domain.

Exercise A.2.9. What goes wrong if you try to construct the field of fractions of a commutative ring which
isn’t a domain?

Since integral domains can be embedded in fields, polynomials with coefficients there retain most
of their properties, so we can also define polynomials with coefficients in an integral domain R. The
ring of such polynomials is denoted R[X], and the ring of rational functions with coefficients in R,
Frac R[X] is denoted R(X).
Exercise A.2.10∗ . Let R be an integral domain. Prove that R[X] is also one.

In fact, perhaps the most important property we lose when restricting ourself to an integral domain,
is that we can not do the Euclidean division of any f by any g 6= 0. Indeed, our proof Proposition A.1.1
involved dividing by the leading coefficient of g, and it is true that we can’t have X = 2Xq + r for
some q ∈ Z[X] and r ∈ Z[X] of degree less than 1. However, this also means that there is one case
when we can make this Euclidean division: when g is monic.

Finally, we explain what morphisms are. Imagine you have the two fields {0, 1} and {a, b} where
a, b are formal symbols. Multiplication and addition are defined as follows. For the former 0 + 0 = 0,
0 + 1 = 1 and 1 + 1 = 0 for addition, and 0 · 0 = 0, 0 · 1 = 0 and 1 · 1 for multiplication. For the latter,
it’s a + a = a, a + b = b and b + b = a for addition, and a · a = a, a · b = a and b · b = b.

These are defined exactly in the same way! Any reasonable person would want to conclude that
they are the same, that they are both equal to F2 . But they are not! Our definition of a field was
very clear: it is a triple of a set and two binary operations satisfying some axioms. Here the sets are
different so the triples are too, which means the fields are not the same.

Thus, we want to define formally what a "relabelling" of the elements is. This is exactly what an
isomorphism is (iso = same, morphism = shape). We will in fact not define them formally in general
because isomorphisms depend on the structure considered (a ring isomorphism and a group (to be
defined later) isomorphism are different), but here is what a field isomorphism is.
A.2. ALGEBRAIC STRUCTURES AND MORPHISMS 161

Definition A.2.10 (Field Isomorphism)

Let K and L be two fields. We say a function ϕ : K → L is an isomorphism if f is additive,


multiplicative, sends 1 to 1, and is bijective, i.e.

ϕ(x + y) = ϕ(x) + ϕ(y)


ϕ(xy) = ϕ(x)ϕ(y)
f (1) = 1

for any x, y ∈ K. If there exists such a ϕ, we say K and L are isomorphic and write K ' L.

A few words on this definition. The function ϕ is our "relabelling" of the elements: relabel x as
ϕ(x). The conditions of ϕ are to ensure that f conserves the field structure, i.e. addition gets mapped
to addition, multiplication to multiplication, inverse to inverse, identity to identity. The reason why
we ask that f (1) = 1 but not that f (0) = 0 is that this follows from ϕ(0) + ϕ(0) = ϕ(0). However
f (1) = 1 does not follow from ϕ(1)ϕ(1) = ϕ(1): ϕ could be identically zero.

Similarly, f (−x) = −f (x) and f (x−1 ) = f (x)−1 follow from the additivity and multiplicativity
respectively. What we really want is for f to respect every single aspect of the field structure, but
we have not written down all of these conditions in the definition of an isomorphism since they are
redundant (of course we also want f to be bijective: this is what it means for the fields to be "the
same up to relabelling").

In fact, all this talk about "conserving the structure" suggests that this might actually be an
important notion, and this is why we define morphism as functions which preserve the structure (which
is implicit, technically we should say our previous f is a field isomorphism, not just an isomorphism).

Definition A.2.11 (Field Morphism)

Let K and L be two fields. We say a function ϕ : K → L is an morphism if f is additive,


multiplicative and sends 1 to 1, i.e.

ϕ(x + y) = ϕ(x) + ϕ(y)


ϕ(xy) = ϕ(x)ϕ(y)
ϕ(1) = 1

for any x, y ∈ K.

Exercise A.2.11. Prove that a field morphism is always injective.

Commutative rings morphisms and isomorphisms are defined exactly the same way, because the
additional structure that comes from fields is the existence of multiplicative inverse, but we have seen
that the fact that ϕ sends inverses to inverses already follows from its multiplicativity (and the fact
that ϕ(1) = 1).

You might think that these notions of isomorphisms and morphisms are just pedantic details about
how to define formally objects. They are not. They suggest that objects which were initially defined
very differently might in fact be similar. For instance, morphisms of sets are just functions since sets
have no structure, and isomorphisms are bijective functions. Are bijections useless to study?

An example of non-trivially isomorphic fields is the one of Q(π) the field of the rational functions
in π and Q(e) the field of the rational functions in e. This is not obvious, and in fact is very hard to
prove (see Exercise 1.5.31† ).9 A better example will be seen soon, but first we need to define groups
for this (they will be used in Chapter 6).
9 Well, actually, I’m a bit exaggerating here because I did not find a good examples with fields, showing that Q(π) '

Q(e) amounts to showing they are both transcendental (see Section 1.1 for a definition), and this is what’s really hard.
162 APPENDIX A. POLYNOMIALS

Definition A.2.12 (Group)

We say a set G with a binary operation † : G2 → G is a group if the † is associative, has an


identity, and each element has an inverse for †, i.e.
1. (a † b) † c = a † (b † c) for any a, b, c ∈ G.
2. there is an e ∈ G such that a † e = e † a = a for any a ∈ G.

3. for any a ∈ G there is an a−1 ∈ G such that a † a−1 = a−1 † a = e.

Exercise A.2.12∗ . Prove that the identity e of a group G is unique, and that any a ∈ G has a unique inverse.
Moreover, prove that (xy)−1 = y −1 x−1 .

The simplest example is the cyclic group with n elements (Z/nZ, +) where Z/nZ represents integers
modulo n. We say it’s cyclic because it’s generated by only one element: the elements of (Z/nZ, +)
have the form 1 + . . . + 1 for some number of ones. It is also commutative or abelian, which is another
way of saying that the operation is commutative.10

Note that if you consider rings as groups you must ignore their multiplicative structure, since
groups have only one operation (you can also consider the multiplicative group of units a ring, i.e. the
elements which are invertible).

A slightly more elaborate example, yet still very important, is the symmetric group with n elements
Sn . This is the group of permutations of {1, . . . , n}, and the operation is composition.

Exercise A.2.13∗ . Check that (Sn , ◦) is a group.

Morphisms of groups are extremely easy to define since groups have so little (yet so much!) struc-
ture: it’s simply a function which commutes with (i.e. distributes on) the operation: ϕ : G → H is a
morphism if ϕ(a † b) = ϕ(a) ? ϕ(b) for any a, b ∈ G.

Exercise A.2.14∗ . Prove that a morphism of groups from (G, †) to (H, ?) maps the identity of G to the
identity of H.

We now give an example of a non-trivial isomorphism (of groups). If p is a prime, the groups
((Z/pZ)× , ·) and (Z/(p − 1), +) are isomorphic, where (Z/pZ)× denotes the integers mod p which are
coprime with p (so that inverses exist). Since they clearly have the same number of elements, this
is equivalent to (Z/pZ)× being cyclic, i.e. generated by one element, which we will call g. This g is
such that, for any a ∈ (Z/pZ)× , there is a k such that a = g k . This is exactly the definition of a
primitive root! Thus, this isomorphism translates the fact that there is a primitive root modulo p,
which is certainly non-trivial! Here is another very interesting example: (Z/nZ, +) is isomorphic to
(Un , ·), where Un denotes the set of complex nth roots of unity. Indeed, the function k 7→ exp 2kiπ
n
is clearly an isomorphism between the two.

We will usually write our group operations multiplicatively, that is, we will write xy or x · y for x † y.
In this case, one should always write the inverse of y as y −1 instead of 1/y, unless the group is already
known to be abelian. Indeed, if we were to write 1/y, would x/y mean xy −1 or y −1 x? Sometimes the
additive notation will also be used, we shall then write x + y for xy and nx for xn . We may also write
the identity e of G as 1 or 0, depending on whether we are using the additive or multiplicative notation.
In addition, when the operation is obvious, we will omit it. For instance, when we consider a ring as a
group (such as Z/nZ), the operation will always be addition. Indeed, it cannot be multiplication since
0 does not have an inverse. Conversely, if we write R× , the set of invertible units of R, this shall be
considered as a multiplicative group (it is not additive since 0 6∈ R× ).

Lastly, we define two important maps on morphisms.


10 Personally, I was only convinced by this termniology when I realised saying "let L/K be a commutative extension"

seemed extremely awkward. (A field extension L/K is said to be abelian if its Galois group is, see Chapter 6.)
A.3. EXERCISES 163

Definition A.2.13 (Kernel)

The kernel ker ϕ of a morphism ϕ is the set of x such that ϕ(x) = 0. It measures how far ϕ is
from being injective.

Definition A.2.14 (Image)

The image im ϕ of a morphism ϕ is the set of y such that y = ϕ(x) for some x. It measures how
far ϕ is from being surjective.

Exercise A.2.15∗ . Prove that the kernel of a morphism (of rings or groups) is closed under addition.

Exercise A.2.16∗ . Prove that a morphism of groups is injective iff its kernel is trivial, i.e. consists of only
the identity.

As a final remark, in Appendix C, we will introduce another algebraic structure called vector spaces,
and we will define morphisms for vector spaces.

A.3 Exercises
Derivatives
Exercise A.3.1† . Let K be a field of characteristic 0 and let f, g ∈ K[X] be two polynomials. Prove
that the derivative of f ◦ g is g 0 · f 0 ◦ g.
Exercise A.3.2† . Let f ∈ K[X] be a non-constant polynomial. Prove that there are a finite number
of g, h ∈ K[X] such that g ◦ h = f , up to affine transformations, meaning (g, h) ≡ g(aX + b), h−b

a .
Exercise A.3.3. Let f ∈ R[X] be a polynomial. Suppose that f ◦ f is the square of a polynomial.
Prove that f also is the square of a polynomial.
Exercise A.3.4† (USA TST 2017). Let K be a characteristic 0 field and let f, g ∈ K[X] be non-
constant coprime polynomials. Prove that there are at most three elements λ ∈ K such that f + λg is
the square of a polynomial.
Exercise A.3.5 (All-Russian Olympiad 2014). On a blackboard, we write (only) the polynomials
X 3 − 3X 2 and X 2 − 4X and all real numbers c ∈ R. If the polynomials f and g are written on the
board, we can also write f ± g, f · g and f ◦ g. Is it possible to write a polynomial of the form X n − 1?
Exercise A.3.6† (Discrete Derivative). Let K be a field of characteristic 0 and let f ∈ K[X] be a
polynomial of degree n and leading coefficient a. Define its discrete derivative as ∆f := f (X+1)−f (X).
Prove that, for any g ∈ K[X] ∆f = ∆g if and only if f − g is constant, and that ∆f is a polynomial
of degree n − 1 with leading coefficient an where a is the leading coefficient of f . Deduce the minimal
degree of a monic polynomial f ∈ Z[X] identically zero modulo m, for a given integer m ≥ 1.
Exercise A.3.7† . Let f : R → R be a function, where R is some ring. Define its discrete derivative
∆f as x 7→ f (x + 1) − f (x). Prove that, for any integer n ≥ 0,
n  
X n
∆n f (x) = (−1)n−k f (x + k).
k
k=0

Exercise A.3.8† . Let m ≥ 0 be an integer. Prove that there is a polynomial fm ∈ Q[X] of degree
m + 1 such that
Xn
k m = fm (n)
k=0

for any n ∈ N.
164 APPENDIX A. POLYNOMIALS

Roots of Unity
Exercise A.3.9† (Roots of Unity Filter). Let f = i ai X i ∈ K[X] be a polynomial, and suppose
P
that ω1 , . . . , ωn ∈ K are distinct nth roots of unity. Prove that

f (ω1 ) + . . . + f (ωn ) X
= ak .
n
n|k

Deduce that, if K = C,
max |f (z)| ≥ |f (0)|.
|z|=1

(You may assume the existence of a primitive nth root of unity ω, meaning that ω k 6= 1 for all k < n,
or, equivalently, every nth root of unity is a power of ω. This will be proven in Chapter 3.)

Exercise A.3.10† . Let f = i ai X i ∈ C[X] be a polynomial and ω1 , . . . , ωn ∈ C be distinct nth


P
roots of unity with n > deg f . Prove that

|f (ω1 )|2 + . . . + |f (ωn )|2 X


= |ai |2 .
n i

Denote by S(f ) the sum of the squares of the modules of the coefficients of f . Deduce that S(f g) =
S(f X deg g g(1/X)) for all f, g ∈ C[X]. (X deg g g(1/X) is the polynomial obtained by reversing the
coefficients of g.)

Exercise A.3.11† (USEMO 2021). Denote by S(f ) the sum of the squares of the modules of the
coefficients of a polynomal f ∈ C[X]. Suppose that f, g, h ∈ C[X] are such that f g = h(X)2 . Prove
that S(f )S(g) ≥ S(h)2 .

Exercise A.3.12† . Let k be an integer. Prove that a∈Fp ak is 0 if p − 1 - k and −1 otherwise.


P

Deduce that any polynomial f ∈ Fp [X] of degree at least 1. satisfying f (a) ∈ {0, 1} for all a ∈ Fp
must have degree at least p − 1.

Exercise A.3.13† . Let p 6= 3 be a prime number. Suppose that a and b are integers such that
p | a2 + ab + b2 . Prove that (a + b)p ≡ ap + bp (mod p3 ).

Exercise A.3.14 (China TST 2018). Let k be an integer, p a prime number, and S the set of kth
powers of elements of Fp . Prove that, if 2 < |S| < p − 1, the elements of S are not an arithmetic
progression.

Group Theory
Exercise A.3.15† . Given a group G and a normal subgroup H ⊆ G, i.e. a subgroup such that

x+H −x=H

for any x ∈ G,11 we define the quotient G/H of G by H as G modulo H 12 , i.e. we say x ≡ y (mod H)
if x − y ∈ H.13 Prove that this indeed a group, and that |G/H| = |G|/|H| for any such G, H.

Exercise A.3.16† (Isomorphism Theorems). Prove the following first, second, and third isomorphism
theorems.

1. Let ϕ : A → B be a morphism of groups. Then, A/ ker ϕ ' im ϕ. (In particular, ker ϕ is normal
in A and | im ϕ| · | ker ϕ| = |A|.)
11 Inparticular, when G is abelian, any subgroup is normal.
12 This is where the notation Z/nZ comes from! In fact this shows that, in reality, we should say "modulo nZ" instead
of "modulo n".
13 A better formalism is to say that G/H is the set of cosets g + H for g ∈ G. In fact, we will almost always use

this definition in the solutions of exercises (since this is the only place where this will appear, apart from ??), but we
introduced it that way to make the analogy with Z/nZ clearer.
A.3. EXERCISES 165

2. Let H be a subgroup of a group G, and N a normal subgroup of G. Then, H/H ∩ N ' HN/N .
(In particular, you need to show that this makes sense: HN is a group and H ∩ N is normal in
H.)
3. Let N ⊆ H be normal subgroups of a group G. Then, (G/N )/(H/N ) ' G/H.
Exercise A.3.17† . Let G be a finite group, ϕ : G → C× be a non-trivial group morphism (i.e. not
the constantPfunction 1), where (C× , ·) is the group of non-zero complex numbers under multiplication.
Prove that g∈G ϕ(g) = 0.

Exercise A.3.18† (Lagrange’s Theorem). Let G be a group of cardinality n (also called the order of
G). Prove that g n = e for all g ∈ G. In other words, the order of an element divides the order of the
group. More generally, prove that the order of a subgroup divides the order of the group.
Exercise A.3.19† (5/8 Theorem). Let G be a non-commutative finite group. Prove that the proba-
bility
|{(x, y) ∈ G2 | xy = yx}|
p(G) =
|G|2
that two elements commute is at most 5/8.
Exercise A.3.20† (Fundamental Theorem of Finitely Generated Abelian Groups). Let G be an
abelien group which is finitely generated, i.e., if we write its operation as +, there are g1 , . . . , gk ∈ G
such that any g ∈ G can be represented as n1 g1 + . . . + nk gk for integers ni ∈ Z.
a) Suppose that G is torsion-free, i.e. the only element which has finite order in G is its identity 0.
Prove that there is a unique integer n ≥ 0 such that (G, +) ' (Zn , +).
b) Suppose now that G is torsion, i.e. all elements in G have finite order. Prove that that there is
a d ∈ N∗ such that G contains a subgroup H ' Z/dZ and such that the order of any element of
G divides d (i.e. dG = 0). By finding a morphism ϕ : G → H which is the identity on H, show
that we have G ' H × G/H. Deduce that there exists a unique sequence of positive integers
1 6= dm | . . . | d1 such that

(G, +) ' (Z/d1 Z × . . . × Z/dk Z, +).

c) Deduce that if G is a finitely generated abelian group, there is a unique integer n ≥ 0 (called the
rank of the group) and a unique sequence of positive integers 1 6= dm | . . . | d1 such that

(G, +) ' (Zn × Z/d1 Z × . . . × Z/dk Z, +).

Exercise A.3.21† (Burnside’s Lemma). Let G be a finite group, S a finite set, and · a group action
of G on S, meaning a map · : G × S → S such that e · s = s and (gh) · s = g · (h · s) for any g, h ∈ G
and s ∈ S. Given a g ∈ G, denote by Fix(g) the set of elements of s fixed by g. Prove that
1 X
|S/G| = Fix(g),
|G|
g∈G

where |S/G| denotes the number of (disjoint) orbits Oi = Gsi . Deduce the number of necklaces that
have p beads which can be of a colours, where p is a prime number and two necklaces are considered
to be the same up to rotation.
Exercise A.3.22† (Exact Sequences). We say a sequence of group morphisms fi : Gi → Gi+1 , written
ϕ1 ϕ2 ϕn
G1 → G2 → . . . → Gn+1

is exact if im ϕk = ker ϕk+1 for every k ∈ [n − 1]. Prove that the short sequences
ϕ
0→G→H

and
ϕ
G→H→0
166 APPENDIX A. POLYNOMIALS

are exact if and only if ϕ is injective or sujective respectively. (0 designates the trivial group, and the
ommited maps the trivial morphisms.) Finally, suppose that d is a measure of the size of groups which
satisfies d(G1 ) = d(G2 ) whenever G1 ' G2 , and d(G/H) = d(G) − d(H) for any group G and any
normal subgroup H ⊆ G. Suppose that
ϕ1 ϕ2 ϕn
0 → G1 → G2 → . . . → Gn+1 → 0

is exact. Prove that14


n+1
X
(−1)k d(Gk ) = 0.
k=1

Ideals and Noetherian Rings


Exercise A.3.23† . We say an ideal m 6= R of a ring R is i f no ideal a 6= R contains it. Prove that m
is maximal if and only if R/m is a field. (The quotient is the same as in Exercise A.3.15† .)
Exercise A.3.24† . We say a commutative ring R is Noetherian if it satisfies the ascending chain
condition: for any weakly increasing chain of ideals

a1 ⊆ a2 ⊆ a3 ⊆ · · · ,

there is some N ∈ N∗ such that aN = aN +1 = aN +2 = . . .. Prove that the following assertions are
equivalent.
(i) R is Noetherian.
(ii) Any ideal a of R is finitely generated15 , meaning that there are some a1 , . . . , an such that

a = a1 R + . . . + an R.

(iii) Any set of ideals has a maximal element (with respect to the relation of inclusion)16 .
Exercise A.3.25† (Hilbert’s Basis Theorem). Let R be a Noetherian ring. Prove that R[X] is
Noetherian as well17 .

Miscellaneous
Exercise A.3.26† (China TST 2009). Prove that there exists a real number c > 0 such that, for any
prime number p, there are at most cp2/3 positive integers n satisfying n! ≡ −1 (mod p).
Exercise A.3.27† (Mason-Stothers Theorem, ABC conjecture for polynomials). Let K be a charac-
teristic 0 field. Suppose that A, B, C ∈ K[X] are polynomials such that A + B = C. Prove that

1 + max(deg A, deg B, deg C) ≤ deg(rad ABC)

where rad ABC is the greatest squarefree divisor of ABC (in other words, deg(rad ABC) is the number
of distinct complex roots of ABC). Deduce that the Fermat equation f n + g n = hn for f, g, h ∈ K[X]
does not have non-trivial solutions for n ≥ 2.
Exercise A.3.28† . Find all polynomials f ∈ C[X] which send the unit circle to itself.
14 Exact sequences may seem scary at first, but this is one of the many reasons why they’re so useful (but not the most

fundamental). Many times, the goal is this equality, and we just happen to have an exact sequence which produces it.
Common examples of such size measures include d(G) = log(|G|) for finite groups, or d(V ) = dim V for vector spaces.
15 In particular, a principal ideal domain is Noetherian.
16 This lets us perform Noetherian induction over Noetherian rings, because induction just amounts to considering a

minimal n such that some property is not satisfied, and deduce a contradiction by constructing an even smaller one.
(Over Noetherian rings, every set of ideals has a maximal element instead of a minimal one, but this is the natural
generalisation of what happens over Z: a | b if and only if aZ ⊆ bZ.)
17 Since any field K is Noetherian (it has only two ideals: {0} and itself), Hilbert’s theorem implies that K[X] is

Noetherian. By Exercise A.3.24† , this means that every ideal is finitely generated (hence the name "basis theorem").
This is of considerable importance in (classical) algebraic geometry as it allows us to say that, given a set of points S in
K n , the ideal of polynomials vanishing on S is finitely generated (see Shafarevich [39]).
A.3. EXERCISES 167

Exercise A.3.29. Suppose that f, g ∈ C[X] are polynomials such that, for all x ∈ C, f (x) ∈ R implies
g(x) ∈ R. Prove that there exists a polynomial h ∈ R[X] such that g = h ◦ f .
Exercise A.3.30† . Let K be a characteristic 0 field, and let f ∈ K[X] be a non-zero polynomial.
Suppose that an additive map ϕ : K → K commutes with f , i.e. ϕ(f (x)) = f (ϕ(x)) for all x ∈ K.
Prove that ϕ commutes with every monomial of f .

Exercise A.3.31. Let (K, +, ·) be a set satisfying the axioms of a field except possibly that · takes
values in K. Prove that it is in fact a field.
Exercise A.3.32† (Gauss-Lucas Theorem). Let f ∈ C[X] be a polynomial with roots α1 , . . . , αk .
Prove that
f0 X 1
= .
f X − αk
k

Deduce the Gauss-Lucas theorem: if f ∈ C[X] is non-constant, Pthe roots ofPf 0 are in the convex hull of
0
the roots of f , that is, any root β of f is a linear combination i λi αi with i λi = 1 and non-negative
λi ∈ R.
Exercise A.3.33† (Sturm’s Theorem). Given a squarefree polynomial f ∈ R[X], define the sequence
f0 = f , f1 = f 0 and fn+2 is minus the remainder of the Euclidean division of fn by fn+1 . Define also
V (ξ) as the number of sign changes in the sequence f0 (ξ), f1 (ξ), . . ., ignoring zeros. Prove that the
number of distinct real roots of f in the interval ]a, b] is V (a) − V (b).18
Exercise A.3.34† (Ehrenfeucht’s Criterion). Let K be a characteristic 0 field, let f1 , . . . , fk ∈ K[X]
be polynomials and define

f = f1 (X1 ) + . . . + fk (Xk ) ∈ K[X1 , . . . , Xk ].

If k ≥ 3, prove that f is irreducible. In addition, prove that this result still holds if k = 2 and f1 and
f2 have coprime degrees.
Exercise A.3.35† (IMC 2007). Let a1 , . . . , an be integers. Suppose f : Z → Z is a function such that
n
X
f (kai + `) = 0
i=1

for any k, ` ∈ Z. Prove that f is identically zero.


Exercise A.3.36. Find all polynomials f ∈ C[X] satisfying
1. f (X n ) = f (X)n for some integer n ≥ 2.

2. f (X 2 + 1) = f (X 2 ) + 1.
3. f (X)f (X + 1) = f (X 2 + X + 1).
4. f (X 2 ) = f (X)f (X + 1).

5. f (X 2 ) = f (X)f (X − 1).
Exercise A.3.37. Let f ∈ K[X] be a polynomial of degree n. Find f (n + 1) if
k
1. (USAMO 1975) f (k) = k+1 for k = 0, . . . , n.

2. f (k) = 2k for k = 0, . . . , n.

18 If we choose a = −∞, b = +∞, this gives an algorithm to compute the number of real roots of f , by looking at the

signs of the leading coefficients of f0 , f1 , . . . since f (±∞) only depends on the leading coefficient of f (as long as it is
non-constant).
Appendix B

Symmetric Polynomials

Prerequisites for this chapter: Section A.1.

B.1 The Fundamental Theorem of Symmetric Polynomials


Given a commutative ring R (in our case we will consider Z and Q) and an integer n ≥ 0, we can
consider the symmetric polynomials in n variables with coefficients in R. These are defined as the
polynomials in n variables invariant under all permutations of these variables.

Definition B.1.1 (Symmetric Polynomials)

We say a polynomial f ∈ R[X1 , . . . , Xn ] is symmetric if f (X1 , . . . , Xn ) = f (Xσ(1) , . . . , Xσ(n) ) for


any permutation σ of [n].

Exercise B.1.1. Let f ∈ K(X1 , . . . , Xn ) be a rational function, where K is a field. Suppose f is symmetric,
i.e. invariant under permutations of X1 , . . . , Xn . Prove that f = g/h for some symmetric polynomials g, h ∈
K[X1 , . . . , Xn ].

As an example, f = X 2 Y + XY 2 + X 2 + Y 2 is a symmetric polynomial in two variables, and

g = X 2 Y Z + XY 2 Z + XY Z 2 + XY 2 + X 2 Y + XZ 2 + X 2 Z + Y Z 2 + Y 2 Z

is a symmetric polynomial in three variables.

Definition B.1.2 (Elementary Symmetric Polynomials)

The kth elementary symmetric polynomial for k ≥ 0, ek ∈ R[X1 , . . . , Xn ], is defined by


X
ek = Xi1 · . . . · Xik .
1≤i1 <...<ik ≤n

Further, if k > n then ek = 0 (the empty sum) and if k = 0 then e0 = 1 (the sum of the empty
product).

The two-variable symmetric polynomials are thus simply e1 = X + Y and e2 = XY . The three-
variable ones are e1 = X + Y + Z, e2 = XY + Y Z + ZX and e3 = XY Z.

We now state the fundamental theorem of symmetric polynomials.

168
B.1. THE FUNDAMENTAL THEOREM OF SYMMETRIC POLYNOMIALS 169

Theorem B.1.1 (Fundamental Theorem of Symmetric Polynomials)

Suppose f ∈ R[X1 , . . . , Xn ] is a symmetric polynomial. Then f ∈ R[e1 , . . . , en ]. In other words,


there is a polynomial g ∈ R[X1 , . . . , Xn ] such that

f (X1 , . . . , Xn ) = g(e1 , . . . , en ).

This theorem explains why we called ek "elementary symmetric polynomials": because they gen-
erate all symmetric polynomials.

Proof

We proceed by induction on the number n of variables. For n = 1 this is trivial as e1 = X. Thus


suppose it is true for n − 1 variables.

Now, we proceed again by induction on the degree of f . When f = 0 this is obvious. Otherwise,
by the first inductive hypothesis,

f (X1 , . . . , Xn−1 , 0) = g(e01 , . . . , e0n−1 )

for some g ∈ K[X1 , . . . , Xn ], where e0i is the ith elementary symmetric polynomial in n − 1
variables, i.e. e0i (X1 , . . . , Xn−1 ) = ei (X1 , . . . , Xn−1 , 0). Notice that, in particular,

deg f ≥ deg g(e01 , . . . , e0n−1 ) = w(g)

where the weight of g,Pw(g) is defined as the maximum of the weights of its monomials, and
n
w (X1m1 · . . . · Xnmn ) = k=1 kmk .

Write
f (X1 , . . . , Xn ) = g(e1 , . . . , en−1 ) + h(X1 , . . . , Xn ).
We have h(X1 , . . . , Xn−1 , 0) = 0, i.e.

Xn | h(X1 , . . . , Xn−1 , Xn )

(h(X1 , . . . , Xn ) has a root at Xn in the ring R[X1 , . . . , Xn−1 ][X]). By symmetry we also have
Xi | h for any i so h = X1 · . . . · Xn r = en s.

To conclude, s is symmetric and has degree at most deg f − n < deg f as

deg g(e1 , . . . , en−1 ) = w(g) = deg g(e01 , . . . , e0n−1 ).

This implies that r = s(e1 , . . . , en ) by the second inductive hypothesis, so that we get

f = g(e1 , . . . , en−1 ) + en r(e1 , . . . , en )

as wanted.


Exercise B.1.2∗ . Prove that, if e1 , . . . , en ∈ A[X1 , . . . , Xn ] are the elementary symmetric polynomials in n
variables and g ∈ A[X1 , . . . , Xm ] is a polynomial in m ≤ n variables, the degree of g(e1 , . . . , em ) is the weight
w(g) of g.

Exercise B.1.3. Prove that the decomposition of a symmetric polynomial f as g(e1 , . . . , en ) is unique.

Exercise B.1.4. Prove the following generalisation of the fundamental theorem of symmetric polynomials: if
(1) (m1 )
f ∈ R[A1 , . . . , A1 , . . . , A(1) (mn )
n , . . . , An ]
170 APPENDIX B. SYMMETRIC POLYNOMIALS

(1) (mk )
is symmetric in Ak , . . . , Ak for every k ∈ [n], then

f ∈ R[eA 1 A1 An An
1 , . . . , e m1 , . . . , e 1 , . . . , e mn ]

Ak (1) (mk )
where ei designates the ith elementary symmetry polynomial in Ak , . . . , Ak .

B.2 Newton’s Formulas


Apart from elementary symmetric polynomials, there are another type of polynomials which are of
particular interest.

Definition B.2.1 (Power Sum Polynomials)

The kth power sum polynomial for k ≥ 0, pk ∈ R[X1 , . . . , Xn ], is defined by

pk = X1k + . . . + Xnk .

Here is how they relate to the elementary symmetric polynomials.

Theorem B.2.1 (Newton’s Formulas)

For any integer k ≥ 0, we have


k
X
kek = (−1)i−1 ek−i pi = ek−1 p1 − ek−2 p2 + . . . + (−1)k−1 pk .
i=1

The main importance of these formulas is that they let us recover the ei from the pi by induction,
as the RHS only has ei for i < k. This is expressed in the following corollary.

Corollary B.2.1*

Let K be a field of characteristic 0. Then, K(e1 , . . . , en ) = K(p1 , . . . , pn ). In particular, if

p1 (x1 , . . . , xn ), . . . , pn (x1 , . . . , xn )

all lie in K, then so do


e1 (x1 , . . . , xn ), . . . , en (x1 , . . . , xn ).

Remark B.2.1
This still holds in characteristic p as long as n < p, so that the k in kek is non-zero.

Exercise B.2.1∗ . Prove Corollary B.2.1.

Elementary Proof of Newton’s Formulas

Let’s compute the product ek−1 pi . ek−i is the sum of products of k − i distinct variables, while
pi is the sum of the ith powers of all the variables. Thus, ek−i pi is the sum of the product of the
ith power of some variable, times k − i distinct variables.

In the product of the k − i distinct variable, either one of these variables will be the same as the
B.2. NEWTON’S FORMULAS 171

one raised to the ith power, or not. Hence, we get ek−i pi = r(i) + r(i + 1) where r(j) is the sum
of products of one other variable raised to the k − jth power times j other distinct variables:
X X
Xi1 · . . . · Xik−j · X`j .
i1 <...<ik−j `6=i1 ,...,ik−j

Thus,
k
X
(−1)i−1 ek−i pi = (r(1)+r(2))−(r(2)+r(3))+. . .+(−1)k (r(k)+r(k+1)) = r(1)+(−1)k r(k+1).
i=1

To conclude, r(1) = kek and r(k + 1) = 0 (there is no sum over −1 variables, or at least which
is a homogeneous polynomial of degree k).


Proof of Newton’s Formulas using Generating Functions

We now present a proof by generating functions. We work in R[X1 , . . . , Xn ][[T ]], i.e. the ring of
formal power series in T with coefficients in R[X1 , . . . , Xn ]. The idea behind generating functions
is that, if we want to prove that two sequences (am )m≥0 and P (bm )m≥0 are the same, it suffices
to prove that their generating functions m≥0 am T m and m≥0 bm T m are the same, which is
P
usually done using some algebraic manipulations. Thus, we shall compute the generating function
of the sequence (kek ≥ 0)k≥0 (which is in fact a polynomial since ek = 0 for k > n). Actually,
we will not compute this generating exactly, but the generating function of this sequence shifted
by 1, i.e. X
kek T k−1 .
k≥0

Why? Because this is the derivative of


X
ek T k = (1 + X1 T ) · . . . · (1 + Xn T ).
k≥0

Hence, we get
 
n
X n
X Y
kek T k−1 = Xi (1 + Xj T )
k=0 i=1 j6=i
n
! n
X Xi Y
= (1 + Xj T )
i=1
1 + Xi T j=1
 

n n
!
X X X
j i
= Xi (−Xi T )  ei T
i=1 j=0 i=0
 

n X n
!
X j+1
X
j j i
= (−1) Xi T  ei T
i=1 j=0 i=0
 
∞ n
!
X X
j j i
=  (−1) pj+1 T  ei T .
j=0 i=0

By comparing the T k−1 coefficients, we get


k
X
kek = (−1)i−1 pi ek−i
i=1
172 APPENDIX B. SYMMETRIC POLYNOMIALS

as wanted.


B.3 The Fundamental Theorem of Algebra


Recall the statement of the fundamental theorem of algebra, whose name was coined at a time where
algebra was about solving polynomial equations.1 In reality, it is a theorem about analysis and is
usually proven that way. However, in this section we will present a mostly algebraic one (a completely
algebraic proof is impossible because R is defined analytically).

Theorem B.3.1 (Fundamental Theorem of Algebra)

Any polynomial f ∈ C[X] of degree n ≥ 0 has exactly n roots, i.e.,

f = a(X − α1 ) · . . . (X − αn )

where α1 , . . . , αn are the roots of f counted with multiplicity and a is its leading coefficient.

By Proposition A.1.2, it suffices to show that any non-constant polynomial f ∈ C[X] has a root in
C. This is also called the d’Alembert-Gauss theorem because d’Alembert was the first one to recognise
the importance of proving this result but gave a flawed proof, while Gauss was (almost) the first one
to give a rigorous proof (in fact he even gave multiple proofs).

Theorem B.3.2 (d’Alembert-Gauss Theorem)

Any non-constant polynomial with complex coefficients has a complex root.

Notice that we can assume the polynomial f has real coefficients, since if f has complex coefficients,
g = f f has real coefficients where f denotes the polynomial obtained by applying complex conjugation
to the coefficients of f . Thus, if we find a root α of g, either α ∈ R in which case we can use induction on
g g
X−α , or we know by Proposition A.1.6 that α is also a root and we can use induction on (X−α)(X−α) .
This would prove that g has as many complex roots as its degree, and thus f too.

We shall only use two results. The first one is a corollary of the intermediate value theorem, while
the second one is left as an exercise.

Proposition B.3.1

Any polynomial f ∈ R[X] of odd degree has a real root.

Proof

Since f has odd degree, f (−∞) and f (+∞) have opposite signs so there is a root by the inter-
mediate value theorem. (Here we work in R ∪ {+∞, −∞}, which is just a nice shortcut for not
writing limits.)


1 Now it means abstract algebra and linear algebra, see Section A.2 and Appendix C (the section on abstract algebra

does not give justice to the subject, since it was only about setting up useful definitions for this book).
B.4. EXERCISES 173

Proposition B.3.2

Any polynomial f ∈ C[X] of degree 2 has a root in C.

Exercise B.3.1∗ . Prove Proposition B.3.2.

Proof that any non-constant polynomial with real coefficients has a complex root

Let f ∈ R[X] be a polynomial of degree n. We proceed by induction on k = v2 (n); the case


k = 0 is Proposition B.3.1. For the induction step, suppose k ≥ 1 so that n is even.

Let α1 , . . . , αn be the roots of f . You might wonder what that means since we don’t know if
they exist (in C) or not. It’s true that we don’t know whether they lie in C or not yet, but
we can construct them formally, just like i was constructed formally to be an object such that
i2 = −1. Thus we can construct α1 , . . . , αn inductively such that f has n roots in some field
K. (However, beware that, if you add a formal object α such that f (α) = 0 to a field, this will
not necessarily make a field when f isn’t irreducible. But you can factorise f into a product of
irreducible polynomials then add a root of one of the factors and repeat. See Exercise 4.2.1∗ .)

Given a real number t, we consider the polynomial


Y
gt = X − (αi + αj + tαi αj )
i<j
Q
which has real coefficients by the fundamental theorem of symmetric polynomials: if i<j X−
(Xi + Xj + tXi Xj ) = ht (X, e1 , . . . , en ) for some ht ∈ R[X][X1 , . . . , Xn ], then

gt = ht (X, e1 (α1 , . . . , αn ), . . . , en (α1 , . . . , αn ))

since ei (α1 , . . . , αn ) ∈ R by Vieta’s formulas.

Notice that gt has degree n(n−1)


2 and, since n is even, v2 (deg gt ) = v2 (deg g) − 1. Hence, gt always
has a complex root α = αi + αj + tαi αj for some i, j by the inductive hypothesis.

Now pick n(n−1)


2 + 1 non-zero values of t ∈ R: by the pigeonhole principle two of them must have
the same indices, αi + αj + rαi αj is a complex root of gr and αi + αj + sαi αj is a complex root
of gs for the same i, j.

By subtracting these two numbers, we get αi αj ∈ C. Similarly, we have αi + αj ∈ C. Thus αi


and αj are roots of a quadratic equation X 2 − (αi + αj )X + αi αj with complex coefficients and
hence also lie in C by Proposition B.3.2. In particular, we have found a complex root of f as
wanted.


B.4 Exercises
Newton’s Formulas
Exercise B.4.1. Denote by hk ∈ R[X1 , . . . , Xn ] the kth complete homogeneous polynomial , i.e. the
sum of all monomials of degree k. Prove that
k
X
khk = hk−i pi = hk−1 p1 + hk−2 p2 + . . . + pk
i=1

for any k ≥ 0.
174 APPENDIX B. SYMMETRIC POLYNOMIALS

Exercise B.4.2† (Hermite’s Theorem). Prove that a function f : Fp → Fp is a bijection if and only
if a∈Fp f (a)k is 0 for k = 1, . . . , p − 2 and −1 for k = p − 1.
P

Exercise B.4.3† . Suppose that α1 , . . . , αn are such that α1k + . . . + αkn is an algebraic integer for all
n ≥ 0. Prove that α1 , . . . , αk are algebraic integers.

Algebraic Geometry2
Exercise B.4.4† (Resultant). Let R be a commutative ring, and f, g ∈ R[X] be two polynomials of
respective degrees m and n. For any integer k ≥ 0, denote by Rk [X] the subset of R[X] consisting of
polynomials of degree less than k. The resultant Res(f, g) is defined as the determinant of the linear
map
(u, v) 7→ uf + vg
from Rn [X] × Rm [X] to Rm+n [X]. Prove that, if R is a UFD (see Definition 2.2.2), Res(f, g) = 0 if
and only if f and g have
P a common factorP in R[X] (which is also a UFD by Gauss’s lemma 5.1.3).
Then, show that if f = i ai X i and g = i bi X i , we have3
a0 0 ··· 0 b0 0 ··· 0
a1 a0 ··· 0 b1 b0 ··· 0
.. ..
a2 a1 . 0 b2 b1 . 0
.. .. .. .. .. ..
. . . a0 . . . b0
Res(f, g) = .. .. ,
am am−1 ··· . bn bn−1 ··· .
.. .. .. ..
0 am . . 0 bn . .
.. .. .. .. .. ..
. am−1
. . . . . bn−1
0 0 am 0 ···
0 ··· bn
and, if f = a i X − αi and g = b j X − βj , then4
Q Q

Y
Res(f, g) = am bn αi − βj .
i,j

In addition, prove that Res(f, g) ∈ (f R[X] + gR[X]).5 Finally, prove that if f, g ∈ Z[X] are monic and
uf +vg = 1 for some u, v ∈ Z[X], Res(f, g) = ±1. (It is not necessarily true that (f R[X]+gR[X])∩R =
Res(f, g)R.)
Exercise B.4.5. Prove that
• Res(f g, h) = Res(f, h) Res(g, h) for any f, gh ∈ R[X].
• Res(f − gh, g) = bk−n Res(f, g) for any f, g, h ∈ R[X] where k is the degree of f − gh, n is the
degree of g and b its leading coefficient.
• Res(F (f, g), G(f, g)) = Res(F, G)k Res(f, g)mn where F, G, f, g ∈ R[X, Y ] are homogeneous poly-
nomials of respective degrees m, n, k, k. Here, by Res(A, B) for homogeneous A, B ∈ R[X, Y ],
we mean Res(A(X, 1), B(X, 1)).
Exercise B.4.6† (Elimination Theory). Let K be an algebraically closed field, let m, n ≥ 0 be
integers, and let d1 , . . . , dm ≥ 0 be integers. Using Exercise B.4.4† , prove that there are homogeneous
polynomial F1 , . . . , Fk such that, for any homogeneous polynomials f1 , . . . , fm ∈ K[X1 , . . . , Xn ] of
respective degrees d1 , . . . , dm (we exceptionnaly treat 0 as a polynomial of degree 0 in this exercise),
F1 , . . . , Fk simultaneously vanish at the coefficients of f1 , . . . , fm if and only if f1 , . . . , fm have a non-
zero common root in K n .6 (F1 , . . . , Fk are polynomials in the coefficients of f1 , . . . , fm .)
2 The link with symmetric polynomials is quite feeble, I admit. I included this section because of the resultant, which

is a polynomial symmetric in the roots of its arguments.


3 This is an (m + n) × (m + n) matrix, with n times the element a and m times the element b .
0 0
n(n−1)
(−1) 2
4 In particular, the discriminant of f is a
· Res(f, f 0 ).
5 In other words, the resultant provides an explicit value of a possible constant in Bézout’s lemma for arbitrary rings

(such as Z).
6 Note that for m = n and d = . . . = d
1 m = 1, this is what the determinant does. See also Remark C.3.1.
B.4. EXERCISES 175

Exercise B.4.7† (Hilbert’s Nullstellensatz). Let K be an algebraically closed field. Suppose that
f1 , . . . , fm ∈ K[X1 , . . . , Xn ] have no common zeros in K. Using Exercise B.4.4† , prove that there exist
polynomials g1 , . . . , gm such that
f1 g1 + . . . + fm gm = 1.
Deduce that, more generally, if f is a polynomial which is zero at common roots of polynomials
f1 , . . . , fm (we do not assume anymore that they have no common roots), then there is an integer k
and polynomials g1 , . . . , gm such that

f k = f1 g1 + . . . + fm gm .

Exercise B.4.8† (Weak Bézout’s Theorem). Prove that two coprime polynomials f, g ∈ K[X, Y ] of
respective degrees m and n have at most mn common roots in K. (Bézout’s theorem states that they
have exactly mn common roots counted with multiplicity, possibly at infinity.7 )

Exercise B.4.9† . Prove that n + 1 polynomials f1 , . . . , fn+1 ∈ K[X1 , . . . , Xn ] in n variables are


algebraically dependent, meaning that there is some non-zero polynomial f ∈ K[X1 , . . . , Xn+1 ] such
that
f (f1 , . . . , fn+1 ) = 0.

Exercise B.4.10† (Transcendence Bases). Let L/K be a field extension. Call a maximal set of K-
algebraically independent elements of L a transcendence basis. Prove that, if L/K has a transcendence
basis of cardinality n, then all transcendence bases have cardinality n. This n is called the transcendence
degree trdegK L. Finally, show that, if L = K(α1 , . . . , αn ) any maximal algebraically independent
subset of α1 , . . . , αn is a transcendence basis. (In particular trdegK L ≤ n.)

Exercise B.4.11† . Let K be an algebraically closed field which is contained in another field L.
Suppose that f1 , . . . , fm ∈ K[X1 , . . . , Xn ] are polynomials with a common root in L. Prove that they
also have a common root in K.

Miscellaneous
Exercise B.4.12† (ISL 2020 Generalised). Let n ≥ 1 be an integer. Find the maximal N for which
there exists a monomial f ∈ Z[X1 , . . . , Xn ] of degree N which can not be written as a sum
n
X
ei fi
i=1

with fi ∈ Z[X1 , . . . , Xn ].

Exercise B.4.13† (Lagrange). Given a rational function f ∈ K[X1 , . . . , Xn ], we denote by Gf the


set of permutations σ ∈ Sn such that

f (X1 , . . . , Xn ) = f (Xσ(1) , . . . , Xσ(n) ).

Let f, g ∈ K(X1 , . . . , Xn ) be two rational functions. If Gf ⊆ Gg , prove that there exists a rational
function r ∈ K[e1 , . . . , en ](X) such that
g = r ◦ f.

Exercise B.4.14† (Iran Mathematical Olympiad 2012). Prove that there exists a polynomial f ∈
R[X0 , . . . , Xn−1 ] such that, for all a0 , . . . , an−1 ∈ R,

f (a0 , . . . , an−1 ) ≥ 0

is equivalent to the polynomial X n + an−1 X n−1 + . . . + a0 having only real roots, if and only if
n ∈ {1, 2, 3}.
7 This requires some care: we need to define the multiplicity of common roots as well as what infinity means. See

any introductory text to algebraic geometry, e.g. Sharevich [39]. See also the appendix on projective geometry of
Silverman-Tate [42].
176 APPENDIX B. SYMMETRIC POLYNOMIALS

Exercise B.4.15. Let f ∈ K[X] be a monic polynomial with roots α1 , . . . , αn . Prove that its dis-
criminant, as defined in Remark 1.3.3 or Exercise B.4.4† is equal to
n
n(n−1) Y
(−1) 2 · f 0 (αi ).
i=1

Deduce a formula for the discriminant of X n + aX + b and show that it is valid over any ring (not
necessarily one where X n + aX + b has at most n roots8 ).

8 Remember that Corollary A.1.1 is valid only over fields and integral domains.
Appendix C

Linear Algebra

Prerequisites for this chapter: Section A.1. Section A.2 is recommended.

C.1 Vector Spaces

Definition C.1.1 (Vector Space)

A vector space V over a field K, also called a K-vector space, is a set where you can add elements
of V and also mutliply them by elements of K. More specifically, we have the following axioms.

1. associativity of addition: (u + v) + w = u + (v + w) for any u, v, w ∈ V .


2. commutativity of addition: u + v = v + u for any u, v ∈ V .
3. identity of addition: there is a 0V ∈ V such that u + 0V = 0V + u for any u ∈ V .
4. compatibility of multiplication: (ab)v = a(bv) for any a, b ∈ K and v ∈ V .

5. identity of multiplication: 1K v = v for any v ∈ V (1K is the identity of K).


6. distributivity of multiplication: a(u + v) = au + av and u(a + b) = ua + ub for any a, b ∈ K
and u, v ∈ V .
The elements of the base field K are called scalars (and the elements of V vectors, although we
won’t use this termniology much).

Remark C.1.1
We will use the expressions "K-vector space" and "vector space over K" interchangeably in what
follows. The same goes for other words instead of "vector space".

Remark C.1.2
We can also define vector spaces over a commutative ring R which is not a field, in this case they
are not called vector spaces but modules. We will not concern ourselves with them, although they
will be mentioned occasionally in some exercises.

These axioms are again all very obvious and you don’t need to try to remember them, they are
exactly the properties which let us establish the next propositions (that’s why we defined it like that)
so you just need to focus on what’s next. As an example of vector spaces one can take the K-vector
space K, the K-vector space K n with componentwise addition and multiplication (these are regular
vectors), Z/pn Z as a Fp -vector space, and as a final more elaborate example the R-vector space of
functions f : R → R such that f (0) = 0 (it’s closed under addition).

177
178 APPENDIX C. LINEAR ALGEBRA

When the base field is obvious from context or does not matter, we will drop the K.

Definition C.1.2 (Linear Independence)

We say elements u1 , . . . , un of a K-vector


P space V are linearly independent if no non-trivial linear
combination of them is zero, i.e. i i i = 0 for ai ∈ K implies ai = 0 for all i. Otherwise, we
a u
say they are linearly dependent.

For instance, the vectors (1, 0) and (0, 1) are linearly independent but the vectors (1, 0), (0, 1) and
(1, 1) aren’t because (1, 0) + (0, 1) − (1, 1) = 0.

Definition C.1.3 (Bases)

A K-basis of a K-vector space V is a family of linearly independentPelements (ei )i∈I such that
they span all of V : any element of V is a unique linear combination i ai ei for some ai ∈ K (all
but finitely many ai must be equal to zero so that the sum makes sense).

The most common basis of Rn for instance is family set of unit vectors

(1, 0 . . . , 0), (0, 1, . . . , 0), . . . , (0, . . . , 0, 1).

The next proposition says that the cardinality of a basis (if there is one) does not depend on the
basis, but only on the vector space. But first, here is an application (even though we have only made
definitions!). When the base field K is obvious, we will drop the K and simply say "basis" and "vector
space".

Problem C.1.1

There are 2n + 1 cows such that, whenever one exludes one of them, the rest can be divided into
two groups of size n such that the sum of the weights of the cows in each group is the same.
Prove that all cows have the same weight.

Solution

Let wi be the weight of the ith cow. First, we solve the problem by
Pinduction (say,
P on the sum of
the absolute values of the weights) when the weights are in Z: if i∈Ik wi = i6∈Ik ,i6=k wi then
2n+1
X X X
wi = wk + wi + wi ≡ wk (mod 2)
i=1 i∈Ik i6∈Ik ,i6=k

so all weights have the same parity. If they are all even divide them by 2 and get a smaller
solution (unless they are all equal to 0 in which case they are indeed equal), otherwise add 1 to
all of them and divide them by 2 (unless they are all equal to 1 in which case they are indeed
equal); this yields another solution as the groups have the same size by assumption.
Now solve the problem over Q: just multiply the weights by the lcm of the denominators to get
weights in Z and we have already solved that case.
Finally, let’s do the general case: wi ∈ R. Consider the Q-vector space generated by the weights:
V = w1 Q + . . . + w2n+1 Q. Find a basis of V like this: pick any maximal subset of weights which
are linearly independent (wi )i∈I . Indeed, any other weight wk can be represented as a linear
combination of (wi )i∈I since there is a linear combination
X
awk + ai wi = 0
i∈I
C.1. VECTOR SPACES 179

as (wk ) ∪ (wi )i∈I is linearly dependent by assumption, so


X −ai
wk = wi
a
i∈I

as a 6= 0 (otherwise (wi )i∈I are linearly dependent).


P
Thus, we have found a basis e1 , . . . , em of V . To finish, write each wi as j aj,i ej . We shall
prove that aj,1 , aj,2 , . . . , aj,2n+1 satisfy the same conditions as the wi (you can partition them
into two groups of same sum and same size when you remove any one of them) for any j. Since
they are in Q, by our prevous step this implies P P all equal and thus the weights wi
that they are
are also all equal. This is however very easy: if i∈I wi = i∈I 0 wi then
X X X X
ej aj,i = ej aj,i
j i∈I j i∈I 0
P P
so i∈I aj,i = i∈I 0 aj,i by definition of a basis. We are done.


Now, we prove that any basis has the same cardinality if there exists a finite one. In that case
we say the vector space is finite-dimensional . Unless otherwise stated, we will always work in the
finite-dimensional case.

Proposition C.1.1 (Dimension)

Suppose e1 , . . . , en is a basis of a vector space V . Then, any basis of V has cardinality n. This
n is called the dimension of V and written dimK V .

In fact we prove more.

Proposition C.1.2

Suppose u1 , . . . , un are linearly independent elements of V and v1 , . . . , vm span all of V . Then


m ≥ n.

Since bases satisfy both of these conditions, we get n ≥ m and m ≥ n so m = n for any bases of
cardinality m and n.

Proof

We prove the contrapositive: if n > m then u1 , . . . , un are linearly dependent. Write uj =


P
i ai,j vi with ai,j ∈ K. We proceed by induction on m (when m = 1 it is obvious).

Pick an ai,j 6= 0, this is possible otherwise all ui are zero so in particular linearly dependent;
without loss of generality assume i = j = 1. We will get rid of ui and vj for our induction. To
do so, consider the family of vectors u02 , u03 , . . . , u0n where u0k is defined by
a1,k
u0k := uk − u1
a1,1

Note that, for any 2 ≤ j ≤ n,


a1,j X
u0j = uj − u1 = (ai,j − a1,i a1,j /a1,1 )vi
a1,1
j≥2
180 APPENDIX C. LINEAR ALGEBRA

has no coordinate in v1 . Thus by our inductive hypothesis (e.g. on V 0 the space generated by
v2 , . . . , vm ) u02 , . . . , u0n are linearly dependent, which implies that u1 , . . . , un are linearly dependent
as well and we are done. (This idea of getting rid of coordinates will be used again in the proof of
Theorem C.3.1, which states that the determinant of linearly independent vectors is non-zero.)


Here is a small application.

Problem C.1.2

Let K be a field and let f, g ∈ K[X] be two non-constant polynomials. Prove that there is an
h ∈ K[X] such that f h is a polynomial in g.

Solution

This amounts to saying that some multiple of f is a polynomial in g. Let n = deg f . We work
in K[X]/(f ) (where (f ) = f K[X] is the ideal generated by f ), i.e. K[X] modulo f . This a
K-vector space of dimension n as (1, X, . . . , X n−1 ) is a basis.

Now consider the n + 1 elements 1, g, g 2 , . . . , g n , where g k denotes the kth power of g. Since this
family has more elements than the dimension of the vector space, its elements must be linearly
dependent: we get X
ai g i ≡ 0 (mod f )
i

for some not all zero ai ∈ K. This constitutes the wanted multiple of f .


Remark C.1.3
As this solution shows, the problem is extremely flexible. For instance, for any infinite set of
non-negative integers S (e.g. the set of primes), f has a multiple whose non-zero monomials have
the form aX s for some s ∈ S and a ∈ K.
A final important result on bases, that we already saw in the proof of Problem C.1.1, is that we
can always complete a family of linearly independent vectors to get a basis.

Proposition C.1.3

Any family of linearly independent vectors of a finite-dimensional vector space V can be com-
pleted into a basis by adding elements to it, and from any generating family we can extract a
basis.

Exercise C.1.1∗ . Prove Proposition C.1.3.

Exercise C.1.2. Let V be a finite-dimensional vector space, and let W ⊆ V be a susbspace of V . Prove that
dim V /W = dim V − dim W , where V /W is the group quotient from Exercise A.3.15† , i.e. we identify vectors
v ∈ V and v + w for w ∈ W . More formally, the elements of V /W are the sets v + W for v ∈ V .

Remark C.1.4
We will almost exclusively deal with finite-dimensional spaces in this book, and when we don’t
we will not use infinite linear algebra. For instance, K[X] is an infinite-dimensional vector space
over K, but we don’t use linear algebra directly over K[X]. However, it may be interesting to
know that we can prove the existence of a basis of an arbitrary vector space with the help of
C.2. LINEAR MAPS AND MATRICES 181

Zorn’s lemma (which is equivalent to the axiom of choice). It has even been shown that Zorn’s
lemma is equivalent to this result. Here is a sketch of the proof.

Consider the set L of linearly independent subsets of a vector space V . As we have seen earlier,
a basis is a maximal element (with respect to the inclusion element) of L. Zorn’s lemma tells us
that such an element exists if every chain (i.e. a set such that its elements are all comparable)
has an upper bound in L and L is non-empty, which is clearly true. Let us check that this is the
case. Suppose that C is a chain of elements of L. Then, note that
[
U= S
S∈C

is an upper bound of C. Hence, we only need to show that U ∈ L. Suppose otherwise: say
that x1 , . . . , xn ∈ U are linearly dependent (linear combinations always involve a finite number of
elements, since infinite sums don’t make sense in general vector spaces). Some elements S1 , . . . , Sn
contain x1 , . . . , xn respectively, and since C is a chain, one of these must be maximal among them:
say Sn ⊇ S1 , . . . , Sn−1 . Then, x1 , . . . , xn ∈ Sn , contradicting the assumption that S ∈ L, i.e. that
its elements are linearly independent. (The same argument also shows that any field extension
L/K has a transcendence basis. See Exercise B.4.10† .)

As always, a quick application to finite fields. In Chapter 4 we prove that there exist fields of
cardinality q for every prime power q = pn =6 1. Here we show the converse: if F is a field with q
elements, then q is a prime power.

Proposition C.1.4

Suppose F is a field with q elements. Then there is some prime p and an integer n such that
q = pn .

Proof

The key is to consider F as a vector space over Fp , where p is the characteristic of F . Indeed, the
characteristic c is a prime, since if 0 = c = ab then either a or b must be zero too which means
a = c or b = c by minimality of c.

Now F is a Fp -vector space in the obvious way: just define nx = x + . . . + x (technically that’s
| {z }
n times
what we did before with the characteristic too) and this is compatible with Fp as p = 0 in F .

Since F is finite, it is also finite-dimensional as a vector space: let e1 , . . . , en be a basis (there


exists one by Proposition C.1.3). Then, every element of F can be written in a unique way as
n
X
ai ei
i=1

for some ai ∈ Fnp . There are exactly pn tuples (a1 , . . . , an ) ∈ Fp , so q = pn as wanted.




C.2 Linear Maps and Matrices


In this section, we consider morphisms of vector spaces which are called linear maps, or linear trans-
formations.1
1 This shows that one should not call polynomial functions of degree 1 "linear", because they are not linear maps

(unless the constant coefficient is 0)! One should call them "affine", because they correspond to affine transformations,
not linear ones.
182 APPENDIX C. LINEAR ALGEBRA

Definition C.2.1 (Linear Maps)

Let U and V be two K-vector spaces. A linear map ϕ : U → V is a function which is additive
and homogeneous, i.e. ϕ(x + y) = ϕ(x) + ϕ(y) and ϕ(λx) = λϕ(x) for any x, y ∈ U and λ ∈ K.

For instance, the derivative map f 7→ f 0 is a linear map from K[X] to itself (as K-vector spaces).
There is a very simple characterisation of linear maps U → V . Let u1 , . . . , um be a basis of U and
v1 , . . . , vn of V . Write X
ϕ(uj ) = ai,j vi
i
for j = 1, . . . , m. Then ϕ P
is uniquely defined from these ai,j , and any system of ai,j gives rise to a
linear map U → V : if x = j bj uj then
X
ϕ(x) = bj ai,j vi .
i,j

Note in particular that this shows that the structure of finite-dimensional vector spaces is more or
less trivial: a vector space of dimension n is isomorphic to K n . However, the profoundness of linear
algebra lies precisely in what these isomorphisms are.

Remark C.2.1
When K = Q, the Q-linear maps are precisely the additive maps. Indeed, it follows from additivity
that ϕ(nx) = nϕ(x) for n ∈ Z, which implies that
 m  ϕ(mx) m
ϕ x = = ϕ(x).
n n n
Additive functions are also called functions satisfying the "Cauchy equation" (ϕ(x + y) = ϕ(x) +
ϕ(y)). This explains why this equation is unsolvable over R: R is an infinite-dimensional Q-
vector space, so there are a lot of solutions: "just" pick a basis (ui )i∈I of R and send ui wherever
you want. (It is however impossible, in the general case, to prove the existence of a basis of an
infinite-dimensional vector space without the axiom of choice.)

Let us now comment a bit our proof of Lagrange’s interpolation theorem A.1.2. What we did was
consider the canonical basis e1 , . . . , en+1 where ei has a 1 in the ith position and zeros everywhere else
of the space K n+1 consisting of vectors (b1 , . . . , bn+1 ). Then, for each element of this basis, we found
polynomials fi such that
(fi (a1 ), . . . , fi (an )) = ei .
Finally, we get the wanted result by taking linear combinations of these fi since e1 , . . . , en+1 is a basis.
We come back to more abstract considerations. Given the bases B = (u1 , . . . , um ) and C =
(v1 , . . . , vn ), we denote the linear map ϕ in matrix form relative to the bases B and C by
 
a1,1 a1,2 · · · a1,n
 a2,1 a2,2 · · · a2,n 
MBC (ϕ) =  . ..  .
 
.. ..
 .. . . . 
am,1 am,2 ··· am,n
Note that we have used the index j for elements of the domain, and the index i for the codomain. This
means that, to get the matrix of ϕ, we represent ϕ(u1 ), . . . , ϕ(um ) by column vectors:
   
a1,1 a1,n
 a2,1   a2,n 
 ..  , . . . ,  .. 
   
 .   . 
am,1 am,n
and then piece them together.
C.2. LINEAR MAPS AND MATRICES 183

Definition C.2.2 (Matrices)


 
· · · a1,n
a1,1
..  ..
..  . The set
An m × n matrix is a family (ai,j )i,j∈[m]×[n] which is denoted by  .
. . 
am,1 · · · am,n
of m × n matrices with coefficients in K is denoted by K m×n ; when n = 1 we just write K m .

Note that this last notation clashes with the Cartesian product, and that an element of K m is
a column vector not a row vector! To make things worse, we will even denote elements of K m by
(a1 , . . . , am ) as column vectors take too much place.
Here is how we define the product of two matrices A and B: if A = MCB (ψ) and B = MD
C
(ϕ) where
ϕ ψ B
U → V → W , we want AB to correspond to MD (ψ ◦ ϕ) where D = (w1 , . . . , w` ) is a basis of W . Thus
we compute
!
X X X X X X
ψ(ϕ(uj )) = ψ bk,j vk = bk,j ψ(vk ) = bk,j ai,k wi = wi ai,k bk,j .
k k k i i k
P
Hence we define (ai,j )(bi,j ) = (ci,j ) where ci,j = k ai,k bk,j (scalar product of the ith row of A with
jth column of B). (In particular, the product of two matrices is only defined when the coordinates
agree: m × n and n × `.)

Definition C.2.3 (Matrix Multiplication)

and (bi,j ) of dimensions m × n and n × ` is the matrix (ci,j )


The product of two matrices (ai,j ) P
of dimension m × ` given by ci,j = k ai,k bk,j .

Matrix multiplication is clearly associative, since composition is. It is however not commutative
in general. Similarly, addition of matrices is defined componentwise because we want MCB (ϕ) + MCB (ψ)
to correspond to ϕ + ψ. (We do not define multiplication of matrices to correspond to multiplication
of linear maps because this does not make sense: x 7→ x · x is not linear.)
         
1 0 1 1 1 1 1 1 1 0 1 0
Exercise C.2.1∗ . Prove that = but = .
0 0 0 0 0 0 0 0 0 0 0 0

Exercise C.2.2∗ . Prove that matrix multiplication is distributive over matrix addition, i.e. A(B + C) =
AB + AC and (A + B)C = AC + BC for any A, B, C of compatible dimensions.
Suppose we want to invert a linear map, i.e. find another linear map ϕ−1 such that ϕ ◦ ϕ−1 = id.
The matrix of the identity is very simple to describe (with one basis): it’s the matrix with ones on the
diagonal and zero everywhere else
 
1 0 ... 0
0 1 . . . 0
MBB (id) =  . . . := In
 
 .. .. . . ... 

0 0 ... 1
since
id(ej ) = ej = 0e1 + . . . + 0ej−1 + ej + 0ej+1 + . . . + 0en .
This matrix is called the identity matrix . Thus we would like to invert matrices. Why would this
be useful? Well, this lets us, for instance, perform changes of bases: imagine that we first expressed
ϕ with respect to the bases B = (u1 , . . . , um ) and C = (v1 , . . . , vn ), but then we decided we actually
preferred to express it with respect to B 0 = (u01 , . . . , u0m ) and C 0 = (v10 , . . . , vm
0
). Consider the following
0 0
two linear maps ϕU (uj ) = uj and ϕV (vj ) = vj . It is clear that
0
MCB0 (ϕ) = MCB (ϕV ◦ ϕ ◦ ϕ−1
U )

since if ϕ(uj ) = i ai,j vi , then composing with ϕ−1 0


P
U on the right transforms uj into uj and composing
0
with ϕV on the left transforms vi into vi .
184 APPENDIX C. LINEAR ALGEBRA

Thus, 0
MCB0 (ϕ) = MCB (ϕU )MCB (ϕV )MCB (ϕU )−1
(the matrices MCB (ϕV ) and MCB (ϕU )−1 are called change of bases matrices). Of particular interest is
the case where U = V , B = C and B 0 = C 0 . In that case, we get an equality of the form M 0 = N M N −1 .

Finally, we prove one last result linking the surjectivity and injectivity of a linear map: if a linear
map from a vector space to itself is injective, then it is surjective too and conversely. This is false in
the infinite-dimension case, but this does not affect as as we only care about the finite-dimensional
one. Recall what the kernel and image of a morphism are. Note that if ϕ is linear, then ker ϕ and im ϕ
are vector spaces (as they are closed under addition, as well as multiplication by scalars).

Definition C.2.4 (Kernel and Image)

Let ϕ : U → V be a linear map. Its kernel ker ϕ is the set of u ∈ U such that ϕ(u) = 0, and its
image is the set of v ∈ V such that ϕ(u) = v for some u ∈ U .

Theorem C.2.1 (Rank-Nullity Theorem)

Suppose ϕ : U → V is a linear map (of finite-dimensional vector spaces). Then, dim ker ϕ +
dim im ϕ = dim U .

Remark C.2.2
This is called the "rank-nullity theorem" because dim im is also called the rank, and nullity means
dim ker.

Proof

Let u1 , . . . , uk be a basis of ker ϕ and ϕ(u01 ), . . . , ϕ(u0m ) a basis of im ϕ. We prove that


u1 , . . . , uk , u01 , . . . , u0m is a basis of U .

First we prove that these elements are linearly independent. Suppose that
X X
ai ui + bi u0i = 0.
i i

Then, i bi ϕ(u0i ) = 0 by composing with ϕ.PSince ϕ(u01 ), . . . , ϕ(u0m ) are linearly independent,
P
this means b1 = . . . = bm = 0. Then, from i ai ui = 0, we deduce ai = 0 since u1 , . . . , uk are
also linearly independent.

Let u ∈ U be an element. Write ϕ(u) = i bi ϕ(u0i ).


P
It remains to Pprove that they span all of U . P
Then, ϕ (u − i bi u0i ) = 0. This means u − i bi u0i ∈ ker ϕ, so
X X X X
u− bi u0i = ai ui ⇐⇒ u = ai ui + bi u0i
i i i i

as wanted.


Exercise C.2.3. Define the rank of a linear map ϕ as the dimension of its image. Prove that rank(ϕ + ψ) ≤
rank ϕ + rank ψ and rank ϕ◦ ≤ min(rank ϕ, rank ψ) for any ψ, ϕ.
C.2. LINEAR MAPS AND MATRICES 185

Remark C.2.3
ϕ
We can in fact sum this to the fact that the sequence 0 → ker ϕ → U → im ϕ → 0 is exact (by
definition of exactness, see Exercise A.3.22† ). By Exercise C.1.2, dimension is an acceptable size
measure for Exercise A.3.22† , which then directly yields dim ker ϕ − dim U + dim im ϕ = 0, i.e.
the rank-nullity theorem.

Corollary C.2.1*

A linear map U → U is injective if and only if it is surjective. In other words, a square matrix
has a right-inverse if and only if it has a left-inverse.

Proof

A linear map is injective if and only if its kernel is trivial, i.e. has dimension 0. By the rank-nullity
theorem, this is equivalent to dim im ϕ = dim U , i.e. ϕ being surjective.

A n × n matrix A has a right-inverse if and only if the linear map from K n×n → K n×n defined by
B 7→ AB is surjective, and if A−1 is this inverse then A(A−1 A) = A so A−1 A = In by injectivity
(from the rank-nullity theorem) which means A−1 is a left inverse too. But if A has a left-inverse,
then B 7→ AB is injective which means it’s surjective too.


Another corollary is that this lets us deduce the existence part of the Lagrange interpolation
theorem (Theorem A.1.2) from the uniqueness part: for any a1 , . . . , an ∈ K, the map from the vector
space Kn [X] of polynomials of degree less than n to K n given by

f 7→ (f (a1 ), . . . , f (an ))

is injective so must be surjective too since Kn [X] and K n both have dimension n.
Here is a combinatorial application of the fact that the right-inverse of a matrix is also its left-
inverse.

Problem C.2.1

There are 2n boys and 2n girls at a party. For each pair of girls, there are exactly n boys that
danced with exactly one of them. Prove that the same is true if we exchange the words "boys"
and "girls" in the last sentence.

Solution

Consider the adjacency matrix M = (ai,j ) defined by ai,j = 1 if the ith girl and the jth boy have
danced together and −1 otherwise (so the rows correspond to the girls and the columns to the
boys). We claim that the condition of the problem is exactly equivalent to

M M T = 2nI2n ,

where M T designates the transpose matrix of M , i.e. the matrix M T = (bi,j ) where bi,j = aj,i
(exchange the rows with the columns). Let’s compute this product: the (i, j) coordinate is
X X
ci,j = ai,k bk,j = ai,k aj,k .
k k

Now what is ai,k aj,k ? ai,k corresponds to whether the ith girl has danced with the kth boy, and
186 APPENDIX C. LINEAR ALGEBRA

aj,k to whether the jth girl has danced with the kth boy. Thus ai,k aj,k is −1 = 1 · (−1) = (−1) · 1
if exactly one of them has danced with him, and 1 = 1 · 1 = (−1)(−1) otherwise.
P
We conclude that the sum k ai,k aj,k is zero if and only if there are exactly n boys which danced
for i 6= j this is exactly what the problem says! For i = j the
with exactly one of the girls i, j: P
sum is trivial: a2i,k = 1 so ci,j = k 1 = 2n. We have thus proven our claim: the condition is
equivalent to M M T = 2nI2n , i.e. that M T /2n is the right-inverse of M . But then M T /2n is also
the left-inverse of M by the rank-nullity theorem, so
1 T
M M = I2n ⇐⇒ M T (M T )T = 2nI2n
2n
which means that the statement is true with the word "boys" and "girls" exchanged (since this
is what the transpose does: it exchanges the rows with the columns).


C.3 Determinants
The set of matrices almost form a non-commutative ring under addition and multiplication, except
that multiplication is not always defined (only when the dimensions are compatible). However, for
square matrices it is always possible. Thus, square matrices are usually nicer to study, for instance we
have seen that they have a right-inverse if and only they have a left-inverse, which is not true for other
matrices (by the rank-nullity theorem). In this section, we find a criterion to determine those square
matrices are invertible. Note that finding when a n × n matrix M is invertible is equivalent to finding
when the rows or the columns are linearly independent. Indeed, the column are linearly independent
if and only if the images of the canonical basis
e1 = (1, 0, . . . , 0), . . . , en = (0, . . . , 0, 1)
by the linear map v 7→ Av from K n to K n are linearly independent, i.e. if and only if this map is
injective (which is equivalent to A being invertible). For the rows, one can consider the map v 7→ AT v,
as (AB)T = B T AT so A is invertible if and only if its transpose is.
Exercise C.3.1. Prove that an m × n matrix can only have a right-inverse if m < n, and only a left-inverse
if m > n. When does such an inverse exist?

Exercise C.3.2∗ . Prove that (AB)T = B T AT for any n × n matrices A, B.


 
a b
Let’s start with the 2 × 2 case. Let be such a matrix. The vectors (a, b) and (c, d) are
c d
linearly dependent if and only if there are some x, y such that (ax + cy, bx + dy) = (0, 0). By rescaling
x and y if necessary, we may assume x = −c and y = a from ax + cy = 0 (unless a = c = 0 but in that
case they’re clearly linearly dependent). Then, bx + dy = 0 becomes ad − bc = 0.
 
a b
Thus, is invertible if and only if ad − bc 6= 0. In fact, we can even check that
c d
  
a b d −b
= (ad − bc)I2 .
c d −c a

Exercise C.3.3. Prove this identity.


 
a b
This number ad − bc is called the determinant det M of the matrice M = . Our goal will
c d
be to define such a determinant for n × n matrices satisfying the following properties:
• det M is a homogeneous polynomial of degree n in the coordinates of M . Moreover, det M has
degree 1 in each coordinate of M .
• M is invertible if and only if det M 6= 0.
C.3. DETERMINANTS 187

Remark C.3.1
We can also view the determinant as a device which takes n homogeneous polynomials of degree
1 in n variables, and tells us if they have a non-trivial common root. Indeed, if we set fi =
P n
j=1 ai,j Xj , we have f1 (x) = . . . = fn (x) = 0 if and only if

M x = 0,

where M = (ai,j ), and we have seen that ker(x 7→ M x) is non-trivial if and only if M is invertible.

In fact, these properties define uniqueley the determinant up to leading coefficient (this will be
proven in Theorem C.3.4)!For the sake of convenience, since we are going to be talking more about
columns and rows, we denote the ith row of M by Mi and the jth column of M by M j so that
 
M1
M = [M 1 , · · · , M n ] =  ...  .
 

Mn

Here is how one can define the determinant of n × n matrices inductively. In Theorem C.3.4
and Theorem C.3.3, we will give another characterisations of the determinant, the last one being
completely explicit. The reader may wish to take it for granted that the determinant exists for now,
skip to Problem C.3.1 to see an application and come back later to read the proofs.

Definition C.3.1 (Determinant)

Let A = (ai,j ) be a matrix. Denote by Ai,j the matrix obtained by removing the ith row and
the jth column. We define the determinant inductively by det[a] = a and

det A = a1,1 det A1,1 − a2,1 A2,1 + . . . + (−1)n an,1 An,1 .

Remark C.3.2
 
a1,1 ··· a1,n a1,1 ··· a1,n
 .. .. ..  as .. .. .. .
We shall also sometimes denote the determinant of  . . .  . . .
an,1 ··· an,n an,1 ··· an,n

For 0 × 0 matrices, the determinant is simply 1, and for a 1 × 1 matrix [a], the determinant is a. Here
is an example for 3 × 3 matrices:
 
a1,1 a1,2 a1,3      
a2,2 a2,3 a1,2 a1,3 a1,2 a1,3
det a2,1 a2,2 a2,3 = a1,1 det
  − a2,1 det + a3,1 det
a3,2 a3,3 a3,2 a3,3 a2,2 a2,3
a3,1 a3,2 a3,3
= a1,1 (a2,2 a3,3 − a2,3 a3,2 ) − a2,1 (a1,2 a3,3 − a1,3 a3,2 ) + a3,1 (a1,2 a2,3 − a2,2 a1,3 ).

We shall get a more explicit formula for the determinant in Theorem C.3.3, but for now we shall
use this one. Let’s prove that it is linear in each column (we say the determinant is multilinear in the
columns).

Proposition C.3.1

The determinant is linear in each column, i.e., for any k ∈ [n], t ∈ K and C, C 0 ∈ K n , we have

det[A1 , . . . , Ak−1 , tC + C 0 , Ak+1 , . . . , An ] = t det[A1 , . . . , Ak−1 , C, Ak+1 , . . . , An ]


+ det[M 1 , . . . , Ak−1 , C 0 , Ak+1 , . . . , An ].
188 APPENDIX C. LINEAR ALGEBRA

Proof

This follows easily from our inductive definition. For the sake of convenience we shall write
detkD (M ) = det[M 1 , . . . , Mk−1 , D, Mk+1 , . . . , M n ] for any D ∈ K N and any M . Let C = (ci )
and C 0 = (c0i ). First suppose k = 1. Then,
X X X
detktC+C 0 (A) = (tci + c0i ) det Ai,1 = t ci det Ai,1 + c0i det Ai,1 = t detkC (A) + detkC 0 (A).
i i i

Otherwise, X
detktC+C 0 (A) = ai,1 detk−1
tC+C 0 Ai,1
i

which is linear by the inductive hypothesis.




Thus, the determinant is invariant under addition of two columns of the matrix if and only if the
determinant of a matrix with two identical columns is 0. Before showing this however, we prove that
the determinant changes by a sign when we exchange two columns. This should satisfy the reader who
was disappointed by the lack of symmetry in our definition.

Proposition C.3.2

The determinant of a matrix is multiplied by −1 when exchanging two of its columns (which are
distinct).

Proof

Without loss of generality suppose j > i. We prove the claim when exchanging two consecutive
columns. Iterating this process yields the desired conclusion: indeed if we do the switches

i 7→ i + 1 7→ . . . 7→ i + (j − i),

the kth column goes to the k − 1th column for i + 1 ≤ k ≤ j, and then if we do the switches

j−1 7→ j − 2 7→ . . . 7→ j − 1 + (j − 1 − i)
| {z }
originally j

we have exchanged at the end exchanged only the original ith column with the original jth
column. In total, we made (−1)(j−i)+(j−1−i) = −1 switches of consecutive columns, so the
determinant is negated.

To prove that this for consecutive columns, we introduce a similar notation to the one we did
before:
detkC,C 0 (A) = det[A1 , . . . , Ak−1 , C, C 0 , Ak+2 , . . . , An ].
We have

0 = detkAk +Ak+1 ,Ak +Ak+1 (A)


= detkAk ,Ak (A) + detkAk+1 ,Ak+1 (A) + detkAk ,Ak+1 (A) + detkAk+1 ,Ak (A)
= det A + detkAk+1 ,Ak (A)

as wanted since the first two determinants are zero as the matrices have two equal columns.

C.3. DETERMINANTS 189

Proposition C.3.3

The determinant of a matrix with two identical columns is zero.

Proof

By induction again (what else can we do when we defined the determinant inductively?). When
n = 1, 2 this is obvious. Otherwise, by switching some columns, we may assume the second and
third ones are equal by Proposition C.3.2. Then
X
det A = ai,1 det Ai,1
i

and all Ai,1 have two identical columns so habe zero determinant by the inductive hypothesis.


Exercise C.3.4∗ . Prove that det In = 1.

Exercise C.3.5∗ . Prove that the determinant of a matrix with a zero column is zero.

Exercise C.3.6. Prove that the determinant of a non-invertible matrix is 0.

With this we can almost prove that the determinant is non-zero if and only if the matrix is invertible.
But first, we need to know how to compute a certain kind of determinants: determinants of upper
triangular matrices, i.e. matrices M = (ai,j ) such that ai,j = 0 for j > i
 
a1,1 0 ··· 0
 a2,1 a2,2 ··· 0 
.
 
 .. .. ..
 . . . 0 
an,1 an,2 ··· an,n

Here is a sketch of how we are going to proceed to prove that the determinant of an invertible matrix
is non-zero, using upper triangular matrices.

The determinant is invariant under column operations, i.e. adding a scalar times a column to
another column. Indeed,

detkAk +tAi (A) = detkAk (A) + t detkAi (A) = det A

since detAi (A) = 0 as it has two equal columns. Thus, we shall transform A into an upper triangular
matrix using column operations and exchanging columns. The determinant of this matrix will then be
± det A so we compute it with the following proposition and as a result we conclude that det A 6= 0.
This idea of transforming A into a triangular matrix is also what we did to prove that bases all had
the same cardinality in Proposition C.1.2.

Proposition C.3.4

The determinant of an upper triangular matrix A = (ai,j ) is the product of the elements on the
diagonal a1,1 · a2,2 · . . . an,n .
190 APPENDIX C. LINEAR ALGEBRA

Proof

By induction! It’s clearly true for n = 1 and for n ≥ 2 we have

det A = a1,1 det A1,1 − a2,1 A2,1 + . . . + (−1)n an,1 An,1 .

Now notice that Ai,1 has one row full of zeros (the one correspond to the first row of A) which
means det Ai,1 = 0 by Exercise C.3.5∗ , i.e. det A = a1,1 det A1,1 = a1,1 a2,2 · . . . · an,n by the
inductive hypothesis.


Theorem C.3.1

Any n × n matrix A is invertible if and only if its determinant det A is non-zero.

Proof

As said before, we will transform A into an upper triangular matrix by making column operations
and exchanging columns. Since this leaves the space generated by the columns unchanged and
changes the determinant by a factor of ±1, it will suffice to prove that an upper triangular matrix
is invertible if and only if its diagonal has no zero element. This is Exercise C.3.7∗ .

Here is how we do it. We proceed by induction on n (n = 1 is trivial as always). If the first row
of A is all zero then we can directly apply the inductive hypothesis on A1,1 . Otherwise, suppose
that ai 6= 0. By exchanging the ith column with the first one, we can assume i = 1.

Now, consider the matrix


 
a1,2 1 3 a1,3 1 a1,n 1
A1 , A2 − A ,A − A , . . . , An − A .
a1,1 a1,1 a1,1

It is column equivalent to A and its first row is zero, except for a1,1 . Now apply the inductive
hypothesis to this matrix.


Exercise C.3.7∗ . Prove that an upper triangular matrix is invertible if and only if its determinant is non-zero,
i.e. if the elements on its diagonal are non-zero.

We are finally ready for some applications. We shall first give a proof that algebraic integers are
closed under addition and multiplication using our machinery. In fact we even have the following more
general criterion. A module is like a vector space except it can be over any ring, not necessary a field.
In this case a Z-module is a space where you can add and subtract elements since multiplication by
integers is trivial. Finitely generated means that M = u1 Z + . . . + um Z for some u1 , . . . , um .

Proposition C.3.5

An algebraic number α is an algebraic integer if and only if there is a finitely generated Z-module
M such that αM ⊆ M .

Proof

If α is an algebraic integer, then we can take M = Z + αZ + . . . + αn−1 Z where n is the degree


of α. For the converse, suppose M is a finitely generated Z-module such that αM ⊆ M .
C.3. DETERMINANTS 191

Let u1 , . . . , um be a system of generators of M . Write


X
αuj = ai,j ui
i

with ai,j ∈ Z. Subtracting αuj from both sides, we get that the vectors

(a1,j , . . . , aj−1,j , aj,j − α, aj+1,j , . . . , um,j )

are linearly independent over C. Let A = (ai,j ). The previous remark means that the rows
of A − αIm are linearly dependent, i.e. A − αIm has determinant zero. But the determinant
det(A − αIm ) is a polynomial in α with integer coefficients since A has integer coordinates.
Moreover, from Lemma C.3.1, we see that its leading coefficient is (−1)m so α ∈ Z as wanted.


Corollary C.3.1

The set of algebraic integers Z is closed under addition and multiplication.

Proof

If α and β are algebraic integers of respective degrees m and n, then


X
M = Z[α, β] := αi β j Z
0≤i≤m−1,0≤j≤n−1

is a finitely generated Z-module such that (α + β)M ⊆ M and αβM ⊆ M . Thus α + β and αβ
also are algebraic integers.


Exercise C.3.8. Prove that Z is integrally closed , meaning that, if f is a monic polynomial with algebraic
integer coefficients, then any of its root is also an algebraic integer. (This is also Exercise 1.5.21† .)

Before presenting more applications, we need two last results. Here is an explicit formula for the
determinant which will make our life a lot easier. It can easily be proven by induction. Theorem C.3.3
will determine exactly which of these ε(σ) are 1.

Lemma C.3.1

The determinant of an n × n matrix A = (ai,j ) is equal to


 
a1,1 · · · a1,n
det  ... .. ..  = X ε(σ)a
σ(1),1 · . . . · aσ(n),n

. . 
an,1 · · · an,n σ∈S n

where the sum is taken over all permutations σ of [n] and ε(σ) ∈ {−1, 1}. Moreover, σ(id) = 1
(i.e. the coefficient of a1,1 · . . . · an,n is 1).

Exercise C.3.9∗ . Prove Lemma C.3.1.

Now, we compute a very important determinant, and then we can move on to applications.
192 APPENDIX C. LINEAR ALGEBRA

Theorem C.3.2 (Vandermonde Determinant)

Let x1 , · · · , xn ∈ K be elements. We have

x21 xn−1
 
1 x1 ··· 1
1 x2 x22 ··· xn−1
2

n−1 
  Y
det 1
 x3 x23 ··· x3  = (xj − xi )
. .. .. .. ..  i<j
 .. . . . . 
1 xn x2n ··· xn−1
n

Proof

Note that this determinant is zero when xi = xj for some i 6= j since it then has two equal
columns. Now replace xi by a variable Xi and consider this determinant as a polynomial in
X1 , . . . , Xn . The previous observation implies that it is divisible by Xi − Xj for any i, j, i.e. by
Lemma C.3.1 shows that the degree of the determinant is n(n−1)
Q
i<j Xj − Xi . In addition,
Q 2
which is the same as i<j Xj − Xi so they are equal up to a multiplicative constant. The same
lemma shows that the coefficient of

X1 X22 X33 · . . . · Xnn


Q
in the determinant is 1, so it is equal to i<j Xj − Xi since it has the same coefficient.


From this we deduce a very important corollary.

Corollary C.3.2*

If x1 , . . . , xn are distinct numbers, then the vectors

(1, x1 , . . . , xn1 ), . . . , (1, xn , . . . , xnn )

are linearly independent.

Remark C.3.3
In fact, we didn’t need to do all this to prove this corollary. Indeed, the invertibility of the matrix

1 x1 x21 · · · xn−1
 
1
1 x2 x22 · · · xn−1 2

1 x3 x2 · · · xn−1 
 
M = 3 3 
. .. .. .. .. 
 .. . . . . 
1 xn x2n ··· xn−1
n

when x1 , . . . , xn are distinct is exactly what Lagrange interpolation A.1.2 gives us. Indeed, the
non-explicit form of the theorem says precisely that the linear map x 7→ M x from K n to itself
is surjective. I hope the reader doesn’t feel too disheartened by this, there are times where the
explicit value of the determinant will be useful to us (in the exercises).

Here is an arithmetic application, which is exactly how we will use the Vandermonde determinant
in this book (in the exercises).
C.3. DETERMINANTS 193

Problem C.3.1

Let p is a prime number and a1 , . . . , am integers which are not divisible by p. Suppose p |
ak1 + . . . + akm for k = 1, . . . , m. Prove that p | m.

Solution

Collect the ai which are equal modulo p together to get


n
X
p| ci bni
i=1
P
for some positive integers i ci = m and distinct bi ∈ Fp . We have a system of equations


 c1 + . . . + cn ≡ 0

c b + . . . + c b ≡ 0
1 1 n n


 . . . . . . . . . . . . . . .......
 n
c1 b1 + . . . + cn bnn ≡ 0.

Since the bi are distinct modulo p, the vectors (1, . . . , bn1 ), . . . , (1, . . . , bnn ) are P
linearly independent
in Fp by Vandermonde. Thus, we must have c1 ≡ . . . ≡ cn ≡ 0. Since m = i ci , we have p | m
as wanted.


As we saw from the first example, matrices are deeply linked with system of linear equations. In
hindsight, this is obvious: the system


 a1,1 x1 + a1,2 x2 + . . . + a1,n xn = b2

a x + a x + . . . + a x = b
2,1 1 2,2 2 1,n n 2


 .................................
an,1 x1 + an,2 x2 + . . . + an,n xn = bn

is equivalent to
     
   a1,1 a1,2 a1,n  
x1 a1,1 ··· a1,n  a2,1   a2,2   a2,n  b1
 ..   .. .. ..  = x    .. 
1  .  + x2  .  + . . . + xn  .  =  . 
   
 .  . . .   ..   ..   .. 
xn an,1 ··· an,n bn
an,1 an,2 an,n

i.e. to XA = B where X = (xi ), A = (ai,j ) and B = (bi ). In particular, when A is invertible there is
a unique solution, so if we have at a some point in a problem we reach such a system of equations and
have a trivial solution, then we know what the xi are since it’s the only solution.

As a last application of determinants, we give a different solution to the cows problem C.1.1.

Alternative Solution to the Cows Problem C.1.1

Again, let wi be weight of the ith cow. Write


X X
wi = wi
i∈Ik i∈Jk

where |Ik | = |Jk | = n and Ik ∪ Jk = [2n + 1] \ {k} and suppose w2n+1 ∈ Ik for k 6= 2n + 1.
194 APPENDIX C. LINEAR ALGEBRA

Consider the system of 2n linear equations in 2n unknowns


X X
wi − wi = −w2n+1
i∈Jk i∈Ik ,i6=2n+1

for k = 1, . . . , 2n. The determinant of the associated matrix has the form
 
0 ±1 ±1 · · · ±1
±1 0 ±1 · · · ±1
 
±1 ±1 0 · · · ±1
 
 .. .. .. .. .. 
 . . . . . 
±1 ±1 ±1 ··· 0

where there are 0s on the diagonal and a 1 in the (i, j) coordinate if i ∈ Jj and a −1 otherwise.
We wish to show that this determinant is non-zero. Thus, there will be a unique solution to the
system, and since w1 = . . . = w2n = w2n+1 is such a solution it will imply that they are indeed
all equal. Modulo 2, the determinant is simply
 
0 1 ··· 1
1 0 · · · 1 X X
..  = ε(σ)a1,σ(1) · . . . · an,σ(n) ≡ a1,σ(1) · . . . · a2n,σ(2n)
 
 .. .. . .
. . . . σ∈S2n σ∈S2n
1 1 ··· 0

where the matrix above is A = (ai,j ). Now, a1,σ(1) · . . . · a2n,σ(2n) is 1 if and only if σ has no fixed
point. Thus, this determinant is congruent to the number of derangements, i.e. permutations
without fixed points. Exercise C.3.10∗ implies that this number is odd so non-zero, so the original
determinant was also odd and in particular non-zero and we are done.


Exercise C.3.10∗ . Prove that the number of derangements of [m] is


m
X (−1)i m!
i=0
i!

and that this number is odd if m is even and even if m is odd.

Remark C.3.4
One can also compute directly  
0 1 ··· 1
1 0 ··· 1
..  ,
 
 .. .. ..
. . . .
1 1 ··· 0

as a consequence of, e.g., Exercise C.5.8 .

All right, now let’s finish with the determinant.

Definition C.3.2 (Signature)

The signature ε(σ) of a permutation σ of [n] is


Y σ(i) − σ(j)
.
i−j
1≤i<j≤n
C.3. DETERMINANTS 195

Note that since σ is a permutation, its signature is in {−1, 1}. Thus, the signature of a permutation
is −1 raised to its number of inversions, i.e. the number of j > i such σ(j) < σ(i) (an inversion has a
contribution of −1 in the signature).
This definition of the signature is in fact not always convenient to work with, so we shall also
mention another one. One can see that when we apply a transposition to σ, i.e. switch two of its
values σ(i) and σ(j), the signature is multiplied by −1. Since any permutation is a composition of
transpositions, the signature is 1 if there are an even number of transpositions and −1 otherwise. In
the first case we say the permutation is even, and in the second one that it is odd .
In particular it does not depend on which transpositions we choose. for example, if one starts with
the sequence (1, . . . , 2m) and switches a pair of elements at each step, one will never be able to go back
to the original tuple after an odd number of times since the signature will be −1 while the signature
of the identity is 1.
As another consequence of this characterisation, we see that the signature is a morphism of groups
Sn → {−1, 1}: indeeed ε(σ ◦ σ 0 ) = ε(σ) · ε(σ 0 ) (this is obvious if you consider σ and σ 0 as a composition
of transpositions).
Exercise C.3.11∗ . Prove that the signature is negated when one exchanges two values of σ (i.e. compose a
transposition with σ).

Exercise C.3.12∗ . Prove that transpositions τi,j : i ↔ j and k 7→ k for k 6= i, j generate all permutations
(through composition).

Remark C.3.5
Proposition C.3.2 now reads as follow: when we apply a transposition to the columns, the de-
terminant gets multiplied by −1. Thus, when we apply a permutation σ to the columns, the
determinant gets multiplied by ε(σ). (This is also a direct corollary of the next theorem.)

We get the following refinement of Lemma C.3.1 by induction.

Theorem C.3.3

The determinant of an n × n matrix A = (ai,j ) is equal to


 
a1,1 · · · a1,n
det  ... .. ..  = X ε(σ)a
σ(1),1 · . . . · aσ(n),n

. . 
an,1 · · · an,n σ∈S n

where the sum is taken over all permutations σ of [n].

Exercise C.3.13∗ . Prove Theorem C.3.3.


Since permutations are symmetric with respect to the rows and the columns, we get that the
determinant is also symmetric with respect to the rows or the columns. In particular, our expansion
with respect to one column that we defined the determinant with also holds for rows (and using
Proposition C.3.2 it holds for any row and any column).

Corollary C.3.3

For any square matrix A, det A = det AT .

Exercise C.3.14∗ . Prove that det A = det AT for any square matrix A.
As promised in the beginning of the section, the determinant is the unique solution of a certain
functional equation. In fact, this is equation is more or less equivalent to our inductive definition as
we shall see. As a consequence, we will see that this implies the multiplicativity of the determinant.
196 APPENDIX C. LINEAR ALGEBRA

Theorem C.3.4

The determinant of n × n matrices is the only function D which is multilinear (linear in all
columns), alternating (zero when two columns are the same), and such that D(In ) = 1.

Proof

We proceed by induction on n. When n = 1 it’s obvious. For the inductive step, consider the
canonical basis of K n , i.e. the column vectors ei with a 1 in ith position and zeros everywhere
else.

The same proof as Proposition C.3.2 shows that when we exchange two columns, D gets multiplied
by −1 (since the only thing we used there was the multilinearity). Note that if A = (ai,j ) is an
(n − 1) × (n − 1) matrix, then
 
0 a1,1 · · · a1,n−1
 .. .. .. .. 
.
 . . . 
Dk : A 7→ D 
 1 0 · · · 0 

. . . .
 .. .. .. .. 

0 an,1 · · · a1,n−1

where the first column is E k , the kth row has only zeros, and the other rows have the matrix
A (a bit distorted if k 6= 1, n) also satisfies the conditions of the theorem, except possibly
the unitary condition. One can check that Dk (In−1 ) = (−1)k−1 by exchanging some columns
(Exercise C.3.15∗ ), thus Dk (A) = (−1)k det A by the inductive hypothesis. Notice also that, for
any b1 , . . . , bn−1  
0 a1,1 · · · a1,n−1
 .. .. .. .. 
.
 . . . 
D 1 b1 · · ·
 bn−1  = Dk (A)
. . . .
 .. .. .. .. 

0 an,1 · · · a1,n−1
since by adding the first column E k to the other ones we can get rid of the bi as this doesn’t
change the determinant.

Finally, using the multilinearity we have


n
! n
X X
i 2 n
D(A) = D ai,1 E , A , . . . , A = ai,1 Di (A1,i )
i=1 i=1

by the previous remark. Since Di (A1,i ) = (−1)i−1 det A, this is the recurrence we originally
defined the determinant with so we are done.


Exercise C.3.15∗ . Prove that Dk (In−1 ) = (−1)k−1 .

From this we deduce the following.

Proposition C.3.6 (Multiplicativity of the Determinant)*

The determinant is multiplicative, i.e. for any n × n matrice A and B we have det(AB) =
det(A) det(B).
C.3. DETERMINANTS 197

Proof

Fix B. The function A 7→ det(AB)/ det(B) satisfies all conditions of Theorem C.3.4 so is equal
to det(A).


Exercise C.3.16. Prove that the determinant is multiplicative by using the explicit formula of Theorem C.3.3.

 
A B
Exercise C.3.17. Let M = be a block-triangular matrix (meaning that A ∈ K m×m for some
0 C
m, C ∈ K (n−m)×(n−m) , B ∈ K m×(n−m) and we consider 0 as the zero matrix in K (n−m)×m ). Prove that
det M = det A det C.

As an important corollary, we get det(A−1 ) = det(A)−1 since

det(A) det(A−1 ) = det(In ) = 1.

This result also lets us define the determinant of a linear map, although we won’t be using this here.

Corollary C.3.4

We can define the determinant of a linear map ϕ : U → U as the determinant of any matrix
MBB (ϕ) representing ϕ.

Proof

We need to show that this is well-defined. If M is a matrix representing ϕ, then the other
matrices representing ϕ have the form N M N −1 for an invertible N (see Section C.2). Since the
determinant is multiplicative, we have

det(N M N −1 ) = det(N ) det(M ) det(N )−1 = det(M )

as wanted.


Remark C.3.6
With this notion of determinant of a linear map, the norm NL/K (α) is sometimes defined as the
determinant of the linear map from L to L defined by x 7→ xα (see Chapter 6).

Exercise C.3.18. Let L/K be a finite extension. Prove that the determinant of the K-linear map L → L
defined by x 7→ xα is the norm of α defined in Definition 6.2.3.

Finally, one might be interested in knowing when, say, a matrix with integer coordinates has an
inverse with integer coordinates as well. This is achieved by Proposition C.3.6 and the following result,
which gives an explicit formula for the inverse of a given matrix.

Proposition C.3.7

The adjugate of A, adj A := ((−1)i+j det(Aj,i )) satisfies A adj A = (det A)In = adj AA.
198 APPENDIX C. LINEAR ALGEBRA

Remark C.3.7
The transpose of the adjugate, com A := ((−1)i+j det(Ai,j )), is called the comatrix of A.

Let us first check an easy special case of this result. When n = 2, Aj,i are 1 × 1 matrices which
means that their determinant is particularly simple: Proposition C.3.7 becomes
  
a b d −b
= ad − bc
c d −c a

which is helpful to keep in my mind as it allows one to compute very quickly the inverse of a given
2 × 2 matrix.

Proof
Pn
Let’s compute (bi,j ) = A adj A. We have bi,j = k=1 ai,k (−1)j+k det(Aj,k ). When i = j this
is the ith row expansion of the determinant so is indeed equal to det A. When i 6= j, it is still
the expansion of a determinant, but not of A: it is (−1)i+j times the determinant of the matrix
obtained by replacing the ith row of A by its jth row. This matrix has two identical rows so its
determinant is zero.

Thus we have bi,j = 0 if i 6= j and bi,i = det A, i.e. A adj A = (det A)In . The coordinates of
adj AA are treated in a similar fashion, by noting that they are column expansions of certain
determinants this time.


Exercise C.3.19∗ . Prove that adj AA = (det A)In .

Remark C.3.8
There is another way to argue that adj AA is also equal to (det A)In once we have proven A adj A
is. Note that this is a polynomial equation in the coordinates of A, so if it holds sufficiently many
times in a fixed infinite field K it must always hold (e.g. by Exercise A.1.7∗ ). Suppose that det A
is non-zero. Then, adj A/ det A is the inverse of A so it commutes with A as wanted. Finally,
(adj AA − (det A)In ) det A is zero for all A ⊆ K n×n so must be identically zero, which implies
adj A = (det A)In since the determinant is not the zero polynomial.

In fact, this also gives another proof Theorem C.3.1. This follows from the more general corollary
below, which answers our question about invertible matrices with integer coefficients.

Corollary C.3.5*

Let A be a matrix with coefficients in a commutative ring R. A is invertible in R (i.e. A has an


inverse with coordinates in R) if and only if det A is a unit of R.

For instance, a matrix with integer coordinates has an inverse with integer coordinates if and only
if its determinant is ±1.

Proof

If A−1 has coordinates in R, then det(A) det(A−1 ) = 1 so det A is a unit. Conversely, if u det A =
1, then A(u adj(A)) = In .

C.4. LINEAR RECURRENCES 199

C.4 Linear Recurrences


In this short section, we derive the formula for linear recurrences using the Vandermonde determinant,
which will be used a lot in this book.2

Definition C.4.1 (Linear Recurrences)

We say a sequence (un )n∈Z is a linear recurrence if there is some k ≥ 1 and numbers a0 , . . . , ak−1 ∈
K such that a0 6= 0 and
k−1
X
un+k = ai un+i
i=0

for all i. The smallest such k is called the order of the linear recurrence, and the polynomial
X k − ak−1 X k−1 − . . . − a1 X − a0 is its characteristic polynomial . This polynomial is also called
the characteristic polynomial of the above equation (and the equation is called the equation
associated with f ).

Theorem C.4.1 (Linear Recurrences)

Let K be characteristic zero field and (un )n∈Z be a linear recurrence of elements of K with
characteristic polynomials f . Suppose the distinct roots of f are α1 , . . . , αr with multiplicity
m1 , . . . , mr . Then, there exist polynomials f1 , . . . , fr of degrees less than m1 , . . . , mr respectively
such that
un = f1 (n)α1n + . . . + fr (n)αrn
for all n ∈ Z.

Proof

Let d be the degree of f . Consider the K-vector space of sequences satisfying the equation
associated with f (which is indeed a vector space as it is closed under addition). This space
has dimension d: indeed for any (x0 , . . . , xd−1 ) ∈ K d there is a unique sequence solution to the
recurrence (un )n such that u0 = x0 , . . . , ud−1 = xd−1 . Thus, the dimension of this space is the
same as dim K d = d.

Now we prove that all sequences of the form un = f1 (n)α1n + . . . + fk (n)αrn are solutions. Since
these sequences also form a K-vector space with generating family given by

un = n(n − 1) · . . . · (n − (k − 1))αin

for k = 0, . . . , mi − 1 and i = 1, . . . , r. Thus we want to have


d−1
X
(n + d)(n + d − 1) · . . . · (n + d − (k − 1))αin+j = (n + j)(n + j − 1) · . . . · (n + j − (k − 1))αin+j
j=0

which is equivalent to
(X n f )(k) (αi ) = 0
and that’s true because αi is a root of multiplicity mi > k of X n f .

Finally, to show that all solutions have this form we want to prove that the dimension of the
space of solution of this form is the same as the dimension of the space of all solutions, i.e. d.
Since our generating family had exactly m1 + . . . + mr = d elements, this is equivalent to it being
a basis.

2 Mainly in the exercises, though.


200 APPENDIX C. LINEAR ALGEBRA

Thus, we want to show it is linearly indepndent. Suppose that a linear combination was zero,
i.e. X
un fi (n)αin = 0
i

for all n. We shall prove that fi (n) = 0 for all n andPeach i, thus implying that fi = 0 since
K has characteristic 0. We proceed by induction on i deg fi , the base case follows from the
Vandermonde determinant C.3.2. For the induction step, suppose deg f1 ≥ 1 without loss of
generality. Consider the sequence
X
vn = un+1 − α1 un = (αi fi (n + 1) − α1 fi (n))αin .
i

Since deg(αi fi (X + 1) − α1 fi ) ≤ deg fi for i ≥ 1 and deg(α1 (fi (X + 1) − fi )) ≤ deg fi − 1, by


the inductive hypothesis we have αi fi (X + 1) − α1 fi = 0 for all i. This means that they are
constant, but we have already treated this case so we are done.


Remark C.4.1
This theorem is equivalent to the existence of a (unique) partial fraction decomposition
Qr of any
rational function, meaning that, given a rational function h = f /g with g = i=1 (X − αi )mi and
deg g > deg f , i.e. deg h < 0, there are polynomials fi of degree at most mi − 1 such that
r
X fi
h= .
i=1
(X − αi )mi
Pd
In fact, if we fix g = i=0 ai X i , the rational function with denominator g and negative degree
correspond exactlyP to the generating functions of linear recurrences with characteristic polynomial
d
gb = X d g(1/X) = i=0 ad−i X i . Indeed, suppose that (un )n≥0 is such a sequence. Then, since
d−1
un+d = − a10 k=0 ad−k un+k , we have
P


X
h := un X n
n=0

d−1
!
X X
n
= un X + un+d X n+d
n=0 n=0

d−1
! d−1
X X X n+d X
= un X n + − ad−k un+k
n=0 n=0
a0
k=0

d−1
! d−1
X 1 X X
= un X n − ad−k X d−k un+k X n+k
n=0
a0 n=0
k=0

d−1 d−1 k−1
! d−1
X
n
X ad−k X
d−k+n 1 X X
= un X + un X − ad−k X d−k un X n
n=0
a0 n=0
a0 n=0
k=0 k=0
(a0 − g)h
:= f −
a0
(a0 −g)h f a0 f
so h = f − a0 , i.e. h = g
1−(1− a
= g as wanted. Note that f can be any polynomial of
0)
degree less than d since

f = u0 + (u1 + u0 a1 /a0 )X + (u2 + u1 a1 /a0 + u0 a2 /a0 )X 2 + . . .

so we can pick the ui inductively to get any f (equivalently, the matrix of the coefficients of f
C.5. EXERCISES 201

represented as linear combinations of the ui is upper-triangular with no zero coordinate on the


diagonal so is invertible).

On the other hand, since the roots of gb are the α1i , by our characterisation of linear recurrences,
Pr
there are polynomials fi of degree at most mi − 1 such that un = i=1 fi (n)(1/αi )n . Hence, we
also have

X r
X
n
h= X fi (n)(1/αi )n
n=0 i=1
Xr X∞
= fi (n)(X/αi )n .
i=1 n=0

Now, note that the fi are linear combinations of (X + k)(X + k − 1) · . . . · (X + 1) so it suffices


to prove that the result holds for these polynomials. For this, simply note that differentiating k
times

X 1 −α
(X/α)n = =
n=0
1 − X/α X −α
gives

X k!
(n + k)(n + k − 1) · . . . · (n + 1)(X/α)n =
n=0
(X − α)k+1
as wanted.

In fact, we can also prove (quite directly) the existence of a partial fractions decomposition and
conclude Theorem C.4.1 from this.

Exercise C.4.1. Prove that any rational function h of negative degree has a partial fractions decomposition
and deduce another proof of Theorem C.4.1.

Remark C.4.2
This also works for fields of characteristic p 6= 0, but one needs the condition that no root has
multiplicity ≥ p + 1, otherwise n 7→ fi (n) could be identically zero without fi being zero (e.g.
fi = np − n). For instance the equation
P un+4 = un has characteristic polynomial (X − 1) but
4

the space of solutions of the form i fi (n)αin has dimension 2 while the space of solutions of
un+4 = un has dimension 4 so not all solutions have the wanted form.

Exercise C.4.2. Prove that Theorem C.4.1 holds in a field K of characteristic p 6= 0 as long as the multiplicities
of the roots of the characteristic polynomial are at most p. In particular, for a fixed characteristic equation, it
holds for sufficiently large p.

As a corollary, we get the following result, which is not obvious at first sight.

Corollary C.4.1

The product and sum of two linear recurrences are also linear recurrences.

C.5 Exercises
Vector Spaces, Bases and Matrices
Exercise C.5.1† (Grassmann’s Formula). Let U be a vector space and V, W be two finite-dimensional
subspaces of U . Prove that
dim(V + W ) = dim V + dim W − dim(V ∩ W ).
202 APPENDIX C. LINEAR ALGEBRA

Exercise C.5.2 (Noether’s Lemma). Let U, V, W be finite-dimensional vector spaces, and let ϕ : U →
V, ψ : V → W be linear maps. Suppose that im ϕ ⊆ ker ψ. Prove that there exists a linear map τ such
that ψ = τ ◦ ϕ. Similarly, if im ϕ ⊆ im ψ, prove that there is a linear map τ such that ϕ = ψ ◦ τ .

Exercise C.5.3† . Given a vector space V of dimension n, we say a subspace H of V is a hyperplane


of V if it has dimension n − 1. Prove that H is a hyperplane of K n if and only if there are elements
a1 , . . . , an ∈ K not all zero such that

H = {(x1 , . . . , xn ) ∈ K n | a1 x1 + . . . + an xn = 0}.

Exercise C.5.4† . Let M ∈ K m×n be a matrix. Prove that M has rank k 3 if and only if there are
invertible matrices P ∈ K m×m and Q ∈ K n×n such that M = P Jk Q where
 
Ik 0
Jk =
0 0

is a block-diagonal matrix of rank k (meaning that you consider the upper-right 0 as an element
of K k×(n−k) , the lower-left 0 as an element of K (m−k)×k , and the lower-right 0 as an element of
K (m−k)×(n−k) ). Deduce that M, N ∈ K m×n have the same rank if and only if there exist invertible
matrices P ∈ K m×m and Q ∈ K n×n such that M = P N Q.

Exercise C.5.5. [
   
Ak×m Bk×n Um×s Vm×t
Block Matrices]Let M = and N = be block-matrices (the indices
Cr×m Dr×n Wn×s Tn×t
indicate the dimensions). Prove that
 
AU + BW AV + BT
MN =
CU + DW CV + DT.

In other words, we can multiply block-matrices as we would multiply normal two-by-two matrices.

Exercise C.5.6. Let E be a vector space with subspaces E1 , . . . , En . Suppose that E1 ∪ . . . ∪ En is


a vector space. Prove that one Ei contains all others.

Exercise C.5.7. Let z1 , . . . , zn+1 ∈ C be distinct complex numbers. Prove that ((X − z1 )n , . . . , (X −
zn+1 )n ) is a C-basis of the space of complex polynomials with degree at most n.

Determinants
Exercise C.5.8† . Let a0 , . . . , an−1 be elements of K and ω a primitive nth root of unity. Prove that
the circulant determinant  
a0 a1 · · · an−1
an−1 a0
 · · · an−2 

 .. .. .. .. 
 . . . . 
a1 a2 ··· a0
is equal to
f (ω)f (ω 2 ) · . . . · f (ω n−1 )
where f = a0 + . . . + an−1 X n−1 . Deduce that this determinant is congruent to a0 + . . . + ap−1 modulo
p when n = p is prime and a1 , . . . , ap are integers.

Exercise C.5.9† (Cramer’s Rule). Consider the system of equations M V = X where M is an n × n


matrix and V = (vi )i∈[[1,n]] and X = (xi )i∈[[1,n]] are column vectors. Prove that, for any k ∈ [[1, n]],
vk is equal to det M/ det Mk,X , where Mk,X denotes the matrix [M 1 , . . . , M k−1 , X, M k+1 , . . . , M n ]
obtained from M by replacing the kth column by X.
3 Recall that the rank of M is the dimension of the linear map V 7→ M V it induces, i.e. the maximum number of

linearly independent columns as the image is the space generated by the columns.
C.5. EXERCISES 203

Exercise C.5.10† (Wronskian Determinant). Let K be a characteristic 0 field. Given n formal


Laurent series f1 , . . . , fn ∈ K((X)), i.e. formal power series divided by a power of X,4 we define the
Wronskian determinant
f1 f2 ··· fn
f10 f20 ··· fn0
W = .. .. .. ..
. . . .
(n−1) (n−1) (n−1)
f1 f2 ··· fn .
Prove that W = 0 if and only if f1 , . . . , fn are linearly dependent (over K).
Exercise C.5.11† . Let (un )n≥0 be a sequence of elements of a field K. Suppose that the (m + 1) ×
(m + 1) determinant det(un+i+j )i,j∈[[0,m]] is 0 for all sufficiently large n. Prove that there is some N
such that (un )n≥N is a linear recurrence of order at most m.
Exercise C.5.12† . Let f1 , . . . , fn : S → K be linearly independent functions, where S is any set and
K is a field. Prove that there exists n elements m1 , . . . , mn of S such that the tuples

(f1 (m1 ), . . . , fn (m1 )), . . . , (f1 (mn ), . . . , fn (mn ))

are linearly independent over K.


Exercise C.5.13. Suppose that K is an infinite field and that A ⊆ K n×n is such that det(A + M ) =
det M for any M ⊆ K n×n . Prove that A = 0.
Exercise C.5.14. Let a1 , . . . , an be integers. Prove that
Y ai − aj

i<j
i−j

is an integer by expressing it as the determinant of a matrix with integer coordinates.

Algebraic Combinatorics
Exercise C.5.15† . Let A1 , . . . , An+1 be non-empty subsets of [n]. Prove that there exist disjoint
subsets I and J of [n + 1] such that [ [
Ai = Aj .
i∈I j∈J

Exercise C.5.16. Let p be a prime number and a1 , . . . , ap+1 real numbers. Suppose that, whenever
we remove one of the ai , we can divide the remaining ones into a certain amount of groups, depending
on i, each with the same arithmetic mean (and at least two groups). Prove that a1 = . . . = ap+1 .
Exercise C.5.17. We have n coins of unknown masses and a balance. We are allowed to place some
of the coins on one side of the balance and an equal number of coins on the other side. After thus
distributing the coins, the balance gives a comparison of the total mass of each side, either by indicating
that the two masses are equal or by indicating that a particular side is the more massive of the two.
Show that at least n − 1 such comparisons are required to determine whether all of the coins are of
equal mass.

Polynomials of Linear Maps and Matrices


Exercise C.5.18† (Characteristic Polynomial). Let K be an algebraically closed field. Let M ⊆ K n×n
be an n × n matrix. Define its characteristic polynomial as χM = det(XIn − M ). Its roots (counted
with multiplicity) are called the eigenvalues λ1 , . . . , λn ∈ K of M . Prove that det M is the product
of the eigenvalues of M , and that Tr M is the sum of the eigenvalues. In addition, prove that λ is
an eigenvalue of M if and only if there is a non-zero column vector V such that M V = λV (in other
words, M acts like a homothety on V ). Conclude that, if f ∈ C[X] is a polynomial, the eigenvalues of
f (M ) are f (λi ) (with multiplicity). (We are interpreting 1 ∈ K as In for f (M ) here, i.e., if f = X + 1,
4 In other words, K((X)) = K[[X]][X −1 ]. A more conceptual way to view K((X)) is as the field of fractions of K[[X]].

In particular, rational functions are formal Laurent series.


204 APPENDIX C. LINEAR ALGEBRA

f (M ) is M + In .) In particular, the eigenvalues of M + Iα are λ1 + α, . . . , λn + α, and the eigenvalues


of M k are λk1 , . . . , λkn .5 Finally, show that if
 
A B
M=
0 C

is block-triangular , then χM = χA χC .6
Exercise C.5.19† (Cayley-Hamilton Theorem). Prove that, for any n × n matrix M , χM (M ) = 0
where χM is the characteristic polynomial of M and 0 = 0In . Conclude that, if every eigenvalue of M
is zero, M is nilpotent, i.e. M k = 0 for some k.7
Exercise C.5.20† (Kernel Lemma). Let V be a vector space over a field K, ϕ : V → V a linear map,
and f, g ∈ K[X] coprime polynomials. Prove that

ker((f g)(ϕ)) = ker(f (ϕ)) ⊕ ker(g(ϕ)).

(The action of K[X] on ϕ is defined by (X n )(ϕ) := ϕn .)


Exercise C.5.21† (Minimal Polynomial). Let U be a finite dimensional vector space, and let ϕ : U →
U be a linear map. Prove that, although the minimal polynomial πϕ of ϕ is not necessarily equal to
its characteristic polynomial χϕ , they have the same irreducible factors.8 In addition, show that if V
is a subspace of U which is stable under ϕ, i.e. ϕ(V ) ⊆ V , then χϕ|V divides χϕ , where ϕ|V : V → V
denotes the restriction of ϕ to V . Finally, if U = V ⊕ W where V and W are stable under ϕ, prove
that
πϕ = lcm(πϕ|V ,ϕ|W )
and
χϕ = χϕ|V χϕ|W .

Exercise C.5.22† (Diagonalisation). We say anendomorphism (i.e. a linear map from a vector space
to itself) ϕ : V → V of a finite-dimensional vector space V is diagonalisable if there is a basis (e1 , . . . , en )
of V in which ϕ is diagonal, i.e. there are λ1 , . . . , λn ∈ K such that ϕ(ei ) = λi ei for all i. Prove that
ϕ is diagonalisable if and only if its minimal polynomial πϕ is squarefree.9 If K is algebraically closed
and χϕ = (X − λ1 )m1 · . . . · (X − λr )mr where λ1 , . . . , λr are distinct, then, in the decomposition
r
M
V = ker((ϕ − λi id)mi )
i=1

given by Exercise C.5.20† and Exercise C.5.19† , ker((ϕ − λi id)mi ) = ker(ϕ − λi id) for all i if and only
if f is diagonalisable.
Exercise C.5.23† (Ponctual Minimal Polynomial). Let ϕ : U → U be an endomorphism of a finite-
dimensional vector space U . We define the minimal polynomial of ϕ at some x ∈ U πϕ,x to be the
unique monic polynomial of smallest degree such that πϕ,x (ϕ)(x) = 0. Prove that, if U = V ⊕ W
where V and W are stable under ϕ (see Exercise C.5.21† ), then

πϕ,x+y = lcm(πϕ|V ,x , πϕ|W ,y )

for any (x, y) ∈ V × W . Deduce that there always exists an x ∈ U such that πϕ,x = πϕ .
5 One of the advantages of the characteristic polynomial is that we are able to use algebraic number theory, or more

generally polynomial theory, to deduce linear algebra results, since the eigenvalues say a lot about a matrix (if we combine
this with the Cayley-Hamilton theorem). See for instance Exercise C.5.28 and the third solution of Exercise C.5.30† .
6 Of course, all of this extends to endomorphisms (i.e. linear maps from V to itself) ϕ : V → V of a finite-dimensional

vector space.
7 Note that if, in the definition of χ , we replace det by an arbitrary multilinear form in the coordinates of M ,
PM
such as the permanent) perm(A) = σ∈Sn a1,σ(1) · . . . · an,σ(n) , the result becomes false, so we cannot just say that
"χM (M ) = det(M − M In ) = det 0 = 0" (this "proof" is nonsense because the scalar 0 is not the matrix 0, but the point
is that this intuition is fundamentally flawed).
8 Beware that, as K n×n is not a domain (meaning that AB = 0 implies A = 0 or B = 0) anymore, minimal

polynomials are not necessarily irreducible anymore.


9 This is already very useful since it gives us that whenever the eigenvalues of ϕ are distinct, f is diagonalisable, which

implies that diagonalisable are dense (in Cn×n ). We can use this to give another proof of the Cayley-Hamilton theorem,
see the solution to Exercise C.5.19† .
C.5. EXERCISES 205

Exercise C.5.24† (Cyclic Endomorphisms). We say an endomorphism ϕ : V → V of a finite-


dimensional vector space V is cyclic if there is some x ∈ V for which (x, ϕ(x), . . . , ϕn−1 (x)) is a
basis of V , where n = dim V . Prove that ϕ is cyclic if and only if its minimal polynomial πϕ has
degree n, and that this happens if and only if πϕ = χϕ (you may assume the Cayley-Hamilton theorem
only for this last claim). Give a direct proof of the Cayley-Hamilton theorem when ϕ is cyclic, and
deduce a proof of the theorem in the general case by noting that, if x ∈ V , the restriction of ϕ to the
space generated by (ϕk (x))k≥0 is always cyclic.
Exercise C.5.25† . Prove that χAB = χBA for any A, B ∈ K n×n . Using Exercise C.5.4† , deduce that
if m ≥ n, then χAB = X m−n χBA for any A ∈ K m×n and B ∈ K n×m .
Exercise C.5.26† . Given an n × n matrix A and a set I ⊆ [n], we denote by A[I] the |I| × |I|
submatrix obtained by keeping only the rows and columns of A indexed by elements of I. Prove that,
for 0 ≤ k ≤ n, the X k coefficient of χM is
X
(−1)n−k det A[I].
|I|=n−k

In particular, the X n−1 coefficient is minus the trace of A, and the constant coefficient is its determi-
nant. Using Exercise C.5.37† , deduce that if A has rank r, 0 is a root of χA of multiplicity at least
n − r.
Exercise C.5.27. Let A ⊆ Cn×n be a Hermitian matrix , i.e. A = AT . Prove that all its eigenvalues
are real.
Exercise C.5.28. Let M be a square matrix with integer coordinates and p a prime number. Prove
that Tr M p ≡ Tr M (mod p).

Miscellaneous
Exercise C.5.29† (Iterated Kernels). Let ϕ : V → V be an endomorphism of a finite-dimensional
vector space V . Prove that, for any k ≥ 0, we have

dim ker ϕk+1 = dim ker ϕk + dim ker ϕ ∩ im ϕk .

Deduce that the sequence (dim ker ϕk+1 − dim ker ϕk )k≥0 is weakly decreasing.
Exercise C.5.30† . Let p be a prime number, and G be a finite (multiplicative) group of n×n matrices
with integer coordinates. Prove that two distinct elements of G stay distinct modulo p. What if the
elements of G only have algebraic integer coordinates and p is an algebraic integer with all conjugates
greater than 2 in absolute value?
Exercise C.5.31† (USA TST 2019). For which integers n does there exist a function f : Z/nZ →
Z/nZ such that
f, f + id, f + 2id, . . . , f + mid
are all bijections?
Exercise C.5.32† (Finite Fields Kakeya Conjecture, Zeev Dvir). Let n ≥ 1 an integer and F a finite
field. We say a set S ⊆ Fn is a Kakeya set if it contains a line in every direction, i.e., for every y ∈ Fn ,
there exists an x ∈ Fn such that S contains the line x + yF. Prove that any polynomial of degree less
than |F| vanishing on a Kakeya set must be zero. Deduce that there is a constant cn > 0 such that,
for any finite field F, any Kakeya set of Fn has cardinality at least cn |F|n .
Exercise C.5.33† (Siegel’s Lemma). Let a = (ai,j ) be an m × n matrix with integer coordinates.
Prove that, if n > m, the system
Xn
ai,j xj = 0
j=1

for i = 1, . . . , n always has a solution in integers with


 m
 n−m
max |xi | ≤ n max |ai,j | .
i i,j
206 APPENDIX C. LINEAR ALGEBRA

Exercise C.5.34. Define the trace of a matrix as the sum of its diagonal coefficients, and the trace
of a linear map as the trace of any matrix representing it. Prove that this doesn’t depend on the basis
chosen and is thus well-defined. In addition, let L/K be a finite separable extension with embeddings
σ1 , . . . , σn and let α ∈ K. Prove that the trace of the linear map x 7→ xα is
n
X
σi (α).
i=1

This function is called the trace TrL/K of L/K.

Exercise C.5.35† . How many invertible n × n matrices are there in Fp ? Deduce the number of
(additive) subgroups of cardinality pm that (Z/pZ)n has.
Exercise C.5.36† . Let K be a field, and √let S ⊆ K 2 be a set of points. Prove that there exists a
polynomial f ∈ K[X, Y ] of degree at most 2n such that f (x, y) = 0 for every (x, y) ∈ S.
Exercise C.5.37† . Given an m×n matrix M , we define its row rank as the maximal number of linearly
independent rows of M . Similarly, its column rank is the maximal number of linearly independent
columns of M . Prove that these two numbers are the same, called the rank of M and denoted rank M .
Deduce that M has rank r if and only if all its minors of order r + 1 (i.e. the determinant of an
(r + 1) × (r + 1) submatrix, obtained by removing a chosen set of m − (r + 1) rows and n − (r + 1)
columns) but some minor of order r does not vanish.
Exercise C.5.38. Let A, B ∈ Rn×n . Prove that com(AB) = com(A) com(B).
Exercise C.5.39† (Nakayama’s Lemma). Let R be a commutative ring, I an ideal of a R, and M a
finitely-generated R-module. Suppose that IM = M , where IM does not mean the set of products
of elements of I and M , but instead the R-module it generates (i.e. the set of linear combinations of
products). Prove that there exists an element r ≡ 1 (mod I) of R such that rM = 0.
Exercise C.5.40† (Homogeneous Linear Differential Equations).
a) Let K be an algebraically closed field of characteristic 0. Given elements a0 , . . . , an ∈ K with
a0 , an 6= 0, solve the linear differential equation of order n
n
X
ai f (i) = 0
i=0

over formal power series f ∈ K[[X]].


b) Prove the Taylor formula with integral remainder: given a n + 1 times differentiable functon
f : R → R and a real number a ∈ R, prove that for any x ∈ R,
n x
(x − a)k (x − t)n (n+1)
X Z
f (x) = + f (t) dt.
k! a n!
k=0

Deduce, given n + 1 real numbers a0 , . . . , an ∈ R with an 6= 0, the set of n times differential


functions f : R → R such that
X n
ai f (i) = 0.
i=0
Solutions

207
Chapter 1

Algebraic Numbers and Integers

1.1 Definition
i
Exercise 1.1.1. Is 2 an algebraic integer?

Solution

Suppose that f (i/2) = 0 for some monic f ∈ Z[X]. Note that the real part and imaginary part
are both polynomials in 1/2 with integer coefficients, and that one of them has leading coefficient
±1 (which one it is depends on the parity of deg f ). Thus 1/2 would be an algebraic integers,
contradicting Proposition 1.1.1.


Exercise 1.1.2 (Rational Root Theorem). Let f ∈ Z[X] be a polynomial. Suppose that u/v is a
rational root of f , written in irreducible form. Prove that u divides the constant coefficient of f and
v divides its leading coefficient. (This is a generalisation of Proposition 1.1.1.)

Solution
Pn
Let f = i=0 ai X i , We have
n
X
ai ui v n−i = 0.
i=0
n
Modulo v, we get an u ≡ 0, i.e. v | an since u and v are coprime by assumption. Similarly,
modulo u we get a0 v n ≡ 0, i.e. u | a0 .


1.2 Minimal Polynomial


Exercise 1.2.1∗ . Prove that the minimal polynomial of an algebraic number is irreducible and that
an irreducible polynomial is always the minimal polynomial of its roots.

Solution

Let α be an algebraic number. Assume, for the sake of contradiction, that πα = f g with
0 < deg f, deg g < deg πα . Then, one of f or g must vanish at α, a contradiction since they have

208
1.2. MINIMAL POLYNOMIAL 209

smaller degree.

Conversely, let π ∈ Q[X] be a monic irreducible polynomial and let α be one of its roots. By
Proposition 1.2.1, πα | π. since π is irreducible and both πα and π are monic, this must mean
that they are equal.


Exercise 1.2.2. Prove that Y 4 − 3 is irreducible in Q[X].

Solution

The roots of Y 4 − 3 are ik 4 3 where and k = 0, . . . , 3. Note that none of these are rational so
the only potential way to factorise Y 4 − 3 would be as a product of two√degree 2 polynomials,
but the constant term of such a degree 2 divisor would have the form ±ik 3 by Vieta’s formulas
A.1.4 which is not rational. This can also be seen as a special case of the Eisenstein criterion
5.1.4.


Exercise 1.2.3∗ . Prove that any algebraic number of degree n has n distinct conjugates.

Solution

suppose α be an algebraic number of degree n with less than n distinct conjugates; i.e. it’s min-
imal polynomial π has a double root. Then gcd(π, π 0 ) has degree at least 1 by Proposition A.1.3
and at most n − 1, and divides π. Thus π is not irreducible, contradicting Exercise 1.2.1∗ .


Exercise 1.2.4∗ . Prove that the conjugates of an algebraic integer are also algebraic integers.

Solution

Let α be an algebraic integer, i.e. a root of a monic polynomial f ∈ Z[X]. Then, f (β) = 0 for
any conjugate β of α by Proposition 1.2.1 so any conjugate β is also an algebraic integer.


Exercise 1.2.5. We call an algebraic number of degree 2 a quadratic number . Characterise quadratic
integers.

Solution

By Proposition 1.2.2, a quadratic integer α is a root of a monic polynomial f ∈ Z[X] of degree


2 which is not rational. Write f = X 2 + uX + v. Then,

−u ± u2 − 4v
α= .
2

In particular, this has the form a±2 b for some b ≡ 1 (mod 4) and odd a if u is odd, since the

square of an odd rational integer is 1 mod 4, and the form a ± b for a, b ∈ Z if u is even. This is
our wanted characterisation: the former is a root of X 2 + uX + v where u = −a and u2 − 4v = b
210 CHAPTER 1. ALGEBRAIC NUMBERS AND INTEGERS

which has a solution as u2 ≡ b (mod 4), while the latter is a root of X 2 + uX + v where u = −2a
and u2 − 4v = 4b (again possible since u2 ≡ 4b (mod 4)).


1.3 Symmetric Polynomials


Exercise 1.3.1. Let α ∈ Q be an algebraic number with conjugates α1 , . . . αn and f ∈ Q[X1 , . . . , Xn ]
be a symmetric polynomial. Show that f (α1 , . . . , αn ) is rational. Further, prove that if α is an algebraic
integer and f has integer coefficients, f (α1 , . . . , αn ) is in fact a rational integer.

Solution

Write f (X1 , . . . , Xn ) = g(e1 , . . . , en ) with g ∈ Z[X]. Then,

f (α1 , . . . , αn ) = g(e1 (α1 , . . . , αn ), . . . , en (α1 , . . . , αn ))

is a rational integer as ek (α1 , . . . , αn ) is ± the coefficient of X n−k of the minimal polynomial πα


of α by Vieta’s formulas A.1.4, and hence a rational integer.


Exercise 1.3.2∗ . Prove that Z is closed under multiplication.

Solution

Let m and n be the degree of two algebraic integers α and β. The polynomial
Y Y Y Y
f= X − αi βj = αin X/αi − βj = αin πβ (X/αi )
i,j i j i

is symmetric as a polynomial (over the ring Z[X]) in α1 , . . . , αm (note that Y n πβ (X/Y ) is indeed
a polynomial in Y ) and hence takes value in

Z[X][e1 (α1 , . . . , αm ), . . . , em (α1 , . . . , αm )] = Z[X].

Exercise 1.3.3∗ . Prove Proposition 1.3.1.

Solution

Note that the assumption f ≡ g (mod m) implies that a ≡ b (mod m). By multiplying f and g
by the inverse of their leading coefficient modulo m, we may thus assume that they are monic.
Let s = h(e1 , . . . , en ) with h ∈ Z[X1 , . . . , Xn ] be a symmetric polynomial in Z[X1 , . . . , Xn ]. Then,
ek (α1 , . . . , αn ) ≡ ek (β1 , . . . , βn )
by Vieta’s formulas since f ≡ g (mod m). Finally, this implies that
s(α1 , . . . , αn ) = h(e1 (α1 , . . . , αn ), . . . , en (α1 , . . . , αn ))
≡ h(e1 (β1 , . . . , βn ), . . . , en (β1 , . . . , βn ))
= s(β1 , . . . , βn )
1.4. WORKED EXAMPLES 211

as wanted.


1.4 Worked Examples


Exercise 1.4.1∗ . Let α ∈ Q be an algebraic number. Prove that there exists a rational integer
N 6= 0 such that N α is an algebraic integer.

Solution

Let α be an algebraic number with minimal polynomial f = X n + . . . + a1 X + a0 ∈ Q[X]. Let


N be the lcm of the denominators of an−1 , . . . , a0 . Then,

N n f (X/N ) = X n + N an−1 X n−1 + N 2 an−2 X n−2 + . . . + N n−1 a1 X + N n a0

has integer coefficients and is zero at N α, as wanted.




1.5 Exercises
Elementary-Looking Problems
Exercise 1.5.1† . Find all non-zero rational integers a, b, c ∈ Z such that a
b + b
c + c
a and b
a + c
b + a
c
are integers.1

Solution

Notice that the polynomial


     
 a b  c 3 a b c 2 b c a
X− X− X− =X − + + X + + + X −1
b c a b c a a b c

has integer coefficients by assumption. Thus, ab , cb , ac are rational algebraic integers, i.e. rational
integers. Since their product is 1, they must all be ±1, i.e. |a| = |b| = |c|. Conversely, these
clearly work.


Exercise 1.5.2† (USAMO 2009). Let (an )n≥0 and (bn )n≥0 be two non-constant sequences of rational
numbers such that (ai −aj )(bi −bj ) ∈ Z for any i, j. Prove that there exists a non-zero rational number
b −b
r such that r(ai − aj ) and i r j are integers for any i, j.

Solution

Without loss of generality, by translating the sequences, we may assume that a0 = b0 = 0. Thus,
setting j = 0, we have ai bi ∈ Z for all i. The condition (ai − aj )(bi − bj ) ∈ Z then reads

1 The point of this problem is to do it specifically with algebraic numbers. You can of course solve it elementarily,

but that will just amount to reproving one of the results we showed. The reason why this is an exercise is to enable the
reader to recognise certain situations which are immediately solved with algebraic numbers, without needing to redo all
the work each time.
212 CHAPTER 1. ALGEBRAIC NUMBERS AND INTEGERS

ai bj + aj bi ∈ Z. We deduce that ai bj and aj bi are algebraic integers, since they are roots of

(X − ai bj )(X − aj bi ) = X 2 − X(ai bj + aj bi ) + ai bi aj bj ∈ Z[X].

Since they are rational, they must be rational integers, i.e. ai bj ∈ Z for every i, j. Choose k such
that ak 6= 0, there exists one since (an )n≥0 is non-constant. Let d be the gcd of the numbers
ak bj . Then, r = ak /d works. Indeed, we already have that rbj ∈ Z for all j so it remains to
show that ai /r ∈ Z for all i. By Bézout’s lemma, d is a linear combination of some ak bj , so that
rai = dai /ak is a linear combination of some ai bj and thus an integer as wanted.


Exercise 1.5.4† (Adapted from Irish Mathematical Olympiad 1998). Let x ∈ R be a real number
such that both x2 − x and xn − x for some n ≥ 3 are rational. Prove that x is rational.

Solution

Let a = x2 − x. Suppose for the sake of contradiction that x is irrational, so that its minimal
polynomial is X 2 − X√− a. Let y be its other conjugate. Then, xn − x = y n − y. Since the root
of X 2 − X − a are 1± 24a+1 , we get
 n    n  
1 1 1 1
+δ − +δ = −δ − −δ ,
2 2 2 2

where δ = 4a+1 2 . We shall prove that this is only possible for δ ∈ {0, ±1/2}, which is a
contradiction since δ is irrational by assumption. Since this is equation is symmetric between δ
and −δ, it suffices to prove that it has no positive solution δ 6= 21 . By dividing it by δ − 1/2, we
wish to show that
1
n n−1 n−1 i  n−1
−1
 X 1
2 +δ 1 1
+ − δ − 2 = + δ + − δ −2
δ − 12 2 i=0
2 2

is positive for positive δ.

Suppose first that δ ≤ 12 . Then, we have


n−2
X i n−2
X i  n−2
1 1 1
+δ > =2−
i=0
2 i=0
2 2

and  n−1  n−1  n−1  n−1  n−1


1 1 1 1 1
+δ + −δ ≥ + =
2 2 2 2 2
by the power mean inequality.
1
Now, if δ ≥ 2 the inequality is trivial: since n ≥ 3 we have
n−2
X i
1
+δ ≥1+1=2
i=0
2

and  n−1  n−1


1 1
+δ + −δ > 0.
2 2

1.5. EXERCISES 213

Exercise 1.5.8† . Let |x| < 1 be a complex number. Define



X
Sn = k n xk .
k=0

Suppose that there is an integer N ≥ 0 such that SN , SN +1 , . . . are all rational integers. Prove that
Sn is a rational integer for any integer n ≥ 0.

Solution
1
We shall prove that S0 = 1−x is a rational integer, and that this implies that Sn is a rational
integer for all n. By differentiating the equality

X 1
xk =
1−x
k=0

n times, we get

X (−1)n n!
Rn := (k + 1)(k + 2) · . . . · (k + n)xk = .
(1 − x)n
k=0

Define fn (X) := (X + 1) · . . . · (X + m). Since each fm is monic and has degree n, they form a
Z-basis of Z[X], meaning that any element of Z[X] can be represented as P a linear combination
P n n n
a f
k k k for some ai ∈ Z. This is in particular the case for X , say X = i=0 ak,n fk so that

n
X
Sn = ak,n Ri .
i=0

1
This shows that, if 1−x is a rational integer, then so is Rn for all n and hence Sn for all n.
1 1
Now, note that SN is a polynomial with integer coefficients in 1−x , so 1−x and thus x is algebraic.
Pn
Now, write fn as a linear of X k , i.e. fn = k=0 bk,n X k (this is just regular expansion). Let p
be a large rational prime which divides neither the numerator or the denominator of the norm
of 1 − x, so that 1 − x is invertible modulo p by Exercise 1.5.23† . Then,
n
X (−1)n n!
Rn = bk,n Sk =
(1 − x)n
k=0
PN
is divisible by p for n ≥ p. Thus, we deduce that k=0 bk,n Sk is congruent to a rational integer
modulo p. The idea now is to take many n so that the vectors (b0,n , . . . , bN,n ) are linearly
independent using Exercise C.5.12† , i.e. so that they have a non-zero determinant. Then, for
p sufficiently large, p the determinant is also non-zero modulo p. Since the inverse of a matrix
with coordinates in Fp also has coordinates in Fp , this implies that S0 , . . . , SN are congruent to
rational integers modulo p. By taking p sufficiently large and using Exercise 1.5.25† , we deduce
1
that S0 , . . . , SN are rational numbers. Finally, we shall look at the p-adic valuation of S0 = 1−x
to prove that it’s an integer.

To use Exercise C.5.12† , we need to prove that b0,n , . . . , bN,n are linearly independent. For this,
we shall prove that they all grow at different rates. Indeed, we have have
n
!k
X 1 n! X 1 n! log(n)k
bn,k = n! ∼ ∼
i1 · . . . · ik k! i=1 i k!
1≤i1 <...<ik ≤n

which shows that the assumptions are satisfied. As said before, we get that S0 , . . . , SN are
congruent to rational integers modulo p. Thus, for p sufficiently large, they must all be rational.
1
Finally, suppose some prime p divides the denominator of 1−x . Since bk,n for a fixed k eventually
214 CHAPTER 1. ALGEBRAIC NUMBERS AND INTEGERS

PN
becomes divisible by anything, k=0 bn,k Sk is a rational integer for sufficiently large n. However,
(−1)n!
it is congruent modulo 1 to (1−x) n which is not an integer since vp (n!) < n by Legendre’s formula

8.3.5; this is a contradiction.




Exercise 1.5.9† . Let n ≥ 3 be an integer. Suppose that there exist a regular n-gon with integer
coordinates. Prove that n = 4.

Solution

Let A, B, C be three consecutive vertices. Since the sum of the angles of the n-gon is (n − 2)π,
the angle ∠ABC is π n−2 2π
n = π − n . By the cosine law, we have
2 2
1 + cos 4π (AB 2 + BC 2 − AC 2 )2
 
n 2π 2π
= cos = cos π − = .
2 n n 4AB 2 · BC 2

Since A, B, C have integer coordinates, this is rational so cos 4π


n is rational. Finally, using Prob-
lem 1.1.1, we must have n ∈ {1, 2, 4, 8}. It remains to prove that n = 8 is impossible.
π
We have ∠BAC = ∠BCA = n. By the sine law,

AB 2 AC 2
2 = 2 .
cos nπ cos 2π

n

This is impossible since the RHS is rational while the LHS isn’t for n = 8 (we are again using
2
the identity cos(x)2 = 1+cos(x)
2 ).


Exercise 1.5.10† . Let P be a polygon with rational sidelengths for which there exists a real number
α ∈ R such that all its angles are rational multiples of α, except possibly one. Prove that cos α is
algebraic.

Solution

Note that cos α being algebraic is equivalent to exp(iα) being algebraic, since if z is algebraic
then so is z + z = 2<(z). Without loss of generality, we may assume that the angles which are
rational multiple of α are in fact positive integer multiples of α, by rescaling α.
Let Xk denote the kth vertex and let n denote the number of vertices. Represent the polygon
by complex numbers: Xk is represented by xk ∈ C. We have
xk − xk+1 |xk − xk+1 |
= exp(i∠Xk−1 Xk Xk+1 ) .
xk − xk−1 |xk − xk−1 |
Without loss of generality, we may assume that the angle which is potentially not an integer
multiple of α is Xn−1 Xn X1 , so that ∠Xk−1 Xk Xk+1 is ak α for some ak ∈ N for k = 1, . . . , n − 1.
|xk −xk+1 |
Denote also |x k −xk−1 |
by rk ∈ Q. Thus, the condition reads
xk − xk+1
= rk exp(iak α).
xk − xk−1
By rescaling the xi , we may assume that x1 − xn = 1. Thus,
x1 − x2 = r1 exp(ia1 α) := s1 exp(ib1 α).
1.5. EXERCISES 215

This implies that

x2 − x3 = (x2 − x1 )r2 exp(ia2 α) = −r1 r2 exp(i(a1 + a2 )α) := s2 exp(b2 α).

Continuing like that, we get

xk − xk+1 = −rk sk−1 exp(i(bk−1 + ak )α) := sk exp(ibk α)

for some sk ∈ Q and bk ∈ N. Finally, since


n−1
X n−1
X
sk exp(ibi α) = xk − xk+1 = x1 − xn = 1,
k=1 k=1

exp(iα) is algebraic as wanted since it’s a root of

sn−1 X bn−1 + sn−2 X bn−2 + . . . + s1 X b1 − 1.

Exercise 1.5.14† . Let ω1 , . . . , ωm be nth roots of unity. Prove that |ω1 + . . . + ωm | is either zero or
greater than m−n .

Solution

Let ω = exp 2iπ



n be a primitive nth rooth of unity so that each ωi is a power of ω, say ω ki . We
shall multiply
ω1 + . . . + ωm = ω k1 + . . . + ωm
km

`km
by its conjugates, which are among ω `k1 + . . . + ωm by the fundamental theorem of symmetric
polynomials. Suppose that it is non-zero. Taking the product over its conjugates, we get that
Y
ω `k1 + . . . + ωm
`km

`km
is a non-zero rational integer. Thus, it is at least one in absolute value. Since |ω `k1 +. . .+ωm |≤
m by the triangular inequality, we finally get

mn−1 |ω1 + . . . + ωm | ≥ 1,

i.e. |ω1 + . . . + ωm | ≥ m1−n ≥ m−n as wanted.




Remark 1.5.1
In fact, since ω has only ϕ(n) conjugates and not n (see Chapter 3), we get the stronger bound
|ω1 + . . . + ωm | ≥ m1−ϕ(n) .

Exercise 1.5.15† . Let n ≥ 1 and n1 , . . . , nk be integers. Prove that


   
2πn1 2πnk
cos + . . . + cos
n n
1
is either zero or greater than 2(2k)n/2
.
216 CHAPTER 1. ALGEBRAIC NUMBERS AND INTEGERS

Solution

We imitate our proof of Exercise 1.5.14† : since 2 cos 2`π = exp 2`iπ + exp − 2`iπ
  
n n n , the fun-
damental theorem of symmetric polynomials shows that
   
Y 2`n1 π 2`nk π
2 cos + . . . + 2 cos
n n
`

2nk π
is an integer, where the product is taken over the conjugates of 2 cos 2nn1 π + . . . + 2 cos
 
n .
Suppose that it is non-zero, so that this product is non-zero too. Then, as
   
2`n1 π 2`nk π
2 cos + . . . + 2 cos ≤ 2k
n n

by the triangular inequality, we have


       
2n1 π 2nk π Y 2`n1 π 2`nk π
(2k)n/2 2 cos + . . . + 2 cos ≥ 2 cos + . . . + 2 cos ≥1
n n n n
`

2n1 π 2nk π
has at most n+1
 
since 2 cos n + . . . + 2 cos n 2 ≥ n/2 + 1 conjugates (since cos x =
cos(2π − x) so half the potential conjugates get discarded). Thus, we get
   
2πn1 2πnk 1
cos + . . . + cos ≥
n n 2(2k)n/2

as wanted.


Remark 1.5.2
In fact, since cos 2`π

n has only ϕ(n)/2 conjugates for n > 2 and not n (see Chapter 3), we get
the stronger bound
   
2πn1 2πnk k
cos + . . . + cos ≥
n n 2(2k)ϕ(n)/2

for n > 2 (for n ≤ 2 we get the bound 1).

Exercise 1.5.16† (USA TST 2014). Let N be an integer. Prove that there exists a rational prime p
and an element α ∈ F× 2
p such that the orbit {1, α, α , . . .} has cardinality at least N and is sum-free,
meaning that αi + αj 6= αk for any i, j, k. (You may assume that, for any n, there exist infinitely many
primes for which there is an element of order n in Fp . This will be proven in Chapter 3.)

Solution

Let us fix the order n of α and suppose that any prime p for which there is an element of order
n fails. Let ω = exp 2iπ
n be a primitive nth root of unity. By Proposition 1.3.1, we have
Y Y
αi + αj − αk ≡ ωi + ωj − ωk .
i,j,k i,j,k

Thus, if the LHS is divisible by infinitely many primes, the RHS must be zero. However,  it is easy
to see that, when ζ is a root of unity, ζ + 1 is also one if and only if ζ = exp ±2iπ 3 . Indeed, if
ζ = exp (iθ), then, by looking at the imaginary part of ζ +1, we get ζ +1 = exp (i(2π − θ)). Then,
by looking at the real parts we get cos(θ) + 1 = cos(2π − θ) = − cos(θ) which gives cos(θ) = − 21
as wanted. Thus, if 6 - n, ω i + ω j = ω k is impossible since this implies ω i−j + 1 = ω k−i so ω k−i
1.5. EXERCISES 217

would be exp 2π

6 . We are done: we just need to choose an n ≥ N which is not divisible by 3.
(As a bonus, if 6 | n then any p fails since αn/3 + 1 = −α2n/3 = αn/6 , while for 6 - n we have
proven that all sufficiently large p work.)


Properties of Algebraic Numbers


Exercise 1.5.19† . Let α ∈ Q be an algebraic number with conjugates α1 , . . . , αn and let f ∈ Q[X]
n
be a polynomial. Prove that the m conjugates of f (α) are each represented exactly m times among
f (α1 ), . . . , f (αn ).

Solution

First, note that these are all conjugates since if f (α) is a root of g, then α is a rootQof f ◦ g so the
n
same goes for its conjugates. Second, note that there are no other conjugates since i=1 X −f (αi )
has rational coefficients by the fundamental theorem of symmetric polynomials. Finally, since
the roots of this polynomial are exactly the roots of πf (α) (which is irreducible), it must be a
power of πf (α) , say πfk(α) . Then, each f (αi ) is repeated k times and f (α) has degree nk as wanted.


Exercise 1.5.20† . Let α1 , . . . , αm ∈ Q be algebraic number and f ∈ Q[X1 , . . . , Xm ] a polynomial.


(1) (n )
Denote the conjugates of αk by αk , . . . , αk k . Prove that the conjugates of f (α1 , . . . , αm ) are among2
(i ) (im )
{f (α1 1 , . . . , αm ) | ik = 1, . . . , nk }.

Solution

This is a consequence of the fundamental theorem of symmetric polynomials:


(i )
Y
(im )
X − f (α1 1 , . . . , αm )
i1 ,...,im

has rational coefficients by Exercise B.1.4.




Exercise 1.5.21† . Let f ∈ Z[X] be a monic polynomial and α be one of its roots. Prove that α is
an algebraic integer.

Solution
(1) (m )
Let f = X n +an−1 X n−1 +. . .+a0 ∈ Z[X] be a polynomial and let ak , . . . , ak k be the conjugates
of ak . The fundamental theorem of symmetric polynomials then shows that the polynomial
(in−1 ) (i )
Y
X n + an−1 + . . . + a0 0
i0 ,...,in−1

2 This †

√time, unlike Exercise 1.5.19 √, these are usually not all conjugates. For instance, for f = X + Y , α = 2 and
β = 1 − 2, 1 is not a conjugate of 2 2 − 1.
218 CHAPTER 1. ALGEBRAIC NUMBERS AND INTEGERS

has integer coefficients by Exercise B.1.4 (this is the same argument as Exercise 1.5.20† ). Since
it is monic, its roots are algebraic integers, and since it is divisible by f , the same goes for f .

Alternatively, one can use Proposition C.3.5: M = Z[an−1 , . . . , a0 ] is a finitely generated Z-


module such that αM ⊆ M for any root α of f (this is also Exercise C.3.8).


Exercise 1.5.22† . We say an algebraic integer α ∈ Z is a unit if there exists an algebraic integer
α0 ∈ Z such that αα0 = 1. Characterise all units.

Solution

Let α ∈ Q be a non-zero algebraic number and let f = X n + an−1 X n−1 + . . . + a0 be its minimal
polynomial. Then, 1/α is a root of a0 X n + a1 X n−1 + . . . + 1 which shows that its degree is at
most n. By reiterating this process, we get that the degree of α is also at most n, which implies
that we have equality. Hence
a1 1
X n + X n−1 + . . . +
a0 a0
is the minimal polynomial of 1/α. In particular, for α ∈ Z, 1/α is an algebraic integer if and
only if a0 | 1, i.e. a0 = ±1. This is also equivalent to | N(α)| = 1. An alternative solution is given
in Exercise 7.1.1∗ : if α is invertible than so are its conjugates so the same goes for the product
of its conjugates, i.e. its norm. Hence, | N(α)| = 1 if α is invertible. Conversely, if N (α) = ±1,
then ±α2 · . . . · αn is the inverse of α, where α2 , . . . , αn are its conjugates distinct from itself.


Exercise 1.5.23† . Let m be a rational integer. We say an algebraic integer α ∈ Z is a unit mod m
if there exists an algebraic integer α0 ∈ Z such that αα0 ≡ 1 (mod m). Characterise all units mod m.

Solution

Here we imitate the second solution of Exercise 1.5.22† . If α is invertible modulo m, then so are
its conjugates so the same goes for its norm. Conversely, if its norm is invertible modulo m, then
(N(α))−1 α2 · . . . · αn is the inverse of α, where α2 , . . . , αn are its conjugates distinct from itself


Exercise 1.5.25† . Let α ∈ Z be an non-rational algebraic integer. Prove that there are a finite
number of rational integers m such that α is congruent to a rational integer mod m.

Solution

Suppose that α ≡ k (mod m) for some k ∈ Z. Then, all its conjugates are also congruent to k
modulo m (by conjugating both sides) which implies

πα ≡ (X − k)n (mod m)

where n ≥ 2 is the degree of α. In particular, πα has a double root modulo m, so m divides


the discriminant of πα by Remark 1.3.3. Since πα has distinct roots by Exercise 1.2.3∗ , the
discriminant is non-zero so there are indeed a finite number of such m.

1.5. EXERCISES 219

Exercise 1.5.26† (Kronecker’s Theorem). Let α ∈ Z be a non-zero algebraic integer such that all its
conjugates have module at most 1. Prove that it is a root of unity.

Solution

Let α1 , . . . , αn be the conjugates of α. By the fundamental theorem of symmetric polynomials,


n
Y
fm := X − αim
i=1

has integer coefficients for all positive integers α. Moreover, its coefficients are bounded: by the
triangular inequality,
 
X X k
|ek (α1 , . . . , αn )| = αi1 · . . . · αik ≤ 1= .
i1 <...<ik i1 <...<ik
n

r
Thus, there exist r and s such that f2r = f2r+s . This means that raising the roots αi2 of f2r to
r
the 2s power permutes them. If we iterate this permutation starting from α2 , we will eventually
r
cycle back to α2 , i.e.
r r+ks
α2 = α2
for some k > 0. Since α 6= 0, it must be a root of unity.


Exercise 1.5.27† . Determine all non-zero algebraic integers α ∈ Z such that all its conjugates are
real and have module at most 2.

Solution

αk +i 4−α2k
Let α1 , . . . , αn be the conjugates of α. Notice that 2 has module 1. Now,
p ! p !
αk + i 4 − αk2 αk − i 4 − αk2
X− X− = X 2 − αk X + 1
2 2

so that ! !
n p p
Y αk + i 4 − αk2 αk − i 4 − αk2
X− X−
2 2
k=1

has integer coefficients.



Since all its roots have module 1, by Kronecker’s theorem 1.5.26† , we
α+i 4−α2
is a root of unity. Since α/2 is its real part, we get α = 2 cos 2kπ

get that 2 m for some
k, m. Conversely, such α work since the conjugates of
     
2kπ 2kiπ 2kiπ
2 cos = exp + exp −
m m m

are among 2 cos 2`π



m for ` ∈ Z by the fundamental theorem of symmetric polynomials.


Exercise 1.5.28† . Suppose that ω is a root of unity whose real part is an algebraic integer. Prove
that ω 4 = 1.
220 CHAPTER 1. ALGEBRAIC NUMBERS AND INTEGERS

Solution

Suppose that ω 2 6= 1. Then, by the triangular inequality,

ω + ω −1 1+1
<(ω) = < =1
2 2
since the inequality case would imply ω = ω −1 , i.e. ω 2 = 1. Notice that, since the conjugates of
−1
roots of unity are roots of unity, the conjugates of <(ω) = ω+ω 2 also all have absolute value at
most 1 by the triangular inequality, and are algebraic integers by assumption. Thus, the product
of <(ω) and its conjugates has absolute value strictly less than 1. Since it’s an integer it must
hence be zero, which implies <(ω) = 0, i.e. ω = ±i which satisfies ω 4 = 1 as wanted.


Exercise 1.5.29† . Let ω1 , . . . , ωn be roots of unity. Suppose that 1


n (ω1 + . . . + ωn ) is a non-zero
algebraic integer. Prove that ω1 = . . . = ωn .3

Solution

Notice that, by the triangular inequality, if ω1 , . . . , ωn are not all equal, then n1 (ω1 + . . . + ωn )
has absolute value strictly less than 1. Since its conjugates all have absolute value at most 1 by
the triangular inequality, the product of n1 (ω1 + . . . + ωn ) with its conjugates has absolute value
strictly less than 1. Since it’s a rational integer, it must be zero. This implies n1 (ω1 +. . .+ωn ) = 0,
contradicting our initial assumption.


Exercise 1.5.30† . Let α ∈ Z be an algebraic number and let p be a rational prime. Must it follow
that αn ≡ 0 (mod p) or αn ≡ 1 (mod p) for some n ∈ N?4

Solution

The answer is No. As a counterexample, we can try to find an α such that the sequence (αn )n≥1
is constant modulo p and not congruent to 1 modulo p. The simplest possible p is p = 2, so let’s
pick p = 2. Then, one of the simplest ways to achieve α2 ≡ α (mod 2) and α 6≡ 0, 1 (mod 2) is
to choose α such that α2 = α − 2. Such an α must clearly be irrational. By Exercise 1.5.23† , α
is not invertible modulo 2, but it is also non-zero since 2 = αβ is not divisible by 2 · 2, where β is
the other conjugate of α. Indeed, if 2 divided α, then it would also divide β: α/2 is an algebraic
integer so its conjugate β/2 is as well byExercise 1.2.4∗ . We can also show directly that α 6≡
β−1
(mod 2) without invoking Exercise 1.5.23† : if α−1 2 were an integer, then so would 2 , so their
product
α−1 β−1 αβ − (α + β) + 1 2−1+1 1
· = = =
2 2 4 4 2
which is not the case.

A more elaborate example, which also sheds a lot of light on the situation, is the following.
Consider a prime p ≡ 1 (mod 4) and factorise it as ππ in the Gaussian integers Z[i]. By Theo-
rem 2.3.1, π and π aren’t associates so, with the help of the Chinese remainder theorem (in Z[i]),
we can pick an α ∈ Z[i] congruent to 0 modulo π and to 1 modulo π. Then, the powers of α are
clearly congruent to α modulo π and π, so modulo p = ππ. However, α is congruent to neither

3 In fact, any algebraic integer that can be written as a linear combination of roots of unity with rational coefficients

can also be written as a linear combination of roots of unity with integer coefficients. However, this is a difficult result
to prove (see Exercise 3.5.26† for a special case).
4 In Chapter 4, we prove that the answer is positive for sufficiently large p.
1.5. EXERCISES 221

0 or 1 modulo p. In fact, this is the only way such a counterexample happens, but we need to
replace the factorisation in prime elements (which doesn’t always hold) by the factorisation in
prime ideals (which always holds).


Exercise 1.5.31† (Lindemann-Weierstrass).


Rt
a) Given a polynomial f ∈ C[X], we write I(·, f ) for the function t 7→ 0
et−u f (u) du. Prove that
m
X m
X
I(t, f ) = et f (k) (0) − f (k) (t)
k=0 k=0

where m = deg f .
Pn
b) By looking at k=0 ak I(k, f ) with f = X p−1 (X − 1)p · . . . · (X − n)p for some prime number p
and some rational integers a0 , . . . , an , prove Hermite’s theorem: e is transcendental.
c) Prove the Lindemann-Weierstrass theorem: if α1 , . . . , αn ∈ Q are distinct algebraic numbers,
eα1 , . . . , eαn are linearly independent (over Q)5 . Deduce that π is transcendental.

Solution

By integrating by parts, we get


Z t
I(t, f ) = et−u f (u) du
0
Z t
= [−et−u f (u)]t0 − −et−u f (u) du
0
= et f (0) − f (t) + I(t, f 0 )
which proves part a by P immediate induction. For part b, suppose for the sake of contradiction
n
that e is algebraic, i.e. i=0 ai ei = 0 for some a0 , . . . , an ∈ Z and a0 6= 0 since e 6= 0. Then,
n
X
S:= ak I(k, f )
k=0
X m
X m
X
= ak ek f (i) (0) − ak f (i) (k)
k=0 i=0 i=0
X
(i)
=− ak f (k).
k,i

Now, that we managed to get rid of e, we are left with an integer. The point will be to make this
integer S be very large with some divisibility condition, and then show that it was in fact not that
large by returning to the integral definition. To make S large, we can make almost first derivatives
of f vanish at 1, . . . , n since the coefficients of f (i) are divisible by i! (see Proposition 5.3.1). This
suggests taking f = X ` (X − 1)` · . . . · (X − n)` as we then have `! | S. However, we run into
a problem: what if S = 0? To avoid S being 0, we reduce the multiplicity of one k to have
S ≡ −ak f (`) (k) (mod k!), which is hopefully non-zero. One way to garantee that this is non-
zero is to choose ` = p to be prime since we know very well when derivatives vanish modulo p as
Fp is a field. This gives us the polynomial
f = X p−1 (X − 1)p · . . . · (X − n)p .
With this polynomial, we have S ≡ −a0 f (p) (0) (mod p), which is zero if and only if p | a0 or
the multiplicity of 0 modulo p is at least p. The former is easily avoided by taking p > |a0 | since
a0 6= 0, and the latter by taking p > n since the multiplicity of 0 is then p − 1 < p.

5 Equivalently, if α1 , . . . , αn ∈ Q are linearly independent (over Q), eα1 , . . . , eαn are algebraically independent.
222 CHAPTER 1. ALGEBRAIC NUMBERS AND INTEGERS

Hence, we get that S is a non-zero integer (because it is non-zero modulo p) divisible by (p − 1)!,
so is at least (p − 1)! in absolute value. However,
Z k
|I(k, f )| ≤ ek−u f (u) du
0
Z k
≤ C p du
0
= kC p
≤ nC p

where C can be chosen as en (n + 1)! since ek−u ≤ en and |u(u − 1) · . . . · (u − n)| ≤ (n + 1)! for
k, u ∈ [0, n]. Thus,
Xn
|S| = ak I(t, f ) ≤ AC p
k=0

for some constants A, C which do not depend on p. For sufficiently large p, this contradicts the
fact that |S| ≥ (p − 1)! ≥ (p/2)p/2 .

The general case of the Lindemann-Weierstrass is actually almost Pn identical to our proof of Her-
mite’s theorem. Suppose that α1 , . . . , αn ∈ Q are such that k=0 ak eαk = 0 for a1 , . . . , ak ∈ Z.
By replacing α1 , . . . , αn with α1 + r, . . . , αn + r for a well chosen r ∈ Z, we may assume that they
are non-zero.

As before, we consider
n
X
S= ak I(αk , f )
k=0

where
f = X p−1 (X − α1 )p · . . . · (X − αn )p .
(We could have also chosen f = (X − α1 )p−1 (X − α2 )p · . . . · (X − αn )p but that makes it harder to
prove that S is non-zero, we need the theory of finite fields.) If N is such that N α1 , . . . , N αn ∈ Z,
then N np S is an algebraic integer since N np f (i) (αk ) ∈ Z and
X
S=− f (i) (αk ).
k,i

Thus, we get that N np S is divisible by (p − 1)! and is non-zero for sufficiently large p as

N np S ≡ −a1 N np f (p) (0) ≡ −a1 (p − 1)!(−1)n (N β1 · . . . · N βn )p (mod p).

Indeed, N β1 · . . . · N βn is divisible by only a finite number of rational primes since it is non-zero


so any of its divisor divides its norm, i.e. the product with its conjugates. The fact that N np S
is non-zero and is divisible by (p − 1)! is however not enough alone to conclude that S is large.
Therefore, we instead consider its norm: let S1 , . . . , Sm be its conjugates. Then,
m
Y
(N np Si )
i=1

is a non-zero integer divisible by (p − 1)!, so is at least (p − 1)! in absolute value. However, the
conjugates of Si are among
X n
ak I(αk0 , f )
k=1
1.5. EXERCISES 223

where αk0 is a conjugate of αk by the fundamental theorem of symmetric polynomials (see Exer-
cise 1.5.20† ). Thus, as before (this actually needs a little work, see the remark after the solution),
we deduce that there is a constant C such that |Si | ≤ C p for all i. Finally, we get
m
Y
(C m N mn )p ≥ |N np Si | ≥ (p − 1)!
i=1

which is again a contradiction for large enough p (m may depend on p but what matters is that
it’s bounded by the fundamental theorem of symmetric polynomials).

As a consequence, eα1 , . . . , eαn are algebraically dependent if and only if linear combinations
of α1 , . . . , αn with integer coefficients are distinct, i.e. if and only if α1 , . . . , αn are linearly
independent. In particular, if π were algebraic, iπ would be as well, and thus eiπ = −1 would be
transcendental (this is what it means for one number to be algebraically independent) which is
of course impossible.


Remark 1.5.3
Technically, the bounding step needs some justification for why it works. Indeed, it is not clear
that we still have the triangulair inequality
Z b
f (x) dx ≤ |b − a|M
a

when |f | ≤ M on the segment [a, b] when a and b are complex numbers. First, we define the
segment [a, b] as the set of complex numbers of the form λa + (1 − λ)b with λ ∈ [0, 1] ⊆ R. When
a and b are real, this corresponds to the usual [a, b] if a ≤ b, and to [b, a] otherwise.

We now turn to the more interesting question: how do we define integrals with complex bounds?
The answer is that we define the integral over a smooth arc γ (i.e. a differentiable map γ : [0, 1] →
C) by
Z Z 1
f (z) dz := f (γ(t))γ 0 (t) dt.
γ 0

It can be checked that this is independent of the parametrisation chosen in the sense that, if γ is
replaced by γ ◦ ϕ where ϕ : [0, 1] → [0, 1] is
R an increasing
R differentiable and bijective function (i.e.
an increasing diffeomorphism), we have γ f = γ◦ϕ f . This amounts to the change of variable
theorem. Note that we have the following triangular inequality: if |f | ≤ M on γ([0, 1]),
Z Z 1 Z 1
0
f = f (γ(t))γ (t) dt ≤ M γ 0 (t) dt
γ 0 0

Rb
and the second factor is the length of γ. If we finally define a f (z) dz for complex a, b as the
integral of f over the arc γ : t 7→ ta + (1 − t)b, we get the wanted inequality. Moreover, this
integral
R has all the properties we desire, in the sense that if F is anti-derivative of f , we have
γ
f = F (γ(1)) − F (γ(0)).
Chapter 2

Quadratic Integers

Exercise 2.0.1. Why is the "naive" approach of factorising the equation as x2 = (y − 1)(y 2 + y + 1)
difficult to conclude with? Why does our solution not work as well for the equation x2 − 1 = y 3 ?

Solution

One reason is that both of these approaches transform one diophantine equation into two simulta-
neous diophantine equations. On the other hand, since i and −i are conjugate, (x + i)(x − i) = y 3
gives us only one equation, since, from (x + i)3 = (a + bi)3 , we also get (x − i)3 = (a − bi)3 by
conjugating.


2.1 General Definitions


Exercise 2.1.1∗ . Prove that Z + αZ is a ring for any quadratic integer α. This amounts to checking
that it is closed under addition, subtraction, and multiplication. What happens if α is a quadratic
number which is not an integer?

Solution

Suppose that α2 − uα − v = 0, i.e. α2 = uα + v where u, v ∈ Z. We have

(a + bα) ± (c + dα) = (a ± c) + (b ± d)α

and

(a + bα)(c + dα) = ac + (ad + bc)α + bdα2


= ac + (ad + bc)α + bd(uα + v)
= (ac + bdv) + (ad + bc + bdu)α

If α is not a quadratic integer, then α2 is not a linear combination of 1 and α with rational
integer coefficients, so Z[α] would not be closed under multiplication with this definition (and
thus not a ring).


Exercise 2.1.2∗ . Prove that α + αQ is a ring for any quadratic integer α. This amounts to checking
that it is closed under addition, subtraction, multiplication, and division.

224
2.1. GENERAL DEFINITIONS 225

Solution

The same proof as the one of Exercise 2.1.1∗ shows that it is a ring, so it remains to prove
that every non-zero element has inverses. If we multiply a + bα by a + bα where α is the other
conjugate of α, we get a rational number c by the fundamental theorem of symmetric polynomials.
Moreover, this number is zero only if one of a + bα and a + bα is zero, which implies that the
other is too since they are conjugate. Indeed, if f (a + bX) has a root at α it has one at α too.

Hence, when a + bα is non-zero, it has an inverse


a + bα
.
c
To conclude, note that, if α2 + uα + v = 0, then α = −u − α by Vieta’s formulas so α ∈ Q(α).


Exercise 2.1.3∗ . Let α be a quadratic number and β ∈ Q(α). Show that β has degree 1 or 2.

Solution

As in Exercise 2.1.2∗ , let α ∈ Q(α) be the conjugate of α. Let β = a + bα with a, b ∈ Q. Then

(X − (a + bα))(X − (a + bα))

has rational coefficients by the fundamental theorem of symmetric polynomials so β has degree
at most 2 as wanted.



Exercise 2.1.4∗ . Prove that a quadratic field K is equal to Q( d) for some squarefree rational
integer d 6= 1. Moreover, prove that such fields are pairwise non-isomorphic (and in particular distinct),

meaning
√ that, for distinct squarefree a, b 6= 1, there does not exist a bijective function
√ f : Q( a) →
Q( b) such that f (x + y) = f (x) + f (y) and f (xy) = f (x)f (y) for any x, y ∈ Q( a).

Solution

Let √α be a quadratic number, i.e. a root of aX 2 + bX + c for some a, b, c ∈ Z. Then, α =


−b± b2 −4ac

so Q(α) = Q( b2 − 4ac). By letting d be the squarefree part of b2 − 4ac (but
2a √
conserving its sign), we thus get Q(α) = Q( d) as wanted (and d 6= 1 since it does not only have
rational elements).
√ √
Let a, b be squarefree and suppose f : Q( u) → Q( v) is an isomorphism. We will show
that a = b. Note that f (1)2 = f (1) so f (1) = 1 or 0 but the latter is impossible since then
f (x) = f (x)f (1) = 0 for all x. We have

f (u) = f (1) + f (1) + . . . + f (1) = uf (1) = u


√ 2 √ √ √ √
and f (u) = f ( u) so u ∈ Q( v). Thus, the problem of showing that Q( u) and Q( v)
are non-isomorphic
√ √ when u 6= v reduces √ to showing that they √ are distinct. This is easy: if
u = a + b v then u = a2 + vb2 + 2ab v so a = 0 or b = 0 since v is irrational but the former
means that u is a perfect square and the latter that u/v is one, i.e. that u = 1 or u = v.


Exercise 2.1.5∗ . Prove that the conjugate is well defined.


226 CHAPTER 2. QUADRATIC INTEGERS

Solution
√ √
√ element of Q( d) can be written in a unique way as a + b d
This amounts to the fact that every
with a, b ∈ Q which is true since d is irrational.


Exercise 2.1.6∗ . Let d 6= 1 be a rational √


squarefree number. Prove that the conjugation satisifes

α + β = α + β and αβ = αβ for all α, β ∈ Q( d). Such a function is called an automorphism of Q( d)
if it is also bijective.

Solution

We have √ p √
(a − b d) + (a0 − b0 ) = (a + b) − (a0 + b0 ) d
and √ √ √ √
(a − b d)(a0 − b0 d) = (aa0 + bb0 d) − (ab0 + ba0 ) d.


Exercise
√ 2.1.7. Let d 6= 1 be a rational squarefree number. Prove that the only automorphisms of
Q( d) are the identity and conjugation.

Solution

Let f by an automorphism of Q( d). Since f is bijective and f (1)2 = f (1), we must have f (1) = 1
as otherwise f (x) = f (x)f (1) = 0 for all x. By induction we have f (nx) = f (x) + . . . + f (x) =
nf (x) for any n ∈ Z≥1 , by it is clearly true for n = 0 since f (0) + f (0) = f (0) and for n < 0
since f (−n) = f (0) − f (n) = −f (n) thus f (nx) = nf (x) for any n ∈ Z. Since

nf (xm/n) = f (xm) = mf (x),


√ 2
we√also have √ any a ∈ Q; in
√ f (ax) = af (x) for √ particular f fixes Q. To finish, f (d) = f ( d) so
f ( d) = ± d. Since f (a + b d) = a + bf ( d), the plus sign gives the identity and the minus
sign the conjugation.



Exercise 2.1.8∗ . Let d 6= 1 be a squarefree rational integer, and α, β ∈ Q( d). Prove that N (αβ) =
N (α)N (β).

Solution
√ √
Let α = a + b d and β = a0 + b0 d. Then,
2 2
N (α)N (β) = (a2 − db2 )(a0 − db0 )

and √
N (αβ) = N ((aa0 + dbb0 ) + (ab0 + ba0 ) d) = (aa0 + dbb0 )2 − d(ab0 + ba0 )2 .
To conclude, both sides are equal to (aa0 )2 + (dbb0 )2 − d((ab0 )2 + (ba0 )2 ).


Exercise 2.1.9. Prove Exercise 2.1.8∗ without any computations using Exercise 2.1.6∗ .
2.2. UNIQUE FACTORISATION 227

Solution

We have
N (α)N (β) = ααββ = αβαβ = N (αβ).



Exercise 2.1.10. Let d < 0 be a squarefree integer. Prove that the conjugate
√ of an element of Q( d)
is the same as its complex conjugate. In particular, the norm over Q( d) is the module squared.

Solution
√ √
When d < 0, the complex conjugate√of d is − d. Since rational
√ numbers are real, this means
that the complex conjugate of a + b d for a, b ∈ Q is a − b d.


2.2 Unique Factorisation


Exercise 2.2.1∗ . Let d 6= 1 be a squarefree rational integer. Prove that the product of two units of
OQ(√d) is still a unit, and that the conjugate of a unit is also a unit.

Solution

If uu0 = 1 and vv 0 = 1 then (uv)(u0 v 0 ) = 1 and uu0 = 1.




Exercise 2.2.2∗ . Let d 6= 1 be a squarefree rational integer. Prove that α ∈ OQ(√d) is a unit if and
only if |N (α)| = 1.

Solution

If N (α) = αα = ±1, then it is clear α is a unit. Now suppose α is a uni. By Exercise 2.2.1∗ , α
is also a unit, which means N (α) = αα is one too. However the only rational integer units are
±1 by Proposition 1.1.1.


Exercise 2.2.3∗ . Determine the units of the ring Z[i].

Solution

a + bi is a unit of Z[i] if and only if its norm a2 + b2 is 1 by Exercise 2.2.2∗ since its positive.
This means a = ±1 and b = 0 or a = 0 and b = ±1, i.e. a + bi ∈ {1, −1, i, −i}.


Exercise 2.2.4∗ . Prove that an associate of a prime is also prime.


228 CHAPTER 2. QUADRATIC INTEGERS

Solution

If p and q are associates, p | α if and only if q | α.




Exercise 2.2.5∗ . Prove that the conjugate of a prime is also a prime.

Solution

p | η iff p | η so if p | αβ then p | αβ which implies p | α or p | α, i.e. p | α or p | β as wanted.




Exercise 2.2.6∗ . Prove that primes are irreducible.

Solution

If p = αβ is prime, then p must divide α or β, we may assume it divides α, i.e. α = pγ. Then,

p = αβ = pγβ =⇒ βγ = 1

so β is a unit as wanted.


Exercise 2.2.7∗ . Let d 6= 1 be a squarefree rational integer and let x ∈ OQ(√d) be a quadratic integer.
Suppose that |N (x)| is a rational prime. Prove x is irreducible.

Solution

If x = αβ then N (x) = N (α)N (β) so if N (x) is a rational prime then N (α) or N (β) must be ±1
by the uniqueness of the prime factorisation in Z, i.e. one of them is a unit.


Exercise 2.2.8∗ . Suppose a prime p divides another prime q. Prove that p and q are associates.

Solution

Write q = αp. Since q is irreducible by Exercise 2.2.6∗ and p is not a unit, α must be a unit.


Exercise 2.2.9∗ . Prove that p is a prime element of R if and only if it is non-zero and R (mod p) is an
integral domain (this means that the product of two non-zero elements is still non-zero). In particular,
if R (mod p) is a field (this means that elements which are not divisible by p have an inverse mod p),
p is prime.
2.2. UNIQUE FACTORISATION 229

Solution

αβ is divisible by p if and only if it is zero modulo p, so p is prime iff when αβ is zero modulo
p, α or β must already be zero modulo p. This is exactly what it means for R (mod p) to be an
integral domain.


Exercise 2.2.10. Let d 6= 1 be a squarefree rational integer and let p ∈ OQ(√d) be a prime. Prove
that p divides exactly one rational prime q ∈ Z.

Solution

First we prove uniqueness. Suppose p | q and p | r for distinct rational primes q and r. By
Bézout, let aq + br = 1 for some a, b ∈ Z. Then p | aq + br = 1 which means that it’s a unit and
is a contradiction.

For existence, consider the prime factorisation ±q1n1 · . . . · qknk of the norm N (p) of p. Since
p | N (p), p must divide one of the qi since it’s prime.



Exercise 2.2.11. Prove that 2 is irreducible in Z[ −5] = OQ(√−5) but not prime.

Solution
√ √
First, note that 2 is not prime since 2 | (1 + −5)(1 − −5) but 2 divides neither of these factors.
Now, suppose 2 = αβ and that neither of α and β are units. Then,

4 = N (2) = N (α)N (β)



so N (α) = N (β) = ±2 as they are different from ±1. If we write α = a + b −5, we have
N (α) = a2 + 5b2 which cannot be equal to 2 and is thus a contradiction.


Exercise 2.2.12. Show that the primes of Definition 2.2.2 must all be prime elements, and that there
is at least one associate of each prime element in that set. (Conversely, if we have unique factorisation,
any such set of primes work. This explains why we consider all primes defined in Definition 2.2.3.)

Solution

Suppose p is a prime as in Definition 2.2.2 and p | αβ. Write α = upa1 1 ·. . .·pann and β = vq1b1 ·. . .·qm
bm

the prime factorisations of α and β. Then, by the uniqueness of the prime factorisation, p must
be equal to some pi or qi times a unit w. In the first case, p | pi | α and in the second case
p | qi | β.

Now suppose p is a prime as in Definition 2.2.3 and consider its prime factorisation pa1 1 · . . . ·
pann . Then p must divide one of the pi since it’s prime, which means it is associate to it by
Exercise 2.2.8∗ .


Exercise 2.2.13∗ . Prove that a greatest common divisor γ of α and β really is a greatest common
divisor of α and β, in the sense that if γ | α, β and δ | α, β then δ | γ.
230 CHAPTER 2. QUADRATIC INTEGERS

Solution

Since α, β ∈ αR + βR, we have α, β ∈ γR, i.e. γ | α, β. Now suppose δ | α, β. Let x and y be


such that xα + yβ = γ. Then,
δ | xα + yβ = γ.


Exercise 2.2.14∗ . Prove that an associate greatest common divisor is also a greatest common divisor,
and that the greatest common divisor of two elements is unique up to association.

Solution

If γ and δ are two gcds of α and β then, by Exercise 2.2.13∗ , γ | δ | γ so they are associates.


Exercise 2.2.15. Let R be a Euclidean domain with Euclidean function f . Show that, if f (α) = 0,
then α = 0, and if f (α) = 1, then α is a unit or zero.

Solution

If α 6= 0, consider the Euclidean division of 0 by α: 0 = ρα + τ . Then f (τ ) < f (α) = 0 which is


impossible.

For the second part, suppose that α is non-zero and f (α) = 1. Then, if we perform the Euclidean
division of any β by α: β = αρ + τ , we have f (τ ) < f (α) = 1, i.e. f (τ ) = 0. By the first part,
this means that τ = 0. Hence, α divides everything, and in particular it divides 1, i.e. it is a
unit.


Exercise 2.2.16∗ . Prove that a Euclidean domain is a Bézout domain.

Solution

Let R be a Euclidean domain with Euclidean function f . Let α, β be two elements of R. Consider
a non-zero element γ ∈ αR + βR such that f (γ) is minimal. We will show that αR + βR = γR.
Suppose otherwise, that there is a δ ∈ αR + βR such that the Euclidean division δ = ργ + τ has
τ non-zero (otherwise γ | δ, i.e. δ ∈ γR). Then, f (τ ) < f (γ) but τ ∈ αR + βR and is non-zero
too so that contradicts the minimality of γ.


Exercise 2.2.17∗ . Prove that irreducible elements are prime in a Bézout domain.

Solution

Let x be an irreducible element in a Bézout domain R. By Exercise 2.2.9∗ , it suffices to show


that every x - α has an inverse modulo p. Since R is a Bézout domain, there is some β such that

xR + αR = βR.
2.3. GAUSSIAN INTEGERS 231

In particular β | x so β is a unit or a unit times x. The latter is not possible since β also divides
α, thus β is unit. Without loss of generality, β = 1 by Exercise 2.2.14∗ . Since β = ax + bα for
some a, b ∈ R by definition, modulo x we have bα ≡ 1 as wanted.


2.3 Gaussian Integers


Exercise 2.3.1∗ . Let n ∈ Z be a rational integer and p an odd rational prime. If n2 ≡ −1 (mod p),
prove that p ≡ 1 (mod 4).

Solution

Suppose that p | n2 + 1. Then the order of n modulo p is 4, indeed n4 ≡ (−1)4 = 1 so the order
divides 4 and n2 ≡ −1 6≡ 1 so the order is not divisible by 2. Since the order divides p − 1 by
Fermat’s little theorem (see Exercise 3.3.4∗ ), we have 4 | p − 1, i.e. p ≡ 1 (mod 4) as wanted.


Exercise 2.3.2∗ . Let p ≡ 1 (mod 4) be a rational prime. Prove that there exist a rational integer n
such that n2 ≡ −1 (mod p). (Hint: Consider (p − 1)!.)

Solution

By Wilson’s theorem, (p − 1)! ≡ −1 (mod p). Hence,

−1 ≡ (p − 1)!
p−1 p−1
2
Y 2
Y
= k· p−k
k=1 k=1
p−1 p−1
2
Y 2
Y
≡ k −k
k=1 k=1
 p−1 2
2
p−1 Y
= (−1) 2  k
k=1
 p−1 2
Y2

≡ k
k=1

since p ≡ 1 (mod 4).




Exercise 2.3.3. Which rational integers can be written as a sum of two squares of rational integers?

Solution

Since the norm of Z[i] is multiplicative, if m and n are sums of two squares, then so are mn. In
particular, all integers with only prime factors equal to 2 or congruent to 1 modulo 4 are sums
232 CHAPTER 2. QUADRATIC INTEGERS

of two squares. Also, perfect squares times these numbers are sums of two squares.

Now suppose that n = a2 + b2 is a sum of two squares but not of this form, i.e. there is some
prime p ≡ −1 (mod 4) such that vp (n) is odd. If p - b then p | (ab−1 )2 + 1 which is impossible by
Exercise 2.3.1∗ . Thus p | a, b. We can now proceed by infinite descente on n/p2 = (a/p)2 + (b/p)2
(or equivalently suppose n was the minimal counterexample and reach a contradiction).


Exercise 2.3.4∗ . Find all rational integer solutions to the equation x2 + 1 = y 3 . (This is the example
we considered in the beginning of the chapter.)

Solution

Note that any solution of x2 + 1 = y 3 must have y odd since x2 + 1 is congruent to 1 or 2 modulo
4. Thus, y is not divisible by 1 + i. Write the equation as (x + i)(x − i) = y 3 . The gcd of the
two factors divide (x + i) − (x − i) = 2i but is not divisible by 1 + i so is 1. This means that

x + i = uα3

for some unit u. Since the units of Z[i] are 1, −1, i, −i and they are all cubes (13 , (−1)3 , (−i)3 , i3 ),
we can assume u = 1. The rest of the solution is the same as in the introduction of the chapter.


2.4 Eisenstein Integers


Exercise 2.4.1∗ . Prove that the norm of a+bj is a2 −ab+b2 . (Bonus: do it without any computations
using cyclotomic polynomials from Chapter 3.)

Solution

(a + bj)(a + bj) = a2 + ab(j + j) + bjj = a2 − ab + b2


since j + j = −1 and jj = 1 by Vieta (j is a root of X 2 + X + 1 = 0). For the bonus, one can
note that N (a + bj) = Φ6 (a, b) by Exercise 3.1.8∗ .


Exercise 2.4.2∗ . Determine the units of Z[j].

Solution

a + bj is a unit if and only if a2 − ab + b2 = ±1. If ab ≤ 0 we get a = 0 and b = ±1 as well


as a = ±1 and b = 0, and if ab > 0 then a2 − ab + b2 = (a − b)2 + ab ≥ 02 + 1 which means
a = b = ±1. In conclusion, the units of Z[j] are ±1, ±j and ±(1 + j) = ±j 2 . (Note that these
are all roots of unity.)


Exercise 2.4.3∗ . Prove that Z[j] is norm-Euclidean.


2.4. EISENSTEIN INTEGERS 233

Solution
α
Let α, β ∈ Z[j] be two elements with β 6= 0. Write β = x+yj and let m and n be rational integers
1
2 2 2
such that |x − m| ≤ 2 and |y − n| ≤ 12 . Thus, |N (x + yi − (m + ni))| ≤ 12 + 12 + 12 < 1.

Hence,
|N (α − β(m + ni))| = |N (β)| · |N (x + yi − (m + ni))| < |N (β)|
which means that the remainder τ = α − β(m + ni) works since it has norm less than N (β).


Exercise 2.4.4. Characterise the primes of Z[j]. Conclude that when p ≡ 1 (mod 3) there exist
rational integers a and b such that p = a2 − ab + b2 . (You may assume that there is an x ∈ Z such
that x2 + x + 1 ≡ 0 (mod p) if p ≡ 1 (mod 3). This will be proven in Chapter 3, as a corollary of
Theorem 3.3.1.)

Solution

As in the Z[i] case, it suffices to find the prime factorisation of rational primes (this can also be
seen as a corollary of prime divides exactly one rational prime). Indeed, if α is a prime of Z[j],
then N (α) = αα has at most two rational prime factors: if it has one then N (α) is prime in Z.
Otherwise, N (α) = ±p2 α is an associate of a rational prime p.

Thus, suppose N (α) = p for some rational prime p (N (α) is positive). Write α = a + bj. Then,
N (α) = a2 −ab+b2 . Clearly, p - b as otherwise p2 | N (α) = p. Thus, p | (−a·b−1 )+(−a·b−1 )+1.

We wish to find which rational primes divide a number of the form x2 + x + 1. Note that
x3 − 1 = (x + 1)(x2 + x + 1) so x3 ≡ 1 (mod p). The order of x modulo p divides 3. If it is 1
then x2 + x + 1 ≡ 3 so p = 3 which factorises as (1 − j)(1 − j) = −j 2 (1 − j)2 (it ramifies).

Otherwise, the order must be 3. Since it divides p − 1, we have p ≡ 1 (mod 3). In particular,
primes congruent to −1 modulo 3 stay inert in Z[j]. Finally, if p ≡ 1 (mod 3), by Theorem 3.3.1,
there exists an x such that p | x2 + x + 1 = (x − j)(x − j 2 ). Since p - x − j, x − j 2 , this
means these primes split as a product of two Eisenstein primes a + bj and a − bj. (In particular
p = a2 − ab + b2 .)


Exercise 2.4.5∗ . Let θ ∈ Z[j] be an Eisenstein integer. Prove that, if λ - θ, then θ ≡ ±1 (mod λ).
In that case, prove that we also have θ3 ≡ ±1 (mod λ4 ).

Solution

Modulo λ, every element of Z[j] is congruent to a rational integer. Indeed, a + bj ≡ a + b


(mod 1 − j). Moreover, λ | 3 so every rational integer is congruent modulo λ to 0 or ±1.
234 CHAPTER 2. QUADRATIC INTEGERS

3
∓1
Let θ = ηλ ± 1. We shall show that λ3 θθ∓1 = θ2 ± θ + 1. We have

θ2 ± θ + 1 = (ηλ ± 1)2 ± (ηλ ± 1) + 1


= η 2 λ2 ± 2ηλ + 1 ± ηλ + 1 + 1
= η 2 λ2 ± 3λη + 3
≡ η 2 λ2 + 3 (mod λ3 )
= λ2 (η 2 − j 2 )
= λ2 (η − j)(η + j)

since any λ - η is congruent to j ≡ 1 or −j ≡ −1 modulo λ.




Exercise 2.4.6∗ . Let α, β ∈ Z[j] be coprime Eisenstein integers non-divisible by λ. Prove that, if

λ | α3 + β 3 = (α + β)(α + βj)(α + βj 2 ),

each pair of factors has gcd λ.

Solution

Modulo λ, j ≡ 1 so all the factors are the same modulo λ and are thus all divisible by λ. If
γ | α + β, α + βj then γβ(1 − j) which implies γ | 1 − j = λ since it is coprime with β (it divides
α + β). By sysmmetry between j and j 2 (they are conjugate), we reach the same conclusion if
γ | α + β, α + βj 2 .

Finally, if γ | α + βj, α + βj 2 then γ | βj(1 − j) but γ is coprime with β since it divides α + βj


which implies γ | 1 − j = λ again.


Exercise 2.4.7. Check the computational details: ±1 ± µ ± η is never zero mod λ4 for units µ, η and
±1 ± µ ≡ 0 (mod λ3 ) implies µ = ±1.

Solution

We have seen in Exercise 2.4.2∗ that the units of Z[j] are roots of unity. Thus, the norms of
±ε, ±2 ± ε, ±1 ± µ have absolute value less than 3 · 3 by the triangular inequality (because the
conjugates of roots of unity are still roots of unity and thus have absolute value 1). In particular,
for it to be divisible by N (λ4 ) = 81, it must be zero, i.e. ±ε = 0, ±2 ± ε = 0 and ±1 ± µ = 0.
The first two cases are impossible, and the last one implies µ = ±1 as wanted.


2.5 Hurwitz Integers


Exercise 2.5.1∗ . Prove that ij = k = −ji, jk = i = −kj and ki = j = −ik follows from i2 = j2 =
k2 = −ijk = −1 and associativity of the multiplication.
2.5. HURWITZ INTEGERS 235

Solution

We have
1. ij = −ijkk = k.

2. kj = ijj = −i.
3. jk = −iijk = i.
4. ik = jkk = −j.
5. ki = −kkj = j.

6. ji = kii = −k.


Exercise 2.5.2. Prove that i, j, k are distinct.

Solution

Exercise 2.5.1∗ tells us that everything is cyclic between i, j, k so assume i = j. Then we get
k = ij = i2 = −1 and
−i = ki = j = i
so i = 0 which is an obvious contradiction.


Exercise 2.5.3∗ . Let α, β, γ ∈ H be quaternions. Prove that (αβ)γ = α(βγ). (We say multiplication
is associative. This is why we can write αβγ without ambiguity.)

Solution

The simplest (and neatest) way to see this is to use the representation by matrices given by
Remark 2.5.2, since multiplication of matrices is associative. Without matrices, but still with a
bit of (implicit) linear algebra, we can do the following. We shall prove that (xy)z = x(yz) for
any x, y, z ∈ {1, i, j, k}. Since real numbers commute with everything and multiplication is clearly
associative when one of the factor is real, we thus get (xy)z = x(yz) for any x, y, z which are
linear combinations with real coefficients of 1, i, j, k, i.e. all quaternions (since if (xy)z = x(yz)
and (wy)z = w(yz) then ((x + w)y)z = (x + w)(yz) too).

When one of x, y, z is 1 we trivially have (xy)z = x(yz) by the previous remark (since 1 is real).
Thus, suppose without loss of generality by cyclicity (Exercise 2.5.1∗ ) that x = i. We distinguish
all possible cases.

1. y = z = j. We have
(ij)j = kj = −i = i(jj).

2. y = z = k. We have
(ik)k = −jk = −i = i(kk).

3. y = j, z = k. We have
(ij)k = kk = −1 = ii = i(jk).

4. y = k, z = j. We have
(ik)j = −jj = 1 = −ii = i(kj).
236 CHAPTER 2. QUADRATIC INTEGERS

5. y = i, z = j. We have
(ii)j = −j = ik = i(ij).

6. y = i, z = k. We have
(ii)k = −k = −ij = i(ik).

7. y = j, z = i. We have
(ij)i = ki = −j = ik = i(ji).

8. y = k, z = i. We have
(ik)i = −ji = k = ij = i(ki).

9. y = z = i. We trivially have
(ii)i = i(ii).

Exercise 2.5.4. Prove that there are infinitely many square roots of −1 in H.

Solution

We have
(a + bi + cj + dk)2 = (a2 − b2 − c2 − d2 ) + 2abi + 2acj + 2adk
because the other terms cancel out because of Exercise 2.5.1∗ , for instance bici + cjbi = 0. In
particular, if a = 0, (bi + cj + dk)2 = −(b2 + c2 + d2 ) and there are thus clearly infinitely many
square roots of any negative number.


Exercise 2.5.5∗ . Prove that, for any α, β ∈ H, α + β = α + β and αβ = βα (this is because


multiplication is not commutative anymore).

Solution

It is clear that conjugation is additive:

(a − bi − cj − dk) + (a0 − b0 i − c0 j − d0 k) = (a + a0 ) − (b + b0 )i − (c + c)j − (d + d0 )k.

For multiplicativity, one can see that

(a + bi + cj + dk) + (a0 + b0 i + c0 j + d0 k) = (aa0 − bb0 − cc0 − dd0 ) + (ab0 + ba0 + cd0 − dc0 )i
+ (ac0 + ca0 + db0 − bd0 )j + (ad0 + da0 + bc0 − cb0 )k.

If we exchange (a, b, c, d) with (a0 , b0 , c0 , d0 ) and switch the signs of b, c, d, b0 , c0 , d0 one can see that
it is the same as taking the conjugate: both give

(aa0 − bb0 − cc0 − dd0 ) − (ab0 + ba0 + cd0 − dc0 )i − (ac0 + ca0 + db0 − bd0 )j − (ad0 + da0 + bc0 − cb0 )k

(a times something switches sign because a stays the same but the other factor switches sign,
and cd0 − dc0 switches sign too because (c, d) ↔ (c0 , d0 )).


Exercise 2.5.6∗ . Check that (a + bi + cj + dk)(a − bi − cj − dk) is indeed a2 + b2 + c2 + d2 .


2.5. HURWITZ INTEGERS 237

Solution

(a + bi + cj + dk)(a − i − cj − dk) = a2 + b2 + c2 + d2 because the non-real terms cancel out


because of Exercise 2.5.1∗ , for instance bi(−cj) + cj(−bi) = 0.


Exercise 2.5.7∗ . Prove that H is a skew field. This amounts to checking that elements have multi-
plicative inverses (i.e. for any α there is a β such that αβ = βα = 1).

Solution

This follows from Exercise 2.5.6∗ : the inverse of a + bi + cj + dk is a−bi−cj−dk


a2 +b2 +c2 +d2 .


Exercise 2.5.8∗ . Prove that the norm is multiplicative: for any α, β ∈ H, N (αβ) = N (α)N (β).

Solution

This follows from Exercise 2.5.5∗ :

N (αβ) = αββα = ααββ = N (α)N (β)

because real numbers commute with every quaternion and ββ is a real number.


Exercise 2.5.9∗ . Prove that H = { a+bi+cj+dk


2 | a ≡ b ≡ c ≡ d (mod 2)}. Deduce that the elements
of H have integral norms.

Solution

When you multiply two such elements you get back an element of the same form:.Indeed, it is
clear when one of the factors is in Z[i, j, k]. For the same reason, we can consider a, b, c, d modulo
2. Thus, we just need to show that the product of two such elements with odd a, b, c, d still has
this form, which is true for 1+i+j+k
2 · 1−i−j−k
2 = 1.

From this we conclude that H is included in this set. However it is also clear that H contains
this set so they are equal.


Exercise 2.5.10∗ . Determine the units of H.

Solution

α = a + bi + cj + dk is a unit if and only if a2 + b2 + c2 + d2 = N (α) = ±1. This means


 
1±i±j±k
α ∈ ±1, ±i, ±j, ±k, .
2
Indeed, either a, b, c, d are all integers in which case one of them is ±1 and the rest 0, or they are
all half integers in which case they must all be ±1 2 as otherwise the sum is too big.
238 CHAPTER 2. QUADRATIC INTEGERS

Exercise 2.5.11∗ . Let α, β, γ ∈ H. Prove that α d β implies α d βγ but does not always imply
α d γβ.

Solution

Suppose that α d β, i.e. β = αδ. Then βγ = α(δγ) so α d βγ too.

For the second part, a very simple counter-example is α = β = 1 + i + j and γ = i. Suppose that
α d γβ. This means that there exists a δ ∈ H such that αδ = γβ, i.e.

(1 + i + j)δ = i(1 + i + j).

In other words, 3 divides


(1 − i − j)i(1 + i + j).
However, this is equal to
(i + 1 + k)(1 + i + j) = 1 + 2j + 2k
which is clearly not divisible by 3.


Exercise 2.5.12∗ . Prove that being left-associate is an equivalence relation, i.e., for any α, β, γ, α is
a left-associate of itself, α is a left-associate of β if and only if β is a left-associate of α, and if α is a
left-associate of β and β is a left associate of γ then α is a left-associate of γ.

Solution

We have α = α · 1,
α = βε ⇐⇒ β = αε−1 ,
and
α = βε, β = γη =⇒ α = γηε.


Exercise 2.5.13∗ . Prove that a left-gcd γ of α and β satisifies the following property: γ d α, β and
if δ d α, β then δ d γ.

Solution

Since α, β ∈ γR, we have γ d α, β. Write γ = αx + βy. If δ d α, β, then δ d αx + βy = γ by


Exercise 2.5.11∗ .


Exercise 2.5.14. Prove that 1 + i and 1 − j do not have a left-gcd in Z[i, j, k]. In particular, it
is not left-Bézout and thus not left-Euclidean too (and the same holds for being right-Bézout and
right-Euclidean by symmetry).
2.5. HURWITZ INTEGERS 239

Solution

Let L = Z[i, j, k]. Suppose otherwise and let γ be a left-gcd. Note that 1 + i has norm 2 so γ
must have norm 1 or 2. If it has norm 1, it is a unit so (1 + i)L + (1 − j)L = γL. Let α and β
be such that (1 + i)α + (1 − j)β = 1. Then,

1 = (1+i)α+(1−j)β = (1+i)α+(1+bf i)(1/2−i/2−j/2+k/2)β = (1+i)(α+(1/2−i/2−j/2+k/2)β)

which is impossible since the norm of the LHS is divisible by 2 since the second factor is a Hurwitz
integer so has integral norm by Exercise 2.5.9∗ .

If γ has norm 2 (this is what happens in H), then, since γ left-divides 1 + i and 1 − j and has
the same norm as them it is a left-associate of them and thus 1 + i and 1 − j are left-associates
too by Exercise 2.5.12∗ . This is a contradiction since the only units of L are ±1, ±i, ±j, ±k by
Exercise 2.5.10∗ .


Exercise 2.5.15∗ . Prove Proposition 2.5.1.

Solution

By symmetry, suppose R is a left-Euclid domain with Euclidean function f . Let γ ∈ αR + βR be


a non-zero element such that f (γ) is minimal. Suppose for the sake of contradiction that α 6∈ γR.
Write α = γρ + τ for some non-zero τ with f (τ ) < f (γ). This contradicts the minimality of f (γ)
since τ ∈ αR + βR too. Thus α ∈ γR and by symmetry β too.


Exercise 2.5.16∗ . Prove Proposition 2.5.3.

Solution

We proceed by induction on N (α), the base case is the units. If α is irreducible it is its own
factorisation, otherwise write α = βγ with N (β), N (γ) > 1. Then N (β), N (γ) < N (α) so they
can be written as a product of irreducible elements and thus α too.


Exercise 2.5.17. Prove that there is an irreducible Hurwitz integer x ∈ H for which there exist α
and β such that x d αβ but x left-divides neither α nor β.

Solution

1 + i has norm 2 which is a rational prime so is irreducible. In addition, 1 + i d 2 = (1 + j)(1 − j)


but 1 + i left-divides neither of 1 + j and 1 − j. Indeed, if it did, then there would be a unit η
such that 1 + j = (1 + i)η since they have the same norm. However, one can see that none of the
 
1±i±j±k
η ∈ ±1, ±i, ±j, ±k,
2

work since there are always two non-zero coefficients of i, j, k except when the unit is ±1 or ±j
(which do not work).

240 CHAPTER 2. QUADRATIC INTEGERS

Exercise 2.5.18∗ . Let p be a rational prime. Prove that there exist rational integers a and b such
that p | 1 + a2 + b2 .

Solution

This is clear when p = 2, so suppose that p is odd. Then, 1 + a2 and −b2 both reach p+1
2 values
modulo p so their images cannot be disjoint since Fp only has p < p+1
2 + p+1
2 elements.


2.6 Exercises
Diophantine Equations
Exercise 2.6.2† . Prove that OQ(√2) and OQ(√−2) are Euclidean.

Solution

We proceed exactly as in Proposition 2.3.1. Let α, β ∈ OQ(√±2) = Z[ ±2] be algebraic integers,

with β 6= 0. Write α/β = x + y ±2. Choose rational integers m, n such that |x − m|, |y − m| ≤ 12 .
Then,
 2  2
√ √ 1 1 3
|N ((x + y ±2) − (m + n ±2))| ≤ +2 = <1
2 2 4

so that τ = β − (m + n −2) works as the remainder of the Euclidean division of α by β since it
has norm less than |N (β)|.


Exercise 2.6.4† . Prove that OQ(√−7) is Euclidean.

Solution
 √
1+ −7

α

Let α, β ∈ OQ(√−7) = Z 2 be quadratic integers with β 6= 0. Write β = x + y −7 with
x, y ∈ Q. Pick a half-integer n such that |y − m| ≤ 14 , and a half integer m ≡ n (mod 1) such
that |x − n| ≤ 21 . Then,
 2  2
√ 1 1 11
|N ((x − m) + (y − n) −7)| ≤ +7 = < 1.
2 4 16

Thus, the remainder τ = β(((x − m) + (y − n)
√ −7) works since it has norm less than |N (β)| by
the previous computation and α = β(m + n −7) + τ .


Exercise 2.6.6† . Solve the equation x2 + 11 = y 3 over Z.


2.6. EXERCISES 241

Solution
√  √ 
First, we prove that Q( −11) is norm-Euclidean. Let α, β ∈ OQ(√−11) = Z 1+ 2−11 be

quadratic integers with β 6= 0. Write α β = x + y −11 with x, y ∈ Q. Pick a half-integer n such
that |y − m| ≤ 41 , and a half integer m ≡ n (mod 1) such that |x − n| ≤ 12 . Then,
 2  2
√ 1 1 15
|N ((x − m) + (y − n) −11)| ≤ + 11 = < 1.
2 4 16

Thus, the remainder τ = β(((x − m) + (y − n)√ −11) works since it has norm less than |N (β)|
by the previous computation and α = β(m + n −11) + τ .

We conclude
√ that Q( −11) is Euclidean and thus a UFD.√It is easy to √ check that 2 stays prime
in Q( −11). Rewrite the equation x2 + 11 = y 3 as (x + −11)(x − −11) = y 3 . Note that, if
y is even, then √ √
x + −11 x − −11
2| ·
2 2
which is impossible since 2 is prime and divides √ neither of these
√ factors. Thus, √ y is odd, which
means that x is even.√ Let δ be the
√ gcd of x√+ −11 and x − −11. Since δ | 2 −11 and 2x, we
get δ = 1 as 2 - x +
√ −11 and −11 - x + −11 (otherwise 11 | x, y which gives a contradiction
modulo 112 ) and −11 is prime (it has prime norm).

We conclude that, since Q( −11) is a UFD, we have
√ 3


a + b −11
x+ −11 = ε
2

for some rational integers a ≡ b (mod 2) and ε a unit. Since the units of Q( −11) are just ±1
(see Section 7.1), ε is also a cube so we may assume ε = 1.

To conclude, by looking at the real and imaginary parts, we get 8x = a(a2 − 33b2 ) and 8 =
b(3a2 − 11b2 ). Hence, b | 8.
• If b = ±1, we get 3a2 − 11 = ±8, i.e. b = 1 and a = ±1. This yields (x, y) = (±4, 3).

• If b ≡ ±2, we have 3a2 − 44 = ±4, i.e. b = 2 and a = ±4. This yields (x, y) = (±58, 15).
• If 4 | b, then, since a ≡ b (mod 2), we get 16 | b(3a2 − 11b2 ) = 8 which is impossible.


Exercise 2.6.8† . Let n be a non-negative rational integer. In how many ways can n be written as a
sum of two squares of rational integers? (Two ways are considered different if the ordering is different,
for instance 2 = 12 + (−1)2 and 2 = (−1)2 + 12 are different.)

Solution

Let n ≥ 1 be an integer. WeQkwish to count in how many ways it is the sum of two squares. First
we do the case where n = i=1 pi is squarefree and all its prime factors are 1 modulo 4. Then,
Qk
n = i=1 πi π i is a product of k Gaussian primes and their conjugate. Expressing it as a sum of
two squares is equivalent to writing it as a product of two conjugate Gaussian integer n = αα,
which is equivalent to picking one out of πi and π i for each i and putting it in α. Thus, n is a
sum of two squares in 2k different ways, which turns out to be its number of divisors.
Qk
Now, suppose n = i=1 pni i where the pi are distinct and 1 modulo 4 again. The same approach
as before works, except we need to be more careful with where we put repeated prime factors.
242 CHAPTER 2. QUADRATIC INTEGERS

For instance, for n = p2 , α = ππ and α = ππ are in fact the same. Here is how we do it: we
distribute some amount of πi to α, say πij , and fill the rest with π ni i −j . There are ni + 1 ways to
do this, so in total n is a sum of two squares in

(n1 + 1) · . . . · (nk + 1)

ways, which turns out to be again the number of divisors of n.


Qk Q`
Finally, we treat the general case, i.e. n = 2r i=1 pni i i=1 qi2mi , where pi are distinct primes
congruent to 1 modulo 4 and qi distinct primes congruent to −1 modulo 4 (any sum of two
squares has this form by Exercise 2.3.3). Notice that the dyadic valuation of n doesn’t change
the number of ways to represent it as a sum of two squares, since 2 = −i(1 + i)2 so we need to
distribute the same prime to α and α. Thus we may assume r = 0. Then, notice that since all
qi are −1 modulo 4, if n = a2 + b2 then
!2 !2
n a a
Q` 2mi
= Q` mi
+ Q` mi
i=1 qi i=1 qi i=1 qi

Qkof way to represent n as a sum of two squares is the same as the number of
so that the numbers
ways to represent i=1 pni i as a sum of two squares, i.e. (n1 + 1) · . . . · (nk + 1). Finally, we may
in fact summarise all of the above discussion as follows: the number of ways to represent n as a
sum of two squares is X
χ4 (d) = d1 (n) − d−1 (n),
d|n

where χ4 (d) is 1 if d ≡ 1 (mod 4), −1 if d ≡ −1 (mod 4) and 0 otherwise, d1 is the number of


(positive) divisors congruent to 1 modulo 4, and d−1 the number of (positive) divisors congruent
to −1 modulo 4.


Remark 2.6.1
P
In fact, the fact that the number of representations as a sum of two squares of n is d|n χ4 (d) is
not at all a coincidence. Let r2 (n) denote the way of representing n as a sum of two squares (of
positive rational integers), and s a formal object we will specify later. Then,

X r2 (n) X 1
=
n=1
ns (a2 + b2 )s
a,b≥0,(a,b)6=(0,0)
X 1
=
N (x)s
x6=0,<(x),=(x)≥0
∞ ∞
! !
X 1 Y 1 X
=
N (1 + i)ks N (p)ks
k=0 p≡−1 (mod 4) k=0

! ∞ !
Y X 1 X 1
N (π)ks N (π)ks
p≡1 (mod 4) k=0 k=0


! ∞
!2 ∞
!
X 1 Y X 1 Y X 1
=
2ks pks p2ks
k=0 p≡1 (mod 4) k=0 p≡−1 (mod 4) k=0
1 Y 1 Y 1
=
1 − 2−s (1 − p−s )2 (1 − p−2s )
p≡1 (mod 4) p≡−1 (mod 4)

1
by uniqueness of the prime factorisation in Z[i] (we take the sum of N (x) over all Gaussian
2.6. EXERCISES 243

integers except we only choose one associate for each integer, and the product is also taken over
all Gaussian primes but with onlyassociate for each prime). This function is called the Dedekind
zeta function ζQ(i) of Q(i). Now consider the Dirichlet L-function of χ

X χ(n)
L(s, χ) := .
n=1
ns
P∞ 1
We claim that ζQ(i) (s) = ζ(s)L(s, χ), where ζ is the usual Riemann function ζ(s) = n=1 ns .
This is easy: by similar manipulations as before, we have ζ(s) = p 1−p1 −s and
Q

Y 1 Y 1 Y 1
L(s, χ) = =
p
1 − ξ(p)p−s 1 − p−s 1 + p−s
p≡1 (mod 4) p≡−1 (mod 4)

and thus ζ(s)L(s, χ) = ζQ(i) (s) (since 1+p−2s = (1−p−s )(1+p−s )). Finally, by regular expansion,
we have P
X d|n χ(d)
ζ(s)L(s, χ) = .
ns
n≥1
P r2 (n)
Since ζQ(i) = n≥1 ns , we get the wanted expression for r2 (n) (we do not neeed complex
analysis as the above manipulations were purely formal). This solution might seem a lot more
complicated than the previous, but it is in fact a lot deeper as this product formula can be
generalised to any Galois extension of number fields.

Exercise 2.6.11† (Euler). Let n ≥ 3 be an integer. Prove that there exist unique positive odd
rational integers x and y such that 2n = x2 + 7y 2 .

Solution

Rewrite this as √ √
x + y −7 x − y −7
· = 2n−2 .
2 2
Thus, solving x2 + 2 n n
√7y = 2 in odd integers amounts to†writing 2√ as a product of conjugate
odd factors in Q( −7). We have seen in√Exercise 2.6.4 that Q( −7) is Euclidean, thus we
seek the prime factorisation of 2. Since 1± 2 −7 have norm 2 which is a rational prime, the prime
factorisation of 2 reads √ √
1 + −7 1 − −7
2= · = αβ.
2 2

Now, write x+y2 −7 = αi β j with i + j = n − 2. Since it is not divisible by 2 = αβ, we have

i = 0 or j = 0, and which one it is depends on the sign of its −7 part. This shows that there
is exactly one pair which works. Indeed, it is also

clear that they work since they will have the
same parity and thus be both odd as α2 = −3+2 −7 ≡ α (mod 2).


Exercise 2.6.12† (Fermat’s Last Theorem for n = 4). Show that the equations α4 + β 4 = γ 2 and
α4 − β 4 = γ 2 have no non-zero solution α, β, γ ∈ Z[i].

Solution

Surprisingly, our solution is very similar to the proof of Theorem 2.4.1. Set λ = 1 − i. Any
fourth power θ4 which isn’t disivible by λ is congruent to ±1 modulo λ6 : θ is congruent to 1 or
i modulo 2, so θ2 is congruent to ±1 modulo 4, which implies that θ4 ≡ 1 (mod 8) as wanted.
Another way to see this is to notice that two factors are of θ4 − 1 = (θ − 1)(θ + 1)(θ − i)(θ + i)
244 CHAPTER 2. QUADRATIC INTEGERS

are divisible by λ2 and the other two by λ.

Now suppose for the sake of contradiction that α, β, γ ∈ Z[i] are pairwise coprime and non-zero
such that α4 +β 4 = γ 2 . If λ does not divide α nor β, then, the equation becomes 1+1 ≡ γ 2 modulo
λ6 by our first remark. In particular, vλ (γ) = 1; set γ = λδ. Then, λ2 δ 2 ≡ 2 = iλ2 (mod λ6 ) so
δ 2 ≡ i (mod λ4 ) which contradicts our previous observation: δ 4 ≡ −1 6≡ 1 (mod λ6 ).

Hence, λ must divide αβ, say it divides α. We will now, as we did in Theorem 2.4.1, that the
more general equation
ελ4n α4 + β 4 = γ 2
does not have solutions λ - α, β, γ ∈ Z[i] and ε ∈ Z[i]× a unit for n ≥ 1. Hence, suppose that
α, β, γ, ε is a solution with minimal n. We first prove that n must be at least 2. If there were a
solution with n = 1, modulo λ4 we get β 4 ≡ 1 ≡ γ 2 . We will prove that γ 2 is in fact congruent
to 1 modulo λ5 , whence λ5 | λ4n which implies n ≥ 2. To prove this notice that γ cannot be
congruent to i modulo λ2 so must be congruent to 1. Set γ = λ2 ρ + 1 and notice that

γ 2 − 1 = (γ − 1)(γ + 1)
λ2 µ(λ2 ρ + 2)
= λ4 ρ(ρ + i)

is divisible by λ5 since one of ρ and ρ + i must be divisible by λ.

Hence, n ≥ 2. We have
ελ4n α4 = (γ − β 2 )(γ + β 2 ).
We can see that both factors are congruent modulo 2 and that their gcd divides 2, which means
that (
γ ± β2 = uλ2 x4
2 4n−2 4
γ ∓ β = vλ y
for some units u, v and Gaussian integers λ - x, y. Subtracting the two lines yields the equation

2β 2 = uλ2 x4 + vλ4n−2 y 4 ,

i.e.
µx4 + ηλ4(n−1) y 4 = β 2
where µ = −iu and η = −iv are units. It only remains to prove that µ = ±1 to conclude that
we have found a solution to the equation corresponding to n − 1 ≥ 1, thus contradicting its
minimality. Indeed, if µ = −1 we get the equation

x4 − ηλ4(n−1) y 4 = (βi)2 .

For this, consider our equation modulo λ4 to get µ ≡ ±1 as wanted.

We now consider the equation α4 − β 4 = γ 2 . Suppose that (α, β, γ) is a non-zero solution of


coprime Gaussian integers. Without loss of generality, suppose as well that λ - α: if λ | α then
λ - β and (β, α, iγ) is a solution. Note that we can assume without we are already done when
λ | β because we solved the more general equation

α4 + ελ4n β 4 = γ 2 .

where ε is a unit and λ ≥ 1. Hence, it remains to settle the case where λ - α, β. In that case,
rewrite the equation as β 4 = (α2 − γ)(α2 + γ). The two factors are coprime since λ - β so

α2 ± γ = ε± δ±
4

for some units ε± ∈ Z[i] and Gaussian integers λ - δ± . This then yields
4 4
ε− + δ − + ε+ δ + = 2α2 .
2.6. EXERCISES 245

Modulo λ2 , we get ε− + ε+ ≡ 0. It is easy to see that this implies ε− + ε+ = 0. But then, since
we know fourth powers are congruent to 1 modulo λ6 , we get

λ 6 | ε− + δ −
4 4
+ ε+ δ+ = 2α2

which implies λ | α and is a contradiction.




Exercise 2.6.14† . Prove that OQ(√5) is Euclidean.

Solution
 √ 
1+ 5 α

Let α, β ∈ OQ(√5) = Z 2 be quadratic integers with β 6= 0. Write β = x + y 5 with
x, y ∈ Q. Pick a half-integer n such that |y − m| ≤ 14 , and a half integer m ≡ n (mod 1) such
that |x − n| ≤ 21 . Then,
 2  2
√ 1 1 9
|N ((x − m) + (y − n) 5)| ≤ +5 = < 1.
2 4 16

Thus, the remainder τ = β(((x − m) + (y −√n) 5) works since it has norm less than |N (β)| by
the previous computation and α = β(m + n 5) + τ .


Hurwitz Integers and Jacobi’s Four Square Theorem


Exercise 2.6.15† . Let α ∈ H be a primitive Hurwitz integer, meaning that there does not exist a
α
non-zero m ∈ Z such that m ∈ H and let N (α) = p1 · . . . · pn be its prime factorisation. Then, the
factorisation of α = π1 · . . . · πn for irreducible elements πi of norm pi is unique up to unit-migration,
meaning that if if τ1 · . . . · τk is another such factorisation, then k = n and



 τ1 = π1 u1
τ = u−1
1 π2 u2


 2

...
τn−1 = u−1

n−1 πn un




−1

τ
n = un πn .

for some units u1 , . . . , un . Deduce that α is irreducible if and only if its norm is a rational prime.

Solution

Let α be a primitive Hurwitz integer. We shall construct its factorisation in irreducible step by
step, proving at the same time its uniqueness. We proceed inductively on n. Consider the right-
gcd π1 of p1 and α, i.e. π1 such that p1 H +αH = π1 H. This is unique up to multiplication on the
right by a unit. Note that this must necessarily be the π1 we’re looking for: if α = π10 · . . . · πn0 and
N (π10 ) = p1 , π10 divides both α and p1 , and, by looking at the norm, must be the right-gcd of p1
and α. Indeed, if the right-gcd were δ = π10 ρ0 it would have norm p21 since N (ρ) | N (p1 /π10 ) = p1 .
But then, ρ ∼ p so the right-gcd is p which is impossible since p - α.

Now, let’s prove that this π1 indeed has norm p. By construction, π1 e p1 , α so there exist ρ and
β such that p1 = π1 ρ and α = π1 β. In particular, N (π1 ) | N (p1 ) = p21 . We have already proved
that there was a problem if N (π1 ) = p21 : in that case p ∼ π1 divides α. It remains to settle the
246 CHAPTER 2. QUADRATIC INTEGERS

case where N (π1 ) = 1. This would mean that p1 H + αH = H, which is impossible since any
element in the LHS has norm divisible by p0 :

N (p1 u + αv) = (p1 u + αv)(p1 u + vα)


≡ αvvα (mod p)
= N (α)N (v)
≡ 0.

Finally, suppose now that α is irreducible. It is then clearly primitive, since we have shown that
rational integers were reducible. Factorise its norm as a product of rational primes p1 · . . . · pk
and factorise α has π1 · . . . · πk for irreducible elements πi of norm pi . Since α is irreducible, we
must have k = 1, i.e. its norm N (α) = p1 is prime.


Exercise 2.6.16† . Prove that (1 + i)H = H(1 + i)1 . Set ω = 1+i+j+k 2 . We say a Hurwitz integer
α ∈ H is primary if it is congruent to 1 or 1 + 2ω modulo 2 + 2i.2 Prove that, for any Hurwitz integer
α of odd norm, exactly one of its right-associates is primary.

Solution

As said in the footnote, 1 + i and its conjugate 1 − i = −i(1 + i) are associates. As a consequence,
1 + i right-divides α if and only if it left-divides α. In particular, the left and right multiples of
1 + i are the same.

For the second part, note that any Hurwitz integer α can be written in the form aω + bi + cj + dk.
Modulo 2, α is congruent to 1, i, j, k, or ω. In particular, there is a unit ε such that εα ≡ 1
(mod 2), and this unit is unique up to sign. Then, any Hurwitz integer congruent to 1 modulo
2 is congruent to ±1 or ±(1 + 2ω) modulo 2 + 2i, so this determines the sign of ε.


Exercise 2.6.17† . Let m ∈ Z be an odd integer. Prove that the Hurwitz integers modulo m, H/mH,
are isomorphic to the algebra of two by two matrices modulo m, (Z/mZ)2×2 . In addition, prove that
the determinant of the image is the norm of the quaternion.

Solution

We shall copy Remark 2.5.2. Our goal is to find matrices with coefficients in Z/mZ which  square

1 0
to −I2 . We are now going to do something very abusive: we shall define 1 for I2 = , i
      0 1
0 1 0 i i 0
for , j for and k for . Then, we will consider the matrix with integral
−1 0 i 0 0 −i
coefficients a + bi + cij + dik for a, b, c, d ∈ Z, where i is the complex number. When we square
this, we get
a2 − b2 + c2 + d2 + 2abi + 2iacj + 2iadk,
as seen in Exercise 2.5.4. Since we want something linearly independent with 1 and i, we shall
assume that a = b = 0. Then, if c = u and d = v are such that u2 + v 2 ≡ −1 (mod m), the

1 This means that we can manipulate congruences modulo 1 + i normally. Note that the choice of i is not arbitrary

at all, since 1 − i = −i(1 + i) and 1 − j = (1 − ω)(1 + i) are associates. By α ≡ β (mod γ), we mean that γ divides α − β
from the left and from the right.
2 Note that a primary Hurwitz integer is always in Z[i, j, k].
2.6. EXERCISES 247

matrix  
0 −v −u
j := uij + vik =
−u v
squares to −1 modulo m. We still need to find another matrix k0 which squares to −1, and
satisfies ij0 k0 = −1, i.e.
k0 = −ij0 = −i(uij + vik) = −uik + vij.
Clearly, this also squares to −1. It remains to prove the existence of such u, v. When m = p is
prime, this is Exercise 2.5.18∗ . In fact, our solution to this exercise also works when m = pk is
k
a prime power: there are p 2+1 squares since

x2 ≡ y 2 ⇐⇒ (x + y)(x − y) ≡ 0 ⇐⇒ x ≡ ±y
pk +1
since the two factors are coprime so pk must divide one of them. Thus, there are 2 elements
2 pk +1 2
of the form v + 1 and 2 of the form −u , so two must be equal as wanted. When m is
composite, the existence of such u, v follows from the Chinese remainder theorem.

To conclude, we have proven that there exists u, v such that u2 +v 2 ≡ −1 (mod m) and used them
to construct an isomorphism from H/mH to (Z/mZ)2×2 . Explicitely (for the reader which wasn’t
convinced by our perfectly valid manipulations with very abusive notation), this isomorphism is
given by
         
a + di b + ci 1 0 0 1 −v −u u −v
ϕ : a + i + cj + dk 7→ =a +b +c +d .
−b + ci a − di 0 1 −1 0 −u v −v −u

Actually, so far we have only shown that it is a morphism, but we will prove at the end that it
is injective and thus an isomorphism since |H/mH| = (Z/mZ)2×2 .

Note that ϕ(α) is the adjugate of ϕ(α), so their product is

N (α)I2 = ϕ(αα)
= ϕ(α) adj ϕα
= det(ϕ(α))I2
 
a b
by Proposition C.3.7. Indeed, as we saw in the beginning of Section C.3, the adjugate of
c d
 
d −b
is . Since this is additive, it suffices to check that
−c a

ϕ(1) = ϕ(1)
 
0 −1
= ϕ(−i) = adj ϕ(i)
1 0
 
v u
= ϕ(−j) = adj ϕ(j)
u −v
 
−u v
= ϕ(−k) = adj ϕ(k)
v u

which is clearly true.

Finally, ϕ is injective since its kernel is trivial (see Exercise A.2.16∗ ). Indeed, ϕ(α) = 0 implies
ϕ(α) = adj ϕ(α) = 0. As a consequence, if α = a + bi + cj + dk,

2ϕ(a) = ϕ(α + α) = 0

so a = 0 since m is odd. The same reasoning used on αi, αj and αk shows that a = b = c = d = 0.

248 CHAPTER 2. QUADRATIC INTEGERS

Remark 2.6.2
The step where we assume a = b = 0 is completely legitimate: since a2 − b2 + c2 + d2 + 2abi +
2iacj + 2iadk must be −1, we need ab = ac = ad = 0, i.e. a = 0 or b = c = d = 0 but the latter
clearly doesn’t work when there’s no square root of −1 in Z/mZ. Thus, a = 0. Similarly, we want

k0 := −ij0 = −i(wi + uij + vik) = w − uik + vij

to square to −1, which implies w = 0 for the same reason.

Exercise 2.6.18† . Let m be an odd integer. We say a Hurwitz integer α = a + bi + cj + dk is primitive


modulo n if gcd(2a, 2b, 2c, 2d, m) = 1. Compute the number ψ(m) of primitive Hurwitz integers modulo
m with norm zero (modulo m).

Solution

By
 Exercise
 2.6.17† , we need to count the number of two by two primitive modulo m matrices
a b
with zero determinant, i.e. the number of a, b, c, d such that ad − bc ≡ 0 (mod m)
c d
and gcd(a, b, c, d, m) = 1. It is immediate from the Chinese remainder theorem that this is
multiplicative, i.e. ψ(mn) = ψ(m)ψ(n) when m and n are coprime. Hence, it remains to
compute ψ(pk ) for an odd prime p. We shall first prove that ψ(pk+1 ) = p3 ψ(pk ), thus reducing
this computation to the (easy) computation of ψ(p). More precisely, we show that any primitive
modulo pk quadruple (a, b, c, d) such that
ad − bc ≡ 0 (mod pk )
can be lifted to exactly p3 primitive modulo pk+1 quadruple (a0 , b0 , c0 , d0 ) ≡ (a, b, c, d) (mod pk )
such that a0 d0 − b0 c0 ≡ 0 (mod pk+1 ). Hence, suppose (a, b, c, d) is such a quadruple and suppose
without loss of generality that a is non-zero modulo p. Consider a quadruple (a0 , b0 , c0 , d0 ) ≡
(a, b, c, d) (mod pk ). The congruence a0 d0 ≡ b0 c0 (mod pk+1 ) is equivalent to
d0 ≡ b0 c0 (a0 )−1 (mod pk+1 ).
Thus, for each choice of a0 , b0 , c0 , there is exactly one d0 satisfying this equality. Since there are p3
triplets (a0 , b0 , c0 ) modulo pk+1 which are congruent to (a, b, c) modulo pk , this proves the result.
It remains to compute ψ(p). Choose a t ∈ Z/pZ and consider the equation ad ≡ t ≡ bc (mod p).
If t 6≡ 0, there are (p − 1)2 solutions: pick any non-zero a, b, and set d ≡ ta−1 and c ≡ tb−1 . If
t ≡ 0, there are 2(p − 1) + 1 = 2p − 1 solutions to ad ≡ 0: if a ≡ 0, there are p − 1 non-zero
possibilities for d and inversely, and then we count the solution a ≡ d ≡ 0. Hence, there are
(2p − 1)2 solutions to ad ≡ 0 ≡ bc, but since we are interested in primitive quadruples, we must
remove the solution (0, 0, 0, 0). In total, we have
ψ(p) = (p − 1) · (p − 1)2 + (2p − 1)2 − 1 = (p2 − 1)(p + 1).
mk
To conclude, if m = pm
1 · . . . · pk , we have
1

k
Y
ψ(m) = ψ(pk )
i=1
k
Y
= p3(k−1) ψ(p)
i=1
k
Y
= p3(k−1) (p2 − 1)(p + 1)
i=1
Y 1

1

3
=m 1− 2 1+ .
p p
p|m


2.6. EXERCISES 249

Exercise 2.6.19† . Let p be an odd prime. Prove that any non-zero α ∈ H/pH of zero norm modulo p
has a representative of the form ρπ, where π is a primary element of norm p and ρ ∈ H, and that this
π is unique. Conversely, let π ∈ H have norm p. Prove that the equation ρπ ≡ 0 (mod p) has exactly
p2 solutions ρ ∈ H/pH. Deduce that there are exactly p + 1 primary irreducible Hurwitz integers with
norm p.

Solution

Lift α to a Hurwitz integer β of norm divisible by p. Consider its primitive part γ. Since β isn’t
divisible by p, the norm of γ is still divisible by p. Hence, by Exercise 2.6.15† , γ = ρπ for some
ρ ∈ H and some π of norm p. In addition, as we saw in the solution to this exercise, this π is
unique up to multiplication by a right-unit since it’s the left-gcd of γ and p. However, we still
need to justify the step where we went from β to γ, i.e. that β = δπ for some δ ∈ H and π of
norm p implies γ = ρπ for some ρ ∈ H. Let m be the non-squarefree part of β, i.e. β = mγ.
Since p = ππ is invertible modulo m, π is too so m must divide δ by Bézout. We are done:
γ = (δ/m)π.

For the second part, by Exercise 2.6.17† , this amounts to counting the solutions (x, y, z, t) to
    
x y a b 0 0
≡ (mod p)
z t c d 0 0
   
x y a b
where is the matrix corresponding to ρ and the matrix corresponding to π. In
z t c d
other words, we wish to count the solutions to

xa + yc ≡ 0 (2.1)
xb + yd ≡ 0 (2.2)
za + tc ≡ 0 (2.3)
zb + td ≡ 0. (2.4)

Note that the condition that π has norm divisible by p translates to ad − bc ≡ 0. Since π is non-
zero modulo p, at least one of its coordinate is non-zero, say a. Then, (1) becomes x ≡ −yca−1
and (3) becomes z ≡ −tca−1 . Note that the other two equations are then automatically fulfilled
since
a(xb + yd) = b(xa + yc) − (ad − bc)y
and the same goes for z and t. Hence, there are p2 solutions as claimed: we choose y and t
arbitrarily and x and y are then uniquely determined.

We have shown that each of the ψ(p) = (p2 − 1)(p + 1) non-zero classes of H/pH of zero norm
can be written in the form ρπ for some unique π of norm p. However, each π has exactly p2 left-
multiples modulo p: ρπ takes each value exactly p2 times and there are p4 elements ρ ∈ H/pH, π
has p4 /p2 = p2 left-multiples. (This can also be seen more efficiently with the language of group
theory: the morphism from H/pH to itself sending ρ to ρπ has a kernel of cardinality p2 so its
image has cardinality |H/pH|/p2 = p2 by the first isomorphism theorem from Exercise A.3.16† .)
Thus, each π occurs for exactly p2 − 1 classes, so there are

ψ(p)
=p+1
p2 − 1
primary elements of norm p.


Exercise 2.6.20† (Jacobi’s Four Square Theorem). Let n be a positive rational integer. In how many
ways can n be written as a sum of four squares of rational integers. (Two ways are considered different
250 CHAPTER 2. QUADRATIC INTEGERS

if the ordering is different, for instance 2 = 12 + 02 + 02 + (−1)2 and 2 = (−1)2 + 02 + 02 + 12 are


different.)

Solution

Note that counting the number of ways to write n as a sum of four squares is the same as counting
the number of quaternions in Z[i, j, k] with norm n. We start by counting the number of primary
primitive Hurwitz integers of odd norm m, then we will consider the contribution of primary
non-primitive integers of norm m, and finally the contribution of their associates too (and treat
the even case).

Let m be an odd positive integer and let pm mn


1 · . . . · pn
1
be its prime factorisation. By Exer-

cise 2.6.15 , each primitive integer of norm m has an expression of the form
mi
n Y
(i)
Y
πj
i=1 j=1

(i)
where πj is an element of norm pi . Since we are interested in primary integers, we may assume
(i)
that each πj is primary as well, by migrating the units. Then, this expression becomes unique,
and any such expression gives rise to a unique integer of norm m again by Exercise 2.6.15† ,
provided that it is primitive. We will prove that it is primitive as long as two consecutive factors
are not conjugates, which is obviously a necessary condition. (Note that the conjugate of a
primary integer is also primary.) Suppose that some rational prime p divides this product. Since
the product has norm m, p = pk for some k. Since every element of norm coprime with p is
invertible modulo p, p must divide
(k) (k)
π1 · . . . · πm k
:= π1 · . . . · π` .

Consider the greatest i such that p divides π1 · . . . · πi . Then, π1 · . . . · πi−1 is not divisible by p so
is primitive since its norm is a power of p. We will prove that πi and πi−1 are conjugate. Write
pρ = π1 · . . . · πi for some ρ ∈ H. This is equivalent to

ρπi = π1 · . . . · πi−1 .

Since these elements are now primitive, Exercise 2.6.15† tells us that πi and πi−1 are the same
up to association, i.e. the same since they are primary.

Hence, the number f (m) of primary and primitive Hurwitz integers of norm m is equal to the
number of products of the form
n Ymi
(i)
Y
πj
i=1 j=1

(i)
where πj is primary of norm pi and no two consecutive factors are conjugate. In other words,
(1) (1)
we have p1 + 1 possibilities for π1 and then only p1 for every other πj since we need to avoid
(1)
the conjugate of πj−1 . The same goes for p2 , p3 , . . . , pn . Thus,
n n  
Y
i −1
Y 1
f (m) = (pi + 1)pm
i =m 1+ .
p
k=1 k=1

Now that we have computed the number of primary primitive integers of norm m, we shall
compute the number of primary integers of norm m. This is simply
X
g(m) = f (m2 /d)
d2 |m
2.6. EXERCISES 251

because a primary integer α of norm m is a primary primitive integer of norm m/d2 , where d is
the non-primitive part of m, i.e. the unique positive integer d | α such that α/d is primitive. By
expanding the following expression, we see that

n m
Y Xi /2
i −2k
g(m) = f (pm
i )
i=1 k=0

because f is multiplicative, i.e. f (ab) = f (a)f (b) when a and b are coprime, and each d2 | m can
P`/2
be written as pd11 · . . . · pdnn with 2di ≤ mi for every i. Now, note that the sum k=0 g(p`−2k ) is

p`−1 (p + 1) + p`−3 (p + 1) + . . . + (p + 1) = p` + . . . + 1

when ` is odd, and

p`−1 (p + 1) + p`−3 (p + 1) + . . . + p(p + 1) + 1 = p` + . . . + 1

when ` is even. Thus, in all cases,

Y mi
n X X
g(m) = pk = d.
i=1 k=0 d|m

Now, only two things remain be done: take in account the contribution of units, and treat the
case where m is odd. Let α be a primary integer and let ε be a unit of H. Then, αε is in Z[i, j, k]
if and only if ε ∈ {±1, ±i, ±j, ±k}. Thus, there are
X
r4 (n) = 8g(n) = 8 · d
d|n

ways to write n as a sum of four squares when n is odd. Now suppose that n is even and write
n = 2r m with r = v2 (n). We will prove that any element of norm n has the form (1 + i)r times
an element of norm m. As a consequence, the number of primary quaternions of norm n will be
X
g(n) = g(m) = d.
d|m

This time however all units will yield elements in Z[i, j, k] since (1 + i)ε ∈ Z[i, j, k] for any unit
ε. Since there are 24 units, we conclude that the numbers of ways to express n as a sum of four
squares is X
r4 (n) = 24g(m) = 24 · d.
d|n,d odd

Hence, it only remains to prove that an element of even norm is divisible by 1 + i. Indeed, since
1 + i has norm 2, iterating this result yields that an element of norm divisible 2r is divisible
by (1 + i)r as wanted. Suppose that α = a + bi + cj + dk has even norm. In particular,
(2a)2 + (2b)2 + (2c)2 + (2d)2 is divisible by 8. Since odd squares are 1 modulo 8, this implies
2 | 2a, 2b, 2c, 2d, i.e. α ∈ Z[i, j, k]. Modulo 1 + i, α is simply a + b + c + d, which is clearly divisible
by 1 + i since it is divisible by 2. We are done.
P
To summarise, there are 8 · d|n d ways to write n as a sum of two squares when n is odd, and
P
24 · d|n,d odd d when n is even.

252 CHAPTER 2. QUADRATIC INTEGERS

Domains
Miscellaneous
Exercise 2.6.25† . Let (Fn )n∈Z be the Fibonacci sequence defined by F0 = 0, F1 = 1, and Fn+2 =
Fn+1 + Fn for any integer n. Prove that, for any integers m and n, gcd(Fm , Fn ) = Fgcd(m,n) .

Solution
αn −β n
Note that Fn = α−β , where α, β are the roots of X 2 −X −1 (see Section C.4). Thus, d | Fm , Fn
if and only if
δ := (α − β)d | αm − β m , αn − β n .
Now, note that δ = αgcd(m,n) − β gcd(m,n) = (α − β)Fgcd(m,n) works since

αk gcd(m,n) ≡ β k gcd(m,n)

for any k ∈ Z. For the converse, let k be the smallest positive integer such that

δ | αk − β k ⇐⇒ δ | (α/β)k − 1

(note that β is a unit since αβ = 1 so we can divide by β like we did). Then, we shall prove that
k | m, n. Write m = qk + r the Euclidean division of m by k. Then,

1 ≡ (α/β)m = (α/β)k )q · (α/β)r ≡ (α/β)r

which contradicts the minimality of k, unless r = 0. Thus, k | m, and by symmetry k | n, which


implies that k | gcd(m, n) and δ | αgcd(m,n) − β gcd(m,n) as wanted.


Remark 2.6.3
This is identical to the proof that gcd(an − bn , am − bm ) = agcd(m,n) − bgcd(m,n) for a, b ∈ Z using
orders, but in OQ(√5) . See Section 7.2 for more.
√ √
Exercise 2.6.27† . Let n√be a rational integer. Prove that (1 + 2)n is a unit of Z[ 2]. Moreover,
prove that any unit of Z[ 2] has that form, up to sign.

Solution
√ √ √
Note that N ((1 + 2)k ) = (−1)k so (1 + 2)k√is a unit. Now, suppose a + b 2 is the smallest
unit with a, b > 0 which is not a power of 1 + 2. Then,

√ √ √ a+b 2 √
(2b − a) + (a − b) 2 = −(a + b 2)(1 − 2) = √ < a + b 2.
1+ 2
Thus, if we show that 2b − a > 0 and a − b > 0, we will reach a contradiction. This is easy: since
a2 − 2b2 = ±1, if 2b ≤ a we have
a2 − 2b2 ≥ 2b2 > 1,
and if a ≤ b we have
a2 − 2b2 < −b2 < −1
√ √
unless b = 1 but that gives a + b 2 = 1 + 2 which we have ruled out. (See also Section 7.1.)

2.6. EXERCISES 253

Exercise 2.6.28† (IMO 2001). Let a > b > c > d be positive rational integers. Suppose that

ac + bd = (b + d + a − c)(b + d − a + c).

Prove that ab + cd is not prime.

Solution

We shall first simplify the condition on a, b, c, d:

ac + bd = (b + d)2 − (a − c)2 = b2 + 2bd + d2 − a2 + 2ac − c2 ,

i.e. a2 − ac + c2 = b2 + bd + d2 , or in other words, (a + jc)(a + j 2 c) = (b − jd)(b − j 2 d). This of


course suggests working in Z[j]. Set α = a + jc and β = b − jd. We have αα = ββ. Let ρ be the
gcd of α and β and write α = γρ and β = δρ. Then,

γγ = δδ

and gcd(γ, δ) = 1 so γ | δ and δ | γ, i.e. γ = εδ for some unit ε ∈ Z[j]. Now, notice that

αβ = (a + jc)(a − jd) = ab + cd + j(bc + cd − ad).

However, αβ is also equal to γδρ2 and γδ = εN (γ). Hence, if ab + cd is prime, we have N (γ) ∈
{1, ab + cd}.

Suppose first that N (γ) = 1, i.e. γ is a unit. Then, β = δρ is a unit times γρ = α, say α = ηβ.
Since the only units of Z[j] are ±j k by Exercise 2.4.2∗ , we get

±(a + jc) ∈ {b − jd, d + j(b + d), b + d + jb}.

All of these clearly contradict the assumption that a > b > c > d > 0. It remains to treat the
case where N (γ) = ab + cd. In that case, we have ab + cd | bc + cd − ad since

N (γ) | αβ = ab + cd + j(bc + cd − ad).

Note that bc + cd − ad must be positive or zero for otherwise its absolute value is less than
ad < ab + cd. However, if it is positive, then bc + cd − ad ≥ ab + cd which is impossible since
ab > bc. Hence, bc + cd − ad must be 0. This implies that

ερ2 (ab + cd) = γδρ2 = ab + cd,

i.e. ρ is a unit. Then, β = εγρ is a unit times γρ = α, say α = µβ. As before, the only units of
Z[j] are ±j k so we get

±(a + jc) ∈ {b + d + jd, −d + jb, b + j(b + d)}.

Each of these cases still contradicts a > b > c > d > 0.




Exercise 2.6.29† . Let x ∈ R be a non-zero real number and m, n ≥ 1 coprime integers. Suppose that
xm + x1m and xn + x1n are both rational integers. Prove that x + x1 is also one.

Solution
√ √
2 2
Let a = xm + x1m and b = xn + x1n . Then, xm = a± 2a −4 and xn = b± 2a −4 , i.e. xm and xn are

units in a quadratic field Q( d) (it is the same field since (xm )n = (xn )m ). Finally, let u, v ∈ Z
254 CHAPTER 2. QUADRATIC INTEGERS


be such that um + vn = 1, by Bézout’s lemma. Then, x = xum xvn is a unit of Q( d) too, i.e.
x + x1 ∈ Z (since x1 is the conjugate of x).


Remark 2.6.4

One might be tempted to look at xmn : it is both an mth power and an nth power in Q( d) so we
may want to conclude that it is an mnth power. For UFDs, by looking at the p-adic valuation, √
we see that it is an mnth power times a unit. Since the only units in real quadratic fields (Q( d)
for d > 0) are ±1 (see Section 7.1), we conclude that it is ± an mnth power, and by looking
the parity of mn, we can see that in fact it must be an mnth power as wanted. In general,
at √
Q( d) might not be a UFD, but since it has ideal factorisation, there is still a concept of p-adic
valuation so the previous solution works too.
Chapter 3

Cyclotomic Polynomials

3.1 Definition
Exercise 3.1.1∗ . Let ω be an nth root of unity. Prove that its order divides n.

Solution

Let k be the order of ω and let n = qk + r be the Euclidean division of n by k. We have

ω n = (ω k )q ω r = ω r

so ω r = 1 but r < k which means that r = 0 by minimality of the order.




Exercise 3.1.2∗ . Let p be a rational prime. Prove that Φp = X p−1 + . . . + 1.

Solution

We have Φ1 Φp = X p − 1 by Proposition 3.1.1 so

Xp − 1
Φp = = X p−1 + . . . + 1.
X −1


Exercise 3.1.3∗ . Let n ≥ 1 be an integer. Prove that Φn (0) = −1 if n = 1 and 1 otherwise.

Solution

By induction on n: true for n = 1 and for n > 1 we have


0n − 1 −1
Φn (0) = Q = = 1.
Φ
d|n,d<n d (0) −1 · 1 · ... · 1

Exercise 3.1.4. Let n > 1 be an integer. Prove that Φn (1) = p if n is a power of a prime p, and
Φn (1) = 1 otherwise.

255
256 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

Solution

By induction on n:
Y Xn − 1
Φd = = X n−1 + . . . + 1
X −1
16=d|n
Q
so 16=d|n Φd (1) = n. Note that the function given in the statement satisfies this equation:
Y Y
p= pvp (n) = n
16=pi |n p|n

since each factor p appears exactly vp (n) times. Thus, by induction, Φn (1) is p is n is a power
of p and 1 otherwise.


Exercise 3.1.5∗ . Prove the Corollary 3.1.1 by induction.

Solution

By induction on n:
Xn − 1
Φn = Q
d|n,d<n Φd

and this polynomial division has integer coefficients since the divider is monic.


Exercise 3.1.6∗ . Prove that Φn (1/X) = Φn (X)/X ϕ(n) for n > 1.

Solution

When n > 1, the primitive nth roots of unity come by pairs ω, 1/ω so the number of such . Thus,
Y Y 1 Y
Φn (1/X) = 1/X − ω = (ω − X) = Φn /X ϕ(n) (−1)ϕ(n) 1/ω
ω ω
ωX ω

and (−1)ϕ(n) ω 1/ω = Φn (0) by Vieta’s formulas which is 1 by Exercise 3.1.3∗ .


Q

Remark 3.1.1
If f = i ai X i is a polynomial, the fact that f (X) = X deg f f (1/X) can be seen more visually
P
using its coefficients: this is equivalent to
X X
ai X i = ai X deg f −i ,
i

i.e. ai = adeg f −i .

Exercise 3.1.7∗ . Prove that, for n > 1, Φn (X, Y ) is a two-variable symmetric and homogeneous, i.e.
where all monomials have the same degree, polynomial with integer coefficients.
3.1. DEFINITION 257

Solution

It is homogeneous because Φn (X/Y ) is a homogeneous rational fractions (of degree 0) and Y ϕ(n)
i
P
is too. It is a polynomial because, if Φn = i ai X then
X
Y ϕ(n) Φn (X/Y ) = ai X i Y ϕ(n)−i
i

(we can also see it is homogeneous that way). It is symmetric by Exercise 3.1.6∗ :

Φn (Y /X) = (Y /X)ϕ(n) Φn (X/Y ) ⇐⇒ X ϕ(n) Φn (Y /X) = Y ϕ(n) Φn (X/Y ).

Exercise 3.1.8∗ . Prove that


Y
Φn (X, Y ) = X − ωY.
ω primitive nth root

Solution

We have Y Y
Φn (X, Y ) = Y ϕ(n) X/Y − ω = X − Y ω.
ω ω

Exercise 3.1.9∗ . Prove that, for odd n > 1, Φn (X)Φn (−X) = Φn (X 2 ) and deduce Corollary 3.1.3.

Solution

We have
Y Y Y
Φn (X)Φn (−X) = (X − ω)(−X − ω) = −(X 2 − ω 2 ) = (−1)ϕ(n) X 2 − ω2 .
ω ω ω

Since n > 2, ϕ(n) is even, and since n is odd, ω 7→ ω 2 is a permutation of the primitive nth
2
roots of unity so this is just Φn (X 2 ). Since Φ2n (X) = ΦΦnn(X )
(X) by Proposition 3.1.2, we get
Φ2 n(X) = Φn (−X).


Exercise 3.1.10. Prove that, for any polynomial f , f (X)f (−X) is a polynomial in X 2 .

Solution

We present three proofs: the first work over any ring with the fundamental theorem of symmetric
polynomials and by expansion, and one which works over C (and any algebraically closed field)
using the fundamental theorem of algebra A.1.1.

For the first one, note that f (X)f (−X) is symmetric in X and −X so is a polynomial in −X 2
and 0.
258 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

For the second one, write f (X) as g(X 2 ) + Xh(X 2 ) to get

f (X)f (−X) = (g(X 2 ) + Xh(X 2 ))(g(X 2 ) − Xh(X 2 )) = g(X 2 )2 + X 2 h(X 2 )2 .

For the last one, note that that the result is true for polynomials of degree 1 as

(X − α)(−X − α) = −(X − α)(X + α) = −(X 2 − α2 )

so is true for any polynomial since any polynomial factorises as a product of a constant polynomial
and degree 1 polynomials.


Exercise 3.1.11∗ . Let p be a prime number and n ≥ 1 an integer. Prove that if p | n then
p
,Y p )
Φpn (X, Y ) = Φn (X p , Y p ), and that Φpn (X, Y ) = ΦΦnn(X
(X,Y ) otherwise.

Solution

We have
(
ϕ(pn) Y pϕ(n) Φn (X p /Y p ) = Φn (X p , Y p ) if p | n
Φpn (X, Y ) = Y Φpn (X/Y ) = p
/Y p ) Φn (X p ,Y p )
Y pϕ(n) /Y ϕ(n) ΦΦnn(X
(X/Y ) = Φn (X,Y ) if p - n

by Proposition 3.1.2.


Exercise 3.1.12∗ . Let k ≥ 1 be an integer. Prove that Φ2k = X 2


k−1
+ 1.

Solution
0
By induction on n: we have Φ2 = X 2 + 1 and
k−1 k−1
Φ2·22k = Φ2k (X 2 , Y 2 ) = X 2 +Y2

for k ≥ 1 by Proposition 3.1.2.




3.2 Irreducibility
Exercise 3.2.1∗ . Let n ≥ 1 be an integer and ω be a primitive nth root of unity. Prove that any
primitive nth root can be written in the form ω k for some gcd(k, n) = 1.

Solution

Write ω = exp 2miπ for some gcd(m, n) = 1. The other primitive roots of unity are exp 2kiπ
 
n n
for gcd(k, n) = 1 and the powers of ω are exp 2kmiπ

n . Since m is coprime with n, it is invertible
mod n so k 7→ km is a bijection of (Z/nZ)× which is equivalent to ω 7→ ω k being a bijection of
primitive nth root as wanted.

3.2. IRREDUCIBILITY 259

3.2.2∗ . Let f =
Qn
Exercise
Q k=1 X − αi be a polynomial. Prove that, for any k = 1, . . . , n, f 0 (αk ) =
i6=k αk − αi .

Solution

By Exercise A.1.8∗ , we have XY


f0 = X − αi
i j6=i

from which the result follows by evaluating this at αk .




Exercise 3.2.3∗ (Frobenius Morphism). Prove the following special case of Proposition 4.1.1: for any
rational prime p and any polynomial f ∈ Z[X], f (X p ) ≡ f (X)p (mod p).

Solution

Note that
X p
p
(f + g) = f k g p−k ≡ f p + g p (mod p)
k
k
p! P p P p
for any f, g ∈ Z[X] since p | kp = k!(p−k)!

for 0 < k < p. Thus, by induction, ( fi ) ≡ fi .
Taking fi = ai X i and letting f = i ai X i , we get
P

!p
X X p X
p i
f (X) = ai X = ai X ip ≡ ai X ip = f (X p )
i i i

by Fermat’s little theorem.




Exercise 3.2.4 (Alternative Proof of Theorem 3.2.1). Let ω be a primitive nth root of unity with
minimal polynomial π and let p - n be a rational prime. Suppose τ is the minimal polynomial of
π(ω p ). Prove that p | τ (0) and that τ (0) is bounded when p varies. Deduce that ω p is a root of π for
sufficiently large p, and thus that ω k is a root of π for any gcd(n, k) = 1.

Solution

τ (0) is ± the product of its roots by Vieta’s formulas, and since π(ω p ) is a root it is divisible by it.
Thus, p | π(ω p ) | τ (0). Since ω p takes a finite amount of values, π(ω p ) does as well which means
that the same goes for τ . This shows that τ (0) is bounded when p varies. As a consequence, for
sufficiently large p, say p > N , since p | τ (0), we have τ (0) = 0, i.e. τ = X and π(ω p ) = 0.

To finish, say p1 , . . . , p` - n are the primes less than N . Since ω p is also a primitive nth root
of unity for p - n, we can repeat our reasoning with this root of unity to show that π(ω m ) = 0
for any m whose prime factors are all greater than N . Pick any k coprime with n. Using the
Chinese remainder theorem, pick an m ≡ k (mod n) which is congruent to 1 modulo p1 , . . . , p` .
Then all prime factors of m are greater than N so

π(ω k ) = π(ω m ) = 0

as wanted.

260 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

2kπ

Exercise 3.2.5.  k and n ≥ 1 be coprime integers. Prove that the conjugates of cos n are the
 0 Let
numbers cos 2kn π for gcd(k 0 , n) = 1 and that they have degree ϕ(n)/2. What about sin 2kπ
n : what
are its conjugates and what is its degree?

Solution

Note that  
2kπ
2 cos = ω + ω −1 = ω + ω n−1
n
2kπ

where ω = exp n is a primitive nth root of unity. In particular, by the
 fundamental
 theorem
2k0 π
2kπ
for gcd(k 0 , n) = 1

of symmetric polynomials, the conjugates of 2 cos n are among 2 cos n
−1
as wanted. For the converse, note that if f (2 cos 2kπ

n ) = 0 then f (X + X ) has a root at ω so
2kπ

at all other primitive nth
 roots
 of unity as wanted.
 In
 particular,
 for
 n ≥ 3, cos n conjugates
2k0 π 2k0 π −2k0 π
since the numbers cos n go by pair cos n , cos n . (For n ≤ 2, it has degree
ϕ(n) = 1.)

The situation is more complicated for sines. The problem is that we can not invoque the funda-
mental theorem of symmetric polynomials because there is now a i in our expressions:

ω − ω −1
 
2kπ
2 sin = .
n 2

Perhaps the simplest way is to transform it into a cosine:


     
2kπ π 2kπ 2π(n − 4k)
sin = cos − = cos .
n 2 n 4n

We need to evaluate gcd(n − 4k, 4n). Since k and n are coprime, it is clear that  the only
potential prime factor of this gcd is 2. In particular, if 8 | n, the gcd is 4 so sin 2kπ
n has degree
ϕ(4n/4)/2 = ϕ(n)/2.

When n ≡ 4 (mod 8), the numerator is this time divisible by 8 because n ≡ 4k (mod 8). If
n ≡ 4 (mod 16), then the gcd is 16 so sin 2kπ
n has degree

ϕ(4n/16)/2 = ϕ(n)/4.

If n 6≡ 4 (mod 16), then it has degree ϕ(4n/8)/2 = ϕ(n)/4 as well (ϕ(m) = ϕ(2m) when m is
odd). Of course, we are assuming that n 6= 4 here, so that n − 4k is non-zero. If n = 4, it has
degree 1.

If n ≡ 2 (mod 4), the gcd of n − 4k and 4n is just 2, so sin 2kπ



n has degree ϕ(4n/2)/2 = ϕ(n).
Similarly, if n is odd, the gcd is 1 so it has degree ϕ(4n)/2 = ϕ(n) as well. This time we assumed
that n was greater than 2, otherwise it has degree 1. We can summarise our results in the
following table.

degree of sin 2kπ



n
n ∈ {1, 2, 4} 1
n ≡ 0 (mod 8) ϕ(n)/2
n ≡ 4 (mod 8) ϕ(n)/4
n ≡ 2 (mod 4) ϕ(n)
n ≡ 1 (mod 2) ϕ(n)

3.2. IRREDUCIBILITY 261

Remark 3.2.1
The reason why sines turn out to be so unstructured is because of the i in the denominator of
k −k
sin 2kπ = ω −ω

n 2i . Suppose k = 1 without loss of generality, by symmetry between primitive
nth roots of unity. The best way to see  this is with Galois theory (see Chapter 6). Because of
this i, to consider conjugates of sin 2π
n we need to work in Q(ω, i). Then, we count the number
 ω−ω−1
of automorphisms, i.e. elements of the Galois group over Q, fixing sin 2πn = 2i . If there are
N such elements, then there are exactly

[Q(ω, i) : Q]/N = ϕ(lcm(4, n))/N

conjugates, by Proposition 6.3.1. Normally N = 2, i.e. there are only two embeddings fixing
ω−ω −1
2i : the identity and the complex conjugation. However, sometimes there are more. Let’s see
−1
more closely what’s happening: if σ ∈ Gal(Q(ω, i)/Q) fixes ω−ω2i , since it sends i to ±i, it must
send ω − ω −1 to ±(ω − ω −1 ). In other words, we consider the potential embeddings
(
ω 7→ ω
id : ,
i 7→ i
(
ω → 7 ω −1
τ: ,
i → 7 −i
(
ω → 7 −ω
ϕ:
i → 7 −i
and (
ω 7→ −ω −1
ψ:
i 7→ i.
The first two always exist: they are the identity and the complex conjugation. The other two are
more delicate. First of all, if n is odd or congruent to 2 modulo 4, then −ω ±1 is not a conjugate
of ω so they do not exist. If 4 | n, since −ω ±1 = ω n/2±1 and ω n/4 is i or −i, we must have

(ω n/4 )n/2±1 = ±ω n/4


n
for these embeddings to exist. This means that 2 · n4 ≡ n
2 (mod n), i.e n
4 is odd, or in other words
n ≡ 4 (mod 8).

To conclude, when 8 | n we have ϕ(lcm(4, n))/N = ϕ(n)/2, when n ≡ 4 (mod 8) we have


ϕ(lcm(4, n))/N = ϕ(n)/4, and otherwise we have ϕ(lcm(4, n))/N = 2ϕ(n)/2 as desired. (Obvi-
ously, we exclude the exceptions 1, 2, 4.)

Exercise 3.2.6. Find all quadratic cosines.

Solution

The degree of cos 2kπ



n is 1 for n = 1, 2 and ϕ(n)/2 for n > 2. Indeed, when n > 2, the cosines
cos 2kπ

n for gcd(k, n) = 1 come into pairs
   
2kπ 2(n − k)π
cos = cos
n n

so cos 2kπ

n has half as many conjugates as the number of gcd(k, n) = 1, i.e. ϕ(n)/2. Thus, the
quadratic cosines are cos 2kπ

n for gcd(k, n) = 1 and ϕ(n)/2 ≤ 2, i.e. n ∈ {1, 2, 3, 4, 5, 6, 8}.

262 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

3.3 Orders
Exercise 3.3.1∗ . Let p be a rational prime and a a rational integer. Prove that, for any n ≥ 1,
p | Φn (a) if and only if p | Φpn (a).

Solution

It suffices to note that Φpn (X) ≡ Φn (X)p or Φn (X)p−1 modulo p by Proposition 3.1.2.


Exercise 3.3.2∗ . Let p be a rational prime. Prove that there always exists a primitive root or
generator modulo p, i.e. an integer g such that g k generates all integers p - m modulo p.

Solution

Primitive roots are the elements of order p − 1, i.e. the roots of Φp−1 modulo p. Since

Φp−1 | X p−1 − 1 = (X − 1) · . . . · (X − (p − 1))

it splits in Fp and in particular has a root.




Exercise 3.3.3∗ . Let p be a rational prime and a, b two rational integers. Prove that p | Φn (a, b) if
and only if p | a, b or pvpn(n) is the order of ab−1 modulo p.

Solution

If p - a, b, Φn (a, b) is zero modulo p if and only if Φn (ab−1 ) is.




Exercise 3.3.4∗ . Let p be a rational prime and a an integer of order n modulo p. Prove that ak ≡ 1
(mod p) if and only if n | k. Deduce that n divides p − 1.1

Solution

Let k = qn + r be the Euclidean division of k by n. We have

ak = (an )q ar ≡ ar

so ar ≡ 1 but r < n which means that r = 0 by minimality of the order. Since ap−1 ≡ 1 by
Fermat’s little theorem, we get n | p − 1.


Exercise 3.3.5∗ . Let p be a rational prime and a, b two rational integers. Suppose that p | Φn (a, b).
Prove that, if p does not divide both a and b, either p ≡ 1 (mod n) or p is the greatest prime factor
of n.

1 This is the mod p version of Exercise 3.1.1∗ . In fact the proof should be the same as it works in any group (see

Section A.2 and Theorem 6.3.2).


3.4. ZSIGMONDY’S THEOREM 263

Solution

If p - a, b then Φn (ab−1 ) ≡ 0 (mod p).




Exercise 3.3.6∗ . Let p be a rational prime and a an integer. Suppose p | Φn (a), Φm (a) and n 6= m.
Prove that m
n is a power of p.

Solution
n m m
a has both order pvp (n)
and pvp (m)
modulo p so n = pvp (m)−vp (n) is a power of p as wanted.


Exercise 3.3.7. Prove the following strengthening of Problem 3.1.1: for any integer n ≥ 0, the
n+1 n
number 22 + 22 + 1 has at least n + 1 distinct prime factors.

Solution

We have
n
n+1 n Y
22 + 22 + 1 = Φ3·2k (2)
k=0

and 2 is not the greatest prime factor of 3 · 2k neither is it congruent to 1 modulo 3 · 2k so can’t
3·2k
divide Φ3·2k (2) by Corollary 3.3.1. Since 3·2 k0 is a power of 2, the only possible common prime
factor of Φ3·2k (2) and Φ3·2k0 (2) is 2 but we have already shown that they were odd. Thus, each
factor contributes to at least one prime factor and we have in total at least n + 1 prime factors
as wanted (we have shown in Problem 3.1.1 that they were non-trivial).


Remark 3.3.1
This can also be seen as a corollary of the Zsigmondy theorem: each Φ3·2k (2) brings a primitive
prime factor except when k = 1 but 3 = Φ6 (2) is still primitive compared to 7 = Φ3 (2).

Exercise 3.3.8∗ . Let n ≥ 1 be an integer. Prove that there exist infinitely many rational primes
p ≡ 1 (mod n).

Solution

Suppose that there were only finitely many primes p1 , . . . , pk congruent to 1 modulo n. Consider
the number Φn (np1 · . . . · pk ). It is congruent to ±1 modulo np1 · . . . · pk by Exercise 3.1.3∗ so
any prime factor of it must be congruent to 1 modulo n by Theorem 3.3.1 and distinct from
p1 , . . . , pk . Since it is greater than 1 by the triangular inequality, it has a prime factor which is
a contradiction.


3.4 Zsigmondy’s Theorem


Exercise 3.4.1∗ . Check that the exceptions stated in Theorem 3.4.1 are indeed exceptions.
264 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

Solution

When n = 2 and a + b is a power of 2, all prime factors of a2 − b2 = (a − b)(a + b) either divide


a − b or are equal to 2 which also divide a − b. For a = 2, b = 1, and n = 6, we see that all prime
factors of 26 − 1 = 9 · 7 divide 23 − 1 = 7 and 22 − 1 = 3.


Exercise 3.4.2∗ . Prove that a2 − b2 has no primitive prime factor if and only if a + b is ± a power
of 2.

Solution

We have already shown that a2 − b2 has no primitive prime factor if a + b is a power of 2. For
the converse, note that any common prime factor of a + b and a − b must divide 2a and 2b so
must be 2 since a and b are coprime.


Exercise 3.4.3. Let n ≥ 3 be an integer. Prove that Φn is positive on R.

Solution

Since Φn (0) = 1 > 0 by Exercise 3.1.3∗ , if Φn (x) were nonpositive for some real x, Φn would
have a real root by the intermediate value theorem which would imply n = 1 or n = 2 since the
only real roots of unity are 1 and −1.


Exercise 3.4.4. Prove that 2m−1 > m for any integer m ≥ 3 and 2m − 1 > 3m for any integer m ≥ 4.

Solution

It suffices to prove the second inequality since, if 2m > 3m + 1 then 2m−1 > m + m+1
2 ≥ m+1
and we already have 22 > 3. We use the binomial expansion:
     
m m m m m m(m − 1)
2 = (1 + 1) > + + = 2m + > 3m + 1
m−1 2 1 2

since m ≥ 4.


3.5 Exercises
Diophantine Equations
Exercise 3.5.2† (USA TST 2008). Let n be a rational integer. Prove that n7 + 7 is not a perfect
square.
3.5. EXERCISES 265

Solution

Suppose that n7 + 7 = m2 . Then, by adding 121 = 112 to both sides, we get n7 + 27 = m2 + 112 ,
i.e.
Φ1 (n, −2)Φ7 (n, −2) = Φ4 (m, 11).
In particular, any prime factor of the RHS must be equal to 11 or congruent to 1 modulo 4. First
suppose that 11 - m. Then, we must have n + 2 = Φ1 (n, −2) ≡ 1 (mod 4), i.e. n ≡ −1 (mod 4).
However, we then have n7 + 27 ≡ −1 (mod 4) which is impossible.

Thus, 11 must divide m. Since 11 is not equal to 2 or 7 nor congruent to 1 modulo 7, it can’t
divide Φ7 (n, −2). Hence, it must divide n + 2 = Φ1 (n, −2). Since v11 (Φ4 (m, 11)) = 2, we also
have v11 (n + 2) = 2. But then, n + 2 is still congruent to 1 modulo 4, since all its prime factors
are congruent to 1 modulo 4 except 11, and its v11 is even. Hence, we get the same contradiction
as before which shows that our equation does not have any solution.


Exercise 3.5.5† (French TST 1 2017). Determine all positive integers a for which there exists
positive integers m and n as well as positive integers k1 , . . . , km , `1 , . . . , `n such that

(ak1 − 1) · . . . · (akm − 1) = (a`1 + 1) · . . . · (a`n + 1).

Solution

If we multiply both sides by (a`1 − 1) · . . . · (a`n − 1), we get

(ak1 − 1) · . . . · (akm − 1)(a`1 − 1) · . . . · (a`n − 1) = (a2`1 − 1) · . . . · (a2`n − 1).

If we eliminate common factors, we get an equality of the form (au1 − 1) · . . . · (aur − 1) =


(av1 − 1) · . . . · (avs − 1) with even vi and disjoint {u1 , . . . , ur } and {v1 , . . . , vs }. Now, consider
amaxi,j (ui ,vj ) − 1. By the Zsigmondy theorem, unless a = 2 or maxi,j (ui , vj ) ≤ 2, this has a
primitive prime factor which is a contradiction since this implies that some prime divides one side
of the equality but not the other. Conversely, it is easy to see that a = 2 works: (22 −1)2 = 23 +1.

Now suppose that maxi,j (ui , vj ) ≤ 2. It cannot be 1 since the ui and vj are disjoint. Hence, it
must be 2. Since the vj are even, this implies u1 = . . . = ur = 1 and v1 = . . . = vs = 2. We
conclude that (a − 1)r = (a2 − 1)s , i.e. (a − 1)r−s = (a + 1)s . The gcd of a + 1 and a − 1 divides
2, so we must have a − 1 and a + 1 must both be powers of 2. This gives us a = 3. Conversely,
we have (3 − 1)2 = (3 + 1).

We conclude that the only solutions are a = 2 and a = 3.




Divisibility Relations
Exercise 3.5.7† . Find all coprime positive integers a and b for which there exist infinitely many
integers n ≥ 1 such that
n2 | an + bn .
266 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

Solution

We shall prove that a and b work if and only if a + b is not a power of 2 and {a, b} 6= {1, 2}.
Suppose that n2 | an + bn . Let p be the smallest prime factor of n. Then, the order of ab−1
divides 2n and p − 1 so must be 2 by assumption, i.e. p | a + b. If a + b was a power of 2, then 4
would not divide an + bn which would be a contradiction. Thus, a + b is not a power of 2.

Now suppose a = 2 and b = 1. The previous reasoning shows that the smallest prime factor of
n is 3. Let q be the second smallest prime factor (distinct from 3). Then, the order of 2 divides
2n and q − 1 so must divide 6, i.e. q = 7. This is impossible since the order of 2 modulo 7 is odd
so 7 never divides 2k + 1. Thus, n has only one prime factor, i.e. it is a power of3. Clearly, n
is odd, as otherwise 3 - 2n + 1. The lifting the exponent lemma gives v3 (2n + 1) = v3 (n) + 1 so
that v3 (n) ≤ 1, i.e. n ∈ {1, 3}. There are finitely many such integers.

Finally, suppose a + b is not a power of 2 and {a, b} =


6 {1, 2}. We shall proceed by induction on
k to find an odd n that works with exactly k prime factors. We start with the solution n = 1
2n
−b2n
corresponding k = 0. Then, Zsigmondy tells us that an + bn = aan −b n has an odd prime factor
p which doesn’t divide n, since a prime factor q | n divides aq−1 − bq−1 (the exception was with
{a, b} = {2, 1} which we have ruled out). We claim that pn is also a solution:

n2 | an + bn | anp + bnp

since p is odd, and by the lifting the exponent lemma vp (anp + bnp ) = 1 + vp (an + bn ) ≥ 2 so p2
divides anp + bnp as well. Since p and n are coprime, we have (np)2 | anp + bnp as desired.


Prime Factors
Exercise 3.5.11† (ISL 2002). Let p1 , . . . , pn > 3 be distinct rational primes. Prove that the number

2p1 ·...·pn + 1
n
has at least 22 distinct divisors.

Solution

Consider the 2n divisors of p1 · . . . · pn and order them d1 < . . . < d2n . Then, each 2di + 1 |
2p1 ·...·pn + 1 gives a primitive prime factor by Zsigmondy’s theorem (no exception since pi > 3),
n
so there are at least 2n prime factors in total and thus at least 22 divisors.


Exercise 3.5.12† (Problems from the Book). Let a ≥ 2 be a rational integer. Prove that there exist
infinitely many integers n ≥ 1 such that the greatest prime factor of an − 1 is greater than n loga n.

Solution
k
We choose n = ak , so that n loga n = kak . We consider prime factors of Φak (a) | aa − 1. They
are all congruent to 1 modulo ak , and suppose for the sake of contradiction that they are all less
than kak (which is the same as being at most kak since they are congruent to 1 modulo ak ).
k k
Since Φak (a) < aa , it has at most ak prime factors since they are all greater than ak . Let these
3.5. EXERCISES 267

prime factors be k1 ak + 1, . . . , km ak + 1. The key claim is that Φak (a) ≡ 1 (mod a2k ), but
m
Y m
X
ki ak + 1 ≡ 1 + ak ki (mod a2k )
i=1 i=1

Pm k
and i=1 ki < km < ak since each ki is less than k and m is less than ak . Thus, it remains to
prove that Φak (a) ≡ 1 (mod ak ). We shall prove that this holds modulo pvk for any prime power
pv which divides a.

By Proposition 3.1.2, we have


kv−1
Φak (a) = Φp(a/pv )k (ap ).

Since pkv−1 ≥ 2kv for sufficiently large k (in fact k ≥ 3), modulo p2kv we get

Φp(a/pv )k (0) ≡ 1

as wanted.


Exercise 3.5.13† (Inspired by IMO 2003). Let m ≥ 1 be an integer. Prove that there is some
rational prime p such that p - nm − m for any rational integer n.

Solution

In fact, we prove more: if p is a prime factor of m and k = vp (m), there is some prime q such
that m is not a pk th power modulo q. For didactic purposes, we shall first do the case k = 1
(this whole paragraph will be about motivation, and the following paragraph will have the real
proof). By Exercise 4.6.20† , m is a pth power modulo q if and only if
q−1
m gcd(p,q−1) ≡ 1 (mod q).

In particular, we must have q ≡ 1 (mod p), otherwise this is always true. Hence, we want to have
q−1
m p 6≡ 1, i.e. the order r of m modulo q doesn’t divide q−1
p , or in other words vp (m) = vp (q −1).
This suggests to try, for instance, m = p and q 6≡ 1 (mod p2 ). Hence, we want to pick a prime
factor q of Φp (m) which is not congruent to 1 modulo p2 . If there was no such prime, we would
have Φp (m)equiv1 (mod p2 ) which is impossible since Φp (m) ≡ m + 1 (mod p2 ) and p2 - m.

Now, let’s do the general case. The proof is almost identical: we find a prime q for which m has
q−1
order p modulo q, and such that q 6≡ 1 (mod pk+1 ). That way, gcd(q−1,p k ) is not divisible by p.
k
Hence, if m were congruent to a gcd(q − 1, pk )th power ngcd(q−1,p )
modulo p, we would have
q−1
m gcd(q−1,pk ) ≡ nq−1 ≡ 1
q−1
but the order of m doesn’t divide gcd(q−1,p k ) . To find such a prime q, consider Φp (m) as before.

If all its prime divisors were congruent to 1 modulo pk+1 , we woudl have Φp (m) ≡ 1 (mod pk+1 )
which is impossible since it is congruent to 1 + m.


Remark 3.5.1
This is also a consequence of (a corollary of) the Chebotarev density theorem: as said in Re-
mark 4.6.4, it there were no such prime, m would be an m/2th power if 8 | m, which is impossible
268 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

since 2m/2 > m for m ≥ 8, or an mth power if 8 - m, which is also impossible since 2m > m for
m ≥ 1.

Exercise 3.5.14† . Prove that ϕ(n)/n can get arbitrarily small. Deduce that π(n)/n → 0, where π(n)
denotes the number of primes at most n.

Solution

We take n = p1 · . . . · pk , where p1 , . . . , pk are the first k primes. We need to prove that


   
1 1
ϕ(n) = 1 − · ... · 1 − → 0,
p1 pk

i.e.
Y 1

1− = 0.
p
p

This follows from the following equality:



Y 1 X 1
−1
= = ∞.
p
1−p n=1
n
   
1 1
To deduce that π(n) = o(n), one can notice that there are 1 − p1 · ... · 1 − pk n + o(n)
numbers less than n which are not divisible by any of p1 , . . . , pk .


Exercise 3.5.15† . Let P (n) denote the greatest prime factor of any rational integer n ≥ 1 (P (1) = 0).
Let ε > 0 be a real number. Prove that there exist infinitely many rational integers n ≥ 2 such that

P (n − 1), P (n), P (n + 1) < nε .

Solution

We choose n = 2p1 ·...·pk , where p1 · . . . · pk are the first k odd primes. It is clear that P (n) = no(1) .
By factorisating the other two sides in cyclotomic polynomials, we get that P (n) is at most

max (Φd (2), Φd (−2)) ≤ 3ϕ(p1 ·...·pk ) = 2o(p1 ·...·pk )


d|p1 ·...·pk

ϕ(p1 ·...·pk )
since p1 ·...·pk → 0 by Exercise 3.5.14† .


Exercise 3.5.16† (Brazilian Mathematical Olympiad 1995). Let P (n) denote the greatest prime
factor of any rational integer n ≥ 1. Prove that there exist infinitely many rational integers n ≥ 2 such
that
P (n − 1) < P (n) < P (n + 1).

Solution
k
Let p be an odd prime. Let k ≥ 0 be the smallest integer such that P (p2 + 1) > p, there exists
k k
one P (p2 + 1) → ∞ by Zsigmondy (one may also note that two numbers of the form p2 + 1
3.5. EXERCISES 269

k
have gcd 2). Note that k ≥ 1 since P (p + 1) < p. We claim that n = p2 works. Indeed, we have
k k
P (p2 + 1) > p = P (p2 ) by assumption, and
k−1
!
2k
Y i
2
P (p − 1) = P (p − 1) p +1 <p
i=1

by minimality of k.


Exercise 3.5.18† (Structure of units of Z/nZ). Let p be an odd rational prime and n ≥ 1 and
integer. Prove that there is a primitive root modulo pn , i.e. a number g which generates all the
numbers coprime with p modulo pn . Moreover, show that there doesn’t exist a primitive root mod 2n
for n ≥ 3, but that, in that case, there exist a rational integer g and a rational integer a such that
each rational integer is congruent to either g k for some k or ag k modulo 2n .2

Solution

Let g ∈ Z be a primitive root modulo p. Then, if g p−1 6≡ 1 (mod p2 ), we have vp (g n(p−1) − 1) =


1 + vp (n) by LTE which shows that g is a primitive root modulo pn for any n. If g p−1 6≡ 1
(mod p2 ), then g + p is also a primitive root modulo p and

(g + p)p−1 ≡ g p−1 + p(p − 1)g p−2 6≡ 1 (mod p2 )

so our previous argument shows that g is a primitive root modulo any power of p.

For p = 2, we have vp (g n − 1) = vp (n) + vp (g 2 − 1) − 1 ≥ vp (n) + 2 for even n since 8 | g 2 − 1, so


the order of any odd integer modulo 2n divides 2n−2 < ϕ(2n ). However, the same argument as
before shows that, if g 2 6≡ 1 (mod 16) (e.g g = 3), then g has exactly order 2n−2 . Then, note that
powers of g are all congruent to 1 or g modulo 8, and since there are exactly 2n−2 such elements
modulo 2n , this means that it goes through all of them. Thus, if a 6≡ 1, g (mod 8) is odd, every
element of Z/2n Z can be represented in exactly one way as either g n or ag n as wanted.


Coefficients of Cyclotomic Polynomials


Exercise 3.5.20† . Let m ≥ 0 be an integer. Prove that the coefficient of X m of Φn is bounded when
n varies.

Solution

This follows from the formula Φn = d|n (X d − 1)µ(n/d) of Exercise 3.5.19. Indeed, modulo
Q

X m+1 , all terms with d > m vanish (possibly changing the sign also) and we are left with a finite
number of cases. ((X d − 1)−1 is too be interpreted as the inverse of X d − 1 modulo X m .)


2 In group-theoretic terms, this says that (Z/pn Z)× ' Z/ϕ(pn )Z and that (Z/2Z)n ' (Z/2Z) × (Z/2n−2 Z) for n ≥ 2.

The Chinese remainder theorem then yields


(Z/2n pn 1 nm ×
1 · · · pm Z) ' (Z/2Z) × (Z/2
n−2
Z) × (Z/ϕ(pn 1 nm
1 )Z) × . . . × (Z/ϕ(pm )Z).
270 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

Remark 3.5.2
In fact, if we define µ(x) to be 0 when x is not an integer, we get

Y
Φn = (1 − X d )µ(n/d)
d=1

since the total number of times µ(n/d) is ±1 is even, for it is 2r where r is the number of prime
factors of n. This can be used to give explicit formulas for the coefficients of Φn , since the
coefficient an (k) of X k depends on the finite product
k
Y
(1 − X d )µ(n/d) ,
d=1

which we can expand as


k X  
Y µ(n/d)
(−X)di
i
i
d=1

(this is an equality of formal power series) and then extract the coefficient of X k of this expression.
For instance, we get the formulas

an (1) = −µ(n)
µ(n)(µ(n) − 1)
an (2) = − µ(n/2)
2
µ(n)(µ(n) − 1)
an (3) = + µ(n/2)µ(n) − µ(n/3).
2

Exercise 3.5.21† . Let ψ(x) = pα ≤x log p. By noticing that


P

Z 1
exp(ψ(2n + 1))
exp(ψ(2n + 1)) xn (1 − x)n dx ≤ ,
0 4n
prove that π(n), the number of primes at most n, is greater than Cn/ log n for some constant C > 0.

Solution
1
We have x(1 − x) ≤ 4 for x ∈ [0, 1] so
Z 1 Z 1
1 1
xn (1 − x)n dx ≤ n
dx = n .
0 0 4 4
R1
However, since exp(ψ(2n + 1)) P = lcm(1, . . . , 2n + 1), exp(ψ(2n + 1)) 0
xn (1 − x)n is a positive
integer, since if X n (1 − X)n = i ai X i we get
Z 1 X ai
xn (1 − x)n dx = .
0 i
i+1

Hence, exp(ψ(2n + 1)) ≥ 4n which implies ψ(2n + 1) ≥ 2n log 2. In particular, ψ(2n) ≥ 2(n −
1) log 2 so ψ(n) ≥ (n − 2) log 2 for all n. Since
X  X  log n 
ψ(n) = logp (n) log p = log p ≤ log nπ(n),
log p
p≤n p≤n

(n−2) log 2
we get π(n) ≥ log n as wanted.

3.5. EXERCISES 271

Exercise 3.5.22† . Let m ≥ 3 be an odd integer and suppose that p1 < . . . < pm = p are rational
primes such that p1 + p2 > pm and let n = p1 · . . . · pm . What are the coefficient of X p and X p−2 of
Φn ? Deduce that any rational integer arises as a coefficient of a cyclotomic polynomial.3

Solution

By Exercise 3.5.19, we have Φn = d|n (X d − 1)µ(n/d) . Modulo X p+1 , if d > p + 1, (X d − 1)µ(n/d)


Q

becomes −1, since n/d is always squarefree. Hence,


Y (X p1 − 1) · . . . · (X pm )
Φn = (X d − 1)µ(n/d) ≡
X −1
d|n

since we removed 2t − (t + 1) factors, which is even by assumption, so the sign doesn’t change.
Moreover, since pi + pj ≥ p + 1 for any i, j, we have

Xp − 1
Φn = (1 − X p1 ) · . . . · (1 − X pm−1 )
X −1
= (1 + X + . . . + X p−1 )(1 − X p1 − X p2 − . . . − X pm−1 )

so the coefficient of X p is −m + 1 since each monomial of the second factor has a contribution
of −1 except the first one, which has no contribution since the degree of the first factor is less
than p. Similarly, the coefficient of X p−2 is −m + 2 since now the first monomial of the second
factor has a contribution of 1 since the degree of the first factor is large enough.

Suppose for the sake of contradiction that there are no odd primes p1 < . . . < pm = p such that
p1 + p2 > p. In particular, if p1 < . . . < pm , we have pm > 2p1 . Hence, the number of primes
between 2k and 2k+1 is always less than m. As a consequence, the number of primes less than 2k ,
π(2k ), is less than kt. This contradicts Exercise 3.5.21† . This shows that any negative coefficient
can be represented, and for the positive coefficients (we can trivially get 0, e.g. Φ9 = X 6 +X 3 +1)
simply consider Φ2n which is Φn (−X) for odd n: this negates our coefficients since p and p − 2
are odd.


Exercise 3.5.23† . Let p and q be two rational primes. Prove that the coefficients of Φpq are in
{−1, 0, 1}.

Solution

Let a and b be positive rational integers such that ap + bq = ϕ(pq), there exists scuh integers by
??. We claim that
a
! b 
q−1
!  p−1 
X X X X
Φpq = X pi  X qj  − X −pq X pi  X qj  .
i=0 j=0 i=a+1 j=b+1

Note that this is monic and has degree ap + bq = ϕ(pq) so it suffices to show that it is zero at
any primitive pqth root of unity ω. Here is a hint of motivation for this formula (which I don’t
find extremely convincing, if anyone has something better please contact me): we start with the
equations
p−1
X
Φp (ω q ) = ω qi = 0
i=0

3 This may come off as a bit surprising considering that all the cyclotomic polynomials we saw had only ±1 and 0

coefficients.
272 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

and
q−1
X
p
Φq (ω ) = ω pi = 0.
j=0

To construct a polynomial vanishing at ω, we can consider polynomials in Φp (X q ) and Φq (X p ),


but it is easy to see that this will have a degree which is too high. Then, we try splitting the
sum Φp (ω p ), but it is again easy to see that the degree will be too high, unless we factorise one
of the parts by powers of ω. However, for this to work, we need to factorise by exactly X pq (the
exponent must be a multiple of pq since ω has order pq, and the higher the exponent the more
cancellation is needed so the best guess is X pq ).
Pa
Back to the problem, it is trivial to show that our polynomial is zero at ω: we have i=0 ω pi =
Pq−1 Pb Pp−1
− i=a+1 ω pi and j=0 ω qi = − j=b+1 ω qi so

a
! b

q−1
! p−1

X X X X
ω pi  ω qi  = ω pi  ω qi 
i=0 j=0 i=a+1 j=b+1

as wanted.

Finally, let us return to the original problem. Showing that Φpq has coefficients in {−1, 0, 1}
is now equivalent to showing that that there is at most one way to write any integer n in the
form pi + qj for i ∈ [[0, a]] and j ∈ [[0, b]] or in the form pi + qj − pq for i ∈ [[a + 1.q − 1]] and
j ∈ [[b + 1, p − 1]].

For this, note that n can be written in two ways if and only if there are distinct pairs (i, j) and
(i0 , j 0 ) with i, i0 ∈ [[0, q − 1]] and j, j 0 ∈ [[0, p − 1]] such that

pi + qj ≡ pi0 + qj 0 (mod pq).

(It is clear that two such expressions give us an equality of this form, and the converse follows
from |pi + qj − (pi0 + qj 0 )| < 2pq, although it is techncially not needed in our case.) This is
equivalent to p(i − i0 ) ≡ q(j 0 − j) (mod pq), which implies that p | j 0 − j so j 0 = j and q | i − i0
so i = i0 . This contradicts the assumption that (i, j) 6= (i0 , j 0 ). (In fact, this a special case of
the Chinese remainder theorem: the map ψ : Z/pZ × Z/qZ → Z/pqZ given by (i, j) 7→ pi + qj is
bijective.)


Cyclotomic Fields and Fermat’s Last Theorem


Exercise 3.5.24† (Sophie-Germain’s Theorem). Let p be a Sophie-Germain prime, i.e. a rational
prime such that 2p + 1 is also prime. Prove that the equation ap + bp = cp does not have rational
integer solutions p - abc.

Solution

Suppose that ap + bp = cp for some coprime rational integers a, b, c such that p - abc. Modulo
q = 2p + 1, pth powers are congruent to ±1 or 0, so q | abc, say q | c. We have

Φ2 (a, b)Φ2p (a, b) = cp

and the gcd of Φ2 (a, b) and Φ2p divides p by LTE and Theorem 3.3.1. Since p - c by assumption,
the two factors are coprime and hence are both pth powers. Modulo q, this implies that a + b is
congruent to 0 or ±1. The same goes for a − c and b − c by symmetry. Since

0 ≡ 2c = (a + b) − (a − c) − (b − c) (mod q),
3.5. EXERCISES 273

one of a+b, a−c and b−c must be divisible by q. If it is a−c or b−c, then q | a, b, c contradicting
the hypothesis that they are coprime. Thus, q | a + b. Since a − c ≡ a and b − c ≡ b are also
congruent to ±1 or 0 modulo q, we get a ≡ −b ≡ ±1, i.e. a ≡ 1 and b ≡ −1 without loss of
generality. But then,
p−1
X
Φ2p (a, b) = ak (−b)p−1−k ≡ p
k=0

which is not a pth power modulo q. This is a contradiction.




Exercise 3.5.25† . Let ω be an nth root of unity. Define Q(ω) as Q + ωQ + . . . + ω n−1 Q. Prove that

Q(ω) ∩ R = Q(ω + ω −1 )

where Q(ω + ω −1 ) = Q + (ω + ω −1 )Q + . . . + (ω + ω −1 )n−1 Q.

Solution

ai ω k be a real element of Q(ω). Note that


P
Let f (ω) = i
X
2f (ω) = f (ω) + f (ω −1 ) = ai (ω i + ω −i ) ∈ Q(ω + ω −1 )
i

1
since X i + X1i is a polynomial with rational coefficients in X + X by induction on i or by the
i 1 1
fundamental theorem of symmetric polynomials: X + X i is symmetric in X and X so it is a
1 1
polynomial in X + X and X · X = 1.


Exercise 3.5.26† . Let ω be a primitive pth root of unity, where p is prime. Prove that the ring of
integers of Q(ω), OQ(ω) := Q(ω) ∩ Z is

Z[ω] := Z + ωZ + . . . + ω n−1 Z.

(In fact this holds for any nth root of unity but it is harder to prove.)

Solution
Pp−2
Suppose that i=0 ai ω i = α ∈ Z for some rational numbers a0 , . . . , ap−2 . Then, the same is true
Pp−1
for its conjugates i=0 ai ω ki = αk . If we consider this as a system of equations, we know from
ij
Proposition C.3.7 that ai times the determinant of Ω Q = (ω )i,j∈[p−2] is an algebraic integer for
all i. Since this is is the Vandermonde determinant 1≤i<j≤p−2 ω j − ω i , we get
Y
ai ω i − ω j ∈ Z.
0≤i6=j≤p−1

Finally, as we saw in Theorem 3.2.1, the product i6=j ω i − ω j is ±pp by Exercise 3.2.2∗ which
Q
means
Pp−2 that the denominator of the ai is a power of p. To conclude, we shall prove that if
i
b
j=0 i ω is divisible by p, then all bi are divisible by p, thus showing that the denominator of
the ai is in fact not divisible by p, i.e. ai ∈ Z as desired.
Pp−2
Hence, suppose that i=0 bi ω i ≡ 0 (mod p). The same is true for its conjugates, and summing
them we get (p − 1)b0 ≡ 0, i.e. p | b0 . Since ω is invertible (ω p = 1), we can simply remove b0 ,
divide by ω and repeat this process to get p | bi for all i.
274 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

Exercise 3.5.27† . Let ω be a primitive pth root of unity, where p is prime. Prove that p = u(1−ω)p−1 ,
where u ∈ Z is a unit of Z, i.e. 1/u is also an algebraic integer. Deduce that 1 − ω is prime in Q(ω).

Solution

We have
p−1
Y
p = Φp (1) = 1 − ωk
k=1
k
so we want to prove that 1−ω1−ω is a unit for every p - k. Note that this already shows that
Qp−1
1 − ω is prime since it has prime norm (the norm of f (ω) is defined as k=1 f (ω k ) and this is
1−ω
clearly multiplicative). We wish to show that 1−ω k is also an algebraic integer, which is true by

1−ζ `
symmetry between primitive roots of unity: if ζ = ω k and ω = ζ ` then this is just 1−ζ .


Exercise 3.5.28† (Kummer). Let ω be a root of unity of odd prime order p and suppose ε is a unit
of Q(ω). Prove that ε = ηω n for some n ∈ Z and η ∈ R.

Solution

Let ε = f (ω) be a unit of Q(ω). Consider θ = ε/ε = f (ω)/f (ω −1 ). Then, its conjugates are
f (ω k )/f (ω −k ) which all have module 1, so θ is a root of unity by Kronecker’s theorem 1.5.26† .
We shall now analyze the roots of unity of Q(ω): by Bézout, if ζ ∈ Q(ω) is a primitive mth root
of unity, then ξ ∈ Q(ω) where ξ is a primitive lcm(m, p)th root of unity. Indeed,
 a  b  
2iπ 2iπ 2iπ
exp exp = exp
m p lcm(p, m)

where ap + bm = gcd(m, n) by Bézout’s lemma. However, the degree of a primitive kpth root of
unity is ϕ(kp) which is always greater than ϕ(p) (which is the maximum degree of an element
of Q(ω) by the fundamental theorem of symmetric polynomials), except when k ≤ 2. Thus, the
root of unity of Q(ω) have the form ±ω k , and this means that θ = ±ω n for some n.

Without loss of generality, we may assume that n is even (by replacing it by n + p if necessary).
Then, consider η = εω −n/2 . We wish to prove that it is real. By definition, η/η = ±1, so it is
either real or purely imaginary: we want to rule the second case out. Thus, suppose that η = −η.
We claim that η is divisible by 1 − ω, and thus not a unit by Exercise 3.5.27† . Since 1 − ω | p, 2
is invertible modulo 1 − ω so it suffices to show that 2η = η − η is divisible by 1 − ω. Finally, if
η = i ai ω i then
P
X
η−η = ai (ω i − ω −i )
i

which is divisible by 1 − ω since 1 − ω | 1 − ω 2i = ω i (ω −i + ω i ).




Exercise 3.5.29† . Let α ∈ Z[ω], where ω is a primitive pth root of unity. Prove that αp is congruent
to a rational integer modulo p.
3.5. EXERCISES 275

Solution

Note that

(a0 + a1 ω + . . . + ap−1 ω p−1 )p ≡ ap0 + ap1 ω p + . . . + app−1 ω p(p−1) ≡ a1 + . . . + ap−1 (mod p)

by Frobenius.


Exercise 3.5.30† (Kummer). Let p be an odd prime and ω a primitive pth root of unity. Suppose
that Z[ω] is a UFD.4 Prove that there do not exist non-zero rational integers a, b, c ∈ Z such that

ap + bp + cp = 0.

(You may assume that, if a unit of Z[ω] is congruent to a rational integer modulo p, it is a pth power
of a unit. This is known as "Kummer’s lemma". See Borevich-Shafarevich [7, Chapter 5, Section 6]
or Conrad [12] for a (1 − ω)-adic proof of this.)

Solution

Suppose that there are non-zero coprime a, b, c ∈ Z such that ap + bp = cp and, without loss of
generality, p - a, b. Working in Z[ω], this gives us

(a + b)(a + ωp) · . . . · (a + ω p−1 = cp .


k
−1
The gcd of two factors divides (ω i − 1)b and (ω j − 1)a for some p - i, j. Since ωω−1 is a unit

whenever k - p by Exercise 3.5.27 , the gcd of two factors divides (ω − 1)a and (ω − 1)b so divides
ω − 1. Since ω − 1 is prime by the same exercise, either all factors are divisible by 1 − ω (since
a + bω k ≡ a + b (mod 1 − ω)) or none of them are. We will distinguish these two cases.

First, suppose that 1 − ω - a + b. This corresponds to p - c. Then, by unique factorisation, there


are units εk ∈ Z[ω]× and elements ck ∈ Z[ω] such that a + bω k = εk cpk . Consider k = 1 and set
ε = ε1 and ε = ω m ε by Exercise 3.5.28† . Then, since cp1 ≡ cp1 (mod p) by Exercise 3.5.29† , we
have

a + bω = εcp1
≡ εcp1
= ω m εcp1
= ω m a + bω
= aω m + bω m−1 .

Hence, p | a + bω − aω m − bω m−1 . If m 6= 1, 0, then the coefficient of ω m of this expression


is a and this is not divisible by p so the expression isn’t either by Exercise 3.5.26† . Similarly,
when m 6= 1, 2, the coefficient of ω m+1 is b which isn’t divisible by p. Thus, m = 1, which yields
a ≡ b (mod p). But then, by symmetry, we must also have a ≡ −c (mod p). This implies that
0 = ap + bp − cp ≡ 3ap which forces p = 3. It is however easy to see that a3 + b3 = c3 has no
solution 3 - abc by working modulo 9, which finishes the first case.

Now, we consider the second case. As in our proof of Theorem 2.4.1, we consider the more general
equation
αp + β p = ε(1 − ω)pn γ p

4 Sadly, it has been proven that Z[ω] is only a UFD when p ∈ {3, 5, 7, 11, 13, 17, 19, 23}. This approach works however

almost verbatim when the class number h of Q(ω) is not divisible by p. The case h = 1 corresponds to Z[ω] being a
UFD. That said, it has not been proven that there exist infinitely many p such that p - h (but it has been conjectured
to be the case), while it has been proven that there exist infinitely many p such that p | h.
276 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

with coprime 1 − ω - α, β, γ ∈ Z[ω], ε ∈ Z[ω]× a unit, and n ≥ 1. Suppose that α, β, γ is a non-


trivial solution with minimal n. As we saw before, the gcd of two numbers of the form α + βω k
is 1 − ω. First, we prove that there are no solutions when n = 1. In that case, v1−ω (α + βω k )
must be 1 for all k, which implies that the numbers α + βω k are non-zero multiplies of 1 − ω
modulo (1 − ω)2 . Since there are only p − 1 such multiples as |Z[ω]/(1 − ω)Z[ω]× | = p − 1, two
of them must be equal which is impossible as we saw previously. Hence, n ≥ 2.

By replacing β by βω m for some m, we may assume that v1−ω (α + βω k ) = 1 for all p - k and
v1−ω (α + β) = p(n − 1) + 1.By unique factorisation, set α + βω = η(1 − ω)ρp and α + β =
µ(1 − ω)p(n−1)+1 τ p for some units η, µ. Then, since

(α + βω) + ω(α + βω −1 ) = (ω + 1)(α + β),

we get
ηρp + ωηρp = (ω + 1)µ(1 − ω)p(n−1) τ p .
2
Dividing by η and noticing that ω + 1 = ωω−1+1
is a unit, this gives us an equation of the form
p p p(n−1) p
x + uy = v(1 − ω) z for 1 − ω - x, y, z and u, v units. We wish to prove that u is a pth
power. This is where we use this fundamental lemma of Kummer: modulo p, u is congruent to
the pth power (−x/y)p so is a pth power itself. This contradicts the minimality since n − 1 ≥ 1,
so we are done.


j k
Exercise 3.5.31† (Fleck’s Congruences). Let n ≥ 1 be an integer, p a rational prime and q = n−1
p−1 .
Prove that, for any rational integer m,
 
X n
pq | (−1)k .
k
k≡m (mod p)

Solution

Let ω be a primitive pth root of unity. We use a unity root filter on the polynomial
X −m (mod p) (X − 1)n (see Exercise A.3.9† ):
  P −km
X
kn ω (1 − ω k )n
S := (−1) = k .
k p
k≡m (mod p)

Now, note that the numerator is divisible by (1 − ω)n . This means that v1−ω (S) ≥ n − (p − 1)
since v1−ω (p) = p − 1 by Exercise 3.5.27† (and 1 − ω is prime in Q(ω)). Thus,

v1−ω (S) n
vp (S) ≥ ≥ −1
p−1 p−1
l m j k
n
which implies that vp (S) ≥ p−1 − 1 = n−1
p−1 as wanted.


Miscellaneous
Exercise 3.5.33† (Korea Winter Program Practice Test 1 2019). Find all non-zero polynomials
f ∈ Z[X] such that, for any prime number p and any integer n, if p - n, f (n), the order of f (n) modulo
p is at most the order of n modulo p.
3.5. EXERCISES 277

Solution

Let’s see what the condition means: it says that, if n is a mth root of unity in Fp , then f (n) is
either zero or a root of unity of order ≤ m in Fp . Thus, if we remember Proposition 1.3.1, we
might try to prove that the same holds over C. In fact we only need an assertion a lot weaker
than this to finish with Exercise A.3.28† , but since we can prove the general result directly with
cyclotomic polynomials let’s do it.

Let k ≥ 1 be an integer and ω a complex primitive mth root of unity. Let p ≡ 1 (mod m) be a
rational prime and z ∈ Fp an element of order m. Then, f (z) has order at most m (or is zero).
There are infinitely many such primes, so let m0 be such that for infinitely many p ≡ 1 (mod k),
f (z) has order m0 (or is zero). Then,
Y Y
f (ω k )Φm0 (f (ω k )) ≡ f (z k )Φm0 (f (z k )) ≡ 0 (mod p)
gcd(k,m0 )=1 gcd(k,m0 )

is divisible by infinitely many primes so must be zero, i.e. f (ω k ) = 0 or Φm0 (f (ω k )) = 0 for some
k. We have shown our claim: f (ω) is zero or a root of unity of order m0 ≤ m.

Finally, we can use Exercise A.3.28† : its assumption is a bit weaker than what we have, but we
can see that it works for any polynomial which sends infinitely many points on the unit circle to
itself, which is clearly the case here. Thus, f = ±X k for some k since real numbers of the unit
circles are ±1, and it is easily seen that −X k does not work as f (1) is a root of unity of order at
most 1. Conversely, it is easy to that X k works.


Exercise 3.5.34† (Korea Mathematical Olympiad Final Round 2019). Show that there exist infinitely
many positive integers k such that the sequence (an )n≥0 defined by a0 = 1, a1 = k + 1 and
an+2 = kan+1 − an
for n ≥ 0 contains no prime number.

Solution
n n
Using Theorem C.4.1, we can see that an = α (1+α)−β
α−β
(1+β)
, where α and β are the roots of the
characteristic polynomial X 2 − kX + 1. Indeed, we have a0 = 1 and a1 = α + β + 1 = k + 1.
Let’s express this formula in a more convenient form:
αn (1 + α) − β n (1 + β)
an =
α−β
n
αn (1 + α) − α1 1 + α1

=
α − α1
n
αn (1 + α) − α1 · 1+α α
= (1+α)(1−α)
α
1 α2n+1 − 1
= n· .
α α−1
These manipulations might seem a bit random at first, but they are very simple and motivated:
we have simply replaced β by α1 and simplified it as much as possible. We can now see where
cyclotomic polynomials appear:
α2n+1 − 1 Y
= Φd (α)
α−1
d|2n+1,d>1
278 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

is a product of cyclotomic polynomials! Note however that this product is trivial when 2n+1 = p
is prime, and that it is a product of cyclotomic polynomials evaluated at quadratic integers, so
we need to be a bit careful, but this is still a good sign. How could we transform this into a
non-trivial product even when 2n + 1 = p is prime? If α = γ m was an mth power, we would have

α2n+1 − 1 γ m(2n+1) Y
= m = Φd (γ).
α−1 γ −1
d|m(2n+1),d-m

In particular, for m = 2, this product is always non-trivial. Note that given a quadratic integer
of norm 1 γ, we can always construct a sequence an associated with α = γ m , since α is also a
quadratic integer of norm 1, and quadratic integers of norm 1 are exactly the roots of polynomials
of the form X 2 − kX + 1. Now let’s show that all these α work.

To show this, we take the norm of Φd (γ): if δ is the conjugate of γ, we have

γ m(2n+1) − 1 δ m(2n+1) − 1 Y
a2n = · = Φd (γ)Φd (δ)
γm − 1 δm − 1
d|m(2n+1),d-m

and Φd (γ)Φd (δ) is now a rational integer. First, we will prove that these factors are non-trivial,
and then that they cannot be all equal to a rational prime, thus establishing that a2n has at least
two distinct prime factors so that an isn’t prime as wanted. (Note that this last step isn’t needed
if we had chosen, say, m = 4, but we prefer to give the smallest possible m.)

Without loss of generality suppose that γ > δ. Since Φd (δ) = Φd (γ)/γ ϕ(n) , we want to have

Φd (γ)2 > γ ϕ(n) .

Since Φd (γ) > (γ − 1)ϕ(n) , this is true for (γ − 1)2 ≥ γ, i.e. when γ 2 + 1 = kγ ≥ 3γ. (This is
not a bad result at all: for k < 3, the roots of X 2 − kX + 1 are either rational or non-real so it
is normal that the situation gets weirder there. In general, it is very hard to estimate the size of
linear recurrences with non-real roots. For instance, that’s why we have this condition on a2 − 4b
in Exercise 4.6.35† . See also Theorem 8.4.1.)

Now suppose that


Φ2n+1 (γ)Φ2n+1 (δ) = p = Φ2(2n+1) (γ)Φ2(2n+1) (δ).

If we were dealing with rational integers, we could say that this is impossible since 2(2n+1) 2n+1
must be a power of p but p > 2 by our previous inequalities. We are dealing with quadratic
integers instead, but it is not that different: we just use higher finite fields instead of only Fp
(see Chapter 4). If η ∈ Fp2 is a root of πγ = X 2 − kX + 1, we get

Φ2n+1 (η) = Φ2(2n+1) (η) = 0

2n+1 2(2n+1)
so η has order both pvp (2n+1)
and pvp (2(2n+1))
which implies that p = 2 as wanted.


Remark 3.5.3
If α ∈ R and we choose γ to be the fundamental unit of Q(α) (which may have norm −1, see
Chapter 7), the same reasoning shows that if α = γ m and m is not an odd prime p, an is composite
except finitely many times (if m = pr the factorisation of a ps −1 for s ≤ r is trivial). In particular,
2
if the fundamental unit has norm −1, any α works since it has norm 1 so 2 | m. Conversely,
we can conjecture that, for m = p an odd prime, an is prime infinitely many times. This is an
anlogue of the conjecture that there exists infinitely many Mersenne primes.

Exercise 3.5.35† (Iran Mathematical Olympiad 3rd round 2018). Let a and b be rational integers
3.5. EXERCISES 279

distinct from ±1, 0. Prove that there are infinitely rational primes p such that a and b have the same
order modulo p. (You may assume Dirichlet’s theorem ??.)

Solution

Without loss of generality, suppose that a 6= b. Note that, modulo p, if gcd(q, p − 1) = 1, a and aq
always have the same order. Hence, we pick a prime q and look at primes factors p of aq − b. Our
goal is to prove that there are infinitely many ones which is not congruent to 1 modulo q. Note
that if they were all congruent to 1 modulo q, then aq − b would be congruent to 1 modulo q too
so q | a − b, which is easy to avoid. The idea will be to control the (for the sake of contradiction)
finitely many primes not congruent to 1 modulo q to reach the same contradiction.

Say these primes are p1 , . . . , pk . We allow q to vary here: these are the primes p which divide at
least one term of the form aq −b without being congruent to 1 modulo q. We wish to bound the p-
adic valuation of aq −b: for each i, depending on whether pi divides a or not, set mi = vp (a−b)+1
mk
in the former case and mi = vp (b) + 1 in the latter. Now consider N = ϕ(pm 1 ) · . . . · ϕ(pk ) and a
1

prime q ≡ −1 (mod N ) (there exists one by Dirichlet’s theorem, or by Theorem 4.4.1). We have
(
−b (mod pm i ) if pi | a
i

aq − b = 1
a −b≡
1−ab
a (mod pm i ) otherwise.
i

We have successfuly evaluated the contribution of our primes pi : if p = pi | a, then vp (aq − b) =


vp (b), otherwise vp (aq − b) = vp (ab − 1). If all other prime factors of aq − b were congruent to 1
modulo q, we would thus have
Y v (b) Y v (ab−1)
a − b ≡ aq − b ≡ pi p pi p .
pi |a pi -a

In particular, for large q,


v (b) v (ab−1
Y Y
pi p pi p = a − b.
pi |a pi -a

Now, note that the only property of the pi we have used is that every other prime factor of aq − b
is congruent to 1 modulo q. Hence, we may assume that the prime factors of ab − 1 are among
them. Since any p | ab − 1 doesn’t divide a, this yields vp (ab − 1) = vp (a − b) for every p | ab − 1.
Hence, ab − 1 | a − b. This is clearly impossible since a 6= b and |a|, |b| > 1 so |ab − 1| > |a − b| > 0.


Remark 3.5.4
It is more natural to try this approach with q ≡ 1 (mod N ) at first. However, this only gives us
the equality Y v (b) Y v (a−b)
a − b ≡ aq − b ≡ pi p pi p
pi |a pi -a

which provides almost no information: it only yields vp (a − b) = vp (b) for p | a (it implies
vp (a) ≥ vp (b), and by symmetry vp (a) = vp (b), but since this is only for p | gcd(a, b) it is not
sufficient to finish). Thus, we need to try with q ≡ r (mod N ). Since q is prime, r needs to
be coprime with N and this can give complicated choices of r such as the smallest prime which
doesn’t divide N . Then, we need to evaluate vp (ar − b), if it is larger than vp (a − b) we are done
Q v (b) Q vp (ar −b)
by the equality a − b = pi |a pi p pi -a pi , and by the same equality we are done if it
smaller. Then, we can vary r so that vp (ar − b) < vp (a − b) but this is complicated since we need
to take in account the prime factors of N and choose an r coprime. Finally, we can realise that
in fact there is a very natural choice of r coprime with N apart from 1, and that is −1. This
is also very good in the sense that a and a1 are the powers of a which are the most likely to be
280 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

distinct modulo p: if they aren’t, we have a ≡ ar for any odd r so all other choices of r aren’t
better. These considerations give the above solution.

Exercise 3.5.37† (IMC 2010). Let f : R → R be a function and a < b two real numbers. Suppose
that f is zero on [a, b], and
p−1  
X k
f x+ =0
p
k=0

for any x ∈ R and any rational prime p. Prove that f is zero everywhere.

Solution

Let N be a positive integer that we will choose later. Define I ⊆ R[X] as the set of polynomials
an X n + . . . + a0 such that the function
n  
X k
x 7→ ak f x +
N
k=0

is identically zero. We claim that I is an ideal of R[X], meaning that it is closed under addition
and closed under multiplication by any polynomial. The former is clear, for the latter note
that multiplication by X k corresponds to a translation (and that multiplication by constants
obviously doesn’t change anything to the condition). The key point about ideals is that we can
take the gcd of two polynomials: indeed, if u, v are elements of I, by Bézout’s lemma there exist
polynomials r, s ∈ R[X] such that ru + sv = gcd(u, v), and since I is an ideal, ru + sv ∈ I.

Now, we use the second condition of the statement. This gives us that, for any rational prime
p | N , the polynomial

XN − 1
up = 1 + X N/p + X 2N/p . . . + X (p−1)N/p =
X N/p − 1
is in I. Let’s compute the gcd of these polynomials when p ranges through the prime factors of
N : the roots of up are N th roots of unity with order not dividing N/p. Thus, the gcd of the up
is exactly the polynomials whose roots are primitive N th roots of unity, i.e. ΦN .

Now, since ϕ(N )/N can be arbitrarily N so that ϕ(N )/N ≤
small by Exercise 3.5.14 , chooseP
1

b − a. Let x be an element of a − N , b . By definition of I, since ΦN = i φi X i ∈ I, we have

ϕ(n)  
X k
φi f x+ = 0.
N
k=0

Note that all terms in this sum are in [a, b] except the first one, since

k ϕ(N )
a≤x+ ≤x+ ≤ a + (b − a) = b
N N
for 1 ≤ k ≤ ϕ(n). Thus, we also have f (x) = φ0 f (x) = 0, i.e. f is identically zero on a − N1 , b .
 

Similarly, f is identically zero on a, b + N1 . By induction, f is identically zero on a − Nk k


   
,b + N
for any k ∈ N , i.e. f is zero on R as wanted.


Exercise 3.5.40† . Let n ≥ 1 be an integer. Prove that, for all x ∈ R, Φn (x) ≥ (x − 1)xϕ(n)−1 with
equality if and only if n = 1.5

5 In particular, Φn (2) ≥ 2ϕ(n)−1 .


3.5. EXERCISES 281

Solution

We clearly have equality when n = 1, thus assume that n ≥ 2. We present ABCDE’s solution on
AoPS, see https://artofproblemsolving.com/community/c6h1596694p9917603. Write
Y
Φn (x) = (xd − 1)µ(n/d) ,
d|n

by Exercise 3.5.19. We wish to prove that this product is greater than (x − 1)xϕ(n)−1 , i.e., by
dividing by xϕ(n) , that Y
(1 − x−d )µ(n/d) ≥ 1 − x−1 .
d|n

Now, take the logarithm to get


X
µ(n/d) log(1 − x−d ) ≥ log(1 − x−1 ).
d|n

Recall the Taylor series of the logarithm:



X yk
log(1 − y) = − ,
k
k=1

valid for |y| < 1. Thus, we wish to prove that


∞ ∞ ∞
X 1X X X x−kd X xk
µ(n/d)x−kd = µ(n/d) ≤ .
k k k
k=1 d|n d|n k=1 k=1

Note that we exchanged the two sums thanks to absolute convergence. Finally, to show this, we
will prove that each term on the left is less than the term on the right, i.e. that
X
µ(n/d)x−kd ≤ x−k .
d|n

µ(n/d)y −d ≤ y −1 for all y ≥ 2. We distinguish a few


P
More specifically, we will prove that d|n
cases.
1. n is squarefree and has an even number of prime factors, i.e. µ(n) = 1. This is the most
interesting case, and the one where the inequality is the sharpest. Since µ(n) = 1,
X
µ(n/d)y −d = y −1 − y −p + . . . ,
d|n

where p is the smallest prime factor of n. Now, notice that the dots have absolute value
less than

X 1
y −d = y −(p+1) ≤ y −p
1 − y −1
d=p+1

since 1
1−y ≤ 2 ≤ y. Thus, . . . < y −p , i.e.
X
µ(n/d)y −d = y −1 − (y −p + . . .) ≤ y −1
d|n

as wanted.
2. n is squarefree and has an odd number of prime factors, i.e. µ(n) = −1. In that case, we
have X
µ(n/d)y −d = −y −1 + . . . ,
d|n
282 CHAPTER 3. CYCLOTOMIC POLYNOMIALS

where the dots have absolute value less than



X 1
y −d = x−2 ≤ y −1
1 − y −1
d=2

so X
µ(n/d)y −d = −y −1 + . . . < 0 < y −1 .
d|n

3. n is not squarefree, i.e. µ(n) = 0. In that case, we have



X X 1 − y −1 −1
µ(n/d)y −d ≤ y −d = y −2 y .

d|n d=2


Chapter 4

Finite Fields

Exercise 4.0.1. Suppose K is a field of characteristic 0 , i.e.

1 + ... + 1
| {z }
n times

(where 1 is the multiplicative identity) is never zero for any n ≥ 1. Prove that K contains (up to
relabelling of the elements) Q.1

Solution

We consider the following injective morphisme Q → K; its image will be the copy of Q inside K.
Send n ∈ N to
ξ(n) = 1 + . . . + 1
| {z }
n times

where 1 is the multiplicative identity of K. Then send −n to the additive inverse −ξ(n) of ξ(n).
Finally, send a/b to ξ(a)/ξ(b). It is clear that this is a well defined morphism, and this is injective
since K has characteristic 0. Indeed, by expanding we get ξ(mn) = ξ(m)ξ(n) for any m, n ∈ N,
which means it’s true for m, n ∈ Z too by adding signs where needed. In particular, if a/b = c/d
then ξ(a)/ξ(b) = ξ(c)/ξ(d), which shows that it is well-defined. To show that it is multiplicative
on all of Q and thus a morphism, we see that, for a, b, c, d ∈ Z with b, d 6= 0, we have

ξ(ac) ξ(a)ξ(c)
ξ(ac/bd) = ξ(a/b)ξ(c/d) ⇐⇒ =
ξ(bd) ξ(b)ξ(d)

which is true since ξ is multiplicative on Z.




Exercise 4.0.2∗ . Let p be a rational prime. Prove that there exists a unique field with p elements
(it’s Z/pZ).

Solution

First we prove that F has characteristic p. For this, we shall prove that the characteristic of a
finite ring divides its cardinality, thus proving that F has characteristic 1 or p but the former
is impossible since it is non-trivial. Let m be the characateristic of a ring R. Partition R into

1 Technically, it will usually not contain Q because Q is a very specific object. Indeed, the definition of a field is

extremely sensitive: if you change the set K (relabel its elements) but keep everything else the same you get a different
field. In that case we say the new field is isomorphic to the old one. So you must prove that K contains a field isomorphic
to Q, i.e. Q up to relabeling of its elements.

283
284 CHAPTER 4. FINITE FIELDS

sets of the form {a, a + 1, . . . , a + m − 1}. These sets either coincide or are pairwise disjointe: if
a + i = b + j then {a, . . . , a + m − 1} = {b, . . . , b + m − 1}. Thus the cardinality of R is divisible
by m since each such set has cardinality m.

Now, identify n ∈ Fp with


1 + ... + 1.
| {z }
n times

This is well defined because Fp and F have the same characateristic, thus yields a morphism (it
is clearly multiplicative and additive) between Fp and F , which is clearly injective. Since F and
Fp have the same cardinality, this is an isomorphism.


Remark 4.0.1
What we did can be summarised as follow: use Lagrange’s theorem on the additive group of F
to prove that F has characteristic p, then conclude with Exercise A.2.3∗ that F contains a copy
of Fp which mean that they are isomorphic since they have the same cardinality.

Exercise 4.0.3∗ . Prove that F3 (i) := F3 + iF3 is a field (with 9 elements). (The hard part is to prove
that each element has an inverse.)

Solution

The inverse of a + i3 b is given by aa−i 3b 2 2


2 +b2 since (a + i3 b)(a − i3 b) = a + b . Note that this is well

defined since a2 + b2 = 0 iff a = b = 0, as the polynomial X 2 + 1 has no root in F3 .




4.1 Frobenius Morphism


Exercise 4.1.1. Why is commutativity (of R) needed?

Solution

We need R to be commutative for the binomial expansion to work: for instance, (a + b)2 =
a2 + ab + ba + b2 which is ab + ba if and only if a and b commute.


Exercise 4.1.2∗ . Prove that an = αn + β n + γ n .

Solution

We have 1 + 1 + 1 = 3 = u1 , α + β + γ = 0 = u1 by Vieta’s formulas, and

α2 + β 2 + γ 2 = (α + β + γ)2 − 2(αβ + βγ + γα) = 2 = u2

by Vieta’s formulas.

4.2. EXISTENCE AND UNIQUENESS 285

4.2 Existence and Uniqueness


Exercise 4.2.1∗ . Let K be a field and f ∈ K[X] an irreducible polynomial of degree n. Prove that
K(α) := K + αK + . . . + αn−1 K
is a field, where α is defined as a formal root of f , i.e. an object satisfying f (α) = 0.

Solution

It is clear that K(α) is a commutative ring, since αn is a linear combination of 1, . . . , αn−1 by


definition (f (α) = 0) so it is closed under multiplication (the other ring axioms are obvious).
Thus the tricky part is to prove that every non-zero element has an inverse. Note that this is
not necessarily true if f is reducible: if f = gh we have g(α)h(α) = 0 and the g(α), h(α) could
be both non-zero (keep in mind that α is just a formal object satisfying f (α) = 0).

Let g(α) be a non-zero element of K(α), i.e. f - g. We use Bézout’s lemma in K[X] (K[X] is
Euclidean for the degree map so Bézout too): since f is irreducible, it is coprime with g so there
exist r, s ∈ K[X] such that
rf + sg = 1.
Evaluation at α yields s(α)g(α) = 1 as wanted.


Remark 4.2.1
In fact, if α is a root of f , we have K(α) ∼ K[X]/(f ), where K[X]/(f ) means K[X] modulo f .
This gives a more abstract way of constructing a field extension of K where f has a root. Indeed,
the element X ∈ K[X]/(f ) is a root of f : f (X) is divisible by f . (Note that we treat f as an
element of K[X]/(f )[Y ] here, i.e. a polynomial in Y with coefficients in K[X]/(f ) (in fact its
coefficients are simply in K).)

Exercise 4.2.2. Let R be a commutative ring. Prove, using Zorn’s lemma, that R has a maximal
ideal. More generally, prove that any strict ideal of R is contained in a maximal ideal. (You really
don’t need to do this.)

Solution

Consider the set I of strict ideals of R. I is clearly non-empty, so we must prove that any chain
(i.e. a subset C ⊆ I such that, for all a, b ∈ C, we have a ⊆ b or b ⊆ a) of elements of I has an
upper bound in I. Let C be such a chain, and consider the set
[
U= S.
S∈C

It is clear that this is an upper bound of C, provided that it lies in I. Note also that it is an
ideal: [ [
RU = RS ⊆ S=U
S∈C S∈C
and, for any u, v ∈ U , there are S1 and S2 such that u ∈ S1 and v ∈ S2 . Since C is a chain,
suppose without loss of generality that S1 ⊆ S2 : then u + v ∈ S2 ⊆ U . Finally, U is in I since
U 6= R: if U were equal to R, it would contain 1, so some S ∈ C would as well, contradicting the
assumption that S 6= R.
For the second part, we just need to replace I by Ia := {b | a ⊆ b 6= R} and the above proof
works verbatim.

286 CHAPTER 4. FINITE FIELDS

4.3 Properties
Exercise 4.3.1∗ . Let a and b be positive integers and K a field. Prove that X a − 1 divides X b − 1
in K if and only if a | b. Similarly, if x ≥ 2 is a rational integer, prove that xa − 1 divides xb − 1 in Z
if and only if a | b.

Solution

Note that the roots of X a − 1 are ath roots of unity and that these are all bth roots of unity if
and only if a | b (for instance by considering a primitive ath root of unity). Thus X a − 1 | X b − 1
if and only if a | b.

xa − 1 | xb − 1 if and only if the order of x modulo xa − 1 divides b. Since x has order a modulo
xa − 1, this means a | b.


Exercise 4.3.2∗ . Let f ∈ Fp [X] be a polynomial of degree n. Prove that f splits over Fpn! .

Solution

It suffices to prove that an irreducible polynomial g of degree at most n has its roots in Fpn! , since
any polynomial of degree n is a product of such polynomials. This is true because Corollary 4.3.3:
deg g is at most n and thus divides n!.


4.4 Cyclotomic Polynomials


Exercise 4.4.1∗ . Prove Proposition 4.4.1. (This proof is independent from the one in Chapter 3.)

Solution

The formula for Φm follows from the formula


Y
Φd = X m − 1
d|m

by induction on m. The formula from Φn follows from Proposition 3.1.2 by induction on n/m = pk
k
(or from the previous formula by noting that X n − 1 = (X m − 1)p ).


Exercise 4.4.2∗ . Let p - m be a positive integer. Prove that Φm has a root in Fpn if and only if
m | pn − 1.

Solution

Since the order of any element of Fpn divides pn − 1 by Theorem 4.2.1, if Φm has a root in Fpn ,
4.4. CYCLOTOMIC POLYNOMIALS 287

since this root has order m we get m | pn − 1. For the converse, if m | pn − 1 then
n Y
Φm | X p −1 − 1 = X − a.
a∈F×
pn

The RHS splits in Fpn so the LHS too and in particular has at least one root there.


Exercise 4.4.3. Prove that p2 ≡ 1 (mod 9) if and only if p ≡ ±1 (mod 9).

Solution

p2 ≡ 1 (mod 9) iff 9 | p2 − 1 = (p − 1)(p + 1). The two factors have gcd dividing 2 so are coprime
with 9, so 9 divides p2 − 1 iff 9 divides p − 1 or p + 1, i.e. iff p ≡ ±1 (mod 9).


Exercise 4.4.4. Compute Ψ1 , . . . , Ψ8 .

Solution

We have Ψ1 = X − 2, Ψ2 = X + 2,
 
1 Φ3 1
Ψ3 X + = =X +1+
X X X
so Ψ3 = X + 1,    
1 Φ4 1
Ψ4 X+ = = X+
X X X
so Ψ4 = X,
Φ5 2
 
1
Ψ5 X + =
X X
1 1
= X2 + X + 1 + + 2
X X
 2  
1 1
= X+ + X+ −1
X X
so Ψ5 = X 2 + X − 1,  
1 Φ6 1
Ψ6 X + = =X −1+
X X X
so Ψ6 = X − 1,
 
1 Φ7
Ψ7 X+ =
X X3
1 1 1
= X3 + X2 + X + 1 + + 2+ 3
X X X
 3  2  
1 1 1
= X+ + X+ −2 X + −1
X X X
so Ψ7 = X 3 + X 2 − 2X − 1, and finally
2
12
  
1 Φ8 1
Ψ8 X + = 2 = X2 + = X+ −2
X X X X
288 CHAPTER 4. FINITE FIELDS

so Ψ8 = X 2 − 2.


Exercise 4.4.5∗ . Let p 6= 0 be an integer. Prove that Z[1/p], i.e. the set of numbers m/pk with
m ∈ Z and k ∈ N is dense in R.

Solution
PN
Let x ∈ R be a real number. Write it in base p: x = i=−∞ ai pi with ai ∈ {0, . . . , p − 1}. Let
PN
α = i=−(M −1) ai pi , which is a fraction with denominator a power of p. Then,

N −M 0
X X p−1 p−1 X 1 1 p−1
x− ai pi < = = M ·
i=−∞
p i pM
i=−∞
p i p 1 − p1
i=−(M −1)

which goes to zero as M → ∞, thus showing the wanted density.




Exercise 4.4.6∗ . Prove that the leading coefficient of Ψn is 1.

Solution

Ψ1 = X − 2, Ψ2 = X + 2 are clearly monic so assume n > 2. Let a be the leading coefficient of


Ψn . Then, the leading coefficient of Φn /X ϕ(n)/2 = Ψn (X + 1/X) comes from (X + 1/X)ϕ(n)/2
and is thus aX ϕ(n)/2 . Since Φn is monic, Ψn is too.


4.5 Quadratic Reciprocity


Exercise 4.5.1∗ . Prove Proposition 4.5.1.

Solution

Let g be a primitive root modulo p. a is a square modulo p if and only if it has the form g 2k for
p−1
some k, which is exactly equivalent to a 2 = g k(p−1) = 1.

Without primitive roots, one can also do a bit of elementary counting: there are exactly p−12
quadratic residues (they come by pairs x2 , (−x)2 , since x2 = y 2 ⇐⇒ x = ±y) and all quadratic
p−1
residues are roots of X 2 − 1 by Fermat’s little theorem. The quadratic non-residues must
therefore be roots of
X p−1 − 1 p−1
p−1 = X 2 + 1.
X 2 −1


77

Exercise 4.5.2. Compute 101 .
4.5. QUADRATIC RECIPROCITY 289

Solution

We have
    
77 7 11
=
101 101 101
  
101 101
= 1
7 1
  
3 2
=
7 11
 
7
=
3
= 1.

p2 −1
Exercise 4.5.3. Prove that Ψ8 = X 2 − 2 and that (−1) 8 = 1 if and only if p ≡ ±1 (mod 8).

Solution

We have already computed Ψ8 in Exercise 4.4.4.




 
Exercise 4.5.4∗ . Prove that, for any ` ∈ Fq , g` = `
q g.

Solution

If ` = 0 then both sides are 0. Otherwise, ` is invertible so


  X  k` 
`
g` = ω k` = g
q q
k∈Fq

   2
` `
which yields g` = q g since q = 1.


Exercise 4.5.5∗ . Prove without computing g 2 that g has exactly 2 conjugates, i.e. is a quadratic
number.

Solution
Q
i X − gi has rational coefficients by the fundamental theorem of symmetric polynomials so the
conjugates of g are among g and −g. Conversely, g and −g are conjugates of g since if
!
Xi
i
f X
i
p

has a root at ω it also has a root at ω k for p - k. Thus, if f (g) = 0 then f (gi ) = 0 too.

290 CHAPTER 4. FINITE FIELDS

4.6 Exercises
Dirichlet Convolutions
Exercise 4.6.1† (Dirichlet Convolution). A function f from N∗ to C is said to be an arithmetic
function. Define the Dirichlet convolution 2 f ∗ g of two arithmetic functions f and g as
X X
n 7→ f (d)g(n/d) = f (a)g(b).
d|n ab=n

Prove that the Dirichlet convolution is associative. In addition, prove that if f and g are multiplicative 3 ,
meaning that f (mn) = f (m)f (n) and g(mn) = g(m)g(n) for all coprime m, n ∈ N, then so is f ∗ g.

Solution

Let f, g, h be three arithmetic functions and let n ∈ N∗ . Then,


X
((f ∗ g) ∗ h)(n) = (f ∗ g)(d)h(c)
cd=n
X
= f (a)g(b)h(c)
cd=n,ab=d
X
= f (a)g(b)h(c).
abc=n

Similarly,
X
(f ∗ (g ∗ h))(n) = f (a)(g ∗ h)(d)
ad=n
X
= f (a)g(b)h(c)
ad=n,bc=d
X
= f (a)g(b)h(c)
abc=n

which shows that the Dirichlet convolution is associative. Now, suppose that f and g are multi-
plicative and let m, n be two coprime positive integers. We have
X  mn  X  mn 
(f ∗ g)(mn) = f (d)g f (ab)g
d ab
d|n a|m,b|n

because m and n are coprime, so each divisor of mn is a divisor of m times a divisor of n. By


multiplicativity of f and g, this is
  
X X X
f (a)f (b)g(m/a)g(n/d) =  f (a)g(m/a)  f (b)g(m/d) = (f ∗ g)(m)(f ∗ g)(n)
a|m,b|n a|m b|n

so f ∗ g is also multiplicative.


Exercise 4.6.2† (Möbius Inversion). Define the Möbius function µ : N∗ → {−1, 0, 1} by µ(n) = (−1)k
where k is the number of prime factors of n if n is squarefree, and µ(n) = 0 otherwise. Define also δ
as the function mapping 1 to 1 and everything else to 0. Prove that δ is the identity element for the
2 The Dirichlet convolution appears naturally in the study of Dirichlet series: the product of two Dirichlet series
P∞ f (n) P∞ g(n) P∞ (f ∗g)(n)
n=1 ns and n=1 ns is the Dirichlet series corresponding to the convolution of the coefficients n=1 ns
.
3 This terminology has conflicting meanings: in algebra, it means that f (xy) = f (x)f (y) for all x, y, while for arithmetic

functions, it only means that f (xy) = f (x)f (y) for coprime x, y.


4.6. EXERCISES 291

Dirichlet convolution: f ∗ δ = δ ∗ f = f for all arithmetic functions f . In addition, prove that µ is


the inverse of 1 for the Dirichlet convolution, meaning that µ ∗ 1 = 1 ∗ µ = δ where 1 is the function
n 7→ 1.4

Solution

The first claim is very easy: for any n ∈ N∗ ,


X
(f ∗ δ)(n) = δ(d)f (n/d) = f (n).
d|n

For the second claim, note that the Möbius function is multiplicative. Hence, by Exercise 4.6.1† ,
µ ∗ 1 is as well. This means that, to prove that µ ∗ 1 is zero everywhere except at 1, we just need
to prove that it’s zero on prime powers. Thus, let pm 6= 1 be a prime power. We have
X
(µ ∗ 1)(pm ) = µ(d) = µ(1) + µ(p) = 1 − 1 = 0
d|pm

since µ(pm ) = 0 when m ≥ 2. To finish, we also have (µ ∗ 1)(1) = µ(1) = 1 = δ(1).




Exercise 4.6.3† (Prime Number Theorem in Function Fields). Prove that the number of irreducible
polynomials in Fp [X] of degree n is
1 X n d
Nn = µ p
n d
d|n
pn
and show that this is asymptotically equivalent to logp (pn ) .

Solution

The fact that µ is the inverse of 1 means that the equalities g = 1 ∗ f and f = µ ∗ g are equivalent.
Now, consider the number f (n) of elements of Fp of degree n. This is n times the number of
irreducible polynomials of degree n, by grouping them by minimal polynomial. However, we also
have X
f (d) = pn
d|n

since this is the number of elements of Fpn . In other words, f ∗ 1 = n 7→ pn . This means that
f = (n 7→ pn ) ∗ µ, i.e.
X n
f (n) = µ pd .
d
d|n

Division by n yields the formula for the number of irreducible polynomials of degree n. Now,
observe that
pn 1 X n
− Nn = µ pd
n n d
d|n,d<n

is at most
bn/2c
pbn/2c+1 − 1
 n/2 
1 X k p
p = =O
n n(p − 1) n
k=1
in absolute value since the greatest strict divisor of n is at most n/2. We conclude that
pn
 n/2 
p pn
Nn = +O ∼ .
n n n

4 This also explains how we found the formula for Φn from Exercise 3.5.19
292 CHAPTER 4. FINITE FIELDS

Linear Recurrences
Exercise 4.6.4† (China TST 2008). Define the sequence (xn )n≥1 by x1 = 2, x2 = 12 and xn+2 =
6xn+1 − xn for n ≥ 0. Suppose p and q are rational primes such that q | xp . Prove that, if q 6= 2, 3,
then q ≥ 2p − 1.

Solution
n n
−β
Without loss of generality, suppose that p is odd. It is easy to see that xn = 2 · αα−β where
2
α and β are the roots of X − 6X + 1. From now on we will assume α and β to be the roots
of X 2 − 6X + 1 in Fq , since we are working modulo q. We have α, β ∈ Fq2 and Φp (α, β) by
assumption. Thus, either α/β has order p unless p = q. If p = q, xp ≡ x1 = 2 so q = 2.
Otherwise, since the order of α/β divides q 2 − 1, we have p | q 2 − 1, i.e. q ≡ ±1 (mod p) so
q ≥ 2p − 1 as wanted since p ± 1 is even.


 
Exercise 4.6.6† . Let p 6= 2, 5 be a prime number. Prove that p | Fp−ε where ε = 5
p .

Solution
p−ε ε
Let α ∈ Fp2 be a square root of 5. The key point is that 1±α 2 = 1. Indeed, 1±α
2 = 1±εα
2
1±α
since this is clearly true when ε = 1, and when ε = −1 it’s also true since 2 times its conjugate
is 1 (root of X 2 − X − 1). Thus,
1+α p−ε 1−α p−ε
 
2 − 2
Fp−ε ≡ =0
α
as wanted.


 
Exercise 4.6.7† . Let p 6= 2, 5 be a rational prime. Prove that p | Fp − 5
p .

Solution

Let α ∈ Fp2 be a square root of 5. Then,

(1 + α)p − (1 − α)p (1 + αp ) − (1 − αp )
Fp ≡ p
= = αp−1
2 α 2α
  p−1
5
which is p as αp−1 = 5 2 .


Exercise 4.6.8† . Let m ≥ 1 be an integer and p a rational prime. Find the maximal possible period
modulo p ≥ m of a sequence satisfying a linear recurrence of order m.
4.6. EXERCISES 293

Solution

We prove that the maximum possible is pm − 1. Here is a construction: let α ∈ Fp be an element


of order pm − 1 with conjugates α1 , . . . , αm , i.e. a primitive root of Fpm . Consider the sequence
m
X
an := αin ,
i=1

which takes values in Fp by the fundamental theorem of symmetric polynomials (and is a linear
recurrence of order m). Suppose that is has period t, i.e.

an+t = an , an+t+1 = an+1 , . . . , an+t+m−1 = an+m−1

for some m. Then, the Vandermonde determinant gives that αit = αi , by considering this as
system of equations with coefficients αij and solution αin+t = αin . This shows that the period is
pm − 1.
Pr
Now, let i=1 fi (n)αin be a linear recurrence of order m (the αi are not necessarily conjugates
anymore). Suppose first that all fi are constant. Group the αi by their degrees k1 , . . . , ks . Since
the period is at most the product of the orders of the αi , and the order of an element of degree
k divides pk − 1, the period is at most

(pk1 − 1) · . . . · (pkr − 1)
Ps
which is at most pm − 1 since i=1 ki ≤ m (there might be repeated roots so we don’t necessarily
have equality).

Finally, if one of the fi is not P


constant anymore, then we group the αi by their degrees as before.
s
The difference is that, now i=1 ki ≤ m − 1 (there is at least one repeated root). Since all
polynomials have period dividing p, the period is at most

p(pk1 − 1) · . . . · (pkr − 1) < pm − 1.

Remark 4.6.1
It is interesting to note that this proof also characterises the linear recurrences with maximal
period. Indeed, their characteristic polynomial must be the minimal polynomial of a primitive
root of Fpm by what we have seen, and, conversely, Vandermonde shows that all such sequences
have period pm − 1.

Exercise 4.6.9† . Let f ∈ Z[X] be a polynomial and (an )n≥0 be a linear  recurrence
 of rational
an
integers. Suppose that f (n) | an for any rational integer n ≥ 0. Prove that f (n) is also a linear
n≥0
recurrence.5

Solution
Pm
Write an = i=1 fi (n)αin . We shall prove that f | fi for every i, thus showing the wanted result.
Choose some n and a large prime p such that p | f (n) using Theorem 5.2.1. Then consider
an , an+p , . . . , an+(m−1)p . These are all zero modulo p since f (k) | ak . However, the Vandermonde
determinant shows that this implies that either fi (n) ≡ 0 for all i, or the determinant of αijp is

5 Infact,
 the Hadamard quotient theorem states that if a linear recurrence bn always divides another linear recurrence
an then ab n is also a linear recurrence.
n n≥0
294 CHAPTER 4. FINITE FIELDS

zero, i.e. αip ≡ αjp for some i 6= j. This is clearly impossible for large p since this implies
Y
p| αi αj ,
i6=j

as αip − αjp ≡ (αi − αj )p by Frobenius. Thus, we get p | f (n) =⇒ p | fi (n) for large p. We
can then use Corollary 5.4.2 (it is clear that the proof also works for f, g 6∈ Z[X]) to deduce that
all irreducible factors of f divide fi for every i. Simply divide f and all fi by these irreducible
factors, and repeat the argument.


Polynomials and Elements of Fp


n
Exercise 4.6.11† . Let a ∈ Fp be non-zero. Prove that X p − X − a is irreducible over Fp if and only
if n = 1, or n = p = 2.

Solution
n kn
Let α be a root of f = X p − X − a. We have αp = α + ka for any integer k by induction.
Since these are all conjugates of α, we get that α has at least p conjugates which shows that
n pn
X p − X − a is its minimal polynomial when n = 1. Otherwise, when k = p, we have αp = α
so α ∈ Fppn which menas that it has degree at most pn. Since pn < pn when (p, n) 6= (2, 2), we
get the desired reducibility in these cases. Finally, when p = n = 2, X 4 − X − 1 is irreducible
2
since α doesn’t have degree dividing 2: α2 = α + 1 6= α.


Exercise 4.6.12† (ISL 2003). Let (an )n≥0 be a sequence of rational integers such that an+1 = a2n − 2.
Suppose an odd rational prime p divides an . Prove that p ≡ ±1 (mod 2n+2 ).

Solution
1

We prove that f = X 2 − 2 iterated nth times is Ψ2n+2 . This means that f n X + X =
n n
1
Φ2n+2 /X 2 = X 2 + X12n . Note that f X + X = X 2 + X12 so this follows by induction.


Exercise 4.6.14† . Let f ∈ Fp [X] be an irreducible polynomial of odd degree. Prove that its discrim-
inant is a square in Fp .

Solution
Q 2 Qn
The square root of the discriminant ∆ = i<j αi − αj of a polynomial f = i=1 X − αi is
√ Y
∆=± αi − αj .
i<j


Thus, if this was not in Fp , Fpn would contain Fp ( ∆) = Fp2 which is impossible since 2 - n.

4.6. EXERCISES 295

Remark 4.6.2

In particular, for n = 3, ∆ ∈ Fp if and only if f is irreducible or splits in K.

Exercise 4.6.15† (Chevalley-Warning Theorem). Let f1 , . . . , fm ∈ Fpk [X1 , . . . , Xn ] be polynomials


such that d1 + . . . + dm < n, where di is the degree of fi . Prove that, if f1 , . . . , fm have a common
root in Fpk , then they have another one.

Solution

We shall prove more strongly that P the number of common roots is divisible by p. This follows
from the following result: we have x∈Fp xk = 0 for k < p − 1 by Exercise A.3.12† , so the sum
over Fnp of f (x) for any polynomial f ∈ Fp [X1 , . . . , Xn ] of degree less than n(p − 1) also vanish
(since one variable must have degree less than p − 1). This yields our claim when applied to the
polynomial
f = (1 − f1p−1 ) · . . . · (1 − fm
p−1
)
(the powers mean exponentiation and not iteration). Indeed, this has degree less than n(p − 1)
by assumption, and f (x) is 1 if x is a common root of f1 , . . . , fm and 0 otherwise.


Exercise 4.6.16† (USA TST 2016). Define Ψ : Fp [X] → Fp [X] by


n
! n
X X i
i
Ψ ai X = ai X p .
i=0 i=0

Prove that, for any f, g ∈ Fp [X], gcd(Ψ(f ), Ψ(g)) = Ψ(gcd(f, g)).

Solution

Let f, g ∈ Fp [X] be polynomials. By Frobenius and Fermat’s little theorem, Ψ is a linear map.
Note that Ψ(gcd(f, g)) and gcd(Ψ(f ), Ψ(g)) are both monic so it suffices to prove that they both
divide each other. We prove that u | v implies Ψ(u) | Ψ(v). This implies in particular that
Ψ(gcd(f, g)) divides both Ψ(f ) and Ψ(g) so divides gcd(Ψ(f ), Ψ(g)). To see this, write
Xn
v = u( ai X i )
i=0

so that
n n
X X i
Ψ(v) = ai Ψ(uX i ) = ai Ψ(u)p
i=0 i=0
p
since Ψ(uX) = Ψ(u) by Frobenius, and this is indeed divisible by Ψ(u). Conversely, let u, v ∈
Fp [X] be such that uf + vg = gcd(f, g), using Bézout’s lemma. Then,
Ψ(uf ) + Ψ(vg) = Ψ(gcd(f, g)).
By our previous claim, this is divisible by gcd(Ψ(f ), Ψ(g)) as wanted.
Alternatively, to prove the second divisiblity relation, we could have noted that the gcd of two
elements in im Ψ still lies in im Ψ, and that our implication u | v =⇒ Ψ(u) | Ψ(v) is in fact an
equivalence. If we then set gcd(Ψ(f ), Ψ(g)) = Ψ(h), we get h | f, g so that h | gcd(f, g), and thus
gcd(Ψ(f ), Ψ(g)) = Ψ(h) | Ψ(gcd(f, g)).
Note that these claims mean that Ψ is a morphism of partially ordered sets (posets), where the
order relation is the divisibility relation. Indeed, the gcd is the lower bound of elements under
296 CHAPTER 4. FINITE FIELDS

this relation. ToPprove them, we can note for instance that


Pnthe Euclidean division in im Ψ stays
m i i
in im Ψ: if f = i=0 ai X p has greater degree than g = i=0 bi X p , then
m−n
f − am /bn g p

has smaller degree than deg f and is still in im Ψ. Iterating this process yields that the quotient
and remainder are also in im Ψ. For the other part, suppose that Ψ(u) | Ψ(v). Let v = uq + r
be the Euclidean division of v by u. Then, Ψ(u) divides Ψ(r) by the implication we showed, and
since deg(Ψ(r)) < deg(Ψ(u)), Ψ(r) must be 0, i.e. r = 0 and u | v.


Remark 4.6.3
In fact, the equality
n n
X X i
Ψ(v) = ai Ψ(uX i ) = ai Ψ(u)p
i=0 i=0

showsPthat Ψ is a ring morphism from (Fp [X], +, ·) to (im Ψ, +, ◦). Indeed, if we set v = uw, i.e.
n
w = i=0 ai X i , then
n
X i
Ψ(v) = ai Ψ(u)p
i=0

is precisely Ψ(v) = Ψ(w) ◦ Ψ(u).

Squares and the Law of Quadratic Reciprocity


Exercise 4.6.20† . Let q be a prime power, a ∈ F×
q and m ≥ 1 an integer. Prove that a is an mth
p−1
power in Fq if and only if a gcd(p−1,m) = 1.

Solution

Let g be a primitive root of F× k


p . Let k be such that a = g . Then, a is an mth power if and only if
k mn
there is an n such that g = g , i.e. k ≡ mn (mod p−1) which is equivalent to gcd(p−1, m) | k.
p−1
Finally, this is itself equivalent to a gcd(p−1,m) = 1.


Exercise 4.6.21† . Let a be a rational integer. Suppose a is quadratic residue modulo every rational
prime p - a. Prove that a is a perfect square.

Solution

Without loss of generality, suppose that a = ε2n p1 · . . . · pk is squarefree, where ε = ±1, n ∈ {0, 1}
and p1 , . . . , pk are distinct odd primes. Suppose for the sake of contradiction that pk ≥ 1. Let r
be a quadratic non-residue modulo p1 . Pick a prime p ≡ 1 (mod  2 ·.. .·p
 8p  k ) and p ≡ r (mod p1 ),
p q
using Dirichlet’s theorem. Then, since p ≡ 1 (mod 4) we have q = p by the law of quadratic
4.6. EXERCISES 297

   
2 −1
reciprocity for any odd prime q 6= p, and since p ≡ 1 (mod 8) we have p = p = 1. Thus,
       
a ε 2 p1 pk
= · ... ·
p p p p p
   
p p
= · ... ·
p1 pk
    
r 1 1
= · ... ·
p1 p2 pk
= (−1) · 1 · . . . · 1
= −1

which is a contradiction. This means that a ∈ {±1, ±2}; we could simply give counterexamples
to −1, ±2, but we construct arbitrarily large ones so that the problem still holds with the slightly
weaker assumption that a is quadratic resideu modulo sufficiently large primes. For a = −1,
simply pick any prime congruent to −1 modulo 4. For a = 2, pick a prime congruent to 3 modulo
8, and for a = −2, pick a prime congruent to −1 modulo 8.

Finally, we give some ways to avoid the use od Dirichlet’s theorem on primes in arithmetic
progressions. Instead of picking a prime p ≡ 1 (mod 8p2 · . . . · pk ) and p ≡ r (mod p1 ), we could
simply choose p to be such an integer with sufficiently large prime factors, and replace Legendre
symbols by Jacobi symbols.


Remark 4.6.4
This result illustrates the celebrated Chebotarev density theorem, which implies that the set of
primes p such that the polynomial X 2 − a splits over Fp has density 21 when it is irreducible (and
of course 1 otherwise). (This can also be seen from a more careful observation of the quadratic
reciprocity law and our solution of the exercise.) This theorem also implies that, if a is an nth
power modulo all sufficiently primes, then it is an nth power if 8 - n, and an n2 th power otherwise
(which is sharp, as shown by Exercise 4.6.22† ). A note on this theorem: it does not imply that
the density of primes p such that an irreducible polynomial f of degree n splits over Fp has density
1
n ; this depends on its Galois group (see Chapter 6).

Exercise 4.6.22† . Prove that 16 is an eighth power modulo every prime but not an eighth power in
Q.

Solution

Notice that X 8 − 16 = (X 2 − 2)(X 2 + 2)(X 4 + 4). Thus, it has a root in Fp of 2 or −2 is a


quadratic residue. Otherwise, p ≡ 5 (mod 8) which implies that −4 is a fourth power in Fp so it
has a root as well. Indeed,
p−1 p−1 p−1
(−1) 4 = −1 = 2 2 = 4 4
p−1
so (−4) 4 = 1 which means that −4 is a fourth power by Exercise 4.6.20† .


Exercise 4.6.23† . Prove that, if a polynomial f ∈ Z[X] of degree 2 has a root in Fp for any rational
prime p, then it has a rational root. However, show that there exists polynomials of degree 5 and 6
that have a root in Fp for every prime p but no rational root.6
6 The Chebotarev density theorem implies that such a polynomial must be reducible. In fact it even characterises

polynomials which have a root in Fp for every rational prime p based on the Galois groups of their splitting field (see
Chapter 6). In particular, it shows that 5 and 6 are minimal.
298 CHAPTER 4. FINITE FIELDS

Solution

For odd p, a quadratic polynomial f ∈ Z[X] has a root in Fp if and only if its discriminant ∆
is a square in Fp . Hence, ∆ is a square modulo sufficiently large primes, so it is a square by
Exercise 4.6.21† , i.e. f has rational roots.

For n = 6, the following polynomial works: (X 2 +1)(X 2 +2)(X 2 −2). Indeed, for any odd prime p,
if both 2 and −1 are quadratic non-residues, then −2 is a quadratic residue (and 1 is a quadratic
residue modulo 2). For n = 5, the following works: (X 2 + X + 1)(X 3 − 2). Indeed, if p ≡ 1
(mod 3) then Φ3 has a root modulo p, and otherwise 2 is a cube modulo p by Exercise 4.6.20† .


·
Exercise 4.6.24† (Jacobi Reciprocity). Define the Jacobi symbol

n of an odd positive integer n as
the product    
· ·
· . . . ·
pn1 1 pnk k
where n = pn1 1 · . . . · pnk k is the prime factorisation of n. Prove the following statements: for any odd
m, n
m−1 n−1
• m 2 · 2 .
 n
n m = (−1)

m−1
−1

• m = (−1) 2 .
m2 −1
2

• m = (−1) 8 .
m

(The Jacobi symbol n is 1 if m is quadratic residue modulo n but may also be 1 if m isn’t.)

Solution
Q Q
Let m = i pi (not necessarily distinct) and n = i qi . Then,
m  n  Y p  q  Y pi −1 qi −1 pi −1 qi −1
i i
P
= = (−1) 2 · 2 = (−1) i 2 · 2 .
n m i
qi pi i

Thus, we want to show that a−1 b−1


2 + 2 =
ab−1
2 (mod 2) for any odd a and b, since this this
implies that Q Q
i pi − 1 qi − 1
X pi − 1 qi − 1 m−1 n−1
· ≡ · i = ·
i
2 2 2 2 2 2

as wanted. This is equivalent to a − 1 + b − 1 ≡ ab − 1 (mod 4), i.e. 4 | (a − 1)(b − 1) which is


clearly true. Similarly, we have
  Y  Y
−1 −1 pi −1 P pi −1
= = (−1) 2 = (−1) i 2
m i
pi i

m−1
which is (−1) 2 by the previous computation. Finally,
  Y  Y
2 2 p2
i −1
P p2
i −1
= = (−1) 8 = (−1) i 8
m i
pi i

a2 −1 b2 −1 (ab)2 −1
so we want to show that 8 + 8 = 8 (mod 2), i.e. 16 | (a2 − 1)(b2 − 1) which is true.

4.6. EXERCISES 299

Exercise 4.6.25† . Suppose a1 , . . . , an are distinct squarefree rational integers such that
n
X √
bi ai = 0
i=1

for some rational numbers b1 , . . . , bn . Prove that b1 = . . . = bn = 0.

Solution
Pn √
Let p1 , . . . , pk the prime factors of the ai . We proceed by induction on k. Write i=1 bi ai as
√ √ √ √ √ √
A + B pk := A( p1 , . . . , pk−1 ) + B( p1 , . . . , pk−1 ) pk

, where A and B are linear combinations of square roots of integers with prime factors among
p1 , . . . , pk−1 . The key point is that quadratic reciprocity gives us infinitely many primes p such
that all pi for i < k are quadratic residues, while pk isn’t. This implies that p | B, and for
sufficiently large p we get B = 0. We can then remove it and repeat the argument. Note that

A + B pk does not directly make sense modulo p, but if we consider square roots αi of pi we
get that (by symmetric polynomials), for some choice of ±1,

A(±α1 , . . . , ±αk−1 ) + B(±α1 , . . . , ±αk−1 )αk ∈ Fp

(for sufficiently large p so there’s no problem with the denominators of the coefficients). For the
p mentioned earlier, we get that B(±α1 , . . . , ±αk−1 ) = 0 otherwise this sum is in Fp2 and not
Fp . This implies that B = 0 (we see that infinitely many primes divide its norm, i.e. the product
of its conjugates).

Now, we prove this key claim. The proof is the same as the solution of Exercise 4.6.21† . Pick a
quadratic non-residue r and a large prime p ≡ r (mod pk ),√p ≡ 1 (mod 8p1 · . . . · pk ) (if pk 6= 2,
otherwise this simply corresponds to the irrationality of 2; or we can pick a prime p ≡ 5
(mod 8) instead of 1 (mod 8)). As in Exercise 4.6.21† , we can substitue our use of the quadratic
reciprocity law by the Jacobi reciprocity law, although we need adapt it slightly because we used
the convenient formalism of field theory (when p isn’t prime, Z/pZ is not a field, however the
fundamental theorem of symmetric polynomials works in any ring).


n
Exercise 4.6.26† . Let n ≥ 2 be an integer and p a prime factor of 22 + 1. Prove that p ≡ 1
(mod 2n+2 ).

Solution

Note that, since p | Φ2n+1 (2) and n ≥ 2, p ≡ 1 (mod 8) which implies that 2 is q qudratic
n p−1
residue modulo p. Thus, we have 22 ≡ −1 (mod p) but 2 2 ≡ 1 (mod p) which implies that
v2 p−1
2 < n, i.e. p ≡ 1 (mod 2n+2 ).


Exercise 4.6.27† (USA TST 2014). Find all functions f : Z → Z such that (m − n)(f (m) − f (n)) is
a perfect square for all m, n ∈ Z.
300 CHAPTER 4. FINITE FIELDS

Solution

We shall prove that, for any prime p, if f (a) ≡ f (b) (mod p) for some a 6≡ b (mod b), then f is
constant modulo p. Since (f (a) − f (b))(a − b) is a square, we also get that f (a) ≡ f (b) (mod p2 ).
Then, since (f (n) − f (b))(n − b) and (f (n) − f (a))(n − a) are square, we get that, in fact, f is
constant modulo p2 , by looking at the vp . Thus, we can divide f by p2 and give rise to another
solution with a smaller value of f (1). Hence, assuming we have shown this, we can assume that
f (a) ≡ f (b) (mod p) =⇒ a ≡ b (mod p). Now, suppose we are working under this assumption.
Then, f (n + 1) − f (n) has no prime factor so must be ±1. Moreover, since f is injective, it
must be always 1 or always −1 but since f (n + 1) − f (n) is square, it must be always 1. Thus,
f (n) = n + c, and if we remove the assumption that f (1) was minimal, we get that all solutions
have the form f (n) = a2 (n + c) (and clearly these work).

Hence, suppose that f (a) ≡ f (b) (mod p) for some a 6≡ b (mod p). We need to show that f is
constant modulo p. Without loss of generality, suppose that f (a) ≡ 0 be translating f , and that
b = 0 by translating f inside (replace f by x 7→ f (x + b)). Let S be the set of integers s such
that f (s) ≡ 0 (mod p). Note that, if f (x) 6≡ 0 for some x 6≡ 0, a, then xf (x) and f (x)(x − a) are
quadratic residues modulo p, and thus
x−a
x
too. Hence, if we choose x such that x−a a
x = t ⇐⇒ x = 1−t where t is a quadratic non-residue,
then f (x) ≡ 0, since x ≡ 0 or a is impossible in that case. Now, note the only condition on
a
a ∈ S is a 6≡ 0, so we can replace it by 1−t where t is a quadratic non-residue. This gives that
a
1−t a a
1−t (1−t)2 also satisfies these conditions, Iterating this process, we get that (1−t)k
is in S for any
a(1−t)
integer k. In particular, for k = p − 2, we have a(1 − t) ∈ S. Hence, 1− 1t
= −at ∈ S too, since
1
t is also a quadratic non-residue. Thus, att = −(−at)t ∈ S for any quadratic non-residues t, t0 ,
0 0

i.e. ar ∈ S for any quadratic residue r.

It remains
  to get  ∈ S for quadratic non-residues r. For this, note that if we have a b ∈ S such
 ar
b a
that p = − p then we can get br for quadratic residues r and this corresponds to ar for
 
a
quadratic non-residues r. If there was no such b, since 1−t ∈ S, we would have 1−tp = 1 for any
quadratic non-residue t, i.e. the set of quadratic residues would be 1 minus the set of quadratic
non-residues. This is impossible since 1 is never reached by 1 − t. Thus, we have f (x) ≡ 0 for
any x 6≡ 0, a.

Finally, if p ≥ 3, by replacing a by a b 6≡ 0, a, we also get f (x) ≡ 0 for any x 6≡ 0, b and thus


for all x ≡ a. Similarly, by replacing 0 by b 6≡ 0, we get f (x) ≡ 0 for all x ≡ 0. If p = 2, by
translating f (on the inside) if necessary, it suffices to show that f (n) is even when n is (to also
show that f (n) is odd when n is). Since nf (n) is a square, we have f (n) ≡ 0 for n ≡ 2 (mod 4).
Then, since (n − 2)(f (n) − f (2)) is a square, we get f (n) ≡ 0 (mod 4) for n ≡ 0 (mod 4).


Exercise 4.6.28† . Suppose that positive integers a and b are such that 2a − 1 | 3b − 1. Prove that,
either a = 1, or b is even.

Solution

Clearly, a is odd since 3 - 3b − 1. Suppose that b is odd. Then, 3 is a quadratic residue modulo
3b − 1 so that every prime factor of 3b − 1 is congruent to ±1 (mod 12). Hence, the same goes for
any of its divisor. In particular, 2a − 1 ≡ ±112. Since 2a − 1 ≡ 1 (mod 3), it must be congruent
to 1 modulo 4, which implies a = 1 as wanted.

4.6. EXERCISES 301

Sums and Products


Exercise 4.6.29† (Tuymaada 2012). Let p be an odd prime. Prove that
p+1
1 1 1 (−1) 2
+ + ... + ≡ (mod p)
02 + 1 12 + 1 (p − 1)2 + 1 2

where the sum is taken over the k for which k 2 + 1 6≡ 0.

Solution

Note that we have the following partial fractions decomposition in Fp2 :


 
1 1 i 1 1
= = −
k2 + 1 (k − i)(k + i) 2 k+i k−i

where i ∈ Fp2 satisfies i2 = −1. First, we treat the case p ≡ 1 (mod 4), i.e. i ∈ Fp . Then, the
1 1
sum is telescopic: k+i cancels out with (k+2i)−i , except when k = −i. Thus, the only terms that
1 1
don’t cancel are 2i and − −2i , i.e. the sum is

i 1 −1
· =
2 i 2
as wanted.

Now, suppose p ≡ −1 (mod 4). By Exercise A.3.32† and Fermat’s little theorem, we have
X 1 (X p − X)0 1
= =− p .
X −k Xp − X X −X
k∈Fp

Evaluating this at i, we get


X 1 1
=
i−k 2i
k∈Fp
p
since i is the conjugate of i, i.e. −i. We finally conclude that
 
X 1 i X 1 X 1
= − 
k2 + 1 2 k+i k−i
k∈Fp k∈Fp k∈Fp
 
i 1 1
= −
2 2i −2i
1
= .
2


Remark 4.6.5
This argument can be adapted to compute
X 1
,
f (k)
k∈Fp

where f ∈ Fp [X] is a monic polynomial which irreducible over Fp . Indeed, it must have distinct
roots since it is irreducible: gcd(f, f 0 ) must be 1 or f , and the only
P way it could
P havei ap common
root with its derivative at the same time is if f 0 = 0, i.e. f = i ai X pi = i ai X which is
302 CHAPTER 4. FINITE FIELDS

not irreducible. Thus,


1 X 1
= .
f f 0 (α)(X − α)
f (α)=0

This is called the partial fractions decomposition of f (see also Remark C.4.1). Indeed, it is
equivalent to
X f
=1
f 0 (α)(X − α)
f (α)

which is true, since by evaluating at α we get 1 as desired by Exercise 3.2.2∗ , so this is a polynomial
of less than deg f taking deg f times the value 1. (If this seems unmotivated, notice that there must
f
exist such a partial fractions decomposition since the polynomials X−α are linearly independent,
as can be seen by evaluating a linear combination at α, and then we get the precise coefficients
by also evaluating at α).

This implies that


X 1 X 1 X 1
= 0
f (k) f (α) k−α
k∈Fp f (α)=0 k∈Fp
X 1
=− .
f 0 (α)(αp − α)
f (α)=0

Exercise 4.6.32† . Let n ≥ 1 be an integer. Prove that, for any rational prime p,
p−1
Y ϕ(n)
Φn (k) ≡ Φn/ gcd(n,p−1) (1) ϕ(n/ gcd(n,p−1)) (mod p).
k=1

Solution

First, notice that when we replace n by np both sides of the equality are raised to the pth or
(p − 1)th power (depending on whether
Q p | n or not). Thus, we may assume without loss of
generality that p - n. We have Φn = ω X − ω where the product is over the elements of order
n of Fp . Thus,
Y Y Y
Φn (k) = k−ω
k∈F×
p k∈F×
p
ω
Y Y
= ω−k
ω k∈F×
p
Y
= ω p−1 − 1
ω

since X p−1 − 1 = k∈F×


Q
p
X − k (we exchanged k − ω with ω − k since this multiplies everything by
p−1
(−1) = 1 when p is odd and −1 = 1 when p = 2). To conclude, we claim that each primitive
ϕ(n)
n/ gcd(n, p − 1)th root is represented exactly ϕ(n/ gcd(n,p−1)) times by ω p−1 , thus yielding the
p−1 p−1
wanted result (when n > 2 we can replace ω − 1 by 1 − ω since this multiplies the product
by (−1)ϕ(n) = 1, and when n ≤ 2 the product is zero).

Let m = n/ gcd(n, p − 1). Note that, if we fix an element of order n and write the others as power
ϕ(n)
of it, this becomes equivalent to the set of elements coprime with n modulo n restricting to ϕ(m)
copies of the set of elements coprime with m modulo m. Note that the cardinalities agree:

ϕ(n)
ϕ(n) = · ϕ(m).
ϕ(m)
4.6. EXERCISES 303

Thus, we only need to check that each element of (Z/mZ)× (the subset of Z/mZ with invertible
elements) is reached as many times by evaluating elements of (Z/nZ)× modulo m. This is easy:
if a1 ≡ . . . ≡ ak ≡ a (mod m), then ca1 ≡ . . . ≡ cak ≡ b (mod m) for any b ∈ (Z/mZ)× , where
c is any element of (Z/nZ)× congruent to ba−1 modulo m.


Miscellaneous
Exercise 4.6.34† (Lucas’s Theorem). Let p be a prime number and

n = pm nm + . . . + pn1 + n0

and
k = pm km + . . . + pk1 + k0
be the base p expansion of rational integers k, n ≥ 0 (ni and ki can be zero). Prove that
  Y m  
n ni
≡ .
k i=0
ki

Solution

We have
m m
Y i Y i
(X + 1)n = (X + 1)ni p ≡ (X p + 1)ni
i=0 i=0
k
and considering the coefficient of X yields the desired result.


Exercise 4.6.35† (Carmichael’s Theorem). Let a, b be two coprime integers such that a2 − 4b > 0,
and let (un )n≥1 denote the linear recurrence defined by u0 = 0, u1 = 1, and

un+2 = aUn+1 − bUn .

Prove that for n 6= 1, 2, 6, un always have a primitive prime factor, except when n = 12 and a = b = ±1
(corresponding to the Fibonacci sequence).7

Solution
n n
−β
Notice that un = αα−β where α and β are the roots of X 2 −aX +b, which are real by assumption.
Thus, Carmichael’s theorem is an analogue of Zsigmondy’s theorem for real conjugate quadratic
integers. We proceed as in the case of Z. Note that Φn (α, β) is a rational integer since it is its
own conjugate has. Suppose that p is a non-primitive prime factor of Φn (α, β). Since p is not
primitive, p also divides Φm (α, β) for some m < n. Consider temporarily α and β as elements of
Fp2 . If one of them is zero, i.e. p | b, the other must be too since
Φm (α, β) = αϕ(m) + β ϕ(m) + αβ(· · · ).
This is impossible since p - a ≡ α + β as a and b are coprime. Hence, α/β is well-defined and has

7 Although Schinzel [37] proved in 1993 that, for any non-zero algebraic integers α, β such that α/β is not a root of

unity, αn − β n had a primitive prime ideal factor for all sufficiently large n, the quest for determining all exceptions
in the conjugate quadratic quest continued until 1999, where it was finally settled by Bilu, Hanrot and Voutier [5]. In
particular, there is a primitive factor for any n > 30.
304 CHAPTER 4. FINITE FIELDS

both order m/pvp (m) and n/pvp (n) which implies that p | n. Since n/pvp (n) | p2 − 1, we conclude
that p is either the greatest prime factor of n or the second greatest, since if p < q, r | n, we
get n/pvp (n) ≥ qr > p2 . This means that there are at most two non-primitive prime factors not
dividing b.

The second step of the proof is to bound the p-adic valuations of Φn (α, β). When p is odd, the
same proof as the usual LTE works since p | Φn/p (α, β) | αn/p − β n/p so we prove the same way
that
αn − β n
≡ pαn (mod p2 ).
αn/p − β n/p
When p = 2, things get trickier. Since n/pvp (n) | p2 − 1 = 3, we need to consider the cases
n = 2k and n = 3 · 2k . In fact, we will only consider the cases n = 4 and n = 12 since
k k k k
Φ2m·2k (α, β) = Φ2m (α2 , β 2 ) and the coefficients of the (X − α2 )(X + β 2 ) are also coprime.
2 2 2 2
Indeed, the coefficients of α and β are a − 2b and b and we conclude by induction. (This also
k k
follows from ideal factorisation: if α and β are coprime, so are α2 and β 2 ).

We wish to show that α2 + β 2 cannot be a power of 2. If we


easy case, n = 4. √
We first do the √
write α = u + v d and β = u − v d with positive d, we have

Φ4 (α, β) = α2 + β 2 = 2(u2 + dv 2 ).

Since a = 2u and b = u2 − dv 2 are coprime, either u, v ∈ Z and and u2 − dv 2 is odd, so u2 + dv 2 is


too and can’t be a power of 2 (it’s greater than 1), or 2u, 2v are odd integers and u2 +dv 2 = 2u2 −b
is a half integer so 2(u2 + dv 2 ) is odd and can’t be a power of 2 (it’s greater than 1).

Now, we treat the case where n = 12. We write α, β = u ± v d again. We need to determine
when
Φ12 (α, β) = α4 − (αβ)2 + β 4 = 2(u4 + 14u2 v 2 d + v 4 d2 )
is a power of 2 or 3 times a power of 2. Since a = 2u and b = u2 − dv 2 are coprime, if u, v ∈ Z
then u2 − dv 2 but then so is u4 + 14u2 v 2 d + v 4 d2 so it can’t be a power of 2. Hence, let
u = 2r, v = 2s with odd r, s. Then, an absolutely miraculous (computer) computation shows
that r4 + 14r2 s2 d + s4 d2 can never be divisible by 64 for odd r, s, d. Since d ≡ 1 (mod 4), it is
greater than 1 + 5 · 14 + 52 = 96 so must be exactly 96 since this is the only integer of the form
2k or 3 · 2k nfor some k ≤o 6 which is greater than 70. This yields |r| = |s| = 1 and d = 5, i.e.
√ √
1+ 5 1− 5
{α, β} = ± 2 , 2 corresponding to a = b = ±1 as desired.

To conclude, if Φn (α, β) doesn’t have a primitive prime factor and n is not a power of 2 or 3
times a power of 2 (we have already covered these cases), then Φn (α, β) = p for some prime p | n
or Φn (α, β) = pq for some distinct primes p, q | n. By Proposition 3.4.1, we have

pq ≥ Φn (α, β) > |α − β|ϕ(n)

(set q = 1 if Φn (α, β) is prime). Since ϕ(p) = p − 1 > p/2 and ϕ(pq) = (p − 1)(q − 1) ≥ pq/2, we
conclude that
pq > |α − β|pq/2 ,
i.e. C N < N 2 where C = |α − β| and N = pq. However, it is easy to see that, when C > 2.2,
C N > N 2 for any positive integer N .Indeed, this is true for N ≤ 4, and for N ≥ 5 we have
   
N N
2N = (1 + 1)N ≥ 2 + = N 2.
2 1

Hence, we must have |α − β| ≤ 2.2. Now, notice that |α − β| = a2 − 4b so that 0 ≤ a2 − 4b ≤
2.22 < 5. Since a2 − 4b is congruent to 0 or 1 modulo 4, it must be equal to 0, 1, or 4. In all
these cases α and β are rational integers and we have already proven the Zsigmondy theorem for
rational integers so we are done.

4.6. EXERCISES 305

Exercise 4.6.36† . Suppose p ≡ 2 or p ≡ 5 (mod 9) is a rational prime. Prove that the equation

α3 + β 3 + εaγ 3 = 0

where ε ∈ Z[j] is a unit and 2 6= a ∈ {p, p2 } does not have solutions in Z[j].

Solution

Note that p ≡ 2 (mod 3) so p is prime in Z[j]. Suppose that there is a solution α, β, γ 6= 0, and
pick one which minimises |N (αβγ)|. In particular, α, β, γ are coprime. Rewrite the equation as

(α + β)(α + jβ)(α + j 2 β) = −εaγ 3 .

Consider the numbers x = α + β, y = jα + j 2 β and z = j 2 α + jβ. By assumption, xyz = −εaγ 3 .


Since p3 does not the divide the RHS, exactly one of x, y, z is divisible by a and the other ones
are not divisible by p, by unique factorisation. By replacing α and β by j k α and j k β for some
k if necessary, suppose without loss of generality that it is z. Let d be the gcd of x, y, z and
consider the numbers 
3
x/d = ε1 u

y/d = ε2 v 3

z/d = ε3 aw3

by unique factorisation. Since x + y + z = 0, we have

ε1 u3 + ε2 v 3 + ε3 aw3 = 0,

i.e.
u3 + µv 3 + ηaw3 = 0
for some units µ, η. Suppose for a moment that we manage to prove µ = ±1. Then, we get the
smaller solution
u3 + (±v)3 + ηaw3 = 0
for non-zero u, v, w. This implies, by assumption, that

|N (αβγ)|3 ≥ |N (uvw)|3
 xyz 
= N
d3 a
γ  3
= N
d
which implies that |N (dαβ)| ≤ 1: α and β are units. This yields the equation ±1 ± 1 + εaγ 3 = 0,
which is clearly impossible since a - ±1 ± 1 as a > 2.

It remains to prove that µ = ±1 from the equation u3 + µv 3 + ηaw3 = 0. What else can we do to
prove this apart from considering the equation modulo p? This gives us that µ is congruent to a
cube modulo p. You might be wondering what the link between this problem and the theory of
finite fields is. Here is the answer: since p ≡ 2 (mod 3) is prime, Z[j]/pZ[j] ' Fp2 is a field with
p2 elements. Since p 6≡ −1 (mod 9), there is no primitive ninth root of unity as 9 - p2 − 1. Hence,
if µ were a primitive cube root of unity, it could not be a cube modulo p since a cube root of µ
would be a primitive ninth root of unity modulo p. This implies that µ = ±1 as wanted.


Exercise 4.6.37† (Class Equation of a Group Action and Wedderburn’s Theorem). Let G be a finite
group, S a finite set, and · a group action of G on S.8 Given an element s ∈ S, let Stab(s) and Fix(G)
8 In other words, a map · : G × S → S such that e · s = s and (gh) · s = g · (h · s) for any g, h ∈ G and s ∈ S. See also

Exercise A.3.21† .
306 CHAPTER 4. FINITE FIELDS

denote the set of elements of G fixing s and the elements of S fixed by all of G respectivelly. Finally,
let Oi = Gsi be the (disjoint) orbits of elements of G. Prove the class equation:
X |G|
|S| = | Fix(G)| + .
| Stab(si )|
|Oi |>1

By applying this to the conjugation action (S = G and g · h = ghg −1 ), deduce Wedderburn’s theorem:
any finite skew field is a field.

Solution

For the first part, notice that an orbit O = Gs has size 1 if and only if s is fixed by all of G, i.e.
is in | Fix(G)|, and that
|G|
|O| = |Gs| = |G/ Stab(s)| =
Stab(s)
for any s. Indeed, the map G/ Stab(s) → Gs sending h Stab(s) to the only element hs of
h Stab(s)s is a bijection: if gs = hs then h−1 gs = s so h−1 g ∈ Stab(s), i.e. g Stab(s) = h Stab(s).
Thus, the class formula becomes
X X
|G| = 1+ |Oi |
|Oi |=1 |Oi |>1

which is obviously true. (See Exercise A.3.15† for the definition of the group quotient G/H.)

We now consider the second part. We consider a finite skew field F as a multiplicative group
once we remove its zero element, and our goal is to prove that it is abelian. Hence, we define its
center Z = (F × ) as the group of non-zero elements which commute with every other element.
Note that Z ∪ {0} is a finite field, say of cardinality q. Then, F is naturally a vector space over
Z, hence of cardinality q n for some n.

The class equation for the action of F−1 into itself defined by the conjugation g · s := gsg −1 is
r
X |F × |
|F × | = |Z| +
i=1
|C(xi )|

where C(x) is the centraliser of x, i.e. the group of elements which commute with x. Indeed,
Fix(F × ) is the set of elements x such that gxg −1 = x for any g, i.e. x commutes with every g,
while Stab(x) is the set of g such that gxg −1 = x, i.e. elements which commute with x. Here
is the key point: for any x ∈ F , C(x) ∪ {0} is a vector space over F as well since Z commutes
with everything so ZC(x) = C(x). This implies that its cardinality is a power of q too, say
|C(xi )| = q ni − 1 for some ni | n since q ni − 1 | |F × | = q n − 1. Hence, our equation becomes
r
X qn − 1
qn − 1 = q − 1 + .
i=1
q ni − 1

Since the sum is taken over the orbits of size greater than 1, we have ni < n so, modulo Φn (q),
this becomes
0 ≡ q − 1 + 0.
In other words, Φn (q) divides q −1. Since Φn (q) ≥ (q −1)ϕ(n) with equality iff n = 1, we conclude
that n = 1, i.e. F × = Z is commutative!


Exercise 4.6.38† (Finite Projective Planes). We say a pair (Π, Λ) of sets of cardinality at least 2,
together with a relation of "a point lying on a line", where Π is a set of "points" and Λ a set of lines
is a projective plane if
4.6. EXERCISES 307

1. any two distinct points lie on exactly one line, and


2. any two distinct lines meet in exactly one point.
Prove that, for any finite projective plane, there exists an integer n, called the order of the plane, such
that any line contains exactly n + 1 points and any point lies on exactly n + 1 lines. With this setting,
prove also that there are N = n2 + n + 1 points and N lines. Finally, prove that, for any prime power
q 6= 1, there is a projective plane of order q.9

Solution

By symmetry, it suffices to show that there are exactly n2 +n+1 lines. Pick any line λ, containing
the points π1 , . . . , πn+1 . We know that, for any 1 ≤ i ≤ n + 1, there are exactly n lines other
than λ going through πi . Thus, there are n(n + 1) lines (other than λ) intersecting λ, and we
know that this is one less than the total number of lines.

For the second part, we construct more generally the projective plane P2 over any field K: points
correspond to the one-dimensional subspaces of K 3 , and lines to the two-dimensional subspaces
of K 3 . A line contains a point if the latter is a subspace of the former. Then, there is precisely
one line going through any two points π = xK and τ = yK: it is λ = xk + yK. Similarly, any
two lines λ and ` intersect at exactly one point π = λ ∩ ` (for instance by Grassmann’s formule
from Exercise C.5.1† ).

We are interested in the case where K = Fq is finite. Then, there are precisely q + 1 points on
each line: a two-dimensional vector space V over Fq has precisely q +1 one-dimensional subspaces
as such a subspace has the form xFq for some 0 6= x ∈ V , and there are q 2 − 1 possible choices
of x, and each one-dimensional subspace corresponds to precisely q − 1 possible x.

Finally, we show that there are precisely q + 1 lines passing through each point. If π = xFq is
a point, a line through π has the form xFq + yFq for some y 6∈ xFq . There are q 3 − q possible
choices of y, and each line ` corresponds to q 2 − q possible y (the elements of ` \ xFq ).


9 These have been conjectured to be the only such integers, but this remains unproven so far. See ?? for a necessary

condition for there to be a projective plane of order n (which excludes in particular n = 6).
Chapter 5

Polynomial Number Theory

5.1 Factorisation of Polynomials


Exercise 5.1.1∗ . Prove that the content is well-defined: c(N f )/|N | = c(M f )/|M | for any non-zero
M, N ∈ Z such that N f, M g ∈ Z[X].

Solution

Assume without loss of generality that M and N are positive. rf has integer coefficients if and
only if r is divisible by the lcm of the denominators of the coefficients of f . Thus it suffices to
prove the result for f, g ∈ Z[X]. Indeed, if we write N = mN 0 and M = mM 0 where m is the
lcm of the denominators of the coefficients, we have

M c(N f ) = N c(M f ) ⇐⇒ M 0 c(N 0 g) = N 0 c(M 0 g)

where g = mf has integer coefficients. This follows from the fact that c(rg) = rc(g) for any
r ∈ Z (when you multiply all coefficients by r, the gcd also gets multiplied by r).


Exercise 5.1.2∗ . Suppose f ∈ Q[X] has integral content. Prove that f has integer coefficients.

Solution

Write f = g/N with f ∈ Z[X] and 0 6= N ∈ Z.The content of f for is c(g)/N . This is an integer
iff N divides c(g), i.e. N divides all coefficients of g, which is equivalent to f = g/N having
integer coefficients.


Exercise 5.1.3∗ . Prove Proposition 5.1.2.

Solution

Write g = f ∗ h. Then c(h) = c(g) ∈ Z. Thus h ∈ Z[X].




Exercise 5.1.4∗ . Prove Corollary 5.1.2.

308
5.2. PRIME DIVISORS OF POLYNOMIALS 309

Solution

Consider the factorisation into irreducible polynomials of f in Q[X]: f = af1 · . . . · fk . The


factorisation of f in Z[X] is then given as follow: replace each fi by its primitive part fi∗ =
fi /c(fi ) ∈ Z[X]. The multiplicative constant is then c(f ) ∈ Z by Gauss’s lemma. The uniqueness
of the factorisation in Q[X] shows that this factorisation in Z[X] is unique too (each primitive
irreducible factor must be a constant times an irreducible factor occuring in the factorisation in
Q[X]).


Exercise 5.1.5. Prove that Φpn is irreducible with Eisenstein’s criterion.

Solution

We have n
Xp − 1 n n−1
Φp n = ≡ (X − 1)p −p (mod p)
X pn−1 − 1
by Proposition 3.1.1 and Frobenius. Thus, Φpn (X + 1) has all its coefficients divisible by p except
the leading one. Moreover, Φpn (1) = p by expanding the division, which is not divisible by p2 .
Hence, by Eisenstein’s criterion, Φpn (X + 1) is irreducible and thus Φpn too.


5.2 Prime Divisors of Polynomials


Exercise 5.2.1∗ . Why does CRT imply that there is a value reached 2m times modulo p1 · . . . · pm ?

Solution

For each pi , there is a value which is reached twice: Ni = f (ai ) ≡ f (bi ) (mod pi ). Thus, the value
congruent to Ni modulo pi for i = 1, . . . , m is reached by m as long as m ∈ {ai , bi } (mod pi ) for
each bi . There are two choices for each pi , so 2m possible systems of congruence, and by CRT
all of these systems have a solution modulo p1 · . . . · pm .


Exercise 5.2.2. Prove that X − v works iff 0 ≤ v ≥ −2022, and −X + v works iff 0 ≤ v ≤ 2022.

Solution

Without loss of generality, suppose f = X − v by multiplying f by −1 if necessary. Note that,


for 0 < a, b ≤ n, we have
||f (a)| − |f (b)|| ≤ |f (a) − f (b)| < n
so the only pairs which work are those for which |f (a)| = |f (b)|. Since a < b, this means that
a < v and b > v (they must lie in different affine parts). Thus we have v − a = b − v, i.e.
b = 2v − a. This is indeed greater than 2a − a = a and less than n for sufficiently large n. Thus,
for sufficiently large n, there are exactly v − 1 solutions and for smaller n there may be less. This
means that our solutions are those for which v − 1 ≤ 2021, i.e. v ≤ 2022 as wanted.

310 CHAPTER 5. POLYNOMIAL NUMBER THEORY

5.3 Hensel’s Lemma


Exercise 5.3.1. Give a direct proof of Taylor’s formula by expanding the RHS.

Solution

It suffices to prove this when f = X n , as it will then be true for any linear combination of these
polynomials, i.e. for any polynomial.

Notice that
(X n )(k) = n(n − 1) · . . . · (n − (k − 1))X n−k
so that
f (k)
 
n
= X n−k .
k! k
Finally,
X
k f (k) X n
h · = hk X n−k = (X + h)n
k! k
k k

as wanted.


Exercise 5.3.2∗ . Let p be an odd prime and a a quadratic residue modulo p. Prove that a is a
quadratic residue modulo pn , i.e. a square modulo pn (coprime with pn ), for any positive integer n.

Solution

We apply Hensel’s lemma on the polynomial X 2 − a. It has a root α by assumption, and the
derivative 2α is not divisible by p since p is odd and p - a.


Exercise 5.3.3∗ . Prove that an odd rational integer a ∈ Z is a quadratic residue modulo 2n for n ≥ 3
if and only if a ≡ 1 (mod 8).

Solution

Since the only odd square modulo 8 is 1, we must have a ≡ 1 (mod 8). For the converse, we
consider the polynomial
(2Y + 1)2 − a a−1
=Y2+Y − .
4 4
a−1
It has a root modulo 2 (e.g. 0) since 4 is even, and its derivative is 1 which is indeed non-zero
so we can use Hensel’s lemma.


5.4 Bézout’s Lemma


Exercise 5.4.1∗ . Prove Corollary 5.4.1.
5.5. EXERCISES 311

Solution

Just multiply u and v in Proposition 5.4.1 by the lcm N of the denominators of their coefficients
to get something in Z[X].


5.5 Exercises
Algebraic Results
Exercise 5.5.1† . Suppose f, g ∈ Z[X] are polynomials such that f (n) | g(n) for infinitely many
rational integers n ∈ Z. Prove that f | g. In addition, generalise the previous statement to f, g ∈
Z[X1 , . . . , Xm ] such that f (x) | g(x) for x ∈ S1 × . . . × Sn , where S1 , . . . , Sn ⊆ Z are infinite sets.

Solution

We have seen in Corollary 5.4.2 that this holds when the assumption f (n) | g(n) is true for
sufficiently large n (divide f and g by a primitive irreducible factor of f and repeat this process).
To prove this stronger result, we will use a different method, a completely analytical one. Let
g = qf + r be the Euclidean division of g by f , and let N ∈ Z be a non-zero integer such that
N q, N r ∈ Z[X]. Then,
f (n) | N g(n) − N q(n)f (n) = N r(n)
for infinitely many integers n. However, lim|n|→∞ Nfr(n)(n) = 0. Since it’s a sequence of rational
integers, it must be zero for sufficiently large n, which implies that r = 0, i.e. f | g.

For the second part, we proceed by induction on m (we have just done the case m = 1). If we fix
xm , we get that f (X1 , . . . , Xm−1 , xm ) | g(X1 , . . . , Xm , xm ) for any xm ∈ Sm . Suppose that some
irreducible factor π of f doesn’t divide g (if all of them do we can divide f and g by them and re-
peat the argument, as we outlined in the first part). Note that this makes sense as Z[X1 , . . . , Xm ]
is a UFD by Proposition 5.1.3. We shall use Bézout’s lemma in Q(X1 , . . . , Xm−1 )[Xm ] to get
two polynomials u, v ∈ Z[X1 , . . . , Xm ] such that

0 6= uπ + vg = h ∈ Z[X1 , . . . , Xm−1 ]

(by clearing denominators in the Bézout relation). We know that this h is divisible by
π(X1 , . . . , Xm−1 , n) for any infinitely many n. Since h has a finite number of divisors in
Z[X1 , . . . , Xm−1 ], we get π(X1 , . . . , Xm−1 , n) = d for a fixed d and infinitely many n. Thus,
the polynomial π(X1 , . . . , Xm−1 , X) − d has infinitely many roots so is identically zero, i.e.
π(X1 , . . . , Xm−1 , Xm ) is constant in Xm . In that case, we can proceed by induction on degXm g:
we have π | g(k) for some k so we

g(X1 , . . . , Xm−1 , n) − g(k)


π|
n
for any n ∈ Z, and this has a smaller degree in n.


Exercise 5.5.2† . Let f ∈ Q[X] be a polynomial. Suppose that f always takes values which are mth
powers in Q. Prove that f is the mth power of a polynomial with rational coefficients. More generally,
find all polynomials f ∈ Q[X1 , . . . , Xm ] such that f (x1 , . . . , xm ) is a (non-trivial) perfect power for
any (x1 , . . . , xm ) ∈ Zm .
312 CHAPTER 5. POLYNOMIAL NUMBER THEORY

Solution
Qk
Without loss of generality, suppose f ∈ Z[X]. Let f = a i=1 πiri be the factorisation of f in
primitive irreducible polynomials. As explained in Remark 5.4.1, for any i, we can find an n and
a prime p such that vp (f (n)) = ri . Thus, this implies m | ri for all i, and clearly a must also be
an mth power then: f is an mth power as wanted. For the general case where f (n) is just always
a perfect power, we can pick distinct primes pi and an integer ni such that vpi (f (ni )) = ri . Then,
if we pick an integer n ≡ ni (mod pi ), we get vpi (f (n)) = ri , which means that the ri must all
have a non-trivial common divisor. In other words, f is a constant times a perfect power, and it
suffices to look at vp (f (n)) for p dividing the constant to see that it is in fact a perfect power.

Now, we deduce the general case from the one variable case. Suppose again that f ∈ Z[X]. Let
π be a primitive irreducible factor of f . We shall find an arbitrarily large prime p such that
vp (f (x)) = vπ (f ) for some x ∈ Zm if f is non-constant. Then, using CRT, we can find primes
pπ for each primitive irreducible factor of f and an element y ∈ Zm such that vpπ (f (y)) = vπ (f )
for each π. Since f (y) is a perfect power by assumption, we get that all vπ have a non-trivial
common divisor, which means that f is a constant times a perfect power, and it is again easy
to see that it is in fact a perfect power. Let π be a primitive irreducible factor of f . Suppose
without loss of generality that it is non-constant in Xm , and let π 0 denote its derivative with
respect to Xm . Use Bézout’s lemma as in Exercise 5.5.1† to get u, v ∈ Z[X1 , . . . , Xm−1 ] such
that
0 6= uπ + vπ 0 = h ∈ Z[X1 , . . . , Xm−1 ].
Also, for every other primitive irreducible factor τ of f , consider a Bézout relation

0 6= uτ π + vτ τ ∈ Z[X1 , . . . , Xm−1 ].

Now, choose x ∈ Zm−1 such that π(x, X) is non-constant, and such that h(x) and hτ (x) are
non-zero for all π 6= τ | f . This is possible by e.g. Exercise A.1.7∗ used on the product of the
leading coefficient of f as a coefficient in Xm with h and all hτ . Then, pick a large prime p and
an integer n such that p | π(x, n), there exists one by Theorem 5.2.1. When p is sufficiently large,
by our Bézout relations, p - τ (x, n) for π 6= τ | f . Thus, vp (f (x, n)) = vπ (f )vp (π(x, n)). Now, if
p2 | π(x, n), by assumption

p2 - π(x, n + p) ≡ π(x, n) + pπ 0 (x, n).

Thus, there is an n such that vp (f (x, n)) = vπ (f ) as wanted and we are done.


Exercise 5.5.3† . Suppose f, g ∈ Z[X] are polynomials such that f (a) − f (b) | g(a) − g(b) for any
rational integers a, b ∈ Z. Prove that there exists a polynomial h ∈ Z[X] such that g = h ◦ f .

Solution

By Exercise 5.5.1† , we know that f (X) − f (n) | g(X) − g(n) for all n (in fact we even have
f (X) − f (Y ) | g(X) − g(Y ) but we won’t use that). Consider the base f expansion of g: g =
i i
P
i hi f , where f means exponentiation and not iteration, and where hi ∈ Q[X] are polynomials
of degree less than deg f . We have
X X
g(X) ≡ hi (X)f (X)i ≡ hi (X)f (n)i (mod f (X) − f (n))
i i
so X
f (X) − f (n) | (hi (X) − hi (n))f (n)i .
i
However, the RHS has degree strictly less than deg g so must be identically zero. By taking
n sufficiently large, we see that all hi must be constant, otherwise the RHS will be non-zero.
5.5. EXERCISES 313

k k
P if ai iis the coefficient of X
Indeed, P of hii for some k ≥ 1, then the coefficient of X of the RHS
is i ai f (n) so the polynomial i ai X must have infinitely many roots and thus be zero, i.e.
ai = 0 for all i. The fact that all hi are constant is exactly what it means for g to be a polynomial
in f .


Exercise 5.5.4 (RMM SL 2016). Let p be a prime number. Prove that there are only finitely many
primes q such that
bq/pc
X
q| k p−1 .
k=1

Solution

Write q = pn + r with r ∈ [p − 1]. It suffices to prove that, for each r, pn + r divides


n
X
k p−1 := fr (n)
k=1

finitely many times only. Note that, as we saw in Exercise A.3.8† , fr (n) is a polynomial in n of
degree p and leading coefficient 1/p. As a consequence, there is some integer N ∈ Z such that
N fr has integer coefficients and is non-zero modulo p. By Exercise 5.5.1† , if pn + r divides fr (n)
infinitely many times, pX + r | fr in Q[X]. By Gauss’s lemma, since pX + r is primitive, pX + r
divides N fr in Z[X]. In particular, p divides the leading coefficient of N fr so N fr has degree
at most p − 1 over Fp . Since N fr (n) is identically zero modulo p, as fr (n) ∈ Z and p | N , this
implies that N fr = 0 over Fp since it has degree at most p − 1 and p roots. This contradicts our
initial assumption.


Polynomials over Fp
Exercise 5.5.9† (Generalised Eisenstein’s Criterion). Let f = an X n +. . .+a0 ∈ Z[X] be a polynomial
and let p a rational prime. Suppose that p - an , p | a0 , . . . , an−1 , and p2 - ak for some k < n. Then
any factorisation f = gh in Q[X] satisfies min(deg g, deg h) ≤ k.

Solution

Without loss of generality suppose f = gh for some g, h ∈ Z[X] using Gauss’s lemma. Modulo
p, f ≡ an X n so g ≡ bX r , h ≡ cX s . If r, s > k, we get p2 | ak which is a contradiction. Indeed, if
g = bX r + pu and h = cX s + pv, then f ≡ bcX r+s + p(buX r + cvX s ) (mod p2 ) and the coefficient
of X k of this polynomial is zero when r, s > k.


Exercise 5.5.10† (China TST 2008). Let f ∈ Z[X] be a (non-zero) polynomial with coefficients in
k
{−1, 1}. Suppose that (X − 1)2 divides f . Prove that deg f ≥ 2k+1 − 1.

Solution
k
We proceed as in Exercise 5.5.11† . Since (X − 1)2 | f , we have n := deg f ≥ 2k . Modulo 2,
n+1 k k k
f ≡ XX−1−1 and (X − 1)2 = X 2 − 1 by Frobenius. Thus, X 2 − 1 | X n+1 − 1, which implies
314 CHAPTER 5. POLYNOMIAL NUMBER THEORY

2k | n + 1 by Exercise 4.3.1∗ , and in particular n ≥ 2k+1 − 1 as n ≥ 2k .




Exercise 5.5.11† (Romania TST 2002). Let f, g ∈ Z[X] be polynomials with coefficients in {1, 2002}.
Suppose that f | g. Prove that deg f + 1 | deg g + 1.

Solution

X deg f +1 −1
Modulo 3 (which was chosen so that 1 ≡ 2002), by Gauss’s lemma, we get f ≡ X−1 and
X deg g+1 −1 deg f +1 deg g+1
g≡ X−1 Thus, X
. −1 | X − 1 over F3 , which implies deg f + 1 | deg g + 1 by
Exercise 4.3.1∗ .


Exercise 5.5.12† (USAMO 2006). Find all polynomials f ∈ Z[X] such that the sequence (P (f (n2 ))−
2n)n≥0 is bounded above, where P is the greatest prime factor function. (In particular, since P (0) =
+∞, we have f (n2 ) 6= 0 for any n ∈ Z.)

Solution

Suppose that P (f (n2 )) − 2n ≤ N for all n. Suppose that p | f (n2 ) for some odd prime p and a
rational integer n. Without loss of generality, suppose 0 ≤ n < p2 by replacing n by its remainder
upon its Euclidean division by p, and then replacing it by p − n if necessary. By assumption,
p − 2n ∈ [N ]. Thus, the odd prime factors of n always divide the ones of

(2n + 1) · . . . · (2n + N )(2n − 1) · . . . · (2n − N ) = (4n2 − 1) · . . . · (4n2 − N 2 ).

By Exercise 5.5.1† , this implies that f divides

(4X − 1) · . . . · (4X − N 2 ),

i.e. f has the form a i 4X − a2i for some a ∈ Q and ai ∈ Z. Since f (n2 ) has no root, all ai
Q
must be odd, and this implies a = ±1 by Gauss’s lemma since 4X − k is primitive for odd k.
Conversely, these all work, p | f (n2 ) =⇒ p | 2n ± ai and thus p − 2n ≤ maxi (|ai |) := N .


Exercise 5.5.14† (China TST 2021). Suppose the polynomials f, g ∈ Z[X] are such that, for any
sufficiently large rational prime p, there is an element rp ∈ Fp such that f ≡ g(X + rp ) (mod p). Prove
that there exists a rational number r ∈ Q such that f = g(X + r).

Solution

It is clear that f anf g have the same degree. We proceed by induction on deg f . When f is
constant it is trivial. For the inductive step, notice that the statement still holds for f 0 and g 0 .
Since they have degree deg f −1, we conclude that f 0 = g 0 (X +r) for some r, i.e. f = g(X +r)+c
for some c ∈ Q. When deg f = 1 this already implies that f = g(X + s) for some s. Otherwise,
since we have g 0 (X + rp ) ≡ g 0 (X + r), we get rp ≡ r. In that case, we get c ≡ 0 (mod p) for
infinitely many p so c = 0 as wanted.

5.5. EXERCISES 315

Iterates
Exercise 5.5.16† . Let f ∈ Z[X] be a polynomial. Show that the sequence (f n (0))n≥0 is a Mersenne
sequence, i.e.
gcd(f i (0), f j (0)) = f gcd(i,j) (0)
for any i, j ≥ 0.

Solution

If f i (0) ≡ 0 (mod d), then we get f ki (0) ≡ 0 for any integer k ≥ 0 by applying f i multiple times
on both sides. This shows that f gcd(i,j) (0) | gcd(f i (0), f j (0)). For the converse, suppose that d
divides f i (0) and f j (0) where j > i. Then,

0 ≡ f j (0) ≡ f j−i (f i (0)) ≡ f j−i (0)

so d | f j−i (0) too. This shows that we can apply Euclid’s algorithm to get d | f gcd(i,j) (0).


Exercise 5.5.17† . Suppose the non-constant polynomial

f = ad X d + . . . + a2 X 2 + a0 ∈ Z[X]

has positive coefficients and satisfies f 0 (0) = 0. Prove that the sequence (f n (0))n≥1 always has a
primitive prime factor.

Solution

Notice that the coefficient of f i of X 2 is also 0 for any i (this can be seen via direct expansion
or via (f i )0 (0) = f 0 (0)f 0 (f i−1 (0)) = 0). Thus, we have f i (x) ≡ f i (0) (mod x2 ) for any x ∈ Z.
Letting x = f j (0) yields f i+j (0) ≡ f i (0) (mod f j (0)2 ). By induction, we get

f km (0) ≡ f m (0) (mod f k (0)2 ) (*)

for any integers k, m ≥ 0.

Suppose now that f n (0) doesn’t have a primitive prime factor for some n ≥ 2. This means that,
if p | f n (0) is prime, p | f k (0) for some k. By Exercise 5.5.16† , we may assume that k | n by
replacing it by gcd(k, n) if necessary. Then, by (∗), we have f n (0) ≡ f k (0) (mod f k (0)2 ). In
particular, vp (f n (0)) = vp (f k (0)). Thus, we conclude that

vp (f n (0)) ≤ vp (f (0) · . . . · f n−1 (0))

for any prime p. This means that

f n (0)) ≤ f (0) · . . . · f n−1 (0).

We shall prove that this is impossible for n ≥ 2 by induction. We clearly have f (0) ≥ 1 since the
coefficients are non-negative. Now, if f n (0) ≥ f (0) · . . . · f n−1 (0), then

f n (0)2 ≥ f (0) · . . . · f n−1 (0)

so it suffices to prove that f n+1 (0) > f n (0)2 . This is clearly true since f (x) > x2 for positive x.

316 CHAPTER 5. POLYNOMIAL NUMBER THEORY

Exercise 5.5.18† (Tuymaada 2003). Let f ∈ Z[X] be a polynomial and a ∈ Z a rational integer.
Suppose |f n (a)| → ∞. Prove that there are infinitely many primes p such that p | f n (a) for some
n ≥ 0 unless f = AX d for some A, d.

Solution

Suppose that there are finitely many such primes p1 , . . . , pm . Suppose first that f (0) = 0. Then,
if we let g = X k f where g(0) 6= 0, we get

g(f i (a)) ≡ g(0) (mod f i−1 (a)) (*)

since f (0) = 0. Choose an n such that p1 · . . . · pm | f n (a) there exists one by assumption (if
p | f j (a) then p | f j+1 (a)). Let p be one of p1 , . . . , pm . By (∗), we have vp (g(f i (a)) = 0 if p - g(0),
and, for i > n, if p | g(0),

vp (f i (a)) = vp (f i−1 (a)k g(f i−1 (a))) ≥ vp (f i−1 (a)) + 1

since g(f i−1 (a)) ≡ g(0) ≡ 0. Hence, for sufficiently large i we get vp (f i−1 (a)) > vp (g(0)) for
all p = p1 , . . . , pm . Combined with (∗), we must have g(f k (a)) = g(0) since the pi are the only
prime factors, and for large k this implies that g is constant as wanted.

Let n be a large integer. Consider the k + 1 numbers f n (a), f n+1 (a), . . . , f n+m (a). Each of them
is divisible by a large power of a pk , since by assumption they are large and their only prime
factors are the pi . By the pigeonhole principle, two of them must be divisible by a large power
of the same pk , say prk | f n+i (a), f n+j (a) with j > i. Then, prk | f j−i (0). Taking n → ∞ and
thus r → ∞ yields f s (0) = 0 for some 1 < s ≤ m. Now, the previous discussion implies that
f s = AX d for some d. However, if deg f ≥ 2, it is easy to see that f s can only be of this form is
f is. Indeed, if the two leading terms of g are U X u + V X v and the leading term of f is W X w ,
then the leading terms of f ◦ g are

W U w X uw + wV U w−1 X u(w−1)+v

(the situation is different when deg f = 1 because the contribution we also need to take in account
the second term of f ). Thus, we are done when deg f 6= 1. Otherwise, suppose that f = uX + v
with v 6= 0. Then,  
n n v v
f (a) = u a + − .
u−1 u−1
Our previous discussion implies that a 6= 0, otherwise the sequence (f n (a))n≥0 is bounded. If we
take
n = ϕ((p1 · . . . · pk )N ),
v
we get f n (a) ∈ {a, − u−1 } modulo pNi for every i. For sufficiently large N , both of these are
N
non-zero modulo pi , so the p-adic valuation are bounded which implies that the sequence is too
since these are the only prime factors. This is a contradiction.


Exercise 5.5.19† (USA TST 2020). Find all rational integers n ≥ 2 for which there exist a rational
integer m > 1 and a polynomial f ∈ Z[X] such that gcd(m, n) = 1 and n | f k (0) ⇐⇒ m | k for any
positive rational integer k.

Solution

Let k# denote the product of the first k prime numbers (the kth primorial ). We shall prove that
n works if and only if rad n 6= k# for any integer k ≥ 1.
5.5. EXERCISES 317

Suppose first that rad n 6= k# for any integer k ≥ 1. Let p be the greatest prime factor of n
and r = vp (n), and let q be the smallest prime which doesn’t divide n. By assumption, q < p.
Consider the cycle τ : 0 → 1 → . . . → q → 0. Construct the polynomial
p−1
X Y X −i
g= τ (i) j 6= i ∈ Z/pr Z[X]
i=0
i−j
P

which interpolates τ . Note that this is indeed in Z/pr Z as the denominators are in ] − p, p[ and
non-zero, and thus coprime with p. Lift g to any polynomial with integer coefficients which is
congruent to g modulo pk . We shall denote this new g abusively by g again. Let u ∈ Z be such
that a · pnk ≡ 1 (mod pk ). Then, f := un
pk
g works as we have

n | f k (0) ⇐⇒ pk | g k (0) ⇐⇒ q | k

(m = q).

Now, suppose rad n = k# for some k. Suppose for the sake of contradiction that there is some
f ∈ Z[X] and m ∈ Z coprime such that n | f k (0) if and only if m | k. Notice that this implies
that the sequence (f k (0))k≥0 is periodic modulo n, and thus also modulo p for any p | n. Since it
can take only p values modulo p, the period is at most p. Since it is coprime with n, it must be
1 (by assumption all primes q ≤ p divide n). Thus, the sequence (f k (0))k≥0 is constant modulo
rad n. We shall prove by induction on ` that p` | f (0) for any p | n and ` ≤ vp (n), which implies
that f (0) ≡ 0, contradicting the fact that m > 1. We have already proved the base case. Note
that, by Corollary 5.3.1, we have

f k+1 (0) ≡ f (0) + f k (0)f 0 (0) (mod p`+1 ). (*)

Suppose for the sake of contradiction that p - f 0 (0). If f 0 (0) ≡ 1 (mod p), by induction, we
get f k (0) ≡ kf (0) (mod p`+1 ). Thus, if p`+1 - f (0), we have p`+1 | f k (0) ⇐⇒ p | k which
contradicts the fact that m was coprime with n.

Thus, f 0 (0) 6≡ 1 (mod p). Accordingly, (∗) becomes


   
f (0) f (0)
f k+1 (0) + 0 ≡ f 0 (0) f k+1 (0) + 0 .
f (0) − 1 f (0) − 1

By induction, we get
f 0 (0)k − 1
f k (0) ≡ f (0) · .
f 0 (0) − 1
If p`+1 - f (0), we have
p`+1 | f k (0) ⇐⇒ p | f 0 (0)k − 1 ⇐⇒ s | k
where s is the order of f 0 (0) modulo p. However, s | p − 1 so s < p and is thus not coprime with
n which is again a contradiction.

Finally, we conclude that we must in fact have p | f 0 (0). But then, (∗) becomes f k (0) ≡ f (0),
and since f m (0) ≡ 0 we get f (0) ≡ 0 as wanted.


Exercise 5.5.20† . Let f ∈ Q[X] be a polynomial of degree k. Prove that there is a constant h > 0
such that that the denominator of f (x) is greater than h times the denominator of xk .
318 CHAPTER 5. POLYNOMIAL NUMBER THEORY

Solution
Pk
Let x = mn where m and n coprime rational integers, write f =
i
i=0 ai X (with k = deg f )
and pick 0 6= N ∈ Z such that N f ∈ Z[X]. Denote by Z(p) the set of rational numbers with
denominator not divisible by p. Let p be a prime factor of n and c ≥ 0 be an integer. Then,
k−1
N ak mk pvp (n)
m    
n X
Np kvp (n)−c
f = + N ai m i
p−c .
n pc pvp (n) i=0
n

For vp (n) ≥ c, the second sum is in Z(p) , while for c > vp (N ak ), the first term is not in Z(p) .
Thus, for vp (n) ≥ c > vp (N ak ) := cp , N pkvp (n)−c f m

n 6∈ Z(p) , i.e. the denominator D of
Nf m
 kvp (n)+1−c kvp (n)−cp
n is divisible by p . We conclude that, for v p (n) > cp , we have p | D.

We are now almost done. Let P be the product pvp (n) over the primes p for which vp (n) ≤ cp .
Then, Y
P ≤ pcp = |N ak | := C.
p
m m
 
Thus, the denominator of N f n , and hence of f n , is at least

Y 1 Y kvp (n)−cp 1 Y kvp (n) |n|k


pkvp (n)−cp ≥ · p ≥ p =
Ck C k+1 C k+1
vp (n)>cp p|n p|n

1
as desired (h = C k+1
).


Exercise 5.5.21† . Let f ∈ Q[X] be a polynomial of degree at least 2. Prove that



\
f k (Q)
k=0

is finite.

Solution

Let D(x) denote the denominator of a rational number x ∈ Q. Note that, by Exercise 5.5.20† ,
when D(x) is sufficiently large, D(f (x)) > D(x) as deg f ≥ 2. This implies that, for a fixed
r, if r = f k (sk ) for all k, the denominator of sk is bounded (otherwise f k (sk ) would have a
denominator which is too large). However, its absolute value is also bounded, since |f (x)| > r
for sufficiently large |x|. Thus, there are a finite number of possible sk , and this implies
T∞ that
f i (s) = r = f j (s) for some i, j and s := si = sj . In other words, the intersection k=0 f k (Q)
consists only of pre-periodic points, since we also have f i (r) = f i+j (s) = f j (r). Thus, it suffices
to show that there are finitely many pre-periodic points.

Let r be a pre-periodic point, i.e. such that f i (r) = f j (r) for some j > i. The same trick as
before shows that D(r) is bounded. Indeed, if D(r) is sufficiently large, then D(f i (r)) > D(r) is
too, and thus
D(f j (r)) = D(f j−i (f i (r))) > D(f i (r)).
But at the same time, the absolute value of r is also bounded since |f (x)| > |x| for |x| sufficiently
large (as deg f ≥ 2), so there are a finite number of preperiodic points as wanted.

5.5. EXERCISES 319

Exercise 5.5.22† (Iran TST 2004). Let f ∈ Z[X] be a polynomial such that f (n) > n for any positive
rational integer n. Suppose that, for any N ∈ Z, there is some positive rational integer n such that

N | f n (1).

Prove that f = X + 1.

Solution

Choose N = f n+1 (1) − f n (1) for some n. Then, the sequence f i (1) modulo N goes as follows:

f (1), f 2 (1), . . . , f n (1), f n (1), f n (1), . . . .

Thus, by assumption, N | f k (1) for some k ≤ n. If k = n, then f n+1 (1) − f n (1) ≡ f (0). Thus,
we get
f n+1 (1) − f n (1) ≤ f n−1 (0)
or f n+1 (1) − f n (1) ≤ f (0). It is easy to see that this forces f = X + m. Finally, modulo m the
sequence f n (1) is constant equal to 1 so m = ±1, and since f (n) > 1 this means that f = X + 1
as wanted.


Divisibility Relations
Exercise 5.5.23† . Find all polynomials f ∈ Z[X] such that f (n) | nn−1 − 1 for sufficiently large n.

Solution

Let n be a sufficiently large rational integer, and p be a prime factor of f (n). Let m ≡ n (mod p)
and m ≡ 2 (mod p − 1) be an integer. Then, p | f (m) | mm−1 − 1 ≡ n − 1. Thus, every prime
factor of f (n) divides n − 1. By Corollary 5.4.2, this implies that f is a constant times a power
of X − 1, say c(X − 1)k . By LTE, for any p | n − 1, we have

vp (nn − 1) = vp (n − 1) + vp (n − 1) = 2vp (n − 1)

so k ≤ 2. Finally, the constant divides n − 1 for every sufficiently large n so must be ±1.
Conversely, by the previous discussion, ±1, ±(X − 1), ±(X − 1)2 all work.


Exercise 5.5.25† (ISL 2012 Generalised). Find all polynomials f ∈ Z[X] such that rad f (n) |
rad f (nrad n ) for all n ∈ Z. (You may assume Dirichlet’s theorem ??.)

Solution

Let n ∈ Z and suppose that p is a prime factor of f (n). Suppose that p - n. Let k ∈ (Z/(p−1)Z)×
be arbitrary. Pick a prime number q ≡ n (mod p) and q ≡ k (mod p − 1) using Dirichlet’s
theorem and CRT. Then, p | rad f (q) | f (nq ) so p | f (nk ). Thus, whenever p | f (n), either p | n
or p | f (m) for any m with the same order as n modulo p. In particular, if n has order u modulo
p, then f has u roots in Fp . Let d be the degree of f . Since f has at most d roots in Fp , this
implies that the prime factors of f (n) all divide g(n) where g is
XΦ1 Φ2 · . . . · Φd .
a0
Q` as
Now, suppose f = λX s∈S Φs . We claim that f works iff S if r | s =⇒ r ∈ S for any s ∈ S.
320 CHAPTER 5. POLYNOMIAL NUMBER THEORY

Clearly, these all work since if n has order s modulo p, i.e. p | Φs (n), then nrad n has order r
dividing s and p | Φr (s) | f (nrad n ) as wanted.

It suffices to that r | s =⇒ r ∈ S when rs is prime, since we can obtain any divisor of s by


dividing multiple times by a prime. Thus, suppose that s = qr for some prime q. Let p ≡ 1
(mod s) be a prime not dividing any element of S and let n be an element of order s modulo p.
Pick a prime q 0 6= q such that qq 0 ≡ n (mod p) and q 0 ≡ 1 (mod p−1
q ), there exists one by CRT
0
and Dirichlet’s theorem again. Then, p | rad f (qq 0 ) | f (nqq ) so p | f (nq ). Since nq is an element
of order r modulo p, this means that Φr | f as wanted. We conclude that all solutions have the
form
Y `
f = λX a0 Φas s
s∈S

where S is such that r | s ∈ S implies r ∈ S.




Remark 5.5.1
There is an elementary way to avoid the use of Dirichlet’s theorem. When we take a prime
satisfying some congruence condition, we do not really care that it’s prime, we just care about
the value of its radical. Thus, it suffices to have a (sufficiently large) squarefree n such that
n ≡ a (mod b) for some coprime a, b. This can be done by showing that the density of such n is
positive. More specifically, let N be a positive integer. We want to count how many n ∈ [N ] are
1
congruent to a (mod b) and not squarefree. If we show that this is cN + o(N ) for some c < ϕ(b)
N
we are done since there are ϕ(b) + O(1) integers congruent to a modulo b in [N ]. How many
such integers are there then? Well, n is not squarefree if it is divisible by p2 for some prime p.
N
We can assume p - b as these primes can’t divide n ≡ a (mod b). Then, there are ϕ(b)p 2 + O(1)
2
such integers congruent a modulo b since the conditions n ≡ 0 (mod p ) and n ≡ a (mod b) are
independent by CRT. Thus, the number of n ≡ a (mod b) in [N ] which are not squarefree is at
most    
 
N  X 1 N  X 1
+ O(1/N )  = O(π(N )) +
ϕ(b) p2 ϕ(b) p2
p-b,p≤N p-b


where π(n) is the number of primes
P less than n. In Exercise 3.5.14 , we proved that π(N ) = o(N ),
so we only need to show that p-b p12 < 1 to be done. This follows from the following estimate:

X 1 ∞ ∞
X 1 X 1 1
2
< = − = 1.
p
p n=2
n(n − 1) n=2
n − 1 n

We end this remark with one more observation: our previous approach can be refined to show
that, in fact, the exact density of squarefree n ≡ a (mod b) is
1 Y 1 1 6
1− 2 = Q 1 = Q 1 .
ϕ(b) p ζ(2)ϕ(b) p|b 1 − p2 π 2 ϕ(b) p|b 1− p2
p-b

Indeed, the product of 1 − p12 comes from the inclusion-exclusion principle: the density of n not
divisible by p2 is 1 − p12 and these densities are independent by CRT. More precisely, there are

N Y 1
1− + O(1/N )
ϕ(b) p2
p-b,p≤N
5.5. EXERCISES 321

integers in [N ] congruent to a modulo b which are not divisible by a square. If we take the
logarithm, since
log(x + ε) − log x = log(1 + ε/x) = ε/x + O(ε2 /x2 ),
we will be able to sum the O(1/N ) and get O(π(N )/N ) → 0. When we retake the exponential,
this gives us a constant which goes to 1. Hence, after dividing by N , we get the wanted result.

Exercise 5.5.27† . Find all polynomials f ∈ Z[X] such that f (p) | 2p − 2 for any prime p. (You may
assume Dirichlet’s theorem ??.)

Solution

Let n ∈ Z be an integer and suppose p is an odd prime factor of f (n). Suppose for the sake
of contradiction that p - n. Usin Dirichlet’s theorem, we may find a prime q ≡ n (mod p) and
q ≡ −1 (mod p − 1). This gives p | f (q) | 2q − 2 ≡ − 23 (mod p) so p = 3. Thus, the sufficiently
large prime divisors of f (n) divide n which implies that f = aX k for some a ∈ Z and some
integer k ≥ 0 by Corollary 5.4.2. Since f (2) | 2, we get the solutions f = ±2 and f = ±X, which
work by Fermat’s little theorem.


Miscellaneous
Exercise 5.5.29† (Generalised Hensel’s Lemma). Let f ∈ Z[X] be a polynomial and a ∈ Z an integer.
Let m = vp (f 0 (a)). If p2m+1 | f (a), prove that f has exactly one root b modulo pk which is congruent
to a modulo pm+1 for all k ≥ 2m + 1.

Solution

We need to show that we can still perform the inductive step for k ≥ 2m + 1. Write bk+1 =
bk + upk−m . We have

f (bk+1 ) ≡ f (bk ) + upk+1−m f 0 (a) (mod pk+1 )

as before. This can be congruent to 0 if and only if vp (pk+1−m f 0 (a)) < vp (f (bk )), which is true
since
vp (f (bk )) > k = (k − m) + vp (f 0 (a)).
As before, this u is unique modulo p which shows the wanted result.


Remark 5.5.2
This doesn’t work under the weaker assumption that vp (f (a)) > m, as can be seen from f =
14X 2 + 3X + 9 which doesn’t have a root modulo 33 (this may seem very random but it was
in fact carefully constructed from our previous proof), because the congruence we get with the
derivative holds modulo p2(k−m) , and 2(k − m) ≥ k + 1 only for k ≥ 2m + 1.

Exercise 5.5.30† . Let f ∈ Z[X] be a non-constant polynomial. Is it possible that f (n) is prime for
any n ∈ Z?
322 CHAPTER 5. POLYNOMIAL NUMBER THEORY

Solution

Let f be a polynomial which is always prime. If f (n) = p, then p | f (n + kp), so we must have
f (n + kp) = p. For sufficiently large k, this implies that f = p is constant.


Exercise 5.5.31† . Find all polynomials f ∈ Q[X] which are surjective onto Q.

Solution

Without loss of generality, suppose that f ∈ Z[X] and f (0) = 0. We will prove that f must have
degree 1 (and conversely, these polynomials are surjective). Let p be a rational prime. Examine
the equation f (r) = p: by Exercise 1.1.2, the potential rational solutions have the form ps or 1s
where s is a divisor of the leading coefficient of f . The latter is impossible
 for large p, while the
former is only possible for large p if f has degree 1, otherwise f (r) = f ps grows too fast, of the
order of pn where n is the degree of f .


Exercise 5.5.35† (ISL 2005). Let f ∈ Z[X] be a non-constant polynomial with positive leading
coefficient. Prove that there are infinitely many positive rational integers n such that f (n!) is composite.

Solution

Wilson’s theorem tells us that, if p is prime and 0 ≤ n ≤ p − 1,

(p − 1)! (−1)n−1
(p − 1 − n)! ≡ ≡ .
(−1) · (−2) · . . . · (−n) n!

In other words, if we let g(X) be the polynomial X deg f f (1/X), for any prime p and any positive
rational integer n, p | f (n!) ⇐⇒ p | g((−1)n−1 (p − 1 − n)!). Hence, we wish to find an integer
m such that g((−1)m−1 m!) has a prime factor p > m for which f ((p − 1 − m)!) is greater than p.
That way, f ((p−1−m)!) will be divisible by p and not equal to p which means that it’s composite
as desired. Suppose that there are finitely many such primes. Note that, for sufficiently large m,
if p | g(m!) and p ≤ m, then p | g(0). In particular, there are finitely many such primes since
g(0) is the leading coefficient and f , and the same argument shows that
Y
pvp (g(m!))≤g(0) .
p|g(m!),p≤m

This implies that, for sufficiently large m, g((−1)m−1 m!) has a prime factor p > m. Then, for
this p, we have p | f ((p − 1 − m)!) by construction so f ((p − 1 − m)!) = p for sufficiently large
p by assumption. In particular, since f (n!) ≥ n!/2 > 2n for large n, we have p ≤ 2m, otherwise
f ((p − 1 − m)!) > 2m−1 ≥ p. In other words, p is fairly close to m: between m and 2m. We are
almost done. Consider the sequence (f (n!))n≥0 . By assumption, p is an element of this sequence.
However, the terms of f (n!) grow further and further away: f ((n + 1)!)/f (n!) → ∞. Hence, if
we choose m = 2f (n!) for instance, p will be greater than f (n!) but smaller than f ((n + 1)!) and
so won’t be in the sequence (for large n).

Chapter 6

The Primitive Element Theorem and


Galois Theory

6.1 General Definitions


Exercise 6.1.1∗ . Prove that R[α1 , . . . , αn ] is indeed the smallest ring containing R and α1 , . . . , αn ,
in the sense that any other such ring must contain R[α1 , . . . , αn ]. Similarly, prove that any field
containing K and α1 , . . . , αn contains K(α1 , αn ).

Solution

If a ring contains R and α1 , . . . , αn , it contains all polynomials in α1 , . . . , αn with coefficients in


R since these are obtained from multiplication and addition of elements of R and the αi . Thus,
it contains R[α1 , . . . , αn ]. Similarly, if a field contains K and α1 , . . . , αn , it contains all rational
functions in α1 , . . . , αn with coefficients in K since these are obtained from multiplication of
elements of K[α1 , . . . , αn ] with inverses of other elements (we have already shown that the field
must contain K[α1 , . . . , αn ] since a field is also a ring).


Exercise 6.1.2∗ . Let α ∈ Q be an algebraic number. Prove that Q(α) = Q[α].

Solution

We wish to prove that Q[α] is closed under inversion. Let f (α) be a non-zero element of Q[α],
i.e. πα - f . Then, since πα is irreducible, it is coprime with f . Thus, by Bézout’s lemma, we
have rf + sπα = 1 for some a, b ∈ Q[X]. Evaluating at α yields r(α)f (α) = 1 as wanted.


Exercise 6.1.3∗ . Prove that the minimal polynomial of (i+1) 2
2 over Q(i) is X 2 − i.

Solution

(i+1) 2
2 is a root of X 2 − i and not in Q(i) so its minimal polynomial has degree 2 and divide
2
X − i and is therefore equal to it.


Exercise 6.1.4∗ . Check that L is a K-vector space.

323
324 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Solution

Since K ⊆ L, multiplication of elements of L (vectors) by elements of K (scalars) is well-defined


and satisfies the obvious properties since it’s just the multiplication of two elements of L!


Exercise 6.1.5∗ . Prove that (ui vj )i∈[m],j∈[n] is a K-basis of M .

Solution
P
Suppose that i,j ai,j ui vj = 0 for some ai,j ∈ K. Rewrite it as
 
X X
ui  ai,j vj  = 0.
i j

P
This is a linear combination of the L-basis of M . Thus, by definition of a basis, j ai,j vj = 0
for each i. Again by the definition of a basis, this means that ai,j = 0 for each j and each i.

Thus this family is linearly independent. It remains to show that P it generate all of M . We can
proceed exactly as we did but in the reverse direction: let α = i bi ui be an element
P of M , with
bi ∈ L (recall that ui is the L-basis of M ). Write each bi as a linear combination j ai,j vj with
ai,j ∈ K (vi is the K-basis of L). We get
X
α= ai,j ui vj
i,j

as wanted.


Exercise 6.1.6∗ . Let M/L/K be a tower of extensions and α ∈ M . Prove that the minimal
polynomial of α over L divides the minimal polynomial of α over K. In other words, its L-conjugates
are among its K-conjugates.

Solution

The minimal polynomial of α over K is also a polynomial over L since K ⊆ L. Since it vanishes
at α, this means that it is divisible by the minimal polynomial of α over L.


Exercise 6.1.7∗ . Prove that finite extensions of K are exactly the fields of the form K(α1 , . . . , αn )
for α1 , . . . , αn algebraic elements over K, using Proposition 6.1.1.

Solution

We proceed by induction on [L : K], where L is a finite extension of K (we do not fix K). When
this is one we have L = K. For the induction step, let α ∈ L be an element which is not in K.
By the tower law,
[L : K] = [L : K(α)][K(α) : K].
Since α 6∈ K, [K(α) : K] > 1 so that [L : K(α)] < [L : K]. By the inductive hypothesis, this
6.2. THE PRIMITIVE ELEMENT THEOREM AND FIELD THEORY 325

means
L = K(α)(α1 , . . . , αn ) = K(α, α1 , . . . , αn )
as wanted.


6.2 The Primitive Element Theorem and Field Theory


Exercise 6.2.1. Let K be a number field. Prove that the embeddings of K are the non-zero functions
f : K → C which are both multiplicative and additive.

Solution

This is the Cauchy equation: we shall show that any additive function is Q-linear. By induction,
we have f (nx) = nf (x) for any n ∈ Z. Thus, for 0 6= m, n ∈ Z, we have

nf (xm/n) = f (xm) = mf (x)

which means f (xm/n) = f (x)m/n, i.e. f is Q-linear.




Exercise 6.2.2∗ . Prove that ϕ ∈ EmbK (L) if and only if ϕ commutes with polynomials, i.e.

ϕ(f (x1 , . . . , xn )) = f (ϕ(x1 ), . . . , ϕ(xn ))

for any f ∈ K[X1 , . . . , Xn ] and any x1 , . . . , xn ∈ L.

Solution

First suppose that ϕ commutes with polynomials. Then, ϕ is additive and multiplicative since it
commutes with XY and X + Y , and fixes K since it commutes with constant polynomials, i.e.
ϕ is a K-embedding.

Now, suppose that ϕ ∈ EmbK (L) and let f = i1 ,...,in ai1 ,...,in X1i1 · . . . · Xnin be a polynomial.
P
Then,
 
X X X X
ϕ ai1 ,...,in xi11 · . . . · xinn  = ϕ(ai αi ) = ϕ(ai1 ,...,in )ϕ(x1 )i1 ·. . .·ϕ(xn )in = ai1 ,...,in ϕ(x1 )i1 ·. . .·ϕ(xn )
i1 ,...,in i i1 ,...,in i1 ,...,in

since ϕ fixes K. (Note that we have used the fact the sum is finite here, it is not true in general
that embeddings commute with convergent power series.)


Exercise 6.2.3∗ . Let α ∈ L be an element and σ ∈ EmbK (L) be an embedding. Prove that σ(α) is
a conjugate of α.
326 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Solution

Let f be the minimal polynomial of α. By Exercise 6.2.2∗ , we have

0 = σ(f (α)) = f (σ(α))

so σ(α) is a conjugate of α.


Exercise 6.2.4∗ . Prove that an embedding is injective.

Solution

Suppose that α 6= β. Then,  


1
σ(α − β)σ = σ(1) = 1
α−β
so σ(α−β) = σ(α)−σ(β) is non-zero which means that σ(α) 6= σ(β). (In algebraic terms, we have
shown that the kernel was trivial to prove that the morphism was injective. See Exercise A.2.16∗ .)


Exercise 6.2.5. Let α1 , . . . , αn ∈ L be all the K-conjugates of some α ∈ L, and let σ ∈ EmbK (L) be
an embedding. Prove that σ permutates α1 , . . . , αn .

Solution

Exercise 6.2.6. Solve Problem 6.2.1 without field theory, i.e. using only the content of Chapter 1.

Solution

One could proceed as follow. Let S denote the sum of the αi . Choose an k ∈ [n], we shall prove
that αk is rational. For this, consider
X
αk = S − αi .
i6=k

The fundamental
P theorem of symmetric polynomials tells us that each conjugate αk0 of αk has
the form S − i6=k αi where αi0 is some conjugate of αi . Thus, we have
0

n
X n
X
αi = S = αi0 .
i=1 i=1

Since the αi are maximal among their conjugates, proceeding as in the field theory solution, we
get αi0 = αi for each i. But since αk0 was an arbitrary conjugate of αk , this means αk has only
one conjugate, i.e. αk ∈ Q. You can see that this was a lot messier than with field theory!


√ √
Exercise 6.2.7∗ . Check that NQ( √
3
3 3 3 3 3
2) (a + b 2 + c 4) = a + 2b + 4c − 6abc.
6.3. GALOIS THEORY 327

Solution
√ √
Let j be a primitive cube root of unity. The norm of a + b 3 2 + c 3 4 is
√ √ √ √ √ √
(a + b 2 + c 4)(a + bj 2 + cj 2 4)(a + bj 2 2 + cj 4).
3 3 3 3 3 3

This is
√ √ √ √ √ √
a3 + (b 2)3 + (c 4)3 + (1 + j + j 2 )(a(b 2)2 + a(c 4)2 ) + (1 + j + j 2 )(a2 b 2 + a2 c 4)
3 3 3 3 3 3

√ √ √ √ √ √
+ (1 + j + j 2 )((b 2)2 (c 4) + (b 2)(c 4)2 ) + 3(j + j 2 )a(b 2)(c 4)
3 3 3 3 3 3

= a3 + 2b3 + 4c3 + 0 + 0 + 0 − 3(2abc)


= a3 + 2b3 + 4c3 − 6ab.

Remark 6.2.1
It is perhaps easier to use the definition of the norm as√a determinant
√ (see Remark√C.3.6).
√ One
can check that the matrix of the linear map x 7→ (a + b 3 2 + c 3 4)x in the basis 1, 3 2, 3 4 is
 
a 2c 2b
 b a 2c
c b a

which has determinant

a a2 − 2bc − 2c ba − 2c2 + 2b(b2 − ac) = a3 + 2b3 + 4c3 − 6abc.


 

6.3 Galois Theory


Exercise 6.3.1∗ . Check that K(α1 , . . . , αn )/K is Galois and prove that any Galois extension has this
form.

Solution

K(α1 , . . . , αn )/K is Galois because K-embeddings send αi to some other αj so send


K(α1 , . . . , αn ) to itself and are thus all automorphisms. Conversely, if L/K is Galois, let α
be a primitive element for L and α1 , . . . , αn its conjugates. Then,

L = K(α) = K(α1 , . . . , αn )

since L/K is Galois.




Exercise 6.3.2∗ . Can you express the Galois group of a quadratic extension L/K in a way that
doesn’t depend on L or K? (More specifically, show that the Galois groups of quadratic extensions
are all isomorphic.)

Solution

The √ of a quadratic extension K( d)/K are the identity and the conjugation a +
√ embeddings
b d 7→ a − b d. The Galois group is isomorphic to Z/2Z (with addition): the identity is sent
328 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

to 0 and the conjugation to 1: conjugation composed with conjugation gives the identity, i.e.
1 + 1 = 0. (Coincidentally the only group with two elements is Z/2Z so we could have directly
concluded that they were isomorphic.)


Exercise 6.3.3∗ . Check that the Galois group is a group under composition.

Solution

There are three things to check: that the operation is associative, that it has an identity and
that there are inverses. The first is trivial since composition is associative, and the second as
well since σ ◦ id = id ◦ σ = σ for any embedding σ. For the third, note that, if L = K(α) and
σ : f (α) 7→ f (β) is an element of Gal(L/K), we have K(β) ⊆ L since L is Galois and

[K(β) : K] = [K(α) : K] = [L : K]

since β is a conjugate of α, which means that L = K(β) by the tower law. Thus, σ −1 : f (β) 7→
f (α) is also an element of the Galois group.


Exercise 6.3.4. Let L/K be Galois and K ⊆ M ⊆ L be an intermediate field. Prove that EmbK (M )
is a system of representatives of Gal(L/K)/ Gal(L/M ), where the quotient means Gal(L/K) modulo
Gal(L/M ), i.e. we say σ 0 ≡ σ if σ −1 ◦ σ 0 ∈ Gal(L/M ). (Our quotient A/B is more commonly though
of as the set of right-cosets of B in A, i.e. the sets Ba for a ∈ A (which we just wrote as a in our
case).) (See also Exercise A.3.15† .)

Solution

Extend any embedding σ ∈ EmbK (M ) to an embedding σ 0 ∈ EmbK (L) = Gal(L/K). This is


a well defined map from EmbK (M ) to Gal(L/K)/ Gal(L/M ): if ϕ and ψ are equal on M , then
ψ −1 ◦ ϕ is the identity on M so is in Gal(L/M ) as wanted.

This map is clearly injective, to show that it is bijective we just follow our argument in the other
way. Let σ ∈ Gal(L/K)/ Gal(L/M ). We prove that its image on M is well defined: if σ 0 ≡ σ
then σ and σ 0 have the same images on M since σ −1 ◦ σ 0 is the identity on M .


Exercise 6.3.5. Prove Proposition 6.2.4 using Exercise 6.3.4. (This is a bit technical.)

Solution
Q
We have NM/K = σ∈EmbK (M ) σ and
Y Y
NL/K ◦ NM/L = ϕ◦ ψ.
ϕ∈EmbK (L) ψ∈EmbL (M )
Q
We would like to say that this is ϕ,ψ ϕ ◦ ψ and then show that ϕ ◦ ψ correspond to the K-
embeddings of M but there is one problem: im ψ is in general not contained in L, the domain
of ϕ. Thus, let F be the Galois closure of M , i.e., if M = K(α), then F = K(α1 , . . . , αn )
where α1 , . . . , αn are the conjugates of α. Using Exercise 6.3.4, we extend embeddings of L to
embeddings of F .
6.3. GALOIS THEORY 329

Let GK =
Q Gal(F/K), GM = Gal(F/M ) and GL = Gal(F/L). By Exercise 6.3.4, we have
NM/K = σ∈GK /GM σ and
Y Y Y
NL/K ◦ NM/L = ϕ◦ ψ= ϕ ◦ ψ.
ϕ∈GK /GL ψ∈GM /GL ϕ∈GK /GL ,ψ∈GL /GM

To conclude, we prove that if ϕi are a system of representatives of GK /GL and ψj of GL /GM ,


then ϕi ◦ ψj is a system of representatives of GK /GM . By looking at the cardinalities, it sufficies
to show that they are distinct in GK /GM . Thus, suppose that

ϕ0 ψ 0 ≡ ϕψ (mod GM ) ⇐⇒ ϕ−1 ϕ0 ψ 0 ≡ ψ (mod GM ).

If we look at this modulo GL , we get ϕ−1 ϕ0 = id, i.e. ϕ = ϕ0 . Then if we look at the remainder
modulo GM , we have ψ 0 = ψ which shows what we wanted.


Exercise 6.3.6∗ . Compute σ(0,−1) ◦ σ(1,1) and σ(1,1) ◦ σ(0,−1) .

Solution

This follows from the fact that σ(a,b) ◦ σ(c,d) = σ(a+bc,bd) . Indeed, σ(a,b) sends ζ to

σ(a,b) (ζ d )b = (ζ d )b = ζ bd

3
and 2 to √ √ √
σ(a,b) (ζ c 2) = (ζ c )b · ζ a 2 = ζ a+bc 2.
3 3 3

Thus,
σ(0,−1) ◦ σ(1,1) = σ(0−1·1,−1·1) = σ(−1,−1)
but
σ(1,1) ◦ σ(0,−1) = σ(1+1·0,1·(−1)) = σ(1,−1)


Exercise 6.3.7∗ . Prove Corollary 6.3.1. (You may assume Exercise 6.3.101 .)

Solution

Write L = K(α1 , . . . , αn ), and consider the field L0 defined as K adjoined α1 , . . . , αn and all
their conjugates. By Exercise 6.3.10, L0 /K is separable and thus Galois. Its Galois group is
finite which means that it has a finite number of subgroups: by the Galois correspondance, this
implies that there is a finite number of intermediate fields L0 /M/K. Since L ⊆ L0 , there is also
a finite number of intermediate fields L/M/K as wanted.


Exercise 6.3.8. Prove that the primitive element theorem follows from Corollary 6.3.1 by considering
the fields Kt = K(α + tβ), where α, β ∈ K are given elements.

1 Sadly, our proof of it uses the primitive element theorem. However, it is usually proved from the theory of embeddings

in any finite extension (not necessarily separable), which explains why we relied on the primitive element theorem, as
we have not delved in the theory of inseparable extensions at all. See Conrad [13] or Lang [18].
330 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Solution

Since we have already proven the primitive element theorem for finite fields in Chapter 4, suppose
without loss of generality that K is infinite. Since there are only many finitely many intermediate
fields by Corollary 6.3.1 but an infinite number of possible t ∈ K, the pigeonhole principle implies
that Kt = Ks for some t 6= s. Set γ = α + βt. Then,

α + tβ, α + sβ ∈ K(γ) ⊆ K(α, β).

Thus,
(t − s)β = (α + tβ) − (α + s(β)) ∈ K(γ),
so that β ∈ K(γ) as t 6= s. Then, α = (α + tβ) − tβ is in K(γ) as well, so that

K(α, β) ⊆ K(γ) ⊆ K(α, β)

as wanted.


Exercise 6.3.9∗ . Prove that ei (σ1 (α), . . . , σk (α)) is fixed by H for any i.

Solution

We shall prove that any symmetric polynomial f evaluated at σi (α) is fixed by H (this is in fact
equivalent to what we need to prove by the fundamental theorem of symmetric polynomials).
For this, note that
σ(f (σ1 (α), . . . , σk (α)) = f (σσ1 (α), . . . , σσk (α))
by a slightly generalised version of Exercise 6.2.2∗ . Since σ 7→ σσ is a permutation of H (since
i i
H is a group!) and f is symmetric, this is exactly equal to f (σ1 (α), . . . , σk (α).


Exercise 6.3.10. Let α, β be separable over K (i.e. their minimal polynomials have distinct roots).
Prove that K(α, β) is separable over K.

Solution

Note that this is equivalent to the normal closure L of K(α, β) being Galois over K, where L is
defined as K adjoined the conjugates of α of β (which are also separable). Note also that the
primitive element theorem is still true as we saw in Remark 6.2.1, so L = K(γ) which means that
we can still define G = Gal(L/K). Finally, let δ ∈ L be any element, and let Gδ = {δ1 , . . . , δn }
be its action under G. Then, the coefficients of
n
Y
f= X − δi
k=1

is fixed by all of G, which means that f ∈ K[X]. Since f has distinct roots by construction, δ is
separable as wanted.


Exercise 6.3.11∗ . Given two subfields A and B of a field L, define their compositum or composite
field AB as the smallest subfield of L containing both A and B (in other words, the field generated
by A and B). Let L/K be a finite Galois extension and A, B be intermediate fields. Prove that
Gal(L/AB) = Gal(L/A) ∩ Gal(L/B).
6.3. GALOIS THEORY 331

Solution

Note that any embedding which fixes both AB fixes both A and B. Hence, Gal(L/AB) ⊆
Gal(L/A) ∩ Gal(L/B). Conversely, if A = K(α) and B = K(β), then AB = K(α, β) and it is
clear that the embeddings which fix both α and β fix AB.


Exercise 6.3.12∗ . Given two subgroups H1 , H2 of a group H, define the subgroup they generate,
hH1 , H2 i, as the smallest subgroup containing both H1 and H2 . Let L/K be a finite Galois extension
and A, B be intermediate fields. Prove that Gal(L/A ∩ B) = hGal(L/A), Gal(L/B)i.

Solution

Note that any embedding which fixes A or B fixes A ∩ B. Hence, hGal(L/A), Gal(L/B)i ⊆
Gal(L/A ∩ B) (if a group contains two subgroups it also contains the subgroup generated by
them, by definition). Conversely, if hGal(L/A), Gal(L/B)i fixes all of M , then M is fixed both
by the embeddings of A and by the embeddings of B, which implies that M ⊆ A ∩ B. This shows
that Gal(L/A ∩ B) ⊆ hGal(L/A), Gal(L/B)i.


Exercise 6.3.13∗ . Prove Proposition 6.3.2.

Solution

If H1 ⊆ H2 then LH1 is fixed by less embeddings than LH2 so has more elements (any element
fixed by the embeddings of H2 is also fixed by the embeddings of H1 ). Conversely, if M1 ⊆ M2 ,
there are more embeddings which fix M1 than M2 since any embedding which fix M2 also fix
M1 .


Exercise 6.3.14∗ . Let L/K be a finite Galois extension and let M be an intermediate field. Prove
that, for any σ ∈ Gal(L/K), Gal(L/σM ) = σ Gal(L/M )σ −1 . Deduce that the intermediate fields
which are also Galois (over K) are M = LH where H is a normal subgroup of G = Gal(L/K),
meaning that σHσ −1 = H for any σ ∈ G. In particular, if L/K is abelian, meaning that its Galois
group is, any intermediate field is Galois over K.

Solution

If ϕ fixes M , σϕ fixes sends M to σM and thus σϕσ −1 sends σM to itself as wanted (this
is of course reversible). M is Galois over K iff it is fixed under conjugation, i.e. σM = M
for any σ ∈ Gal(L/K). By the fundamental theorem of Galois theory, this is equivalent to
Gal(L/M ) = Gal(L/σM ) = σ Gal(L/M )σ −1 . This proves the second part. For the third, simply
note that σHσ −1 = σσ −1 H = H when G is abelian.


Exercise 6.3.15∗ . Fill in the details of this proof of the quadratic reciprocity law.
332 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Solution
√ ∗
√ ∗ of the proof. First, we prove that q ∈ Q(ωq ) where ωq is a primitive qth root
Here is a summary
of unity, say q = f (ωq ). Then, we assume that f ∈ Z(p) [X], where Z(p) denote the rational
numbers with non-negative p-adic valuation. This follows for instance from Exercise 3.5.26† .
After that, we apply the Frobenius morphism on both sides to get
√ √
σp ( q ∗ ) = f (ωqp ) ≡ ( q ∗ )p ,

i.e.
p √ ∗ √
 
q ≡ ( q ∗ )p
q
√   √
since the Galois group of Q(ωq )/Q( q ∗ ) is {σk | kq = 1}, i.e. these embeddings fix q ∗ and

the others negate it. To see that this is necessarily the Galois group of Q(ωq )/Q( q ∗ ), notice
q−1
that the only subgroup of cardinality 2 of Z/(q − 1)Z is 2Z/(q − 1)Z, which corresponds to the
quadratic residue once we raise a primitive root to these powers (since primitive roots are what
give us an isomorphism Z/(q − 1)Z ' (Z/qZ)× ).

Now that we have this equality, we can rewrite it as


  
p q−1
 p−1
2
≡ (−1) 2 q (mod p),
q

i.e.   
p q p−1 q−1
≡ (−1) 2 · 2
q p
as wanted.


   
a a
Exercise 6.3.16. Prove that, for any positive integer a and primes p, q - a, we have p = q
whenever p ≡ ±q (mod 4a). Deduce the quadratic reciprocity law: for any odd primes p, q,
  
p q p−1 q−1
= (−1) 2 · 2 .
q p

Solution
 
a √ √
We saw that p depends only on σp ( a). Since σ−1 is the complex conjugation, it fixes a
√ √
which is real, and thus σp ( a) = σq ( a) whenever p ≡ ±q (mod 4a).

Now we need to prove the quadratic reciprocity law from this. Let p and q be two odd primes.
If p ± q = 4a, where the ± sign is chosen so that  the
 expression
  is divisible by 4, i.e. is
q−1 p−1
(−1) 2 + 2 +1 . Then, by the above considerations, ap = aq . On the other hand, by consid-
ering the equality p ± q = 4a modulo p or q, we get
   
a ±q
=
p p

and    
a p
= .
q q
6.3. GALOIS THEORY 333

Thus, we conclude that      


p ±q p−1 q−1 q
= = (−1) 2 · 2
q p p
as wanted, since
 q−1 p−1
 p−1
2 p−1 q−1 p−1 p−1 p−1 q−1
(−1) 2 + 2 +1 = (−1) 2 · 2 + 2 + 2 = (−1) 2 · 2 .

Exercise 6.3.17∗ . Convince yourself of this solution and fill in the details.

Solution

HereQis the proof written in the correct order. We pick a primitive element r of Q(ω)H , say
r = h∈H u − ω h for some u ∈ Z by Remark 6.3.2, and consider its minimal polynomial f . (We
do not actually need to know the explicit form of r, but this makes the proof slightly shorter to
write up.) Then, let p be any prime not dividing n and consider an element ζ ∈ Fp of order n.
If p (mod n) ∈ H, then Y
ρk = u − ζ kh
h∈H

is in Fp since Y
ρpk = u − ζ khp = ρk
h∈H

as h 7→ hp is a permutation of H since p (mod n) ∈ H. By the fundamental theorem of symmetric


polynomials, Y Y
X − f (ρh ) ≡ X − f (σh (r)) = X ϕ(n) (mod p)
h∈H h∈H

so f has all its roots ρh in Fp as asserted.

Now, suppose for the sake of contradiction that infinitely many p ≡ m (mod n) are such that f
has a root in Fp . Since the roots of f are still the ρh , this means that, for some h ∈ H, ρh ∈ Fp
for infinitely many p ≡ m (mod n). Since ρh ∈ Fp is equivalent to ρph = ρh and ρph = ρhp = ρhm ,
we get
ρhm − ρh = 0
for infinitely many p. As a consequence, the product
Y Y
ρghm − ρgh ≡ σghm (r) − σgh (mod p)
g∈(Z/nZ)× /H g∈(Z/nZ)× /H

is divisible by infinitely many primes p by the fundamental theorem of symmetric polynomials.


This implies that σghm (r) = σgh for some g, i.e. ghm and gh are in the same coset, which is
false since m 6∈ H. (We can also assume that g = 1 by replacing ω by ω g .)


Exercise 6.3.18∗ . Prove that the identity of a group is unique.

Solution

If e and e0 are two identities than e = ee0 = e0 so they are equal.



334 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Exercise 6.3.19∗ . Prove the following refinement of Theorem 2.5.1: if G is a finite group and H a
subgroup of G, |H| divides |G|. Why does it imply Theorem 2.5.1?

Solution

Partition G into left-cosets aH, a ∈ G. Each coset has cardinality |H|, and two distinct cosets
are disjoint so this is indeed a partition: if ag = bh with g, h ∈ H, then a = bhg −1 so aH = bH.
Thus, the cardinality of a coset divides the cardinality of the union, i.e. |H| divides |G|. When
H is the subgroup generated by an element g, this means that the order of g divides the order
of G.


6.4 Splitting of Polynomials


Exercise 6.4.1. Does there exist an a 6≡ 1 (mod n) such that any non-constant f ∈ Z[X] has infinitely
many prime factors congruent to a modulo n?

Solution

No, a counterexample is f = Φn .


6.5 Exercises
Field and Galois Theory
Exercise 6.5.1† . Let L/K be a finite separable extension of prime degree p. If f ∈ K[X] has prime
degree q and is irreducible over K but reducible over L, then p = q.

Solution

Let α be a root of f . On the one hand, [L(α) : K] = [L(α) : K(α)][K(α) : K] is divisible by q.


On the other hand,
[L(α) : K] = [L(α) : L][L : K].
The first factor is smaller than q so not divisible by q, and the second is p. Thus, q | p, i.e. p = q.


Exercise 6.5.2† . Let L/K be a finite Galois extension and let M/K be a finite extension. Prove that
Gal(LM/M ) ' Gal(L/L ∩ M ). In particular, [LM : L] = [L : L ∩ M ]. Conclude that, if L/K and
M/K are Galois, we have
[LM : K][L ∩ M : K] = [L : K][M : K].

Solution

For the first part, let L = K(α). Consider the restriction ϕ from Gal(LM/M ) → Gal(L/K)
(an element of Gal(LM/M ) is a function σ : LM → LM fixing M , which can be restricted to a
function σ : L → L fixing K). This is injective, since σ ∈ Gal(LM/M ) is determined by its value
as α, and the same goes for σ ∈ Gal(L/K). We wish to show that the image of this restriction is
6.5. EXERCISES 335

Gal(L/L ∩ M ) (thus corresponding to an isomorphism between Gal(LM/M ) and Gal(L/L ∩ M )


as wanted).

Notice for this that

Lϕ Gal(LM/M ) = {x ∈ L | σ(x) = x ∀ σ ∈ ϕ Gal(LM/M )}


= {x ∈ L | σ(x) = x ∀ σ ∈ Gal(LM/M )}
=M ∩L

since Gal(LM/M ) fixes exactly M but here we restrict it to L.

For the second part, we have [LM : K] = [LM : L][L : K] by the tower law and [LM : L] =
[M :K]
[M : L ∩ M ] by the first part. Now, [M : L ∩ M ] = [L∩M :K] by the tower law again. Thus, we
conclude that
[L : K][M : K]
[LM : K] = [LM : L][L : K] = [M : L ∩ M ][L : K] =
L ∩ M : K]

as wanted.


Exercise 6.5.3† (Artin). Let L be a field, and G ⊆ Aut(L) a finite subgroup of automorphisms of L.
Prove that L/LG is Galois with Galois group G.

Solution

Set n = |G| and K = LG . First, we prove that L/LG is Galois. Let α ∈ L and let Gα =
{α1 , . . . , αm } be the action of G on α. Since any σ ∈ G permutates α1 , . . . , αn (as automorphisms
are injective), the coefficients of
m
Y
X − αi
i=1
G
are fixed by all G and thus in K = L . In particular, α has degree at most n and is separable.
Since α1 , . . . , αm ∈ L, we get that the conjugates of α all lie in L, i.e. L/K is Galois.

It remains to prove that Gal(L/K) = G. Note that G ⊆ Gal(L/K) so we only to prove that
[L : K] ≤ n. In fact, we are more generally done if we show that L/K is finite, by the fundamental
theorem of Galois theory. To prove this, note that, for any α1 , . . . , αk ∈ L, K(α1 , . . . , αk ) is equal
to some K(β) by the primitive element theorem, and we have shown that β has degree at most
n over K. In other words,
[K(α1 , . . . , αk ) : K] ≤ n
for all α1 , . . . , αk ∈ L. To conclude, if we choose α ∈ L to have maximal degree over K, then
[K(α, β) : K] ≤ [K(α) : K] so that β ∈ K(α) and thus L = K(α).


Exercise 6.5.4† . Prove that, for any n, there is a finite Galois extension K/Q such that Gal(K/Q) '
Z/nZ.

Solution

Pick a prime p ≡ 1 (mod n), let ω be a primitive pth root of unity, and set K = Q(ω). Note
that, since K has abelian Galois group, every subfield of K is also Galois over Q. We wish
336 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

to find a subfield L such that Gal(L/Q) ' Z/nZ. By Exercise 6.3.4, we have Gal(L/Q) '
Gal(K/Q)/ Gal(K/L), so we want to find a subgroup H of Z/(p − 1)Z such that
 
Z/(p − 1)Z /H ' Z/nZ.

Now, note that the subgroups of Z/(p − 1)Z have the form d · Z/(p − 1)Z ' Z/ p−1
d Z, and that
(e.g. by Exercise A.3.16† )
  
Z/(p − 1)Z d · Z/(p − 1)Z ' Z/dZ.

Thus, if H is the subgroup of Gal(K/Q) corresponding to n · Z/(p − 1)Z, we get Gal(K H /Q) '
Z/nZ as wanted.


Remark 6.5.1
If we combine the structure of units from Exercise 3.5.18† with the structure of finite abelian
groups from Exercise A.3.20† , we get that every finite abelian group is a Galois group Gal(K/Q)
for some number field K.

Exercise 6.5.5† (Cayley’s Theorem). Let G be a finite group. Prove that it is a subgroup of Sn for
some n. Conclude that there is a finite Galois extension L/K of number fields such that G ' Gal(L/K).
(This is part of the inverse Galois problem. So far, it has only been conjectured that we can choose
K = Q.)

Solution

We claim that G ⊆ S|G| . Indeed, left-multiplication by g defines a permutation sg of G, and it


is clear that this is an isomorphism: sg ◦ sg− 1 = id and

sg ◦ sh = x 7→ hx 7→ ghx = sgh .

For the second part, we can consider an L such that Gal(L/Q) ' S|G| and then take K = LG
since G is a subgroup of S|G| by Cayley’s theorem. To prove that such an L exists without
invoking Exercise 6.5.23† , one can consider a prime number p ≥ n and a subgroup S ' Sn of Sp .
By Exercise 6.5.22† , there exists a number field M Galois over Q such that Gal(M/Q) ' Sp .
Indeed, it is clear that there exists a polynomial of degree p with real coefficients and exactly
two non-real roots. We can then refine it to an irreducible polynomial with rational coefficients
by replacing its coefficients by close rational numbers of p-adic valuation 1, except for its leading
coefficient which we replace by a close rational number of p-adic valuation 1. This gives us a
polynomial irreducible over Q by Eisenstein’s criterion. Finally, we pick L = M S .


Exercise 6.5.6† (Dedekind’s Lemma). Let L/K be a finite separable extension in characteristic 0.
Prove that the K-embeddings of L are linearly independent.

Solution

Suppose for the sake of contradiction that a non-zero linear combination annihilates the embed-
dings:
a1 σ1 + . . . + ak σk = 0
6.5. EXERCISES 337

and pick k to be minimal. Pick an element a ∈ L such that σ1 (a) 6= σk (a). Then,

a1 σ1 (ax) + . . . + ak σk (ax) = 0

for all x ∈ L by assumption, but this is also

a1 σ1 (a)σ1 (x) + . . . + ak σk (a)σk (x)

so we conclude that
   
σ1 (a) σk−1 (a)
a1 1 − σ1 + . . . + ak−1 1 − σk−1 = 0,
σk (a) σk (a)

contradicting the minimality of k.




Exercise 6.5.7† (Hilbert’s Theorem 90). Suppose L/K is a cyclic extension in characteristic 0, mean-
ing its Galois group Gal(L/K) ' (Z/nZ, +) for some n (like Gal(Fpn /Fp )) or Gal(Q(exp(2iπ/p))/Q)).
Prove that α ∈ L has norm 1 if and only if it can be written as β/σ(β) for some β ∈ L, where σ is a
generator of the Galois group (element of order n).

Solution

It is clear that β/σ(β) has norm 1 for any β. Now suppose α has norm 1 and let σ ∈ Gal(L/K)
be a generator. By Exercise 6.5.6† , pick a γ such that
n−1
X k−1
Y
β= σ k (γ) σ i (α)
k=0 i=0

is non-zero. Then,
n−1
X k−1
Y
ασ(β) = σ k+1 (γ) σ i+1 (α)
k=0 i=0
n−1
X k−1
Y
n n−1 k
= σ (γ)ασ(α) · . . . · σ (α) + σ (γ) σ i (α)
k=1 i=0
n−1
X k−1
Y
= γNL/K (α) + σ k (γ) σ i (α)
k=1 i=0

since NL/K (α) = 1 by assumption. Thus, α = β/σ(β) as wanted.




Remark 6.5.2


This theorem gives us very interesting corollaries such as Exercise 7.5.6 : x + y d has norm 1 iff
√ √
a+b√d a2 +db2 2ab

x + y d = a−b d
= 2
a −db 2 + 2
a −db 2 d.

Exercise 6.5.9† (Lüroth’s Theorem). Let K be a field and L a field between K and K(T ). Prove
that there exists a rational function f ∈ K(T ) such that L = K(f ).
338 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Solution

The proof given here is taken from Bergman [3]. Without loss of generality, suppose that L 6= K.
Given an element h ∈ K(T )[X] express it as h = f (T, X)/g(T ) with coprime f, g ∈ K[T, X] (g
is constant in X) and define its height ht(h) as max(degT (f ), degT (g)). Now, pick any element
of minimal height u = f (T )/g(T ) ∈ L. We will prove that

f (X) − ug(X)

is the minimal polynomial of T both over L and K(u). This implies that [K(T ) : K(u)] = [K(T ) :
L] and K(u) ⊆ L(u), i.e. L = K(u) as desired.

Without loss of generality, we may assume that deg f 6= deg g, by replacing u by u + t where
t ∈ K is such that f +tg has degree less than deg f if deg f +deg g. Similarly, we can assume that
deg f > deg g by replacing u by 1/u if necessary. Finally, by multiplying u by a constant, we can
suppose that f and g are monic. That way, the polynomial f (X)−ug(X) is monic in X of degree
deg f . Note that, when c = a(T, X)/b(T ) is monic in X, b(T ) divides the leading coefficient of X
of a so degT a ≥ degT b which implies that ht(c) = degT (a). In particular, ht(cd) = ht(c) + ht(d)
for polynomials c, d monic in X.

Suppose now that f (X) − ug(X) is divisible by another monic polynomial π = i ui X i ∈ L[X].
P
We have
ht(π(X)) ≥ ht(uk ) ≥ ht(u) = ht(f (X) − ug(X))
where k is chosen so that uk is non-constant. Hence, we conclude that, if f (X) − ug(X) =
π(X)τ (X), we have ht(τ ) = 0, i.e. τ = h(X) ∈ K[X]. This h divides both f and g: indeed, 1
and u are linearly independent since u 6∈ K so f /h + ug/h ∈ K(T )[X] implies that f /h and g/h
are in K[X]. This is of course impossible if τ is non-constant since f and g are coprime. Hence,
f (X) − ug(X) is the minimal polynomial of T over L.

The second part is a lot easier. Suppose that π = i fi (X)ui ∈ K[u][X] with fi ∈ K[X] vanishes
P
at X = T . We will prove that degX π ≤ ht(u) = deg f , thus showing that f (X)ug(X) is the
minimal polynomial of T over K(u). Let m = degu π. We have
m−1
X
0 = g(T )m−1 π(T ) = fm (T )f (T )m /g(T ) + fi (X)f (T )i g m−1−t .
i=0

Note that every term in the second sum is a polynomial, hence fm (T )f (T )m /g(T ) is too: g(T )
divides fm (T ) since f and g are coprime so

degX π ≥ deg fm ≥ deg f = ht u

as claimed.


nth Roots
Exercise 6.5.10† . Let K be a field, p a prime number, and α an element of K. Prove that X p − α
is irreducible over K if and only if it has no root.

Solution

Suppose that X p − α is reducible for some α 6= 0. We will show that α is a pth power in K. Let
f be a non-trivial factor of X p − α, say of degree k ∈ [p − 1]. Its constant term has the form
ωαk/p by Vieta’s formula (we are working in a field extension where X p − α splits here), where
ω is a pth root of unity. Thus, αk := β p is a pth power. If m is the inverse of k modulo p, say
6.5. EXERCISES 339

mk = np + 1, we get p
αmk β mk

α = np =
α αn
as wanted.


Exercise 6.5.11† . Let f ∈ K[X] be a monic irreducible polynomial and p a rational prime. Suppose
that (−1)deg f f (0) is not a pth power in K. Prove that f (X p ) is also irreducible.

Solution

Suppose that f (X p ) is reducible and let α be a root of f . By Lemma 6.1.1, X p − α is reducible


over K(ω), where ω is a primitive pth root of unity. By Exercise 6.5.10† , α is a pth power in
K(ω), say g(α)p . Let α1 , . . . , αn be the conjugates of α. Then, by Vieta’s formulas,
n n n
!p
Y Y Y
n p
(−1) f (0) = αk = g(αk ) = g(αk )
k=1 k=1 k=1

is a pth power.


Exercise 6.5.12† (Vahlen, Capelli, Redei). Let K be a field and α ∈ K. When is X n − α irreducible
over K?

Solution

Suppose that X m − α and X n − α are irreducible over K. Then, so is X mn − α. Indeed,


suppose that β is a root of X mn − α. Then, [K(β) : K] is divisible by [K(β m ) : K] = n and
[K(β n ) : K] = m, which implies that it’s divisible by mn as well since they are coprime. Thus,
[K(β) : K] = mn as wanted since it’s clearly at most mn.
Hence, it suffices to study the case where n = pk is a prime power. First, suppose that p is odd.
k
Then, we prove by induction that X p − α is irreducible if and only if α is not a pth power, the
k+1
base case being Exercise 6.5.10† . For the induction step, by Exercise 6.5.11† , if f = X p − α is
k+1
reducible, then (−1)p f (0) = α is a pth power. For p = 2 we get −α which could be a square
while α isn’t, except if K has characteristic 2 since we then α = −α. Thus, we are already done
if char K = 2 so we may now suppose that char K 6= 2.
k
Finally, it remains to study X 2 − α. We claim that, for k ≥ 2, this is irreducible iff α is a square
or −4 times a fourth power. One implication is Sophie Germain’s identity: if α = −4β 4 , then
X 2 + 4β 4 = (X 2 + 2βX + 2β 2 )(X 2 − 2βX + 2β 2 ).
k
It remains to prove that X 2 − α is irreducible if α is not a square or −4 times a fourth power.
k+1
Suppose that X 2 − α is reducible. Then, α = −β 2 for some β by Exercise 6.5.11† . Since α
k+1 k
is not a square, −1 isn’t as well. Let γ be a root of X 2 − α. We have γ 2 = iβ for some
2 2r
i = −1 ∈ K. We will prove that X − iβ is irreducible over K(i), thus showing that
[K(γ) : K] = [K(γ) : K(i)][K(i) : K] = 2k · 2 = 2k+1
as wanted. If it were reducible, iβ would have the form −(u + vi)2 = (v + ui)2 for some u, v ∈ K.
This gives us u2 − v 2 = 0 and β = 2uv so
α = −β 2 = −4u4
340 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

as wanted.

We can summarise the previous discussion as follows.


• X n − α is irreducible iff α is not a pth power for any p | n, and not −4 times a fourth power
in the case that 4 | n and char K 6= 2.
• As a corollary, if α is not −4√times a fourth power or 4 - n or char K = 2, the minimal
√ n
polynomial of n α is X n/d − αd where d | n is the greatest integer such that αd is an nth
power.



Exercise 6.5.13
√ . Let n ≥ 1 be an integer and ζ a primitive nth root of unity. What is the Galois
n
group of Q( 2, ζ) over Q?

Solution
√ √
Knowing the embeddings of Gal(ζ, n 2) is equivalent to knowing the conjugates of n 2√over
Gal(ζ). √
By Exercise
√ 6.5.12† and Problem 6.3.3, we know that the minimal polynomial of n 2 is
n/2
X − 2 if 2 ∈ Gal(ζ) and 2 | n, and X n − 2 otherwise. The embeddings are thus
(
ζ 2 7→ ζ 2a
σ(a,b) : √ √
n
2 7→ ζ b n 2

for independent a ∈ Z/ n2 Z and b ∈ Z/nZ in the first case, and
(
ζ 7→ ζ a
σ(a,b) : √ √
n
2 7→ ζ b n 2

for independent a ∈ (Z/nZ)× and b ∈ Z/nZ in the√second case. (One may note that neither of
these Galois groups are abelian for n ≥ 3.) Since 2 is in Q(ζ) iff 8 | n by Exercise 6.5.27, we
are done.


Exercise 6.5.14† . Let n ≥ 1 be an integer and p1 , . . . , pm rational primes. Prove that


√ √
[Q( n p1 , . . . , n pm ) : Q] = nm .

(This is a generalisation of Exercise 4.6.25† .)

Solution

Suppose that the degree is strictly smaller than nm , i.e. a non-trivial linear combination of
powers is zero:
k
X √
bi n ai = 0
i=1

for non-zero rational numbers b1 ,q


. . . , bk and distinct nth-powers-free positive integers a1 , . . . , ak .
By mulyiplying this equation by n an−1
1 , we may assume that a1 = 1. We will prove that b1 = 0,

thus reaching a contradiction. Let K be a Galois extension of Q containing all n ai with Galois
6.5. EXERCISES 341

group G. Take the sum of the equations


k
X √
bi σ( n ai ) = 0,
i=1


over σ ∈ G. Exercise 6.5.12† shows that the sum of the conjugates of n ai is zero for i 6= 1. Since
the action of Gal(K/Q) on Q(α) restricts to multiple copies of Emb(Q(α)) for any α ∈ Q, the
P √
sum σ∈G n ai is zero for any i 6= 1. Thus, we are left with the same
X
b1 σ(1) = b1 |G|,
σ∈G

which means b1 = 0 as wanted.




Exercise 6.5.15† (Kummer Theory). Let L/K be a finite Galois extension in characteristic 0. Suppose
that Gal(L/K) ∼ Z/nZ. If K contains a primitive nth root of unity, prove that L = K(α) for some
αn ∈ K.

Solution

Let σ be a generator of Gal(L/K) and let ω ∈ K be a primitive nth root of unity. If we find a
non-zero α ∈ L such that σ(α) = ωα, we are done, since, by iterating σ, we get σ k (α) = ω k α,
i.e. L is generated by the nth root α of some element of Ki+1 (as can be seen from considering
its norm for instance). For this, consider

Xn − 1
f= = X n−1 + ωX n−2 + . . . + ω n−1 .
X −ω
We have 0 = σ n − id = (σ − ωid) ◦ (f (σ)), hence, α = f (σ)(β) works for any β. We only need
to ensure that α 6= 0, and this follows from the linear independence of the embeddings from
Exercise 6.5.6† .


Exercise 6.5.16† (Artin-Schreier Theorem). Let L/K be a finite extension such that L is algebraically
closed. Prove that [L : K] ≤ 2.

Solution

First, we prove that [L : K] is a power of 2. Suppose otherwise. Using Cauchy’s theorem and the
fundamental theorem of Galois theory, we can find an intermediate field M such that [L : M ] = p
where p | [L : K] is an odd prime. Thus, suppose without loss of generality that [L : K] = p.
Consider a primitive pth root of unity ω ∈ L. Since ω has degree at most p − 1 over K, its
degree is not divisible by p which means that ω must be in K. Then, Kummer theory from
Exercise 6.5.15† implies that L = K(α) for some β = αp ∈ K. Since α is not in K, β is not a pth
2
power in K. This implies that the polynomial X p − β is irreducible over K by Exercise 6.5.12† .
This is a contradiction since any element algebraic over K has degree at most p by assumption.
Thus, [L : K] = 2k for some k. In particular, any polynomial of odd degree has a root in K.
Indeed, group the roots of a polynomial f ∈ K[X] in disjoint orbits of the form
σ1,i (αi ), . . . , σki ,i (αi )
where αi is a root of f and σ1,i , . . . , σki ,i form a set of representatives of
342 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Gal(L/K)/ Gal(L/K(αi )). In other words, we are simply asking that, for a fixed i, σj,i (αi ) go
through every conjugate of αi exactly once. Then, if αi 6∈ K,

ki = | Gal(L/K)/ Gal(L/K(αi ))| = 2k /| Gal(L/K(αi ))|

is even since Gal(L/K(αi )) < 2k . If this is the case for all i, then f has an even number of roots,
contradicting the assumption that its degree is odd. (This is a generalisation of the fact that
non-real roots always come by pair of complex conjugates.)

Accordingly, any polynomial of odd degree has a root in K. We will prove that any element
α ∈ K is such that α is a square of −α is. Assuming this, notice that these are exactly the
assumptions we used in our proof that C = R(i) is closed, in Section B.3. Hence, L = K(i)
where i2 = −1, which means in particular that [L : K] ≤ 2 as wanted.
k+1
It remains to show this claim. For this, notice that X 2 − α is reducible over K since L/K has
degree 2k . By Exercise 6.5.12† , this implies that α is a square, or −4 times a fourth power, and
thus minus a square.


Constructibility and Solvability


Exercise 6.5.18† . Prove that a real number is constructible if and only if it is algebraic and the
degree of its splitting field, meaning the field generated by its conjugates, is a power of 2. Deduce that,
using only a straightedge (a non-graded ruler) and a compass,
1. A regular n-gon is constructible if and only if ϕ(n) is a power of 2.
2. It is not always possible to trisect an angle.
3. It is not possible to construct the side of a cube of volume 2.
4. It is not possible to square the circle, i.e. construct a square with the same area as the unit
circle. (You may assume that π is transcendental. This follows from Exercise 1.5.31† .)

Solution

Note that α = cos π2 n is constructible iff its degree ϕ(n)/2 (for n ≥ 2) is a power of 2. In


that case, we can construct the point (cos(2kπ/n), 0) and the point (0, sin(2kπ/n)), and hence
the point (cos(2kπ/n), sin(2kπ/n)) as well. Similarly, if we are able to trisect an angle of α,
by intersecting with a vertical line we are able to construct cos(α/3) (and this is equivalent).
However, for α = 2π3 , we get cos(2π/9) which has degree 3 and is thus not constructible. For the

3
same reason, 2 is not constructible
√ so we cannot double the cube. Finally, squaring the circle
would involve constructing π which is transcendental so is impossible.

It remains to prove the characterisation of the constructible numbers. One direction is easy:
when we intersect a line with a line, the field generated by the coordinates does not change, and
when we intersect a line with a circle, the field generated by the coordinates becomes a quadratic
extension of itself or does not change. Thus, the degree over Q gets multiplied by 1 or by 2
each time, and must thus be a power of 2. It is also clear that this is Galois: each time we
take a square root, we could also take the other square root (this amounts to considering the
other intersection with the circle). The other direction is almost given by Exercise 6.5.17. To
finish, we proceed by induction on [K : Q] = 2k where K is the splitting field of α. By Cauchy’s
theorem 6.3.3, Gal(K/Q) has an element of order 2, and thus, K has a subfield of index 2 by the
Galois correspondence, say [K : L] = 2. By assumption, the points of L are constructible since
[L : Q] = 2k−1 , and if K = Q(α), we can recover α from L using the quadratic formula so α is
also constructible by Exercise 6.5.17.

6.5. EXERCISES 343

Exercise 6.5.19† . We say a finite Galois extension L/K in characteristic 0 is solvable by radicals if
there is a tower of extensions
K = K0 ⊂ K1 ⊂ . . . ⊂ Km ⊇ L

such that Ki+1 is obtained from Ki by adjoining an nth root of some element of Ki to Ki , for some
n. We also say a group G is solvable if there is a chain 0 = G0 ⊂ G1 ⊂ . . . ⊂ Gm = G such that Gi is
normal in Gi+1 (see Exercise 6.3.14∗ ) and Gi+1 /Gi is cyclic. Prove that L/K is solvable by radicals if
and only if its Galois group is. (When L is the field generated by the roots of a polynomial f ∈ K[X],
L/K being solvable by radicals means that the roots of f can be written with radicals, which explains
the name.)

Solution

First, suppose that G is solvable. Consider the tower of fields Ki = LGi . Each Ki+1 /Ki is
Galois by Exercise 6.3.14∗ , with Galois group isomorphic to Gi+1 /Gi by Exercise 6.3.4. We
wish to conclude that Ki+1 is generated from Ki by adding the nth root of an element. The
problem is that this is almost always false, since then Ki+1 /Ki wouldn’t be Galois extension!
Hence, we shall to consider the tower of fields Ki (ω), where ω be a primitive |G|th root of unity
(since |Gi+1 /Gi | | |G| by Lagrange). By Exercise 6.5.2† , the Galois group of Ki+1 (ω)/Ki (ω) is
isomorphic to the subgroup Gal(Ki+1 /Ki (ω) ∩ Ki+1 ) of Gal(Ki+1 /Ki ), and hence cyclic as well,
say of order n. Then, Kummer theory from Exercise 6.5.15† states that Ki+1 (ω) is Ki (ω)(αi )
for some αin ∈ Ki (ε). This gives us that L/K is solvable as wanted since we have the tower

K ⊆ K(ω) ⊆ K(ω)(α1 ) ⊆ . . . K(ω)(αm ) = L(ω) ⊇ L.

Now, we need to prove that G is solvable if L/K is. Note that, if G is solvable, then so is G/H
for any normal subgroup H. Indeed, if 0 = G0 ⊆ . . . ⊆ Gm = G, then H = G0 H/H ⊆ . . . ⊆
Gm H/H = G/H and
(Gi+1 H/H)/(Gi H/H) ' Gi+1 H/Gi H
by Exercise A.3.16† , and this is cyclic, since if gGi generates Gi+1 /Gi , then gGi+1 H generates
Gi+1 H/Gi H.

Set M = Km (ω), where ω is a root of unity chosen so that Ki+1 (ω)/K is Galois for each i (and
in particular M/K is). Since Gal(L/K) ' Gal(M/K)/ Gal(M/L) is a quotient of Gal(M/K), it
suffices to prove that Gal(M/K) is solvable. There is only one thing left to do now: add ω to all
Ki so that Ki+1 /Ki becomes Galois and we can simply take its Galois group. This gives us that
Gal(M/K(ω)) is solvable, but we what we want is for Gal(M/K) to be. However,

Gal(M/K)/ Gal(M/K(ω)) ' Gal(K(ω)/K)

is cyclic, so we can add to a chain 0 = G0 ⊆ . . . ⊆ Gm = Gal(M/K(ω)) ⊆ Gal(M/K) to conclude


that Gal(M/K) is solvable as wanted!

We shall however explain a bit more why taking Galois groups gives us cyclic extensions. Let
us write Mi = Ki (ω) = Mi−1 (αi ) for some αini ∈ Mi . Then, Gal(Mi+1 /Mi ) is cyclic since its
embeddings have the form σ(αi ) = ω k α for some k, so they form a subgroup of Z/nZ where n is
the order of ω, which is thus cyclic. If we let Gi = Gal(M/Mi ), we have

0 = Gal(M/M ) = Gm ⊆ . . . ⊆ G0 = Gal(M/K(ω))

and
Gi /Gi+1 = Gal(M/Mi )/ Gal(M/Mi+1 ) ' Gal(Mi /Mi+1 )
by Exercise 6.3.4 which is cyclic as wanted.

344 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Exercise 6.5.20† . Let n ≥ 1 be an integer. Prove that Sn is not solvable for n ≥ 5. Conclude
from Exercise 6.5.22† that some polynomial equations are not solvable by radicals.2 (This is quite
technical.)

Solution

The usual proof proves a lot more than just the non-solvability of Sn : it completely characterise
all its descending chains of normal subgroups. More precisely, the normal subgroups of Sn are
the symmetric group Sn , the alternating group of even permutations (Definition C.3.2) An , as
well as the trivial group 0 = {id}, while the only strict normal subgroup of An is 0 (we say it’s
simple). However, this demands a lot of work, and since this is a number theory book and not an
algebra one, we will not prove this. See Weinstraub [48, Appendix A, Section 3], for an account
of the more general result.

Note that if G is solvable, so is any of its subgroups H: 0 = G0 ⊆ . . . ⊆ Gm = G becomes

0 = G0 ∩ H ⊆ . . . ⊆ Gm ∩ H = G

and
Gi+1 ∩ H (H ∩ Gi+1 ) ∩ Gi
=
Gi ∩ H Gi
by the second isomorphism theorem (see Exercise A.3.16† ). This is a subgroup of Gi+1 /Gi , which
is cyclic, so it’s cyclic itself. Hence, to show that Sn is not solvable for n ≥ 5, it suffices to prove
that S5 is not solvable. This can be done using a computer for instance, as there are only 120
elements. However, we will still present a somewhat more satisfactory proof.

We first prove that the only strict subgroup G of Sn such that Sn /G is abelian is An . Let H be
such a subgroup. Note that, in cycle notation, we have

(1, 2, 4)(1, 4, 2) = id
(1, 3, 5)(1, 5, 3) = id
(1, 2, 3) = (1, 2, 4)(1, 3, 5)(1, 4, 2)(1, 5, 3).

Hence, if we let f ((1, 2, 4)) = x and f ((1, 3, 5)) = y, we get

f ((1, 2, 3)) = f ((1, 2, 4)(1, 3, 5)(1, 4, 2)(1, 5, 3)) = xyx−1 y −1 = id

in Sn /G which is abelian by assumption. By symmetry, G contains all 3-cycles. It remains to


prove that the 3-cycles generate An . Since An is generated by the products of two transpositions,
it suffices to prove that 3-cycles generate all products of two transpositions. This follows from
the following equalities

(i, k, j) = (i, j)(i, k)


(i, k, j)(i, k, `) = (i, j)(k, `)

for distinct i, j, k, `.

Now, we need to prove that G = A5 has no non-trivial normal subgroup. For this, we shall find
the conjugacy classes Cg = {hgh−1 | h ∈ G}. We can check that there are 5 such conjugacy
classes, of respective size 1, corresponding to the identity, 15, 20, 12 and 12. Assume we have
proven this. Since a normal subgroup H is a union of conjugacy classes by definition, it must
contain the trivial class of size 1, corresponding to the identity. But then, it is easy to see that
its cardinality only divides |A5 | = 60 for H = A5 or H = {id}. Thus, by Lagrange’s theorem
from Exercise 6.3.19∗ , H must be A5 or {id} as wanted. But A5 /{id} ' A5 is not cyclic so we
are done.

2 If one only wants to show that there is no general formula, one doesn’t need to do the first part since the general

polynomial n
Q
i=1 X − Ai ∈ Q(A1 , . . . , An )[X] already has Galois group Sn over Q(A1 , . . . , An ) (where A1 , . . . , An are
formal variables).
6.5. EXERCISES 345

To prove that these are the cardinalities of the conjugacy classes, recall Remark 6.5.3:if γ =
(i1 , . . . , ik ) is a cycle, we have
σγσ −1 = (σ(i1 ), . . . , σ(ik )).
This means that the conjugates of 3-cycles are all 3-cycles, thus forming the class of size 20 (it is
not hard to see that we can pick an even σ). Similarly, there are two pairs of conjugacy classes
of 5-cycles of size 12, because this time our σ will not be even. It only remains to prove that all
15 products of two transpositions (i, j)(k, `) with i, j, k, ` distinct are conjugate. This is not very
hard:
σ(i, j)(k, `)σ −1 = (σ(i), σ(j))(σ(k), σ(`)).
Hence, if (i0 , j 0 )(k 0 , `0 ) is another product of two transpositions, we pick σ(i) = i0 , σ(j) = j 0 ,
σ(k) = k 0 and σ(`) = `0 . If this has even signature we are done, otherwise exchange i0 and j 0 .

Finally, to conclude that some algebraic numbers are not expressible by radicals, we only need to
prove that there exist polynomials of prime degree with exactly two non-real roots. One example
is X 2 − 4X − 2 (irreducible by Eisenstein’s criterion).


Exercise 6.5.21† . We say a finite Galois extension L/K of real fields, i.e. L ⊆ R, is solvable by real
radicals if there is a tower of extensions
K = K0 ⊂ K1 ⊂ . . . ⊂ Km ⊇ L
such that Ki+1 is obtained from Ki by adjoining the nth root of some positive element of Ki to Ki .
Prove that L/K is solvable by real radicals if and only if [L : K] is a power of 2.

Solution

Without loss of generality, by adding more intermediate fields, we can suppose that Ki+1 =
Ki (αi ), where αip ∈ K for some prime p. The key point is that, if [L : K] is equal to an odd
prime q, then [L(α) : K(α)] is also equal to q for any αp ∈ K. Let’s see first how this implies
our result: if [L : K] is not a power of 2, say is divisible by an odd prime q, Gal(L/K) has an
element of order q by Cauchy’s theorem 6.3.3, and hence L has a subfield of index q by the Galois
correspondence, say [L : M ] = q. Then, L/M is also solvable by real radicals, but

[L : M ] = [L(α1 ) : K(α1 )] = [L(α1 , α2 ) : K(α1 , α2 )] = . . . ,

i.e. [LKi : M Ki ] = [L : M ] by our lemma. This is a contradiction for i = m: we get [L : L] =


[L : M ]. Conversely, if [L : K] is a power of 2, say L = K(α), L has a subfield M of index 2,
i.e. [L : M ] = 2. Then, α can be obtained by real radicals from M using the quadratic formula.
Repeating this process yields that [L : K] is solvable by square roots as well.

Hence, it suffices to show that, when [L : K] = q, [L(α) : K(α)] = q as well. Note that
Gal(L(α)/K(α)) is a subgroup of Gal(L/K), as we saw in Exercise 6.5.2† . In particular, [L(α) :
K(α)] divides q so must be 1 or q. Suppose for the sake of contradiction that it is 1 (in particular
α 6∈ K). This gives L ⊆ K(α), so q divides [K(α) : K], which is p by Exercise 6.5.10† . Hence,
p = q and L = K(α). But this is impossible, since the conjugates of α are not in L as primitive
qth roots of unity are not real since q ≥ 3.


Exercise 6.5.22† . Let p be a prime number and G ⊆ Sp a subgroup containing a transposition τ


(see the paragraph after Definition C.3.2) and an element γ of order p. Prove that G = Sp . Deduce
that, if f ∈ Q[X] is an irreducible polynomial of degree p with precisely two non-real complex roots,
then the Galois group of the field generated by its roots (called its splitting field , because it is a field
where it splits) over Q is Sp .
346 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Solution

Suppose without loss of generality that τ is 1 ↔ 2 (which we usually denote in cycle notation
τ = (1, 2)). A power of γ, say γ k sends 1 to 2. Suppose without loss of generality that γ. By
symmetry between 3, 4, . . . ,, we can in fact suppose that γ is the cycle (1, 2, . . . , p) which sends
1 → 2 → . . . → p → 1. We will prove that G contains all transpositions and thus must be Sp
since transpositions generate the symmetric group by Exercise C.3.12∗ . Notice that γτ γ −1 is the
transposition (2, 3) since it goes 2 → 1 → 2 → 3 and 3 → 2 → 1 → 2 and must is the identity
else where since τ is. Similarly, γ k τ γ −k is the transposition (k + 1, k + 2). Since

(1, k) = (1, k − 1)(k − 1, k)(1, k − 1),

a straightforward induction tells us that (1, k) ∈ G for all k. Finally, (1, i)(1, j)(1, i) = (i, j) so
G contains all transpositions as wanted.

Now, suppose f is an irreducible polynomial of degree p with only two non-real roots. Since its
degree divides the degree of its splitting field, its Galois group G has an element of order p by
Cauchy’s theorem 6.3.3. Moreover, it contains the transposition corresponding to the complex
conjugation, which exchanges the two non-real roots.


Exercise 6.5.23† . Let n be a positive integer. Prove that there is a number field K, Galois over
Q, such that Gal(K/Q) ' Sn . (You may assume the following result of Dedekind: if f ∈ Z[X] is a
polynomial, for any prime number p not dividing the discriminant ∆ of f , the Galois group of f over
Fp is a subgroup of the Galois group of f over Q.3 )

Solution

Let’s look at what the injection of GalFp (f ) in GalQ (f ) gives us. We use the cycle notation
σ1 · . . . · σk to mean the permutation σ which decomposes into the disjoint cycles σ1 , . . . , σk . A
cycle
σ : i1 7→ i2 7→ . . . 7→ ik 7→ i1
will be denoted (i1 , . . . , ik ) (it is the identity on other elements).

We know from Chapter 4 that GalFp (f ) is generated by the Frobenius morphism. Write f ≡
f1 ·. . .·fk (mod p) with distinct irreducible polynomials fi ∈ Fp [X] of respective degrees ni (there
is no repeated root since p doesn’t divide the discriminant). Then, the Frobenius morphism acts
as a cycle on the roots of each fi , of length ni . Hence, Frob can be written in cycle notation as
σ1 · . . . · σk with σi a cycle of length ni . This implies, by assumption, that GalQ (f ) also has an
element of the form.

We have a lot of freedom on the factorisation of f modulo primes, so let’s put aside the question
of ensuring that the primes we choose do not divide the discriminant of f for the moment and
focus on the rest of the proof. Suppose that G = GalQ (f ) has an n-cycle σ, a transposition τ ,
and an (n − 1)-cycle ψ. Without loss of generality, by symmetry, suppose that ψ = (2, 3, . . . , n).
By consdering σ m τ σ −m for an appropriate m, we can assume that τ is the transposition (1, k) for
some k. Indeed, by symmetry, if σ = (1, 2, . . . , n) and τ = (i, j), we have σ m τ σ −m = (i+m, j+m).
Since
ψ = (k, k + 1, . . . , n, 2, 3, . . . , k − 1),
we can, again by symmetry, suppose that in fact k = 2. To conclude, we will prove that (1, 2)
and (2, 3, . . . , n) generate Sn , thus implying that G = Sn as desired. By Exercise C.3.12∗ ,
it suffices to prove that they generate all transpositions. Since ϕ := τ ψ = (1, 2, . . . , n) ∈ G,

3 The Galois group of a polynomial f over a field F is defined as the Galois group of its splitting field over F , i.e. as

Gal(F (α1 , . . . , αk )/F ), where α1 , . . . , αk are the roots of f .


6.5. EXERCISES 347

we can proceed as we did in Exercise 6.5.22† : (m + 1, m + 2) = ϕm τ ϕ−m ∈ G, and since


(1, m) = (1, m − 1)(m − 1, m)(1, m − 1), we have (1, m) ∈ G for all m by induction. Finally, since
(i, j) = (1, i)(1, j)(1, i), we have all transpositions and we are done: G = Sn .

It remains to prove that we can ensure that the Galois group contains an n-cycle, a transposition,
and an n − 1 cycle. For this, pick three primes p, q, r. Then, choose a polynomial f ∈ Z[X], using
the Chinese remainder theorem, such that
• f is irreducible modulo p,
• f factorises as a product of a polynomial of degree 1 and an irreducible polynomial of
degree n − 1 modulo q, and

• f factorises as a product of an irreducible polynomial of degree 2 and n − 2 polynomials of


degree of degree 1 modulo r (choose r sufficiently large so that there is no common root).
Then, GalFp (f ) contains an n-cycle, GalFq (f ) a transposition, and GalFp (f ) an (n − 1)-cycle. In
addition, the discriminant of f is not divisible by p, q, r since f has no repeated factor modulo
these primes. Hence, by Dedekind’s result and our previous observation, GalQ (f ) = Sn as
desired.


Remark 6.5.3
If γ = (i1 , . . . , ik ) is a cycle, then it is straightforward to see that σγσ −1 = (σ(i1 ), . . . , σ(ik )).
This explains how we found our relations.

Cyclotomic Fields
Exercise 6.5.24† . Let ω be a primitive nth root of unity. When is Φm irreducible over Q(ω)?

Solution

Let ζ be a primitive mth root of unity and let ξ be a primitive lcm(m, n)th root of unity. Φm is
irreducible over Q(ω) if and only if ζ has degree ϕ(m) over Q(ω). We have, by Problem 6.3.2,

[Q(ξ) : Q] ϕ(lcm(m, n))


[Q(ζ, ω) : Q(ω)] = [Q(ξ) : Q(ω)] = = .
[Q(ω) : Q] ϕ(n)

Thus, Φm is irreducible over Q(ω) if and only if ϕ(lcm(m, n)) = ϕ(m)ϕ(n). Finally, note that
we always have
Y Y
ϕ(m)ϕ(n) = pvp (m)−1 (p − 1) q vq (n)−1 (q − 1)
p|m q|n
Y Y
min(vp (m),vp (n))−1
= p (p − 1) q max(vp (m),vp (n))−1 (q − 1)
p|m,n q|mn

= ϕ(lcm(m, n))ϕ(gcd(m, n))

since the p − 1 factor is repeated twice when p | m, n and only once otherwise, with exponent
vp (m) + vp (n) − 2 = min(vp (m), vp (n)) − 1 + max(vp (m), vp (n)) − 1 in the first case and exponent
max(vp (m), vp (n)) − 1 in the second. Thus, Φn is irreducible over Q(ω) iff ϕ(gcd(m, n)) = 1, i.e.
iff gcd(m, n) = 1 or 2.

348 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Exercise 6.5.25† . Let n be an integer and m ∈ Z/nZ be such that m2 ≡ 1 (mod n). Prove that
there exist infinitely many primes congruent to m modulo n, provided that there exists at least one
which is greater than n2 . (It is also true that our Euclidean approach to special cases of Dirichlet’s
theorem only works for m2 ≡ 1 (mod n), see [29].)

Solution

Suppose that p > n2 is a prime congruent to m (mod n). We have already done the case m = 1
in Exercise 3.3.8∗ and the case m = −1 in Theorem 4.4.1, so suppose m 6= ±1. Let ω be a
primitive nth root of unity, and let H = {1, m}. Let also σk denote the embedding ω 7→ ω k .
Consider g = (X − ω)(X − ω m ). For large k, we have Q(g(N )) = Q(ω)H by Remark 6.3.2.
Now, let f be the minimal polynomial of g(N ), where N is an integer that we will choose later.
Consider discriminant Y
∆=± f 0 (σi (g(N )))
i

by Exercise 3.2.2∗ . This is a polynomial of degree ϕ(n)(ϕ(n) − 1) < n2 in N , which can’t always
be divisible by p since p > n2 by assumption. Hence, there is some N such that this is not
divisible by p. Choose this N to also be divisible by n, using CRT.

We are now ready to finish. Suppose for the sake of contradiction that there are a finite number
of such primes p1 = p, p2 , . . . , pk . Let Q be the product of the possible exceptions, i.e. the prime
divisors of f which are not congruent to 1 or m modulo p. Pick an M congruent to g(N ) or
g(N ) + p modulo p2 , so that vp (f (M )) = 1 using Corollary 5.3.1. Pick also M to be divisible by
np2 · . . . · pk Q. Since
f (0) = ±Φn (N ) ≡ Φn (0) = ±1 (mod n)
, we know that the prime factors of f (0) are all congruent to 1 modulo n. Thus, the prime factors
of f (M ) ≡ f (0) (mod np2 · . . . · pk Q) are all congruent to 1 or m modulo n since we don’t run
in an exception. Since, by assumption, the only primes congruent to m modulo n are the pi ,
this means that f (N ) is divisible only by primes congruent to 1 modulo n, and potentially by p
too. Since we also have vp (N ) = 1, we get f (N ) ≡ ±m (mod n) depending on its sign. This is
a contradiction if m 6= ±1 since we have f (N ) ≡ f (0) ≡ ±1 (mod n).


Pn
Exercise 6.5.26†P(Mann). Suppose that ω1 , . . . , ωn are roots of unity such that i=1 ai ωi = 0 for
some ai ∈ Q and i∈I ai ωi 6= 0 for any non-empty strict subset I ⊆ [n]. Prove that ωim = ωjm for any
i, j ∈ [n] where m is the product of primes at most n.

Solution

Suppose without loss of generality that ω1 = 1, by dividing everything by ω1 . Next, let m the
smallest integer such that ωim = 1 for all m, let p be a prime factor of m and write m = pk r
for somep - s. We will prove that p ≤ n and r ≤ 1, thus yielding the wanted result. Let
2iπ
ω = exp pk
be a primitive pk th root of unity, and write ωi = ζ ti ω si for some si ≤ p − 1, where
 
2iπ
is a primitive m/pth root of unity. Inded, if ωi = exp 2`iπ

ζ = exp m/p m , we have

 
ti p + si r
ζ ti ω si = exp
m

and it suffices to choose si r ≡ ` (mod p). The equation


n
X
ai ωi = 0
i=1
6.5. EXERCISES 349

thus becomes f (ω) = 0 for some f ∈ Q(ζ) of degree at most p − 1, which is non-zero by
assumption. Let’s compute the degree of ω over Q(ζ):
(
[Q(ω, ζ) : Q] ϕ(pk r) p − 1 if k = 1
[Q(ω, ζ) : Q(ζ)] = = = .
[Q(ζ) : Q] ϕ(pk−1 r) p if k ≥ 2

Thus, we already reach a contradiction if k ≥ 2 since f has degree less than p. Hence, k = 1 and
we must have
f = α(1 + X + . . . + X p−1 ).
However, f has at most k non-zero coefficients, which implies p ≤ k as wanted.


Exercise 6.5.27. Which quadratic subfields does a cyclotomic field contain?

Solution

Given a positiverinteger m, we let ωm denote


ra primitive mth root of unity. We have seen that,
  
−1 −1
when p is odd, p p ∈ Q(ωp ). Hence, p p ∈ Q(ωn ) whenever p | n. This implies that
q
−1
∈ Q(ωn ) whenever m | n is odd, and −1
 
m m is the Jacobi symbol. In Problem 6.3.3, we
√ √ √
also saw that 2 ∈ Q(ω8 ) so 2 ∈ Q(ωn ) when 8 | n. Also, we of course have −1 ∈ Q(ω4 ) so

−1 ∈ Q(ωn ) when 4 | n. To summarise the above discussion, Q(ωn ) contains all the quadratic
subfields of the form
q
−1

• Q( m ) when m is a squarefree positive odd divisor of n

• Q( ±m) when m is a squarefree odd divisor of n and 4 | n

• Q( ±2m) when 8 | n and m is a squarefree odd divisor of n.

We claim that these are all the subfields of Q(ωn ). Suppose that Q(ωn ) contains Q( m) with
minimal m not of the wanted form. Then, Q(ωn ) and Q(ω4m ) have a non-trivial intersection so
n and 4m have gcd at least 3 by Problemr6.3.2. First suppose that they have a common odd
 
−1
prime divisor p. Then Q(ωn ) contains Q( p m/p) which contradicts the minimality of m.
Otherwise, if the gcd is exactly 4, the intersection is Q(i) so m = −1 which is in√our list of
described subfields. Otherwise,
p the gcd is at least 8 which means that m is odd and 2 ∈ Q(ωn )
so Q(ωn ) contains Q( m/2) which contradicts the minimality of m again.


Exercise 6.5.28† . Prove the Gauss and Lucas formulas: given an odd squarefree integer n > 1, there
exist polynomials An , Bn , Cn , Dn ∈ Z[X] such that

n−1 n−1
4Φn = A2n − (−1) 2 nBn2 = Cn2 − (−1) 2 nXDn2 .

Deduce that, given any non-zero rational number r, there are infinitely many pairs of distinct rational
prime (p, q) such that r has the same order modulo p and modulo q.
350 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Solution
n−1
Let ω be a primitive nth root of unity and set n∗ = −1

n n = (−1)
2 n. Notice that the
2 ∗ 2

expression A − n B is a norm in Q[X]( n ∗ ). Exercise 6.5.27 tells us that Q(ω) contains

Q( n∗√), so we just need to write Φn √in the form U V where U, V are polynomials conjugate in
k
Q[X]( n∗ ). This is easy: in Q(ω)/Q( n∗ ), we can see that the conjugates of ω are ω k for 
n  =
√ 
k

k
 k
1. Indeed, σk :7→ ω k fixes p∗ iff p = 1 and negates it otherwise. Since n = p|n p ,
Q
√ √ Q √
σk negates an even number of p∗ , i.e. fixes n∗ = p|n p∗ iff nk = 1. This means that


Gal(Q(ω)/Q( n∗ )) = {σk | nk = 1}. Thus, we can write


Y Y
Φn = X − ωk X − ωk
( nk )=1 ( nk )=−1
√ h √ i

as wanted. Note that the ring of integers of Q( n∗ ) is Z 1+ 2 n so the coefficients of A and B
are in 21 Z which gives us the factor of 4 on the left as wanted. We also have the formula
√ Y
An + Bn n∗ = X − ωk ,
( nk )=1

which can be used to derive the explicit formulas


Y Y
2An = X − ωk + X − ωk
( n )=1
k
( n )=−1
k

and √ Y Y
2Bn n∗ = X − ωk − X − ωk .
( nk )=1 ( nk )=−1
For Lucas’s formula, we wish to have

Φn (X)Φn (−X)Φn (X 2 ) = Cn (X 2 )2 − n∗ (XDn (X 2 ))2 .

For this consider the equality


√ √ √
Un + Vn n∗ = (An (X) + Bn (X) n∗ )(An (−X) − Bn (−X) n∗ )

= (An (X)An (−X) − n∗ Bn (X)Bn (−X)) + n∗ (An (−X)Bn (X) − An (X)Bn (−X)).

Note that Un (−X) = Un (X) so Un = Cn (X 2 ) for some Cn while Vn (−X) = −Vn (X) so Vn =
XDn (X 2 ) for some Dn . These Cn and Dn are the ones we were looking for.

Finally, we prove that, for any non-zero r = a/b ∈ Q, there are infinitely many integers n such
that (the numerator of) Φn (a, b) has at least two distinct prime factors, unless r = ±1 but these
r have the same order modulo any odd prime. One piece of notation: by multiplying the equality
4Φn (X/Y ) = Cn (X/Y )2 − X/Y Dn (X/Y )2 , we get an equality of the form

4Φn (X, Y ) = Cn (X, Y )2 − XY Dn (X)2 .

Without loss of generality, a and b are coprime and b > 0. First, we treat the case where ab has
even dyadic valuation. If we restrict ourselves to odd n, we can also assume that a is positive,
since we then have Φ2n (a, b) = Φn (−a, b). Thus, suppose that a and b are positive and coprime
and let m be the squarefree part of ab. Suppose that m 6= 1. Then, if p is a prime factor of m,
we have k k k k k k k
4Φpk m (a, b) = 4Φm (ap , bp ) = Cm (ap , bp )2 − m(ab)p Dm (ap , bp )2 .
k
Here is the magic: m(ab)p is a perfect square so this is a difference of two squares which
factorises! It remains to prove that the two factors are not both of the form 2q ` for some prime
6.5. EXERCISES 351

q. Indeed, we can ensure that all prime factors q of Φpk m (a, b) are such that a/b has order pk m
modulo q: the only other possible prime factors are common divisors of a and b, of which there
are none, and prime factors of m | ab, which also implies that they are common divisors of a and
b. Finally, since deg Cn > deg Dn , the two factors are asymptotically equivalent (the quotient
goes to 1), so if they both had the form 2q ` , they would need to be equal for large p. This is of
course impossible. When m = 1, we can consider the equation Φ3 = X 2 + X + 1 = (X + 1)2 − X
to get a difference of squares in the same way (replace p by 3) and the same conclusion applies.

It only remains to treat the case where ab has odd dyadic valuation. In that case, we shall derive
a formula of the form
Φ2n = Cn (X)2 − nXDn (X)2
for any squarefree even n. It is clear that the above argument will work as before as long as
the squarefree part m of ab has an odd prime factor p, i.e. is not equal to 2 since it’s even by
assumption. We can simply consider Φ12 = X 4 − X 2 + 1 = (X 2 + X + 1)2 − 2X(X + 1)2 in that
case (and raise X to the power 3k for large k).

Hence, it suffices to show that there exist such polynomials Cn and Dn for even squarefree n.
Without loss of generality, we can assume that n is positive since Φ2n (X) = Φn (X 2 ) = Φ2n (−X).
Our formula is equivalent to Φ4n (X) = Cn (X 2 )2 − n(XDn (X 2 ))2 . We will√now proceed as
before. Let ω be a primitive nth root. By Exercise 6.5.27, Q(ω) contains Q( n) and  we have
√ k
 k
 k k

Gal(Q(ω)/Q( n) = {σk | n = 1}, by definition of the Kronecker symbol n = n/2 2 ,
k2 −1 √
where k2 = (−1) 8 . Indeed, it is easy to see that σk fixes 2 iff k2 = 1, and the rest follows
 

from a parity argument as before. Thus,


√ Y
Un + nVn = X − ωk
( nk )=1

satisfty Φ4n = Un2 − nVn2 . We wish to show that Un is a polynomial in X 2 and Vn is X times a
polynomial in X 2 , i.e. that Un is even and Vn odd. This follows from the equalities
Y Y
2Un = X − ωk + X − ωk
( n )=1
k
( n )=−1
k

and √ Y Y
2 nVn = X − ωk − X − ωk .
( n )=1
k
( n )=−1
k

Indeed, we have
Y Y
−X − ω k = X + ωk
( nk )=±1 ( nk )=±1
Y
= X − ω k+2n
( nk )=±1
Y
= X − ωk
( )=∓1
k
n

   
since k+2n = kp for any odd prime p but k+2n k
 
p 2 =− 2 as 2n ≡ 4 (mod 8). This concludes
the proof.

352 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Remark 6.5.4
Schinzel has generalised our identities to give a wide class of cyclotomic polynomials with at least
two distinct prime factors. See [38].

Miscellaneous
Exercise 6.5.30† . Let f ∈ Q[X] be an irreducible polynomial with exactly one real root of degree at
least 2. Prove that the real parts of its non-real roots are all irrational.

Solution

Let α be the real root and let β be any non-real root. Suppose for the sake of contradiction
that 2<(β) = β + olβ is rational. Let σ be an embedding of Q(β) sending β to α. Then,
α + σ(β) = 2<(β) since 2<(β) is rational so fixed by σ, which implies that σ(β) is real. Since
α is the only real root by assumption, we get σ(β) = α which implies that α = <(β) is rational
and is a contradiction.


Remark 6.5.5
It is not√true at all in general that √
embeddings
√ commute with √ complex √ conjugation.
√ For
√ instance,
over Q( 3 2, j), the embedding σ : 3 2 7→ 3 2j, j 7→ j 2 sends 3 2j 2 to 3 2, and 3 2j to 3 2j 2 , which
are not complex conjugate.

Exercise 6.5.31† . Let K be a number field of degree n. Prove that there are elements α1 , . . . , αn of
K such that
OK ⊆ α1 Z + . . . + αn Z.
By showing that any submodule of a Z-module generated by n elements is also generated by n elements,
deduce that OK has an integral basis, i.e. elements β1 , . . . , βn such that

OK = β1 Z + . . . + βn Z.

Solution
Pn−1
Let α ∈ OK be a primitive element for K. Suppose that i=0 ai αi ∈ Z for some ai ∈ Q. We
will prove that the denominator of the ai are bounded. Set Emb(K) = {σ1 , . . . , σn }. Consider
the following system of equations:
 n−1
a0 + a1 σ1 (α) + . . . + an−1 σ1 (α)
 = A1
n−1

a + a σ (α) + . . . + a
0 1 2 σ
n−1 2 (α) = A2
.
 .......................................


a0 + a1 σn (α) + . . . + an−1 σn (α)n−1 = An

for some A1 , . . . , An ∈ Z. This can be written in matrix form as


· · · σ1 (α)n
    
1 σ1 (α) a1 A1
1 σ2 (α) · · · σ2 (α)n   a2   A2 
..   ..  =  ..  .
    
 .. .. ..
. . . .  .   . 
1 σn (α) · · · σn (α)n an An
Write this equation as M a = A ⇐⇒ a = AM −1 . Then, by Exercise C.3.19∗ , we know that the
ai are linear combinations of algebraic integers divided by det M . However,
Y
D := det(M )2 = ± σi (α) − σj (α) ∈ Z
i6=j
6.5. EXERCISES 353

by Vandermonde, so that Dai ∈ Z for any i, i.e. Dai ∈ Z since Dai ∈ Q. (This is one of the
applications of the exact value of the Vandermonde determinant promised in Remark C.3.3!)
This shows that
1 1
OK ⊆ Z + . . . + αn−1 Z
D D
as wanted.

Pn by induction on the number of generators n0 (it


It remains to prove the second part. We proceed
is
Pn trivial when n = 0). Suppose that M = i=1 αi Z and N ⊆ M is a submodule. Define M as
0 0
α
i=2 i Z, and N as N ∩ M . Using the inductive hypothesis, set N 0 = β2 Z + . . . + βn Z. Now,
consider the S set of rational integers k such that kα1 + β ∈ N for some β ∈ N 0 . Since N is
a Z-module, A is an additive subgroup of Z which thus has the form b1 Z for some b. (Indeed,
pick the smallest positive element b1 ∈ A and consider the remainder of the Euclidean division
of k ∈ A by b1 to show that b1 | k, as otherwise A would contain a smaller positive element.)

Set
β1 = b1 α1 + b2 α2 + . . . + bn αn ∈ N
Pn
for some b2 , . . . , bn . If α = i=1 ai αi is an element of N , we have a1 ∈ A so a1 = kb1 for some
k and thus α − kβ ∈ N 0 . Hence,

N = β 1 Z + N 0 = β1 Z + β2 Z + . . . + βn Z

as wanted.


Exercise 6.5.32† . Let f ∈ Q[X] be an irreducible polynomial of prime degree p and denote its roots
by α0 , . . . , αp−1 . Suppose that
λ0 α0 + . . . + λp−1 αp−1 ∈ Q
for some rational λi . Prove that λ0 = . . . = λp−1 .

Solution

Let K = Q(α0 , . . . , αp−1 ). Since p = [Q(α0 ) : Q] divides [K : Q] = | Gal(K/Q), Gal(K/Q) has


an element σ of order p by Cauchy’s theorem 6.3.3. By relabelling the αi , suppose without loss
Pp−1
of generality that σ sends αk to αk+1 . Let S = i=0 λi αi ∈ Q. Then, by applying σ to the
Pp−1
equation i=0 λi αi multiple times, we get the following system of equations in the αi


 λ0 α0 + . . . + λp−1 αp−1 = S

λ α + . . . + λ α = S
0 1 p−1 0


 . . . . . . . . . . . . . . . ............
λ0 αp−1 + . . . + λp−1 αp−2 = S.

Since this system has a non-trivial solution (the trivial one being α0 = . . . = αp−1 ), its determi-
nant must be zero. By Exercise C.5.8† , this circulant determinant is

g(ω)g(ω 2 ) · . . . · g(ω p−1 )


Pp−1 i k
where ω is a primitive pth root of unity and g = i=0 λi X . Thus, g(ω ) = 0 for some
k
k ∈ [p − 1]. Since p is prime, the ω for k ∈ [p − 1] are all conjugate with minimal polynomial
Φp = 1 + . . . + X p−1 . Thus, Φp | g. Since deg Φp ≥ deg g, we have g = λΦp for some λ ∈ Q.
This yields λ0 = . . . = λp−1 = λ as wanted. (Conversely, all such λi work since the sum of the
conjugates of an algebraic number is rational.)

354 CHAPTER 6. THE PRIMITIVE ELEMENT THEOREM AND GALOIS THEORY

Exercise 6.5.33† (TFJM 2019). Let N be an odd integer. Prove that there exist infinitely many
rational primes p ≡ 1 (mod N ) such that x 7→ xn+1 + x is a bijection of Fp , where n = p−1
N .

Solution

The idea is that, if ω + 1 is an N th power in Fp for all N th roots of unity ω ∈ Fp , then


p−1
f is a bijection. Indeed, this means that (ω + 1)n = (ω + 1) N = 1 for any k. Then, if
xn+1 + x = y n+1 + y, by raising the equation to the n = p−1 n n
N th power, since x and y are N th
roots of unity, we get
xn = y n
as (xn + 1)n = (y n + 1)n = 1. But then, our original equation becomes

x(1 + xn ) = y(1 + y n )

so x = y or xn = −1. However, the latter implies (−1)N = xp−1 = 1 which is impossible since
N is odd by assumption.

Thus, we are done if we find infinitely many p such that ω + 1 is an N th power in Fp for any
N th root ω ∈ Fp . Let ζ ∈ C be a complex primitive N th root of unity. Consider the polynomial
Y p
f= X − ζ i N ζ j + 1,
i,j

which has integer coefficients by the fundamental theorem of symmetric √ polynomials. Then, any
p ∈ Psplit (f ) which doesn’t divide N nor f (0) works (the idea is that N ω + 1 will exist in Fp for
such a p). Let p be such a prime, and let α1 , . . . , αm be the roots of f in Fp . We will prove that,
for any root N th root of unity ω ∈ Fp , ω + 1 is the N th power of an element of Fp (in particular,
ω + 1 ∈ Fp ). Note that ω + 1 6= 0 since p - f (0). By the fundamental theorem of symmetric
polynomials, we have
YY YY p
ω + 1 − αkn ≡ ζ k − (ζ i N ζ j + 1)n = 0
ω k k i,j

as wanted.


Exercise 6.5.34† . Let f ∈ C(X) be a rational function, and suppose f sends rational integers
algebraic integers to algebraic integers. Prove that f is a polynomial.

Solution

By linear algebra, f has coefficients in a number field K (which we will assume without loss of
generality to be Galois). Indeed, consider the system of linear equations in the coefficients of its
numerator g and denominator g

g(n1 ) = α1 h(n1 ), . . . , g(nk ) = αk h(nk )

for n1 , . . . , nk ∈ Z and α1 , . . . , αn ∈ Z. It has in solution in some finite-dimensional K :=


Q(α1 , . . . , αk )-vector space V (the one generated by the coefficients), and thus also in K, for
instance by considering a basis 1, v1 , . . . , vm of V and looking at the coefficient of 1. Next,
g(n)
notice that for k > deg g + deg h, these conditions completely determine f : if h(n) = u(n)
v(n) for
deg g + deg h + 1 values of n and some polynomials u and v of same degrees as g and h, the
polynomial gu − hv has more roots than its degree so must be identically zero.
6.5. EXERCISES 355

Now, consider its conjugates

(f1 , . . . , fk ) = (σ1 (f ), . . . , σk (f )),

where {σ1 , . . . , σk } = Gal(K/Q). By assumption, for any i,

ei (f1 , . . . , fk )

takes infinitely many values which are rational integers at rational integers. Thus, it is a poly-
nomial by Exercise 5.5.1† . This implies that f is integral over the ring of polynomials C[X]: it’s
a root of the monic polynomial

(Y − f1 ) · . . . · (Y − fk ) ∈ C[X][Y ].

However, it is also rational over C[X] since it is in C(X). Thus, an analogue of Proposition 1.1.1
shows that it must be in C[X], as a rational integral (over C[X]) element of C(X) (C[X] is a
UFD so the same proof as Proposition 1.1.1 works).

Chapter 7

Units in Quadratic Fields and Pell’s


Equation

7.1 Fundamental Unit


Exercise 7.1.1∗ . Prove that α is invertible if and only if its norm is ±1.

Solution

If α is invertible than so are its conjugate σi (α) since αβ = 1 transforms into σi (α)σi (β) = 1.
Thus, so is the product of its conjugates, i.e. its norm. But we have seen that the only invertible
rational integers are ±1. Conversely, if N(α) = ±1 then α times ± the product of its other
conjugates is 1 so α is invertible.


7.2 Pell-Type Equations


Exercise 7.1.2∗ . Prove Proposition 7.1.1.

Solution

We have
√ already shown that these√were the only units of Q(i) and Q(j)
√ in Chapter 2. The units
of Q( −d) are the elements a + b −d of OQ(√−d) satisfying N (a + b −d) = a2 + db2 = 1 since
the norm is positive
√ so cannot be −1. For d = 2, there are only the trivial solutions ±1 since
OQ(√−2) = Z[ −2] and |b| ≥ 1 implies a2 + 2b2 ≥ 2 while |a| ≥ 2 implies a2 + 2b2 ≥ 4.
5
If d ≥ 5 (d = 4 is not squarefree), a and b are both half integers so a2 + db2 ≥ 4 > 1 if |b| ≥ 1
which implies b = 0 from which we get a = ±1, corresponding to the units ±1.


Exercise 7.2.1∗ . Prove that OQ(√u) /βOQ(√u) is finite if β 6= 0.

Solution

Let α be such that O := OQ(√u) = Z[α] (see Proposition 2.1.1). Note that, when β = m is a
rational integer, this has exactly m2 = N (m) elements since m | a + bα iff m | a, b (by definition

356
7.3. STØRMER’S THEOREM 357

c + dα is an algebraic integer for rational c, d iff c and d are integers).

For the general case, note that O/βO ⊆ O/N (β)O since β | N (β) so the former has a finite
number of elements too.


7.3 Størmer’s Theorem


Exercise 7.3.1∗ . Prove that ym | yn iff m | n.

Solution

This is exactly the same as Exercise 4.3.1∗ but in number fields. Write yn = α 2√
n
−β n
d
with
α, β ∈ OQ(√d) . Note that ym | yn iff αm − β m | αn − β n which is equivalent to α/β = α2 having
order dividing n modulo αm − β m . Since its order is exactly m, this is equivalent to m | n.


Remark 7.3.1
This solution also works for the more general problem where α, β are in any number field K, but
the difficulty lies in showing that OK mod γ is finite for any non-zero γ ∈ OK , so that talking
about the order makes sense. This follows for instance from Exercise 6.5.31† .

7.4 Units in Complex Cubic Fields and Kobayashi’s Theorem


Exercise 7.4.1. Why does looking at the (2k )2 Pell-type equations ax2 − by 2 = k for squarefree
integral S-units a, b not prove that u − v = k has finitely many integral S-units solutions?

Solution

It doesn’t work because there are no reason for it to work, we were simply lucky before. Indeed,
the solutions of Proposition 7.2.4 are very messy: they have the form αk βi for √ some α and
elements β1 , . . . , βn . The problem is the additional factor: if we look as the d part we get
something of the form
X n  X n 
n−2k−1 2k k
yn = a x y d +b xn−2k y 2k dk
2k + 1 2k
k k

which has too many terms to work with. In particular, we can’t even restrict ourselves to the
n = p prime case since we do not necessarily jave yn | ym when n | m.


Exercise 7.4.2∗ . Prove Theorem 7.4.2 in the case where a/b is a rational cube.

Solution

Write au3 = bv 3 = c with non-zero u, v ∈ Z. Then, ax3 + by 3 = k becomes

(xv)3 + (yu)3 = ku3 v 3 /c


358 CHAPTER 7. UNITS IN QUADRATIC FIELDS AND PELL’S EQUATION

so it suffices to consider the case where a = b = 1. We have x + y | x3 + y 3 = k, say x + y = d.


It suffices to show that there are finitely many solutions for a fixed d since k has finitely many
divisors. In fact we will prove that there is at most one solution for a fixed d. We have
k k
x2 − xy + y 2 = = := d0 .
x+y d

Thus, 3xy = (x + y)2 − (x2 − xy + y 2 ) = d2 − d0 . Hence, x and y are roots of

d2 − d0
X2 − X +d
3
by Vieta’s formulas which implies that there is at most one (unordered) pair of solutions as
wanted.



Exercise 7.4.3∗ . Prove that the only roots of unity of Q( 3 d, j) are ±1, ±j and ±j 2 .

Solution

Note that        
2iπ 2iπ 2iπ
Q exp , exp = Q exp
n m lcm(m, n)
  a
2iπ 2iπ b
= exp 2iπ

since the RHS clearly contains the LHS, and exp lcm(m,n) n exp m where am +
bn = gcd(m,
√ n) by Bézout. Thus, if n is the greatest order of a root of unity ω contained in
K = Q( 3 d, j), then 6 | n since −j has order 6.

By Chapter 3, ω has degree ϕ(n), and since K has degree 6 we get ϕ(n) | 6. This implies
n ∈ {6, 12, 18}. n = 6 is what we want, so we need to show that the other two cases are
impossible. For this, note that ϕ(12) = ϕ(18) = 6, so if it were the case then we would have
K = Q(ω).

To finish, we shall imitate the solution of Problem 6.3.1 to show that this is impossible, i.e. that
the Galois group of K is not abelian. Note that the embeddings of K are
(√ √
3
d 7→ j a 3 d
σ(a,b) :
j 7→ j b

for a ∈ Z/3Z and b ∈ (Z/3Z)× . Moreover, by Exercise 6.3.6∗ , we have

σ(0,−1) ◦ σ(1,1) = σ(−1,−1)

and
σ(1,1) ◦ σ(0,−1) = σ(1,−1)
so σ(0,−1) and σ(1,1) do not commute which means that Galois group of K is not abelian as
wanted.


Exercise 7.4.4∗ . Prove that θ/σ(θ) ∈ {±j, ±j 2 } is also impossible.


7.5. EXERCISES 359

Solution

• θ/σ(θ) = ±j yields

3

3

3

3
x + y d + z d2 = ±j(x + yj d + zj 2 d2 )

which means x = 0 since√ there’s no j term on the left and y = 0 since


√ there’s no j 2 term
3 3
on the left. Thus θ = z d which is impossible since the norm of z d is z 3 d2 which can’t
2 2

be 1.
• θ/σ(θ) = ±j 2 yields

3

3

3

3
x + y d + z d2 = ±j 2 (x + yj d + zj 2 d2 )

which means x = 0 since√there’s no j 2 term on the left and z = 0 since


√ there’s no j term
on the left. Thus θ = y 3 d which is impossible since the norm of y 3 d is y 3 d which can’t
be 1.

7.5 Exercises
Diophantine Equations
12 +...+n2
Exercise 7.5.1† (ISL 1990). Find all positive rational integers n such that n is a perfect
square.

Solution

Note that
12 + . . . + n2 (n + 1)(2n + 1)
= .
n 6
Thus, this is equal to k 2 is and only if

2n2 + 3n + 1 = (n + 1)(2n + 1) = 6k 2 ,

i.e.
(4n + 3)2 − 48k 2 = 1.
Thus, we want to√solve the√Pell equation x2 − 48y 2 = 1 with x ≡ 3 (mod 4). The solutions are
n n
given by x = (7+ 48) +(7−
2
48 )
and it is easy to see that this is congruent to 3 modulo 4 when
n is odd. Indeed, modulo 4 it is congruent to
√ √
(7n + n7n−1 48) + (7n − n7n−1 48)
= 7n ≡ (−1)n .
2



Exercise 7.5.2† (BMO 1 2006). Let n be a rational integer. Prove that, if 2 + 2 1 + 12n2 is a
rational integer, then it is a perfect square.
360 CHAPTER 7. UNITS IN QUADRATIC FIELDS AND PELL’S EQUATION

Solution
m m m m
−β
The solutions to the equation x2 − 3y 2 = 1 are given by xm = α +β
2 , ym = α2(α−β) , where α

and β are the conjugate fundamental
p units 2 ± 3). Since y1 = 1 and y2 = 4, y√
m is even iff n is.
Thus, by assumption, since 1 + 3(2n)2 is an integer, we have 2n = y2m and 1 + 12n2 = x2m
for some m, i.e.
p α2m + β 2m
2+ 1 + 12n2 = 2 + 2 · = 2αm β m + α2m + β 2m = (αm + β m )2 = (2xm )2 .
2


Exercise 7.5.4† (RMM 2011). Let Ω(·) denote the number of prime factors counted with multiplicity
of a rational integer, and define λ(·) = (−1)Ω(·) . Prove that there are infinitely many rational integers n
such that λ(n) = λ(n + 1) = 1 and infinitely many rational integers n such that λ(n) = λ(n + 1) = −1.

Solution

For the first part, let (x, y) be a solution to the Pell equation x2 − 6y 2 = 1. Then, n = 6y 2 has
an even number of prime factors and so does n + 1 = x2 .

For the second part, let (x, y) be a solution to the Pell-type equation 3x2 −2y 2 = 1. Then, n = 3y 2
has an odd number of prime factors, and so does n+1 = 2x2 . Note that this equation has infinitely
√ n √ n
many solutions. Indeed, the Pell equation z 2 − 2y 2 = 1 has the solutions z = (3+2 2) +(2−2
2
2)
,
and this is divisible by 3 iff n is odd.


Exercise 7.5.5† . Let k be a rational integer. Prove that there are infinitely positive integers n such
that n2 + k | n!.

Solution

By Proposition 7.2.3, the equation x2 − dy 2 = −k has infinitely many solutions if it has at least
one and d isn’t a perfect square. Thus, pick an r such that r2 + k = d is not a perfect square (this
is true for sufficiently large m since gaps between consecutive perfect squares are increasing) and
consider any solution n to the equation n2 − dm2 = −k, which has infinitely many solutions since
(n, m) = (r, 1) is one. Finally, note that

n2 + k = dy 2 = y · dy | (dy 2 − k)! = n!

for sufficiently large y.




Pell-Type Equations
Exercise 7.5.6† . Let d be a rational integer. Solve the equation x2 − dy 2 = 1 over Q.

Solution

We will solve this geometrically! The idea is that we (almost) get a correspondence between the
rational points of our conic (the curve x2 + dy 2 = 1), and the rational points of the horizontal
7.5. EXERCISES 361

line y = 0. Indeed, if we have a rational point p on the conic, we get a rational point on the
horizontal line by intersecting it with the line going through p and (1, 0). Conversely, if we have
a rational point q on the horizontal line and intersect the conic with the line going through q and
(1, 0), we get a rational point on the conic.

Let’s make this more explicit. Let (0, t) be a rational point on the horizontal line. Then, the line
joining (0, t) with (1, 0) is y = t(1 − x). When we intersect this with the conic, we get

dt2 + 1
x2 + dt2 (1 − x)2 = 1 ⇐⇒ x + 1 + dt2 (1 − x) = 0 ⇐⇒ x = .
dt2 − 1
2t
From this, we get y = dt2 −1 . Thus, the solutions are

dt2 + 1
  
2t
, t ∈ Q ∪ {(1, 0)}.
dt2 − 1 dt2 − 1

Exercise 7.5.8† . Prove that the equation x2 − 34y 2 = −1 has no non-trivial solution in Z despite −1
being a square modulo 34.

Solution
√ √
The fundamental unit of Q( −34) is 35 + 6 34 which has norm 1.


Fundamental Units
√ √
Exercise 7.5.10† . Let d ≡ 1 (mod 4) be a squarefree integer, and suppose η = a+b d
6∈ Z[ d] is the
√ √ 2
fundamental unit of Q( d). Prove that η n ∈ Z[ d] if and only if 3 | n.

Solution

Let’s look at Z[η] = OQ(√d) modulo 2: there are four elements since a+bη 2 ∈ Z[η] iff a2 , 2b ∈ Z.
Out of these four, three are invertible modulo 2 and the last isn’t (it’s 0): 1 · 1 ≡ 1, η · η ≡ 1, and
η + 1 ≡ η which is invertible by the previous equality. Indeed, by assumption η − η ≡ η + η ≡ 1
(mod 2).

In other words, 2 is prime in Z[η], since an element √


is either invertible modulo 2 or divisible by 2
(note that this relies on the assumption that η 6∈ Z[ d], in general, for d ≡ 1 (mod 4), 2 is prime
iff d ≡ 5 (mod 8)). Thus, by Fermat’s little theorem in Z[η]/2Z[η] (this is a finite field with 4
elements, see Theorem 4.2.1), the order of η modulo 2 divides 3. Since η 6≡ 1, its order must be
exactly 3. Finally, we have
√ η n − η −n
η n ∈ Z[ d] ⇐⇒ ∈ Z ⇐⇒ 2 | η 2n − 1 ⇐⇒ 3 | 2n ⇐⇒ 3 | n
2
as wanted.


Exercise 7.5.11† . Let d 6= 1 be a squarefree 2n


√ rational integer, and suppose√ that 2 + 1 = dm for
2

some integers n, m ≥ 0. Show that 2n + m d is the fundamental unit of Q( d), provided that d 6= 5.
362 CHAPTER 7. UNITS IN QUADRATIC FIELDS AND PELL’S EQUATION

Solution

Clearly, m√is odd. Suppose for
√ the sake of contradiction that 2n + m √d is not the fundamental
unit of Q( d). Then, 2n + m d is the mth power of an element η ∈ Q( d) for some m. Without
m/p
loss of generality, we may assume that m =√p is prime
√ (by replacing η by α ). First, suppose
that p is odd. Suppose also that η = u + v d ∈ Z[ d], where u and v are positive. Then,
√ √
2n + m d = (u + v d)p

gives us
X p  X p 
n p−2k 2k k
2 = u v d =u up−2k−1 v 2k dk .
2k 2k
k k
p−1
Thus, u | 2 . Notice that the second factor is congruent to pv p−1 d 2 modulo u and that this is
n

coprime with u as p and d are odd and u, v are coprime since 2n and m are. Thus, either u = 1
or u = 2n . The former is impossible since the unit is non-trivial, and the latter as well since the
p−1
second factor is at least pv p−1 d 2 > 1.

Now, suppose that η = u+v2 d for some odd u, v. Then, we get exactly the same equation as
before, but with n replaced by n + p:
X p  X p 
n+p p−2k 2k k
2 = u v d =u up−2k−1 v 2k dk .
2k 2k
k k

(We distinguished the two cases because we want u and v to be coprime for the two factors to
be as well.) As before, u = 1 or u = 2n+p : the latter is still impossible as the second factor is
p−1
at least pv p−1 d 2 > 1, but now the former could be possible since the unit is non-trivial (the
rational part is now 21 and not 1). However, it gives dv 2 ∈ {5, −3}: the first case is ruled out by
the hypothesis and the latter is impossible.
√ √
It only remains to settle the case p = 2 now. In that case, suppose first that η = u+v d ∈ Z[ d].
Then, we get √ √ √
2n + m d = (u + v d)2 = (u2 + dv 2 ) + 2uv d

u+v d
and this is impossible since m is odd. Finally, if η = 2 for some odd u, v, we get
√ √ √
2n+2 + 4m d = (u + v d)2 = (u2 + dv 2 ) + 2uv d

which is impossible since 2uv is not divisible by 4.




Remark 7.5.1
We could have also used Carmichael’s theorem from Exercise 4.6.35† : we have
αm + β m
2n = ,
2

m odd (since 2n + y d is not a square), where α and β are the conjugate fundamental units
with √
of Z[ d]. Since m is odd, we do not have to consider any exceptions, and we get that αm − (−β)m
has a primitive prime factor p which does not divide α + β. Since α + β is even (it’s twice the
this implies that p is odd, which√is a contradiction since 2n has no odd prime
rational part of α), √
n
√ Thus, 2 +y d is the fundamental unit of Z[ d], but√
factor. it might not be the fundamental unit
of Q( d). √The last case we need to consider is when 2n + y d = η 3 , where η is the fundamental
unit of Q( d), by Exercise 7.5.10† .

Exercise 7.5.12† . Suppose that d = a2 ± 1 is squarefree, where a ≥ 1 is some rational integer and
7.5. EXERCISES 363

let k ≥ 0 be a rational integer. Suppose that the equation x2 − dy 2 = m has a solution in Z for some
|m| < ka. For sufficiently large d, prove that |m|, d + m or d − m is a square.

Solution

Note
√ that, the assumption that d = a2 ± 1 gives us that θ =
√ a + d is the fundamental unit of
Q( d). Indeed, if x2 − dy 2 = ±1 for some y 6= 0, then x ≥ d − 1 so x ≥ a.

Suppose (x, y) is a positive solution to x2 − dy 2 = m. By dividing x + y d by a suitable power
of θ (this √
may change the sign of m but doesn’t change its absolute value), we may assume that
1 ≤ x + y d < θ. Then,
√ √ √ |m|
|2y d| ≤ |x + y d| + |x − y d| < θ + < 2a + k + o(1).
θ
Thus, |y| < 1 + o(1). For sufficiently large a, our inequality forces |y| = 1 or y = 0. If |y| = 1, we
get that m + d is a perfect square, and if y = 0 we get that m is a perfect square as wanted.


Remark 7.5.2
The argument used in Remark 5.5.1 can be slightly modified to show that, for any choice of ±1,
there exist infinitely many squarefree numbers of the form a2 ± 1.

Exercise 7.5.13† . Solve completely the equation x3 + 2y 3 + 4z 3 = 6xyz + 1 which was seen in
Problem 6.2.2.

Solution
√ √
Since the norm of√ x + y 3 2 + z 3 4 is x3 + 2y 3 + 4z 3 − 6xyz (see √
Problem 6.2.2),√we wish
√ to find
units in K = Q( 3 2). We claim that the fundamental unit of Q( 3 2) √ is θ =√1 + 3
2 + 3
4.
√ Thus,√
the solutions will be the one considered in√Problem 6.2.2, i.e. x + y 3 2 + z 3 4 = (1 + 3 2 + 3 4)n
for some n (the only roots of unity of Q( 3 2) are ±1 and −1 has norm −1 so does not work).

We need to show that this unit has minimal absolute√ value


√ (among the ones greater than 1).
3 3
Suppose that there is a unit greater than 1 a + b 2 + c 4 := ε < θ < 4. Let σ be a complex
embedding of K. Then, |σ(ε)|2 = 1ε ∈ 12 , 1 . Hence, the minimal polynomial X 3 + uX 2 + vX ± 1
 

of ε satisfies
2
0 < −u = ε + σ(ε) + σ(ε) < ε + √ < 5
ε
and
√ 1
|v| = ε(σ(ε) + σ(ε)) + |σ(ε)|2 ≤ 2 ε + < 5.
ε
Thus, u ∈ [−4, −1] and v ∈ [−4, 4]. However, we also have u = −3a and v = 3(a2 − 2bc). Thus,
u = −3 and v = ±3, which yield a = 1, and b = c = 1 or b = 0 or c = 0. If b = 0, then
a3 + 2b3 + 4c3 − 6abc = 1 + 4c3 so c must also be 0, and if c = 0 then a3 + 2b3 − 6abc = 1 + 2b3
so b = 0 or b = −1.
√ √ √ √
Thus, we conclude that ε = 1 + 3 2 + 3 4 as wanted, or ε = 1 − 3 2. However,
√ 1 − 3 2 < 1 so we
1 √
must be in the first case, as asserted. (In fact, it turns out that 1 − 3 2 = − 1+ √
3
2+ 3 4
, which is
perhaps a neater choice of fundamental unit.)

364 CHAPTER 7. UNITS IN QUADRATIC FIELDS AND PELL’S EQUATION

Exercise 7.5.14† (Weak Dirichlet’s Unit Theorem). Let K be a number field with r real embeddings
and s pairs of complex embeddings. Prove that there exist units ε1 , . . . , εk with k ≤ r + s − 1 such
that any unit of K can be written uniquely in the form

ζεn1 1 · . . . · εnk k

for some integers ni and a root of unity ζ.

Solution

Let σ1 , . . . , σr be the real embeddings of K, and σr+1 , σ r+1 , . . . , σr+s , σ r+s its pairs of complex
embeddings and let U be the group of units of K. We look at the logarithms of the embeddings
of units:
L = {(log |σ1 (ε)|, . . . , log |σr+s−1 (ε)|) | ε ∈ U } ⊆ Rr+s−1 .
We claim that this set is a discrete additive subgroup of Rr+s−1 , meaning that it’s closed under
addition and subtraction, and, for any x ∈ Rr+s−1 , there is no sequence of distinct elements of
L tending to x. To show this, we will prove that, for any A, B > 0, there are finitely many units
such that A < |σi (ε)| < B for i = 1, . . . , r + s − 1. Notice that such a number also satisfies
1 1
< |σr+s−1 | <
B r+s−1 Ar+s−1
since it is a unit (so the product of |σi (ε)| is 1). Thus, all the conjugates of ε have bounded
absolute value, which implies that its minimal polynomial has bounded coefficients. This shows
that there are a finite number of such ε.

Next, we show that any discrete additive subgroup Γ of Rm is a lattice, i.e. admits a linearly
independent basis as a Z-module, or in P other words, there are α1 , . . . , αk such that any element
k
of Γ can be written in a unique way as i=1 ni αi with ni ∈ Z. Since Rm has dimension m as a
R-vector space, this implies that k ≤ m by Proposition C.1.2. To show this, pick any maximal
set of linearly independent elements β1 , . . . , βk ∈ Γ and let Γ0 = β1 Z + . . . + βk Z. We will prove
that there are a finite numbers of elements in Γ modulo Γ0 , i.e. that Γ/Γ0 is finite, say has N
elements. Then, Lagrange’s theorem 2.5.1 implies that N α ∈ Γ0 for any α ∈ Γ, i.e.
1 0 β1 βk
Γ⊆ Γ = Z + ... + Z.
N N N
We can then conclude with Exercise 6.5.31† (so many intermediate results!) that Γ also has a
Z-basis. Thus, it remains to prove that Γ/Γ0 is finite. For this, note that it suffices to prove that
β1 [0, 1]+. . .+βk [0, 1] contains finitely many elements of Γ, as this is a system of representatives of
Rm /Γ0 . If there were infinitely many elements of Γ there, Γ would have a convergent subsequence
by the Bolzano-Weierstrass theorem from 8.6.8† , contradicting its discreteness (we do not actually
need BW since we actually directly showed that there were a finite number of elements of L in
any interval).

Finally, the previous discussion implies that L has a basis corresponding to the image of ε1 , . . . , εk
under the logarithmic embedding, for some k ≤ r + s − 1. Thus, by raising everything to the
exponential, we get that, for every unit ε, there are unique integers n1 , . . . , nk such that the
number
ε
εn1 1 · . . . · εnk k
has all its conjugates on the unit circle. By Exercise 1.5.26† , this implies that it is a (unique)
root of unity ζ as wanted.

7.5. EXERCISES 365

Remark 7.5.3
There is nothing particularly deep about the logarithm in this proof, apart from the fact that
it transforms multiplication into addition and that we feel more comfortable working with addi-
tion. We could of course transform our additive proof into a multiplicative one by removing the
logarithms and turning addition into multiplication.

Exercise 7.5.15† (Gabriel Dospinescu). Find all monic polynomials f ∈ Q[X] such that f (X n ) is
reducible in Q[X] for all n ≥ 2 but f is irreducible.

Solution

Let α be a root of f and let K be the splitting field of f , i.e. Q(α1 , . . . , αk ) where αi are the roots
of f (the conjugates of α). Note that the statement is equivalent to f (X p ) being reducible for
any prime p. In fact we only need this assumption for infinitely many primes. By Lemma 6.1.1,
f (X p ) is reducible over Q if and only if X p − α is reducible over Q(α), and by Exercise 6.5.10† ,
this is equivalent to α being a pth power in Q(α), and thus in K too.

By looking at the norm of α in K, we see that α must have norm 1 or 0 since its norm is a
pth power in Q for infinitely many p. If α = 0, then f = X since it is irreducible, which works.
Otherwise, α must be a unit. By Exercise 7.5.14† , there are multiplicatively independent units
ε1 , . . . , εk ∈ K and integers n1 , . . . , nk as well as root of unity ζ such that

α = ζεn1 1 · . . . · εnk k .

Since ε1 , . . . , εk are multiplicatively independent, the fact that α is a pth power means that p | ni
for every i. For sufficiently large p, we find n1 = . . . = nk = 0. Thus, α is a root of unity.
However, since a primitive mth root of unity has degree ϕ(m) over Q, K contains finitely many
roots of unity (ϕ(m) is greater than [K : Q] for sufficiently large m), which implies ζ = 1 since
it’s a pth power for infinitely many primes p. Thus, we conclude that α = 1. The two solutions
are hence f = X and f = X − 1, which indeed work.


Miscellaneous
Exercise 7.5.16† (Liouville’s Theorem). Let α be an algebraic number of degree n. Prove that there
exists a constant C > 0 such that
p C
α− > n
q q
for any p, q ∈ Z (with q > 0).

Solution

Let α1 , . . . , αn be the conjugates of α. We have


n n
Y p Y
qn − αi = p − qαi ≥ 1
i=1
q i=1

p p
since it’s a non-zero integer. If q − α < 1, then q − α0 < 1 + |α − α0 | for any conjugate α0 6= α.
Thus, in this case we have
n
Y 1
|p − qα| 1 + |α − αi | ≥
i=1
qn
1 p C
as wanted (C = Qn ). Otherwise, we have −α ≥1> qn too.
i=1 1+|α−αi | q
366 CHAPTER 7. UNITS IN QUADRATIC FIELDS AND PELL’S EQUATION

Exercise 7.5.17† . Prove that 5n2 ± 4 is a perfect square for some choice of ± if and only if n is a
Fibonacci number.

Solution
√ √
Simply note that 1+2 5 is the fundamental unit of Q( 5), and that the solutions to the equation
x2 − 5y 2 = ±1 for x ≡ y (mod 1) half-integers, i.e. the rational integers solutions to the equation
(2x)2 − 5(2y)2 = ±4 are thus
 √ n  √ n
1+ 5
2 − 1−2 5
2y = √ = Fn .
5


Exercise 7.5.18† (ELMO 2020). Suppose n is a Fibonacci number modulo every rational prime.
Must it follow that n is a Fibonacci number?

Solution

By Exercise 7.5.17† , the statement means that, for every p, 5n2 + 4 or 5n2 − 4 is a quadratic
residue (or zero). This implies that 5n2 + 4 or 5n2 − 4 is a perfect square by an argument similar
to Exercise 4.6.21† , i.e. n is a Fibonacci number too. Indeed, if, modulo sufficiently large primes,
one of a and b is a quadratic residue, then one of them must be a square (this is not true anymore
with 3 numbers, see Exercise 4.6.23† ). By Exercise 4.6.21† , we may assume that a 6= b.
Suppose without loss of generality that a and b are squarefree. Write a = ε2r p1 · . . . · pk and
b = η2s q1 · . . . · qm with ε, η ∈ {−1, 1}, r, s ∈ {0, 1}, and p1 , . . . , pk , q1 , . . . , qm odd primes. Let t
be a quadratic non-residue modulo p1 (if k ≥ 1). If a and b are both divisible by an odd prime,
say p1 = q1 , then pick a large prime
p≡1
(mod 8p2 · . . . · pk q2 · . . . · qm )
   
and p ≡ t (mod p1 ). Then, quadratic reciprocity gives us ap = pb = −1 which is a contra-
diction.
Suppose for the sake of contradiction that k, m ≥ 1. Then, pick a large prime
p≡1 (mod 8p2 · . . . · pk q2 · . . . · qm ),
0 0
 ≡
p   p1 ) and p ≡ t (mod q1 ) where t is a quadratic non-residue modulo q1 to get
 t (mod
a b
p = p = −1. Thus, suppose without loss of generality that m = 0. If k ≥ 1, pick a large
   
prime p ≡ 8 (mod p1 · . . . · pk ) and p ≡ t (mod p1 ) to get ap = pb = −1 again.

Finally, we have k = m = 0 so {a, b} ∈ {1, −1, 2, −2}. It remains to show that {a, b} = {2, −2},
{a, b} = {−1, −2} and {a, b} = {−1, 2} are all impossible. For the first, note that they are both
quadratic non-residues modulo p ≡ 5 (mod 8), for the second, note that they are both quadratic
non-residues modulo p ≡ −1 (mod 8), and for the last note that they are both quadratic non-
residues modulo p ≡ 3 (mod 8).
As a final remark, as in Exercise 4.6.21† , we may avoid Dirichlet’s theorem on primes in arithmetic
progressions with Jacobi’s quadratic reciprocity law (by picking any rational integer p ≡ u
(mod v) with sufficiently large prime factors instead of a prime).

7.5. EXERCISES 367

Exercise 7.5.19† (Nagell, Ko-Chao, Chein). Let p be an odd rational prime. Suppose that x, y ∈ Z
are rational integers such that x2 − y p = 1. Prove that 2 | y and p | x. Deduce that this equation has
no solution for p ≥ 5. (The case p = 3 is Exercise 8.6.29† .)

Solution

If y is odd, then the two factors of y p = (x − 1)(x + 1) are coprime so x − 1 and x + 1 are pth
powers. This is impossible, as there are no pth power distant by 2: (m + 1)p − mp ≥ p + 1. Now,
p
suppose for the sake of contradiction that p - x. Then, the two factors of x2 = (y + 1) · yy+1 +1

are coprime. Indeed,


p
this is a product of cyclotomic polynomials, but it can also be seen more
elementarily: yy+1+1
≡ p (mod y + 1). This implies that y + 1 = a2 and y p + 1 = b2 . Now consider
p−1
the Pell equation u2 − yv 2 = 1. We have two solutions: (u, v) = (a, 1) and (u, v) = (b, y 2 ).
Notice that, for both of them, v is a y-unit. By Størmer’s theorem, this implies that they are
both the fundamental solution, which is impossible.

Thus, p | x and 2 | y. Without loss of generality, suppose that x + 1 = 2p−1 ap and x − 1 = 2bp
(by replacing x by −x if necessary). Since |x| > 1, a and b have the same sign, and |a| < |b|.
The key (magical?) point is that
 2  2
2p p x−1 x−3
b + (2a) = + 2(x + 1) = .
2 2
b2p +(2a)p
For p 6= 3, this is not divisible by p since p | x, so b2 + 2a and b2 +2a are perfect square.
However,
b2 < b2 + 2a < (b + 1)2
if a and b are positive, and
(b − 1)2 < b2 + 2a < b2
if a and b are negative. In all cases, we have reached a contradiction.


Exercise 7.5.20† . Prove that there are at most 3|S| pairs of S-units distant by 2.

Solution

If u − v = 2, then (v + 1)2 − uv = 1. We let rad(uv) | d be minimal such that uv/d is a square.


There are 3|S| possible d. As before, any u − v = 2 give rise to a solution to the Pell equation
x2 −dy 2 = 1 for some d-unit number y, which must thus be the minimal unit by Proposition 7.3.1.
Thus, there are also at most 3|S| pairs of S-units distant by 2.


Exercise 7.5.21† . Assuming the finiteness of rational solutions to the S-unit equation u + v = 1 for
any finite S, determine all functions f : Z → Z such that m − n | f (m) − f (n) for any m, n and f is a
bijection modulo sufficiently large primes.

Solution

Let S be the set of primes p for which f is not a bijection modulo p or p = 2. By assumption,
f (n + 1) − f (n), f (n + 2) − f (n + 1), and f (n + 2) − f (n) are all S-units. Thus, we have a solution

(f (n + 2) − f (n + 1)) + (f (n + 1) − f (n)) = f (n + 2) − f (n)


368 CHAPTER 7. UNITS IN QUADRATIC FIELDS AND PELL’S EQUATION

to the S-unit equation. There are a finite number of solutions to this equation (up to scaling),
so we get that f (n+2)−f (n+1)
f (n+1)−f (n) is in a finite set U . Now, pick a large prime p 6∈ S such that
|U (mod p)| = |U | and let a ∈ Z. Since f (n+2)−f (n+1) f (n+ap+2)−f (n+ap+1)
f (n+1)−f (n) and f (n+1+ap)−f (n+ap) are congruent
modulo p and in U , they must be equal. By picking another sufficiently large prime q 6= p and
b ∈ Z such that ap + bq = 1, we get

f (n + 2) − f (n+!) f (n + ap + bq + 2) − f (n + ap + bq + 1) f (n + 3) − f (n + 2)
= =
f (n + 1) − f (n) f (n + 1 + ap + bq) − f (n + ap + bq) f (n + 2) − f (n + 1)

which means that the quotient f (n+2)−f (n+1)


f (n+1)−f (n) is in fact constant, say equal to r. Then, f satisfies
the following linear recurrence: f (n + 2) = f (n + 1) + r(f (n + 1) − f (n)) = sf (n + 1) − f (n)
which, unless s = 2 which implies that the characteristic polynomial has a double root, reduces
to f (n) = uαn + vβ n for some conjugate quadratic integers α, β. But then, if p 6= 2 is such that
the characteristic polynomial X 2 − sX + 1 splits modulo p, we get

f (n) ≡ up αpn + vp βpn (mod p)

for some up , vp , αp , βp ∈ Fp , so that f (p − 1) = up + vp = f (1). This is a contradiction. Thus,


s = 2, which gives f (n) = un + v for some u, v ∈ Z. Conversely, it is clear that arithmetic
progressions work.

7.5. EXERCISES 369

×
Exercise 7.5.1† . Note that OK = OK \ pK . In other words, pK consists precisely of those elements
of OK which are not invertible. This proves that pK is the unique maximal ideal of OK , as if a is an
ideal of OK and α ∈ a \ pK , then OK = (α) so that a = OK .

For the second part, note that κ is a field by Exercise A.3.23† , and is naturally a field extension of
Zp /pZp ' Fp through the embedding a (mod pZp ) 7→ a (mod pK ). To show that [κ : Fp ] ≤ [K : Qp ],
we prove that if x1 (mod pK ), . . . , xn (mod pK ) ∈ κ are Fp -linearly independent, then x1 , . . . , xn ∈ K
are Qp -linearly independent. Suppose otherwise: let λ1 , . . . , λn ∈ Qp be such that

λ1 x1 + . . . + λn xn = 0,

and not all λi are zero. Let k be such that |λk |p is maximal. Then,

λ1 λn
x1 + . . . + xn = 0
λk λk
is a Zp -linear dependence (because all λi /λk have absolute value at most 1), which is non-trivial
modulo pK because the kth coefficient is 1. This shows that x1 (mod pK ), . . . , xn (mod pK ) are
linearly dependent as well.
Chapter 8

p-adic Analysis

8.1 p-adic Integers and Numbers


Exercise 8.1.1∗ . Check that Zp is an integral domain. What is its characteristic?

Solution

ab = 0 means ai bi = 0 for all i, where a = (a1 , a2 , a3 , . . .) and b = (b1 , b2 , b3 , . . .). Suppose that
a is non-zero, and let k be such that ak 6= 0. Then, vp (ai ) = vp (ak ) for i ≥ k since ai ≡ ak
(mod pk ). Thus, for i ≥ k, we have vp (bi ) ≥ i−vp (k). Hence, the coordinates of b have arbitrarily
large p-adic valuation which means that they are all zero by compatibility: if vp (bi ) ≥ N and
i ≥ N , then bN ≡ bi ≡ 0 (mod pN ).

Zp has characteristic 0 since (n, n, n, . . .) is zero only when n = 0, otherwise n has a non-zero vp
and it thus a non-zero coordinate too.


Exercise 8.1.2∗ . Check that a 7→ (a (mod p), a (mod p2 ), a (mod p3 ), . . .) is indeed an embedding
of Z(p) into Zp , i.e. that it’s injective.

Solution

It is clearly additive and multiplicative, and it is injective since the kernel is trivial: if a is
non-zero then it has a non-zero vp so a non-zero component under this embedding too.


Exercise 8.2.1∗ . Convince yourself of this proof.

Solution

Another way to write this proof is to define bk as


∞ |ai |<p−k
X X
k
ai (mod p ) ≡ ai .
i=0 k=0

This sequencePis Cauchy since |bi − bj | < p− min(i,j) by the strong triangle inequality and clearly

converges to i=0 ai by the strong triangle inequality again.


370
8.2. P -ADIC ABSOLUTE VALUE 371

8.2 p-adic Absolute Value


Exercise
P 8.2.2∗ . Prove that the strong triangle inequality also holds for series: if ai → 0 then
| i ai |p ≤ maxi |ai |p with equality if the maximum is achieved only once.

Solution

We have
n
X
ai ≤ max |ai |p ≤ max |ai |p
1≤i≤n i
i=1

for all n and this yields the wanted inequality by taking the limit as n → ∞. For the equality
part, just note when the maximum is achieved only once, we have equality when n is sufficiently
large so taking the limit yields the equality again.


Exercise 8.2.3∗ (Weak Approximation Theorem). Let S be a finite set of primes or ∞ and consider
elements (xp )p∈S such that xp ∈ Qp . Prove that, for any ε > 0, there is an x ∈ Q such that |x−xp |p < ε
for all p ∈ S.

Solution
n
When ∞ 6∈ S, this is the Chinese remainder: for any n, there
Q is an x ≡ xp (mod p ) for all
p ∈ S. Say that this congruence is equivalent to x ≡ a (mod p∈S pn ). Then, any
Y
x∈a+ pn Z(S)
p∈S,p6=∞
T
is close to xp p-adically, where Z(S) = p∈S,p6=∞ Z(p) . Since Z(S) is dense in R (e.g. by Exer-
cise 4.4.5∗ ), this yields the wanted result.


Exercise 8.2.4∗ . Prove the product formula.

Solution

This is a consequence of the prime factorisation:


Y Y 1
|x|p = p−vp (x) = .
p p
|x|∞

8.3 Binomial Series


Exercise 8.3.1∗ . Prove that Q is dense in Qp .
372 CHAPTER 8. P -ADIC ANALYSIS

Solution

This is a consequence of the density of Z in Zp : if α is an element of Qp then write α = pk a with


a ∈ Zp . There is a sequence of rational integers approaching a, and multiplying this sequence by
pk yields a sequence of rational numbers approaching α.


Exercise 8.3.2∗ . Let f ∈ Qp [X] be a polynomial. Prove that f is continuous on Qp .

Solution

The proof is the same as in R: if ε is very small then


n  
X k k−1 n−k
(x + ε)n − xn = ε ε x
n
k=1

is also very small by the triangular inequality (it is in fact even neater in Qp since we have the
strong triangle inequality to bound the second factor).


Exercise 8.3.3∗ . Let f : Zp → Qp be a continuous function. If |f (x)|p ≤ 1 for any n in a dense


subset (in Zp ), prove that |f (x)|p ≤ 1 for any x ∈ Zp .

Solution

Let x ∈ Zp be a p-adic integer and let a1 , a2 , . . . be sequence of elements of that dense subset
approaching x. By the triangular inequality, we have

|f (x)|p − 1 ≤ |f (x)|p − |f (an )|p ≤ |f (x) − f (an )|p → 0

which means that |f (x)|p ≤ 1 as wanted.




Exercise 8.3.4∗ . Prove that, if p > 5 is a rational prime, p2 | k=1


Pp−1 1
Pp−1 1
k3 and p | k=1 k4 .

Solution

Note that
1 1 k 3 + (p − k)3 3k 2 p
+ = ≡ − (mod p2 )
k3 (p − k)3 (p(p − k))3 (k(p − k))3
P p−1 k2 k2
3 (p−k)2
so we need to prove that p | k=12
(k(p−k))3 . Since p is odd and (k(p−k)) ≡ ((p−k)k)3 , this is
equivalent to
p−1
X k2
p| .
(k(p − k))3
k=1

Since
k2 k2
≡ ≡ −k −4 ,
(k(p − k))3 (−k 2 )3
8.3. BINOMIAL SERIES 373

Pp−1
we need to show that k=1 k −4 ≡ 0 (mod p). Let ω be a primitive root modulo p. Then,
p−1 p−1
X
−4
X ω −4(p−1) − 1
k ≡ ω −4k ≡ ≡0
ω −4 − 1
k=1 k=1

since the numerator is zero and the denominator is non-zero as p > 5. Note that this is also the
second claim.


Exercise 8.3.5∗ . Prove Proposition 8.3.4.

Solution

If we consider only the nth coordinate of these series, then, since ai,j → 0, the series become
finite sums (the nth coordinate of ai,j is zero for sufficiently large i + j). In particular, both sides
are equal. Letting n go to infinity, this shows both that the series converge and that they are
equal.


Exercise 8.3.6∗ . Let n ∈ N be a positive rational integer and p be a prime number. Prove that
∞  
X n
vp (n)! = .
pk
k=1

Solution
j k
There are np numbers in [n] which are divisible by p, which explains this term in the sum.
However, their contribution to the total p-adic valuation
j k might not always be one: some of these
numbers are divisible by p too. Hence we add pn2 to account for them too (combining with
2
j k
n
p this constitutes a contribution of 2 for the multiples of p2 ). Then we take in account the
j k
contribution of multiples of p3 with pn3 , then the multiples of p4 , etc.


Exercise 8.3.7∗ . Prove Corollary 8.3.1.

Solution

Since (1 + u)x = k xk uk , we need to prove that uk /k! → 0 for |u|p < p−1/(p−1) by Proposi-
P 

tion 8.3.3. Proposition 8.3.5 gives us |k!|p ≥ p−k/(p−1) so that


 k
|u|p
|uk /k!|p = |u|kp /|k!|p ≥ → 0.
p−1/(p−1)


374 CHAPTER 8. P -ADIC ANALYSIS

8.4 The Skolem-Mahler-Lech Theorem


Exercise 8.4.1∗ . Convince yourself of this proof.

Solution

Not much I can say here.




Exercise 8.4.2∗ . Do you think this proof could be formulated without appealing to p-adic analysis?

Solution

As said before, there is no known proof which doesn’t use p-adic ideas. However, one could
phrase the proof without mentionning p-adic numbers by looking at partial sums of our power
series modulo powers of p. See Block [6] for an example.


Exercise 8.4.3∗ . Prove that any number field has a finite number N of roots of unity, and that
ω N = 1 for any root of unity ω of K. (In other words, the roots of unity of K are exactly the N th
roots of unity.)

Solution

Since ϕ(n) → ∞ and a primitive nth root of unity has degree ϕ(n) over Q, contains finitely many
roots of unity.

One way to finish from this is to say that the roots of unity of K form a subgroup of the
multiplicative group of nth roots of unity for some n. Since this is a cyclic group, any of its
subgroup is also cyclic, and in particular the group of roots of unity of K.

Another way to finish from the first observation is to pick a root of unity ω ∈ K of maximal
order N . Then, if ζ is another root of unity of K, say of order n, Q(ω, ζ) ⊆ K contains a root of
unity of order lcm(N, n) by Problem 6.3.2 which implies that n divides N by maximality of N .


8.5 Strassmann’s Theorem



Exercise 8.5.1. Prove that Q( −7) is norm-Euclidean. (This is also Exercise 2.6.4† .)

Solution
 √
1+ −7

α

Let α, β ∈ OQ(√−7) = Z 2 be quadratic integers with β 6= 0. Write β = x + y −7 with
x, y ∈ Q. Pick a half-integer n such that |y − m| ≤ 14 , and a half integer m ≡ n (mod 1) such
that |x − n| ≤ 21 . Then,
 2  2
√ 1 1 15
|N ((x − m) + (y − n) −7)| ≤ +7 = < 1.
2 4 16
8.5. STRASSMANN’S THEOREM 375


Thus, the remainder τ = β(((x − m) + (y − n)
√ −7) works since it has norm less than |N (β)| by
the previous computation and α = β(m + n −7) + τ .


√  √ n−2
x± −7 1+ −7
Exercise 8.5.2. Prove that, if x2 + 7 = 2n , then 2 = 2 for some choice of ±.

Solution
√ √
By the uniqueness of the prime factorisation in Q( −7) (Exercise 8.5.1), we have x+ 2 −7 = αk β m .
Since the LHS is not divisible by 2, this means min(k, m) = 0 as otherwise 2 = αβ divides the
LHS.


Exercise 8.5.3∗ . Compute the Strassmann bounds for the function s 7→ (α − β)(us+10r ± 1), for each
r ∈ {0, 1, . . . , 9}. (If you do not want to do it all by hand, you may use a computer. In any case, it is
better to do it to have a feel for why it works because it’s very cool.)

Solution

Modulo 11, we can see that

αr − β r ± (α − β) ≡ 5r − 7r ∓ 2 (mod 11)

can be zero only for r ∈ {1, 2, 3, 5}, which means that in the other cases the Strassmann bound
is 0. Now, let’s study the second coefficient:

aαr − bβ r ≡ 99 · 5r − 77 · 7r .

When r = 1, this is
99 · 16 − 77 · 7 ≡ 77 (mod 112 )
so the Strassmann bound is 1 since all other coefficients are divisible by 112 . Similarly, when
r = 2 this is 33 6≡ 0, and when r = 5 this is 55 6≡ 0.

Finally, we need to treat the case r = 3. This time, the second coefficient is divisible by 112 so
we need to consider the third one:

(α − β)(ur+10s ± 1) = αr (1 + a)s − β r (1 + b)s ± (α − β)


X s X s
r s r
=α a −β bs ± (α − β)
k k
k k
   
r 2 s(s − 1) r 2 s(s − 1)
≡ α 1 + as + a − β 1 + bs + b ± (α − β) (mod 113 ).
2 2
2 r 2 3
The coefficient of s2 is a α −b
2
β
. Since we are now working modulo 113 , we need to compute a
and b modulo 11 . For this, we also need to compute α and β modulo 113 , but afterwards we
3

can return to their values modulo 11 since 112 x ≡ 112 y (mod 113 ) if and only if x ≡ y (mod 11).
With the help of Hensel’s lemma, we find α ≡ 137 and β ≡ 1195. This yields a ≡ 1188 and
b ≡ 198. Finally, a2 αr − b2 β 3 is

11882 · 53 − 1982 · 73 ≡ 726 6≡ 0 (mod 113 )

so the Strassmann bound is 3 as claimed.



376 CHAPTER 8. P -ADIC ANALYSIS

Exercise 8.5.4. Prove that 3, 4, 5, 7, 15 are indeed solutions to the given equation. (You may use a
computer for n = 15.)

Solution

We have

• 12 + 7 = 8 = 23 .
• 32 + 7 = 16 = 24 .
• 52 + 7 = 32 = 25 .
• 112 + 7 = 128 = 27 .

• 1812 + 7 = 32768 = 215 .




8.6 Exercises
Analysis
Exercise 8.6.1† (Vandermonde’s Identity). Let x and y be p-adic integers. Prove that
    
x+y X x y
=
k i j
i+j=k,i,j≥0

for any k.

Solution

When x and y are natural integers, this follows from considering the coefficient of X k in (X +
1)x+y = (X + 1)x (X + 1)y . For arbitrary p-adic integers, this follows from the density of N in
Zp .


Exercise 8.6.2† (Mahler’s Theorem). Prove that a function f : Zp → Qp is continuous if and only if
there exist ai → 0 such that
∞  
X x
f (x) = ai
i=0
i
for all x ∈ Zp . These ai are called the Mahler coefficients of f . Moreover, show that max(|f (x)|p ) =
max(|ai |p ).

Solution

It is clear that any such function is continuous on Zp , hence we need to prove that the reverse
holds as well. Let ∆f = x 7→ f (x + 1) − f (x) denote the discrete derivative operator from
ExercisePA.3.6† . The coefficients ak are then ∆k f (0): indeed, a straightforward P
shows that
n ∞
f (n) = k=0 ak nk for any n ∈ N, so if these ak go to 0, f must be equal to x 7→ x=0 ak xk


by density and continuity.


Thus, it only remains to show that ∆k f (0) → 0. To prove this, we will show that they eventually
8.6. EXERCISES 377

x
P k

all become divisible by p. We can then subtract p-∆k f (0) ∆ f (0) k from f (x) and divide
everything by p to conclude that p2 | ∆k f (0) for large k. Iterating this process yields that
vp (∆k f (0)) → +∞ as desired.

To show this, let N be such that p | f (x + pN ) − f (x) for any x. There exists such an N since f
is continuous by assumption. Then, by Exercise A.3.7† ,
N
pN
 
pN pN −k
X
∆ f (x) = (−1) f (x + k)
k
k=0

N N pN

for any x ∈ Zp . Now, by Frobenius, (1 + X)p ≡ 1 + Xp (mod p) which means that p | k
for any 1 ≤ k ≤ pN − 1. Hence,
N N
∆p f (x) ≡ f (x + pN ) + (−1)p f (x) (mod p).

When p is odd this is f (x + pN ) − f (x) which is divisible by p by construction, and when p is even
N
the same holds since −1 ≡ 1. Hence, p | ∆p f (x) for all x ∈ Zp which implies that p | ∆n f (x)
N
for all n ≥ pN as well by applying ∆ multiple times to ∆p f (x). In particular, p | ∆n f (0) for
sufficiently large n as wanted.


Exercise 8.6.4† . Prove that the following power series converge if and only if for |x|p < 1 and
|x|p < p−1/(p−1) respctively:
∞ ∞
X (−1)k−1 xk X xk
logp (1 + x) = , expp (x) = .
k k!
k=1 k=0

In addition, prove that


1. expp (x + y) = expp (x) expp (y) for |x|p , |y|p < p−1/(p−1) .
2. logp (xy) = logp (x) + logp (y) for |x|p , |y|p < 1

3. expp (log(1 + x)) = 1 + x for |x|p < p−1/(p−1) .

4. logp (exp(x)) = x for |x|p < p−1/(p−1) .

Solution

We shall only prove the convergence, the claimed equalities follow from the general theory of
power series: if g(x), (f ◦ g)(x) and f (g(x)) all converge, we have (f ◦ g)(x) = f (g(x)) (this is
even easier over Qp because we have the strong triangle inequality). The convergence for logp
follows from the fact that that |xk /k|p = |x|kp /|k|p goes to 0 when |x|p < 1 since |k|p > 1/k, but
does not go to 0 when |x|p = 1 since |k|p ≤ 1 for all k.

The convergence for expp is very similar: by Legendre’s formula,


 
k k sp (k) 1
vp (x /k!) = kvp (x) − + = k vp (x) − + o(k)
p−1 p−1 p−1

where o(k)/k → 0. This forces vp (x) ≥ 1


p−1 , i.e. |x|p ≤ p−1/(p−1) . Finally, we need to see that
1 sp (k)
we can’t have equality. This is easy: when vp (x) = p−1 , vp (xk /k!) is p−1 which is bounded
when k is a power of p, so does not go to infinity.

378 CHAPTER 8. P -ADIC ANALYSIS

Exercise 8.6.5† . Prove that !


n
X 2k
v2 → ∞.
k
k=1

Solution
P∞ k
The problem is equivalent to showing that k=1 2k = 0 in Q2 . Note that this sum is exactly
log2 (−1), which is 1/2 log2 (1) = log2 (1) = 0 by Exercise 8.6.4† .


P∞
Exercise 8.6.6† (Mean Value Theorem). Let f (x) = i=0 ai xi be a p-adic power series converging
for |x|p ≤ 1, i.e. ai → 0. Prove that

|f (t + h) − f (t)|p ≤ |h|p max(|ai |p )


i

for any |t|p ≤ 1 and |h|p ≤ p−1/(p−1) .

Solution

We shall prove that |(t + h)n − tn |p ≤ |h|p for any |t|p ≤ 1 and |h|p ≤ p−1/(p−1) . The strong
triangle inequality then implies that

X ∞
X ∞
X
ai (t + h)i − ai ti = ai ((t + h)i − ti )
i=0 i=0 i=0
≤ max(|ai ((t + h)i − ti )|p )
i
≤ |h|p max(|ai |p )
i

as wanted. Our claim is however very easy to prove: since |h|p ≤ p−1/(p−1) , we have |hk /k!|p ≤ 1
by Legendre’s formula so that
n
X
(t + h)n − tn = tn−k n(n − 1) · . . . · (n − (k − 1))hk /k!
k=0

has absolute value at most |h|p by the strong triangle inequality.




Exercise 8.6.7† . Prove that Zp is sequentially compact, meaning that any sequence (an )n≥0 ∈ ZNp has
a subsequence (ani )i≥0 which converges in Zp . Prove more generally that a set S ∈ Qp is sequentially
compact if and only if it is closed, meaning that any sequence of elements of S converging in Zp (for
the Euclidean distance) converges in S, and bounded.

Solution

Let (an )n≥0 be a sequence of p-adic integers. By the pigeonhole principle, (an )n =: (an,0 )n has an
infinite subsequence (an,1 )n which is constant modulo p. Now (an,1 )n has an infinite subsequence
constant modulo p2 , (an,2 )n . Repeating this process, we get an chain of sequences

(an,0 )n ⊆ (an,1 )n ⊆ . . .
8.6. EXERCISES 379

such that (an,k )n is constant modulo pk . Thus, the subsequence

(a0,0 , a0,1 , a0,2 , . . .)

converges as a0,k+1 ≡ a0,k (mod pk ) so |a0,k+1 − a0,k |p ≤ p−k .

For the second part, note that a sequentially compact set must be closed and bounded, otherwise
we can choose some sequence of elements of S which diverges to infinity or converges to an
element not in S, and thus has no convergent in S subsequence. Now, suppose that S is closed
and bounded, say by pM , and let s = (sm )m≥0 be a sequence of elements of S. then, pM s
is a sequence of p-adic integers, which thus has a convergent subsequence pm s0 . Then, s0 is a
convergent subsequence of s, which must converge in S since S is closed.


Topology
Exercise 8.6.8† (Bolzano-Weierstrass Theorem). Prove that a set S ⊆ R is sequentially compact if
and only if it closed, meaning that any sequence of elements of S converging in R (for the Euclidean
distance) converges in S, and bounded. Prove that the same holds over Rn for n ≥ 1.

Solution

Clearly, if S is unbounded or not closed, one can extract a sequence which diverges to infinity
or converges to an element not in S, and thus has no convergent in S subsequence. Now,
suppose that S is closed and bounded and let s = (sm )m≥0 be a sequence of elements of S.
Without loss of generality, by translating S, suppose that all its elements are in [0, M ]. We shall
proceed by dichotomy to extract a convergent in R subsequence of s, which will thus also be
convergent in S since S is closed. By the (infinite) pigeonhole principle, there must an intervall
I1 ∈ {[0, M/2], [M/2, M ]} such that
S ∩ I1
is infinite. Pick an element r1 in thisproduct of  intervalls
 a2 +b2 and then repeat the operation: if
I1 = [a1 , b1 ], there must be some I2 ∈ a2 , a2 +b
2
2
, 2 , b2 such that
(1) (n)
S ∩ I2 × . . . × I2

is infinite. Pick an element in this product of intervalls r2 , and proceed inductively that way
to get chains of intervalls Im = [am , bm ] of length M/2m such that Im+1 ⊆ Im and S ∩ Im is
infinite and in particular contains rm . Since the length of Im is M/2n , the sequences (am )m≥0
and (bm )m≥0 are Cauchy, say they converge to c. Then, the sequence (rm )m≥1 we produced
converges to c as desired (c is in S since S is closed).

For the second part, we can work by induction on n, the previous paragraph being the base case.
If (xm , ym )m≥0 is an infinite sequence of S ⊆ Rn , with xm ∈ R and ym ∈ Rn−1 for every m ≥ 0,
we can extract a convergent subsequence (xϕ(m) )m≥0 of (xm )m≥0 . Moreover, by the inductive
hypothesis, we can extract a convergent subsequence (yϕ(ψ(m)) )m≥0 of (yϕ(m) )m≥0 . Then,

(xϕ(ψ(m)) , yϕ(ψ(m)) )m≥0

is a convergent subsequence of (xm , ym )m≥0 .




Exercise 8.6.9† (Extremal Value Theorem). Let (M, d) be a metric space, i.e. a set with a distance
d : M → R≥0 such that d(x, y) = 0 iff x = y, d(x, y) = d(y, x) (commutativity) and d(x, y) ≤
380 CHAPTER 8. P -ADIC ANALYSIS

d(x, z) + d(z, y) (triangle inequality) for any x, y, z ∈ M and let S be a sequentially compact subset of
M . Suppose f : S → R is a continuous function. Prove that f has a maximum and a minimum.

Solution

Suppose otherwise. There is a sequence (sn )n≥0 of elements of S such that

f (sn ) → s 6∈ im f

(s can be ±∞). Let (rn )n be subsequence of (sn )n converging to r ∈ S. Then, we get

f (r) = lim f (rn ) = s


n→∞

which is a contradiction.


Remark 8.6.1
If f is taken to be |g|p for some g, which is usually how the theorem is used in p-adica analysis,
there is in fact a simpler argument. Since the distance of Qp is almost discrete, i.e. the values
that it reaches 0, . . . , 1/p2 , 1/p, 1, p, p2 , . . . are all isolated except 0, we get the stronger conclusion
that f has a maximum if and only if it is bounded above, and it has a minimum if 0 ∈ im f or if
it is bounded below by a positive number.

Exercise 8.6.10† (The Topology of Metric Spaces). Let (M, d) be a metric space. We say a subset
U ⊆ M is open if, for every point x ∈ U , there is an ε > 0 such that the ball {y ∈ M | d(x, y) < ε}
is contained in U . A subset F 1 of M is said to be closed if the limit of any convergent sequence of
elements of F still lies in F . Prove that M and ∅ are open, that a finite intersection of open sets is
open, and that an arbitrary union of open sets is open2 . In addition, prove that F is closed if and only
if its complement is open.

Solution

It is obvious that, for any x ∈ M or x ∈ ∅, there is a ball around x contained in M or ∅. Thus,


M and ∅ are open. It is also clear that an arbitrary union of open sets is open: if we pick a point
in the union, it must be in one of the open sets, which means that there is a ball containing it
lying in that open set and thus in the union as well. It remains to prove that the intersection of
two open sets U and V is still open. Let x be an element of U ∩ V . Since x ∈ U, V , there are two
open balls B1 ⊆ U and B2 ⊆ V . Note that we can assume that they are centered at x. Indeed,
if an open ball B<ε (x0 ) := {y | d(x0 , y) < ε} contains x, we can consider the open ball
B<ε−d(x,x0 ) (x) := {y | d(x, y) < ε − d(x, x0 )}
for any element y in this set satisfies
d(x0 , y) ≤ d(x, x0 ) + d(x, y) < ε
so is in B<ε (x0 ), which means that
B<ε−d(x,x0 ) (x) ⊆ B<ε (x0 )
and is thus still contained in our open set. Accordingly, write B1 = B<ε (x) and B2 = B<η (x).
Then, the open ball B = B<min(ε,η) is a subset of U ∩ V and contains x, which shows that U ∩ V
is open.

1 The"F" stands for "fermé", which means "closed" in French.


2 Theseare actually the axioms of a topology. A topological space (X, τ ) is a set X together with a topology
τ ⊆ 2X consisting of subsets of X, called open sets, satisfying those properties. The closed sets are then defined as the
complements of open sets.
8.6. EXERCISES 381

First, suppose that the complement U of a closed set F is not open, i.e. that, for some x ∈ U ,
there is no open ball around x contained in U . In other words, for each ε > 0, we can find a
d(x, y) < ε such that y 6∈ U , i.e. y ∈ F . Letting ε tend to 0, this gives us a sequence of elements
(yn )n≥0 of F converging to x ∈ U , contradicting the fact that F is closed. In fact, the converse of
the statement is obvious from this argument: if U is open, then there is no sequence of elements
of F converging to an element of U .


Exercise 8.6.11† (Compact Sets). We say a metric space (M, d) is compact if, for every open cover
(Ui )i∈I of M , i.e. a family of open sets such that
[
Ui = M,
i∈I

we can extract a finite subcover (Ui )i∈I 0 of M . Prove that a closed subset of a compact set is compact,
and that a closed subset of a sequentially compact space is sequentially compact.

Solution

Let F be a closed subset of M . Suppose first that M is compact and let (Ui )i∈I be an open cover
of F . Note that, here, the Ui are open in F and not in M , which means that they have the form
Ui = Vi ∩ F for some open set Vi in M . Indeed, for every x ∈ Ui , if an open ball (in F ) B<ε (x)
of radius ε is contained in Ui , we simply add every d(x, y) < ε, y ∈ M to Ui to get Vi . It is clear
that this yields an open set. Now, let V0 (suppose without loss of generality that 0 6∈ I) be the
complement of F in M . This gives us an open cover (Vi )i∈I∪{0} of M , by Exercise 8.6.10† . Since
M is compact, we can extract a finite subcover,
[
M= Vi .
i∈I 0

This then yields a finite subcover of F :


[ [
F = Vi ∩ F = Ui ,
i∈I 0 \{0} i∈I 0 \{0}

thus showing that F is compact.

Now suppose that M is sequentially compact. The situation is easier this time, since any sequence
of elements of F has a convergent in M subsequence, and this subsequence converges in fact in
F as it is closed.


Exercise 8.6.12† (Cantor’s Intersection Theorem). Prove that in a compact or sequentially compact
space (M, d), if
F1 ⊇ F2 ⊇ . . .
is a chain of non-empty closed subsets of M , the intersection
\
Fn
n∈N∗

is non-empty. Further, prove that the same conclusion holds when M is complete3 (not necessarily
compact), and the closed sets satisfy diam(Fn ) → 0, where diam(S) := supx,y∈S d(x, y).

3 Recall that completeness means that all Cauchy sequences converge. A Cauchy sequence (u )
n n≥0 is a sequence such
that, for any ε > 0, there is an N such that d(um , un ) ≤ ε for all m, n ≥ N .
382 CHAPTER 8. P -ADIC ANALYSIS

Solution

Suppose first that M is compact. Suppose for the sake of contradiction that a chain of non-
empty closed sets F1 ⊆ F2 ⊆ . . . has empty intersection in a compact space M . Let Ui be the
complement of Fi . Then, U1 ⊇ U2 ⊇ . . . and
[ \
Un = M \ Fn = M,
n∈N∗ n∈N∗

i.e. (Un )n∈N∗ is an open cover of M . Since M is compact, we can extract a finite subcover
m
[
Un = M.
n=1

However, as U1 ⊇ U2 ⊇ . . ., this means that Um = M , i.e. Fm = ∅. This is a contradiction since


we assumed our closed sets were non-empty.

Next, suppose that M is sequentially compact. Then, we can pick an element xn ∈ Fn for every
n. Since M is sequentially compact, this sequence
T will converge to an element x ∈ M , and since
the Fi are closed, x will be an element of n∈N∗ Fn .

Finally, supposee that M is complete and our chain of closed sets are such that diam(Fn ) → 0.
Without loss of generality, suppose that diam(Fn ) ≤ 2−n . Pick an xn ∈ Fn for every n. Then,
the sequence (xn )n≥0 is Cauchy: d(xn+1 , xn ) ≤ 2−n so

max(m,n)−1
X
d(xm , xn ) ≤ d(xk , xk+1 )
k=min(m,n)

X
≤ 2−k ≤
k=min(m,n)
− min(m,n)
2

by the triangular inequality (see also Exercise 8.6.14† ). Thus, since M is complete, itT converges
to a point x ∈ M . Finally, since our Fi are closed, x must in fact be in every Fi , i.e. in n∈N∗ Fn .


Remark 8.6.2
In the second case, when M is complete and diam(Fn ) → 0, it is trivial to see that the intersection
consists in fact of a single point: if x, y ∈ Fn , then diam(Fn ) ≥ d(x, y).

Exercise 8.6.13† (Baire’s Theorem). Let (M, d) be a complete metric space. Suppose that U1 , U2 , . . .
are dense open sets, i.e. open sets that intersect any non-empty ball4 . Prove that
\
Un
n∈N∗

is still dense. Equivalently, if F1 , F2 , . . . are closed sets with empty interior , i.e. that contain no ball5 ,
then [
Fn
n∈N∗

has empty interior as well. Deduce that, if (V, k · k) is an infinite-dimensional normed vector space
(see Exercise 8.6.20† ) with countable basis (e1 , e2 , . . .), then (V, k · k) is not complete (we say it’s not
a Banach space). (You may assume Exercise 8.6.20† .)
4 In general topological spaces, a set S is dense if it intersects any non-empty open set, or, equivalently, if the only

closed set contained it is the space itself.


5 The interior of S is the union of its subsets which are open in the ambient space M .
8.6. EXERCISES 383

Solution
T
Let B be a non-empty ball. We wish to prove that there the intersection B ∩ n∈N∗ Un is non-
empty. For this, consider a non-empty closed ball F1 of U1 in B, which exists since U1 is open
and dense. Then, consider a non-empty closed ball F2 of U2 in F1 , which exists for the same
reason. Continuing that way, we get a chain of non-empty closed balls

F1 ⊆ F2 ⊆ . . .

such that Fi ⊆ B ∩ Ui . Without loss of generality, suppose that the radius of these balls goes to
0. Then, these closed sets have non-empty intersection by Exercise 8.6.12† , so that
\ \
B∩ Un ⊇ Fn
n∈N∗ n∈N∗

is non-empty as desired.

Baire’s theorem is indeed equivalent to the fact that a countable union of closed sets with empty
interior still has empty interior, since a closed set F has empty interior if and only if its comple-
ment U is dense. Indeed, U is not dense if and only if there is a ball B such that B ∩ U = ∅, i.e.
F contains B.

For the second part, suppose for the sake of contradiction that V is complete. Given an n ∈ N,
consider the space Vn spanned by e1 , . . . , en . Then, Vn is closed (in fact even complete) by
Exercise 8.6.20† , and has empty interior: for any x ∈ Vn and ε > 0, the element
en+1
y =x+ 6∈ Vn
εken+1 k

satisfies kx − yk ≤ ε. Thus, the union


[
Vn = V
n∈N

must has empty interior as well, which is clearly false since V is the whole space.


Exercise 8.6.14† (Banach’s Fixed Point Theorem). We say a map f from a metric space (M, d) to
itself is a contraction if there is a real number λ < 1 such that d(f (x), f (y)) ≤ λd(x, y) for all x, y ∈ M .
Let f be a contraction of a complete metric space M . Prove that f has a unique fixed point x∗ , and
that, for any x0 ∈ M , limn→∞ f n (x0 ) = x∗ .

Solution

First of all, note that any contraction is continuous. The uniqueness part is trivial: if x and y
are two fixed point, then
d(x, y) = d(f (x), f (y)) ≤ λd(x, y)
so that d(x, y) = 0 since λ < 1, i.e. x = y. Now we prove the existence part. Let x0 be an
element of M , and consider the sequence xn = f n (x0 ). We have

d(xn , xn+1 ) ≤ λd(xn−1 , xn )


384 CHAPTER 8. P -ADIC ANALYSIS

for every n ≥ 1, which means that (xn )n≥0 is Cauchy. Indeed,

max(m,n)−1
X
d(xm , xn ) ≤ d(xk , xk+1 )
k=min(m,n)

X
≤ d(x0 , x1 ) λk
k=min(m,n)

d(x0 , x1 )λmin(m,n)
=
1−λ
goes to 0 when min(m, n) → ∞. Thus, since M is complete, it converges to an x∗ ∈ M which
must, by continuity be a fixed point:
 
f (x∗ ) = f lim f n (x0 )
n→+∞

= lim f (f n (x0 ))
n→+∞

lim f n (x0 )
n→+∞

=x .

Exercise 8.6.15† . We say a metric space (M, d) is separable if it has a countable dense subset, and
that it is totally bounded if, for every ε > 0, there is a finite cover of M in open balls of radius ε.
Prove that a metric space is separable if and only if it has a countable basis of open sets (Un )n∈N ,
i.e. a family of non-empty open sets such, for any x ∈ M and any open set x ⊆ U , there is an n
for which x ∈ Un ⊆ U . In addition, prove that compact spaces and sequentially compact spaces are
totally bounded, and that totally bounded spaces are separable.

Solution

Let M be a metric space. If M is separable, we may construct a countable basis of open sets
by picking, for every point x in a countable dense set, all balls of rational radius centered at x.
Conversely, if M has a countable basis of open sets, we can construct a countable dense subset
by picking a point in each element of the basis.
Now, suppose that M is totally bounded. Then, M is separable: we can pick, for every rational
ε > 0, a point in each of the finitely many balls of radius ε we can cover M with to form a
countable dense subset.
Next, suppose that M is compact and let ε > 0. Consider the open cover
[
M= B<ε (x).
x∈M

Since M is compact, we can extract a finite subcover as desired.


Finally, suppose that M is sequentially compact but not totally bounded. Let ε > 0 be such
that it is impossible to cover M with finitely many open balls of radius ε. Construct a sequence
(xn )n≥0 of elements of M inductively so that, for every n ∈ N,
d(xn , xk ) ≥ ε
for all k < n; there exists one by assumption. Since M is sequentially compact, (xn )n≥0 has a
convergent subsequence, which is impossible since d(xm , xn ) ≥ ε for every m 6= n. Thus, M is
totally bounded.

8.6. EXERCISES 385

Exercise 8.6.16† . Let (M, d) be a metric6 space. Prove that the following assertions are equivalent:
(i) M is compact.
(ii) M is totally bounded and complete.
(iii) M is sequentially compact.

Solution

We have already proven that compact and sequentially compact spaces are totally bounded, so we
shall only prove that a metric space is compact if and only if it is sequentially compact. Indeed,
such a space will then automatically be complete, since any Cauchy sequence will then have a
convergent subsequence, which will converge to the limit of the original sequence by definition of
a Cauchy sequence. Conversely, a totally bounded and complete space is sequentially compact,
since we can extract a Cauchy subsequence from any given sequence, e.g. by picking xn+1 so
that B<2−n (xn ) contains xn+1 and infinitely many other points of the sequence.

Suppose first thatSM is not compact. We need to prove that it is not sequentially compact
either. Let M = i∈I Ui be an open cover which has no finite subcover. By Exercise 8.6.15† ,
since M has a countable basis of open sets, we may assume that I = N is countable. Consider
the descending chain of closed sets

Fn = M \ (U0 ∪ . . . ∪ Un ).

Then, by assumption, Fn is never empty, yet their intersection is empty. By Exercise 8.6.12† , M
is not sequentially compact.

Suppose now that M is not sequentially compact and let (xn )n≥0 be an infinite sequence with
no convergent subsequence. We will again use Exercise 8.6.12† to prove that it is not compact
either, by finding a chain of non-empty closed sets with empty intersection. Suppose for the sake
of contradiction that M is compact. Then, it is totally bounded by Exercise 8.6.15† , so we can
find a ball F0 = B≤1 (z0 ) which contains infinitely many elements of (xn )n≥0 . Then, consider a
ball F1 = B≤ε (z1 ) of sufficiently small radius so that F1 ⊆ F0 which still contains infinitely many
elements of (xn )n≥0 . Continuing in this manner, we get a chain of closed balls

F0 ⊆ F1 ⊆ . . .

such that, for every n, Fn contains infinitely many elements of (xn )n≥0 . Without loss of generality,
we may also assume that the radius of Fn goes to 0. Then, the intersection
\
Fn
n∈N
T
consists of at most one point, since, if x, y ∈ n∈N Fn , then d(x, y) → 0 so that x = y. We shall

T the fact that M is compact by Exercise 8.6.15 . For every n,
prove that it is empty, contradicting
pick a ϕ(n) so that xϕ(n) ∈ Fn . If n∈N Fn = {x}, then xϕ(n) → x, contradicting the assumption
that (xn )n≥0 had no convergent subsequence.


Absolute Values
Exercise 8.6.17† . We say an absolute value | · | over a field K, i.e. a function | · | → R≥0 such that
• |x| = 0 ⇐⇒ x = 0
• |x + y| ≤ |x| + |y|
6 Thetheorem is not always true for arbitrary topological spaces: some compact spaces are not sequentially compact,
and some sequentially compact spaces are not compact.
386 CHAPTER 8. P -ADIC ANALYSIS

• |xy| = |x| · |y|


is non-Archimedean if |m| ≤ 1 for all m ∈ Z and Archimedean otherwise. Prove that m is non-
Archimedean if and only if it satisfies the strong triangular inequality |x + y| ≤ max(|x|, |y|) for all
x, y ∈ K. In addition, prove that, if | · | is non-Archimedean, we have |x + y| = max(|x|, |y|) whenever
|x| =
6 |y|.

Solution

It is clear that | · | is non-Archimedean if it satisfies the strong triangle inequality. Thus, suppose
that |m| ≤ 1 for all m ∈ Z. Now, notice that, for any positive integer n,

|x + y|n = |(x + y)n |


n  
X n k n−k
= x y
k
k=0
n  
X n
= |x|k |y|n−k
k
k=0
≤ n max(|x|, |y|)n .

Taking the limit as n goes to ∞, we get

|x + y| ≤ n1/n max(|x|, |y|) → max(|x|, |y|)

as wanted. For the equality, if |x| > |y|, note that, by the same inequality, we also have |x| ≤
max(| − y|, |x + y|). Since | − y| = |y| < |x|, we must have max(|x + y|, | − y|) = |x + y| so
|x + y| ≥ |x| ≥ |x + y| as wanted.


Exercise 8.6.18† . Let K be a field and let | · | : K → R≥0 be a multiplicative function which is an
absolute value on Q. Suppose that | · | satisfies the modified triangular inequality |x + y| ≤ c(|x| + |y|)
for all x, y ∈ K, where c > 0 is some constant. Prove that it satisfies the triangular inequality.

Solution

The argument is very similar to our proof of Exercise 8.6.17† . Let x, y be elements of K. For
any positive integer n,

|x + y|n = |(x + y)n |


n  
X n n−k k
≤c x y
k
k=0
n  
X n
=c |x|n−k |y|k
k
k=0
n  
X n
≤c |x|n−k |y|k
k
k=0
= c(|x| + |y|)n .

Indeed, a straightforward induction shows that |m| ≤ m for m ∈ N since | · | is an absolute value
on Q so |m + 1| ≤ |m| + |1| = |m| + 1 for any m ∈ N since |1|2 = |1| and |1| =
6 0. Taking the nth
root and letting n tend to infinity, we get

|x + y| ≤ c1/n (|x| + |y|) → |x| + |y|


8.6. EXERCISES 387

as wanted.


Exercise 8.6.19† (Ostrowski’s Theorem). Let | · | be an absolute value of Q. Prove that | · | is equal
to | · |rp for some prime p and some r ≥ 1, or to | · |r∞ for some 0 < r ≤ 1 or is the trivial absolute value
| · |0 which is 0 at 0 and 1 everywhere else.

Solution

First, note that f (1)2 = f (1) so f (1) = 1 since f (x) 6= 0. For the same reason, f (−1) = 1. Now,
suppose that there is some a ∈ N such that |a| > 1 and let b ∈ N be any integer. By the previous
Pbn log (a)c
remark, we have a > 1 so let am = i=0 b ai bi . We get

bn logb (a)c
X
m
|a| ≤ |ai ||b|i
i=0

which implies that |b| > 1 as well. But then,


bm logb (a)c
X
|a|m ≤ |ai ||b|i ≤ C|b|bn logb (a)c
i=0

for some constant C = max(|1|, |2|, . . . , |b − 1|) > 0 which implies that |a| ≤ |b|logb (a) when we
take m → ∞, i.e. |a|1/ log a ≤ |b|1/ log b . Since the reverse inequality is true as well by symmetry,
we get that |a|1/ log a = c is constant. This gives us |a| = alog c := ar . It is then easy to see that
this extends to |a|r∞ on all of Q using the multiplicativity of | · |. Finally, it is easy to check that
this satisfies the triangular inequality only for 0 < r ≤ 1.

Now suppose that |n| ≤ 1 for all n ∈ Z. By Exercise 8.6.17† , | · | satisfies the strong triangle
inequality. Without loss of generality, assume that | · | is non-trivial and let p ∈ N be the smallest
positive integer such that |p| < 1. Since | · | is multiplicative, p must be prime as it has no
non-trivial divisor and is distinct from 1. By assumption, |a| = 1 for any 1 ≤ a ≤ p − 1. We shall
prove that |n| = 1 for any p - n to conclude that, in general,
− log |p|
|n| = |p|vp (n) |n/p| = |p|vp (v) = |n|p p

Consider any p - n now and express it in base p as i ai pi . Since p - n, we have a0 < p, so


P
1 = |a0 | > maxi≥1 |ai pi |. By the previous inequality, we are in the equality case of

X
ai pi ≤ max |ai pi | = 1
i
i

so |n| = 1 as wanted. To conclude, it is this time easy to see that |x + y| ≤ max(|x|, |y|) only
when r ≥ 1.


Exercise 8.6.20† (Equivalence of Norms). Let (K, | · |) be a complete valued field of characteristic
0, i.e. a field with an absolute value | · | which is complete for the distance induced by this absolute
value. A norm on a vector space V over K is a function k · k : V → R≥0 such that
• kxk = 0 ⇐⇒ x = 0
• kx + yk ≤ kxk + kyk
• kaxk = |a|kxk
388 CHAPTER 8. P -ADIC ANALYSIS

for all x, y ∈ V and a ∈ K. We say two norms k · k2 and k · k2 are equivalent if there are two positive
real numbers c1 and c2 such that kxk1 ≤ c1 kxk2 and kxk2 ≤ c2 kxk1 for all x ∈ V .7 Prove that any
two norms are equivalent over a finite-dimensional K-vector space V . In addition, prove that V is
complete under the induced distance of any norm k · k.

Solution

Since we wish to show that all norms are equivalent, it suffices to prove that any norm is equivalent
to a fixed norm we choose. A particularly simple one is the maximum norm

kxk∞ = max |ai |


i
Pn
where e1 , . . . , en is a basis of V and x = i=1 ai ei for some ai ∈ K. In other, words this is simply
the maximum of thePcoefficients of x in the basis (e1 , . . . , en ). Clearly, V is complete under this
n
norm, since if xk = i=1 ak,i ei is a Cauchy sequence, then so is every (ak,i )k≥0 for the distance
induced by | · | which means that ak,i −→ ai for some ai and
k→+∞

n
X
xk → ai ei .
i=1

Since two equivalent norms have the same Cauchy sequences, we are P done if we prove that any
n
norm k · k is equivalent to k · k∞ . One inequality is very easy: if x = i=1 ai ei , we have
n
X
kxk = ai ei
i=1
n
X
≤ |ai |kei k
i=1
n
!
X
≤ kxk∞ · kei k .
i=1

For the other inequality, suppose for the sake of contradiction that there doesn’t exist a c > 0
such that kxk∞ ≤ ckxk for all x ∈ V . In other words, for all ε, there is some x such that
kxk < εkxk∞ . In particular, x 6= 0. Since we have infinitely P
many x, by the pigeonhole principle,
n
we can assume that kxk∞ = |ak | for a fixed k, where x = i=1 ai ei . By dividing x by ak , we
may also assume that ak = 1. This gives us a sequence

xm = ym + ek

converging to 0, where ym is in the space W spanned by e1 , . . . , ek−1 , ek+1 , . . . , en . In particular,

kym − y` k ≤ kym + ek k + ky` + ek k

also converges to 0 when min(m, `) → +∞. In other words, (ym )m≥0 is a Cauchy sequence. Now,
we use induction on n = dim V . When n = 1 the result is trivial since V = K and k · k = k1k| · |.
For the inductive step, notice that W has dimension n−1 so, by assumption, it is complete under
k · k. Hence, (ym )m≥0 converges to some y ∈ W . This means that

ky + ek k = lim kym + ek k = 0,
m→+∞

which is impossible since y + ek 6= 0.




7 This means that they induce the same topology on V .


8.6. EXERCISES 389

Exercise 8.6.21† . Let K = Qp be a local field8 , where p be a prime number or ∞ and let L be a
finite extension of K. Prove that there is only one absolute value of L extending | · |p on K, and that
1/[L/K] 9 10 11
it’s given by | · |p = NL/K (·) p
.

Solution

For simplicity purposes, we write | · | for | · |p . We first prove the uniqueness. Suppose that | · |
and | · |0 are two absolute values extending | · |. Then, they are norms over the K vector space L.
By Exercise 8.6.20† , they must be equivalent:
a|x| ≤ |x|0 ≤ b|x|
for some positive real numbers a, b. In particular, if we let x = y n , we get a|y|n ≤ (|y|0 )n ≤ b|y|n .
By taking nth roots and letting n tend to infinity, this gives us
|y| ← a1/n |y| ≤ |y|0 ≤ b1/n |y| → |y|
so |y|0 = |y| as wanted. Note that we didn’t use the fact that K was a field of the form Qp here.
Now, we prove the existence. Multiplicativity is obvious, and |x| = 0 iff x = 0 too. The tricky
part is to prove that it satisfies the triangular inequality |x + y| ≤ |x| + |y|. After dividing by
|y|, this is equivalent to |x + 1| ≤ |x| + 1. We will however not prove this directly, but rather
that there is a constant c > 0 such that |x + 1| ≤ c(|x| + 1). Assuming we have proven this,
Exercise 8.6.18† tells us that we can in fact pick c = 1, i.e. that | · | satisfies the triangular
inequality (and is thus an absolute value).
It remains to prove that such a c exists. Let e1 , . . . , en be a K-basis of L (for instance ei = αi
for some primitive element α). Define the maximum norm as
X
ai ei = max |ai |.
i
i ∞

The point is that this defines a distance d(x, y) = |x − y|(∞) and that the unit sphere S = {x |
kxk∞ = 1} is sequentially compact for this distance, so that our extension of | · | will have a
(non-zero) minimum there by the extreme value theorem from Exercise 8.6.9† .
It is also not very hard to see that the unit sphere is indeed sequentially compact: this is the
Bolzano-Weierstrass theorem from Exercise 8.6.8† for p = ∞, i.e. K = R, and is very easy when
p is prime by an argument similar to the proof of Exercise 8.6.7† .
p
To Pconclude, our extension of |·|, n |N (·)| is continuous for the distance induced by |·|(∞) because
N ( i ai ei ) is polynomial in the ai . Thus, there are positive a and b such that a ≤ |x| ≤ b for
|x|(∞) = 1 by the extremal value theorem from Exercise 8.6.9† . Note that a is positive as | · |
doesn’t vanish on S. From this we conclude that akxk∞ ≤ |x| ≤ bkxk∞ for any x. But then, we
have
b
|x + 1| ≤ b|x + 1|∞ ≤ b(|x|∞ + 1) ≤ (|x| + 1)
a
which is what we wanted to show.


8 This result is true for any complete valued field (K, | · |), but it is harder to prove. See Cassels [8, Chapter 7] for a

proof.
9 In particular, this absolute value is still non-Archimedean if it initially was. For instance, by Exercise 8.6.17† , if p

is prime, the extension of | · |p still satisfies the strong triangle inequality. In fact, this is the only interesting case since
it’s too hard to treat the case K = R separately.
10 Here is why this absolute value is intuitive: by symmetry between the conjugates, we should have |α| = |β| if α
p p
[K:Q ]
and β are conjugates. Taking the norm yields |NK/Qp (α)|p = |α|p p as indicated.
11 One might be tempted to also define a p-adic valuation for elements of K as v (·) = − log(| · | )/ log(p), and this is
p p
also what we will do in some of the exercises. However, we warn the reader that, if α ∈ Z is an algebraic integer and αp
is a root of its minimal polynomial in Qp , vp (αp ) ≥ 1 does not mean anymore that p divides α in Z, it only means that
p divides αp in Zp := {x ∈ Qp | |x|≤ 1}.
390 CHAPTER 8. P -ADIC ANALYSIS

Exercise 8.6.22† . Let (K, |·)| be a complete valued field of characteristic 0 and let f ∈ K[X] be a
polynomial. Prove that f either has a root in K, or there is a real number c > 0 such that |f (x)| ≥ c
for all x ∈ K.

Solution

Suppose without loss of generality that f is irreducible and that there does not exist a c > 0 such
that |f (x)| ≥ c for all x ∈ K. In other words, |f (x)| takes arbitrarily small values for x ∈ K.
We will produce a Cauchy sequence (xn )n≥0 such that |f (xn )| → 0. The limit x of (xn )n≥0 will
then clearly be a root of f .

We use the Newton method to find such a sequence. Let x0 ∈ K be such that |f (x0 )| < 1 is
small (we will specify this later). Note that |x0 | is bounded since, by the triangular inequality, if
f = an X n + . . . + a0 , we have

|f (x0 )| ≥ |an ||x|n − |an−1 ||x|n−1 − . . . − |a0 |.

Define the sequence (xn )n≥0 by xn+1 = xn + εn , where εn will be chosen in the next sentences.

Given an element x ∈ K such that |x2 + 1| is small, we define the sequence (xn )n≥0 as follows.
Set x0 = x. Then, set xn+1 = xn + ε for some small ε. We have, by Taylor’s formula 5.3.1
n−1
X εkn f (k) (xn )
f (xn+1 ) = = f (xn ) + εn f (xn ) + O(ε2n ).
k!
k=0

Hence, to kill the greatest term of this sum, we choose εn = − ff0(x n)


(xn ) . Let’s justify a bit the
notation O(ε2n ): we have shown that, if f (x) is very small then x is bounded, so the derivatives
f (k) (x) are bounded as well. We also need to ensure that f 0 (x) is not too small when f (x) is, so
that εn = − ff0(x n)
(xn ) is very small. This follows from Bézout’s lemma: since f is irreducible, it is
coprime with its derivative f 0 (we are in characteristic 0) so there are u, v ∈ K[X] such that

uf + vf 0 = 1.

When f (x) is very small, u(x) is bounded (since x is) so |v(x)f 0 (x)| is very close to 1. Since v(x)
is also bounded, we get that |f 0 (x)| is bounded below as wanted.

To conclude, we have
n−1
X εkn f (k) (xn )
|f (xn+1 )| = < c|εn |2
k!
k=0

when f (xn ) is very small. Since


|f (xn )|2
|εn |2 = ,
|f 0 (xn )|2
there is some θ < 1 such that
|f (xn+1 )| ≤ θ|f (xn )|
when f (xn ) is sufficiently small (in particular, it suffices to have f (x0 ) sufficiently small). Hence,
pick an x0 such that |f (x0 )| is sufficiently small and this inequality is true. Then, |f (xn )| ≤ θn−1
by induction so that,
|f (xn )|
|xn+1 − xn | = 0 ≤ cθn
|f (xn )|
for some constant c > 0. It is not hard to see that this implies that (xn )n≥0 is Cauchy, so we are
done.

8.6. EXERCISES 391

Exercise 8.6.23† (Ostrowski). Let (K, | · |) be a complete valued Archimedean field of characteristic
012 . Prove that it is isomorphic to (R, | · |∞ ) or (C, | · |∞ ).

Solution

Without loss of generality, suppose that Q ⊆ K. By Exercise 8.6.19† , we may also assume that
| · | extends the usual absolute value | · |∞ of Q, by replacing | · | by | · |r for some suitable r ≥ 1.
This new absolute value might not satisfy the triangular inequality, but in fact it does. Indeed,
by the power mean inequality, we have
r
|x|r + |y|r |x + y|r

|x| + |y|
≥ ≥ .
2 2 2r

Setting c = 2r−1 , we get that this absolute value, which we will from now one abusively denote |·|
as well, satisfies the modified triangular inequality |x+y| ≤ c(|x|+|y|). Then, by Exercise 8.6.18† ,
| · | satisfies the triangular inequality as desired.

Now, note that K contains (a field isomorphic to) R since it is complete and R is the set of limits
of Cauchy sequences of rational numbers. | · | is then the usual absolute of R, by construction of
R. Without loss of generality, suppose also that C ⊆ K, by extending | · | to K(i) if necessary.
By Exercise 8.6.21† , we know that we should extend | · | to K(i) by
p
|α + βi| = |α2 + β 2 |,

but we don’t know if it is indeed an absolute value. To show that it is, note that, if i 6∈ K, by
Exercise 8.6.22† there is a constant c > 0 such that

|α2 + β 2 | ≥ c(|α|2 + |β|2 )

for all α, β ∈ K. Indeed, if |x2 + 1| ≥ c/2 for all x ∈ K, we have, for any |β| ≥ |α|,

|α2 + β 2 | =|β|2 |(α/β)2 + 1|


≥ |β|2 c/2
≥ c(|α|2 + |β|2 )

Thus, for any α, β, γ, δ ∈ K,

|(α + βi) + (γ + δi)|2 = |(α + γ)2 + (β + δ)2 |


≤ 2(|α|2 + |β|2 + |γ|2 + |δ|2 )
2
≤ (|α + βi|2 + |γ + δi|2 )
c
where the third line follows from the triangular inequality and the inequality between the arith-
metic and geometric mean, so | · | satisfies the triangular inequality by Exercise 8.6.18† (and the
quadratic-geometric mean inequality).

We will now prove that any element of K is in fact in C, thus showing that K = C as wanted.

Let α be an element of K and let m be the minimum of |α − x| for x ∈ C. This minimum exists
by the Bolzano-Weierstrass theorem: we have |α − x| ≥ |x| − |α| so |α − x| → ∞. If we choose r
such that |α − x| > |α| for |x| > r, we get that the minimum of |α − x| over C is also its minimum
over the ball {x | |x| ≤ r}. However, this ball is compact by the Bolzano-Weierstrass theorem,
and the function x 7→ |α − x| is continuous by the triangular inequality, so a minimum exists by
the extremal value theorem. We wish to prove that this minimum m is zero.

12 In fact it is quite easy to show that char K = 0 follows from the assumption that | · | is Archimedean, but we add

this assumption for the convenience of the reader.


392 CHAPTER 8. P -ADIC ANALYSIS

The idea is now to take an x such that |α − x| is large, and, at the same time, A − x divides a
f
polynomial f such that |f (α)| is small. If we let g = A−x , we get that |g(α)| is quite small so
that one of |α − z| where z is a root of g is small, and in particular smaller than m. Since the
remainder of a polynomial f modulo A − x is f (x), we can relax the condition to |f (α)| small
and |f (x)| as well. With these conditions, it is natural to pick f first and then x: an obvious
candidate for f is
f = (A − y)n
where y is such that |α − y| = m. Now, we need to estimate |f (α) − f (x)|. By the triangular
inequality, it is at most mn + |x − y|n . In particular, if ε = |x − y| < 1, it is at most mn plus
something very small. In addition, by definition, we know that |g(α)| ≥ mn−1 , where g = f −f (x)
A−x .
Hence,
|α − x|mn−1 ≤ |g(α)||α − x| = |f (α) − f (x)| ≤ mn + εn .
This means that, if m is non-zero, by dividing by mn ,

|α − x| ≤ m (1 + (ε/m)n ) → m.

Thus, |α − x| = m for all |x − y| < 1. Iterating this process, we get |α − x| = m for all x ∈ C
which is obviously a contradiction since |α − x| goes to ∞ when |x| → ∞. Hence, |α − z| = 0 for
some z ∈ C, i.e. α = z ∈ C as wanted.


×
Exercise 7.5.1† (Residue Field). Note that OK = OK \ pK . In other words, pK consists precisely
of those elements of OK which are not invertible. This proves that pK is the unique maximal ideal of
OK , as if a is an ideal of OK and α ∈ a \ pK , then OK = (α) so that a = OK .

For the second part, note that κ is a field by Exercise A.3.23† , and is naturally a field extension of
Zp /pZp ' Fp through the embedding a (mod pZp ) 7→ a (mod pK ). To show that [κ : Fp ] ≤ [K : Qp ],
we prove that if x1 (mod pK ), . . . , xn (mod pK ) ∈ κ are Fp -linearly independent, then x1 , . . . , xn ∈ K
are Qp -linearly independent. Suppose otherwise: let λ1 , . . . , λn ∈ Qp be such that

λ1 x1 + . . . + λn xn = 0,

and not all λi are zero. Let k be such that |λk |p is maximal. Then,
λ1 λn
x1 + . . . + xn = 0
λk λk
is a Zp -linear dependence (because all λi /λk have absolute value at most 1), which is non-trivial
modulo pK because the kth coefficient is 1. This shows that x1 (mod pK ), . . . , xn (mod pK ) are
linearly dependent as well.

Solution

Exercise 8.6.25† . Let K be a finite extension of Qp . Prove that OK is compact.

Solution

Pick a primitive element β of Kp /Qp with conjugates β1 , . . . , βd , and consider an element x =


Pd−1 i
i=0 bi β ∈ OKp . By definition of the p-adic absolute value, we also have |xi | ≤ 1, where xi is
the image of x under the embedding β → βi . To conclude, Cramer’s rule (Exercise C.5.9† ) or
8.6. EXERCISES 393

the adjugate (Proposition C.3.7) let us express the bi as linear combinations of the βji and the
xi . Then, using the triangle inequality, we conclude that |b0 |p , . . . , |bd−1 |p are bounded. As a
consequence, the set
d−1
X
{(b0 , . . . , bd−1 ) | bi β i ≤ 1} ⊆ Qnp
i=0

is compact. Thus, OKp is as well, as a compact subset of a compact set.




Diophantine Equations
Exercise 8.6.26† (Brazilian Mathematical Olympiad 2010). Find all positive rational integers n and
x such that 3n = 2x2 + 1.

Solution
√ √ √
We proceed as in Proposition 8.5.1: working in Q( −2), we find 1 + −2x = (1 ± 2)n , i.e.
√ √
(1 + −2)n + (1 − −2)n = ±2.
√ √
To solve this, we shall work in Q11 . We thus consider α = 1 ± −2 and β = 1 ∓ −2 as elements
of Q11 ; Hensel’s lemma gives us α ≡ 20 (mod 121) and β ≡ 103 (mod 121). We wish to find the
zeros of the linear recurrence αn − β n ± 2. Note that we have αn − β n ≡ ±2 modulo 11 only
when n ∈ {0, 1, 2} modulo 5, so we restrict our attention to these n.
Set a = α5 − 1 ≡ 0 (mod 11) and b = β 5 − 1 ≡ 0 (mod 11). We shall compute the Strassmann
bounds of the power series
fr (s) = αr (1 + a)s − β r (1 + b)s
for r ∈ {0, 1, 2}. Modulo 112 , we have
fr (s) ≡ αr (1 + as) + β r (1 + bs) − 2.
The coefficient of s is αr a + β r b. However, for r ∈ {1, 2}, this is respectively 44 and 88 modulo
112 so non-zero in both cases. Hence, the Strassmann bounds for f1 and f2 are 1. It remains to
compute the Strassmann for f0 . This time, we have a + b ≡ 0 (mod 112 ) so we need to expand
one more term. We get
   
s s 2 s 2 s
f0 (s) = (1 + a) + (1 + b) − 2 ≡ 1 + as + a +1+b − 2 (mod 113 ).
2 2
2 2
The coefficient of s2 is thus a +b
2 modulo 113 . However, we can check with Hensel’s lemma
that α ≡ 587 (mod 113 ) and β ≡ 746 (mod 113 ), which yields a ≡ 1012 (mod 113 ) and b ≡ 317
(mod 113 ). One can then verify that
a2 + b2 ≡ 847 6≡ 0 (mod 113 ).
Hence, the Strassmann bound for 0 is 2.
To finish, we need to find solutions: two solutions congruent to 0 modulo 5, one congruent to 1
modulo 5, and one congruent to 2 modulo 5. It is not hard to see that we indeed have
30 = 2 · 02 + 1
31 = 2 · 12 + 1
32 = 2 · 22 + 1
35 = 2 · 112 + 1.
Hence, we have found all solutions: (n, x) ∈ {(0, 0), (1, 1), (2, 2), (5, 11)}.

394 CHAPTER 8. P -ADIC ANALYSIS

Exercise 8.6.29† . Solve the diophantine equation x2 − y 3 = 1 over Z.

Solution

Write this equation as (x − 1)(x + 1) = y 3 . The gcd of the two factors divides 2, so we have
x − 1 = a3 and x + 1 = b3 or x ± 1 = 2a3 and x ∓ 1 = 4b3 for some a, b ∈ Z. The former is
impossible, so we must be in the latter case. The problem thus reduces to solving the equation
a3 − 2b3 = ±1 in rational integers. We know by Section 7.4 that

a − b 2 = ±θn
3


for some n, where θ is the fundamental
√ unit of Q( 3 2). In addition, by Exercise
√ 7.5.13† , we√
know
1
3
that we can choose θ = 1 − 2 = − 1+ √ 3 3 . Hence, we wish to have a − b

2+ 4
3
2 = ±(1 − 3
2)n .
As we saw in the proof of Theorem 7.4.2, for a given n, there are such a, b if and only if
√ √ √
(1 − 2)n + j(1 − j 2)n + j 2 (1 − j 2 2)n = 0.
3 3 3

We work in Q3 (α, j), where α3 = 2 and j is now a tryadic root of unity of order 3. Note that
this has degree 6 over Q3 since j 6∈ Q3 and α 6∈ Q3 (j) (for instance because Gal(Q3 (j)/Q3 ) is
abelian but Gal(Q3 (α)/Q3 ) isn’t). In particular, α0 = α, α1 = jα and α2 = j 2 α are conjugate.
We wish to find when the linear recurrence (1 − α)n + j(1 − jα)n + j 2 (1 − j 2 α)n is zero. Here is
the magic: this is already almost a tryadic convergent power series. Indeed, we can rewrite it as
2n ((1 + π0 )n + j(1 + π1 )n + j 2 (1 + π2 ))
where
√ πk = −(1 + αj k )/2 has √norm −3/8 and thus tryadic absolute value 3−1/3 < 1. (In fact,
−1/3
3
1+ 2 is prime in OQ( √3
3
2) = Z[ 2].) However, 3 is still too large: it’s greater than 3−1/(3−1) .
Hence, we consider the function
fr (s) = (1 + π0 )r (1 + π0 )3s + j(1 + π1 )r (1 + π1 )3s + j 2 (1 + π2 )r (1 + π2 )3s
for r ∈ [3]. Indeed, these converge since (1 + πk )3 = 1 + 3(−3/8 − αk /8 + αk2 /8) has absolute
value 3−1 < 3−1/(3−1) . It is then straightforward to compute the Strassmann bounds: we claim
that it is 1 for r = 0 and r = 1, and 0 for r = 2. Let us start with r = 0. In that case,
2
X
f0 (s) ≡ s j 3 (−3/8 − αj k /8 + α2 j 2k /8) (mod 27).
k=0
P2
easy to compute a sum of the form k=0 j k i ai j ki : this is a unity root filter
P
It is in fact veryP
so is the sum 3 i≡−1 (mod 3) ai αi (see Exercise A.3.9† ). This is actually normal: it’s why we
considered this sum in the first place. In particular, this also explains why this congruence holds
modulo 33 instead of simply 32 : it’s because of the additional factor of 3 added by the unity root
filter. Hence, the coefficient of s of f0 is 9α2 /8 which has tryadic valuation 2 < 3 and 1 is thus
the Strassmann
√ bound for f0 . Conversely, it is clear that s = 0 is a solution (corresponding to
1 = (1 − 3 2)0 ).
We now consider f1 . As before, we are done if the coefficient of s has tryadic valuation 2 since
all the following ones have valuation at least 3. We expand f1 modulo 27, and remember that
we only care about the coefficient of α3n+2 :
2
X
f1 (s) = j k (1 − αk )/2 · (1 + 3(−3/8 − αk /8 + αk2 /8))s
k=0
2
X 2
X
≡ j k (1 − αk )/2 + 3s j k (1 − αk )/2 · (−3/8 − αk /8 + αk2 /8) (mod 27)
k=0 k=0
= −9sα2 /8
which has
√ absolute√value 3−2 as desired. It is again clear that s = 0 is a solution (corresponding
to 1 − 2 = (1 − 3 2)1 ).
3
8.6. EXERCISES 395

Finally, we consider f2 . Here is what changes: the coefficient of s0 is no longer zero because
(1 − α)2 now has a non-zero coefficient for some α3n+2 . More specifically,
2
X
f2 (s) = j k (1 − αk )2 /4 · (1 + 3(−3/8 − αk /8 + αk2 /8))s
k=0
2
X
≡ j k (1 − αk )2 /4 (mod 9)
k=0
= −3α2 /4

which has absolute value 3−1 as desired. √This shows that √ the Strassmann bound is 0, and
concludes our study of the equation a − b 3 2 = ±(1 − 3 2)n : the only solutions are a = ±1,
b = 0 as well as a = b = ±1. If we go back to the original problem, these correspond to
x ± 1 = 2a3 = ±2, i.e. x ∈ {±1, ±3}. These then yield (x, y) ∈ {(±1, 0), (±3, 2)}, which are, in
conclusion, the only rational integer solutions to the equation x2 − y 3 = 1.


Exercise 8.6.30† (Lebesgue). Solve the equation x2 + 1 = y n over Z, where n ≥ 3 is an odd integer.

Solution

Suppose (x, y) is a solution. By the unique factorisation in Z[i], we have xi + 1 = ε(a + bi)n for
some a, b ∈ Z and ε ∈ Z[i] a unit. Note that y = a2 + b2 is odd, since x2 + 1 is never divisible by
4, so one of a, b is even and the other is odd. Since the units of Z[i] have the form ik for some k
by Exercise 2.2.3∗ , they are all nth powers since n is odd, so we can assume ε = 1.

Hence, we wish to find the a, b ∈ Z such that (a + bi)n + (a − bi)n = 2. Since n is odd, this is
divisible by 2a so a is ±1. Since y = a2 + b2 is odd, b must be even. Expanding the real part of
(a + bi)n (which must be 1), we get
n−1
2  
X n
an−2k (−1)k b2k = 1.
2k
k=0

Modulo b2 we get b2 | 1 − an ∈ {0, 2}, and since b2 is at least 4 since b is even, a must be 1. In
other words, our equation becomes

(1 + bi)n + (1 − bi)n = 2.

We wish to expand the LHS as a dyadic convergent power series, but this is not possible because
1
|b|2 might be equal to 2− 2−1 = 2−1 , i.e. b might have dyadic valuation equal to 1. To remedy
this situation, we use the LTE lemma:

(1 + bi)2 = (1 + 2b(i − b/2)).

Since n is odd, we can set n = 2m + 1 and reduce the problem to finding the zeros of the now
dyadic convergent power series

f (m) = (1 + bi)(1 + 2b(i − b/2))m + (1 − bi)(1 + 2b(−i − b/2))m − 2

where i is now a square root of −1 in Q2 . Since this a roots of unity filter (see Exercise A.3.9† ),
as in Exercise 8.6.29† , f (m) is twice the "real dyadic" part of (1 + bi)(1 + 2b(i − b/2))m − 1, i.e.
the coefficient of 1 in this expression. Now expand this as
∞  
X m k
−1 + (1 + bi) (2b(i − b/2)) .
k
k=0
396 CHAPTER 8. P -ADIC ANALYSIS

Suppose that b 6= 0, otherwise we get x = 0 and y = 1. Since v2 (k!) ≤ k − 1 by Legendre’s


formula, every term except the first two vanish modulo 2b2 . Hence, modulo b2 , this is simply

−1 + (1 + bi)(1 + 2bm(i − b/2)).

If we expand this while focusing only on the real dyadic part, we get

(1 − b2 m) + bi · 2bim − 1 = −3b2 m.

Since |b2 |2 > |2b2 |2 , we conclude that the Strassmann bound is (at most) 1. Since m = 0 is a
trivial solution, we conclude that it is the only solution (corresponding to n = 1, which is not
the case). Thus, the only solution (x, y) = (0, 1).


Remark 8.6.3
We can also finish directly with a slightly ad-hoc dyadic method once we reach
n−1
2  
X n
(−1)k b2k−2 = 0.
2k
k=1

Let m be the dyadic valaution of n2 = n(n−1) . We will prove that 2m+1 divides n2 , which is
 
2
n
of course a contradiction. The denominator of 2k is (2k)!. By Legendre’s formula, we have
b2k−2
v2 ((2k)!) = 2k − s2 (2k) ≤ 2k − 1. As a result, (2k)! has dyadic valuation at least −1. Since
n

(n − 1)(n − 3) divides (2k)! 2k , we conclude that
  
n 2k
v2 b ≥ v2 ((n − 1)(n − 3)) + v2 (b2k /(2k)!) ≥ m + 2 − 1 = m + 1
2k

as wanted since m = v2 n−1



2 = v2 (n − 1) − 1. Hence, 2m+1 divides every term of the sum
n−1
2  
X n
(−1)k b2k = 0.
2k + 2
k=0

except the first one, which means that it also divides the first one.

Exercise 8.6.31† . Solve the equation x2 + 1 = 2y n over Z, where n ≥ 3 is an odd integer.

Solution

Suppose (x, y) is a solution. By factorising in Z[i], we get xi + 1 = ε(1 + i)(a + bi)n for some
a, b ∈ Z and a unit ε ∈ Z[i]. Note that y = a2 + b2 is odd, since x2 + 1 is never divisible by 4, so
one of a, b is even and the other is odd. Since the units of Z[i] have the form ik for some k by
Exercise 2.2.3∗ , they are all nth powers since n is odd, so we can assume ε = 1.
By assumption,
2 = (1 + ix) + (1 − ix)
= (1 + i)(a + bi)n + (1 − i)(a − bi)n
= i(1 − i)(a + bi)n + (1 − i)(a − bi)n
= (1 − i) ((±b ∓ ia)n + (a − bi)n )
where the ±1 sign depends on n modulo 4. Since n is odd, this last expression is divisible by
±b ∓ ia + a − bi = (a − b)(1 ∓ i).
8.6. EXERCISES 397

Thus, (1 − i)(a − b)(1 ∓ i) divides 2. This is equivalent to a − b | 1, so a − b = ±1. Without loss


of generality, suppose that a and b are non-zero since {|a|, |b|} = {0, 1} yields y = 1 and thus
x = ±1. Now we distinguish a few cases, depending on which one of a and b is even and whether
a − b is 1 or −1.
1. b is even and a − b = 1. In that case, our equation is

f (n) := (1 + i)(1 + b(1 + i))n + (1 − i)(1 + b(1 − i))n − 2 = 0.

Unlike Exercise 8.6.30† , this is already a dyadic convergent power series since |(1 + i)|2 =
1
2−1/2 which means that |b(1 + i)|2 ≤ 2−3/2 < 2− 2−1 (we are working with the dyadic
i ∈ Q2 ). This is a unity root filter, so we are just focusing on the "real dyadic" part of
(1 + i)(1 + b(1 + i))n − 1. When we expand this modulo b3 , we get

(1 + i)(1 + b(1 + i)n + b2 (1 + i)2 n(n − 1)/2) − 1 = i + 2bin + (1 + i)2b2 in(n − 1)/2

since (1 + i)2 = 2i, which has real dyadic part −b2 n(n − 1). Clearly, |b2 |2 > |b3 |2 since
b 6= 0 so the Strassmann bound is 2. The previous computation P in fact shows
 that the first

two coefficients of f are zero (when written as a Mahler series k=0 ak xk ), which means
that n = 0 and n = 1 are solutions. In other words, these are the only solutions, which are
ruled out by the statement.
2. b is even and a − b = −1. Since n is odd, we have (−1 + b(1 ± i))n = −(1 − b(1 ± i))n so
our equation is

f (n) := (1 + i)(1 − b(1 + i))n + (1 − i)(1 − b(1 − i))n + 2 = 0.

The same computation as before shows that the coefficient of n1 is −2bi + 2bi = 0. Thus,


modulo 2b2 , we have


f (n) ≡ 4 + 0n
so the Strassmann bound is 0 since |2b2 |2 < |4|2 . There are no solutions in this case.
3. a is even. Then, a + bi = ±i + a(1 + i) so the equation is

(1 + i)(±i + a(1 + i))n + (1 − i)(±i + a(1 − i))n − 2 = 0.

Since (±i + α)n = ±i(1 + ±iα)n for any α ∈ Q2 (i), where the ± are independent and
depend on whether n ≡ 1 (mod 4) or n ≡ −1 (mod 4), our equation is

f (n) = (1 + i)(1 ± ia(1 + i))n + (1 − i)(1 ± ia(1 − i))n ± 2i = 0.

where the first two ± signs are the same and the last one is independent. We will prove
that the Strassmann bound is always 0. Modulo 2a (this is a unity root filter so the "real
dyadic" part gets doubled), we have

f (n) ≡ 2(1 + ±i).

Since |2a|2 < |2(1 ± i)|2 , we are done.


To conclude there are no solutions to the equations x2 + 1 = 2y n when y is not equal to 1 and
n ≥ 3, i.e. the only solutions to our equation are (±1, 1).


Linear Recurrences
Exercise 8.6.32† . Let (un )n≥0 be a linear recurrence of rational integers given by i fi (n)αin such
P
that αi /αj is not a root of unity for i 6= j. If un is not of the form aαn for some a, α ∈ Z, prove that
398 CHAPTER 8. P -ADIC ANALYSIS

there are infinitely many prime numbers p such that p | un for some integer n ≥ 0.

Solution

Without loss of generality, suppose that un is not identically zero. By Corollary 8.4.2 and
Corollary 8.4.1, the condition on the αi tells us that |un | → ∞. The idea is that we will bound
the p-adic valuation of un over a subsequence (uan+b )n≥0 to get a contradiction if (un )n≥0 has
finitely many prime divisors (since (uan+b )n≥0 would then be bouded).

We shall analyze the local behaviour of (un )n≥0 for a fixed prime p. Write un = i fi (n)αin .
P
We wish to factorise αi by a suitable power of p so that maxi |αi |p = 1. Indeed, since |p1/n |np =
|p|p = 1/p, the absolute values of powers of p take any value which
P can be taken by | · |p . Thus,
suppose that maxi |αi |p = 1 and consider the sequence vn = i∈I fi (n)αi (n) where I denotes
the set of i such that |αi |p = 1.mb Let Kp be the field generated by the αi . We claim that
that the integers OKp := {|x|p ≤ 1 | x ∈ Kp } of Kp are finite modulo pk , for any fixed k. This
implies (by the pigenhole principle) that (vn )n≥0 is periodic modulo pk for any k. To prove our
claim, suppose for the sake of contradiction that there were infinitely many elements of OKp
non-congruent modulo pk , say f is a set of such elements. By Exercise 8.6.25† , OKp is as well,
and thus S too. This implies that there are s, r ∈ S such that |s − r|p is arbitrarily small, but
then they will be equal modulo pk since
u−v
u≡v (mod pk ) ⇐⇒ ∈ OKp ⇐⇒ |u − v|p ≤ |pk | = p−k .
pk

To conclude, note that (vn )n≥0 is non-zero for large n by the Skolem-Mahler-Lech theorem.
Pick any N so that vN is non-zero, and let Tp be the period of (vn )n≥0 modulo pbvp (vn )c+1 .
Then, |vN +kTp |p is greater than some constant c > 0 for any k, and thus |un+kTp |p as well for
sufficiently large n + kTp , since un − vn → 0. If we finally return to the global behaviour and
let p vary among our finitely many prime divisors of (un )n≥0 , we get that, for any sufficiently
large N , vp (uN +k Qp Tp ) is bounded for any p and for any sufficiently large k. This contradicts
the assumption that |un | → ∞.


Remark 8.6.4
Note that, to prove that αn is periodic modulo p for α ∈ OKp , we cannot simply "convert" (with
the fundamental theorem of symmetric polynomials, after having introduced its conjugates) α to
an element of Fp and use the Frobenius morphism. Why? Because the minimal polynomial of
α does not necessarily have coefficients in Zp . Indeed, we only consider the constant coefficient
of the minimal polynmomial of α to compute its p-adic absolute value, and disregard all other
coefficients. For instance, the roots of X 2 − X/2 + 1 over Q2 are in the unit ball.

As another remark, it has in fact been proven, using a generalisation of (a p-adic extension of)
the Thue-Siegel-Roth theorem (see Remark 7.4.3) that un either has the form cαn , or its greater
prime factor tends to infinity. See [44].

Exercise 8.6.33† . Does there exists an unbounded linear recurrence (un )n≥0 such that un is prime
for all n?

Solution

Suppose for the sake of contradiction that (un )n≥0 is such a sequence. Without loss of generality,
suppose that |un | → ∞ by replacing (un )n≥0 by (uN n+m )n≥0 for some suitable N, m, as indicated
after Corollary 8.4.2. Now, let m be sufficiently large so that um = p is a prime which doesn’t
divide the denominator of the norm of any algebraic number appearing in the formula of um
8.6. EXERCISES 399

(so that they still make sense modulo p). Finite fields theory (e.g.P Theorem 4.2.1) tells us that
there exists a k unpk ≡ un (mod p) for any n. Indeed, if un ≡ i fi (n)αin , with fi ∈ Fp [X] and
αi ∈ Fp , it suffices to pick k so that fi ∈ Fpk [X] and αi ∈ Fpk , by the Frobenius morphism.

In particular, ump`k ≡ 0 (mod p) for any `. By assumption, this means that ump` = p, contra-
dicting the fact that un → ∞.


Miscellaneous
Exercise 8.6.34† . Which roots of unity are in Qp ?

Solution

Let α = (a1 , a2 , . . .) is a root of unity of order n in Qp . Suppose initially that p is odd. We first
focus on the case where p - n. We have ank ≡ 1 (mod pk ). However, the group of units modulo pk
is isomorphic to pk−1 (p − 1) by Exercise 3.5.18† (in more elementary terms: there is a primitive
root) so we also have
gcd(p−1,n) gcd(pk−1 (p−1),n)
ak ≡ ak ≡ 1.
Hence, α has order dividing gcd(p − 1, n), which implies that n | p − 1 since n is the order of α.
Now suppose that p | n. We wish to reach a contradiction, so suppose without loss of generality
that α has order exactly p, by replacing it by αn/p . Then, apk ≡ 1 (mod pk ) so ak ≡ 1 (mod p)
which implies that
vp (apk − 1) = 1 + vp (ak − 1)
by LTE. For large k, vp (ak − 1) stabilises since α 6= 1, which means that this cannot be at least
k.

It remains to provide a construction for (p − 1)th roots of unity. One can do this using the
structure of (Z/pk Z)× , or by means of Hensel’s lemma: the derivative of X p − X is 1 which
is never zero so we can lift all roots of X p − X modulo p to roots in Qp . This is called the
Teichmüller character ω which sends x ∈ (Z/pZ)× to the unique root of X p−1 − 1 congruent to
x modulo p.

It remains to treat the case where p = 2. When n is odd, the same argument as before works:
gcd(2k−2 (2−1),n)
this time we even have ak ≡ 1 (mod 2k ) for k ≥ 2. However, unlike the previous
case, there is now a root of unity of order 2: −1. Since the only root of unity of odd order is 1,
v2 (n)
the order of any root of unity must be a power of 2, since α2 is a root of unity of odd order.
Hence, we shall prove that there is no root of unity of order 4. This is easy: we use the LTE for
p = 2 (which simply amounts to the fact that a square is always 1 modulo 4) to get

v2 (a4k − 1) = 1 + v2 (a2k − 1)

and this stabilises since α2 6= 1 by assumption. This time, the Teichmüller character is defined
as ω : (Z/4Z)× : Q2 sending 1 to 1 and −1 to −1.

To conclude, the roots of unity of Qp are all (p − 1)th roots of unity, as well as a root of order 2
when p = 2.


Exercise 8.6.37† (China TST 2010). Let k ≥ 1 be a rational integer. Prove that, for sufficiently
large n, nk has at least k distinct prime factors.
400 CHAPTER 8. P -ADIC ANALYSIS

Solution
n
The key lemma is that, for any prime p and any positive integer n, pvp ((k )) ≤ n. Suppose that
we have proven this. Then, if nk has at most k − 1 prime factors, say p1 , . . . , pm , we have


  Y m
n vp ((n))
= pi i k ≤ nm ≤ nk−1
k i=1

n

which is impossible for large n since k is a polynomial of degree k in n.
n n!

It remains to prove this key claim. We use Legendre’s formula and the fact that k = k!(n−k)!
to write
  blogX p (n)c      
n n n−k k
vp = i
− i
− i .
k i=1
p p p

The wanted
j k result
j now k from the trivial inequality bx + yc ≤ bxc + byc + 1: each
k jfollows of the
n n−k k
 
terms pi − pi − pi is at most 1, so the whole sum is less than or equal to logp (n) .
n
This gives us vp nk ≤ logp (n), i.e. pvp ((k )) ≤ n as claimed.


Exercise 8.6.38† . Find all additive functions f : ZN → Z, where addition is defined componentwise.
(To those who have read Section C.2, the fact that there are a nice characterisation of those functions
should come off as a surprise.)

Solution

We claim that the Z-linear functions from ZN → Z are given by linear combinations of the
coordinates, which is surprising since the vectors ei with 1 in the ith coordinate and 0 everywhere
else do not form a basis of ZN : any linear combination of them has finitely many non-zero
coordinates (so (1, 1, . . .) isn’t one for instance)! This problem thus has two parts: proving that
any such function is 0 on all but finitely many ei , and proving that an additive function which is
0 on the ei is identically 0. We will do the second part first.
Suppose that f : ZN → Z is additive and f (ei ) = 0 for all i, i.e. f is 0 on the space of vectors
with finitely many non-zero coordinates. The special property of Z is that we can use the theory
of divisibility. More precisely, if the coordinates of x ∈ ZN eventually get all divisible by m, then
m | f (x). Indeed, if x = (x0 , x1 , . . .) is such that m | xn for any n ≥ N , we have
f (x) = f (0, . . . , 0, xN , xN +1 , . . .) = mf (0, . . . , 0, xN /m, xN +1 /m, . . .).
Thus, if the xi get eventually all divisible by increasingly large integers, f (x) must be zero! For
instance, f (a0 , a1 p, a2 p2 , . . .) is divisible by pn for any n so must be zero. You should now be able
to see the p-adic flavor of this problem (even if we won’t really use any of the theory developped
in this chapter)! In particular, x is congruent modulo p to
f (x0 , x1 (p + 1), x2 (p + 1)2 , . . .) = 0.
Since this is true for abritrary p, f (x) must be 0 too. Alternatively, using Bézout’s lemma, there
are 2n yn and 3n zn such that 2n yn + 3n zn = xn . Thus,
f (x) = f (y0 , 2y1 , 4y2 , . . .) + f (z0 , 3z1 , 9z2 , . . .) = 0 + 0 = 0.

P f : Z → Z is 0 on all but finitely many ei , say i 6∈ I.


N
Now we prove that any additive function
This implies that the function x 7→ f − i∈I f (ei )xi , where xi denotes the ith coordinate of x, is
zero on every ei so must be identically zero by the previous step. This shows that any additive
function is a linear combinations of the coordinate.
8.6. EXERCISES 401

The idea will again be p-adic. We will produce a sequence x = (x0 , x1 , . . .) such that v2 (xn ) is
increasing and grows so fast that f (ei ) must be 0 for large i, since we have the congruence
n−1
X
f (x) ≡ xi f (ei ) (mod 2v2 (xn ) ).
i=0
P∞
We can rephrase this as saying that the series i=0 xi f (ei ) converges dyadically to the rational
integer f (x). The point is that we have too many degrees of freedom for this to be always a
rational integer, unless f (ei ) = 0 for sufficiently
P∞ large i. This follows from the fact that, if we write
the dyadic expansion of a dyadic integer as i=0 ai 2i with ai ∈ {0, 1}, then the dyadic integers
with a finite dyadic expansion are exactly the rational integers. Indeed, P∞ this decomposition
P∞ is
unique for the same reason that the base 2 decomposition is: if i=0 ai 2i = i
i=0 bi 2 , pick
the smallest n such that an 6= bn to get an 2n ≡ bn 2n (mod 2n+1 ), i.e. an = bn , which is a
contradiction. Thus, the dyadic expansion of a rational integer must be its base 2 expansion,
which is indeed finite.

Hence, we pick xi = 2ni with (ni )i≥0 an increasing Pmi sequence which grows sufficiently fast. More
precisely, if we write 2ni f (ei ) in base 2 as k=ni ak 2k
, we want ni+1 to be larger than mi .
ni+1
That way,P∞ the base 2 expansion of 2 f (e i+1 ) only adds new terms to the dyadic expansion of
f (x) = i=0 2ni f (ei ), unless f (ei+1 ) = 0. Since the dyadic expansion of f (x) ∈ Z is finite, for
sufficiently large i, 2ni f (ei ) cannot add new terms to it, which means f (ei ) = 0 as wanted. This
concludes the solution.


Exercise 8.6.39† . Let f ∈ Zp [X] be a polynomial whose leading and constant coefficients are in Z× p.
×
Prove that, if K is any finite extension of Qp where f splits, its roots are all in OK (see Exercise 7.5.1†
for the definition of OK ).

Solution

Let f = an X n + . . . + a0 , and let α be an element of K. If |α|p < 1, then |a0 |p = 1 > |ak αk | so
|f (α)|p = 1 by the strong triangular inequality, which shows that f (α) 6= 0. Similarly, if |α|p > 1,
×
we have |f (α)|p = |α|n 6= 0. Thus, if α is a root of f , we must have |α|p = 1, i.e. α ∈ OK .


Exercise 8.6.40† . Let K = Q(ε1 , . . . , ε` ) be a finitely generated field of characteristic 0, and let
α1 , . . . , αr ∈ K × be non-zero elements. Prove that there is a prime p and an embedding τ : K → Qp ,
i.e. an (injective) field morphism, such that τ (αi ) ∈ Z×p for every i. Deduce that the Skolem-Mahler-
Lech theorem holds over any field of characteristic 0.

Solution

Without loss of generality, using the primitive element theorem, suppose that K =
Q(θ1 , . . . , θk , θ) where θ1 , . . . , θk are algebraically independent, and θ is algebraic over
Q(θ1 , . . . , θk ), with minimal polynomial π(θ1 , . . . , θk ) ∈ Q(θ1 , . . . , θk )[X]. Indeed, we can choose
{θ1 , . . . , θk } to be a maximal algebraically independent subset of {ε1 , . . . , ε` }, so that K is alge-
braic over Q(θ1 , . . . , θk ), and thus finite because it is generated by finitely many elements. The
primitive element theorem then tells us that we only need to add one generator to get K.

It is trivial to find an embedding of Q(θ1 , . . . , θk ) in Qp : the latter is uncountable so contains in-


finitely many algebraically independent elements, which means that we can simply send θ1 , . . . , θk
to k algebraically independent p-adic numbers. By the same argument, for any fixed integers
402 CHAPTER 8. P -ADIC ANALYSIS

t1 , . . . , tk ∈ Z, we can pick the images of θi so satsify |τ (θi ) − ti |p ≤ 1.

Choose these integers T = (t1 , . . . , tk ) so that the denominators of the coefficients of π do not
vanish at T , i.e. so that π(T ) is well-defined. Then, π(T ) ≡ τ (π(θ1 , . . . , θk )) (mod p). If we pick
p sufficiently large so that the LHS has a root in Fp , the RHS will then also have a root t ∈ Zp
by Hensel’s lemma (we can assume without loss of generality that π(T ) is squarefree, and then it
will have a double root in Fp only for finitely many primes), which we can call the image t = τ (θ)
of θ. This gives us an embedding K ,→ Qp . However, we haven’t ensured that the αi are sent to
Z×p.

For this, consider the minimal polynomials π1 (θ1 , . . . , θk ), . . . , πr (θ1 , . . . , θk ) of α1 , . . . , αr . We


pick T in such a way that the denominators of πi do not vanish at T (so that πi (T ) is well-
defined), and such that the constant coefficient of π1 (T ), . . . , πr (T ) are non-zero; this is possible
as α1 , . . . , αr 6= 0. Then, pick a large prime as before, but with the condition now that p does
not divide the constant of coefficient of πi (T ) for any i, and that π1 (T ), . . . , πr (T ) ∈ Zp [X]. This
means that, for 1 ≤ i ≤ r,

τ (πi (θ1 , . . . , θk )) ≡ πi (T ) (mod p)

satisfies the assumption of Exercise 8.6.39† , which implies that τ (αi ) ∈ Z×


p as wanted.
Pr
Finally, let un = i=0 fi (n)αin be a linear recurrence of elements of a characteristic 0 field K. By
replacing K with the field generated by the αi and the coefficients of the fi , we may assume that
K is finitely generated. Let τ : K ,→ Qp be an embedding which sends α1 , . . . , αr to units. Then,
as before, for every k ∈ [p − 1], the function n 7→ τ (uk+(p−1)n ) is the restriction of some power
series on Zp , and thus vanishes finitely many times or is identically zero. Hence, Z((τ (un ))n∈Z )
has the wanted form, and we are done because Z((un )n∈Z ) = Z((τ (un ))n∈Z ).


Exercise 8.6.41† . Let p be a prime number. Prove that the set of zeros of the linear recurrence
(un )n∈Z ∈ Fp (T ) defined by un = (1 + T )n − T n − 1 for all n ∈ Z is {pk | k ≥ 0}. Deduce that the
Skolem-Mahler-Lech theorem doesn’t hold in positive characteristic.

Solution

Frobenius shows that upk = 0 for every k ≥ 0. If p- n, (1 + T )n − T n − 1 = 0 is possible only


if n = 1, as otherwise the coefficient of T is n = n1 6= 0. Thus, if un = 0 where n = pk m with
p - m, we have
k k
(1 + T p )m = (1 + T )n = 1 + T n = 1 + (T p )m
which implies m = 1. The set {pk | k ≥ 0} is not the union of a finite number of arithmetic
progressions with a finite set, which shows that the Skolem-Mahler-Lech theorem fails for this
sequence.


Exercise 8.6.42† (Krasner’s Lemma). Let (K, | · |) be a non-Archimedean valued field and let α ∈ K
be an element with conjugates α1 , . . . , αn . Suppose that a separable element β ∈ K is such that

|α − β| < |α − αi |

for i = 2, . . . , n, where | · | is the absolute value defined in Exercise 8.6.21† . Prove that K(α) ⊆ K(β).
8.7. EXERCISES 403

Solution

If α is not separable then |α − β| < 0 which is impossible. Let L be a Galois extension of K


containing α and β. By Galois theory, the statement reduces to showing that any σ ∈ Gal(L/K)
fixing β also fixes α. Let σ be an element of Gal(L/K). By definition of | · | (over L), we have

|α − β| = |σ(α) − σ(β)| = |σ(α) − β|.

Finally, since | · | is non-Archimedean, we have

|α − σ(α)| ≤ max(|α − β|, |β − σ(α)|) = |α − β|

so that |α − σ(α)| is strictly less than |α − α0 | for any conjugate α0 6= α of α. This implies
σ(α) = α as wanted.


8.7 Exercises
Appendix A

Polynomials

A.1 Fields and Polynomials


Exercise A.1.1∗ . Let K be a field. Prove that 0K a = 0K for any a ∈ K.

Solution

0K a = (0K + 0K )a = 0K a + 0K a so 0K a = 0K .


Exercise A.1.2∗ . Let † be a binary associative operation on a set M . Suppose that M has an
identity. Prove that it is unique. Similarly, prove that, if an element g ∈ M has an inverse, then it is
unique.1

Solution

If e and e0 are identities, then e = ee0 = e0 so e = e0 . Similarly, if b and b0 are two inverses of a,
then
b = (b0 a)b = b0 (ab) = b0
by associativity.


Exercise A.1.3∗ . Prove that multiplication of polynomials is associative and commutative.

Solution

aiX , g = j
and h = k ck X k be three polynomials. We have
P P P
Let f = i j bj X
X X
fg = a i bj X ` = bj ai X ` = gf
i+j=` i+j=`

since multiplication is commutative in a field and


X X
(f g)h = (ai bj )ck X ` = ai (bj ck )X ` = f (gh)
i+j+k=` i+j+k=`

since multiplication is associative in a field. (This also works for formal power series.)

1 Such a structure is called a monoid.

404
A.1. FIELDS AND POLYNOMIALS 405

Exercise A.1.4∗ . Prove that the gcd of 0 and 0 is 0.

Solution

Any polynomial divides 0 and 0 if and only if it divides 0.




Exercise A.1.5∗ . Prove that the Euclidean algorithm produces the gcd. Deduce that the gcd of
two polynomials in K[X] is also in K[X]. (As a consequence, the fundamental theorem of algebra
Theorem A.1.1 implies that two polynomials with rational coefficients are coprime in Q[X] if and only
if they don’t have a common complex root.)

Solution

We need to prove that gcd(f, g) = gcd(f − gq, g) for any f, g. This implies that the steps in the
Euclidean algorithm preserve the gcd. Since deg f + deg g decreases at each step, we eventually
reach a situation where f = 0, and the gcd of 0 and g is clearly g. It is however very trivial that
the gcd is conserved since
h | f, g =⇒ h | f − gq, g
and
h | f − gq, g =⇒ h | g, (f − gq) + gq = f.


Exercise A.1.6∗ (Bézout’s Lemma). Consider two polynomials f, g ∈ K[X]. Prove that there exist
polynomials u, v ∈ K[X] such that uf + vg = gcd(f, g).

Solution

Without loss of generality, suppose that deg g ≤ deg f . We proceed by induction on deg g. When
this is 0, i.e. g is constant, we have 0 + ·f + 1/g · g = 1 as wanted. For the induction step, perform
the Euclidean division of f by g: f = gq + r. Since deg r < deg g, by the inductive hypothesis,
there are u and v such that uf + vr = 1. Then,

1 = uf + vr
= uf + v(f − gq)
(u + v)f − (qv)g

as wanted.


Remark A.1.1
One might, at first sight, think that this proof also works for non-coprime f, g (which is impossible
for obvious reasons). However, we used the assumption that they were coprime when we said
the base case was deg g = 1: this is only true because the gcd is 1 so the Euclidean algorithm
eventually yields a pair {1, f } with f 6= 0, right before the pair {1, 0}. Otherwise, we would have
406 APPENDIX A. POLYNOMIALS

to do the base case when deg g = −∞ which is clearly impossible.

Exercise A.1.7∗ . Let f ∈ K[X1 , . . . , Xn ] be a polynomial in n variables and suppose S1 , . . . , Sn ⊆ K


are subsets of K such that |Si | > degXi f . If f vanishes on S1 × . . . × Sn , prove that f = 0. (This is
the generalisation of Corollary A.1.1 to multivariate polynomials.)

Solution

We proceed by induction on n, the base case being the previous proposition. Fix xn ∈ Sn . Then,
the polynomial
g(xn ) = f (X1 , . . . , Xn−1 , xn ) ∈ K[X1 , . . . , Xn−1 ]
vanishes on S1 × . . . × Sn−1 and has degree less than |Si | in Xi . Hence, g(xn ) = 0. Finally, g is a
polynomial in Xn (with coefficients in the ring K[X1 , . . . , Xn−1 ]) of degree less than |Sn | vanishing
on Sn , which implies that it’s 0 by Corollary A.1.1. (Technically, to use Corollary A.1.1 we would
need to work over a field, while we are only working over a ring: K[X1 , . . . , Xn ]. However, this
is trivial to fix: this is an integral domain so we can embed it its field of fractions, i.e. work over
the field K(X1 , . . . , Xn ).)


Exercise A.1.8∗ . Prove that (f g)0 = f 0 g + gf 0 and (f + g)0 = f 0 + g 0 for any f, g ∈ K[X]. Show
also that (f n )0 = nf 0 f n−1 for any positive integer n, where f k denotes the kth power and not the kth
iterate. More generally, show that
n
!0 n
Y X Y
fi = fi0 fj .
i=1 i=1 j6=i

Solution

ai X i and g = j
P P
Write f = i j bj X . We have
X X X
(f + g)0 = k(ak + bk )X k−1 = iai X i−1 + jbj X j−1 = f 0 + g 0
k i j

which shows additivity. For the multiplication, we have


 0
X X
(f g)0 =  ai bj X k  = kai bj X k−1
i+j=k i+j=k

and
X X X X
f 0 g + g0 f = iai bj X k−1 + jai bj )X k−1 = (iai bj + jai bj )X k−1 = kai bj X k−1
i+j=k i+j=k i+j=k i+j=k

as wanted. Finally, the last point follows from the (f g)0 = f 0 g + g 0 f by induction:
n
!0 n−1 n−1 n
Y Y X Y X Y
fi = fn0 fi + fn fi0 fj = fi0 fj .
i=1 i=1 i=1 n6=j6=i i=1 j6=i

The previous point follows is the case f1 = . . . = fn = f .



A.2. ALGEBRAIC STRUCTURES AND MORPHISMS 407

Exercise A.1.9∗ . Prove that every function f : Fp → Fp is polynomial.

Solution

This follows from Lagrange’s interpolation theorem since Fp is finite.




Exercise A.1.10∗ . Prove that the derivative of a rational function does not depend on its form: i.e.
(f /g)0 = ((hf )/(hg))0 for any f, g, h ∈ K[X] with g, h 6= 0.

Solution

We have
f 0 g − g0 f
(f /g)0 =
g(X)2
and
(hf )0 (hg) − (hg)0 hf
(hf /hg)0 = (hg(X))2
g2


A.2 Algebraic Structures and Morphisms


Exercise A.2.1∗ . Prove that 1R and 0R are unique, and that any element has a unique additive
inverse and a unique multiplicative inverse if it is non-zero.

Solution

This follows from Exercise A.2.12∗ .




Exercise A.2.2∗ . Let R be a ring. Prove that 0R a = a0R = 0R for any a ∈ R.

Solution

The proof is the same as for Exercise A.1.1∗ .




Exercise A.2.3∗ . Prove that char R is the smallest m ≥ 0 such that R contains a copy of Z/mZ.

Solution

If R contains a copy of Z/mZ with m ≥ 1 then R has characteristic dividing m which shows the
result when m 6= 1. If m = 0, then R has characteristic 0 since n 6= 0 for all n ∈ Z. The converse
is clear: the copy Z/mZ is a (mod m) 7→ 1 + . . . + 1 for a ∈ N. (This is well-defined because the
| {z }
a times
characteristics are the same.)
408 APPENDIX A. POLYNOMIALS

Exercise A.2.4. Prove that an ideal a of R is equal to R if and only if it contains 1.

Solution

If a = R then 1 ∈ R = a. Conversely, if 1 ∈ a, then R ⊆ Ra ⊆ a so a = R.




Exercise A.2.5. Prove that the ideals of Z have the form nZ for some Z.2

Solution

This is a special case of Exercise 2.6.23. Let a be an ideal of Z, and let n ∈ N∗ be the minimal
positive element of a. Clearly, nZ ⊆ a. Suppose for the sake of contradiction that n doesn’t
divide some a ∈ a. Perform the Euclidean division of a by n: a = qn + r with 0 < r < n as
n - a. However, r ∈ a since a is an ideal, contradicting the minimality of a. (This is one of the
characteristic features of ideals: the axioms of ideals are precisely the axioms which allow us to
say that the remainder still lies in the ideal! That said, the usefulness of ideals far exceed this
property.)


Exercise A.2.6∗ . Prove that the characteristic of a field is either 0 or a prime number p.

Solution

Let c denote the characteristic of a given field K. If c 6= 0, then c ≥ 2 since the trivial ring is not
a field. Suppose that c = ab. Then, in K, we have ab = 0 which means a = 0 or b = 0 since it’s
an integral domain. By minimality of the characteristic, this means that c = a or c = b.


Exercise A.2.7. Let R be a finite integral domain (i.e. with finite cardinality). Prove that it is a
field.

Solution

Let a ∈ R be non-zero. Consider the powers of a: a, a2 , . . .. Since R is finite, there exist i < j
such that ai = aj , i.e. ai (aj−i − 1) = 0. Since a 6= 0 and R is an integral domain, we get
aj−i − 1 = 0, so that aj−i−1 is the inverse of a.


Exercise A.2.8∗ . Prove that a subring of a field is an integral domain.

2 We say Z is a principal ideal domain (PID). See Exercise 2.6.22.


A.2. ALGEBRAIC STRUCTURES AND MORPHISMS 409

Solution

If ab = 0 and a 6= 0 then b = a−1 ab = 0.




Exercise A.2.9. What goes wrong if you try to construct the field of fractions of a commutative ring
which isn’t a domain?

Solution

Clearly, if uv = 0, there is something wrong with 1/u. Indeed, we would have 1/u = v/(uv) = v/0
which doesn’t make sense (even formally: 1 · 0 is not equal toThe problem is that a/b = c/d if
ad = bc is not an equivalence relation anymore: we can have a/b = c/d and c/d = x/y but
a/b 6= x/y. Indeed, this is how the usual proof of transitivity goes: we have ad = bc and cy = dx
so
ady = bcy = bdx
which doesn’t necessarily means ay = bx since d might not be invertible. Here is a concrete
counterexample, if dd0 = 0, then 1/d = d0 /0 and d0 /0 = 1/0 but 1/d 6= 1/0.


Exercise A.2.10∗ . Let R be an integral domain. Prove that R[X] is also one.

Solution

Suppose that f and g are non-zero elements of R[X] with respective leading coefficients a and b.
Then, the leading coefficient of f g is ab since ab 6= 0 as R is an integral domain, which implies
in particular that f g is non-zero.


Exercise A.2.11. Prove that a field morphism is always injective.

Solution

By Exercise A.2.16∗ , it suffices to check that ϕ(c) = 0 implies c = 0, since ϕ(a) = ϕ(b) ⇐⇒
ϕ(a − b) = 0. This follows the fact that, if c 6= 0, ϕ(c)ϕ(c−1 ) = 1 so that ϕ(c) 6= 0.


Exercise A.2.12∗ . Prove that the identity e of a group G is unique, and that any a ∈ G has a unique
inverse. Moreover, prove that (xy)−1 = y −1 x−1 .

Solution

If e and e0 are two identities then e = ee0 = e0 . The inverse of xy is y −1 x−1 since (xy)(y −1 x−1 ) =
xx−1 = e.


Exercise A.2.13∗ . Check that (Sn , ◦) is a group.


410 APPENDIX A. POLYNOMIALS

Solution

Since permutations are bijective, they are invertible. Moreover, the identity permutation is the
identity of the group. Finally, it is clear that the operation is associative since composition is.


Exercise A.2.14∗ . Prove that a morphism of groups from (G, †) to (H, ?) maps the identity of G to
the identity of H.

Solution

Let ϕ be such a morphism and eG , eH be the identities of G and H respectively. We have

ϕ(eG ) = ϕ(eG † eG ) = ϕ(eG ) ? ϕ(eG )

so ϕ(eG ) = eH as wanted (by starring both sides by its inverse).)




Exercise A.2.15∗ . Prove that the kernel of a morphism (of rings or groups) is closed under addition.

Solution

If ϕ(a) = 0 and ϕ(b) = 0 then ϕ(a + b) = ϕ(a) + ϕ(b) = 0.




Exercise A.2.16∗ . Prove that a morphism of groups is injective iff its kernel is trivial, i.e. consists
of only the identity.

Solution

If it is injective, then the kernel is trivial. Otherwise, suppose that ϕ(a) = ϕ(b) and a 6= b. Then
ϕ(ab−1 ) = e so the kernel is non-trivial.


A.3 Exercises
Derivatives
Exercise A.3.1† . Let K be a field of characteristic 0 and let f, g ∈ K[X] be two polynomials. Prove
that the derivative of f ◦ g is g 0 · f 0 ◦ g.

Solution

ai X i . Then, (f ◦ g)0 = ai (g i )0 = iai g 0 g i−1 = g 0 f 0 ◦ g.


P P P
Write f = i i i


Exercise A.3.2† . Let f ∈ K[X] be a non-constant polynomial. Prove that there are a finite number
of g, h ∈ K[X] such that g ◦ h = f , up to affine transformations, meaning (g, h) ≡ g(aX + b), h−b

a .
A.3. EXERCISES 411

Solution

By composing with an affine transformation, we may assume that h(0) = 0 and that h is monic.
If we differentiate the equation g ◦ h = f , we get h0 | f 0 . There is a finite number of such h0
since we fixed its leading coefficient and f is non-constant. Since h(0) = 0, there is also a finite
number of h. Since g is uniquely determined from h, we are done.


Exercise A.3.4† (USA TST 2017). Let K be a characteristic 0 field and let f, g ∈ K[X] be non-
constant coprime polynomials. Prove that there are at most three elements λ ∈ K such that f + λg is
the square of a polynomial.

Solution

The key point is that, if f + λg is a square h2 , then h divides f + λg as well as f 0 + λ0 g = 2hh0


so must divide
g 0 (f + λg) − g(f 0 + λg 0 ) = f g 0 − g 0 f
which is independent of λ. (Note that this is the Wronskian determinant (see Exercise C.5.10† )
f g
which was also used in Exercise A.3.27† . This explains why it doesn’t depend on λ.)
f 0 g0

Also, if f + λg = r2 and f + µg = s2 for µ 6= λ, r and s are coprime since they are two
linearly independent linear conmbinations of f and g, and we know f and g are coprime. Thus,
if f + λi g = h2i for i = 1, . . . , n, we get h1 · . . . · hn | f 0 g − g 0 f as they all divide it and are coprime.
However, when n is large (i.e. greater than 3), the degree of the LHS will be too big so this will
be impossible. Indeed, from f + λi g = h2i , we deduce that deg hi is max(deg f, deg g)/2, except
possibly for one value of λ in the case that deg f = deg g. In the first case we are done since

4 max(deg f, deg g)/2 > deg f + deg g − 1,

so we must have f 0 g = g 0 f which is impossible as this would mean f | f 0 since f and g are coprime.
For the second case, if deg(f + λg) is small, note that we can replace f by f + λg (and replace the
λi by other real numbers µi ) and this case is now impossible since deg(f + λg) < deg g = deg f .
Note that this doesn’t change the value of f 0 g − g 0 f because we constructed it to be

g 0 (f + λg) − g(f 0 + λg 0 ) = f g 0 − g 0 f.

Exercise A.3.6† (Discrete Derivative). Let K be a field of characteristic 0 and let f ∈ K[X] be a
polynomial of degree n and leading coefficient a. Define its discrete derivative as ∆f := f (X+1)−f (X).
Prove that, for any g ∈ K[X] ∆f = ∆g if and only if f − g is constant, and that ∆f is a polynomial
of degree n − 1 with leading coefficient an where a is the leading coefficient of f . Deduce the minimal
degree of a monic polynomial f ∈ Z[X] identically zero modulo m, for a given integer m ≥ 1.

Solution

The discrete derivative operator is a morphism (from the space of polynomials of degree at most
n to the space of polynomials of degree at most n − 1), thus it suffices to show that its kernel
consists only of constants. This follows from the second part, that ∆f is a polynomial of degree
412 APPENDIX A. POLYNOMIALS

Pn
n − 1. For this, simply write f = i=0 ai X i . Then,
n n i−1  
X X X i
∆f = ai ((X + 1)i − X i ) = ai Xj
i=0 i=0 j=0
j

n

and the term in X n−1 is reached only once for i = n, j = n − 1, with coefficient an n−1 = an.
Finally, if a polynomial is identically zero modulo m and monic of degree n, ∆n f = n! since the
degree decreases by one every time we apply ∆, while the leading coefficient gets multiplied by
the degree. Thus, m | n!. Conversely, if n is the minimal integer such that m | n!, the polynomial
 
X
f = n! = X(X − 1) · . . . · (X − (n − 1))
n

works.


Exercise A.3.7† . Let f : R → R be a function, where R is some ring. Define its discrete derivative
∆f as x 7→ f (x + 1) − f (x). Prove that, for any integer n ≥ 0,

n  
n
X
n−k n
∆ f (x) = (−1) f (x + k).
k
k=0

Solution

We proceed by induction on n. For n = 0 it is of course trivial. If it’s true for n, then

∆n+1 f (x) = ∆(∆n f )(x)


n  
n−k n
X
= (−1) (f (x + k + 1) − f (x + k))
k
k=0
n+1    
X n n
= ((−1)n+1−k − f (x + k)
k+1 k
k=0
n+1    
X n n
= (−1)n+1−k + f (x + k)
k+1 k
k=0
n+1  
n+1−k n + 1
X
= (−1) f (x + k).
k
k=0

Exercise A.3.8† . Let m ≥ 0 be an integer. Prove that there is a polynomial fm ∈ Q[X] of degree
m + 1 such that
n
X
k m = fm (n)
k=0

for any n ∈ N.
A.3. EXERCISES 413

Solution
Pn
We proceed by induction on m by noting that k=0 k 0 = n + 1 := f0 (n) and that
n
X
(n + 1)m+1 = (k + 1)m+1 − k m+1
k=0
m   n
X m+1 X
= ki
i=0
i
k=0
n
! m−1
X X m+1

m
= (m + 1) k + fi (n)
i=0
i
k=0

so that
n m−1
X
m (n + 1)m+1 X m + 1 fi (n)
fm (n) = k = −
m+1 i=0
i m+1
k=0
1
is a polynomial as well. Note also that its leading coefficient is m+1 .


Roots of Unity
Exercise A.3.9† (Roots of Unity Filter). Let f = i ai X i ∈ K[X] be a polynomial, and suppose
P
that ω1 , . . . , ωn ∈ K are distinct nth roots of unity. Prove that

f (ω1 ) + . . . + f (ωn ) X
= ak .
n
n|k

Deduce that, if K = C,
max |f (z)| ≥ |f (0)|.
|z|=1

(You may assume the existence of a primitive nth root of unity ω, meaning that ω k 6= 1 for all k < n,
or, equivalently, every nth root of unity is a power of ω. This will be proven in Chapter 3.)

Solution

Let ω be a primitive nth root of unity. Note that, if n - m,


n n−1
X X ω mn − 1
ωkm = ω km =
ωm − 1
k=1 k=0
Pn
since the numerator is zero and the denominator isn’t. When n | m, the sum is simply k=1 1 =
n. Thus, we have proven the result for monomials, and the general case follows by taking linear
combinations (if it’s true for f and g it’s also true for af and f + g).
f (ω1 )+...+f (ωn )
For n > deg f we have n = f (0) so

f (ω1 ) + . . . + f (ωn )
max |f (ωk )| ≥ = |f (0)|
k n

by the triangular inequality.



414 APPENDIX A. POLYNOMIALS

Exercise A.3.10† . Let f = i ai X i ∈ C[X] be a polynomial and ω1 , . . . , ωn ∈ C be distinct nth


P
roots of unity with n > deg f . Prove that

|f (ω1 )|2 + . . . + |f (ωn )|2 X


= |ai |2 .
n i

Denote by S(f ) the sum of the squares of the modules of the coefficients of f . Deduce that S(f g) =
S(f X deg g g(1/X)) for all f, g ∈ C[X]. (X deg g g(1/X) is the polynomial obtained by reversing the
coefficients of g.)

Solution

Note that
|f (ω)|2 = f (ω)f (ω) = f (ω)f (ω) = f (ω)f (ω −1 )
for any ω on the unit circle, since ωω = |ω|2 = 1 for these ω. Thus,
n n
1X 1X
|f (ωk )|2 = f (ωk )f (ωk−1 )
n n
k=1 k=1
n
1 X
= f (ωk )f (ωk−1 )
n
k=1
n X
1 X X
= ai ωki aj ωkj
n
k=1 i j
n X
1 X
= ai aj ωki−j
n i,j
k=1
X
= a2i
i

by Exercise A.3.9† since n | i − j iff i = j for i, j ∈ [[0, deg f ]], as n > deg f . For the second part,
note that |f (ω)g(ω)| = f (ω)g(1/ω)| for any ω on the unit circle.


Exercise A.3.11† (USEMO 2021). Denote by S(f ) the sum of the squares of the modules of the
coefficients of a polynomal f ∈ C[X]. Suppose that f, g, h ∈ C[X] are such that f g = h(X)2 . Prove
that S(f )S(g) ≥ S(h)2 .

Solution

Pick n > deg f, deg g, deg h and let ω1 , . . . , ωn be the nth roots of unity. By ??, we have

|f (ω1 )|2 + . . . + |f (ωn )|


S(f ) =
n
|g(ω1 )|2 + . . . + |g(ωn )|
S(g) =
n
|h(ω1 )|2 + . . . + |h(ωn )2
S(h) = .
n
A.3. EXERCISES 415

Thus, by Cauchy-Schwarz,

|f (ω1 )|2 + . . . + |f (ωn )| |g(ω1 )|2 + . . . + |g(ωn )|


S(f )S(g) = ·
n n
 2
|f (ω1 )||g(ω1 )| + . . . + |f (ωn )||g(ωn )|

n
2
|h(ω1 )|2 + . . . + |h(ωn )|2

=
n
= S(h)2

because f g = h(X)2 .


Exercise A.3.12† . Let k be an integer. Prove that a∈Fp ak is 0 if p − 1 - k and −1 otherwise.


P

Deduce that any polynomial f ∈ Fp [X] of degree at least 1. satisfying f (a) ∈ {0, 1} for all a ∈ Fp
must have degree at least p − 1.

Solution

The first part is Exercise A.3.9† for K = Fp , since non-zero elements of Fp are (p − 1)th roots of
unity by Fermat’s little theorem. For the second, let m be the number of times f (a) = 1. Then,
P
a∈Fp f (a) ≡ m (mod p). If deg f < p − 1, this sum is zero modulo p by the first part which is
impossible since m ∈ [1, p − 1] (if f is constant over Fp and has degree less than p, f − f (0) has
more roots than its degree so is zero).


Exercise A.3.13† . Let p 6= 3 be a prime number. Suppose that a and b are integers such that
p | a2 + ab + b2 . Prove that (a + b)p ≡ ap + bp (mod p3 ).

Solution

We present two solutions, one with polynomials, and one with p-adic ideas. The second is perhaps
more natural. Note that we can suppose that a, b 6≡ 0 (mod p) and reduce the problem to the
case where b = 1 by considering x ≡ ab−1 (mod p) so that x2 + x + 1 ≡ 0. In particular, x has
order 3 modulo p since x3 − 1 ≡ (x − 1)(x2 + x + 1) but x 6≡ 1 since p 6= 3. This implies that
p ≡ 1 (mod 3) by Exercise 3.3.4∗ . (This is a special case of Theorem 3.3.1.)
The key point of the first solution is that, since p ≡ 1 (mod 3), we have (X 2 + X + 1)2 |
(X + 1)p − X p − 1 := f . Since (X + 1)p − X p − 1 ≡ 0 (mod p) by the binomial expansion (see
Proposition 4.1.1 for more details), this means that (X 2 + X + 1)2 divides the polynomial fp in
Q[X], and hence in Z[X] too since it is monic. We conclude that p(X 2 + X + 1)2 | f in Z[X] so
that
vp (f (x)) ≥ vp (p(x2 + x + 1)) ≥ 3
as wanted. First, note that X 2 + X + 1 is irreducible over Q[X] and that its roots are primitive
third root of unity ω, since X 3 −1 = (X 2 +X +1)(X −1). Hence, we wish to show that f (ω) = 0,
f 0 (ω) = 0 and f 00 (ω) = 0. We have
f (ω) = (ω + 1)p − ω p − 1 = (−ω 2 )p − ω p − 1 = 0
since ω p is also a primitive third root of unity. Similarly,
f 0 (ω) = p(ω + 1)p−1 − pω p−1 = pω 2(p−1) − pω p−1 = p − p = 0
416 APPENDIX A. POLYNOMIALS

since 3 | p − 1 and p − 1 is even so we are done.

For the second solution, note that Hensel’s lemma implies the existence of a cube root of unity
j modulo p3 (the derivative of X 3 − 1 is non-zero at x sicnce p 6= 3), which is congruent to x
modulo p. Set x = j + kp. Then, modulo p3 ,

(x + 1)p ≡ (kp − j 2 )p
≡ (−j 2 )p + kp2 (−j 2 )p−1
≡ −j 2 + kp2

and

xp + 1 ≡ (j + kp)p + 1
≡ j p + kp2 j p−1 + 1
≡ −j 2 + kp2

as wanted.


Remark A.3.1
n n
−X −1
It has been conjectured that the polynomials (X+1) n n
(X 2 +X+1)ε where ε = vX +X+1 ((X +1) −X −1)
2

is 2 if n ≡ 1 (mod 3), 1 if n ≡ −1 (mod 3) and 0 if n ≡ 0 (mod 3) are irreducible. These are


called the Cauchy-Mirimanoff polynomials.

Group Theory
Exercise A.3.15† . Given a group G and a normal subgroup H ⊆ G, i.e. a subgroup such that
x+H −x=H
for any x ∈ G,3 we define the quotient G/H of G by H as G modulo H 4 , i.e. we say x ≡ y (mod H)
if x − y ∈ H.5 Prove that this indeed a group, and that |G/H| = |G|/|H| for any such G, H.

Solution

G/H is clearly closed under the operation of G and has inverses and an identity. We need however
to check that the operation is well defined: x ≡ y (mod H) and z ∈ G, x + z ≡ y + z (mod H)
and z + x ≡ z + y (mod H). For the former, note that (x + z) − (y + z) = x − y ∈ H since the
inverse of y + z is −z − x, and for the latter note that (z + x) − (z + y) = z + (x − y) − z is in
H because H is normal in G. The second part is obvious: any x ∈ G is equal to exactly |H|
elements modulo H: x + y for y ∈ H.


Exercise A.3.16† (Isomorphism Theorems). Prove the following first, second, and third isomorphism
theorems.
1. Let ϕ : A → B be a morphism of groups. Then, A/ ker ϕ ' im ϕ. (In particular, ker ϕ is normal
in A and | im ϕ| · | ker ϕ| = |A|.)
3 Inparticular, when G is abelian, any subgroup is normal.
4 This is where the notation Z/nZ comes from! In fact this shows that, in reality, we should say "modulo nZ" instead
of "modulo n".
5 A better formalism is to say that G/H is the set of cosets g + H for g ∈ G. In fact, we will almost always use

this definition in the solutions of exercises (since this is the only place where this will appear, apart from ??), but we
introduced it that way to make the analogy with Z/nZ clearer.
A.3. EXERCISES 417

2. Let H be a subgroup of a group G, and N a normal subgroup of G. Then, H/H ∩ N ' HN/N .
(In particular, you need to show that this makes sense: HN is a group and H ∩ N is normal in
H.)
3. Let N ⊆ H be normal subgroups of a group G. Then, (G/N )/(H/N ) ' G/H.

Solution

1. Note that ker ϕ is normal in A. Indeed, if ϕ(x) = 1, then ϕ(yxy −1 ) = ϕ(y)ϕ(x)ϕ(y)−1 = 1


too. Second, note that every element in the image of ϕ has exactly one one preimage in
A/ ker ϕ: indeed, if ϕ(x) = ϕ(y), then xy −1 ∈ ker ϕ so they are equal modulo ker ϕ. This
shows that it is an isomorphism (it is clearly surjective, and we have shown it was injective
too).
2. Note that H ∩ N is normal in H since N is so hH ∩ N h−1 ⊆ N but this is also in H
when h ∈ H so must be equal to H ∩ N . Note also that HN is indeed a group since, if
gm, hn ∈ HN , then mh = h` for some ` ∈ N as N is normal, so

gmhn = gh`n ∈ HN.

Similarly, gm = kg for some k ∈ G so (gm)−1 = g −1 k −1 ∈ HN . Now, consider the natural


map from H to HN/N , sending h to hN . Its kernel consists of the h such that hN = N , i.e.
h ∈ N . Hence, its kernel is H ∩ N so we get H/H ∩ N ' HN/N by the first isomorphism
theorem.
3. Consider the surjective map G/N → G/H which sends gN to gH. It is well defined
since N ⊆ H. gN is in the kernel if gH = H, i.e. g ∈ H. Hence, the kernel consists
of hN for h ∈ H, i.e. H/N . We conclude from the first isomorphism theorem that
G/H ' (G/N )/(H/N ) as wanted.

Exercise A.3.17† . Let G be a finite group, ϕ : G → C× be a non-trivial group morphism (i.e. not
the constantPfunction 1), where (C× , ·) is the group of non-zero complex numbers under multiplication.
Prove that g∈G ϕ(g) = 0.

Solution

Note that, for any h ∈ G, g 7→ hg is a bijection so


X X X
ϕ(g) = ϕ(hg) = ϕ(h) ϕ(g)
g∈G g∈G g∈G
P
which means that g∈G ϕ(g) = 0 by picking an h such that ϕ(h) 6= 1.


Remark A.3.2
Alternatively, this can be done as follows: the image of ϕ is a subgroup of the group of |G|th roots
of unity by Lagrange, so must be the group of nth roots for some n, greater than 1 by assumption
(this is just the fact that subgroups of cyclic groups are also cyclic). Let ω = exp(2iπ/n) be a
primitive nth root of unity. Hence, we have
n−1
X X ωn − 1
x= ωk = =0
ω−1
x∈im ϕ k=0
418 APPENDIX A. POLYNOMIALS

since the numerator is zero while the denominator isn’t, as n > 1. To conclude, by the first
isomorphism theorem from Exercise A.3.16† , we have
X |G| X
ϕ(g) = x = 0.
ker ϕ
g∈G x∈im ϕ

Exercise A.3.18† (Lagrange’s Theorem). Let G be a group of cardinality n (also called the order of
G). Prove that g n = e for all g ∈ G. In other words, the order of an element divides the order of the
group. More generally, prove that the order of a subgroup divides the order of the group.

Solution

See Theorem 2.5.1 and Exercise 6.3.19∗ .




Exercise A.3.19† (5/8 Theorem). Let G be a non-commutative finite group. Prove that the proba-
bility

|{(x, y) ∈ G2 | xy = yx}|
p(G) =
|G|2

that two elements commute is at most 5/8.

Solution

Denote by Z the center of the group, i.e. the set of elements which commute with every other
one. For a given x ∈ G, denote also by C(x) the centraliser of x, i.e. the set of y such that x and
y commute. The wanted probability is x∈G |C(x)|
P
|G|2 . Note that C(x) are subgroups of G (and
hence Z is too): if xy = yx and xz = zx then

xyz = yxz = yzx.

First, let’s see how big the center can be. It’s a subgroup of G, so its cardinality divides |G| by
Lagrange’s theorem Exercise A.3.18† . It can’t be |G| since G is non-abelian, it can’t be |G|
2 since
G/Z is then isomorphic to Z/2Z so is generated by one element and hence G is generated by Z
and one additional element which means that it’s commutative:

am xan y = am+n xy = an yam x

for x, y ∈ Z. For the same reason, it can’t be |G|


3 since G/Z still has prime order so must be
|Z|
generated by one element by Lagrange’s theorem. Thus, |G| ≤ 14 .

|G|
Now, if x 6∈ Z, C(x) is a subgroup of G distinct from it so has cardinality at most 2 . To
A.3. EXERCISES 419

conclude,

|{(x, y) ∈ G2 | xy = yx}| X |C(x)|


=
|G|2 |G|2
x∈G
X |G| X |C(x)|
= +
|G|2 |G|2
x∈Z x6∈Z
|Z| |G|/2
≤ + (|G| − |Z|) ·
|G| |G|2
|Z| 1 |Z|
= + −
|G| 2 2|G|
|Z| 1
= +
2|G| 2
1 1
≤ +
8 2
5
= .
8


Remark A.3.3
One can check that the bound 5/8 is achieved by the quaternion group Q8 consisting of the
elements e, b, b2 , b3 , a, ab, ab2 , ab3 under the presentation a4 = b4 = e, a2 = b2 , and ba = ab3 .

Exercise A.3.20† (Fundamental Theorem of Finitely Generated Abelian Groups). Let G be an


abelien group which is finitely generated, i.e., if we write its operation as +, there are g1 , . . . , gk ∈ G
such that any g ∈ G can be represented as n1 g1 + . . . + nk gk for integers ni ∈ Z.
a) Suppose that G is torsion-free, i.e. the only element which has finite order in G is its identity 0.
Prove that there is a unique integer n ≥ 0 such that (G, +) ' (Zn , +).
b) Suppose now that G is torsion, i.e. all elements in G have finite order. Prove that that there is
a d ∈ N∗ such that G contains a subgroup H ' Z/dZ and such that the order of any element of
G divides d (i.e. dG = 0). By finding a morphism ϕ : G → H which is the identity on H, show
that we have G ' H × G/H. Deduce that there exists a unique sequence of positive integers
1 6= dm | . . . | d1 such that

(G, +) ' (Z/d1 Z × . . . × Z/dk Z, +).

c) Deduce that if G is a finitely generated abelian group, there is a unique integer n ≥ 0 (called the
rank of the group) and a unique sequence of positive integers 1 6= dm | . . . | d1 such that

(G, +) ' (Zn × Z/d1 Z × . . . × Z/dk Z, +).

Solution

a Suppose that G is torsion-free. We start with the uniqueness part: if we had an isomorphism
from Zm to Zn , we would have one from (Z/2Z)m → (Z/2Z)n by reducing it modulo 2,
and this implies 2m = 2n , i.e. m = n. Now, pick a generating family of minimal cardinality
(α1 , . . . , αn ). We wish to prove that it is linearly independent. Suppose that it is not the
case, and let N 6= 0 be the minimum value of the absolute values of the coefficients of
a non-trivial linear combination which is zero. In fact, we shall also pick the generating
family to minimise N . The contradiction will then come from a construction of another
generating set with zero linear combination with smaller coefficients.
420 APPENDIX A. POLYNOMIALS

Suppose that k1 α1 + . . . + kn αn = 0 and N = |k1 | + . . . + |kn |. Suppose without loss of


generality that 0 < |k1 | < |k2 |. Say we replace the family α1 , . . . , αn by α1 ± α2 , α2 , . . . , αn .
Then, k1 α1 + . . . + kn αn = 0 becomes

k1 (α1 ± α2 ) + (k2 ∓ k1 )α2 + k3 α3 . . . + kn αn = 0.

By picking the ±1 sign appropriately, we ensure that |k2 ∓ k1 | < |k2 | thus leading to a
smaller value of N , which is a contradiction.
b First, note that a torsion finitely-generated abelian group must be finite: if g1 , . . . , gm is a
generating family of respective orders k1 , . . . , km , we have |G| ≤ k1 · · · km .

Now, pick an element h ∈ G of maximal order d. We claim that the order of any element
g ∈ G divides d. (We know that this must be true by the statement: this d is our d1 .
Note however that this is false for non-abelian groups.) Indeed, suppose that x, y ∈ G have
order a, b. We will construct an element of order lcm(a, b). First suppose that a and b are
coprime. Then, a(x + y) = ax has order b since gcd(a, b) = 1, and similarly b(x + y) = by
has order a. Thus, the order of x + y is divisible by a and b, and hence by ab. Conversely,
it clearly divides ab so must be exactly ab.

Now, if a and b are not necessarily coprime, let a0 = vp (b)


and b0 =
Q
vp (a)≥vp (b) p
vp (b)
so that a0 , b0 are coprime and have product lcm(a, b). The elements
Q
vp (a)<vp (b) p
(a/a )x and (b/b )y have respective orders a0 and b0 so we are done by the previous step
0 0

since a0 and b0 are coprime. This proves that the order of any element of G divides d.

Now, let H = hhi be the subgroup generated by g, i.e. {0, g, . . . , (d − 1)g}. This is
isomorphic to Z/dZ. We claim that

G ' H × G/H.

Continuing in this fashion with G/H (which has a strictly smaller cardinality unless G is
already trivial) yields the wanted decomposition as a product of cyclic groups, since we
have shown that the di are divisible by the previous one (d is divisible by the order of any
element). To prove that G ' H × G/H, we will find a morphism ϕ from G to H which
is the identity on H. Then, g 7→ (ϕ(g), g (mod H)) will then be the wanted isomorphism
between G and H × G/H: if ϕ(g) = ϕ(g 0 ) and g ≡ g 0 (mod H), then ϕ(g − g 0 ) = g − g 0
since it is the identity on H so we must have g = g 0 . Thus, our morphism is injective and
hence bijective since |G| = |H| · |G/H|.

We proceed by induction on the minimal number of elements needed to generate G from


H. When H = G it is trivial. Now, suppose ϕ is a morphism from G0 ⊆ G to G and let
g ∈ G \ G0 . We will extend ϕ to hG0 , gi, the subgroup generated by G0 and g. Let n be the
order of y in G/G0 , i.e. the smalles k such that ny ∈ G0 . Then, ky ∈ G ⇐⇒ n | k. Thus,
ϕ(g 0 + kg) := ϕ(g 0 ) + kϕ(g) is well-defined as long as ϕ(g) is such that

ϕ(ng) = nϕ(g).

Now, note that n divides the order of g which divides d = |H|. Hence, it is always possible
to find such a ϕ(g): if ϕ(ng) = kh, since dg = 0, we have (dk/n)h = 0, i.e. n | k which
means that ϕ(g) = (k/n)h works.

Finally, it remains to show that this sequence d1 , . . . , dm is unique. We have already proven
that d1 = d is uniquely determined: it is the maximal order of an element of G. Suppose
that
Z/d1 × . . . × Z/dm Z ' G ' Z/d01 Z × . . . × Z/d0r Z
A.3. EXERCISES 421

where dm | . . . | d1 and d0r | . . . | d01 are positive integers. We prove by induction on k that
(d1 , . . . , dk ) = (d01 , . . . , d0k ) for k ≤ m, r. Then, by comparing the cardinality on both sides,
we get m = r as wanted. Thus, suppose that (d1 , . . . , dk−1 ) = (d01 , . . . , d0k ). Then, we have

Z/(d1 /dk )Z×. . .×Z/(dk−1 /dk )Z ' dk G ' Z/(d1 /dk )Z×. . .×Z/(dk−1 /dk )Z×dk Z/d0k Z×. . .×dk Z/d0r Z.

Comparing the cardinality of both sides yields dk Z/d0k Z = 0, i.e. d0k | dk . By symmetry,
we also have dk | d0k , which means that dk = d0k as wanted.
c Let T be the torsion part of G, i.e. the subgroup of elements of finite order of G (this
is indeed a subgroup as G is abelian). Note that G/T is torsion-free: if x (mod T ) has
finite order, then nx ∈ T for some n so x has finite order, i.e. x ∈ T . Pick a basis α1
(mod T ), . . . , αn (mod T ). Now, we claim that

G ' T × (α1 Z + . . . + αn Z) ' T × Zn

as wanted. This follows from the simple isomorphism (x, y) 7→ x + y. This is surjective by
definition, since α1 Z + . . . + αn Z is a system of representatives of G/T . For the injectivity,
note that, if x + y = x0 + y 0 , then y − y 0 = x − x0 ∈ T so y = y 0 and thus x = x0 since
α1 Z + . . . + αn Z ' G/T has trivial torsion.

Finally, note that if G ' Zn × H with H finite, then we automatically have T ' H and
G/T ' Zn , so the uniqueness follows from the first two questions.

Exercise A.3.21† (Burnside’s Lemma). Let G be a finite group, S a finite set, and · a group action
of G on S, meaning a map · : G × S → S such that e · s = s and (gh) · s = g · (h · s) for any g, h ∈ G
and s ∈ S. Given a g ∈ G, denote by Fix(g) the set of elements of s fixed by g. Prove that
1 X
|S/G| = Fix(g),
|G|
g∈G

where |S/G| denotes the number of (disjoint) orbits Oi = Gsi . Deduce the number of necklaces that
have p beads which can be of a colours, where p is a prime number and two necklaces are considered
to be the same up to rotation.

Solution
P
Consider the sum g∈G | Fix(g)|.
P This is equal to the number of pairs (g, s) such that gs = s.
Hence, this is also equal to s∈S | Stab(s)|, where Stab(s) denotes the elements of G fixing s.
Now consider the orbit Gs of s. We claim that |Gs| = |G/ Stab(s)| = |G|/| Stab(s)|. Indeed,
the map from the left-cosets G/ Stab(s) to Gs sending g Stab(s) to gs is clearly a bijection: if
gs = hs then h−1 g ∈ Stab(s) so g Stab(s) = h Stab(s). Hence,
X X 1
| Fix(g)| = |G|
|Gs|
g∈G s∈S
X X 1
=
|O|
O∈S/G x∈O
X
= 1
O∈S/G

= |S/G|

as desired.
422 APPENDIX A. POLYNOMIALS

For the second part, consider the cyclic group Z/pZ acting on the sets of words (necklaces) in
an alphabet (the set of colours) of size a. Why did we choose Z/pZ? Because we consider the
necklaces up to rotation. The action of g ∈ Z/pZ is of course a rotation of g beads, say to
the right. Then, there is one element fixing all words: 0, and all the other ones only fix words
with all letters equal, i.e. monochromatic necklaces. Indeed, if 0 6= g ∈ Z/pZ fixes a necklace,
then so does Z/pZ = gZ/pZ which means that the necklace is invariant under all rotations, i.e.
monochromatic. Hence, the number of necklaces is

ap + (p − 1)a
p
by Burnside’s lemma. Notice that this also proves Fermat’s little theorem..


Exercise A.3.22† (Exact Sequences). We say a sequence of group morphisms fi : Gi → Gi+1 , written
ϕ1 ϕ2 ϕn
G1 → G2 → . . . → Gn+1

is exact if im ϕk = ker ϕk+1 for every k ∈ [n − 1]. Prove that the short sequences
ϕ
0→G→H

and
ϕ
G→H→0
are exact if and only if ϕ is injective or sujective respectively. (0 designates the trivial group, and the
ommited maps the trivial morphisms.) Finally, suppose that d is a measure of the size of groups which
satisfies d(G1 ) = d(G2 ) whenever G1 ' G2 , and d(G/H) = d(G) − d(H) for any group G and any
normal subgroup H ⊆ G. Suppose that
ϕ1 ϕ2 ϕn
0 → G1 → G2 → . . . → Gn+1 → 0

is exact. Prove that6


n+1
X
(−1)k d(Gk ) = 0.
k=1

Solution

0 → G → H is exact if and only if ker ϕ = im 0 = 0, i.e. ϕ is injective by Exercise A.2.16∗ .


ϕ
ϕ
Similarly, G → H → 0 is exact if and only if im ϕ = ker 0 = H, i.e. f is surjective.

Now we prove the main part of the exercise. For didactic purposes, we shall do the case n = 1
ϕ
and n = 2 first. For the former, note that, by the above paragraph, 0 → G → H → 0 is exact if
and only if f is an isomorphism. This means that G ' H, and thus d(G) − d(H) = 0. For the
latter, suppose that
ϕ ψ
0→A→B→C→0
is exact. Then, ϕ is injective and ψ is surjective. By the first isomorphism theorem (see Exer-
cise A.3.16† ), we have
A ' A/ ker ϕ ' im ϕ
and
B/ im ϕ = B/ ker ψ ' im ψ ' C.

6 Exact sequences may seem scary at first, but this is one of the many reasons why they’re so useful (but not the most

fundamental). Many times, the goal is this equality, and we just happen to have an exact sequence which produces it.
Common examples of such size measures include d(G) = log(|G|) for finite groups, or d(V ) = dim V for vector spaces.
A.3. EXERCISES 423

Thus, in some sense, we have B/A ' C, which gives the wanted equality by taking the size on
both sides. To be more rigorous,

d(B) − d(A) = d(B) − d(im ϕ) = d(B/ im ϕ) = d(C).

We now consider the general cases with n morphisms and n + 1 groups. The first isomorphism
theorem yields again

G1 ' im ϕ1
Gk / im ϕk−1 ' im ϕk
Gn / im ϕn−1 ' Gn+1

for 2 ≤ k ≤ n Taking the size, we get

d(G1 ) = d(im ϕ1 )
d(Gk ) − d(im ϕk−1 ) = d(im ϕk )
d(Gn ) − d(im ϕn−1 ) = Gn+1

for 2 ≤ k ≤ n. Finally, taking the alternating sum, the d(im ϕk ) terms cancel out, and we are
left with
n+1
X
(−1)k d(Gk ) = 0
k=1

as wanted.


Exercise A.3.23† . We say an ideal m 6= R of a ring R is i f no ideal a 6= R contains it. Prove that m
is maximal if and only if R/m is a field. (The quotient is the same as in Exercise A.3.15† .)

Solution

m is a maximal ideal if and only if, for any x 6∈ m, the ideal generated by m and x is R. By
Exercise A.2.4, this is equivalent to it containing 1, i.e. to 1 having the form ax + by for some
a, b ∈ R and y ∈ m. Thus, m is a maximal ideal if and only if, for any x 6∈ m, there is an a ∈ R
for which ax ≡ 1 (mod m). In other words, m is a maximal ideal if and only if any 0 6= x ∈ R/m
has an inverse, i.e. R/m is a field (since m 6= R this is not the trivial ring).


Exercise A.3.24† . We say a commutative ring R is Noetherian if it satisfies the ascending chain
condition: for any weakly increasing chain of ideals

a1 ⊆ a2 ⊆ a3 ⊆ · · · ,

there is some N ∈ N∗ such that aN = aN +1 = aN +2 = . . .. Prove that the following assertions are
equivalent.

(i) R is Noetherian.

(ii) Any ideal a of R is finitely generated7 , meaning that there are some a1 , . . . , an such that

a = a1 R + . . . + an R.
7 In particular, a principal ideal domain is Noetherian.
424 APPENDIX A. POLYNOMIALS

(iii) Any set of ideals has a maximal element (with respect to the relation of inclusion)8 .

Solution

It is clear that (i) is equivalent to (ii): if we have a set of ideals with no maximal element we
can create an infinite strictly increasing chain of ideals by picking an element smaller than the
previous one each time. Conversely, if every set of ideals has a maximal element, then the same
goes for an ascending chain of ideals which means that it must be eventually constant.

We now prove that (i) implies (iii). Suppose that R is Noetherian. We need to prove that every
ideal is finitely generated. Suppose otherwise: let a be an ideal which is not finitely generated.
Then, we can construct a strictly increasing chain of ideals as follows: set ak = a1 R + . . . + ak R
where ak ∈ a is chosen so that ak 6∈ ak−1 ; this is possible since a is not finitely generated so
a 6= ak for all k.

Finally, suppose that every ideal is finitely generated and let a1 ⊆ a2 ⊆ a3 ⊆ · · · be a weakly
increasing chain of ideals. Write

[
a∞ := ai = a1 R + . . . + an R.
i=1

Then, there is some N such that a1 , . . . , an ∈ aN , which yields a∞ ⊆ aN ⊆ a∞ and thus aN = a∞ :


an is constant for n ≥ N .


Exercise A.3.25† (Hilbert’s Basis Theorem). Let R be a Noetherian ring. Prove that R[X] is
Noetherian as well9 .

Solution

Suppose for the sake of contradiction that a1 ⊂ a2 ⊂ · · · is a strictly increasing chain of ideals of
R[X]. Let fi ∈ ai+1 , fi 6∈ ai be of minimal degree. The idea is to perform the Euclidean division
to find smaller fi , which is a contradiction. To do this however, we need to have a polynomial in
ai+1 with the same leading coefficient as fi , and smaller degree. Hence, consider the ideals

bn = a(f1 )R + . . . + a(fn )R

for n ≥ 1, where a(f ) is the leading coefficient of f ∈ R[X]. Since R is Noetherian, this stabilises,
say bn = f kb for n ≥ N . Then, an+2 contains an element which is a linear combination of
f1 , . . . , fn of leading coefficient a(fn+1 ) for all n ≥ N : by the previous remark, this implies that,
for all n ≥ N ,
deg fn+1 < max(deg f1 , . . . deg fn )
for it would otherwise contradict the minimality of deg fn+1 . In particular, (deg fn )n≥0 is
bounded. By the infinite pigeonhole principle, pick an infinite sequence of integers k1 , k2 , . . .
and an integer m such that deg fkn = m for all n ∈ N∗ . This time, define

ck = a(fk1 )R + . . . + a(fkn )R

for n ≥ 1. Then, as before, (ck )k≥0 stabilises, say cn = c for n ≥ M . We also get the same

8 This lets us perform Noetherian induction over Noetherian rings, because induction just amounts to considering a

minimal n such that some property is not satisfied, and deduce a contradiction by constructing an even smaller one.
(Over Noetherian rings, every set of ideals has a maximal element instead of a minimal one, but this is the natural
generalisation of what happens over Z: a | b if and only if aZ ⊆ bZ.)
9 Since any field K is Noetherian (it has only two ideals: {0} and itself), Hilbert’s theorem implies that K[X] is

Noetherian. By Exercise A.3.24† , this means that every ideal is finitely generated (hence the name "basis theorem").
This is of considerable importance in (classical) algebraic geometry as it allows us to say that, given a set of points S in
K n , the ideal of polynomials vanishing on S is finitely generated (see Shafarevich [39]).
A.3. EXERCISES 425

inequality
deg fkn+1 < max(deg fk1 , . . . deg fkn )
for n ≥ M and that is a contradiction since it is actually an equality by construction.


Miscellaneous
Exercise A.3.26† (China TST 2009). Prove that there exists a real number c > 0 such that, for any
prime number p, there are at most cp2/3 positive integers n satisfying n! ≡ −1 (mod p).

Solution

We shall prove that any set S such that a! ≡ b! 6≡ 0 (mod p) has cardinality at most 2p2/3 .
Consider the polynomial following polynomial

fm = (X + 1) · . . . · (X + m) − 1 ∈ Fp [X].

Since Fp is a field, fm has at most m roots in Fp . Thus, there are at most m integers n such that
n! ≡ (m + n)!, since this is equivalent to fm (n) = 0.

Let k be an integer which we will specify later on. Let N be the set of pairs of elements of S at
a distance less than k, i.e.

N = {{a, b} ⊆ S | a 6= b, |a − b| < k.}

By our previous result,


k2
|N | ≤ 1 + 2 + . . . + (k − 1) <
.
2
Now, let M = {a | ∃b : {a, b} ∈ N }. Consider S \ M . By definition, for any a, b ∈ S \ M , we
have |a − b| ≥ k. Since the elements of S are between 0 and p − 1, by the pigeonhole principle,
we have |S \ M | ≤ kp + 1. To conclude,

p k2
|S| ≤ |S \ M | + |M | ≤ |S \ M | + |N | ≤ + + 1.
k 2
√ 
If we now pick k = 3 p , we get |S| ≤ 2p2/3 as wanted.

Exercise A.3.27† (Mason-Stothers Theorem, ABC conjecture for polynomials). Let K be a charac-
teristic 0 field. Suppose that A, B, C ∈ K[X] are polynomials such that A + B = C. Prove that

1 + max(deg A, deg B, deg C) ≤ deg(rad ABC)

where rad ABC is the greatest squarefree divisor of ABC (in other words, deg(rad ABC) is the number
of distinct complex roots of ABC). Deduce that the Fermat equation f n + g n = hn for f, g, h ∈ K[X]
does not have non-trivial solutions for n ≥ 2.

Solution

Suppose without loss of generality that A, B, C are coprime. The problem when one A, B, C
is constant, so assume also that they are all non-constant. Consider the Wronskian (see Exer-
426 APPENDIX A. POLYNOMIALS

 
† A B
cise C.5.10 ) determinant D = det 0 = AB 0 − BA0 . Note that this is the same up sign
A B0
when we replace A and B by two polynomials out of A, B, C: this is because the determinant
is invariant up to sign under column operations (adding certain columns to other columns and
exchanging columns, see Proposition C.3.4). Of course, it can also be proven by computing it
explicitly: (A + B)B 0 − B(A + B)0 = AB 0 − BA0 (and the rest follows by symmetry). In addition,
it is non-zero for it were, A would divide its derivative A0 since A and B are coprime. This
is impossible since A0 is non-zero by assumption and has smaller degree. (More generally, see
Exercise C.5.10† .)

Now, notice r is a double root of ABC only if it is a root of D: indeed such a root must a double
root of one of A, B, C since they are coprime, say A. It is then a common root of A and A0 so of
D too. However, a lot more holds. if v is the multiplicity of r in ABC (thus in A in our case), r
is a root of multiplicity v − 1 of D since it’s a root of multiplicity v − 1 of A0 . Thus,

ABC | rad(ABC)D,

which gives the wanted bound since deg D ≤ deg A + deg B − 1 and the same with B, C and C, A
by symmetry.

Suppose that A = f n , B = g n , C = hn are non-zero and satisfy A + B = C. Then,

1 + n max(deg f, deg g, deg h) = 1 + max(deg A, deg B, deg C)


≤ deg(rad ABC)
= deg(rad f gh) ≤ deg f + deg h + deg h

so n < 3 as wanted.


Remark A.3.4
If we worked in non-zero characteristic, the only step that would potentially fail is the fact that
the Wronskian is non-zero. Let’s see when this is possible: A0 B = B 0 A implies as before that
A | A0 , i.e. A0 = 0. Similarly, B 0 = 0. In other words, the theorem fails only if A, B, C are pth
powers, where p = char K. Moreover, when p ≥ 3, the theorem is always false in that case as we
saw at the end of the proof with the Fermat equation. As a bonus, we also get a characterisation
of the solutions to the Fermat equation. When p = 2, the theorem may or may not be true for
A = U 2 , B = V 2 and C = W 2 : as long as U, V, W have the same degree d and have distinct
roots, we have

deg rad(ABC) = 3d > 1 + max(deg A, deg B, deg C) = 1 + 2d

for d > 2. Such an example is trivial to obtain, e.g. U = X 2 + X, V = αX 2 + βX + γ and


W = U + V , where α, β, γ ∈ F2 are suitable algebraic elements such that α + β + γ 6= 0, β 6= 0
(that way U and V are coprime) and V doesn’t have a double root.

Exercise A.3.28† . Find all polynomials f ∈ C[X] which send the unit circle to itself.

Solution

As in Exercise A.3.9† , f (z) = f (z −1 ) for any z on the unit circle. Thus, 1 = |f (z)|2 = f (z)f (z −1 ).
Hence, f (z)(z n f (z −1 )) = z nP
for z on the unit circle, whereP n = deg f . Note that X n f (1/X) is
n n
indeed a polynomial: if f = i=0 ai X , then X f (1/X) = i=0 an−i X i .
i n

Thus, the polynomials f (X)(X n f (1/X)) and X n have infinitely many roots in common, which
A.3. EXERCISES 427

mean that they are equal. In particular, f | X n , which implies that f = εX k for some ε and
some k. It is clear that ε must be on the unit circle, and conversely any such ε works (in other
words, the polynomials which send the unit circle to itself contract it and then rotate it).


Exercise A.3.30† . Let K be a characteristic 0 field, and let f ∈ K[X] be a non-zero polynomial.
Suppose that an additive map ϕ : K → K commutes with f , i.e. ϕ(f (x)) = f (ϕ(x)) for all x ∈ K.
Prove that ϕ commutes with every monomial of f .

Solution

A straightforward induction shows that ϕ(nx) = nϕ(x) for all x ∈ K and n ∈ Z. In fact, it is
even true for any rational r, although we won’t need it: if a, b ∈ Z with b 6= 0, we have

bϕ(xa/b) = ϕ(ax) = aϕ(x)

so that ϕ(xa/b) = ϕ(x)a/b.


Pm
Now, write f = i=0 ai X i and let x be an element of K. We have, for any n ∈ Z
m
!
X
ϕ(f (nx)) = ϕ ai (nx)i
i=0
m
X
= ϕ(ni ai xi )
i=0
m
X
= ϕ(ai xi )ni
i=0

and

f (ϕ(nx)) = f (nϕ(x))
Xm
= ai ϕ(x)i ni .
i=0

This means that the polynomial


m
X
g(n) = (ϕ(ai xi ) − ai ϕ(x)i )ni
i=0

has infinitely many roots, so must be the zero polynomial. In other words, ϕ(ai xi ) = ai ϕ(x)i for
all i: since this is true for all x ∈ K, ϕ commutes with the monomials ai X i of f .


Exercise A.3.32† (Gauss-Lucas Theorem). Let f ∈ C[X] be a polynomial with roots α1 , . . . , αk .


Prove that
f0 X 1
= .
f X − αk
k

Deduce the Gauss-Lucas theorem: if f ∈ C[X] is non-constant, Pthe roots ofPf 0 are in the convex hull of
0
the roots of f , that is, any root β of f is a linear combination i λi αi with i λi = 1 and non-negative
λi ∈ R.
428 APPENDIX A. POLYNOMIALS

Solution

The identity follows from Exercise A.1.8∗ . Let α be a root of f 0 , without loss of generality such
that f (α) 6= 0. We have
n n
X 1 X α − αk
0= =
i=1
α − αk i=1
|α − αk |2
so that
n n
X 1 X αk
α = .
i=1
|α − αk |2 i=1
|α − α k |2
If we now conjugate this equality, we get
Pn αk
i=1 |α−αk |2
α = Pn 1
i=1 |α−αk |2

which has the desired expression.




Remark A.3.5
You may notice that the first identity is the logarithmic derivative (log f )0 . This can be used to
produce an analytic proof of this identity: it holds when X > αk for all k (in particular they are
all real), but is also a polynomial identity in X and the αk , so it must hold polynomially. More
specifically, if we fix the αi ∈ R, it holds for sufficiently large X so it must hold for all X. Thus,
it holds for all αi , X ∈ R which means that it always holds by Exercise A.1.7∗ .

Exercise A.3.33† (Sturm’s Theorem). Given a squarefree polynomial f ∈ R[X], define the sequence
f0 = f , f1 = f 0 and fn+2 is minus the remainder of the Euclidean division of fn by fn+1 . Define also
V (ξ) as the number of sign changes in the sequence f0 (ξ), f1 (ξ), . . ., ignoring zeros. Prove that the
number of distinct real roots of f in the interval ]a, b] is V (a) − V (b).10

Solution

When x increases from a to b, it may pass through a zero of some fk (otherwise, by the interme-
diate value theorem, V (a) = V (b) and there is clearly no root in the intervall as claimed). We
shall prove that this leaves V (x) invariant if k ≥ 1, and decreases it by 1 precisely when k = 0,
i.e. x is a root of f . Before doing that, note that the important part of the definition of (fn )n≥0
is that fn+1 ≡ −fn−1 (mod f )n for all n. In particular, if fn (x) and fn+1 (x) are zero, then so is
fn−1 (x), which implies, by induction that x is a root of every fi . This is impossible since f0 = f
and f1 = f 0 have no common root by assumption.
First, suppose that fi (ξ) = 0 for some ξ and i ≥ 1. Then, since fi+1 ≡ −fi−1 (mod fi ), fi+1 (x)
and fi−1 (x) have opposite signs around ξ (and are non-zero by our previous observation). This
means that, before ξ, we had one sign change in (fi−1 (x), fi (x), fi+1 (x)) since this has the form
(±1, ε, ∓1) for ε ∈ {−1, 1}. After ξ and at ξ, we still have one sign change for the same reason.
Hence, V (x) stays invariant when x passes through a root of some fi with i ≥ 1.
Now, suppose that f (ξ) = 0. Then, around ξ, f (ξ + ε) = εf 0 (ξ) + O(ε2 ) which means that the
sign of f (x) flips before and after ξ, while the sign of f 0 does not change since ξ is a simple root.
More precisely, before ξ, f (x) and f 0 (x) had opposite sign, while they have the same sign after
ξ. At ξ, we do not count a sign change since f (ξ) = 0 so V (ξ) = V (ξ + ε) for sufficiently small
ε > 0, which finishes the proof.

10 If we choose a = −∞, b = +∞, this gives an algorithm to compute the number of real roots of f , by looking at the

signs of the leading coefficients of f0 , f1 , . . . since f (±∞) only depends on the leading coefficient of f (as long as it is
non-constant).
A.3. EXERCISES 429

Exercise A.3.34† (Ehrenfeucht’s Criterion). Let K be a characteristic 0 field, let f1 , . . . , fk ∈ K[X]


be polynomials and define
f = f1 (X1 ) + . . . + fk (Xk ) ∈ K[X1 , . . . , Xk ].
If k ≥ 3, prove that f is irreducible. In addition, prove that this result still holds if k = 2 and f1 and
f2 have coprime degrees.

Solution

Let us first do the case k = 2. Suppose that f (X) + g(Y ) is reducible, say equal to uv. Let
m = deg f and n = deg g. Consider f (X n ) + g(Y m ), which is a polynomial of degree mn in
both X and Y . Let r and s be the homogeneous parts of u(X n , Y m ) and v(X n , Y m ), i.e. the
polynomial formed by the monomials of highest degree of u(X n , Y m ) and v(X n , Y m ). By looking
at the degrees, we must have rs = aX mn + bY mn where a and b are the leading coefficients of u
and v respectively.

Suppose without loss of generality (by symmetry) that r has at least two monomials, i.e. u has
at least two monomials X i1 Y j1 and X i2 Y j2 such that

ni1 + mj1 = ni2 + mj2 ⇐⇒ n(i1 − i2 ) = m(j1 − j2 ).

Since m and n are coprime, this implies n | j1 − j2 and m | i1 − i2 . But then, degX u ≥ m and
degY u ≥ n, which implies that s is constant in both X and Y , i.e. constant, since f (X)+g(Y ) =
uv. This is a contradiction.

Now suppose k ≥ 3 and f = uv. Let ni = deg fi and let ai be the leading coefficient of fi . The
same argument as before shows that

rs = a1 X1N + . . . + ak XkN ,
N/n
where N = n1 · . . . · nk (we replace Xi by Xi i and take homogeneous parts). Thus, we have
reduced the problem to the case of monomials. We can however reduce it even further: if we
evaluate this at (X, Y, 1, 0, . . . , 0), we get that aX N + bY N + c is reducible in K[X, Y ] (the
factorisation we get is non-trivial since r and s have degree < N so still degree < N when we
evaluate them), say

aX N + bY N + c = (gM X M + . . . + g0 )(hN −M X N −M + . . . + h0 )

for some polynomials gi , hi in Y of degree < N . Now, substitute y a complex root of bY N + c to


Y . This gives us the polynomial aX N which can only be factored as a product of two monomials,
so
g0 (y) = . . . = gM −1 (y) = hN −M −1 (y) = . . . = h0 (y).
But since the roots of bY N +c are distinct (there is no common root with the derivative N bY N −1 ),
gi and hj for i < M and j < N − M vanish at N distinct points, which is more than their degree.
Thus, they must be zero. This leaves us with the factorisation aX N + bY N + c = gM hN −M X N
which is clearly impossible since X N doesn’t divide the LHS.


Exercise A.3.35† (IMC 2007). Let a1 , . . . , an be integers. Suppose f : Z → Z is a function such that
n
X
f (kai + `) = 0
i=1

for any k, ` ∈ Z. Prove that f is identically zero.


430 APPENDIX A. POLYNOMIALS

Solution
Pm i
Consider the set I of polynomials f = i=0 bi X ∈ Q[X] such that
m
X
bi f (i + x) = 0
i=1

for any x ∈ Z. We claim that this set is an ideal of Q[X], meaning that it’s closed under
addition, and closed under multiplication by any polynomial in Q[X]. The first fact is clear. For
the second, note that multiplication by X i corresponds to a translation and that multiplication
by a constant is trivial, so we can deduce it from the first fact. Thus, I is closed under gcd: by
Bézout’s lemma, if f, g ∈ I, there are u, v ∈ Q[X] such that

gcd(f, g) = uf + vg ∈ I.

Our goal is to show that I contains the element 1: this gives f (x) = 0 for any x ∈ Z as wanted.
The statement gives us that
Xn
f= X kai +` ∈ I
i=1

for any k, ` such that kai + ` ≥ 0 for all i. Hence, the problem reduces
Pn to proving that these
kai
polynomials are coprime, i.e., that for any algebraic number α, i=1 α can not always be
zero. This follows from our proof of Theorem C.4.1: this is a linear recurrence,Pn and the only
linear recurrence which is identically zero is the zero recurrence. However, i=1 αkai is clearly
not the zero recurrence since the coefficient before αkai for every i.

Appendix B

Symmetric Polynomials

B.1 The Fundamental Theorem of Symmetric Polynomials


Exercise B.1.1. Let f ∈ K(X1 , . . . , Xn ) be a rational function, where K is a field. Suppose f is
symmetric, i.e. invariant under permutations of X1 , . . . , Xn . Prove that f = g/h for some symmetric
polynomials g, h ∈ K[X1 , . . . , Xn ].

Solution

Let r = f /g be a symmetric rational function. We write it as


Q
f (σ(X1 , . . . , Xn ))
r = Q σ∈Sn .
g id6=σ∈Sn f (σ(X1 , . . . , Xn ))

The numerator is a symmetric polynomial so the denominator must be too since the quotient is.


Exercise B.1.2∗ . Prove that, if e1 , . . . , en ∈ A[X1 , . . . , Xn ] are the elementary symmetric polynomials
in n variables and g ∈ A[X1 , . . . , Xm ] is a polynomial in m ≤ n variables, the degree of g(e1 , . . . , em )
is the weight w(g) of g.

Solution

Clearly, deg g(e1 , . . . , em ) ≤ w(g). We must prove that there is some monomial occuring in
g(e1 , . . . , em ) of degree w(g). Consider, among the monomials X1k1 · · · Xm
km
of g of weight w(g),
the one which is minimal for the reversed lexicographic ordering (i.e. (km , . . . , k1 ) is minimal for
the usual lexiographic ordering). Then,

X1k1 +...+km X2k2 +...+km X3k3 +...+km · · · Xm


km

is a monomial of degree w(g) occuring exactly once in the expansion of g(e1 , . . . , em ).




Exercise B.1.3. Prove that the decomposition of a symmetric polynomial f as g(e1 , . . . , en ) is unique.

431
432 APPENDIX B. SYMMETRIC POLYNOMIALS

Solution

This accounts to proving that f (e1 , . . . , en ) = 0 if and only if f = 0. This is clear by induction
on n (trivial when n = 1). Let f be such a polynomial and suppose for the sake of contradiction
that en | f . If we set Xn = 0 we get

f (e1 , . . . , en−1 , 0) = 0

where the ei are now the elementary symmetric polynomials in X1 , . . . , Xn−1 . By the inductive
hypothesis, this means that f (X1 , . . . , Xn−1 , 0) = 0, i.e. Xn | f . By symmetry, en = X1 ·. . .·Xn |
f , a contradiction.


Exercise B.1.4. Prove the following generalisation of the fundamental theorem of symmetric poly-
nomials: if
(1) (m1 )
f ∈ R[A1 , . . . , A1 , . . . , A(1) (mn )
n , . . . , An ]
(1) (mk )
is symmetric in Ak , . . . , Ak for every k ∈ [n], then

f ∈ R[eA A1 An An
1 , . . . , e m1 , . . . , e 1 , . . . , e mn ]
1

(1) (mk )
where eA
i
k
designates the ith elementary symmetry polynomial in Ak , . . . , Ak .

Solution

We simply proceed by induction on n. When n = 1, this is the usual statement of the funda-
mental theorem of symmetric polynomials. For the induction step, note that f is symmetric in
(1) (m )
A1 , . . . , Ak k for every k ∈ [n − 1], so that
h ih i
An−1
f ∈ R A(1) (mn )
n , . . . , An eA A1
1 , . . . , e m1 , . . . , e 1
1
, . . . , eA
mn−1 .
n−1

(1) (m )
Now, note that f is symmetric as a polynomial in An , . . . , An n with coefficients in
h i
An−1
R 0 = R eA A1
1 , . . . , e m1 , . . . , e 1
1
, . . . , eAn−1
mn−1

by the previous step. Thus, the usual fundamental theorem of symmetric polynomials gives
h ih i
An−1
f ∈ R eA1
1
, . . . , e A1
m 1
, . . . , e 1 , . . . , e An−1
mn−1
e An
1 , . . . , e An
m n

i.e. h i
f ∈ R eA
1
1
, . . . , e A1
m 1
, . . . , e An
1 , . . . , e An
m n

as wanted.


B.2 Newton’s Formulas


Exercise B.2.1∗ . Prove Corollary B.2.1.
B.3. THE FUNDAMENTAL THEOREM OF ALGEBRA 433

Solution

We have K(p1 , . . . , pn ) ⊆ K(e1 , . . . , en ) by the fundamental theorem of symmetric polynomials,


and the reverse inclusion comes from the Newton formulas by induction, as explained before. (We
need the assumption that K is a field because the LHS of the Newton’s formulas has a factor of
k which we need to divide by in the inductive step, and we need K to have characteristic 0 so
that k 6= 0.)


B.3 The Fundamental Theorem of Algebra


Exercise B.3.1∗ . Prove Proposition B.3.2.

Solution

By the quadratic formula (or completing the square), solving quadratic equations is equivalent
to finding square roots. Thus, let a + bi ∈ C be a complex number, with a, b ∈ R. We wish to
find a square root x + iy or a + bi, i.e.

x2 − y 2 + 2ixy = (x + iy)2 = a + bi.

This means x2 − y 2 = a and 2xy = b. This is equivalent to x2 and −y 2 being roots of X 2 −


aX − b2 /4 by Vieta’s formulas. Since the constant coefficient is negative, the roots are real (e.g.
by the intermediate value theorem), and since the product is negative, one is positive and one
negative. Label the positive one as x2 and the negative one as −y 2 , take the square roots to find
x and y and adjust the sign to have 2xy = b.


B.4 Exercises
Newton’s Formulas
Exercise B.4.2† (Hermite’s Theorem). Prove that a function f : Fp → Fp is a bijection if and only
if a∈Fp f (a)k is 0 for k = 1, . . . , p − 2 and −1 for k = p − 1.
P

Solution

If f is a bijection, then this is Exercise A.3.12† . Now suppose that this condition holds. Newton’s
formulas (note k 6= 0 for k < p so Corollary B.2.1 still holds) tell us that there is only one possible
value of ek (f (0), . . . , f (p − 1)) for any fixed k. Hence, we must have

ek (f (0), . . . , f (p − 1)) = ek (0, . . . , p − 1)

since 0, . . . , p − 1 satisfy the condition by Exercise A.3.9† (with K = Fp ). This implies that

(X − f (0)) · . . . · (X − f (p − 1)) = (X − 0) · . . . · (X − (p − 1))

so f is a bijection as wanted.


Exercise B.4.3† . Suppose that α1 , . . . , αn are such that α1k + . . . + αkn is an algebraic integer for all
n ≥ 0. Prove that α1 , . . . , αk are algebraic integers.
434 APPENDIX B. SYMMETRIC POLYNOMIALS

Solution

Newton’s formulas give us k!ei (α1 , . . . , αk ) ∈ Z for any i. Thus, k!α ∈ Z for any α = αi , by
Exercise 1.5.21† . In particular, since the statement is also true when we replace the αi by αim
for any fixed m, we get k!αm ∈ Z for any m.

Thus, the problem reduces to showing that, if α ∈ Q is algebraic and such that N αn ∈ Z (i.e.
powers of α have bounded denominator) for some non-zero N ∈ Z and any positive integer n,
n
then α ∈ Z. For large n, the degree of α2 is constant, since the sequence
n−1
n [Q(α2 ) : Q]
[Q(α2 ) : Q] =
[Q(α2n−1 ) : Q(α2n )]
m
is a weakly decreasing sequence of integers. By replacing α with α2 for some large m, we may
assume that this is true for any n ≥ 0. Let β1 , . . . , β` be the conjugates of α. Consider the
minimal polynomial
`
Y k
f2k = X − βi2
i=1
k
of α2 and let Nk = 1/c(f2k )) be the smallest positive integer such that Nk f2k ∈ Z[X]. By
assumption Nk is bounded. However, we have
`
Y n+1
Nk2 f2k+1 (X 2 ) = Nk2 X 2 − βi2
i=1
`
! `
!
Y n Y n
= Nk X− βi2 Nk X+ βi2
i=1 i=1
= ±(Nk f2k )(Nk f2k (−X))

which is primitive by Gauss’ lemma 5.1.2. Hence, Nk+1 = Nk2 so N1 must be 1 otherwise
n−1
Nk = N12 → ∞. This means that the minimal polynomial of α has integral coefficients, i.e. α
is an algebraic integer.


Remark B.4.1
It is necessary to mention that the key claim admits a very short and intuitive proof if we allows
ourself some ideal theory. The idea is that, if α ∈ Q, we can simply look at the p-adic valuations
to get nvp (α) + vp (N ) ≥ 0 which gives us vp (α) ≥ 0 for large enough n. Hence, α is an integer.
For arbitrary algebraic integers, the same proof works almost verbatim: a number field K is not
always a UFD but always has ideal factorisation. This means that we can this time consider
prime ideals p of K to get nvp (α) + vp (N ) ≥ 0 which implies vp (α) ≥ 0 again. Finally, since this
is true for any prime ideal p, we get α ∈ OK .

Algebraic Geometry
Exercise B.4.4† (Resultant). Let R be a commutative ring, and f, g ∈ R[X] be two polynomials of
respective degrees m and n. For any integer k ≥ 0, denote by Rk [X] the subset of R[X] consisting of
polynomials of degree less than k. The resultant Res(f, g) is defined as the determinant of the linear
map
(u, v) 7→ uf + vg

from Rn [X] × Rm [X] to Rm+n [X]. Prove that, if R is a UFD (see Definition 2.2.2), Res(f, g) = 0 if
and only if f and g have a common factor in R[X] (which is also a UFD by Gauss’s lemma 5.1.3).
B.4. EXERCISES 435

ai X i and g = i
, we have1
P P
Then, show that if f = i i bi X

a0 0 ··· 0 b0 0 ··· 0
a1 a0 ··· 0 b1 b0 ··· 0
.. ..
a2 a1 . 0 b2 b1 . 0
.. .. .. .. .. ..
. . . a0 . . . b0
Res(f, g) = .. .. ,
am am−1 ··· . bn bn−1 ··· .
.. .. .. ..
0 am . . 0 bn . .
.. .. .. .. .. ..
. am−1 . . . . . bn−1
0 0 am ··· 0 0 ··· bn

and, if f = a i X − αi and g = b j X − βj , then2


Q Q

Y
Res(f, g) = am bn αi − βj .
i,j

In addition, prove that Res(f, g) ∈ (f R[X] + gR[X]).3 Finally, prove that if f, g ∈ Z[X] are monic and
uf +vg = 1 for some u, v ∈ Z[X], Res(f, g) = ±1. (It is not necessarily true that (f R[X]+gR[X])∩R =
Res(f, g)R.)

Solution

Suppose that R is a UFD. Then, Res(f, g) = 0 means that the kernel over the field of fractions
K of R of the linear map (u, v) 7→ uf + vg is non-trivial, i.e. there are non-zero u, v ∈ K[X] of
degree less than n and less than m respectively such that uf = vg. If f and g had no common
factor, as R[X] is a UFD, we would have f | v which is impossible as v 6= 0 and deg f > deg v.

The determinant form of the resultant simply follows from considering the matrix of the linear
function corresponding to the basis 1, X, . . . , X m+n−1 . To prove the explicit formula, consider
the case where A = a, B = b, αi = Ai and βj = Bj are indeterminates. Working over a field
K, the resultant vanishes when Ai = Bj for some i, j since the map is not surjective: it never
reaches 1. Thus, the resultant is divisible by Ai − Bj for all i, j. Looking at the determinant
formula, we see that the degree of Res(f, g) in A1 is n and its leading coefficient is Am B n , which
proves the wanted formula.

For the third part, write the equation uf + vg = r in the monomial basis as RV = (r, 0, . . . , 0) :=
re1 , where R is the matrix corresponding to the linear map (u, v) 7→ uf + vf . Hence, we wish to
have rR−1 e1 ∈ Rn . Proposition C.3.7 tells us that r = det R = Res(f, g) works.

Now let f and g be generic polynomials with integer coefficients of respective degree m and n.
Qm Qn
Suppose finally that (f Z[X] + gZ[X]) ∩ Z = Z. Write f and g as i=1 X − αi and j=1 X − βi .
We have u(βi )f (βi ) = 1 for each i, so
n
Y
f (βi ) = ± Res(f, g)
i=1

divides 1 as desired.


1 This is an (m + n) × (m + n) matrix, with n times the element a0 and m times the element b0 .
n(n−1)
(−1) 2
2 In particular, the discriminant of f is a
· Res(f, f 0 ).
3 In other words, the resultant provides an explicit value of a possible constant in Bézout’s lemma for arbitrary rings

(such as Z).
436 APPENDIX B. SYMMETRIC POLYNOMIALS

Exercise B.4.6† (Elimination Theory). Let K be an algebraically closed field, let m, n ≥ 0 be


integers, and let d1 , . . . , dm ≥ 0 be integers. Using Exercise B.4.4† , prove that there are homogeneous
polynomial F1 , . . . , Fk such that, for any homogeneous polynomials f1 , . . . , fm ∈ K[X1 , . . . , Xn ] of
respective degrees d1 , . . . , dm (we exceptionnaly treat 0 as a polynomial of degree 0 in this exercise),
F1 , . . . , Fk simultaneously vanish at the coefficients of f1 , . . . , fm if and only if f1 , . . . , fm have a non-
zero common root in K n .4 (F1 , . . . , Fk are polynomials in the coefficients of f1 , . . . , fm .)

Solution

Without loss of generality, suppose that d1 = . . . = dm = d, since if d is any integer greater than
d1 , . . . , dm , the polynomials f1 , . . . , fm have a non-zero common root if and only if this is the
case of the family of polynomials (Xjd−di fi )1≤i≤m,1≤j≤n of degree d.

We proceed by induction on the number n of variables. When n = 0 or d = 0, f1 , . . . , fm have


a common root if and only if they are all equal to zero, which obviously a polynomial condition
on the coefficients. Now, suppose that n, d ≥ 1 and let f1 , . . . , fm be polynomials of respective
degrees d1 , . . . , dm . We will eliminate one variable with the resultant. Consider the homogeneous
polynomial

g = ResXn (fm , U1 f1 + . . . + Um−1 fm−1 ) ∈ K[U1 , . . . , Um−1 ][X1 , . . . , Xn−1 ],

where U1 , . . . , Um−1 are formal variables, and where we consider both polynomials as polynomi-
als of degree d. Let g1 , . . . , g` ∈ K[X1 , . . . , Xn−1 ] be the coefficients of g (as a polynomial in
U1 , . . . , Um−1 ). By Exercise B.4.4† , they have a common root (x1 , . . . , xn−1 ) ∈ K n−1 (this corre-
sponds to a root of g) if and only if the leading coefficient (in Xn ) of fm or U1 f1 +. . .+Um−1 fm−1
vanishes at (x1 , . . . , xn−1 ), or if there is some xn ∈ K such that (x1 , . . . , xn ) is a common root of
f1 , . . . , fm . Now, note that the first case can only happen if fm has degree less than d in Xn , or
if all of f1 , . . . , fm−1 have degree less than d in Xn , as they are homogeneous polynomials. We
will perform translations to make sure this doesn’t happen.

Note that the coefficient of Xnd in

fi,c = fi (X1 + c1 Xn , . . . , Xn−1 + cn−1 Xn , cn Xn )

is fi (c) where c = (c1 , . . . , cn ). In particular, if we consider a set S ⊆ K of cardinality 2d + 1,


we are garanteed (by Exercise A.1.7∗ ) to find a c ∈ S n such that at least two fi (c) are non-zero,
unless all fi are zero except possibly one (in which case they have a non-zero common root as
d ≥ 1). Thus, if we set, for every c ∈ S n and i ∈ [m],

gi,c = Res(fi,c , U1 f1,c + . . . + Ui−1 fi−1,c + Ui+1,c fi+1,c + . . . + Um fm ),

we know for sure that f1 , . . . , fm have a non-zero common root if all the gi,c have one. Since the
converse is obvious, we conclude that f1 , . . . , fm have a non-zero common root if and only if, for
every c ∈ S n and i ∈ [n], all the coefficients (as polynomials in U1 , . . . , Um ) of the gi,c have a
non-zero common root. As the coefficients of gi,c are in K[X1 , . . . , Xn−1 ] (and their coefficients
are polynomial in those of f1 , . . . , fm ), we conclude by the inductive hypothesis that this is indeed
a polynomial condition in the coefficients of f1 , . . . , fm .


Exercise B.4.7† (Hilbert’s Nullstellensatz). Let K be an algebraically closed field. Suppose that
f1 , . . . , fm ∈ K[X1 , . . . , Xn ] have no common zeros in K. Using Exercise B.4.4† , prove that there exist
polynomials g1 , . . . , gm such that
f1 g1 + . . . + fm gm = 1.
Deduce that, more generally, if f is a polynomial which is zero at common roots of polynomials
f1 , . . . , fm (we do not assume anymore that they have no common roots), then there is an integer k
4 Note that for m = n and d1 = . . . = dm = 1, this is what the determinant does. See also Remark C.3.1.
B.4. EXERCISES 437

and polynomials g1 , . . . , gm such that

f k = f1 g1 + . . . + fm gm .

Solution

We proceed by induction on the number n of variables. When n = 1 this is just Bézout’s


lemma. Now, if n ≥ 1, we will eliminate one variable with the resultant, in the same way as in
Exercise B.4.6† . Consider the polynomial

g = ResXn (fm , U1 f1 + . . . + Um−1 fm−1 ) ∈ K[U1 , . . . , Um−1 ][X1 , . . . , Xn−1 ],

where U1 , . . . , Um−1 are formal variables. By Exercise B.4.4† , (x1 , . . . , xn−1 ) ∈ K n−1 is a root of
g if and only if there is some xn ∈ K such that (x1 , . . . , xn ) is a common root of f1 , . . . , fm , or if
the leading coefficient in Xn of fm or of U1 f1 + . . . + Um−1 fm−1 vanishes at (x1 , . . . , xn−1 ) (we
say (x1 , . . . , xn−1 is a common root at infinity). We wish to rule out the second case. This is not
very hard: perform the change of coordinates Xi → Xi + ci Xn for i = 1, . . . , n − 1 and some ci
to get constant leading coefficients in Xn (thus never vanishing).

Hence, g has no root by assumption since f1 , . . . , fm have no common root. However, a root of
g is simply a common root of its coefficients gi when g is seen as a polynomial in U1 , . . . , Um−1 .
This implies that a linear combination of the gi is 1, by the inductive hypothesis. Finally, notice
that
g = ResXm (f, U1 f1 + . . . + Um−1 fm−1 ) = uf + v(U1 f1 + . . . + Um−1 fm−1 )
for some u, v ∈ K[X1 , . . . , Xn ][U1 , . . . , Um−1 ], by Exercise B.4.4† . Hence, the coefficients gi of g
are linear combinations of the fi (with coefficients in K[X1 , . . . , Xn ]). We conclude that a linear
combination of the fi is 1 as wanted.

For the second part, suppose without loss of generality that f 6= 0. Use the first part on
f1 , . . . , fm , 1−Xn+1 f which have no common root by assumption (this is known as Rabinowitsch’s
trick). Thus, there are g1 , . . . , gm , g ∈ K[X1 , . . . , Xn+1 ] such that

g1 f1 + . . . + gm fm + g(1 − Xn+1 f ) = 1.

Now, evaluate this at Xn+1 = 1/f and multiply by a large enough power of f to get the wanted
equality.


Exercise B.4.8† (Weak Bézout’s Theorem). Prove that two coprime polynomials f, g ∈ K[X, Y ] of
respective degrees m and n have at most mn common roots in K. (Bézout’s theorem states that they
have exactly mn common roots counted with multiplicity, possibly at infinity.5 )

Solution

We can assume without loss of generality that K has as many elements as we want by iteratively
adding new elements to K using Exercise 4.2.1∗ .)

We shall proceed as in Exercise B.4.7† . Consider the resultant h = ResY (f, g). This is a
polynomial of degree at most mn by its matrix expression of Exercise B.4.4† . By the same
exercise, if (x, y) is a common root of f, g, then x is a root of h. Thus, we would be done if
there was at most one possible value of y for each x, since h has degree at most mn and thus
has at most mn roots. Note that we already get that there are finitely many common roots

5 This requires some care: we need to define the multiplicity of common roots as well as what infinity means. See

any introductory text to algebraic geometry, e.g. Sharevich [39]. See also the appendix on projective geometry of
Silverman-Tate [42].
438 APPENDIX B. SYMMETRIC POLYNOMIALS

(although that’s already a consequence of Bézout’s lemma). Here is how we can achieve that: do
a change of coordinates X → X + c0 Y for some c chosen so that each x appears at most once as
a common root (x, y) of f and g: this is possible because the common roots in this new system
of coordinates are (α + c0 β, β) and there are finitely many c0 for which

α − α0
α + c0 β = α0 + c0 β 0 ⇐⇒ c = .
β0 − β


Exercise B.4.9† . Prove that n + 1 polynomials f1 , . . . , fn+1 ∈ K[X1 , . . . , Xn ] in n variables are


algebraically dependent, meaning that there is some non-zero polynomial f ∈ K[X1 , . . . , Xn+1 ] such
that
f (f1 , . . . , fn+1 ) = 0.

Solution

We present two solutions: one with linear algebra and one with resultants.
For the first solution, consider the linear system of equations in (N + 1)n+1 variables
in+1
X
ai1 ,...,in+1 f1i1 · . . . · fn+1 = 0. (∗)
i1 ,...,in+1 ≤N

We wish to find a non-trivial solution to this system. Let us count the number of equations we
have. Set M = maxi (deg fi ). Then, the LHS of (∗) is a polynomial of degree (n + 1)M N , when
we consider the ai1 ,...,in+1 as formal variables. Hence, we have (N + 1)n+1 unknowns and
N
X ((n + 1)M N )n+1 − 1
((n + 1)M N )k =
(n + 1)M N − 1
k=0
n+1
−1
equations, one for each coefficient. For large N , (N + 1)n+1 > ((n+1)M N)
(n+1)M N −1 , which means
that there is a non-trivial solution as wanted (the kernel is non-trivial by e.g. the rank-nullity
theorem C.2.1, or Proposition C.1.2).
To make the idea of the second solution clearer, we treat the case n = 1 first. If f, g ∈ K[X]
are polynomials, the resultant h = ResX (f − S, g − T ) is a non-zero polynomial in S, T with
coefficients in K. Indeed, it is non-zero since when S − f and T − g are coprime it takes a
non-zero value (we can choose T = 0 and S ∈ K to be a large constant for instance). However,
when S = f and T = g, the polynomials f − S and g − T are not coprime anymore so h(f, g) = 0
as wanted.
Now, we construct by backwards induction on k a polynomial with coefficients in K[X1 , . . . , Xk ]
vanishing at f1 , . . . , fn+1 . In other words, we eliminate one variable each time. Here is how we
do it: at first, fn,i = fi . Then, we define the polynomials

fk−1,i = ResXk (fk,k+1 − Tk,k+1 , fk,i − Tk,i )

for i = 1, . . . , k. At each step we get rid of Xk and introduce k + 1 new variables. Thus,
f0,1 ∈ K[{Ti,j | i ≤ j − 1}. It is clear that it is zero when evaluated at Tn,i = fi for every i and
Tk,i constant for i ≤ k − 1 ≤ n − 2. Indeed, note that Res(A, B)(t) is not in general equal to
Res(A(t), B(t)), since A(t), B(t) do not have the same degree as A, B. If we consider constant
polynomials as polynomials of degree deg(fk,i − Tk,i ) > 0, then

ResXk (fk,k+1 − Tk,k+1 , fk,i − Tk,i ) = 0,

as can be seen from the matrix expression of Exercise B.4.4† . It remains to prove that there
is some choice of such Tk,i for which f0,1 is not the zero polynomial. This is easy to see: we
B.4. EXERCISES 439

can choose Tk,k+1 = 0 for all k and at each step we pick Tk,i so that fk,k+1 and fk,i + Tk,i are
coprime. Indeed, if fk,k+1 has ` irreducible prime factors, if we pick ` + 1 values of Tk,i one of
them must work, as otherwise we would have
0 0
π | (fk,i + Tk,i ) − (fk,i − Tk,i ) = Tk,i − Tk,i
0
for some irreducible π | fk,k+1 and distinct Tk,i , Tk,i ∈ K by the pigeonhole principle. This is
0
impossible sine it implies Tk,i = Tk,i . There is still one slight technicality: we could have ` ≥ |K|.
However, we can simply add elements to K to get a sufficiently large K as in Exercise B.4.8† ,
and then consider the norm of the polynomial f we obtain (i.e. take the product over each of its
conjugates, exactly like we did in the solution of Exercise 1.5.21† ).


Exercise B.4.10† (Transcendence Bases). Let L/K be a field extension. Call a maximal set of K-
algebraically independent elements of L a transcendence basis. Prove that, if L/K has a transcendence
basis of cardinality n, then all transcendence bases have cardinality n. This n is called the transcendence
degree trdegK L. Finally, show that, if L = K(α1 , . . . , αn ) any maximal algebraically independent
subset of α1 , . . . , αn is a transcendence basis. (In particular trdegK L ≤ n.)

Solution

We prove a result analogous to Proposition C.1.2: if α1 , . . . , αm ∈ L are K-algebraically inde-


pendent and β1 , . . . , βn ∈ L are such that any element of L is algebraic over K(β1 , . . . , βn ), then
m ≤ n. Since transcendence bases satisfy both conditions, this shows that trdegK L is well-
defined. This almost Exercise B.4.9† : any family of n + 1 elements algebraic over K(β1 , . . . , βn )
is algebraically dependent over K. The only difference is that, in our case α1 , . . . , αm are not
necessarily in K(β1 , . . . , βn ). However, the first argument still works perfectly fine, the only
difference is that, if αi has degree di over K(β1 , . . . , βn ), we get (at most)
m
Y (mM N )n+1 − 1
di
i=1
mM N − 1

equations this time, which is still less than (N + 1)m for large N if m > n.

For the second part, note that, by the same argument as Theorem 1.3.2 or by Chapter 6, any
element of K(α1 , . . . , αn ) is algebraic over K(S), where S ⊆ {α1 , . . . , αn } is a maximal subset of
K-algebrically independent element.


Exercise B.4.11† . Let K be an algebraically closed field which is contained in another field L.
Suppose that f1 , . . . , fm ∈ K[X1 , . . . , Xn ] are polynomials with a common root in L. Prove that they
also have a common root in K.

Solution

We present two solutions, one based on Hilbert’s Nullstellensatz from Exercise B.4.7† and one
in characteristic 0 based on transcendence basis from Exercise B.4.10† . For the first sol, note
that f1 , . . . , fm have a common root in L if and only if there are no g1 , . . . , gm ∈ L[X1 , . . . , Xn ]
such that f1 g1 + . . . + fm gm = 1. In that case, there are no such gi in K[X1 , . . . , Xn ] either, so
f1 , . . . , fm also have a common root in K.

We new present the second solution, which is perhaps more intuitive as it "lifts" (or "reduces"
in our case) the common root over L to a common root over K. Thus, suppose that char K = 0
440 APPENDIX B. SYMMETRIC POLYNOMIALS

and let α1 , . . . , αk be a K-transcendence basis for the field generated by K and the common
root. Then, let α be such that this field is equal to K(α1 , . . . , αk , α), there exists such an α by
the primitive element theorem 6.2.1. Let

r1 (α1 , . . . , αk )(α), . . . , rn (α1 , . . . , αk )(α))

be the common root, with ri ∈ K(X). The equality

fi (r1 (α1 , . . . , αk )(α), . . . , rn (α1 , . . . , αk )(α)) = 0

is an equality modulo the minimal polynomial π(α1 , . . . , αk ) of α. Thus, if we replace αi by


ai ∈ K and α by a root a ∈ K of π(a1 , . . . , ak ), we get a common root in K. We just need to
check that the ri (a1 , . . . , ak )(a) are well-defined, i.e. their denominator is non-zero. This follows
from Exercise A.1.7∗ : the denominator is non-zero so it stays non-zero infinitely many times in
K n . Note that ri (α) is not necessarily a polynomial, instead it is algebraic over K(α1 , . . . , αk ),
but by considering its norm (the product with its conjugates over K(α1 , . . . , αk )) we can get a
polynomial. Indeed, if the norm of ri (α) is non-zero then so is ri (α). (We also need to be careful
with the leading coefficient of π: if it vanishes α has too few conjugates and things can get weird,
but we can simply pick a1 , . . . , ak so that it doesn’t vanish either.)


Miscellaneous
Exercise B.4.12† (ISL 2020 Generalised). Let n ≥ 1 be an integer. Find the maximal N for which
there exists a monomial f ∈ Z[X1 , . . . , Xn ] of degree N which can not be written as a sum
n
X
e i fi
i=1

with fi ∈ Z[X1 , . . . , Xn ].

Solution

The answer is N = n(n−1)


2 . First, we prove that X2 X32 · . . . · Xnn1 canP not be written in the desired
form. Suppose for the sake of contradiction that X2 X32 · . . . · Xnn1 = i ei fi for some polynomials
fi , which we suppose without loss of generality to be homogeneous of degree n(n−1) 2 − i (by
1 n−1
ignoring all other monomials). Then, we sum ε(σ)Xσ(2) · . . . · Xσ(n) over all permutations σ ∈ Sn
of [n], where ε denotes the signature (see Definition C.3.2). Since the ei are symmetric, we have
X X X
1 n−1
ε(σ)Xσ(2) · . . . · Xσ(n) = ei ε(σ)fi (Xσ(1) , . . . , Xσ(n) ).
σ∈Sn i σ∈Sn

Here is the key point: if f has degree less than n(n−1)


P
2 , σ∈Sn ε(σ)f (Xσ(1) , . . . , Xσ(n) ) = 0.
This is an obvious contradiction as the LHS is a sum of distinct monomialsQso is non-zero.
n ai
To Pnclaim, suppose without loss of generality that f is a monomial i=1 Xi . Since
Pnprove this
i=1 ai < i=1 (i − 1), two ai must be equal, say ai = aj . Denote by τ the transposition i ↔ j.
Then, by grouping permutations of [n] by orbits σ, σ ◦ τ , the sum is zero since

f (Xσ(1) , . . . , Xσ(n) ) = f (Xσ◦τ (1) , . . . , Xσ◦τ (n) )

but ε(σ ◦ τ ) = −ε(σ) by Exercise C.3.11∗ so the sum over each orbits cancels out.

It remains to prove that X1a1 · . . . · Xnan works when a1 + . . . + an > n(n−1)


2 . When a1 , . . . , an ≥ 1
it is trivial since the monomial is divisible by e1 . We proceed by induction on a21 + . . . + a2n , with
the following base case: a1 , . . . , an ≥ 1 the monomial is divisible by e1 .
B.4. EXERCISES 441

Now suppose that a1 + . . . + an > n(n−1)


2 and, without loss of generality, 0 = a1 ≤ a2 ≤ . . . ≤ an .
There must exist some k such that ek+1 ≥ ek +2, since otherwise ek ≤ k −1 for all k contradicting
our initial assumption on the sum. Now consider
a
X1a1 · . . . · Xnan − X1a1 · . . . · Xk−1
k−1
Xkak −1 · . . . · Xnan −1 en−k .

We claim that the sum of the squares of the exponents in any monomial appearing in this
polynomial is less than a21 + . . . + a2n , thus concluding the inductive step. To see this, express a
ak−1 ak −1
monomial of X1a1 · . . . · Xk−1 Xk · . . . · Xnak en−k as
a +bk−1
X1a1 +b1 · . . . · Xk−1
k−1
Xkak +bk −1 · . . . · Xnan +bn −1

for some bi ∈ {0, 1} with b1 + . . . + bn = n − k. The wanted result then follows from the
convexity of the square function: if bi = 1 for some i < k and bj = 0 for some j ≥ k, then
(ai + 1)2 + (aj − 1)2 < a2i + a2j . Iterating this process to "push" all the ones to the positions
greater than or equal to k, we get

(a1 + b1 )2 + . . . + (ak−1 + bk−1 )2 + (ak + bk − 1)2 + . . . + (an + bn − 1)2 ≤ a21 + . . . + a2n

with equality if and only if we already had equality in the beginning, i.e. if the monomial is
X1a1 · . . . · Xnan . However, we have ruled that case out by subtracting precisely this monomial, so
we are done.


Exercise B.4.13† (Lagrange). Given a rational function f ∈ K[X1 , . . . , Xn ], we denote by Gf the


set of permutations σ ∈ Sn such that
f (X1 , . . . , Xn ) = f (Xσ(1) , . . . , Xσ(n) ).
Let f, g ∈ K(X1 , . . . , Xn ) be two rational functions. If Gf ⊆ Gg , prove that there exists a rational
function r ∈ K[e1 , . . . , en ](X) such that
g = r ◦ f.

Solution

We present the proofs in Prasolov [31, Chapter 5, Section 1]. Partition Gg into disjoint cosets
Gf = h1 Gf , h2 Gf , . . . , hk Gf and write fi = hi f and gi = hi f for each i, where σf means
f (Xσ(1) , . . . , f (Xσ(n) ) (we say the group of permutations Sn acts on the field K(X1 , . . . , Xn )).
(This is where we use the assumption that Gf ⊆ Gg .)
For the first proof, notice that
k
X gi
i=1
T − fi
Qk
is, by definition, symmetric in X1 , . . . , Xn . Since Ω = i=1 T − fi is as well, we get
k
X gi F (T )
=
i=1
T − fi Ω(T )

for some F ∈ K(e1 , . . . , en )[T ] by the fundamental theorem of symmetric polynomials. Notice
that F 0 (f ) = i=2 f −fi is F/(T −f ) evaluated at T = f by Exercise 3.2.2∗ . Hence, we conclude
Qk
that
k  
F (f ) X Ω
= gi (f ) = g
Ω0 (f ) i=1
(T − fi )Ω0

since (T −fi )Ω0 vanishes at f 6= fi .
442 APPENDIX B. SYMMETRIC POLYNOMIALS

The second proof is perhaps more intuitive. We consider the system of equations
k
X
fis gi = Ts ,
i=1

where the exponent represents powers and not iterates. Cramer’s rule from Exercise C.5.9† and
the Vandermonde determinant from Appendix C and tell us that
D
g=

where
1 ··· 1
f1 ··· fn Y
∆= .. .. .. = fi − fj
. . . i<j
f1k−1 ··· fkk−1
and
T0 1 ··· 1
T1 f2 ··· fk
D= .. .. .. .. .
. . . .
Tk−1 f2k−1 ··· fkk−1
Write this as g = D∆ 2
∆2 . Notice that ∆ is symmetric, while D and ∆ both change sign when two
fi are switched, so D∆ is symmetric in f2 , . . . , fk . However, it is easy to see that, for any i,
ei (f2 , . . . , fk ) can be expressed polynomially in terms of f1 and ej (f1 , . . . , fk ). Hence, this D∆
∆2
is a rational function in f with symmetric coefficients by the fundamental theorem of symmetric
polynomials.


Exercise B.4.14† (Iran Mathematical Olympiad 2012). Prove that there exists a polynomial f ∈
R[X0 , . . . , Xn−1 ] such that, for all a0 , . . . , an−1 ∈ R,

f (a0 , . . . , an−1 ) ≥ 0

is equivalent to the polynomial X n + an−1 X n−1 + . . . + a0 having only real roots, if and only if
n ∈ {1, 2, 3}.

Solution
Q
If n ≤ 3, the discriminant
Q satisfies the condition. Indeed, the discriminant of f = i X − αi is
theQsquare of i<j αi − αj so is positive if all αi are positive. It remains to prove that, for these
n, i<j αi − αj is real if and only if all αi are (its square is real so it must be real or purely
imaginary). For n = 1, it is trivial since any polynomial of degree 1 with real coefficients splits
in R. For n = 2, if the roots of f are α 6= α, then α − α is not real since complex conjugation
negates it. For n = 3, if the roots of f are α 6= α and β ∈ R, then complex conjugation also
negates
(α − α)(β − α)(β − α)
so it isn’t real as desired.

Now, if there exists such a polynomial for n ≥ 4, there exists one for n = 4 by setting g(a, b, c, d) =
f (a, b, c, d, 0, . . . , 0). Thus, it only remains to prove that there doesn’t exist such a polynomial
for n = 4. For this, consider the special polynomial f (0, b, 0, d) since we know precisely when
the roots of X 4 + bX 2 + d are real. For convenience, we shall in fact consider the polynomial
B.4. EXERCISES 443

g(r, s) = f (0, −r − s, 0, rs) which is non-negative iff the roots of

X 4 − (r + s)X 2 + rs = (X 2 − r)(X 2 − s)

are all real, In other words, g(r, s) is non-negative if and only if r and s are. This implies that

0 ≥ lim− g(r, s) = g(r, 0) = lim+ g(r, s) ≥ 0,


s→0 s→0

i.e. g(r, 0) = 0 for any non-negative r. But then, the polynomial g(R, 0) must be zero since it
has infinitely many roots, so g(r, 0) is also zero (and in particular non-negative) for negative r
which is a contradiction.

Appendix C

Linear Algebra

C.1 Vector Spaces


Exercise C.1.1∗ . Prove Proposition C.1.3.

Solution

Let u1 , . . . , uk be linearly independent elements of the n-dimensional vector space V . Proceed as


follow to complete it into a basis: as long as it does not generate everything, add one element
that is not generated (and thus linearly independent with the previous ones). This process must
stop since n + 1 vectors are always linearly dependent by Proposition C.1.2.

For the second part, let u1 , . . . , u` be a generating family of elements and suppose without loss of
generality that u1 , . . . , um is a maximal subset of linearly independent elements. This is a basis,
since every other uk can be represented as a linear combination of them (and thus Pm all of V too).
Indeed, since u1 , . . . , um , uk are linearly dependent for k > m, we have auk + i=1 ai ui = 0 for
some ai ∈ K. Since u1 , . . . , um are linearly independent, a must be non-zero and we get
m
X ai
uk = − ui
i=1
a

as wanted.


Exercise C.1.2. Let V be a finite-dimensional vector space, and let W ⊆ V be a susbspace of V .


Prove that dim V /W = dim V − dim W , where V /W is the group quotient from Exercise A.3.15† , i.e.
we identify vectors v ∈ V and v + w for w ∈ W . More formally, the elements of V /W are the sets
v + W for v ∈ V .

Solution

Complete a basis (w1 , . . . , wm ) of W to a basis (v1 , . . . , vn ) of V . Then, (vm+1 , . . . , vn ) is basis


of V /W . (Or (vm+1 + W, . . . , vn + W ) if we wish to be very formal. The class v + W of v is
.
usually writtent v or v.)


C.2 Linear Maps and Matrices


         
1 0 1 1 1 1 1 1 1 0 1 0
Exercise C.2.1∗ . Prove that = but = .
0 0 0 0 0 0 0 0 0 0 0 0

444
C.3. DETERMINANTS 445

Solution

We have       
1 0 1 1 1·1+0·0 1·1+0·0 1 1
= =
0 0 0 0 0·1+0·0 0·1+0·0 0 0
and     
1 1 1·1+1·0 1·0+1·0 1 0
= .
0 0 0·1+0·0 0·0+0·0 0 0


Exercise C.2.2∗ . Prove that matrix multiplication is distributive over matrix addition, i.e. A(B +
C) = AB + AC and (A + B)C = AC + BC for any A, B, C of compatible dimensions.

Solution

Let A = (ai,j ), B = (bi,j ) and C = (ci,j ). Then, the (i, j) coordinate of A(B + C) is
X
ai,k (bk,j + ck,j )
k
P P
which is equal to k ai,k bk,j + k ai,k ck,j , i.e. the (i, j) coordinate of AB + AC. The right-
distributivity is completely analogous by symmetry of the left and right multiplication.


Exercise C.2.3. Define the rank of a linear map ϕ as the dimension of its image. Prove that
rank(ϕ + ψ) ≤ rank ϕ + rank ψ and rank ϕ◦ ≤ min(rank ϕ, rank ψ) for any ψ, ϕ.

Solution

The first inequality follows from im(ϕ + ψ) ⊆ im(ϕ) + im(ψ), and the second from im(ϕ ◦ ψ) =
ϕ(im ψ) and im(ϕ ◦ ψ) ⊆ im ϕ.


C.3 Determinants
Exercise C.3.1. Prove that an m × n matrix can only have a right-inverse if m < n, and only a
left-inverse if m > n. When does such an inverse exist?

Solution

By symmetry, it suffices to consider right-inverses. Suppose that a matrix A has dimensions


m × n and is right-invertible. Consider the surjective linear map from K n×m to K m×m defined
by B 7→ AB. By surjectivity, mn ≥ m2 , i.e. n ≥ m as wanted.

For the converse, we shall refine a bit our original argument. Note that each column of AB is
the sum of linear combinations of the columns of A: if A1 , . . . , An are the columns of A and
B = (bi,j ), then X
bi,k Ai
i
446 APPENDIX C. LINEAR ALGEBRA

is the kth column of AB. Hence, A has a right-inverse if and only if n ≥ m and its columns are
linearly independent.


Exercise C.3.2∗ . Prove that (AB)T = B T AT for any n × n matrices A, B.

Solution
P
Let A = (ai,j ) and B = (bi,j ). The (i, j) coordinate of AB is k ai,k bk,j so the (i, j) coordinate
of its transpose is X X
aj,k bk,i = bk,i aj,k
k k

which is also the (i, j) coordinate of B T AT .




Exercise C.3.3. Prove this identity.

Solution

We have
      
a b d −b ad + b(−c) a(−b) + ba ad − bc 0
= = = (ad − bc)I2 .
c d −c a cd + d(−c) c(−b) + da 0 ad − bc

Exercise C.3.4∗ . Prove that det In = 1.

Solution

As always, by induction. From the definition of the determinant, we have det In = 1 · det In−1 +
0 · . . . = det In−1 and det I1 = 1.


Exercise C.3.5∗ . Prove that the determinant of a matrix with a zero column is zero.

Solution

Suppose that the kth column M k of M is zero. By Proposition C.3.1 (with t = 0), we have

det M = detk0 (A) = 0 detk0 (M ) = 0.

Exercise C.3.6. Prove that the determinant of a non-invertible matrix is 0.


C.3. DETERMINANTS 447

Solution
i
P
Suppose that the columns of M are linearly dependent, i.e. i ai M = 0 for some ai ∈ K with
ak 6= 0. Then, X
0 = detk0 (M ) = ai detkM i (M )
i

by Exercise C.3.5∗ and Proposition C.3.1. Now, by Proposition C.3.3, all the determinants vanish
except the one with i = k. Thus, we get 0 = ak det M which means det M = 0 as wanted.


Exercise C.3.7∗ . Prove that an upper triangular matrix is invertible if and only if its determinant
is non-zero, i.e. if the elements on its diagonal are non-zero.

Solution

Let αi denote the ith element on the diagonal (i.e. the (i, i) coordinate). First, suppose that
αi 6= 0 for every i and that
Xn
ai M i = 0
i=1

for some ai ∈ K not all zero. Consder the least k such that ak 6= 0 and let α denote the (k, k)
Pn Pk
coordinate of M . The kth coordinate of i=1 ai M i is i=1 ai αi = ak αk since the ai = 0 for
i < k. This means that ak = 0 which contradicts our assumption.

For the converse, let k be such taht αk = 0. Then, the columns M k , M k+1 , . . . , M n all have the
top k coordinates zero. Thus, we can view them as vectors with n − k coordinates. We have
n − k + 1 vectors in a space of dimension n − k so they must be linearly dependent.


Exercise C.3.8. Prove that Z is integrally closed , meaning that, if f is a monic polynomial with alge-
braic integer coefficients, then any of its root is also an algebraic integer. (This is also Exercise 1.5.21† .)

Solution

Let f = X n + αn−1 X n−1 + . . . + α0 be a monic polynomial with algebraic integer coefficients


and let β be one of its roots. Then,

M = Z[αn−1 , . . . , α0 , β]

is a finitely generated Z-module such that βM ⊆ M , so β is an algebraic integer.




Exercise C.3.9∗ . Prove Lemma C.3.1.


448 APPENDIX C. LINEAR ALGEBRA

Solution

By induction:
n
X
det A = (−1)i−1 ai,1 det Ai,1
i=1
n
X X
= (−1)i−1 ai,1 ε(σ)a2,σ(2) · . . . · an,σ(n)
i=1 σ
Xn X
= (−1)i−1 ε(σ)ai,1 aσ(2),2 · . . . · aσ(n),n
i=1 σ

where the sum is over the bijections σ : [n] \ {1} → [n] \ {i}. This has the desired form. Finally,
note that the sign of a1,1 · . . . · an,n is (−1)1−1 times the sign of a2,2 · . . . · an,n which is 1 as wanted.


Exercise C.3.10∗ . Prove that the number of derangements of [m] is

m
X (−1)i m!
i=0
i!

and that this number is odd if m is even and even if m is odd.

Solution

We shall instead count the number of permutations with at least one fixed point. We use the
principle of inclusion-exclusion. Let Sk be the set of permutations fixing k. Then, by the
inclusion-exclusion principle, the number of permutations with at least one fixed point is
m
X X
|S1 ∪ . . . ∪ Sm | = (−1)k−1 |Si1 ∩ . . . ∩ Sik |
k=1 i1 <...<ik
Xm X
= (−1)k−1 (m − k)!
k=1 i1 <...<ik
m  
X
k−1 m
= (−1) (m − k)!
k
k=1
n
X (−1)k−1 m!
= .
k!
k=1

If we subtract this from n! (the total number of permutations), we get exactly the wanted formula.


Exercise C.3.11∗ . Prove that the signature is negated when one exchanges two values of σ (i.e.
compose a transposition with σ).
C.3. DETERMINANTS 449

Solution

Say that we exchange σ(i) with σ(j), i.e. we apply the transposition τ = τσ(i),σ(j) . We have
σ(i)−σ(j) Q σ(k)−σ(j) Q σ(k)−σ(i)
j−i · k6=i k−i · k6=j k−j
ε(σ)/ε(τ ◦ σ) = σ(j)−σ(i) Q σ(k)−σ(i) Q σ(k)−σ(j)
j−i · k6=i k−i · k6=j k−j
σ(i)−σ(j)
j−i
= σ(j)−σ(i)
j−i
= −1

as wanted.


Exercise C.3.12∗ . Prove that transpositions τi,j : i ↔ j and k 7→ k for k 6= i, j generate all
permutations (through composition).

Solution

By induction: we start with τn,σ−1 (n) so that σ ◦ τn,σ−1 (n) (n) = n. Then, ignoring the last
element of σ ◦ τ1,σ−1 (1) , it is a permutation of [n − 1] so a composition of transpositions. We get
−1
the wanted result by applying τn,σ−1 (n) = τn,σ −1 (n) to both sides.

Exercise C.3.13∗ . Prove Theorem C.3.3.

Solution

We proceed as in Exercise C.3.9∗ : we have


n X
X
det(ai,j ) = (−1)i−1 ε(σ)ai,1 aσ(2),2 · . . . · aσ(n),n
i=1 σ

where the sum is over the bijections σ : [n] \ {1} → [n] \ {i}. It remains to prove that ε(σ) =
(−1)σ(1)−1 ε(σ 0 ) where σ 0 denotes the bijection [n] \ {1} → [n] \ {σ(1)} obtained by forgetting
the first element. This is easy: we want to count the number of inversions of σ which are not
inversions of σ 0 . This is exactly the number of inversions (k, 1) since 1 is the only difference
bewteen σ and σ 0 . Since k > 1 for each k 6= 1, the number of such inversions is the number of
σ(k) < σ(1), i.e. σ(1) − 1 as wanted.


Exercise C.3.14∗ . Prove that det A = det AT for any square matrix A.
450 APPENDIX C. LINEAR ALGEBRA

Solution

Our formula from Theorem C.3.3 is symmetric in rows and columns:


X
det A = ε(σ)aσ(1),1 · . . . · aσ(n),n
σ∈Sn
X
= ε(σ −1 )a1,σ−1 (1) · . . . · an,σ−1 (n)
σ∈Sn
X
= ε(σ)a1,σ(1) · . . . · an,σ(n)
σ∈Sn

= det AT .

Exercise C.3.15∗ . Prove that Dk (In−1 ) = (−1)k−1 .

Solution

Consider the n × n matrix I we defined Dk with and substitute A = In−1 . We will exchange
k − 1 columns to transform it into In , thus getting a determinant of (−1)k−1 det In = (−1)k−1
as wanted.

Note that I is already almost equal to In : its first column should be kth and the 1 to k ones
should be shifted to the left. Here is how we do this shift using transpositions.

We first exchange the first column of the with the second one so that a1,1 = 1 becomes the (1, 1)
coordinate of I, then we exchange the (new) second column with the third one so that a2,2 = 1
becomes the (2, 2) coordinate of I, etc., until we exchange the k − 1th column with the kth one so
that ak,k = 1 becomes the (k, k) coordinate of I. Thus, the k − 1th column, which was originally
the first one, becomes the kth one as wanted.


Exercise C.3.16. Prove that the determinant is multiplicative by using the explicit formula of
Theorem C.3.3.

Solution

Write C = AB, so that


C k = b1,k A1 + . . . + bn,k An .
Then, by multilinearity of the determinant,

det C = det(b1,1 A1 + . . . + bn,1 An , . . . , b1,n A1 + . . . + bn,n An )


X
= bσ(1),1 · . . . · bσ(n),n det(Aσ(1) , . . . , Aσ(n) )
σ∈Sn
X
= σ(ε)bσ(1),1 · . . . · bσ(n),n det(A1 , . . . , An )
σ∈Sn

= det B det A

where the second-to-last equality comes from Remark C.3.5.



C.3. DETERMINANTS 451
 
A B
Exercise C.3.17. Let M = be a block-triangular matrix (meaning that A ∈ K m×m for
0 C
some m, C ∈ K (n−m)×(n−m) , B ∈ K m×(n−m) and we consider 0 as the zero matrix in K (n−m)×m ).
Prove that det M = det A det C.

Solution

We proceed by induction on the size of m of A. When A has size 0, we have M = C and the
result is obvious (det A = 1). For the inductive step, let (a1,1 , . . . , am,1 ) be the first column of
A. If we expand M with respect to its first column, we get
m
X Ai,1 Bi,1
det M = ai,1 (−1)i−1 .
0 C
i=1

By the inductive hypothesis, this is precisely


m
X
ai,1 (−1)i−1 det Ai,1 det C = det A det C.
i=1

Exercise C.3.18. Let L/K be a finite extension. Prove that the determinant of the K-linear map
L → L defined by x 7→ xα is the norm of α defined in Definition 6.2.3.

Solution

We first treat the case where L = K(α). Consider the basis 1, α, . . . , αn−1 of L and let f =
X n + an−1 X n−1 + . . . + a0 be the minimal polynomial of α. The matrix corresponding to x 7→ xα
in this basis is  
0 0 0 ··· −a0
1 0 0 · · · −a1 
 
0 1 0 · · · −a2 
.
 
0 0 1 · · · −a3 
 
 .. .. .. . . .. 
. . . . . 
0 0 0 · · · −an−1
Using Theorem C.3.3, we see that this is −ε(σ)a0 , where σ is the cycle (1, 2, . . . , n). Hence, we
need to prove that ε(σ) = (−1)n−1 since the product of the conjugates of α is (−1)n a0 by Vieta’s
formulas. This follows from the fact that

σ = (1, n) · . . . · (1, 3)(1, 2)

is a product of n − 1 transpositions.

Now here is what we can do for the general case. Note that the norm is multiplicative. Indeed,
for any linear maps ϕ and ψ,
det ϕ · det ψ = det ϕ ◦ ψ
since the determinant is multiplicative. Since the composite of x 7→ αx and x 7→ βx is x 7→ αβx,
the norm of αβ is the norm of α times the norm of β. Then, pick a primitive element γ of L/K
such that α/γ is also a primitive element. We can do this since

σ(α)
σ(α)/σ(γ) = α/γ ⇐⇒ σ(γ) = γ ·
α
452 APPENDIX C. LINEAR ALGEBRA

is false for γ = aδ + b for well-chosen a, b ∈ K and a fixed primitive element δ. Thus, from the
first observation, we get

NL/K (α) = NL/K (γ)NL/K (α/γ)


Y Y
= σ(γ) σ(α/γ)
σ∈EmbK (L) σ∈EmbK (L)
Y
= σ(α)
σ∈EmbK (L)

as wanted.


Exercise C.3.19∗ . Prove that adj AA = (det A)In .

Solution

Set (bi,j ) = adj AA. This time, we have


n
X
bi,j = (−1)i+k det(Ak,i )ak,j .
k=1

When i = j this is the column expansion of the determinant of A which is det A, and when i 6= j
this is (−1)i+j times the determinant of the matrix obtained by replacing the ith column of A
by its jth column. This matrix has two identical columns so its determinant is zero as wanted.


C.4 Linear Recurrences


Exercise C.4.1. Prove that any rational function h of negative degree has a partial fractions decom-
position and deduce another proof of Theorem C.4.1.

Solution
Qr
Let g = i=1 (X − αi )mi be the denominator of h, and let n denote its degree. We wish to prove
that linear combinations of
1
(X − αi )ki
with 1 ≤ ki ≤ mi span all of V = {f /g | deg f < n}. Since V has dimension n and we are looking
at the span of n elements of V , we just need to prove that they are linearly independent. So far
the proof is the same as before, but here is where this changes: suppose that
X ai,ki
= 0,
(X − αi )ki
i,ki

i.e. X Y
ai,ki (X − αi )mi −ki (X − αi )mi = 0.
i,ki j6=i

By evaluating this at X = αi , we deduce that ai,mi = 0 for every i. Now we differentiate


the equality. Since ai,mi is equal to 0, when we evaluate our new equality at X = αi , we get
ai,mi −1 = 0 this time. Continuing in this manner (differentiating k times to get ai,mi −k = 0), we
conclude that all ai,ki are equal to 0, i.e. our elements of V are indeed linearly independent.
C.5. EXERCISES 453

Exercise C.4.2. Prove that Theorem C.4.1 holds in a field K of characteristic p 6= 0 as long as the
multiplicities of the roots of the characteristic polynomial are at most p. In particular, for a fixed
characteristic equation, it holds for sufficiently large p.

Solution

The only thing to check is that if fi (n) = 0 for all n ∈ Z and fi has less than the multiplicity of αi
so less than p then fi = 0. This is true because Z reduces to p elements in a field of characteristic
p 6= 0 so fi has p roots which is more than its degree and is thus zero.


C.5 Exercises
Vector Spaces, Bases and Matrices
Exercise C.5.1† (Grassmann’s Formula). Let U be a vector space and V, W be two finite-dimensional
subspaces of U . Prove that
dim(V + W ) = dim V + dim W − dim(V ∩ W ).

Solution

Let u1 , . . . , uk be a basis of V ∩ W . Complete it to a basis u1 , . . . , uk , v1 , . . . , vm of V and a basis


u1 , . . . , uk , w1 , . . . , wn of W . We claim that

u1 , . . . , uk , v1 , . . . , vm , w1 , . . . , wn

is a basis of V + W . It clearly spans all of V + W so it remains to check that it’s linearly


independent. If
Xk m
X m
X
a i ui + bi vi + ci wi = 0,
i=1 i=1 i=1
Pm
then i=1 bi vi is both in V and W so is in V ∩ W . This means that it’s a linear combination
of the ui , but by construction this implies b1 = . . . = bm = 0 since the ui and vi are linearly
independent (they form a basis of V together). By symmetry, c1 = . . . = cn = 0. Finally, this
forces a1 = . . . = ak = 0 too.

We conclude that

dim(V + W ) = k + m + n = (k + m) + (k + n) − k = dim V + dim W − dim V ∩ W.

Remark C.5.1
Like the rank-nullity theorem, there is a very short and elegant proof of Grassmann’s formula
with exact sequences (see Exercise A.3.22† ). For this, note that the sequence

0→V ∩W →V ×W →V +W →0

is exact, where the map V ∩ W → V × W is given by ϕ : u 7→ (u, −u) (and not (u, u), we’ll
see why in the next sentence) and the map V × W by ψ : (v, w) 7→ v + w. Indeed, ϕ is clearly
454 APPENDIX C. LINEAR ALGEBRA

injective, ψ is clearly surjective, and ker ψ = im ϕ as v + w = 0 with (v, w) ∈ V × W if and only


if (v, w) = (u, −u) for some u ∈ V ∩ W . Hence, Exercise A.3.22† yields

dim(V ∩ W ) + dim(V + W ) = dim(V × W ) = dim V + dim W.

Exercise C.5.3† . Given a vector space V of dimension n, we say a subspace H of V is a hyperplane


of V if it has dimension n − 1. Prove that H is a hyperplane of K n if and only if there are elements
a1 , . . . , an ∈ K not all zero such that

H = {(x1 , . . . , xn ) ∈ K n | a1 x1 + . . . + an xn = 0}.

Solution

Clearly, if H is defined as the zero set of a1 X1 + . . . + an Xn then H has dimension n − 1 since,


assuming without loss of generality that an 6= 0, we get a bijective map K n−1 → H given by
 
a1 x1 + . . . + an xn
(x1 , . . . , xn−1 ) 7→ x1 , . . . , xn−1 , − .
a1

For the converse, pick a linear map ϕ mapping H to 0 without being identically 0. Then,
n
X
(x1 , . . . , xn ) = x ∈ H ⇐⇒ ϕ(x) = 0 ⇐⇒ xi ϕ(ei ) = 0
i=1

where ei is the canonical basis of K n : 0 everywhere except in its ith coordinate where there is a
1.


Exercise C.5.4† . Let M ∈ K m×n be a matrix. Prove that M has rank k 1 if and only if there are
invertible matrices P ∈ K m×m and Q ∈ K n×n such that M = P Jk Q where
 
I 0
Jk = k
0 0

is a block-diagonal matrix of rank k (meaning that you consider the upper-right 0 as an element
of K k×(n−k) , the lower-left 0 as an element of K (m−k)×k , and the lower-right 0 as an element of
K (m−k)×(n−k) ). Deduce that M, N ∈ K m×n have the same rank if and only if there exist invertible
matrices P ∈ K m×m and Q ∈ K n×n such that M = P N Q.

Solution

Consider the linear map ϕ induced by M , which maps V ∈ K n to M V ∈ K m . Let


(f (e1 ), . . . , f (ek )) be a basis of im ϕ. Complete it into a basis C of K m , and complete (e1 , . . . , ek )
into a basis B of K n . Then,
MCB = Jk .
Since the LHS has the form P M Q for invertible P, Q (corresponding to the change of bases from
the canonical bases of K n and K m to B and C), we get the wanted result. As any invertible
matrix can be a change of basis matrix, it is also clear that P M Q has the same rank as M for
any invertible P, Q, which proves the last part.


1 Recall that the rank of M is the dimension of the linear map V 7→ M V it induces, i.e. the maximum number of

linearly independent columns as the image is the space generated by the columns.
C.5. EXERCISES 455

Determinants
Exercise C.5.8† . Let a0 , . . . , an−1 be elements of K and ω a primitive nth root of unity. Prove that
the circulant determinant  
a0 a1 · · · an−1
an−1 a0
 · · · an−2 

 .. .. .. .. 
 . . . . 
a1 a2 ··· a0
is equal to
f (ω)f (ω 2 ) · . . . · f (ω n−1 )
where f = a0 + . . . + an−1 X n−1 . Deduce that this determinant is congruent to a0 + . . . + ap−1 modulo
p when n = p is prime and a1 , . . . , ap are integers.

Solution

Let A be the circulant matrix. We shall imitate the proof of Theorem C.3.2: we need to prove
that det A is zero when a0 = −(a1 ω + . . . + an−1 ω n−1 ) for any root of unity ω. This shows
that the product f (ω) divides the determinant as a polynomial in a0 . Then, by looking at the
coefficient of an0 , we can conclude that they are in fact equal. To show that the determinant
vanishes for these a0 , note that the following linear combination of the columns is zero
n
X
ω i Ai = 0.
i=1

Alternatively, one could note that A = f (J), where is the circulant matrix with a1 = 1 and
a0 = a2 = a3 = . . . = an−1 = 0. We can check that all nth roots of unity are eigenvalues of J. To
finish, Exercise C.5.18† implies that the eigenvalues of f (J) are the f (ω), and the determinant
is their product as wanted.


Exercise C.5.9† (Cramer’s Rule). Consider the system of equations M V = X where M is an n × n


matrix and V = (vi )i∈[[1,n]] and X = (xi )i∈[[1,n]] are column vectors. Prove that, for any k ∈ [[1, n]],
vk is equal to det M/ det Mk,X , where Mk,X denotes the matrix [M 1 , . . . , M k−1 , X, M k+1 , . . . , M n ]
obtained from M by replacing the kth column by X.

Solution

Note that this formula is linear in XX since the determinant is multilinear. Since the kth
coordinate of the formula M −1 V is also linear in X, we just need to check that both formulas
agree on a basis of K n . This is easy: when X = M k we get vk = 1 and vi = 0 for i 6= k which
is indeed the solution to M V = M k . Since these form a basis of K n as M is invertible, we are
done.


Exercise C.5.10† (Wronskian Determinant). Let K be a characteristic 0 field. Given n formal


Laurent series f1 , . . . , fn ∈ K((X)), i.e. formal power series divided by a power of X,2 we define the
Wronskian determinant
f1 f2 ··· fn
f10 f20 ··· fn0
W = .. .. .. ..
. . . .
(n−1) (n−1) (n−1)
f1 f2 ··· fn .
2 Inother words, K((X)) = K[[X]][X −1 ]. A more conceptual way to view K((X)) is as the field of fractions of K[[X]].
In particular, rational functions are formal Laurent series.
456 APPENDIX C. LINEAR ALGEBRA

Prove that W = 0 if and only if f1 , . . . , fn are linearly dependent (over K).

Solution

Here is how we will proceed: first we prove this result for elements of K[X, X −1 ], i.e. formal
Laurent series modulo X m for some m, and then we deduce the general result from this. Thus,
suppose first that f1 , . . . , fn are polynomials divided by a power of X. By replacing them by
linear combinations, we may assume that they have distinct degrees. In other words, the original
fi are linearly dependent if and only if one of our new fi is 0 (the Wronskian determinant is
the same by column operations). Hence, suppose otherwise; we may then assume that they are
monic by homogeneity of the determinant. Let d1 , . . . , dn be their respective degrees.

Here is the key step: the leading coefficient of W is the determinant of the leading coefficients of
(j)
fi , as we can see for instance with Theorem C.3.3. This is

1 1 ··· 1
d1 d2 ··· dn
d1 (d1 − 1) d2 (d2 − 1) ··· dn (dn − 1)
.. .. .. ..
. . . .
d1 · · · (d1 − (n − 2)) d2 · · · (d2 − (n − 2)) ··· dn · · · (dn − (n − 2)).

However, by column operations, this is simply the Vandermonde determinant


Y
V = dj − di .
i<j

Since the di are distinct, the Wronskian determinant has a non-zero coefficient and is thus non-
zero as well.

Now, let us do the general case. Let f1 , . . . , fn ∈ K((X)) be any formal Laurent series. Suppose
that their Wronskian determinant W is 0 and let m be any integer. Then, modulo X m , we
get that the Wronskian determinant of f1 (mod X m ), . . . , fn (mod X m ) is also 0. By the first
part of the solution, this means that they are linearly dependent. Let Vm be the vector space of
(a1 , . . . , an ) such that
Xn
ai fi ≡ 0 (mod X m ).
i=1

By assumption, Vm is non-empty for all m ∈ Z. However, clearly, we also have Vm+1 ⊆ Vm for
any m ∈ Z since an element which is zero modulo X m+1 is also zero modulo X m . Hence, we
have a decreasing chain of non-empty finite-dimensional vector spaces

V0 ⊇ V1 ⊇ .

Then, Vm must be eventually equal to some fixed V . Indeed, the sequence of dimensions dim Vm
is a weakly increasing sequence of positive integers which must thus be eventually constant.
However, the only subspace of same dimension as a finite-dimensional vector space V is VSitself,
S must have Vm+1 = Vm = V for all large m as claimed. Clearly, V is also equal to m Vm
so we
so m Vm is non-empty. Pick any of its element (a1 , . . . , an ) to get a linear combination of the
fi which is zero modulo any power of X and thus is zero as wanted.


Remark C.5.2
As we saw in Remark A.3.4, the result fails in characteristic p: W (X p , X 2p ) = 0.

Exercise C.5.11† . Let (un )n≥0 be a sequence of elements of a field K. Suppose that the (m + 1) ×
C.5. EXERCISES 457

(m + 1) determinant det(un+i+j )i,j∈[[0,m]] is 0 for all sufficiently large n. Prove that there is some N
such that (un )n≥N is a linear recurrence of order at most m.

Solution

We proceed by induction on m (it’s trivial when m = 1). More precisely, we prove that, under the
assumptions of the problem, if the (m − 1) × (m − 1) determinant det(un+i+j )i,j∈[[0,m−1]] vanishes
for one value of n, then it vanishes for all the next ones which means that there exists some N for
which (un )n≥N is a linear recurrence of order at most m − 1 ≤ m. If it doesn’t vanish for n ≥ N ,
then (un+i+j )i,j∈[[0,m]] has rank m − 1 by definition of the rank (see Exercise C.5.37† ), and, more
precisely, its first m rows as well as its first last rows are linearly independent and generate the
same hyperplane H. Notice that the last m rows of (un+i+j )i,j∈[[0,m]] are the first m rows of
(un+1+i+j )i,j∈[[0,m]] so this hyperplane H is always the same. Finally, with Exercise C.5.3† , we
conclude that
a0 un + a1 un+1 + . . . + an+m un+m = 0
for all n ≥ N , i.e. (un )n≥N is a linear recurrence of order at most m as claimed.

It remains to prove that, if det(un+i+j )i,j∈[[0,m−1]] = 0, then det(un+1+i+j )i,j∈[[0,m−1]] = 0 as well.


Hence, suppose that the first determinant is 0, i.e. that there is a linear dependence between the
rows. If this dependence does not involve the first row, then it also creates a linear dependence
in the rows of the second matrix which implies that its determinant is 0 as wanted. Otherwise,
the first row is a linear combination of the m − 1 next ones. This implies that the first row of
(un+i+j )i,j∈[[0,m]] is a linear combination of the m − 1 next ones as well as a vector of the form
(0, . . . , 0, a) for some a ∈ K. Then, by performing row operations, we find

0 ··· 0 a
un+1 ··· un+m un+m+1
0 = (un+i+j )i,j∈[[0,m]] =± .. .. .. .. = ±a det det(un+1+i+j )i,j∈[[0,m−1]]
. . . .
un+m ··· un+2m−1 un+2m

by expanding with respect to the first row. We are done.




Exercise C.5.12† . Let f1 , . . . , fn : S → K be linearly independent functions, where S is any set and
K is a field. Prove that there exists n elements m1 , . . . , mn of S such that the tuples

(f1 (m1 ), . . . , fn (m1 )), . . . , (f1 (mn ), . . . , fn (mn ))

are linearly independent over K.

Solution

We proceed by induction on n. It is of course trivial when n = 1. Fix m1 , . . . , mn−1 such that

(f1 (m1 ), . . . , fn−1 (m1 )), . . . , (f1 (mn−1 ), . . . , fn−1 (mn−1 ))

are linearly independent, i.e. such that the determinant


 
f1 (m1 ) ··· fn−1 (m1 )
C=
 .. .. .. 
. . . 
f1 (mn−1 ) · · · fn−1 (mn−1 )
458 APPENDIX C. LINEAR ALGEBRA

is non-zero. We wish to show that there is some mn such that


 
f1 (m1 ) · · · fn (m1 )
 .. .. .. 
 . . . 
f1 (mn−1 ) · · · fn (mn )

is non-zero. Expand it with respect to the last column to get


n
X
Ci fi (mn )
i=1

where Ci are constants and Cn = C. Since f1 , . . . , fn are linearly independent and Cn 6= 0, this
cannot always be zero, as claimed.


Algebraic Combinatorics
Exercise C.5.15† . Let A1 , . . . , An+1 be non-empty subsets of [n]. Prove that there exist disjoint
subsets I and J of [n + 1] such that
[ [
Ai = Aj .
i∈I j∈J

Solution

Identity each subset Ai ⊆ [n] with the vector vi ∈ Rn whose ith coordinate is 1 if i ∈ S and 0
otherwise. This makes the set of subsets of [n] into a subset of a R-vector space of dimension n.
(It’s not a vector space itself, though it would be if we chose F2 instead of R. F2 is usually very
useful in algebraic combinatorics but doesn’t work here.) We have n + 1 vectors in a space of
dimension n so they must be linearly dependent, say
n+1
X
ci vi = 0.
i=1

Now consider the set I of indices of positive ci , and the set J of indices of negative ci . We have
X X
|ci |vi = |cj |vj
i∈I j∈J
S S
which gives us i∈I Ai = j∈J Aj as wanted. In addition, I and J are disjoint by construction.


Polynomials of Linear Maps and Matrices


Exercise C.5.18† (Characteristic Polynomial). Let K be an algebraically closed field. Let M ⊆ K n×n
be an n × n matrix. Define its characteristic polynomial as χM = det(XIn − M ). Its roots (counted
with multiplicity) are called the eigenvalues λ1 , . . . , λn ∈ K of M . Prove that det M is the product
of the eigenvalues of M , and that Tr M is the sum of the eigenvalues. In addition, prove that λ is
an eigenvalue of M if and only if there is a non-zero column vector V such that M V = λV (in other
words, M acts like a homothety on V ). Conclude that, if f ∈ C[X] is a polynomial, the eigenvalues of
f (M ) are f (λi ) (with multiplicity). (We are interpreting 1 ∈ K as In for f (M ) here, i.e., if f = X + 1,
f (M ) is M + In .) In particular, the eigenvalues of M + Iα are λ1 + α, . . . , λn + α, and the eigenvalues
C.5. EXERCISES 459

of M k are λk1 , . . . , λkn .3 Finally, show that if


 
A B
M=
0 C

is block-triangular , then χM = χA χC .4

Solution

The first part follow simply from expanding the determinant and using Vieta’s formulas. For the
second one, there is a non-zero vector V such that M V = λV ⇐⇒ (M − λIn )V if and only if
ker(M − λIn ) is non-trivial, which is equivalent to det(M − λIn ) = 0. Now, note that M V = λV
gives M k V = λk V and thus, by taking linear combinations, f (M )V = f (λ)V . This shows that
the f (λ) are eigenvalues of M . To account for the multiplicity, note that we have established the
result when χM has simple roots, and that the general results follows by density, analytical or
algebraic. We present the proof by algebraic density (more technically, Zariski density) since it
works over any field, which is similar to Remark C.3.8.
Qn
Note that the equality χf (M ) = k=1 f (λi ) − X is a polynomial equality in the coefficients of
f and the coordinatesQnof M by the fundamental
Q theorem of symmetric polynomials. Let ∆f be
the discriminant of k=1 f (λi ) − X, i.e. ± i6=j f (λi ) − f (λj ) which is again polynomial in the
coefficients of f and the coordinates of M . We have shown that
n
Y
(χf (M ) − f (λi ) − X)∆f
k=1

always takes the value zero. Hence, it must be the zero polynomial. If we show that ∆f is
non-zero, we are hence done. Choosing f = X, this amounts to finding a matrix with distinct
eigenvalues. This is not very hard: we can fix all coordinates of M except one and let it vary.
The determinant then varies affinely in this coordinates, say it is ua + v, where a is the varying
coordinate. By induction, we can choose u to have simple roots. A multiple root of ua + v would
also be a root of u0 , but this is impossible for large a unless this root is a common root of u and
v, which must thus have multiplicity one: this is a contradiction. (Alternatively, we can consider
the matrix J from Exercise C.5.8† .)

Finally, the last two parts follow from Exercise C.3.17.




Exercise C.5.19† (Cayley-Hamilton Theorem). Prove that, for any n × n matrix M , χM (M ) = 0


where χM is the characteristic polynomial of M and 0 = 0In . Conclude that, if every eigenvalue of M
is zero, M is nilpotent, i.e. M k = 0 for some k.5

Solution

Using Proposition C.3.7, we get

(M − XIn ) adj(M − XIn ) = adj(M − XIn )(M − XIn ) = χM In . (*)

3 One of the advantages of the characteristic polynomial is that we are able to use algebraic number theory, or more
generally polynomial theory, to deduce linear algebra results, since the eigenvalues say a lot about a matrix (if we combine
this with the Cayley-Hamilton theorem). See for instance Exercise C.5.28 and the third solution of Exercise C.5.30† .
4 Of course, all of this extends to endomorphisms (i.e. linear maps from V to itself) ϕ : V → V of a finite-dimensional

vector space.
5 Note that if, in the definition of χ , we replace det by an arbitrary multilinear form in the coordinates of M ,
PM
such as the permanent) perm(A) = σ∈Sn a1,σ(1) · . . . · an,σ(n) , the result becomes false, so we cannot just say that
"χM (M ) = det(M − M In ) = det 0 = 0" (this "proof" is nonsense because the scalar 0 is not the matrix 0, but the point
is that this intuition is fundamentally flawed).
460 APPENDIX C. LINEAR ALGEBRA

We wish to substitute M for X in M − XIn , but we can only do that if M commutes with the
coefficients of adj(M − XIn ) (which are matrices). Note that this is the case since

XIn adj(M − XIn ) = adj(M − XIn )XIn

(X is a formal variable so commutes with everything, and In does as well) so M adj(M − XIn ) =
adj(M − XIn )M by (∗). The second part is obvious: if all eigenvalues of M are zero, then
χM = ±X n so M n = 0.

Alternatively, note that the theorem is obvious for diagonal matrices by the last part of Exer-
cise C.5.18† and hence for diagonalisable matrices as well. However, by Exercise C.5.22† , we know
that M is diagonalisable if its characteristic polynomial is squarefree. If we consider the discrim-
inant ∆ of χM , we thus get that χM (M ) = 0 whenever ∆ 6= 0. This implies that χM (M ) = 0
holds polynomially, and is thus always true: if a is a coefficient of χM (M ), we have a∆ = 0 for
all M , so the polynomial (in the coefficients of M ) a∆ is zero, i.e. a is zero as ∆ isn’t. (This is
an argument of Zariski density.)


Exercise C.5.20† (Kernel Lemma). Let V be a vector space over a field K, ϕ : V → V a linear map,
and f, g ∈ K[X] coprime polynomials. Prove that

ker((f g)(ϕ)) = ker(f (ϕ)) ⊕ ker(g(ϕ)).

(The action of K[X] on ϕ is defined by (X n )(ϕ) := ϕn .)

Solution

As ker f (ϕ), ker g(ϕ) ⊆ ker(f g)(ϕ), the RHS is included in the LHS. As f and g are coprime, the
RHS is indeed a direct sum: there exist u, v ∈ K[X] such that uf + vg = 1, so that

u(ϕ) ◦ f (ϕ) + v(ϕ) ◦ g(ϕ) = id

which implies ker f (ϕ) ∩ ker g(ϕ) ⊆ ker id = {0}.

Finally, let x ∈ ker(f g)(ϕ). Then, g(ϕ)(x) ∈ ker f (ϕ) and f (ϕ)(x) ∈ ker g(ϕ) so that

x = u(ϕ)(g(ϕ(x))) + v(ϕ)(f (ϕ(x))) ∈ ker f (ϕ) ⊕ ker g(ϕ).

Exercise C.5.21† (Minimal Polynomial). Let U be a finite dimensional vector space, and let ϕ : U →
U be a linear map. Prove that, although the minimal polynomial πϕ of ϕ is not necessarily equal to
its characteristic polynomial χϕ , they have the same irreducible factors.6 In addition, show that if V
is a subspace of U which is stable under ϕ, i.e. ϕ(V ) ⊆ V , then χϕ|V divides χϕ , where ϕ|V : V → V
denotes the restriction of ϕ to V . Finally, if U = V ⊕ W where V and W are stable under ϕ, prove
that
πϕ = lcm(πϕ|V ,ϕ|W )

and
χϕ = χϕ|V χϕ|W .

6 Beware that, as K n×n is not a domain (meaning that AB = 0 implies A = 0 or B = 0) anymore, minimal

polynomials are not necessarily irreducible anymore.


C.5. EXERCISES 461

Solution

If λ is a root of χϕ , i.e. an eigenvalue of ϕ, let v be a corresponding eigenvector. Then,


0 = πϕ (v) = vπϕ (λ) which means that λ is a root of πϕ .

Alternatively, we can express πϕ (X)id = πϕ (Xid) − πϕ (ϕ) as (Xid − ϕ)τ (ϕ) for some polynomial
τ and take the determinant on both sides: this yields

det(piϕ (X)id) = πϕ (X)n | det(Xid − ϕ) = χϕ .

The second part is just another way of phrasing the last part of Exercise C.5.18† : if we complete
a basis of V into a basis of U , then the matrix of ϕ in this basis is block-triangular with the
upper-left block corresponding to V . Similarly, the second equality of the the third part follows
from Exercise C.5.18† : if U = V ⊕W with V, W stable under ϕ, construct a basis of U by merging
a basis of V with a basis of W . Then the matrix of ϕ in this basis is block-diagonal with one
block corresponding to V and one to W .

For the first equality, it suffices to note that f (ϕ) is zero on U if and only if it is zero on V as
well as on W , which happens if and only if πϕ|V , πϕ|W | f , i.e. lcm(πϕ|V , πϕ|W ) | f .


Exercise C.5.22† (Diagonalisation). We say anendomorphism (i.e. a linear map from a vector
space to itself) ϕ : V → V of a finite-dimensional vector space V is diagonalisable if there is a basis
(e1 , . . . , en ) of V in which ϕ is diagonal, i.e. there are λ1 , . . . , λn ∈ K such that ϕ(ei ) = λi ei for
all i. Prove that ϕ is diagonalisable if and only if its minimal polynomial πϕ is squarefree.7 If K is
algebraically closed and χϕ = (X − λ1 )m1 · . . . · (X − λr )mr where λ1 , . . . , λr are distinct, then, in the
decomposition
M r
V = ker((ϕ − λi id)mi )
i=1

given by Exercise C.5.20 and Exercise C.5.19† , ker((ϕ − λi id)mi ) = ker(ϕ − λi id) for all i if and only

if f is diagonalisable.

Solution

Diagonalising ϕ amounts
Lr to finding a basis of eigenvectors e1 , . . . , en . Since the space generated
by eigenvectors is i=1 ker(ϕ − λi id) where λ1 , . . . , λr are the distinct eigenvalues of f , this
proves that ϕ is diagonalisable if and only if
r
M
ker(ϕ − λi ) = V.
i=1
Lr
As the RHS is equal to i=1 ker(ϕ − λi id)mi , this proves that ϕ is diagonalisable if and only if
ker(ϕ − λi id) = ker(ϕ − λi id)mi for every i. As the LHS is equal to ker(X − λ1 ) · · · (X − λr )(ϕ) by
the kernel lemma, this also shows that f is diagonalisable if and only if its minimal polynomial
divides (X − λ1 ) · · · (X − λr ), i.e. is squarefree.


Exercise C.5.23† (Ponctual Minimal Polynomial). Let ϕ : U → U be an endomorphism of a finite-


dimensional vector space U . We define the minimal polynomial of ϕ at some x ∈ U πϕ,x to be the
7 This is already very useful since it gives us that whenever the eigenvalues of ϕ are distinct, f is diagonalisable, which

implies that diagonalisable are dense (in Cn×n ). We can use this to give another proof of the Cayley-Hamilton theorem,
see the solution to Exercise C.5.19† .
462 APPENDIX C. LINEAR ALGEBRA

unique monic polynomial of smallest degree such that πϕ,x (ϕ)(x) = 0. Prove that, if U = V ⊕ W
where V and W are stable under ϕ (see Exercise C.5.21† ), then

πϕ,x+y = lcm(πϕ|V ,x , πϕ|W ,y )

for any (x, y) ∈ V × W . Deduce that there always exists an x ∈ U such that πϕ,x = πϕ .

Solution

For the first part, it suffices to note that, for f ∈ K[X], f (ϕ)(x + y) = f (ϕ)(x) + f (ϕ)(y) is equal
to 0 if and only if f (ϕ)(x) and f (ϕ)(y) are both equal to 0, as V ∩ W = {0}.

We now consider the second part. As the case ϕ = 0 is trivial, let us suppose that ϕ 6= 0, so
that πϕ 6= 1. Suppose first that πϕ = f m is a power of an irreducible polynomial. Then, as πϕ,x
divides πϕ for every x ∈ V , we get πϕ,x = k where k is such that x ∈ ker ϕk \ ker ϕk−1 (if k = 0
set ker ϕk−1 = ∅). In particular, for x 6∈ \ ker f m−1 , we get πϕ,x = πϕ , and there must exist such
an x as f m is the minimal polynomial of ϕ.

Now, let ϕ 6= 0 be any endormophism, and let πϕ = f1m1 · · · frmr be the prime factorisation of its
minimal polynomial. By the kernel lemma, we have
r
M
V = ker fimi (ϕ)
i=1

and the minimal polynomial of ϕ restricted to Ui = ker fimi (ϕ) is fimi . Then, if x1 , . . . , xr are
such that, for each i, πϕ|Ui ,xi = f mi (using the claim of the previous paragraph), we have

πϕ,x1 +...+xr = lcm(f1m1 , · · · , frmr ) = πϕ

by the first paragraph, as wanted.




Exercise C.5.24† (Cyclic Endomorphisms). We say an endomorphism ϕ : V → V of a finite-


dimensional vector space V is cyclic if there is some x ∈ V for which (x, ϕ(x), . . . , ϕn−1 (x)) is a basis
of V , where n = dim V . Prove that ϕ is cyclic if and only if its minimal polynomial πϕ has degree
n, and that this happens if and only if πϕ = χϕ (you may assume the Cayley-Hamilton theorem only
for this last claim). Give a direct proof of the Cayley-Hamilton theorem when ϕ is cyclic, and deduce
a proof of the theorem in the general case by noting that, if x ∈ V , the restriction of ϕ to the space
generated by (ϕk (x))k≥0 is always cyclic.

Solution

ϕ is cyclic if and only if there is some x ∈ V such that πϕ,x has degree n. By Exercise C.5.23† ,
this happens if and only if πϕ has degree n. As πϕ divides χϕ which has degree n, this happens
if and only if πϕ = χϕ .

Suppose that ϕ is cyclic, and let (x, ϕ(x), . . . , ϕn−1 (x)) be a basis of V . Let a0 , . . . , an−1 ∈ K be
such that
n−1
X
ϕn (x) = ai ϕi (x).
i=0

Then, the minimal polynomial of ϕ is π = X − an−1 X n−1 − . . . − a0 . In addition, in this basis,


n
C.5. EXERCISES 463

the matrix of ϕ is  
0 0 ··· 0 −a0
1 0
 ··· 0 −a1 

Cπ = 0 1
 ··· 0 −a2 

 .. .. .. .. ..
. . . . .
0 0 ··· 1 −an−1
which is called the compagnon matrix of π. It is then easy to check that χϕ = π: simply expand
the determinant of XIn −Cπ with respect to the last column. Thus, the Cayley-Hamilton theorem
holds for cyclic polynomials. The general case now follows from this: if x is any element of V ,
let Vx be the space generated by (ϕk (x))k≥0 . Then, ϕ|Vx is cyclic, so χϕ|Vx (ϕ)(x) = 0. Thus,
χϕ (ϕ)(x) = 0 as well as χϕ|Vx divides χϕ by Exercise C.5.21† .


Exercise C.5.25† . Prove that χAB = χBA for any A, B ∈ K n×n . Using Exercise C.5.4† , deduce that
if m ≥ n, then χAB = X m−n χBA for any A ∈ K m×n and B ∈ K n×m .

Solution

When B is invertible, this follows from the fact that det BM B −1 = det B det M (det B)−1 =
det M : if we set C = AB, we get that

χAB = χC = det(XIn − C)

and
χBA = χBCB −1 = det(XIn − BCB −1 ) = det(B(XIn − C)B −1 )
are equal. The general case follows from a density argument: the coefficients of χAB and χBA
are polynomial in those of A and B; if we let a denote a coefficient of χAB − χBA we get that the
polynomial a det B is always zero which means that a = 0 as det B is non-zero as a polynomial.

Now, let us prove the


 second part. First, we do the case where B = Jr ∈ B n×m for some r.
A0 B 0
Write A = where A0 ∈ K r×r . Then,
C 0 D0
 0
A B 0 Ir
   0 
0 A 0
AB = =
C 0 D0 0 0 C0 0m−r

and
0 A0 B0
  0
B0
  
Ir A
BA = =
0 0 C0 D0 0 0n−r
by Exercise C.5.5 where 0k denotes the zero matrix in K k×k . Now, by Exercise C.5.18† , the first
has characteristic polynomial χA0 χ0m−r = X m−r χA0 and the second χA0 χ0n−r = X n−r χA0 as
wanted.

Finally, we treat the general case: by Exercise C.5.4† , any matrix B ∈ K n×m can be written as
P Jr Q for some invertible P ∈ K n×n and Q ∈ K m×m , where r is the rank of B. Then,

χAB = χAP Jr Q
= χQAP Jr
= X m−n χJr QAP
= X m−n χP Jr QA
= X m−n χBA .


464 APPENDIX C. LINEAR ALGEBRA

Remark C.5.3
We could have also deduced the general case from a density argument, by noting that if λ is a
non-zero eigenvalue of ϕ ◦ ψ, say with an eigenvector x, then it is also an eigenvalue of ψ ◦ ϕ, with
eigenvector ψ(x) (which is non-zero as λ 6= 0). Thus, all the non-zero roots of χBA are roots of
χBA . In particular, if BA is invertible (which happens on a (polynomially) dense set as n ≤ m:
this is the case whenever A and B have rank n) and has distinct eigenvalues (which happens
again on a dense set), we get χBA | χAB . But then, as AB has rank at most n (it is the product
of two matrices of rank at most n, see Exercise C.2.3), we know by Exercise C.5.26† that 0 is a
root of multiplicity at least n − m of χBA , thus proving that χAB = X n−m χBA on this dense set
and thus everywhere.

Exercise C.5.26† . Given an n × n matrix A and a set I ⊆ [n], we denote by A[I] the |I| × |I|
submatrix obtained by keeping only the rows and columns of A indexed by elements of I. Prove that,
for 0 ≤ k ≤ n, the X k coefficient of χM is
X
(−1)n−k det A[I].
|I|=n−k

In particular, the X n−1 coefficient is minus the trace of A, and the constant coefficient is its determi-
nant. Using Exercise C.5.37† , deduce that if A has rank r, 0 is a root of χA of multiplicity at least
n − r.

Solution

Write A = (ai,j ). This amounts to proving that


n
X X
det(XIn + A) = Xk det A[I].
k=0 |I|=n−k

Let us examine the X k coefficient on both sides. If σ ∈ Sn is a permutation of [n] and δi,j
denotes the Kronecker delta, i.e. 1 if i = j and 0 otherwise, the X k coefficient of

(a1,σ(1) + Xδ1,σ(1) ) · · · (an,σ(n) + Xδn,σ(n) )


Q
is the sum of j6∈I aj,σ(j) over the sets I of cardinality k for which σ(i) = i for all i ∈ I. Moreover,
the induced permutation on [n] \ I has the same signature as σ. To finish, if we exchange the
order of summation, i.e. fix I of cardinality k and sum over the permutations fixing I, we get
det(A[[n] \ I]). Summing over all I of cardinality k yields the result.

The last part is a direct consequence of Exercise C.5.37† : if A has rank r, det M [I] = 0 whenever
|I| > r.


Exercise C.5.29† (Iterated Kernels). Let ϕ : V → V be an endomorphism of a finite-dimensional


vector space V . Prove that, for any k ≥ 0, we have

dim ker ϕk+1 = dim ker ϕk + dim ker ϕ ∩ im ϕk .

Deduce that the sequence (dim ker ϕk+1 − dim ker ϕk )k≥0 is weakly decreasing.
C.5. EXERCISES 465

Solution

Consider the linear map fk : ker ϕk+1 → V induced by restricting ϕk to ker ϕk+1 . Its kernel is
ker ϕk ⊆ ker ϕk+1 , and its image is ker ϕ ∩ im ϕk . By rank-nullity, we thus have

dim ker ϕk+1 = dim ker ϕk + dim ker ϕ ∩ im ϕk

as wanted. As a consequence, dim ker ϕk+1 − dim ker ϕk = dim ker ϕ ∩ im ϕk is weakly decreasing
because ϕ ∩ im ϕk+1 ⊆ ϕ ∩ im ϕk .


Exercise C.5.30† . Let p be a prime number, and G be a finite (multiplicative) group of n×n matrices
with integer coordinates. Prove that two distinct elements of G stay distinct modulo p. What if the
elements of G only have algebraic integer coordinates and p is an algebraic integer with all conjugates
greater than 2 in absolute value?

Solution

We will present three solutions, in increasing order of non-elementariness, and of generality.


Suppose that the reduction modulo p (group) morphism ϕ is not injective, i.e. has non-trivial
kernel (see Exercise A.2.16∗ – this is really trivial, I’m only using this language so that you
familiarise with it), say I := In 6= A ⊆∈ ker ϕ, i.e. A = I + pB for some non-zero B. Since G
is a finite group, we have Am = I for m = |G| by Lagrange’s theorem Exercise A.3.18† . We will
show that this is impossible (if fact it is equivalent to the problem: if it were possible, the group
generated by M would be a counterexample). (Note that all solutions will use the assumption
that p 6= 2 somewhere, and for a good reason: −I ≡ I (mod 2).)

Here is the first solution, which works only for the first part of the problem. Let k be the greatest
integer such that B/pk has integer coordinates. Suppose first that p - m. Then, modulo pk+2 ,
using the binomial expansion (which we can use since I and pB commute), we have

(I + pB)m ≡ I + mpB 6≡ I (mod pk+2 )

which is a contradiction. We will now replace B by C such that (I + pB)p = I + pC. That, way,
m gets replaced by m/p, and by iterating this process we will eventually reach a p - m which is
a contradiction. However, we need to prove that C 6= 0 too. Thus, suppose that (I + pB)p = I.
We then have
m(m − 1) 2 2
(I + pB)m ≡ I + mpB + p B equivI + p2 B 6≡ I (mod pk+3 )
2
since p is odd, which is also a contradiction.

We now present the second solution. Let |M | denote the maximum of the absolute value of the
coordinates of a matrix M . Since Am = I for some m, |Ak | is bounded when k varies, say by C.
We have
r  
r r
X r
|(pB) = |(I − A) | ≤ |Ar | ≤ C2r .
k
k=0

However, the left hand side is divisible by pn so is at least pn unless it is 0. Since p > 2, by
taking r sufficiently large, we get |B r | = 0, i.e. B r = 0: B is nilpotent. When p is only an
algebraic integer with all conjugates greater than 2, the same argument works: the coordinates
of B r /pr are algebraic integers with absolute value less than C2r /pr so less than 1 for large r.
However, the same goes for their conjugates by assumption. Since the only algebraic integer
whose conjugates are all strictly in the unit disk is 0 (by looking at the constant coefficient of its
minimal polynomial), we get B r = 0 for large r as wanted.
466 APPENDIX C. LINEAR ALGEBRA

Consider k such that B k 6= 0 but B k+1 = 0. Since B 6= 0, we have k ≥ 1. By expanding


(I + pB)m = I, we get
m(m − 1) 2
pmB + p2 B + . . . = 0.
2
Finally, by multiplying this equation by B k−1 , we get pmB k = 0 which implies m = 0 and is a
contradiction.

Finally, the third solution uses more advanced linear algebra. Let β be an eigenvalue of B. Then,
α = 1 + pβ is an eigenvalue of A which is congruent to 1 modulo p. Further, since Am , we have
αm = 1 so it is a root of unity. This implies that β = α−1p has module less than 1 since p > 2.
This is also true for all its conjugates. Thus, the constant coefficient of its minimal polynomial
must be 0, i.e. β = 0. Hence, all eigenvalues of B are zero, which implies that B is nilpotent by
Exercise C.5.19† . We finish as in the previous solution.


Miscellaneous
Exercise C.5.31† (USA TST 2019). For which integers n does there exist a function f : Z/nZ →
Z/nZ such that
f, f + id, f + 2id, . . . , f + mid
are all bijections?

Solution

There exists such a function if and only if all prime factors of n are greater than m + 1. In that
case, it is clear that f = id works. Now suppose that f has a prime factor p ≤ m + 1. Pick p to
be minimal, and suppose without loss of generality that m = p − 1. For 1 ≤ k ≤ m, since f and
f + kid are both bijections, we have
n n m   n
X X X m m−i X
g(x)m ≡ (g(x) + kx)m = k g(x)i xm−i ,
x=1 x=1 i=0
i x=1
i.e.
m−1
X  n
m m−i X
k g(x)i xm−i ≡ 0.
i=0
i x=1
m
 Pn
Thus, we have a linear system in the k m−i with solution xi = i x=1 g(x)i xm−i . By Vander-
monde, the determinant of the matrix M = (k m−i )k,i is
Y
j−i
1≤i<j≤m

which is invertible modulo n since p is the smallest prime factor of n by assumption and m = p−1.
Thus, by Exercise C.3.19∗ , M is invertible modulo n which implies that our system has exactly
one solution. Since x0 ≡ . . . ≡ xm−1 = 0 is of course the trivial solution, this must in fact be the
case. In particular,
Xn
x0 = xp−1
x=1
n n(n−1)
is zero. We will prove that this is impossible. If p = 2, this sum is n−1 2 so n | 2 which
implies that n is odd and is a contradiction.
Now suppose that p is odd. Let k = vp (n). Since this sum is congruent to
k
p
n X p−1
x
pk x=1
C.5. EXERCISES 467

Ppk
modulo pk , it suffices to prove that pk - x=1 xp−1 . Let g be a primitive root modulo pk , there
exists one by Exercise 3.5.18† . Then,
k
pX −1 k−1
(p−1)2
X
p−1 gp −1
x ≡ g k(p−1) = .
g p−1 − 1
x∈(Z/pZ)× k=1

By LTE 3.4.3, the p-adic valuation of this is k − 1 < k. To take care of the terms of the sum
which are divisible by p, simply note that
   
X X
vp  xp−1  = (p − 1)` + vp   = (p − 1)` + k − ` − 1 ≥ k
vp (x)=`,x∈Z/pk Z x∈(Z/pk−` )× xp−1

by our previous computation.




Exercise C.5.32† (Finite Fields Kakeya Conjecture, Zeev Dvir). Let n ≥ 1 an integer and F a finite
field. We say a set S ⊆ Fn is a Kakeya set if it contains a line in every direction, i.e., for every y ∈ Fn ,
there exists an x ∈ Fn such that S contains the line x + yF. Prove that any polynomial of degree less
than |F| vanishing on a Kakeya set must be zero. Deduce that there is a constant cn > 0 such that,
for any finite field F, any Kakeya set of Fn has cardinality at least cn |F|n .

Solution

Let q be the cardinality of F. The proof will be in two steps. Suppose that f is a polynomial of
degree d < q vanishing on a Kakeya set S. Fix any y ∈ Fn . Then, for some x ∈ Fn , f (x + ty) = 0
for any t ∈ F. The polynomial f (x + T y) has more roots than its degree so is zero. Let g be the
homogeneous part of f , i.e. the polynomial formed by the monomials of degree d of f . Notice
that the coefficient of T d in f (x + T d) is exactly g(y). Hence, g(y) = 0 for any y ∈ Fn , which
implies that g = 0 by Exercise A.1.7∗ . This contradicts the assumption that f had degree d.

For the second step, note that the dimension of the vector space V of polynomials of degree at
most q − 1 is n+q−1n . Indeed, the monomials X1d1 · . . . · Xndn for d1 + . . . + dn ≤ q − 1 form a basis
of this space. However, the number of such tuples is the same as the number of ways to choose
n elements from [n + q − 1]: choose a1 < . . . < an and decide that d1 = a1 , d1 + d2 = a2 , etc.,
until d1 + . . . + dn = an (this technique is usually called stars and bars because you have q − 1
stars and n bars used to separate them). Now consider the linear map T : V → F|S| defined by
T (f ) = (f (s))s∈S . If |S| < dim V , it must have a non-trivial kernel by Proposition C.1.2 (or the
rank-nullity theorem). This contradicts the first step.

We conclude that
qn
 
n+q−1 q(q + 1) · . . . · (n + q − 1)
|S| ≥ = ≥
n n! n!
1
so we can take cn = n! .


Exercise C.5.33† (Siegel’s Lemma). Let a = (ai,j ) be an m × n matrix with integer coordinates.
Prove that, if n > m, the system
Xn
ai,j xj = 0
j=1
468 APPENDIX C. LINEAR ALGEBRA

for i = 1, . . . , n always has a solution in integers with


 m
 n−m
max |xi | ≤ n max |ai,j | .
i i,j

Solution

Let M = maxi,j |ai,j |. Fix a constant N to be chosen later. Suppose that the integers a1 , . . . , ak
are negative while ak+1 , . . . , an are positive. Then, for any (x1 , . . . , xn ) ∈ [N ]n , we have

N (a1 + . . . + ak ) ≤ a1 x1 + . . . + an xn ≤ N (ak+1 + . . . + an ).

Thus, the expression a1 x1 + . . . + an xn can take at most 1 + N (|a1 | + . . . + |an |) values.

Now, we return to the problem. Set A = (ai,j ). We have shown that, when X ⊆ [N ]n , each
rows of AX can take at most 1 + N nM values. Thus, when X ranges through [N ]n , AX takes
at most (1 + N nM )m < (1 + N )m (nM )m values. Since X can take (1 + N )n values, if
m
(1 + N )n > (1 + N )m (nM )m ⇐⇒ (1 + N )n−m > (nM )m ⇐⇒ 1 + N > (nM ) n−m

then one value will be taken twice by the pigeonhole principle, say AX = AY . This yields
m 
A(X − Y ) = 0 for some Z = X − Y ⊆ [[−N, N ]]n as wanted. It is clear that N = (nM ) n−m
works and gives us what we want.


Exercise C.5.35† . How many invertible n × n matrices are there in Fp ? Deduce the number of
(additive) subgroups of cardinality pm that (Z/pZ)n has.

Solution

We proceed inductively to determine the number of tuples of linearly independent vectors of


cardinality k. At first, we can pick any non-zero vector in Fnp , there are thus pn − 1 choices.
Then, we can pick any vector which is not a linear combination of the first one, there are thus
pn −p possible choices. Continuing like that, if we have picked k vectors, their linear combinations
generate pk elements so we have pk elements to avoid and thus pn − pk possibilities for the next
vector. In conclusion, the number of invertible n × n matrices with coefficients in Fp is

(pn − 1)(pn − p) · . . . · (pn − pn−1 ).

For the second part, note that a subgroup of (Z/pZ)n is a Fp -vector space, and the fact that
it has pm elements means that its dimension is m. Thus, we want to count the subspaces of
Fnp of dimension m. Here is how we will proceed: we count the number of tuples of m linearly
independent elements, and divide this by the number of tuples which represent a fixed subspace.
We have already computed the first one: it is

(pn − 1) · . . . · (pn − pm−1 ).

We have also determined the second: if we fix a subspace of dimension m, it has

(pm − 1) · . . . · (pm − pm−1 )

bases. We conclude that (Z/pZ)n has


(pn − 1) · . . . · (pn − pm−1 )
(pm − 1) · . . . · (pm − pm−1 )
subgroups of cardinality pm .

C.5. EXERCISES 469

Exercise C.5.36† . Let K be a field, and√let S ⊆ K 2 be a set of points. Prove that there exists a
polynomial f ∈ K[X, Y ] of degree at most 2n such that f (x, y) = 0 for every (x, y) ∈ S.

Solution

ai,j X i Y j where the sum is over the i, j such that i + j ≤ 2n. By stars and
P
Write f = i,j √
bars, there are 2+b 2 2nc such pairs: we choose the two values i and i + j + 1 in [[0, 2n + 1]].
 √ 

Since  √  √  √  √ √
2+ 2n (1 + 2n )(2 + 2n ) 2n · 2n
= > = n,
2 2 2
we have more unknowns then equations so there is a solution.


Exercise C.5.37† . Given an m×n matrix M , we define its row rank as the maximal number of linearly
independent rows of M . Similarly, its column rank is the maximal number of linearly independent
columns of M . Prove that these two numbers are the same, called the rank of M and denoted rank M .
Deduce that M has rank r if and only if all its minors of order r + 1 (i.e. the determinant of an
(r + 1) × (r + 1) submatrix, obtained by removing a chosen set of m − (r + 1) rows and n − (r + 1)
columns) but some minor of order r does not vanish.

Solution

Without loss of generality, by removing some rows if necessary, suppose that all m rows of M
are linearly independent. We will prove that M has at least m linearly independent columns,
which implies that the column rank is greater than or equal to the row rank (if we add rows the
linearly independent columns stay linearly independent). Taking the transpose then yields the
reverse inequality, so they are in fact both equal.
Suppose for the sake of contradiction that M has at most m − 1 linearly independent columns,
say M 1 , . . . , M k . If we consider the m vectors corresponding to the rows of [M 1 , . . . , M k ],
they are linearly dependent by Proposition C.1.2. Now, if we add a column M k+1 which is a
linear combination of M 1 , . . . , M k , the vectors corresponding to the rows of [M 1 , . . . , M k+1 ] stay
linearly dependent. Indeed, if
Xk
M k+1 = ai M i
i=1

and the linear dependence of the rows of [M 1 , . . . , M k ] is


m
X
bi mi,j
i=1

for any j ∈ [k], then


m
X m
X k
X
bi mi,k+1 = bi aj mi,j
i=1 i=1 j=1
k
X m
X
= aj bi mi,j
j=1 i=1

= 0.
In other words, a linear dependence between the rows extends to a linear dependence with the
same coefficients between the rows when we add a linearly dependent column. Continuing like
that shows that, finally, the rows of [M 1 , . . . , M m ] = M are linearly dependent, contradicting
our initial assumption.
470 APPENDIX C. LINEAR ALGEBRA

Clearly, if M has rank r, every set of r + 1 columns is linearly dependent so every minor of order
r + 1 vanishes. Similarly, if a minor of order r does not vanish, M has at least r columns linearly
independent so its rank is at least r. It remains to prove that if M has r linearly independent
rows, then there is some minor of order r which doesn’t vanish. For this, by considering only
those r rows, we may assume that r = m. Now, by the first part, we now that there is a set
of r columns which are linearly independent. The matrix formed by these r columns is then an
invertible r × r submatrix of M , i.e. which has non-zero determinant.


Exercise C.5.39† (Nakayama’s Lemma). Let R be a commutative ring, I an ideal of a R, and M a


finitely-generated R-module. Suppose that IM = M , where IM does not mean the set of products
of elements of I and M , but instead the R-module it generates (i.e. the set of linear combinations of
products). Prove that there exists an element r ≡ 1 (mod I) of R such that rM = 0.

Solution

Let α1 , . . . , αn be generators of M . We have a system of equation as follows:


n
X
αi = βi,j αj
j=1

for i = 1, . . . , n and βi,j ∈ I. Let B = (βi,j ) and A = (αi ) (as a colum vector), so that BA = A,
i.e. (In − B)A = 0. We claim that r = det(In − B) works. First, note that r ≡ 1 (mod I) since
In − B ≡ In (mod I). By Proposition C.3.7, we have

rIn = adj(In − B)(In − B)

so, after right-multiplying by A, we get rA = 0. This implies that rM = 0 as wanted.




Exercise C.5.40† (Homogeneous Linear Differential Equations).

a) Let K be an algebraically closed field of characteristic 0. Given elements a0 , . . . , an ∈ K with


a0 , an 6= 0, solve the linear differential equation of order n
n
X
ai f (i) = 0
i=0

over formal power series f ∈ K[[X]].

b) Prove the Taylor formula with integral remainder: given a n + 1 times differentiable functon
f : R → R and a real number a ∈ R, prove that for any x ∈ R,
n x
(x − a)k (x − t)n (n+1)
X Z
f (x) = + f (t) dt.
k! a n!
k=0

Deduce, given n + 1 real numbers a0 , . . . , an ∈ R with an 6= 0, the set of n times differential


functions f : R → R such that
X n
ai f (i) = 0.
i=0
C.5. EXERCISES 471

Solution

The proof will be very similar to the one of Theorem C.4.1: first we find a family of n linearly
independent solutions,Pn and then we show that any other solution is a linear combination of
these ones. Set ρ = i=0 ai X i the characteristic polynomial of the equation. Our solutions are
f = X k exp(αX) where α is a root of ρ and k is less than the multiplicity of α. To see why this
is true, notice that, for any f, g, we have
m  
(m)
X i
(f g) = f (i) g (m−i) .
i=0
m

In particular,
k  
X m
(X k exp(αX))(m) = exp(αX) k(k − 1) · · · (k − (i − 1))X i αm−i
i=0
i

so that
n n k  
X X X m
ai f (i) = exp(αX) ai k(k − 1) · · · (k − (i − 1))X i αm−i
i=0 m=0 i=0
i
k n  
X X m m
= exp(αX) k(k − 1) · · · (k − (i − 1))X i α−i α
i=0 m=0
i
k  
X k
= exp(αX) X i α−i f (i) (α)
i=0
i
= 0.

To prove that any solution f is a linear combination of our solutions f1 , . . . , fn , we show that we
can find coefficients b1 , . . . , bn such that
n
X
g=f− bi fi
i=1

has is divisible by X n , i.e. its first n coefficients are all zero. Note that g is still a solution to the
differential equation; we wish to prove that g = 0. Suppose otherwise, and set g = X m h with
m = vX (g) ≥ n. Then, if we look at the equation
n
X
ai g (i) = 0
i=0

modulo X m−n+1 , we get

an m(m − 1) · · · (m − (n − 1))X m−n h ≡ 0 (mod X m−n+1 ),

i.e. X | h. This contradicts the fact that m = vX (g).

It only remains to prove the existence of such bi , i.e. that our solutions f1 , . . . , fn are linearly
independent modulo X n since we have a system of n equations and n unknowns. It suffices to
show that the vectors
(0![X 0 ]fi , 1![X 1 ]fi , . . . , (n − 1)![X n−1 ]fi )
where [X m ]f denotes the mth coefficient of f are linearly independent, (e.g. by homogeneity of
the determinant). However, note that this vector for fi = X k exp(αX), k ≤ n, is simply

Vi = (u0,i , u1,i . . . , un−1,i )/αk


472 APPENDIX C. LINEAR ALGEBRA

where
um,i = αm m(m − 1) · · · (m − (k − 1))
so that the linear independence of our vectors follows from the linear independence of our solutions
to the linear recurrence with characteristic polynomial ρ . Indeed, Theorem C.4.1 tells us that
these (um,i )m≥0 are linearly independent and uniquely determined by their first n values, which
means that our Vi are linearly independent as well.
Now we turn to part b. Integrating by parts yields
Z x x Z x
(x − t)n (n+1) (x − t)n (n) −(x − t)n−1 (n)

f (t) dt = f (t) − f (t) dt
a n! n! a a (n − 1)!
Z x
(x − a)n (n) (x − t)n−1 (n)
=− f (x) + f (t) dt
n! a (n − 1)!
so that
x n x n
(t − a)n (n+1) X (x − a)k (x − a)k
Z Z X
f (t) dt = − + f 0 (t) dt = f (x) −
a n! k! a k!
k=1 k=0

as claimed. Now, we consider the equation


n
X
ai f (i) = 0.
i=0

Note that any function satisfying Pnthis equation is automatically infinitely infinitely many times
differentiable, by induction as i=0 ai f (i+k) = 0 implies that f (n+k) is differentiable, i.e. that
f is n + k + 1 times differentiable. Unfortunately for us, not every real smooth is analytic
(however this is true for complex functions!), so we cannot simply use our solution to part a.
We shall make use of Taylor’s formula. The idea is that part a gives us n complex linearly
independent solutions (as we showed), which we can transform into real solutions by considering
x 7→ xk (exp(αx) + exp(αx)) and x 7→ xk (exp(αx) − exp(αx))/i instead of x 7→ xk exp(αx)
and x 7→ xk exp(αx) when α 6∈ R. These stay linearly independent over C, and are thus linearly
independent over R as well. In particular, by Exercise C.5.10† , the Wronskian determinant of our
family of n solutions W (f1 , . . . , fn ) is non-zero as a power series, which means that we can find
an a such that W (f1 , . . . , fn )(x) 6= 0. This means that, given any solution f to our differential
equation, we can find a solution (b1 , . . . , bn ) ∈ Rn to the system
n
(k)
X
f (k) (a) = bi fi (0)
i=1

for k = 0, 1, . . . , n. If we set
n
X
g=f− bi f i ,
i=1

then g is still a solution a solution to our equation, and g(a) = . . . = g (n) (a) = 0. By induction,
every derivative of g vanishes at a. This gives us, for any x ∈ R and m ∈ N,
Z x
(x − t)m (m+1)
g(x) = f (t) dt
a m!
by Taylor’s formula. Fix x ∈ R. We wish to bound f (m+1) (t) for t ∈ [a, x] (here we use this
to mean [x, a] when a > x) by something not too large, so that this integral goes to 0 when
m → +∞ (the function gets eaten by the factorial). This is not too hard to obtain: by the
extremal value theorem 8.6.9† , the first n derivative of f are bounded on [a, x], say by C. Then,
for t ∈ [a, x], we have
n−1
1 X
|f (n) (t)| = ai f (i) (t) ≤ AC
an i=0
C.5. EXERCISES 473

Pn−1
where A = 1 + i=0 |ai /an | (the +1 is just there to make sure that A ≥ 1), by the triangular
inequality. Thus, a straightforward induction shows that, for any m ∈ N,

|f (m) (t)(a)| ≤ CAm

for all t ∈ [a, x]. Finally,


x
(x − t)m (m+1)
Z
|g(x)| = f (t) dt
a m!
Z x
C(A max(|x|, |a|)m )

a m!
(A max(|x|, |a|))m
= C|x − a|
m!
→0

as claimed. Thus, g = 0 and f is a linear combinations of f1 , . . . , fn as desired.




Remark C.5.4
We are in fact also able to solve the first part over any field of characteristic 0 K. Con-
sider the solutions f1 X k exp(α1 X), . . . , fm = X k exp(αm ) over the algebraic closure of K, with
α1 , . . . , αm ∈ K conjugate over K. Then, the fundamental theorem of symmetric polynomials
tells us that, for i = 0, 1, . . . , n − 1,
m
X
gi = αji fj ∈ K[[X]].
j=1

Moreover, by the Vandermonde determinant, these are still linearly independent over K modulo
X n . Hence, we can convert our solutions in K[[X]] to solutions in K[[X]] and still be able to
conclude that they generate all others (over K), because they are still linearly independent modulo
X n and so our previous argument equally applies to this case.

As an example, when K = R, we have

g1 = f + f = 2<(f )

and
g2 = αf + αf = <(α)<(f ) − =(α)=(f ).
Further Reading

Here are some books I like8 . As said in the foreword, I particularly recommend Andreescu-Dospinescu
[1, 2], Ireland-Rosen [15], and Murty [27].

For classical algebraic number theory, I suggest Esmonde-Murty [27].

For p-adic analysis, I wholeheartedly recommend the addendum 3B of SFTB first, and then the
excellent book by Cassels [8], which, although a bit old9 , is full of number-theoretic applications like
the one in Section 8.5. Robert [33] is also very good, but focuses a lot more on analysis than on number
theory, and assumes a fair amount of topology.10 Also, Borevich-Shafarevich [7] has a great proof of
Thue’s theorem 7.4.3 using local methods (p-adic methods).

The second half of chapter 9 on quadratic forms is largely inspired by Cox [cox]. This book
motivates quadratic forms, class field theory (an advanced topic in algebraic number theory) and some
elliptic curves theory through the question of finding which primes have the form x2 + ny 2 for a chosen
n.

The elementary theory of elliptic curves is also of similar flavour to the topics of the present book,
see Silverman-Tate [42] for a wonderful introduction.

For more on polynomials, see Prasolov [31] (things like Appendix A) and Shafarevich [39] for an
introduction to algebraic geometry. See also Rosen [34] for number theory in function fields, i.e.
algebraic number theory but with polynomials over Fq instead of rational integers!

For abstract algebra (including Galois theory), I recommend Lang [21]. For even more advanced
algebra, see his other book [18] and Shafarevich [40].

For linear algebra, see Lang [20] and Shafarevich-Remizov [41]. For applications of linear algebra to
combinatorics, see chapter 12 of PFTB [1], which assumes approximately Appendix C as background,
and Stanley [43], which assumes approximately Lang’s book as background. Finally, even if we have
discussed almost no analysis in this book, it nonetheless remains a very beautiful and fundamental
topic. See Rudin [35] and Lang [22] for an introduction to real analysis, and Rudin [36] and Lang [19]
for their (more advanced) complex analysis counterparts (Rudin also has real analysis). See Murty
[26] for an introduction to analytic number theory (which assumes knowledge of complex analysis),
and Murty-Rath [28] for transcendental number theory.

8 Disclaimer: I haven’t finished reading most of these books, nor do I know the material covered there very well. Use

at your own risks, but I liked what I read from them.


9 It’s not typeset in LAT X!
E
10 The first chapter is particularly topologically heavy but it gets better afterwards. I would suggest to skip it at first

since it just defines the p-adic numbers, and have a look at the other chapters if you’re interested in the analytical theory.

474
Further Reading

[1] T. Andreescu and G. Dospinescu. Problems from the Book. 2nd ed. XYZ Press, 2010.
[2] T. Andreescu and G. Dospinescu. Problems from the Book. XYZ Press, 2012.
[7] Z. I. Borevich and I. R. Shafarevich. Number Theory. Academic Press, 1964.
[8] J. W. S. Cassels. Local fields. Cambridge University Press, 1986.
[15] S. Ireland and M. Rosen. A Classical Introduction to Modern Number Theory. 2nd ed. Vol. 84.
Graduate Texts in Mathematics. Springer-Verlag, 1990.
[18] S. Lang. Algebra. 5th ed. Vol. 211. Graduate Texts in Mathematics. Springer-Verlag, 2002.
[19] S. Lang. Complex Analysis. 4th ed. Vol. 103. Graduate Texts in Mathematics. Springer-Verlag,
1999.
[20] S. Lang. Linear Algebra. 3rd ed. Undergraduate Texts in Mathematics. Springer-Verlag, 1987.
[21] S. Lang. Undergraduate Algebra. 3rd ed. Undergraduate Texts in Mathematics. Springer-Verlag,
1987.
[22] S. Lang. Undergraduate Analysis. 2nd ed. Undergraduate Texts in Mathematics. Springer-Verlag,
1997.
[26] M. R. Murty. Problems in Analytic Number Theory. 2nd ed. Vol. 206. Graduate Texts in Math-
ematics. Springer-Verlag, 2008.
[27] M. R. Murty and J. Esmonde. Problems in Algebraic Number Theory. 2nd ed. Vol. 190. Graduate
Texts in Mathematics. Springer-Verlag, 2005.
[28] M. R. Murty and P. Rath. Transcendental Numbers. Springer-Verlag, 2014.
[31] V. V. Prasolov. Polynomials. Vol. 11. Algorithms and Computation in Mathematics. Springer-
Verlag, 2004.
[33] A. M. Robert. A Course in p-adic Analysis. Vol. 198. Graduate Texts in Mathematics. Springer-
Verlag, 2000.
[34] M. Rosen. Number Theory in Function Fields. Vol. 210. Graduate Texts in Mathematics. Springer-
Verlag, 2002.
[35] W. Rudin. Principles of Mathematical Analysis. 3rd ed. International Series in Pure and Applied
Mathematics. McGraw-Hill, 1976.
[36] W. Rudin. Real and Complex Analysis. 3rd ed. McGraw-Hill, 1987.
[39] I. R. Shafarevich. Basic Algebraic Geometry. 3rd ed. Springer-Verlag, 2013.
[40] I. R. Shafarevich. Basic Notions of Algebra. Vol. 11. Encyclopaedia of Mathematical Sciences.
Springer-Verlag, 2005.
[41] I. R. Shafarevich and A. O. Remizov. Linear Algebra and Geometry. Springer-Verlag, 2013.
[42] J. H. Silverman and J. T. Tate. Rational Points on Elliptic Curves. 2nd ed. Undergraduate Texts
in Mathematics. Springer-Verlag, 2015.
[43] R. Stanley. Algebraic Combinatorics. 2nd ed. Undergraduate Texts in Mathematics. Springer-
Verlag, 2018.

475
Bibliography

[3] G. M. Bergman. Luroth’s Theorem and some related results, developed as a series of exercises.
url: https://math.berkeley.edu/~gbergman/grad.hndts/. (accessed: 26.09.2021).
[4] Y. Bilu, Y. Bugeaud, and M. Mignotte. The Problem of Catalan. Springer-Verlag, 2014.
[6] A. B. Block. The Skolem-Mahler-Lech Theorem. url: http://www.columbia.edu/~abb2190/
Skolem-Mahler-Lech.pdf. (accessed: 26.09.2021).
[9] E. Chen. A trailer for p-adic analysis, first half: USA TST 2003. url: https://blog.evanchen.
cc/2018/10/10/a-trailer-for-p-adic-analysis-first-half-usa-tst-2003/. (accessed:
26.09.2021).
[10] E. Chen. Napkin. url: https://web.evanchen.cc/napkin.html. (accessed: 26.09.2021).
[11] E. Chen. Number Theory Constructions. url: https : / / web . evanchen . cc / static / otis -
samples/DNY-ntconstruct.pdf. (accessed: 26.09.2021).
[12] K. Conrad. Kummer’s lemma. url: https://kconrad.math.uconn.edu/blurbs/gradnumthy/
kummer.pdf. (accessed: 26.09.2021).
[13] K. Conrad. Separability. url: https://kconrad.math.uconn.edu/blurbs/galoistheory/
separable1.pdf. (accessed: 14.11.2021).
[14] D. Djukić. Pell’s Equations. url: https://www.imomath.com/index.php?options=615&lmm=0.
(accessed: 26.09.2021).
[16] A. Khurmi. Modern Olympiad Number Theory.
[17] M. Klazar. Størmer’s solution of the unit equation x − y = 1. url: https://kam.mff.cuni.cz/
~klazar/stormer.pdf. (accessed: 26.09.2021).
[23] P-S. Loh. Algebraic Methods in Combinatorics. url: https://www.math.cmu.edu/~ploh/docs/
math/mop2009/alg-comb.pdf. (accessed: 26.09.2021).
[24] D. Masser. Auxiliary Polynomials in Number Theory. Cambridge Tracts in Mathematics. Cam-
bridge University Press, 2016.
[25] J. S. Milne. Algebraic Number Theory. url: https://www.jmilne.org/math/CourseNotes/
ant.html. (accessed: 26.09.2021).
[29] M. R. Murty and N. Thain. “Primes in Certain Arithmetic Progressions”. In: Functiones et
Approximatio 35 (2006), pp. 249–259. doi: 10.7169/facm/1229442627.
[30] P. Pollack. Thue’s lemma in Z[i] and Lagrange’s four-square theorem. url: http://pollack.
uga.edu/lagrangethue.pdf. (accessed: 29.11.2021).
[32] P. Ribenboim. 13 Lectures on Fermat’s Last Theorem. Springer-Verlag, 1979.
[37] A. Schinzel. “An Extension of the Theorem on Primitive Divisors in Algebraic Number Fields”.
In: Mathematics of Computation 61 (203 1993), pp. 441–444. doi: 10.2307/2152966.
[38] A. Schinzel. “On Primitive Prime Factors of an − bn ”. In: Mathematical Proceedings of the Cam-
bridge Philosophical Society 58 (4 1962), pp. 556–. doi: 10.1017/s0305004100040561.
[44] C. L. Stewart. “On the greatest prime factor of terms of a linear recurrence sequence”. In: Rocky
Mountain Journal of Mathematics 35 (2 1985), pp. 599–608. doi: 10.1216/RMJ-1985-15-2-599.
[45] R. Thangadurai. On the Coefficients of Cyclotomic Polynomials. url: https://www.bprim.
org/sites/default/files/th.pdf. (accessed: 26.09.2021).

476
BIBLIOGRAPHY 477

[46] N. Tosel. Cours de théorie de Galois.


[47] N. Tsopanidis. “The Hurwitz and Lipschitz Integers and Some Applications”. PhD thesis. Facul-
dade De Ciências da Universidade do Porto, 2020.
[48] S. Weintraub. Galois Theory. 2nd ed. Universitext. Springer-Verlag, 2009.
Index

Symbols fixed point theorem 145, 383


5/8 theorem 165, 418 space 145, 382
basis 95, 178
A canonical 186, 196
abelian changes of bases 183
field extension 106, 162, 331 integral 113, 352
group 102, 103, 108, 162, 165, 419 transcendence 175, 439
absolute convergence 281 basis of open sets 145, 384
action binomial
of a group 77, 165, 305, 421, 441 coefficient 76, 84, 90, 303
algebra expansion 52, 59, 82, 121, 133, 310
closure 155, 174, 203, 436, 458 series 132
fundamental theorem of 155, 172 binomial expansion 264, 270, 465
algebraic BMO 1 126, 359
closure 66, 94, 257 Bolzano-Weierstrass theorem 364
field extension 98 Brazil
independence 175, 438 MO 55, 68, 146, 268, 393
integer 14
number 14
alternating map 196
C
Capelli 111, 339
AMM 22, 24
Carmichael’s theorem 76, 303
analytic function 140, 472
Cauchy
anti-derivative 223
equation 98, 182, 325
APMO 83, 154
arc sequence 132
length 223 theorem 109
Archimedean 132, 145, 386 Cauchy-Mirimanoff polynomials 416
arithmetic function 73, 290 Cauchy-Schwarz 415
multiplicative 74, 248, 290 Cauchy-Schwarz inequality 24
Artin 111, 335 Cayley
Artin-Schreier theorem 112, 341 theorem 111, 336
associate 30 Cayley-Hamilton theorem 204, 459
left or right 39 center 418
associative 38, 148, 183, 235 centraliser 306, 418
automorphism 29, 102, 226 characteristic 153
axiom of choice 64, 144, 181 of a ring 58, 59, 71, 97, 136, 149, 158, 170,
181, 199, 283
B polynomial
Bézout of a linear recurrence 142, 199
domain 31, 87, 120 of a matrix 203, 458
left or right 39 Chebotarev density theorem 75, 297
lemma 30, 151, 274, 358, 390, 405, 430 Chebyshev polynomial 68, 273
theorem 175, 437 Chevalley-Warning Theorem 75, 295
Bézout’s lemma 280 China
Banach MO 42

478
INDEX 479

TST 74, 90, 147, 164, 166, 292, 313, 314, field 26, 56, 102–104, 220, 273
399, 425 quadratic subfield 113, 349
Chinese remainder theorem 55, 84, 157, 248, units 56, 274
259, 269, 272, 347 polynomial 23, 35, 44, 67, 76, 81, 232, 302
chinese remainder theorem 319 ring of integers 56, 273
circulant determinant 202, 353, 455
class equation (of a group action) 77, 306 D
class number d’Alembert-Gauss theorem 172
of a number field 56, 275 Dedekind 113, 346
closed set 144, 380 lemma 111, 336
closure zeta function 243
algebraic 66, 94, 257 degree
Galois 102 of a field extension 94
integral 26, 191, 217, 447 of a polynomial 149
column operations 189 of an algebraic number 16
comatrix 198 density 133, 144, 382
commutative Zariski 460
group 162 derangements 194
operation 38, 148, 183 derivative 152, 182
discrete 163, 411
ring 18, 59, 159, 198
determinant 186, 187
compact
circulant 202, 353, 455
set 144, 380
norm 197
compact set 140, 144, 381
resultant 174, 434
compass and straightedge 112, 342
Vandermonde 192
complete homogeneous 173
diagonalisation 204, 461
completion 132
diffeomorphism 223
complex field
differential equation 206, 470
cubic 122
dimension 148, 179
quadratic 115
finite 179
composite field 105, 330
transcendence 175, 439
compositum 105, 330 Dirichler
congruence 16 theorem 296
conjugate Dirichlet
complex 15, 17 approximation theorem 116
of an algebraic number 17 convolution 73, 290
quadratic 29 L-function 243
quaternion 38 series 73, 290
connected set 140 theorem 91, 319
constructible theorem (on arithmetic progressions) 50,
number 112 69, 90, 91, 321
content (of a polynomial) 79 unit theorem 125, 126, 364
continuity 383 discrete 380
contraction 145, 383 discrete derivative 163, 411
convergence 23 discriminant 75, 114, 294
absolute 281 disriminant 21
p-adic 131 distance 131, 144, 379
convex hull 167, 427 distributivity 148, 183, 445
coset 102, 328, 334 divisibility 16, 31
cosine 24–26, 48, 57, 68, 214, 215, 219 left or right 39
law 214 of polynomials 150
quadratic 48, 261 domain 160, 204, 460
rational 15 Bézout 31
cyclic Euclidean 32
field extension 111, 337 integral 31, 160, 228
group 112, 162, 343 principal ideal 43, 166, 423
cyclotomic unique factorisation 30, 174, 434
480 INDEX

E field extension 94
effective 122, 140 abelian 106, 162, 331
Ehrenfeucht’s criterion 167, 429 algebraic 98
eigenvalues 203, 458 cyclic 111, 337
Eisenstein finite 63, 96
criterion 81, 96 Galois 101
integers 35 separable 98
ELMO 127, 366 solvable
SL 92 radicals 112, 343
embedding 129 real radicals 112, 345
complex 123 tower 95, 101
of a field extension 98 finite
real 123 field 399
endomorphism 204, 461 finite field 58, 97, 104, 140, 181
cyclic 205, 462 fixed field 103
equivalence of norms 145, 388 Fleck’s congruences 57, 276
equivalence relation 39, 238 formal 129, 149, 173
Euclid power series 149, 171
algorithm 315 France
lemma 30 TST 54, 265
Euclidean Frobenius
algorithm 97, 150 morphism 399
division 17, 32, 120, 150, 160 Frobenius morphism 48, 59, 73, 82, 106, 259
domain 32, 78, 87, 142 fundamental theorem
function 32, 39 of algebra 155, 172
left or right 39 of finitely generated abelian groups 165,
Euler 42, 243 419
criterion 70 of Galois theory 103
exact sequence 165, 422 of symmetric polynomials 169
extremal value theorem 144, 380 symmetric polynomials 19
fundamental unit 115
F
Fermat 96 G
last theorem 35, 42, 56, 243, 275 Galois
for polynomials 166, 425 closure 102
little correspondence 103
theorem 422 field extension 101
little theorem 61, 107, 108, 150, 152, 154, fundamental theorem 103
155, 301, 415 group 75, 102, 297
two square theorem 35 inverse problem 111, 336
Fibonacci sequence 43, 55, 74, 76, 127, 252, Galois theory 261
292, 303, 366 Gauss
field 148, 159 formula 113, 349
algebraically closed 155, 174, 203, 436, integers 33
458 primes 34
complex lemma 79
cubic 122 sum 72, 106
quadratic 115 gcd 150
extension 94 Gelfond-Schneider theorem 132
unramified 146 generating functions 171
finite 58, 97, 104, 140, 181, 399 generator 49, 97, 108, 111, 191, 262, 337
fixed 103 global 140
of fractions 160 field 132
real Grassmann’s Formula 201, 453
quadratic 115 greatest common divisor 31
totally 123 left or right 39
residue 146 group 50, 55, 102, 160, 162, 262, 269
INDEX 481

abelian 103, 108, 162 TST 55, 90–92, 319


action 77, 165, 305, 421, 441 Ireland 24, 212
commutative 162 irrationality measure 125
cyclic 112, 162, 343 irreducible
of units 61 element 31
quotient 164, 416 polynomial 156
simple 344 ISL 114
solvable 112, 343 isomorphism 58, 63, 111, 161, 283
symmetric 162 IZHO 25

H J
Hadamard quotient theorem 74, 293 Jacobi
height 338 four square theorem 43, 249
Hensel’s lemma 84, 130, 375 reciprocity 75, 298
Hermite symbol 75, 297, 298
matrix 205 Japan
theorem 26, 221 MO 54
Hilbert
basis theorem 166, 424 K
Theorem 90 111, 337 Kakeya
homogeneous 29, 45, 125, 173, 186, 256 conjecture 205, 467
Hurwitz set 205, 467
integers 38 kernel 163, 184, 326
hyperplane 202, 454 Kobayashi’s theorem 122
Korea
I MO 57, 74, 277
ideal 159, 180, 206, 254, 280, 430, 470 winter program 57, 276
maximal 64, 166, 423 Krasner’s lemma 147, 402
prime 221 Kronecker
principal 159 delta 464
ideal factorisation 304 Kronecker’s theorem 26, 219
image 163, 184 Kronecker-Weber theorem 103
IMC 25, 57, 167, 280, 429 Kummer 56, 275
IMO 43, 55, 90, 253, 267 lemma 56, 275
SL 50, 55, 74, 86, 90–92, 126, 266, 294,
319, 322, 359 L
inclusion-exclusion principle 448 Lüroth’s theorem 111, 337
inductive limit 66 Lagrange 175, 441
inert prime 34, 233 four-square theorem 40
integer interpolation 156, 182, 192
algebraic 14 theorem 109, 165, 418
Eisenstein 35 lattice 364
Gaussian 33 Laurent series 203, 455
Hurwitz 38 Legendre
p-adic 128 formula 135
quadratic 18, 209 symbol 71, 297
rational 15 lifting the exponent 152
integral basis 113, 352 lifting the exponent lemma 51, 55, 319, 395
integral closure 26, 191, 217, 447 limit
integral domain 31, 58, 79, 93, 128, 160, 228 inductive 66
integral element 355 projective 128
integration by parts 221 Lindemann-Weierstrass 26, 221
interior 144, 382 linear
intermediate value theorem 172, 264, 428 independence 178
inverse Galois problem 111, 336 map 181, 182
Iran multi 187
MO 57, 91, 175, 279, 442 recurrence 199
482 INDEX

order 199 Newton’s formulas 23, 170


transformation 181 nilpotent 204, 459
linear recurrence 60, 62, 74, 76, 292, 303 Noether’s Lemma 202
Liouville’s theorem 127, 365 Noetherian 166, 423
local 140 norm 96
field 132 absolute 29
local-global 20 determinant 197
localisation 129 Euclidean 32, 33, 43
Lucas of a field extension 101
formula 113, 349 quadratic 29
sequence 43 quaternion 38
theorem 76, 303 norm (on a vector space) 145, 387
normal subgroup 164, 416
M number field 96
Möbius Function 56, 74, 290
Mahler’s theorem 143, 376 O
Mann 113, 348 open set 144, 380
Mason-Stothers theorem 166, 425 order
matrix group 109
adjacency 185 maximal 38
adjugate 197 of a group 165, 418
block 202 of a linear recurrence 199
block-diagonal 202, 454, 461 of a projective plane 77, 307
block-triangular 204, 459 Ostrowski 145, 146, 387, 391
change of bases 184
comatrix 198 P
compagnon 463 p-adic 56, 275
Hermitian 205 absolute value 130
identity 183 convergence 131
multiplication 183 exponential 143, 377
transpose 185 integer 128
upper triangular 189 logarithm 143, 377
mean value theorem 143, 378 number 129
Mersenne unit 129
prime 278 valuation 129
Mersenne sequence 91, 315 partial fractions decomposition 200, 302
metric space 144, 379 partially ordered sets 295
Miklós Schweitzer 24, 75 Pell’s equation 115
minimal polynomial 16 Pell-type 119
ponctual 204, 461 permanent 204, 459
minor 206, 469 permutation
module 177, 190 even 195
monic odd 195
polynomial 149 transposition 195
monoid 148, 404 PFTB 23, 55, 266
morphism 161 pigeonhole principle 116, 117, 316, 330, 379,
multilinear 187 388, 424, 425, 439, 468
multilinear map 196 Poland
multiplicative 71 MO 89
multiplicity polynomial 149
of a root 152 chebyshev 68, 273
constant coefficient 149
N cyclotomic 44
Nagell 111, 142, 334 divisibility 150
Nakayama’s lemma 206, 470 elementary symmetric 18
Newton irreducible 156
method 390 leading coefficient 149
INDEX 483

monic 149 R
power sum 170 Rabinowitsch’s trick 437
primitive 78 Ramanujan 142
root 151, 152 ramification 146
symmetric 18 ramified prime 34, 233
power mean inequality 212, 391 rank 184, 206, 445, 469
power series of a linear map 184
formal 149 of an abelian group 165, 419
pre-periodic points 91, 318 rank-nullity theorem 184
primary rational function 157
Hurwitz integer 42, 246 rational root theorem 15, 208
prime real field
divisors of a polynomial 82 quadratic 115
element 30, 31 totally real 123
Gaussian 34 Redei 111, 339
residue field 146
ideal 59, 129, 221
resultant 174, 434
inert 34, 233
Riemann
ramified 34, 233
zeta function 243, 268
rational 31
ring 158
split 34, 233
of integers 28, 59, 96
primitive
commutative 18, 59, 159, 198
root (of finite fields) 373 local 146
element 97, 114 multiplicative group 61
Hurwitz integer 42, 245 Noetherian 166, 423
polynomial 78 of integers 127
prime factor 51, 91, 315 p-adic 146
root (modulo n) 55, 70, 269 RMM 126, 360
root (of finite fields) 49, 70, 162, 262 SL 90, 313
root of unity 17, 44, 67, 96, 102, 122, 137, Romania
164, 202, 413, 455 TST 90, 314
primitive element theorem 94, 97, 110, 114 roots of unity filter 164, 413
primorial 316 Russia
principal ideal 159 All-Russian Olympiad 163
principal ideal domain 43, 159, 166, 408, 423
principle of isolated zeros 140 S
projective limit 128 scalars 177
projective plane 77, 306 separable space 145, 384
series 131
Siegel’s Lemma 205, 467
Q signature 194
quadratic
simple group 344
complex field 115 sine law 214
conjugate 29 skew field 38, 160, 237
cosine 48, 261 Skolem-Mahler-Lech theorem 124, 137, 147,
field 27, 94, 102 401
integer 18, 209 smooth arc 223
norm 29 smooth funtion 472
number 18, 209 solvability
real field 115 by radicals 112, 343
reciprocity 71, 106 by real radicals 112, 345
residue 70, 87, 310 Sophie Germain’s identity 339
unit 115 Sophie-Germain
quaternion prime 56, 272
conjugate 38 theorem 56, 272
norm 38 split
numbers 37 polynomial 58, 109
484 INDEX

prime 34, 233 U


splitting field 64, 75, 112, 113, 297, 345, 346 unique factorisation domain 30, 81, 108, 174,
squarefree 56, 74, 290 434
Størmer’s theorem 120, 127, 367 unit 26, 48, 55, 115, 218, 269
stars and bars 467 circle 166, 426
Strassmann’s theorem 141 complex cubic 123
Sturm’s theorem 167, 428 complex quadratic 115
subgroup 103 Dirichlet theorem 125, 126, 364
normal 106, 331 fundamental 115, 123
sum-free set 25, 216 group 61
symmetric of a ring 30, 162
group 162 of cyclotomic fields 56, 274
polynomial 18, 168 p-adic 129
complete homogeneous 173 real quadratic 116
elementary 18, 168 S 120, 125, 127, 367
USA
fundamental theorem of 19, 169
MO 24, 76, 90, 113, 167, 211, 314
power sum 170
TST 25, 54, 75, 86, 91, 92, 101, 134, 143,
163, 205, 216, 264, 295, 316, 411, 466
T TSTST 77
Taiwan USEMO 164, 414
TST 146
Taylor V
formula 85, 206, 470 Vahlen 111, 339
series 281 Vandermonde 143, 376
Taylor’s formula 390 determinant 192
Teichmüller character 399 Vandermonde determinant 466
tensor product 64 vector space 163, 177
TFJM 113, 354 dimension 94
Thue’s equation 122 vectors 177
Thue-Siegel-Roth theorem 125 Vieta’s formulas 18, 19, 154, 173
topology 144, 380
torsion 165, 419
W
Wedderburn’s theorem 77, 306
totally bounded space 145, 384
Wilson’s theorem 154, 322
totally disconnected 141
Wronskian determinant 203, 455
totally real 123
trace 206 Z
of a field extension 206 Zariski
transcendence topology 460
basis 175, 439 Zariski density 459, 463
degree 175, 439 Zeev Dvir 205, 467
transcendence basis 181 zeta function
transcendental number 14, 99 Dedekind 243
transposition 195 Riemann 243, 268
Tuymaada 91, 316 Zorn’s lemma 64, 181
Tuymadaa 76, 301 Zsigmondy’s theorem 51, 76, 303

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy