Number Theory
Number Theory
Number Theory
Ben Lynn
Number Theory ii
COLLABORATORS
TITLE :
Number Theory
REVISION HISTORY
Contents
1 Number Theory 1
1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Bonus Material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Modular Arithmetic 1
2.1 Division . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.2 Inverses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
3 Euclid’s Algorithm 2
3.1 Extended Euclidean Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.2 The General Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
4 Division 4
4.1 Computing Inverses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
6 Roots of Polynomials 7
6.1 Composite Moduli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
6.2 Wilson’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
8 Modular Exponentiation 8
8.1 The Discrete Log Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
8.2 Nonunits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
10 Primality Tests 11
10.1 The Fermat Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
10.2 The Miller-Rabin Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
11 Generators 12
Number Theory iv
12 Cyclic Groups 13
12.1 Lagrange’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
12.2 Subgroups of Cyclic Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
12.3 Counting Generators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
12.4 Group Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
13 Quadratic Residues 15
13.1 The Legendre Symbol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
14 Gauss’ Lemma 16
15 Quadratic Reciprocity 17
15.1 The Jacobi Symbol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
16 Carmichael Numbers 18
16.1 Solovay-Strassen Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
17 Multiplicative Functions 19
17.1 Perfect Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
17.2 Fermat Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
18 Möbius Inversion 21
20 Cyclotomic Equations 24
21 The Heptadecagon 25
21.1 A Magic Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
23 Gaussian Periods 28
23.1 A Loose End . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
24 Roots of Unity 29
Number Theory v
1 Number Theory
I’m taking a loose informal approach, since that was how I learned. Once you have a good feel for this topic, it is easy to add
rigour.
More formal approaches can be found all over the net, e.g: Victor Shoup, A Computational Introduction to Number Theory and
Algebra.
One reader of these notes recommends I.N. Herstein, ’Abstract Algebra’ for further reading.
I built a PDF version of these notes.
1.1 Overview
I have tried to order my pages so that the parts most relevant to cryptography are presented first.
Modular Arithmetic We begin by defining how to perform basic arithmetic modulo n, where n is a positive integer. Addition,
subtraction, and multiplication follow naturally from their integer counterparts, but we have complications with division.
Euclid’s Algorithm We will need this algorithm to fix our problems with division. It was originally designed to find the greatest
common divisor of two numbers.
Division Once armed with Euclid’s algorithm, we can easily compute divisions modulo n.
The Chinese Remainder Theorem We find we only need to study Z pk where p is a prime, because once we have a result about
the prime powers, we can use the Chinese Remainder Theorem to generalize for all n.
Units While studying division, we encounter the problem of inversion. Units are numbers with inverses.
Exponentiation The behaviour of units when they are exponentiated is difficult to study. Modern cryptography exploits this.
Order of a Unit If we start with a unit and keep multiplying it by itself, we wind up with 1 eventually. The order of a unit is the
number of steps this takes.
The Miller-Rabin Test We discuss a fast way of telling if a given number is prime that works with high probability.
Generators Sometimes powering up a unit will generate all the other units.
Cyclic Groups We focus only on multiplication and see if we can still say anything interesting.
Quadratic Residues Elements of Zn that are perfect squares are called quadratic residues.
The other topics are less relevant to cryptography, but nonetheless interesting.
2 Modular Arithmetic
What is the most natural way of doing arithmetic in Zn ? Given two elements x, y ∈ Zn , we can add, subtract or multiply them as
integers, and then the result will be congruent to one of the elements in Zn .
Example: 6 + 7 = 1 (mod 12), 3 × 20 = 10 (mod 50), 12 − 14 = 16 (mod 18).
These operations behave similarly to their mundane counterparts. However, there is no notion of size. Saying 0 < 4 (mod 8) is
nonsense for example, because if we add 4 to both sides we find 4 < 0 (mod 8). The regular integers are visualized as lying on
a number line, where integers to the left are smaller than integers on the right. Integers modulo n however are visualized as lying
on a circle (e.g. think of a clock when working modulo 12).
2.1 Division
Division is notably absent from the above discussion. If y divides x as integers, then one might guess we could use the usual
definition. Let us see where this leads: we have 10 = 4 (mod 6). Dividing both sides by 2 gives the incorrect equation 5 = 2
(mod 6).
Thus we have to change what division means. Intuitively, division should "undo multiplication", that is to divide x by y means to
find a number z such that y times z is x. The problem above is that there are different candidates for z: in Z6 both 5 and 2 give 4
when multiplied by 2.
Which answer should we choose for "4/2", 5 or 2? We could introduce some arbitrary convention, such as choosing the smallest
answer when considering the least residue as an integer, but then division will behave strangely.
Instead, we require uniqueness, that is x divided by y modulo n is only defined when there is a unique z ∈ Zn such that x = yz.
We can obtain a condition on y as follows. Suppose z1 y = z2 y (mod n). Then by definition, this means for some k we have
y(z1 − z2 ) = kn. Let d be the greatest common divisor of n and y. Then n/d divides z1 − z2 since it cannot divide y, thus we have
z1 y = z2 y (mod n)
if and only if
z1 = z2 (mod n/d).
Thus a unique z exists modulo n only if the greatest common divisor of y and n is 1.
2.2 Inverses
We shall see that a unique z exists if and only if it is possible to find a w ∈ Zn such that yw = 1 (mod n). If such a w exists, it
must be unique: suppose yw′ is also 1. Then multiplying both sides of yw = yw′ by w gives wyw = wyw′ , which implies w = w′
since wy = 1. When it exists, we call this unique w the inverse of y and denote it by y−1 .
How do we know if y−1 exists, and if it does, how do we find it? Since there are only n elements in Zn , we can multiply each
element in turn by y and see if we get 1. If none of them work then we know y does not have an inverse. In some sense, modular
arithmetic is easier than integer arithmetic because there are only finitely many elements, so to find a solution to a problem you
can always try every possbility.
We now have a good definition for division: x divided by y is x multiplied by y−1 if the inverse of y exists, otherwise the answer
is undefined.
To avoid confusion with integer division, many authors avoid the / symbol completely in modulo arithmetic and if they need to
divide x by y, they write xy−1 . Also some approaches to number theory start with inversion, and define division using inversion
without discussing how it relates to integer division, which is another reason / is often avoided. We will follow convention, and
reserve the / symbol for integer division.
Example: 2 × 3 + 4(5−1 ) = 2 (mod 6).
3 Euclid’s Algorithm
c = ax + by
Number Theory 3 / 34
for integers x and y? If so, is there more than one solution? Can you find them all? Before answering this, let us answer a
seemingly unrelated question:
How do you find the greatest common divisor (gcd) of two integers a, b?
We denote the greatest common divisor of a and b by gcd(a, b), or sometimes even just (a, b). If (a, b) = 1 we say a and b are
coprime.
The obvious answer is to list all the divisors a and b, and look for the greatest one they have in common. However, this requires
a and b to be factorized, and no one knows how to do this efficiently.
A few simple observations lead to a far superior method: Euclid’s algorithm, or the Euclidean algorithm. First, if d divides a
and d divides b, then d divides their difference, a - b, where a is the larger of the two. But this means we’ve shrunk the original
problem: now we just need to find gcd(a, a − b). We repeat until we reach a trivial case.
Hence we can find gcd(a, b) by doing something that most people learn in primary school: division and remainder. We give an
example and leave the proof of the general case to the reader.
Suppose we wish to compute gcd(27, 33). First, we divide the bigger one by the smaller one:
33 = 1 × 27 + 6
27 = 4 × 6 + 3
The above equations actually reveal more than the gcd of two numbers. We can use them to find integers m, n such that
3 = 33m + 27n
First rearrange all the equations so that the remainders are the subjects:
6 = 33 − 1 × 27
3 = 27 − 4 × 6
Then we start from the last equation, and substitute the next equation into it:
d = ma + nb
We can now answer the question posed at the start of this page, that is, given integers a, b, c find all integers x, y such that
c = xa + yb.
Let d = gcd(a, b), and let b = b′ d, a = a′ d. Since xa + yb is a multiple of d for any integers x, y, solutions exist only when d
divides c.
So say c = kd. Using the extended Euclidean algorithm we can find m, n such that d = ma + nb, thus we have a solution
x = km, y = kn.
Suppose x′ , y′ is another solution. Then
c = xa + yb = x′ a + y′ b
Rearranging,
(x′ − x)a = (y − y′ )b
Dividing by d gives:
(x′ − x)a′ = (y − y′ )b′
The numbers a′ and b′ are coprime since d is the greatest common divisor, hence (x′ − x) is some multiple of b′ , that is:
x′ − x = tb/d
x = m + tq, y = n + t p.
4 Division
Intuitively, to divide x by y means to find a number z such that y times z is x, but we had trouble adopting this defintion of division
because sometimes there is more than one possibility for z modulo n.
We solved this by only defining division when the answer is unique. We stated without proof that when division defined in this
way, one can divide by y if and only if y−1 , the inverse of y exists.
We shall now show why this is the case. We wish to find all z such that yz = x (mod n), which by definition means
x = zy + kn
for some integer k. But this is precisely the problem we encountered when discussing Euclid’s algorithm! Let d = gcd(y, n).
Suppose d > 1. Then no solutions exist if x is not a multiple of d. Otherwise the solutions for z, k are
z = r + tn/d, k = s − ty/d
for some integers r, s (that we get from Euclid’s algorithm) and for all integers t. But this means z does not have a unique solution
modulo n since n/d < n. (Instead z has a unique solution modulo n/d.)
On the other hand, if d = 1, that is if y and n are coprime, then x is always a multiple of d so solutions exist. Recall we find them
by using Euclid’s algorithm to find r, s such that
1 = ry + sn
Number Theory 5 / 34
We previously asked: given y ∈ Zn , does y−1 exist, and if so, what is it?
Our answer before was that since Zn is finite, we can try every possibility. But if n is large, say a 256-bit number, this cannot be
done even if we use the fastest computers available today.
A better way is to use what we just proved: y−1 exists if and only if gcd(y, n) = 1 (which we can check using Euclid’s algorithm),
and y−1 can be computed efficiently using the extended Euclidean algorithm.
Example: does 7−1 (mod 19) exist, and if so, what is it? Euclid’s algorithm gives
19 = 2 × 7 + 5
7 = 1×5+2
5 = 2×2+1
Thus an inverse exists since gcd(7, 19) = 1. To find the inverse we rearrange these equations so that the remainders are the
subjects. Then starting from the third equation, and substituting in the second one gives
1= 5−2×2
= 5 − 2 × (7 − 1 × 5)
= (−2) × 7 + 3 × 5
1 = (−2) × 7 + 3 × (19 − 2 × 7)
= (−8) × 7 + 3 × 19
x = a (mod p)
Number Theory 6 / 34
x=b (mod q)
has a unique solution for x modulo pq.
The reverse direction is trivial: given x ∈ Z pq , we can reduce x modulo p and x modulo q to obtain two equations of the above
form.
Proof: Let p1 = p−1 (mod q) and q1 = q−1 (mod p). These must exist since p, q are coprime. Then we claim that if y is an
integer such that
y = aqq1 + bpp1 (mod pq)
then y satisfies both equations:
Modulo p, we have y = aqq1 = a (mod p) since qq1 = 1 (mod p). Similarly y = b (mod q). Thus y is a solution for x.
It remains to show no other solutions exist modulo pq. If z = a (mod p) then z − y is a multiple of p. If z = b (mod q) as well,
then z − y is also a multiple of q. Since p and q are coprime, this implies z − y is a multiple of pq, hence z = y (mod pq).
This theorem implies we can represent an element of Z pq by one element of Z p and one element of Zq , and vice versa. In other
words, we have a bijection between Z pq and Z p × Zq .
Examples: We can write 17 ∈ Z35 as (2, 3) ∈ Z5 × Z7 . We can write 1 ∈ Z pq as (1, 1) ∈ Z p × Zq .
In fact, this correspondence goes further than a simple relabelling. Suppose x, y ∈ Z pq correspond to (a, b), (c, d) ∈ Z p × Zq
respectively. Then a little thought shows x + y corresponds to (a + c, b + d), and similarly xy corresponds to (ac, bd).
A practical application: if we have many computations to perform on x ∈ Z pq (e.g. RSA signing and decryption), we can convert
x to (a, b) ∈ Z p × Zq and do all the computations on a and b instead before converting back. This is often cheaper because for
many algorithms, doubling the size of the input more than doubles the running time.
Example: To compute 17 × 17 (mod 35), we can compute (2 × 2, 3 × 3) = (4, 2) in Z5 × Z7 , and then apply the Chinese
Remainder Theorem to find that (4, 2) is 9 (mod 35).
Let us restate the Chinese Remainder Theorem in the form it is usually presented.
Theorem: Let m1 , ..., mn be pairwise coprime (that is gcd(mi , m j ) = 1 whenever i ̸= j). Then the system of n equations
x = a1 (mod m1 )
...
x = an (mod mn )
has a unique solution for x modulo M where M = m1 ...mn .
Proof: This is an easy induction from the previous form of the theorem, or we can write down the solution directly.
Define bi = M/mi (the product of all the moduli except for mi ) and b′i = b−1
i (mod mi ). Then by a similar argument to before,
n
x = ∑ ai bi b′i (mod M)
i=1
An important consequence of the theorem is that when studying modular arithmetic in general, we can first study modular
arithmetic a prime power and then appeal to the Chinese Remainder Theorem to generalize any results. For any integer n, we
factorize n into primes n = pk11 ...pkmm and then use the Chinese Remainder Theorem to get
Zn = Z k × ... × Z pkm
p11 m
To prove statements in Z pk , one starts from Z p , and inductively works up to Z pk . Thus the most important case to study is Z p .
Number Theory 7 / 34
6 Roots of Polynomials
Let n be a product of distinct primes: n = p1 ...pk . The Chinese Remainder Theorem implies we can solve a polynomial f (x) over
each Z pi and then combine the roots together to find the solutions modulo n. This is because a root a of f (x) in Zn corresponds
to
(a1 , ..., ak ) ∈ Z p1 × ... × Z pk
where each ai is a root of f (x) in Z pi .
Example: Solve x2 − 1 (mod 77).
x2 − 1 has the solutions ±1 (mod 7) and ±1 (mod 11) (since they are both prime), thus the solutions modulo 77 are the ones
corresponding to:
Generalizing the last example, whenever N is the product of two distinct odd primes we always have four square roots of unity.
(When one of the primes is 2 we have a degenerate case because 1 = −1 (mod 2).) An interesting fact is that if we are told one
of the non-trivial square roots, we can easily factorize N (how?).
In order to describe the solutions of a polynomial f (x) over Zn for any n, we need to find the roots of f (x) over Z pk for prime
powers pk . We shall leave this for later.
Since the only square roots of 1 modulo p are ±1 for a prime p, for any element a ∈ Z∗p , we have a ̸= a−1 unless a = ±1. Thus in
the list 2, 3, ..., p − 2 we have each element and its inverse exactly once, hence (p − 1)! = −1 (mod p). On the other hand when
p is composite, (p − 1)! is divisible by all the proper factors of p so we have:
Theorem: For an integer p > 1 we have (p − 1)! = −1 (mod p) if and only if p is prime.
At first glance, this seems like a good way to tell if a given number is prime but unfortunately there is no known fast way to
compute (p − 1)!.
Wilson’s Theorem can be used to derive similar conditions:
Number Theory 8 / 34
If y ∈ Zn is invertible (that is, if y−1 exists), then we say y is a unit. The set of units of Zn is denoted by Z∗n , or Z×
n.
We know y is a unit if and only if y and n are coprime. So the size of Z∗n is precisely the number of integers in [1..n − 1] that are
coprime to n.
We write φ (n) for the number of elements of Z∗n . The function φ (n) is called the Euler totient function. Actually, it turns out to
be convenient to have φ (1) = 1, so we prefer to define φ (n) as the number of integers in [1..n] coprime to n. (This agrees with
our original definition except when n = 1.)
Examples: φ (6) = 2 since among [1..6] only 1 and 5 are coprime to 6, and thus are the only units in Z6 .
Let p be a prime. Then every nonzero element a ∈ Z p is coprime to p (and hence a unit) thus we have φ (p) = p − 1.
How about powers of primes? If p is a prime, then the only numbers not coprime to pk are the multiples of p, and there are
pk /p = pk−1 of these. Hence
φ (pk ) = pk − pk−1
Now let m, n be coprime, and let x ∈ Zmn . Let a = x (mod m) and b = x (mod n) (we reduce x modulo p and q). Then by the
Chinese Remainder Theorem, x is a unit if and only if a and b are. Thus Z∗pq = Z∗p × Z∗q .
Looking at the size of these sets gives this fact: for p, q coprime, we have
(Thus φ is multiplicative.)
Putting this together with the previous statement φ (pk ) = pk − pk−1 for prime p, we get that for any integer n = pk11 ...pkmm (where
we have factorized n into primes) we have
8 Modular Exponentiation
Suppose we are asked to compute 35 modulo 7. We could calculate 35 = 243 and then reduce 243 mod 7, but a better way is to
observe 34 = (32 )2 . Since 32 = 9 = 2 we have 34 = 22 = 4, and lastly
35 = 34 × 3 = 4 × 3 = 5 (mod 7).
The second way is better because the numbers involved are smaller.
This trick, known as repeated squaring, allows us to compute ak mod n using only O(log k) modular multiplications. (We can use
the same trick when exponentiating integers, but then the multiplications are not modular multiplications, and each multiplication
takes at least twice as long as the previous one.)
Number Theory 9 / 34
31 = 3
32 = 2
33 = 6
34 = 4
35 = 5
36 = 1
Note we compute each power by multiplying the previous answer by 3 then reducing modulo 7. Beyond this, the sequence
repeats itself (why?):
37 = 3
38 = 2
and so on.
At a glance, the sequence 3, 2, 6, 4, 5, 1 seems to have no order or structure whatsoever. In fact, although there are things we can
say about this sequence (for example, members three elements apart add up to 7), it turns out that so little is known about the
behaviour of this sequence that the following problem is difficult to solve efficiently:
(The discrete log problem) Let p be a prime, and g, h be two elements of Z∗p . Suppose gx = h (mod p). Then what is x?
Example: One instance of the discrete log problem: find x so that 3x = 6 (mod 7). (Answer: x = 3. Strictly speaking, any x = 3
(mod 6) will work.)
Recall when we first encountered modular inversion we argued we could try every element in turn to find an inverse, but this was
too slow to be used in practice. The same is true for discrete logs: we could try every possible power until we find it, but this is
impractical.
Euclid’s algorithm gave us a fast way to compute inverses. However no fast algorithm for finding discrete logs is known. The
best discrete log algorithms are faster than trying every element, but are not polynomial time.
8.2 Nonunits
Zn = Z k × ... × Z pkm
p11 m
Thus an element a ∈ Zn corresponds to some element (a1 , ..., am ) on the right-hand side, and a is a nonunit if at least one of the
ai is a multiple of pi . From above, this means in at most ki steps, the ith member will reach zero, so in general, for some k, each
aki is zero or a unit, hence we can restrict our study to units.
Note we have again followed an earlier suggestion: we handle the prime power case first and then generalize using the Chinese
Remainder Theorem.
Number Theory 10 / 34
The discrete log problem may be hard, but we do know some facts about the powers of a unit a ∈ Z∗n . Firstly, ak = 1 for some k:
since there are finitely many units, we must have ax = ay for some x < y eventually, and since a−1 exists we find ay−x = 1.
Let a ∈ Z∗n . The smallest positive integer x for which ax = 1 (mod n) is called the order of a. The sequence a, a2 , ... repeats itself
as soon as it reaches ax = 1. (since ax+k = ak ), and we have ak = 1 precisely when k is a multiple of x.
Example: The powers of 3 (mod 7) are 3, 2, 6, 4, 5, 1 so the order of 3 (mod 7) is 6.
The following theorems narrow down the possible values for the order of a unit.
Let x be the order of a ∈ Z∗n , and y be the order of b ∈ Z∗n . What is the order of ab?
Suppose (ab)k = 1. Raising both sides to x shows
Suppose we are given positive integers e, N, and ae (mod N) for some unit a. How can we recover a?
One strategy is to find an integer d such that ade = a (mod N). By Euler’s Theorem, d will satisfy this equation if de = kφ (N) +1
for some k. In other words, we compute
d = e−1 (mod φ (N))
and compute (ae )d to recover a.
However it is not known how to compute φ (N) from N without factoring N, and it is not known how to factor large numbers
efficiently.
10 Primality Tests
Given an integer n, how can we tell if n is prime? Assume n is odd, since the even case is trivial.
The most obvious idea is to look for factors of n, but no efficient factoring algorithm is known.
By Fermat’s Theorem, if n is prime, then for any a we have an−1 = 1 (mod n). This suggests the Fermat test for a prime: pick a
random a ∈ [1..n − 1] then see if an−1 = 1 (mod n). If not, then n must be composite.
However, equality may hold even when n is not prime. For example, take n = 561 = 3 × 11 × 17. By the Chinese Remainder
Theorem
Z561 = Z3 × Z11 × Z17
thus each a ∈ Z∗561 corresponds to some
(x, y, z) ∈ Z∗3 × Z∗11 × Z∗17 .
By Fermat’s Theorem, x2 = 1, y10 = 1, and z16 = 1. Since 2, 10, and 16 all divide 560, this means (x, y, z)560 = (1, 1, 1), in other
words, a560 = 1 for any a ∈ Z∗561 .
Thus no matter what a we pick, 561 always passes the Fermat test despite being composite so long as a is coprime to n. Such
numbers are called Carmichael numbers, and it turns out there are infinitely many of them.
If a is not coprime to n then the Fermat test fails, but in this case we can recover a factor of n by computing gcd(a, n).
Number Theory 12 / 34
We can do better by recalling n is prime if and only if the solutions of x2 = 1 (mod n) are x = ±1.
So if n passes the Fermat test, that is, an−1 = 1, then we also check a(n−1)/2 = ±1, because a(n−1)/2 is a square root of 1.
Unfortunately, numbers such as the third Carmichael number 1729 still fool this enhanced test. But what if we iterate? That is,
so long as it’s possible, we continue halving the exponent until we reach a number besides 1. If it’s anything but −1 then n must
be composite.
More formally, let 2s be the largest power of 2 dividing n − 1, that is, n − 1 = 2s q for some odd number q. Each member of the
sequence
s s−1
an−1 = a2 q , a2 q , ..., aq .
is a square root of the preceding member.
Then if n is prime, this sequence begins with 1 and either every member is 1, or the first member of the sequence not equal to 1
is −1.
The Miller-Rabin test picks a random a ∈ Zn . If the above sequence does not begin with 1, or the first member of the sequence
that is not 1 is also not −1 then n is not prime.
It turns out for any composite n, including Carmichael numbers, the probability n passes the Miller-Rabin test is at most 1/4.
(On average it is significantly less.) Thus the probability n passes several runs decreases exponentially.
If n fails the Miller-Rabin test with a sequence starting with 1, then we have a nontrivial square root of 1 modulo n, and we can
efficiently factor n. Thus Carmichael numbers are always easy to factor.
Exercise: What happens when we run the Miller-Rabin test on numbers of the form pq where p, q are large primes? Can we
break RSA with it?
Given n, find s so that n − 1 = 2s q for some odd q. Then we implement a single Miller-Rabin test as follows:
2. If aq = 1 then n passes.
i
3. Otherwise, for i = 0, ..., s − 1 see if a2 q = −1. If so, n passes.
4. Otherwise n is composite.
We also perform a few trial divisions by small primes before running the Miller-Rabin test several times.
Strictly speaking, these tests are compositeness tests since they do not prove the input is prime, but rather prove that an input is
composite.
There exist deterministic polynomial-time algorithms for deciding primality (see Agrawal, Kayal and Saxena), though at present
they are impractical.
11 Generators
A unit g ∈ Z∗n is called a generator or primitive root of Z∗n if for every a ∈ Z∗n we have gk = a for some integer k. In other words,
if we start with g, and keep multiplying by g eventually we see every element.
Example: 3 is a generator of Z∗4 since 31 = 3, 32 = 1 are the units of Z∗4 .
Example: 3 is a generator of Z∗7 . From before the powers of 3 are 3, 2, 6, 4, 5, 1 which are the units of Z∗7 .
Example: 3 is not a generator of Z∗11 since the powers of 3 (mod 11) are 3, 9, 5, 4, 1 which is only half of Z∗11 .
Theorem: Let p be a prime. Then Z∗p contains exactly φ (p − 1) generators. In general, for every divisor d|p − 1, Z∗p contains
φ (d) elements of order d.
Number Theory 13 / 34
xd − 1 = 0 (mod p).
are precisely a, a2 , ..., ad = 1 and there are no other elements of order d since xd − 1 has at most d roots.
It is easy to show that ak has order d if and only if gcd(k, d) = 1, thus either there are no elements of order d or there are exactly
φ (d) elements of order d.
Now ∑d|p−1 φ (d) = p − 1 (which we can prove using multiplicative functions or cyclic groups) and if any of the φ (d) were
replaced with 0 on the left-hand side, the equality would fail. Hence there must be exactly φ (d) elements of order d for each
d|p − 1 (since each one of the p − 1 elements of Z∗p must have some order).
What about powers of primes, or composite numbers in general?
12 Cyclic Groups
Z∗n is an example of a group. We won’t formally introduce group theory, but we do point out that a group only deals with one
operation. The ∗ in Z∗n stresses that we are only considering multiplication and forgetting about addition.
Notice we rarely add or subtract elements of Z∗n . For one thing, the sum of two units might not be a unit. We performed addition
in our proof of Fermat’s Theorem, but this can be avoided by using our proof of Euler’s Theorem instead. We did need addition
to prove that Z∗n has a certain structure, but once this is done, we can focus on multiplication.
Let us see what can be said from studying multiplication alone.
When Z∗n has a generator, we call Z∗n a cyclic group. If g is a generator we write Z∗n = ⟨g⟩.
A subgroup of Z∗n is a non-empty subset H of Z∗n such that if a, b ∈ H, then ab ∈ H. Thus any subgroup contains 1, and also the
inverse of every element in the subgroup. (Why? Hint: our definition of a subgroup only works when every element has a finite
order; the real definition is different!)
Examples: Any a ∈ Z∗n can be used to generate cyclic subgroup ⟨a⟩ = {a, a2 , ..., ad = 1} (for some d). For example, ⟨2⟩ = {2, 4, 1}
is a subgroup of Z∗7 . Any group is always a subgroup of itself. {1} is always a subgroup of any group. These last two examples
are the improper subgroups of a group.
Number Theory 14 / 34
We prove Lagrange’s Theorem for Z∗n . The proof can easily be modified to work for a general finite group.
Our proof of Euler’s Theorem has ideas in common with this proof.
Theorem: Let H be a subgroup of Z∗n of size m. Then m|φ (n).
Proof: If H = Z∗n then m = φ (n). Otherwise, let H = {h1 , ..., hm }. let a be some element of Z∗n not in H, and consider the set
{h1 a, ..., hm a} which we denote by Ha. Every element in this set is distinct (since multiplying hi a = h j a by a−1 implies hi = h j ),
and furthermore no element of Ha lies in H (since hi = h j a implies a = h−1 j hi thus a ∈ H, a contradiction).
Thus if every element of Z∗n lies in H or Ha then 2m = φ (n) and we are done. Otherwise take some element b in Z∗n not in H or
Ha. By a similar argument, we see that Hb = {h1 b, ..., hm b} contains exactly m elements and has no elements in common with
either H or Ha.
Iterating this procedure if necessary, we eventually have Z∗n as the disjoint union of the sets H, Ha, Hb, ... where each set contains
m elements. Hence m|φ (n).
Corollary: Euler’s Theorem (and Fermat’s Theorem). Any a ∈ Z∗n generates a cyclic subgroup {a, a2 , ..., ad = 1} thus d|φ (n),
and hence aφ (n) = 1.
Theorem: All subgroups of a cyclic group are cyclic. If G = ⟨g⟩ is a cyclic group of order n then for each divisor d of n there
exists exactly one subgroup of order d and it can be generated by an/d .
Proof: Given a divisor d, let e = n/d. Let g be a generator of G. Then ⟨ge ⟩ = {ge , g2e , ..., gde = 1} is a cyclic subgroup of G of
size n/d.
Now let H = {a1 , ..., ad−1 , ad = 1} be some subgroup of G. Then for each ai , we have ai = gk for some k. By Lagrange’s
Theorem the order of ai must divide d, hence gkd = 1.
Since the order of g is n, we have kd = mn = mde for some m. Thus k = em and ai = (ge )m , that is each ai is some power of ge ,
hence H is one of the subgroups we previously described.
Theorem: Let G be cyclic group of order n. Then G contains exactly φ (n) generators.
Proof: Let g be a generator of G, so G = {g, ..., gn = 1}. Then gk generates G if and only if gkm = g for some m, which happens
when km = 1 (mod n), that is k must be a unit in Zn , thus there are φ (n) values of k for which gk is a generator.
Example: When Z∗n is cyclic (i.e. when n = 2, 4, pk , 2pk for odd primes p), Z∗n contains φ (φ (n)) generators.
We can now prove a theorem often proved using multiplicative functions:
Theorem: For any positive integer n
n = ∑ φ (d).
d|n
Proof: Consider a cyclic group G of order n, hence G = {g, ..., gn = 1}. Each element a ∈ G is contained in some cyclic subgroup.
The theorem follows since there is exactly one subgroup H of order d for each divisor d of n and H has φ (d) generators.
In an abstract sense, for every positive integer n, there is only one cyclic group of order n, which we denote by Cn . This is because
if g is a generator, then Cn = {g, g2 , ..., gn = 1} which completely determines the behaviour of Cn .
Example: Both Z∗3 and Z∗4 are cyclic of order 2, so they both behave exactly like C2 (when considering multiplication only). This
is an example of a group isomorphism.
Number Theory 15 / 34
Example: For n = 2k pk11 ...pkmm for odd primes pi , by the Chinese Remainder Theorem we have
Recall each Z∗ ki is cyclic, and so are Z∗2 and Z∗4 . Also recall for k > 2 we have that 3 ∈ Z∗2k has order 2k−2 and no element has a
pi
higher order. Using some group theory this means the group structure of Z∗n is
when k = 1, 2 and
C2 ×C2k−2 ×C k k −1 × ... ×C pkm −pkm −1
p11 −p11 m m
when k > 2.
13 Quadratic Residues
Let a ∈ Zn . We say a is a quadratic residue if there exists some x such that x2 = a. Otherwise a is a quadratic nonresidue.
Efficiently distinguishing a quadratic residue from a nonresidue modulo N = pq for primes p, q is an open problem. This is
exploited by several cryptosystems, such as Goldwassser-Micali encryption, or Cocks identity-based encryption. More general
variants of this problem underlie other cryptosystems such as Paillier encryption.
Let p be an odd prime, as the case p = 2 is trivial. Let g be a generator of Z∗p . Any a ∈ Z∗p can be written as gk for some
k ∈ [0..p − 2].
Say k is even. Write k = 2m. Then (gm )2 = a, so a is a quadratic residue. Exactly half of [0..p − 2] is even (since p is odd), hence
at least half of the elements of Z∗p are quadratic residues.
Suppose we have b2 = a. Then (−b)2 = a as well, and since b ̸= −b (since p > 2) every quadratic residue has at least two square
roots (in fact, we know from studying polynomials there can be at most two), thus at most half the elements of Z∗p are quadratic
residues. (Otherwise there are more square roots than elements!)
Thus exactly half of Z∗p are quadratic residues, and they are the even powers of g.
Given a = gk , consider the effect of exponentiating by (p − 1)/2. If k is odd, say k = 2m + 1, we get
The square of g(p−1)/2 is g p−1 = 1, so g(p−1)/2 is 1 or −1. But g has order p−1 (g is a generator) thus we must have g(p−1)/2 = −1.
If k is even, say k = 2m, then a(p−1)/2 = g(p−1)m = 1.
In other words, we have proved Euler’s Criterion, which states a is a quadratic residue if and only if a(p−1)/2 = 1, and a is a
quadratic nonresidue if and only if a(p−1)/2 = −1.
Example: We have −1 is a quadratic residue in Z p if and only if p = 1 (mod 4).
a
• p = −1 if a is a quadratic nonresidue of p
a
• p = 0 if a = 0 modulo p
14 Gauss’ Lemma
2
There is a less obvious way to compute the Legendre symbol. Among other things, we can use it to easily find p . Before
stating the method formally, we demonstrate it with an example.
Let p = 17, and a = 7. There are 16 nonzero elements [1..16]. Consider the first half [1..8] and multiply them all by 7 to get 7,
14, 4, 11, 1, 8, 15, 5. We’ve singled out 14, 11 and 15 because they are greater than p/2 (that is, 9 or higher).
So
exactly 3 of them are greater than p/2. Gauss’ Lemma states that if we take this 3 and raise −1 to this power, then we have
a
p , that is:
7
= (−1)3 = −1.
17
Theorem (Gauss’ Lemma): Let p be an odd prime, q be an integer coprime to p. Take the least residues of {q, 2q, ..., q(p −
1)/2}, that is, reduce them to integers in [0..p − 1]. Let u be the number of members in this set that are greater than p/2. Then
q
= (−1)u .
p
Proof: Let b1 , ..., bt be the members of the set less than p/2, and c1 , ..., cu be the members greater than p/2. Then u + t =
(p − 1)/2. Consider the sequence
0 < b1 , ..., bt , p − c1 , ..., p − cu < p/2
Each of these are distinct: clearly bi ̸= b j and ci ̸= c j whenever i ̸= j (since q is invertible), and if bi = p − c j , then let bi =
rq, c j = sq. Then r + s = 0, which is a contradiction since
0 < r, s < p/2.
Hence they must be precisely the numbers 1, ..., (p − 1)/2 in some order, thus
q(2q)...(q(p − 1)/2) = b1 ...bt c1 ...cu
u
= (−1) b1 ...bt (p − c1 )...(p − cu )
u p−1
= (−1) !
2
Dividing both sides by ((p − 1)/2))! completes the proof.
For example, let p be an odd prime and take q = 2. The sequence 2, 2 × 2, ..., 2(p − 1)/2 consists of positive least residues. We
have p = 8x + y for some integer x and y ∈ {1, 3, 5, 7}. By considering each case we see that the number of elements greater than
p/2 is even when p = 1, 7 (mod 8) and odd when p = 3, 5 (mod 8). We restate this as follows.
Theorem: Let p be an odd prime.
2 p2 −1 p+1
= (−1) 8 = (−1)⌊ 4 ⌋
p
Proof: See the last paragraph, and note that (p2 − 1)/8 is even when p = ±1 (mod 8) and odd otherwise. Similarly for ⌊(p +
1)/4⌋.
Example: By Gauss’ Lemma, 3p = 1 if p = ±1 (mod 12) and −1 otherwise (that is, p = ±5 (mod 12)). Combining this
with the above result for −1 (mod p) we have −3 p = 1 if p = 1 (mod 6) and −1 otherwise (p = −1 (mod 6)).
Number Theory 17 / 34
15 Quadratic Reciprocity
The law of quadratic reciprocity, noticed by Euler and Legendre and proved by Gauss, helps greatly in the computation of the
Legendre symbol.
First, we need the following theorem:
Theorem: Let p be an odd prime and q be some odd integer coprime to p. Let m = ⌊q/p⌋ + ⌊2q/p⌋...⌊((p − 1)/2)q/p⌋.
Then m = u (mod 2), where as in Gauss’ Lemma, u is the number of elements of {q, 2q, ..., q(p − 1)/2} which have a residue
greater than p/2.
Proof: For each i = 1, ..., (p − 1)/2, the equation iq = p⌊iq/p⌋ + ri holds for some 0 < ri < p. Reading the proof of Gauss’
Lemma, we see these are precisely the bi and ci .
Then summing these equations modulo 2 gives
q(p2 − 1)/8 = pm + b1 + ... + bt + c1 + ... + cu
= pm + b1 + ... + bt + up + (p − c1 ) + ... + (p − cu )
= pm + up + 1 + 2 + ... + (p − 1)/2
= pm + up + (p2 − 1)/8
That is,
(q − 1)(p2 − 1)/8 = p(m + u)
Since p and q are odd, we have m = u (mod 2).
Theorem (Law of Quadratic Reciprocity): Let p, q be distinct odd primes. If p = q = −1 (mod 4), then
p q
=−
q p
otherwise
p q
= .
q p
which we can state as
p q p−1 q−1
= (−1) 2 2
q p
Proof: From above, we need only show
m + n = (p − 1)(q − 1)/4
where m = ⌊q/p⌋ + ⌊2q/p⌋...⌊((p − 1)/2)q/p⌋, and n is similarly defined by swapping p and q.
Eisenstein found an elegant geometrical proof. Consider the line L from (0, 0) to (p, q), and the rectangle R with corners at (0, 0)
and (p/2, q/2). How many lattice points lie strictly in R?
Simply computing the area of a rectangle gives (p − 1)(q − 1)/4. Alternatively, we can count the number of points above and
below L inside R, since no lattice points can lie on L in R since p, q are coprime.
Consider the points below L on the line x = 1. They have y-coordinates of 1, 2, ..., ⌊q/p⌋. When x = 2, there are ⌊2q/p⌋ points,
and so on, giving a total of m points below the L. Similarly there are n points above L in R, proving the result.
We can restate the proof algebraically. Consider the numbers py − qx for x = 1, ..., (p − 1)/2 and y = 1, ..., (q − 1)/2. There
are a total of (p − 1)(q − 1)/4 numbers, not necessarily distinct. None are zero, since p, q are coprime. The result follows after
observing that n of them are positive, and m are negative.
Example:
31 103 10 2 5 5
=− =− =− = −(−1)
103 31 31 31 31 31
since 2 = −5 (mod 8). Next,
5 31 1
=− =− = −1.
31 5 5
Hence 31 is a quadratic nonresidue modulo 103.
This method is flawed because it relies on factoring, so we might think we should stick to our original modular exponentiation
for computing the Legendre symbol. But it turns out all is well once we extend the Legendre symbol.
Number Theory 18 / 34
The Jacobi symbol ab is defined for all odd positive integers b and all integers a. When b is prime, it is equivalent to the
Legendre symbol. If b = 1, define 1a = 1. Lastly, for other values of b, factor b into primes: b = pk11 ...pknn and define
16 Carmichael Numbers
Recall Carmichael numbers are composite numbers that almost always fool the Fermat primality test. We can show that
Carmichael numbers must have certain properties.
First we show they cannot be of the form n = pq where p, q are distinct primes with p > q. By the Chinese Remainder Theorem
we have Zn = Z p × Zq . Then
n − 1 = pq − 1 = q(p − 1) + q − 1.
Suppose a is not a multiple of p. By Fermat
Then if a passes the Fermat test, we must have aq−1 = 1 (mod p) and hence ad = 1, where d = gcd(p − 1, q − 1). Since Z∗p is
cyclic, there are exactly d choices for a that satisfy ad = 1.
Since p > q, the greatest common divisor of p − 1 and q − 1 is at most (p − 1)/2. Thus at most half the choices for a can fool
the Fermat test.
Next suppose n is not squarefree, that is n = pk r for some prime p and k ≥ 2. Then by the Chinese Remainder Theorem,
k
Zn = Z pk × Zr . Any a ∈ Z pk satisfies aφ (p ) = 1 by Euler’s Theorem, so if an−1 = 1 as well, then we must have ad = 1 where
Since Z pk , is cyclic, exactly d elements a ∈ Z pk satisfy ad = 1. As p cannot divide pk r − 1, the largest possible value for d is
p − 1, giving an upper bound of (p − 1)/(pk − 1) ≤ 1/4 probability that n will pass the Fermat test.
Hence if n is a Carmichael number, then n is squarefree and is the product of at least three distinct primes.
Number Theory 19 / 34
Recall our first suggestion for improving the Fermat test was to check a candidate x satisfies x(n−1)/2 = ±1 after checking
xn−1 = 1, since there are no nontrivial square roots of unity modulo a prime.
We can improve this by checking instead that x
x(n−1)/2 = .
n
This is known as the Solovay-Strassen test. Recall that the Jacobi symbol can be evaluated quickly using quadratic reciprocity.
Why does this help? From above we know that if n is not squarefree then n fails the Fermat test with probability at least 3/4. So
we need only consider the case when n is squarefree but composite, say n = pr where p is prime and r is an odd number greater
than 1.
By the Chinese Remainder Theorem any x ∈ Zn can be written as (a, b) ∈ Z p × Zr . For any nonzero a ∈ Z p we have a p−1 = 1
by Fermat, so if an−1 = 1 then we have ad = 1 where d = gcd(p − 1, n − 1).
If d < p − 1 then d is at most (p − 1)/2. Since Z∗p is cyclic, at most (p − 1)/2 elements of a ∈ Z p satisfy ad = 1. That is, the
probability a passes the Fermat test is at most 1/2.
On the other hand, if d = p − 1, then this implies p − 1|n − 1. Since n − 1 = pr − 1 = (p − 1)r + r − 1 we have p − 1|r − 1, so
write r − 1 = s(p − 1). Then
a pr−1 = a(p−1)r a(p−1)s = 1.
But we have x
a b
=
n p r
As r contains at least one odd prime factor, the sign of (b|r) is positive or negative with the same probability, that is, there is a
1/2 chance that x fails the test.
In other words, for any composite n, including Carmichael numbers, the probability n passes the Solovay-Strassen test is at most
1/2.
This was the first algorithm discovered for finding large primes. The Miller-Rabin test surpasses the Solovay-Strassen test in
every way: the probability a composite number n passes is only 1/4, and no Jacobi symbol computations are required. Moreover,
any a that exposes the compositeness of n in the Solovay-Strassen test also triggers the Miller-Rabin test.
17 Multiplicative Functions
An arithmetical function, or ’number-theoretic function’ is a complex-valued function defined for all positive integers. It can be
viewed as a sequence of complex numbers.
Examples: n!, φ (n), π(n) which denotes the number of primes less than or equal to n.
An arithmetical function is multiplicative if f (mn) = f (m) f (n) whenever gcd(m, n) = 1, and totally multiplicative or completely
multiplicative if this holds for any m, n. Thus f (1) = 1 unless f is the zero function, and a multiplicative function is completely
determined by its behaviour on the prime powers.
Examples: We have seen that the Euler totient function φ is multiplicative but not totally multiplicative (this is one reason it
is convenient to have φ (1) = 1). The function n2 is totally multiplicative. The product of (totally) multiplicative functions is
(totally) multiplicative.
Theorem: Let f (n) be a multiplicative function, and define
F(n) = ∑ f (d).
d|n
F(mn) = ∑ f (d)
d|mn
= ∑ f (rs)
r|m,s|n
= ∑ f (r) f (s)
r|m,s|n
= ∑ f (r) ∑ f (s)
r|m s|n
= F(m)F(n)
This theorem can also be proved using basic facts about cyclic groups.
Examples: The divisors of 12 are 1, 2, 3, 4, 6, 12. Their totients are 1, 1, 2, 2, 2, 4 which sum to 12.
The function f (n) = 1 is (totally) multiplicative. Let d(n) be the number of divisors of n. Then since d(n) = ∑d|n 1 we see that
d(n) is multiplicative.
The function f (n) = n is (totally) multiplicative. Let σ (n) be the sum of divisors of n. Then since σ (n) = ∑d|n n we see that σ (n)
is multiplicative. In general, we can apply this trick to any power of n.
A positive integer n is a perfect number if σ (n) = 2n. The first perfect numbers are 6, 28, 496, 8128.
It is not known if any odd perfect numbers exist.
Let n be an even perfect number, so n = 2q−1 m for some q > 1 and odd m. Since σ is multiplicative,
hence 2q |σ (m), so write σ (m) = 2q s for some s and hence (2q − 1)s = m.
Thus two of the divisors of m are s and m = (2q − 1)s. But these already sum to 2q s, hence m can have no other divisors, implying
that s = 1 and m = 2q − 1 is prime. The converse is clear, thus n is an even perfect number if and only if
n = 2q−1 (2q − 1)
with 2q − 1 prime.
This implies q is prime as d|q implies 2d − 1|2q − 1. The converse is unsurprisingly false. A number of the form 2q − 1 is called
a Mersenne number, and if it is prime, then it is called a Mersenne prime.
Mersenne conjectured that for q ≤ 257 the only primes q which yielded primes 2q − 1 were 2, 3, 5, 7, 13, 17, 19, 31, 67, 127, 257.
He made five mistakes: 67, 257 should not be on the list, and he missed 61, 89, 107.
Modulo 10, the powers of 2 cycle through 2, 4, 8, 6, hence n = 2q−1 (2q − 1) cycles through 2(4 − 1), 4(8 − 1), 8(6 − 1), 6(2 − 1).
The third of these is 0, implying 5 divides n which means n cannot be perfect, because 5 is not one below a power of 2. The other
possibilities imply every even perfect number ends with 6 or 8.
Applying a similar but more exhausting calculation modulo 100 for q > 2, we find even perfect numbers other than 6 must end
with 28 or an odd digit followed by 6.
Number Theory 21 / 34
n = 2k p1 ...pm
where each pi is a distinct Fermat prime and k is some nonnegative integer. In 1796, Gauss proved the sufficiency of this condition
(though not the necessity) when he was only 19.
18 Möbius Inversion
F(n) = ∑ f (d).
d|n
= ∑ µ(d) f (r)
dr|n
= ∑ f (r) ∑ µ(d)
r|n d|(n/r)
A little thought leads to this unique solution, known as the ’Möbius function’:
1
n=1
µ(n) = 0 p2 |n for some prime p
r
(−1) n = p1 ...pr for distinct primes pi
if and only if
f (n) = ∑ µ(n/d)F(d)
d|n
Number Theory 22 / 34
First we see that 1 is a generator for Z∗2 and 3 is a generator for Z∗4 . A quick check reveals Z∗8 has no generator: the square of any
odd number is 1 modulo 8.
Next suppose Z∗2k has a generator g for some k > 3. Then for each a ∈ Z∗8 , we have gx = a (mod 2k ) for some x. This equation
still holds modulo 8 since 8|2k . But this is a contradiction since it would imply g is a generator of Z∗8 .
Thus if n is a power of 2, Z∗n has a generator if and only if n = 2 or n = 4.
We examine the behaviour of units under exponentiation modulo a power of two more closely. By induction we can show that
t−3
(1 + 2n)2 = 1 + 2t−2 (n − n2 + 2n4 ) (mod 2t )
Let p be an odd prime. Let g be a generator of Z∗p . We try to find a generator of Z∗p2 .
Intuitively, such a generator ought to relate to the generators of Z∗p . So consider the problem of finding the order of g + kp in Z∗p2
for any k.
Let t be the order of g + kp in Z∗p2 . From (g + kp)t = gt = 1 (mod p) we deduce (p − 1)|t. Since t|φ (p2 ) = p(p − 1), there are
two possibilities. Either t = p − 1 or t = p(p − 1). In the latter case, g + kp generates Z∗p2 .
But the former case (g + kp) p−1 = 1 (mod p2 ) occurs if and only if
(g + kp) p = g + kp (mod p2 ).
Number Theory 23 / 34
gt = 1 + kp
for some k, and setting s = 1, r = 1 in the lemma yields:
gt p = 1 (mod p2 )
Since g generates Z∗p2 , the exponent t p must be a multiple of φ (p2 ) = p(p − 1). Therefore t is a multiple of p − 1 = φ (p) so g
generates Z∗p . (We can generalize to show any generator for a prime power is also a generator for any lower power.)
Using the lemma is somewhat gratuitous as we could have reasoned as above when we showed a generator of a power of 2 must
be a generator of lower power of 2, but on the other hand it’s satisfying to use one lemma to dispatch all the cases.
Now for the other powers of p. We just established g generates Z∗p , that is:
g p−1 = 1 + kp
for some k, which must be coprime to p as g generates Z∗p2 .
Applying the lemma with t = p − 1, s = 1 gives:
r
g(p−1)p = 1 + kp1+r (mod pr+2 )
When r = 1, this is:
2
gφ (p ) = 1 + kp2 (mod p3 )
Let t be the order of g modulo p3 so t|φ (p3 ). We also have:
gt = 1 (mod p2 )
which means φ (p2 )|t because g generates Z∗p2 . Thus t is either φ (p2 ) or φ (p3 ). It cannot be the former since k is coprime to p,
thus it must be the latter and thus g generates Z∗p3 .
We iterate this argument for higher powers of p.
Number Theory 24 / 34
We first consider odd n. Write n = pk11 ...pkmm . By the Chinese Remainder Theorem we have
Z∗n = Z∗ k1 × ... × Z∗ km
p1 pm
Each x ∈ Z∗n corresponds to some element (x1 , ..., xn ) of the right-hand side. Now each xi satisfies
k
φ (pi i )
xi =1 (mod pki i )
we find (x1 , ..., xn )λ = (1, ..., 1), thus x has order dividing λ (n). On the other hand, if we choose each gi to be a generator of Z k
pi i
then (g1 , ..., gn ) has order λ (n).
Hence a generator exists in Z∗n if and only if λ (n) = φ (n). Since each φ (pki i ) is even, λ (n) = φ (n) can only when m = 1, that is,
n must be a prime power.
Now suppose n = 2k q where q is odd. Again by the Chinese Remainder Theorem we have Z∗n = Z∗2k × Z∗q . If k > 2 then Z∗n has no
generator since Z∗8 doesn’t. If k = 2, since Z∗4 has order 2, the lowest common multiple of φ (4) and φ (q) is less than φ (4q), thus
no generator exists. Lastly if k = 1 then if g is a generator for Zq then λ (n) = φ (n) thus a generator exists ((1, g) is a generator).
In summary, Z∗n has a generator precisely when n = 2, 4, pk , 2pk for odd primes p and positive integers k.
We can use the above to tighten Euler’s Theorem. Write n = 2k pk11 ...pkmm for odd distinct primes pi , and define
for k ≤ 2 and
λ (n) = lcm(φ (2k−1 ), φ (pk11 ), ..., φ (pkmm ))
for k > 2.
Then aλ (n) = 1 for all a ∈ Z∗n , and furthermore λ (n) is the smallest positive integer satisfying this condition because there exists
a ∈ Z∗n with order λ (n).
We can apply our new knowledge to study quadratic residues in more general settings.
If Z∗n has a generator, then φ (n) plays the same role as p − 1 in the odd prime case for quadratic residues.
For example, let us consider when −1 is a quadratic residue.
For odd primes p, φ (pk ) = pk − pk−1 , which is 0 or 2 mod 4 depending on whether p is 1 or 3 mod 4, so −1 is a quadratic residue
in Zkp if and only if p = 1 (mod 4).
For p = 2, if n is a quadratic residue modulo 2k then it must also be a quadratic residue for all lower powers of 2, which implies,
for example, −1 is a quadratic residue only when k = 1.
Write n = ∏ 2k pki i for odd distinct primes pi . By the Chinese remainder theorem, −1 is a quadratic residue if and only if k ≤ 1
and each pi = 1 mod 4.
20 Cyclotomic Equations
We try to solve the cyclotomic equation x p − 1 = (x − 1)(x p−1 + x p−2 + ... + 1) = 0 algebraically. (Transcendentally, the roots
are e2πik/p for k = 0, ..., p − 1.)
Number Theory 25 / 34
It can be easily shown that if gcd(m, n) = 1, then a primitive mth root of unity times a primitive nth root of unity
√ is a primitive
mnth root of unity, thus we need only consider prime powers. But then if α is a primitive pth root of unity, then k α is a primitive
pk th root of unity, so we need only consider the case where p is prime.
In general we can use Gauss’ method, but let us see how far elementary methods lead us.
p = 3: we merely solve the quadratic x2 + x + 1 = 0 to obtain
√
−1 ± i 3
x=
2
p = 5: we could solve the quartic x4 + x3 + x2 + x + 1 = 0 but since it is palindromic we make the variable substitution y = x + 1/x,
and solve
y2 + y − 1 = 0
to find √
−1 ± 5
y=
2
and x2 − yx + 1 = 0 implies p
y± y2 − 4
x=
2
giving the four solutions
√ p √ √ p √
5 − 1 ± −2 5 − 10 − 5 − 1 ± 2 5 − 10
x= ,
4 4
p = 7: the palindrome yields a cubic which can be solved for x.
p = 11: the palindrome yields a quintic. Now elementary methods fail us and we need resort to Gauss’ method as Vandermonde
did.
21 The Heptadecagon
In 1796, a teenage Gauss proved that a regular 17-gon can be constructed using a straight-edge and compass by showing that a
primitive 17th root of unity can be found by solving a succession of quadratic equations over the rationals.
Factorizing x17 − 1 = 0 yields:
(x − 1)(1 + x + ... + x16 ) = 0
Let ζ = e2πi/17 be a primitive 17th root of unity. Since ζ ̸= 1, we must have:
ζ + ... + ζ 16 = −1
Since 3 is a generator of Z∗17 , the primitive 17th roots of unity can be written in the sequence
0 1 15
ζ 3 , ζ 3 , ..., ζ 3
Define x1 to be the sum of every second member of the sequence, and x2 to be the sum of the other members, that is,
0 2 14
x1 = ζ 3 + ζ 3 + ... + ζ 3
1 3 15
x2 = ζ 3 + ζ 3 + ... + ζ 3
Then x1 + x2 = −1. By construction, x1 and x2 are Gaussian periods which means it is easy to compute
√ x1 x2 = −4 (or use brute
force(!)), thus x1 , x2 are roots of a quadratic equation with integer coefficients, namely (−1 ± 17)/2. The solution x1 is the
positive one since only two terms in its sum point to the left on the complex plane.
Next define y1 , y2 from the elements used to construct x1 in a similar way:
0 4 8 12
y1 = ζ 3 + ζ 3 + ζ 3 + ζ 3
2 6 10 14
y2 = ζ 3 + ζ 3 + ζ 3 + ζ 3
Number Theory 26 / 34
Thus
y1 = ζ + ζ 13 + ζ 16 + ζ 4
y2 = ζ 9 + ζ 15 + ζ 8 + ζ 2
Then y1 + y2 = x1 . It turns out y1 y2 = −1, thus y1 , y2 are roots of a quadratic equation with coefficients involving the integers
and x1 .
Similarly we can define y3 , y4 from x2
y3 = ζ 3 + ζ 5 + ζ 14 + ζ 12
y4 = ζ 10 + ζ 11 + ζ 7 + ζ 6
and solve a quadratic to obtain their values.
Now define z1 , z2 from y1 in this fashion:
z1 = ζ + ζ 16
z2 = ζ 13 + ζ 4
We have z1 + z2 = y1 and z1 z2 = y3 , so z1 , z2 can be found from a quadratic whose coefficients we know. Lastly we either note
that both the sum and product of ζ and ζ 16 are known so they can be found from a quadratic, or use the fact that
ζ + ζ 16 = 2 cos(2π/17)
Using the above, we can give an elementary method for finding cos(2π/17) that seems to work magically. If we don’t mention
generators the solution appears mysterious.
Let cm = cos(2πm/17). By considering the sums of the roots of unity we have 2(c1 + ... + c8 ) = −1.
Set
a = c1 c4 , b = c3 c5 , c = c2 c8 , d = c6 c7 .
By basic trigonometric identities we have
2a = c3 + c5 , 2b = c2 + c8 , 2c = c6 + c7 , 2d = c1 + c4 .
Similarly bd = −1/16. We also find 16ab = −1 + 4a + 4b, along with similar equations for bc, cd, da. Define
a + c = 2e, b + d = 2 f
e + f = −1/8, 4e f = ab + bc + cd + ad = −1/4
so we can solve a quadratic equation to find e, f . Once we have them, we can solve a quadratic equation to find a, c, and another
to find b, d. With these values we can solve for c1 .
[I found this version in a solution that also describes a practical straight-edge-and-compass construction.]
Number Theory 27 / 34
Theorem: Let
f (x) = a0 + a1 x + ... + an xn
be a polynomial with integer coefficients. Suppose a prime p divides each of a0 , a1 , ..., an−1 (every coefficient except the leading
coefficient), and that p2 does not divide a0 . Then f (x) has no factor with integer coefficients.
Proof: Suppose f = gh. Look at this factorization modulo p.
It turns out F p [x] is a unique factorization domain. Modulo p, since f = an xn , we find g = bxd and h = cxe for some b, c, d + e = n.
In other words, p must divide every non-leading coefficient of g and h. In particular, p divides constant terms of g and h, hence
p2 must divide their product, that is, the constant term of f .
We can prove the theorem without introducing UFDs. As above, modulo p we have f = an xn , so f (0) = 0, thus g(0)h(0) = 0
modulo p. Then at least one of g(0) and h(0) is 0. Without loss of generality g(0) = 0, so g = xg1 for some polynomial g1 .
Then an xn−1 = g1 h, and repeating this argument n − 1 times shows g = bxd and h = cxe for some b, c, d + e = n. We now argue
as before.
We can also prove the theorem more directly. Suppose f = gh for polynomials g, h with integer coefficients. Let
g(x) = bd xd + ... + b0
and
h(x) = ce xe + ... + c0
for some d + e = n. The conditions imply p divides exactly one of b0 and c0 . Without loss of generality, say p divides b0 but not
c0 .
Since p divides
a1 = b1 c0 + b0 c1
we deduce p divides b1 . We now know p divides b0 , b1 but not c0 .
Since p divides
a2 = b2 c0 + b1 c1 + b0 c2
we deduce p divides b2 . We now know p divides b0 , b1 , b2 but not c0 .
Continuing in this manner on a3 ...ad , we conclude by induction that p divides each of b0 ...bd .
But this implies p divides bd ce = an , a contradiction.
We usually combine Eisenstein’s criterion with the next theorem for a stronger statement. (The name "Gauss’ Lemma" has been
given to several results in different areas of mathematics, including the following.)
Theorem: Let f ∈ Z[x]. Then f is irreducible over Z[x] if and only if f is irreducible over Q[x].
(In other words, Let f (x) be a polynomial with integer coefficients. If f (x) has no factors with integer coefficients, then f (x) has
no factors with rational coefficients.)
Proof: Let f (x) = g(x)h(x) be a factorization of f into polynomials with rational coefficients. Then for some rational a the
polynomial ag(x) has integer coefficients with no common factor. Similary we can find a rational b so that bh(x) has the same
properties. (Take the lcm of the denominators of the coefficients in each case, and then divide by any common factors.)
Suppose a prime p divides ab. Since
ab f (x) = (ag(x))(bh(x))
becomes 0 = (ag(x))(bh(x)) modulo p, we see ag(x) or bh(x) is the zero polynomial modulo p. (If not, then let the term of
highest degree in ag(x) be mxr , and the term of highest degree in bh(x) be nxs . Then the product contains the term mnxr+s ̸= 0
(mod p), a contradiction.)
Number Theory 28 / 34
In other words, p divides each coefficient of ag(x) or bh(x), a contradiction. Hence ab = 1 and we have a factorization over the
integers.
Example: Let p be a prime. Consider the polynomial
f (x) = 1 + x + ... + x p−1 .
We cannot yet apply the criterion, so make the variable subsitution x = y + 1. Then we have
g(y) = 1 + (y + 1) + ... + (y + 1) p−1 .
Note f (x) is irreducible if and only if g(y) is irreducible.
The coefficient of yk in g(y) is
p−1
p−1 p
∑ = .
m=k k k+1
The last equality can be shown via repeated applications of Pascal’s identity:
n+1 n n
= + .
k k k−1
Alternatively, use the fact
(y + 1) p − 1
g(y) =
(y + 1) − 1
Thus p divides each coefficient except the leading coefficient, and p2 does not divide the constant term p, hence f (x) is irreducible
over the rationals.
23 Gaussian Periods
24 Roots of Unity
Gauss generalized his method to find an expression using radicals for any root of unity. (Compare with Vandermonde’s method.)
Suppose we want to find an expression for a primitive pth root of unity ζ for a prime p, and assume we have done so for smaller
primes. Let d, D be factors of p − 1 such that D = qd for some q. Let g be a generator of Z∗p . Let β be a primitive qth root of
unity.
For any expression γ containing ζ , define Sγ to be the same expression with each ζ replaced by ζ g .
Suppose γ satisfies SD γ = γ. Then define
t = γ + β Sγ = x1 − x2
(much cancellation occurs since the sum of the kth roots of unity is zero for any k > 1). By a similar argument, each tiq is known,
and thus if we choose qth roots correctly, then
1 q
q
γ = ∑ q tiq
q i=1
√
(the symbol does not have its usual meaning here because the particular qth roots we need may not be real).
d
Instead of trying every possible root until the resulting γ is correct, we consider the expression tit1q−i . If we change each ζ to ζ g
(that is apply Sd ) then ti changes to β −iti , while from before we know t1q−i becomes β −(q−i)t1q−i , thus their product is unchanged.
Number Theory 30 / 34
Arguing as before, tit1q−i is known for all i, so once we have made a choice for the value of t1 we can easily find the values for
each ti without guesswork.
Example: Let ζ be a primitive fifth root of unity. We shall derive an expression for ζ in terms of a primitive fourth root of unity.
Set d = 1, D = 4, p = 5. Take g = 2, since 2 generates Z∗5 . Then q = 4, β = i. Set γ to simply ζ , so the ti s are:
t1 = ζ + iζ 2 − ζ 4 − iζ 3
t2 = ζ −ζ2 +ζ4 −ζ3
t3 = ζ − iζ 2 − ζ 4 + iζ 3
t4 = ζ + ζ 2 + ζ 4 + ζ 3 = −1
We compute t14 and choose a fourth root of the result, from which we work out t2 ,t3 ,t4 . To make the computation easier we notice
t22 = (ζ + ζ 2 + ζ 3 + ζ 4 ) + 2(−ζ 3 + 1 − ζ 4 − ζ 1 + 1 − ζ 2 )
(Actually first equation is unnecessary since we already have t2 in terms of t1 from before.)
Thus after some algebraic manipulation we find
√
4
√
t1 = α( 5 1 + 2i)
√
t2 = −α 2 5
√
4
√
t3 = −α 3 ( 5 1 − 2i)
t4 = −1
Can you write 67 in the form x2 + 7y2 ? By brute force, we find 67 = 22 + 7 × 32 , so the answer is yes. But what if I ask the same
question about a larger prime like 1234577?
We’ll soon learn 1234577 = x2 + 7y2 only if 7 is quadratic residue modulo 1234577. Using quadratic reciprocity, we find that
(7|1234577) = −(1|7) = −1, so this time there are no solutions.
A binary quadratic form is written [a, b, c] and refers to the expression ax2 + bxy + cy2 . We are interested in what numbers can
be represented in a given quadratic form.
The divisor of a quadratic form [a, b, c] is gcd(a, b, c).
Representations x, y with gcd(x, y) = 1 are primitive representations.
If the divisor of a form is 1 then it is a primitive form, but we can forget this. For our purposes, primitive representations matter,
and primitive forms don’t.
Completing the square, we find:
4a(ax2 + bxy + cy2 ) = (2ax + by)2 − dy2
where d = b2 − 4ac. We call d the discriminant.
If d = 0, then the quadratic form is a perfect square. This case is trivial.
If d < 0 then ac > 0, so a, c have the same sign. From the above equality we see if a > 0 then the form is nonnegative for any
x, y. We call such a form positive definite. Similarly, if a < 0 then the form is negative definite.
If d > 0 then a little experimentation shows the form takes negative and positive values. Such a form is termed indefinite.
Given a form, if we swap x and y then the resulting form represents the same numbers. We consider them equivalent. There are
less trivial ways to change a form so it represents the same numbers. If we replace x with x + y, then x = u − v, y = v represents
the same number that x = u, y = v did in the original form.
More generally, let T be a 2x2 matrix with integer entries of determinant ±1, that is, an integral unimodular matrix. We state
facts that are easy but tedious to prove.
The quadratic form [a, b, c] can be written as the matrix:
a b/2
A=
b/2 c
x
Why? Evaluate x y A .
y
We have det A = −4d; some authors define d to be det A instead of the discriminant, which generalizes nicely beyond the quadratic
case.
Then let:
a′ b′ /2
A′ = T T AT =
b′ /2 c′
for some a′ , b′ , c′ . We write [a, b, c] ∼ [a′ , b′ , c′ ]; this relation is an equivalence relation.
′
u u
If = T ′ then u, v represents the same integer under A as u′ , v′ does under A′ , and we call these equivalent representations.
v v
Equivalent representations have the same divisor.
Equivalent forms represent the same integers, have the same divisor and discriminant.
[If two forms represent the same integers, are they necessarily equivalent? I don’t know.]
Number Theory 32 / 34
The principal form of a discriminant d is [1, 0, −k] when d = 4k and [1, 1, k] when d = 4k + 1. Its equivalence class is the
’principal class of forms of discriminant d’.
By applying transformations judiciously, we can reduce any definite form to [a, b, c] such that −|a| < b ≤ |a| < |c| or 0 ≤ b ≤
|a| = |c|.
Reduction is like Euclid’s algorithm. There exist q, r such that −b = 2|c|q + r with −|c| < r ≤ |c|, so we apply the integral
uniform matrix:
0 1
−1 sgn(c)q
to transform an unreduced form [a, b, c] to [c, r, d] for some d. Repeating eventually leads to |c| ≤ |d|.
Thus we have a form [a, b, c] satisfying −|a| < b ≤ |a| ≤ |c|, and we are done unless |a| = |c| and b < 0. In this last case, we
apply the above procedure one more time (we’ll find q = 0 and r = −b) to get the reduced form [c, −b, a].
-- | Matrix multiplication.
mmul a b = [[sum $ zipWith (*) r c | c <- transpose b] | r <- a]
reduce (a, b, c)
| b^2 - 4*a*c >= 0 = error "indefinite form"
| -abs a < b, b <= abs a, abs a < abs c = (a, b, c)
| 0 <= b, b <= abs a, abs a == abs c = (a, b, c)
| otherwise = reduce (div a’ 2, b’, div c’ 2)
where
c2 = 2 * abs c
(q0, r0) = (-b) ‘divMod‘ c2
(q , r) | r0 * 2 > c2 = (q0 + 1, r0 - c2)
| otherwise = (q0 , r0)
[[a’, b’],[_, c’]] = tat
[[0, 1], [-1, signum c * q]]
[[2*a, b], [b, 2*c]]
Our reduction algorithm works for indefinite forms with nonzero a and c. However, it turns out we need to refine our definition
of "reduced" to get useful results in the indefinite case. We’ll skip this part of the theory.
Principal forms are reduced forms.
The above conditions imply b2 ≤ |ac| ≤ |d|/3 when ac/ = 0. This suggests a brute force algorithm to find all reduced forms of
a given discriminant d.
The following function returns all positive definite forms for a given discriminant. The negative definite forms are the same with
a and c negated.
pos d
| d >= 0 = error "d must be negative"
| d ‘mod‘ 4 > 1 = error "d must be 0 or 1 mod 4"
| otherwise = posBs ++ negBs
where
upFrom n = takeWhile (\x -> x^2 <= abs d ‘div‘ 3) [n..]
posBs = [(a, b, c) | b <- upFrom 0, a <- upFrom b, a /= 0,
let (c, r) = divMod (b^2 - d) (4*a), r == 0, c >= a]
negBs = [(a, -b, c) | (a, b, c) <- posBs, a /= c, b > 0, a > b]
For example:
Number Theory 33 / 34
Since there are only finitely many reduced forms of discriminant d and every form is equivalent to some reduced form, the
number of equivalence classes of forms with discriminant d is finite. We call this number the class number of the discriminant d.
At least that’s what I understood from Number Theory by John Hunter. I think the class number is actually the number of
equivalence classes of positive definite forms when d < 0, as there’s no point doubling the total by also counting the negative
definite forms.
Theorem: The equivalence class of a positive definite binary quadratic contains exactly one reduced form.
Proof. Let f = [a, b, c] be a reduced positive definite binary quadratic form. Again, we complete the square:
Then:
a = ap2 + bpr + cr2 > ap2 − a|pr| + a2 ≥ 2a|pr| − a|pr| = a|pr|
Thus we must have pr = 0. If p = 0 then r/ = 0, whence a = ap2 + bpr + cr2 > c, a contradiction. So r = 0, which means
ps = 1.
Then:
−b = 2apq + b(ps + qr) + 2crs = 2apq + b
We deduce |b| = a|pq|, which implies b = 0 or b = a. Since [a, −a, c] is not reduced, we must have b = 0.
Theorem: An integer n is primitively representable by a quadratic form [a, b, c] if and only if [a, b, c] ∼ [n, b′ , c′ ] for some b′ , c′ .
Proof. If n = ax2 + bxy + c2 and gcd(x, y) = 1 then the extended Euclid’s algorithm can find p, q so that px − qy = 1. Then the
integral unimodular transformation:
x q
y p
to [a, b, c] gives [n, b′ , c′ ] for some b′ , c′ .
Conversely, (1, 0) primitively represents n for [n, b′ , c′ ], so transforms to a primitive representation of n for [a, b, c].
Number Theory 34 / 34
x2 = d mod 4|n|
for some x.
Proof. If n is primitive representable, by the previous theorem there exists a form [n, b′ , c′ ] that primitively represents n. The
condition follows immediately from d = b′2 − 4nc′ .
Conversely, the condition implies b′2 − 4nc′ = d for some integers b′ , c′ , and 1, 0 is a primitive representation of n by the form
[n, b′ , c′ ].
Example: The form x2 + 3y2 of discriminant −12 cannot represent 2, yet 22 = −12 (mod 8). The above theorem simplies there
must exist some form of discriminant −12 in another equivalence class, and indeed we find [2, 2, 2] has the desired discriminant
and can represent 2.
The quadratic form [1, 0, 1] has discriminant −4. From the previous theorem, a nonnegative integer n is primitively represented
by some form of discriminant −4 if and only there exists x satisfying x2 = −4 mod 4n, which is the same as x2 = −1 mod n.
A brief search confirms [1, 0, 1] is only reduced positive definite form of discriminant −4. Therefore, n is primitively represented
by some form of discriminant −4 if and only if n is primitively represented by [1, 0, 1], namely, n is the sum of two squares.
Factorize n:
n = ∏ 2s pki i
where the pi are distinct odd primes. By the Chinese Remainder Theorem, and considering quadratic residues for prime powers,
n is primitively represented by x2 + y2 if and only if each pi = 1 mod 4 and s ≤ 1.
If we allow non-primitive representations, then n is a sum of two squares provided ki = 0 mod 2 whenever pi = 3 mod 4.