Diophantine Equations: 2 Mordell's Equation
Diophantine Equations: 2 Mordell's Equation
Diophantine Equations: 2 Mordell's Equation
F.Beukers
Spring 2011
2 Mordell’s equation
2.1 Introduction
Let d ∈ Z with d ̸= 0 and consider the equation y 2 + d = x3 in x, y ∈ Z. This equation is
known as Mordell’s equation. We shall prove the following Theorem.
Actually Mordell proved a more general theorem, but we will come back to that later. It
should be emphasized that Mordell’s proof is only a finiteness result, no algorithm is provided
to actually solve the equation. Nowadays we also have methods to solve the equation explicitly.
The first results in this direction are based on A.Baker’s technique of linear forms in logarithms
starting in 1966. This work earned Baker the Fields medal. Here is a theorem based on Baker’s
methods.
Theorem 2.1.2 (Sprindzuk, 1982) There exists an effectively computable number C > 0
such that any solution x, y ∈ Z of y 2 + d = x3 with d ̸= 0 satisfies
( )
|x|, |y| ≤ exp C|d|(log |d| + 1)6 .
Note that the bound for x, y is roughly exponential in |d| with a very large constant C. One
expects that a much sharper bound holds. This is based on the following conjecture.
Conjecture 2.1.3 (Hall, 1971) . To every ϵ > 0 there is a positive real number c(ϵ) such
that
|y 2 − x3 | > C(ϵ)x1/2−ϵ
for any x, y ∈ Z>0 with y 2 ̸= x3 .
1
Actually Hall conjectured the lower bound Cx1/2 for some C > 0, but this is generally believed
not te be true.
As a consequence of Hall’s conjecture we see that |x|, |y| ≤ c1 (ϵ)|d|2+ϵ . In other words, the
expected upper bounds for x, y are polynomial in |d|.
Nowadays there are explicit algorithms to solve Mordell’s equation. In [GPZ] all equations
with |d| ≤ 10000 are solved. A particularly spectacular example is y 2 − 17 = x3 . In 1930
T.Nagell showed that the complete set of solutions with y > 0 reads,
(x, y) = (−2, 3), (−1, 4), (2, 5), (4, 9), (8, 23), (43, 282), (52, 375), (5234, 378661)
References:
[GPZ ] J.Gebel, A.Pethö, H.G. Zimmer, On Mordell’s equation Compositio Math 110 (1998),
335-367.
This proposition implies for example that y 2 + 1 = x3 has no non-trivial solutions and that
y 2 + 2 = x3 has (x, y) = (3, ±5) as solution set. The latter fact was stated already by Fermat,
but not proved.
Here is a proof of our Proposition. Suppose y 2 + d = x3 . First we note that gcd(x, 2d) = 1.
For if an odd prime p divides both d and x we see that p should divide y as well. But since p2
divides both x3 and y 2 we find that p2 divides d, contradicting the fact that d is square-free.
If x were even, then y 2 ≡ −d(mod 8). But since −d ̸≡ 0, 1(mod 4) we get a contradiction
again. We now assume that gcd(x, 2d) = 1.
We obtain the following factorisation
√ √
(y + −d)(y − −d) = x3 .
√ √
Since d is square-free and d ̸≡√1(mod 4)√the ring of integers in Q( −d) is Z[√ −d]. Let ℘ be
a prime ideal divisor of (y + −d, y − −d). Then it also divides x and 2 −d. Hence its
divides x and 2d. This contradicts gcd(x, 2d) = 1 and we conclude that the principal ideals
2
√ √
(y + −d)√and (y − −d) are relatively prime. Their product is a cube and so √ we conclude
that (y + −d) itself is the cube of an ideal, which we call I. So we get (y + −d) = I 3 .
Note that I 3 is a principal ideal. Hence its order in the ideal class group is either 1 or 3. But
we are given that the class number is not divisible by 3. So the order of√I is the ideal class
group is 1, hence I is principal. There exist a, b ∈ Z such that I = (a + b −d). Hence
√ √
y + −d = ϵ(a + b −d)3
√ ∗
√ ϵ∗ is a unit in the ring of integers. When d > 1 we have Z[ −d] = {±1} and
where
Z[ −1] = {±1, ±i}. In both cases the unit group has order relatively prime to 3, hence
every unit can be considered as the cube of another unit. After redefining a, b if necessary,
we get √ √
y + −d = (a + b −d)3
√
Comparison of the coefficients before −d gives 1 = 3a2 b − db3 = b(3a2 − db2 ). We see that
b = ±1 and 3a2 − db2 = ±1. If b = 1, then 3a2 − d = 1 and so, d = 3a2 − 1. The value of y is
a3 −3ab2 d = a3 −3a(3a2 −1) = −(8a3 −3a). The value of x is a2 +db2 = a2 +3a2 −1 = 4a2 −1.
When b = −1 we proceed similarly. qed
In a very similar way we can show that
√
Proposition 2.2.2 Let d > 0, d square-free, d ≡ −5(mod 8) and h(Q( −d)) not divisible
by 3. Suppose that y 2 + d = x3 has a solution. Then one of the following cases holds,
1. There exist a ∈ Z and ϵ ∈ {±1} such that d = 3a2 + ϵ. The solutions read (x, y) =
(4a2 + ϵ, ±(8a3 + 3ϵa)).
2. There exist a ∈ Z and ϵ ∈ {±1} such that d = 3a2 + 8ϵ. The solutions read (x, y) =
(a2 + 2ϵ, ±(a3 + 3ϵa)).
We can apply this Proposition to y 2 +11 = x3 . Note that d = 11 satisfies all of our conditions.
Moreover, 11 = 3 · 12 + 8, which gives rise to the solutions (x, y) = (3, ±4). In addition,
11 = 3 · 22 − 1 giving rise to (x, y) = (15, ±58).
So far we dealt with d > 0 in our equation. Let us consider an example with d < 0, namely
y 2 − 17 = x3 . This known to have the solution set
(x, y) = (−2, ±3), (−1, ±4), (2, ±5), (4, ±9), (8, ±23), (43, ±282), (52, ±375), (5234, ±378661)
3
is not possible. We conclude that π divides 2. But then x is even. We separate according to
the cases x even or odd. √ √
Suppose x is odd. Then, by the above, y + 17 and y − 17 have no common prime divisor
and hence, by unique factorization, there exists an integer α ∈ OK and a unit η such that
√
y + 17 = ηα3 .
√ k
The units η √are of the form
√ ±(4 + 17) . Of course −1 is a cube and the unit √η is a cube
times 1, 4 + 17 or 4 − 17. Hence there exists an integer α ∈ OK such that y + 17 equals
one of the following √ √
α3 , (4 + 17)α3 , (4 − 17)α3 .
√
Let √us write α = (a + b 17)/2 with a, b ∈ Z√having the same parity. Then, in the case
y + 17 = α3 comparison of the coefficients of 17 gives
8 = 3a2 b + 17b3 .
Since b ̸= 0 implies
√ √ 17b ≥ 17√we see
3a2 + 2
we get a contradiction. Comparison of the
coefficients of 17 in y + 17 = (4 + 17)α3 yields
Replace a by a − 4b to get
8 = a3 + 3ab2 − 8b3 .
Hence a(a2 + 3b2 ) ≡ 0(mod 8), which implies that a, b should both be even. So replace a, b
by 2a, 2b to get
1 = a3 + 3ab2 − 8b3
We do not solve this equation, but note that the solutions (a, b) = (1, 0), (−3, −2) give
√ rise
to the
√ solutions (x, y) = (−1, 4), (43, 282) of the Mordell equation. The case y + 17 =
(4 − 17)α runs similarly.
3
√
Suppose now that x is even. The y is odd and y ± 17 divisible by 2. Hence, upon replacing
x by 2x, √ √
y + 17 y − 17
= 2x3
2 2
From this we deduce the following possibilities
√ √ √
y + 17 5 ± 17 3 5 ± 17 √
= α , (4 ± 17)α3 .
2 2 2
√
for choice of ± signs and some algebraic integer α = (a + b 17)/2.
The first case with + sign gives
4
Replace a by a − 5b to get
8 = a3 − 24ab2 + 80b3
Hence a is even. Replace a by 2a to get
1 = a3 − 6ab2 + 10b3
we have the small solutions (a, b) = (1, 0), (−3, 1). They give rise to the solutions (x, y) =
(2, 5), (52, −375) of Mordell’s equation.
In a similar way the other cases also give rise to diophantine equations of the form f (x, y) = 1
for cubic homogeneous polynomials f ∈ Z[x, y].
In the following section we prove the following theorem.
Theorem 2.2.3 For any k ∈ Z ̸= 0 the solution of the diophantine equation y 2 + k = x3
in x, y ∈ Z can be reduced to the solution of a finite set of diophantine equation of the form
f (x, y) = 1 in x, y ∈ Z where f is a binary cubic form with integer coefficients. Moreover,
the set of forms f can be computed explicitly.
where α1 , . . . , αn are the zeros of the polynomial f (X, 1). One can show that D ∈ Z[a0 , a1 , . . . , an ].
Here are two examples,
Binary quadratic forms aX 2 + 2bXY + cY 2 with discriminant
D = 4(b2 − ac).
Binary cubic forms aX 3 + 3bX 2 Y + 3cXY 2 + dY 3 with discriminant
D = 27(−a2 d2 + 6abcd + 3b2 c2 − 4ac3 − 4db3 ).
For quadratic and cubic forms the discriminant D and polynomials in D are the only invari-
ants. For quartic forms a4 X 4 +4a3 X 3 Y +6a2 X 2 Y 2 +4a1 XY 3 +a0 Y 4 there are two independent
invariants namely
I2 = a0 a4 − 4a1 a3 + 3a22
I3 = a0 a2 a4 − a0 a23 − a21 a4 + 2a1 a2 a3 − a32
5
The ring of invariants is the polynomial ring generated by I2 and I3 . In particular, D =
27(I22 − 27I33 ).
We shall now concentrate on binary forms with a0 , a1 , . . . , an ∈ Z and call them integral binary
forms. In particular the discriminant is an integer. Two integral forms f (X, Y ), g(X, Y ) will
be called SL(2, Z)-equivalent, or simply equivalent, if there exist p, q, r, s ∈ Z with ps−qr = 1
such that g(X, Y ) = ±f (pX + qY, rX + sY ). We have the following Theorem.
Theorem 2.3.1 The number of equivalence classes of binary integral forms of given degree
and given discriminant is finite.
For quadratic forms there is a very explicit reduction procedure from which finiteness of
the number of equivalence classes of discriminant D follows. Let us start with an arbitrary
quadratic form aX 2 + 2bXY + cY 2 which we abbreviate by [a, b, c]. Note that we have chosen
the coefficient of XY to be even. Such quadratic forms are called even. Although one could
also consider odd quadratic forms we concentrate here only on th even ones.
We keep on repeating the following steps. If |b| > |a|/2 we choose k such that |b + ka| ≤
|a|/2. Replace X by X + kY to get the new form [a, b, c] := [a, b + ka, c + 2bk + ak 2 ]. If
|c| < |a| we make the substitution (X, Y ) → (−Y, X) which changes our form into [a, b, c] :=
[c, −b, a]. Repeating this procedure we end up with an equivalent form [a, b, c] which satisfies
√ |2b| ≤ |a| ≤ |c|. From this we derive that |D| ≥ |ac| − b ≥ 3b2 . Hence |b| is
2 2
the inequalities
bounded by |D|/3. This gives a finite number of values of b and through b − ac = D we
get a finite number of values of a, c.
Example. We determine all equivalence classes of even quadratic forms aX 2 + 2bXY + cY 2
with b2 − ac = 17. According√ to the above reduction procedure we can restrict ourselves
to a, b, c satisfying |b| ≤ 17/3. So b = 0, ±1, ±2. The corresponding a, c follow from
b − ac = 17 and |c| ≥ |a| ≥ |2b|. We get the following list of possibilities with a > 0,
2
X 2 − 17Y 2
2X 2 ± 2XY − 9Y 2
3X 2 ± 2XY − 6Y 2
We now turn to cubic forms f (X, Y ) = aX 3 + 3bX 2 Y + 3cXY 2 + dY 3 . We construct the
Hessian form
−1 fXX fXY
H(X, Y ) =
36 fXY fY Y
which turns out to be H(X, Y ) = (b2 − ac)X 2 + (bc − ad)XY + (c2 − bd)Y 2 . We also define
the cubic form
1 fX fY
G(X, Y ) = .
3 HX HY
The forms H, G are called the covariants of f of degrees 2 and 3. The discriminant of f equals
27D1 where D1 = −a2 d2 + 6abcd + 3b2 c2 − 4ac3 − 4db3 . The discriminant of H equals −D1 .
Proposition 2.3.2 Let notation be as above. Then
G2 + D1 f 2 = 4H 3
6
We can now make the following observation. Let f be a binary cubic form with D1 = 4k such
that f (x, y) = 1 has the solution x0 , y0 . Then Mordell’s equation y 2 + k = x3 has the solution
y = G(x0 , y0 )/2, x = H(x0 , y0 ). It turns out that the converse is also true.
Proposition 2.3.3 Consider the equation y 2 + k = x3 and suppose that we have a solution
p, q. Then the cubic form f (x, y) = x3 − 3pxy 2 + 2qy 3 has D1 = 4k and p = H(1, 0), q =
G(1, 0)/2. Note in addition that H(X, Y ) = pX 2 − 2qXY + p2 Y 2 , so H is an even form. We
also have G(X, Y ) = 2(−qX 3 + 3p2 X 2 Y − 3pqXY 2 + (−p3 + 2q 2 )Y 3 ), i.e. G(X, Y )/2 is an
integral form.
7
3 Thue’s equation
3.1 Introduction
Let F be an integral binary form and m a non-zero integer. The equation
F (x, y) = m
Theorem 3.1.1 (Thue, 1909) Let F be an integral binary form such that F (x, 1) has at
least three distinct zeros. Let m be a non-zero integer. Then the equation F (x, y) = m has at
most finitely many solutions.
Note that if F is reducible over Z then we can restrict ourselves to equations of the form
G(x, y) = m′ where G is an irreducible factor of F and m′ a divisor of m. Notice also that the
requirement of at least three zeros is essential. An example of a quadratic equation would be
Pell’s equation x2 − dy 2 = 1 which is known to have infinitely many solutions if d is a positive
integer and not a square.
Using Thue’s theorem and Proposition 2.3.3 we conclude that Mordell’s equation has at
most finitely many solutions. Thue’s theorem is proved using methods from diophantine
approximation. Due to the nature of this technique Thue’s theorem is only a finiteness
statement. It does not give a method to solve the equation. We shall come back to this.
An effective method to solve Thue’s equation became available through A.Baker’s method on
linear forms in logarithms around 1966. As an application of these methods the following was
shown.
Theorem 3.1.2 (Feld’man, Baker) Suppose that F (x, y) is a form in two variables such
that F (x, 1) has at least three disctinct zeros. Then there exist positive, effectively computable
numbers C1 , C2 , depending only on F such that any solution x, y ∈ Z of F (x, y) = m (with
m ̸= 0) satisfies
log(max(|x|, |y|) ≤ C1 |m|C2 .
By ‘effective method’ we mean that the upper bound for x, y provides us with an algorithm
to determine the solution set. However, due to the enormous size of this bound the algorithm
is certainly not efficient. With the speed of present day computers a naive search of x, y up
to the bound given above would take the life time of the universe and more. So extra ideas
have to be invoked to solve the equation.
In the years before the 1980’s about the only such method was Skolem’s method, next to simple
minded congrence considerations which work only in rare cases. In solving Thue’s equation
there is a big difference between the cases when F has a positive or a negative discriminant
as we shall see
As a first example consider f (x, y) = 1 where f (x, 1) is a monic cubic irreducible polynomial.
Let K be the field Q[x]/(f (x, 1)). Write f (x, 1) = (x − α)(x − α′ )(x − α′′ ) where α ∈ K and
8
α′ , α′′ are its algebraic conjugates. Then the equation f (x, y = 1 implies that x − αy = β
where β is a unit in K. We also have the conjugate equations x − α′ y = β ′ and x − α′′ y = β ′′ .
As an exercise one can verify that
1 1 1 α α′ α′′
+ + = + + = 0.
f ′ (α) f ′ (α′ ) f ′ (α′′ ) f ′ (α) f ′ (α′ ) f ′ (α′′ )
As a consequence we get
β β′ β ′′
+ + = 0.
f ′ (α) f ′ (α′ ) f ′ (α′′ )
Now suppose that K has negative discriminant, which is equivalent with r1 = r2 = 1. By
Dirichlet’s unit theorem the units in K are of the form ±η n where η is a fundamental unit
and n ∈ Z. So our equation becomes
ηn (η ′ )n (η ′′ )n
+ + = 0.
f ′ (α) f ′ (α′ ) f ′ (α′′ )
Note that we have turned our Thue equation into an exponential equation in the unknown
exponent n. In the next section we show how to deal with this equation.
Consider the explicit example
x3 − xy 2 + y 3 = 1
Its solution set reads (x, y) = (1, 0), (0, 1), (1, 1), (−1, 1), (4, −3), which was shown by T.Nagell.
The discriminant of the form x3 − xy 2 + y 3 is −23. This is the minimal negative discriminant
possible for an irreducible cubic form. The polynomial X 3 − X + 1 has a real zero and two
complex ones, this is because the discriminant is negative. Let α be a zero. Then the field
Q(α) has r1 = r2 = 1. Its ring of integers is Z[α] and the group of units Z[α]∗ = {±αn |n ∈ Z}.
We compute that 1/f ′ (α) = (4 − 9α − 6α2 )/23. So our exponential equation becomes
The proof really depends on the use of so-called p-adic numbers, but here we give a version
which avoids mentioning them.
9
Let us rewrite our equation by using the binomial theorem for αimn = (a + pβi )n . We get
∑n ( )
n r n−r
0 = npa (θ1 β1 + · · · + θk βk ) +
n−1
p a (θ1 β1r + · · · + θk βkr )
r=2
r
∑n ( )
pr−1 n − 1
0 = θ1 β1 + · · · + θk βk + (θ1 β1r + · · · + θk βkr ).
r=2
ra r−1 r−1
r−1
r−1 has the factor p in its numerator for every r ≥ 2.
p
Since p is odd prime the fraction ra
In particular it follows form our equation that θ1 β1 + · · · + θk βk is divisible by p. This is
impossible by our assumption. Hence we conclude that n = 0. qed
Here is an application to the explicit equation of the previous section. Let α1 , α2 , α3 be the
zeros of X 3 − X + 1. We want to solve
(x, y) = (1, 0), (0, 1), (1, 1), (−1, 1), (4, −3).
Skolem’s method is particularly suitable to solve cubic Thue equation with negative discim-
inant. A negative discriminant of a cubic is equivalent to the form having one real and two
complex (non-real) solutions. To see what goes wrong in the case of a positive discriminant
we take the equation
x3 + x2 y − 2xy 2 − y 3 = 1
10
Baulin showed that the only solutions are
(x, y) = (1, 0), (0, −1), (−1, 1), (−1, −1), (2, −1), (−1, 2), (5, 4), (4, −9), (−9, 5).
One may wonder if an infinite number of such good quality approximations exist for π, or
any other irrational we are looking at. To that end we introduce the following concept.
11
Definition The irrationality measure of an irrational number α is defined as the limsup over
all qualities of all rational approximations and is denoted by µ(α).
We have taken the limsup in our definition rather than the maximum since we are for example
interested in the question whether π has infinitely many approximations of quality at least
3. The first two occurrences from the introduction may have been exceptional coincidences.
If we assume that π behaves like most other numbers, then there is very little chance that
µ(π) ≥ 3. This is shown by the following Theorem.
Theorem The set of irrational numbers with irrationality measure strictly larger than 2 has
Lebesgue measure zero.
This Theorem is not hard to prove. Let us restrict ourselves to the irrational numbers in the
interval [0, 1]. Choose ϵ > 0. A number α with µ(α) ≥ 2 + 2ϵ is, by definition, contained in
an interval of the form [ ]
p 1 p 1
− , + ,
q q 2+ϵ q q 2+ϵ
with 0 < p < q integers, infinitely many times. Let us give an upper bound for the total
length of these intervals with q > Q, where Q is some large fixed positive integer. Such a
bound can be given by
∑∞ ∑ q
2
q=Q+1 p=1
q 2+ϵ
2
The inner sum is equal to q 1+ϵ
. The sum over q can be estimated by the integral criterion,
∑
∞ ∫ ∞
2 2dx 2
< = .
q=Q+1
q 1+ϵ Q x 1+ϵ ϵQϵ
When we let Q → ∞ we see that the latter bound goes to zero. Hence the Lebesgue measure
of the numbers in [0, 1] with irrationality measure ≥ 2+ 2ϵ is zero. The set of numbers in [0, 1]
with irrationality measure > 2 is the union of all sets of numbers with irrationality measure
at least 2 + 2/n for n = 1, 2, 3, 4, . . .. Since a countable union of measure zero sets has again
measure zero, our result follows. qed
We note that numbers with irrationality measure > 2 do exist. In fact there exist irrational
numbers with irrationality measure ∞. These are the so-called Liouville numbers. An example
of such a number is given by
∑ 1
.
n≥0
2n!
The reader may wish to verify as an exercise that the truncated series form a sequence of
approximations whose qualities go to ∞. On the other hand, numbers like Liouville num-
bers are a bit artificial. They are constructed for the purpose of having large irrationality
measures. It is expected that the irrationally measure for a naturally occurring number is 2.
Unfortunately, there are not many instances where this is known. A classical instance is e.
12
The fact that µ(e) = 2 can easily be shown by using the continued fraction expansion of e
which, contrary to that of π, is completely known. Although it is expected that µ(π) = 2, it
is very hard to get any results on µ(π). It was only in 1953 that K.Mahler was able to show
for the first time that µ(π) is finite. Nowadays we know that µ(π) < 8.02. The following
statement is not hard to show.
Exercise 3.3.1 Prove that any real algebraic number of degree 2 has irrationality measure 2.
Let α be an algebraic number of degree n with n > 2. The first non-trivial result on irra-
tionality measures is by A.Thue, who showed in 1909√ that µ(α) ≤ n/2+1. This was improved
by C.L.Siegel who showed in 1929 that µ(α) < 2 n. Finally in 1955 K.F.Roth finished the
problem by showing that µ(α) = 2. This result won him the Fields medal in mathematics.
Using Thue’s upper bound for µ(α) we can prove Theorem 3.1.1. Suppose that the equation
F (x, y) = m has infinitely many solutions x, y ∈ Z. Let α1 , α2 , . . . , αn be the zeros of F (x, 1).
Then the inequality
∏n
αi − ≤ |m|
x
(1)
y yn
i=1
has infinitely many solutions in x, y ∈ Z and y > 0. Let A = mini̸=j |αi − αj |/2. Suppose
x, y is a solution of the inequality and suppose that y > |m|1/n /A. Then there exists an i
such that |αi − xy | < A. By the definition of A this means that |αj − xy | > A for all j ̸= i.
Combining this with inequality (1) again, gives us
αi − x ≤ |m| . (2)
y An−1 y n
To every solution x, y with y > |m|1/n /A there corresponds an i such that the latter inequality
holds. Since there are infinitely many solutions, there exists an i such that (2) has infinitely
many solutions. But this implies that µ(αi ) ≥ n. This contradicts Thue’s inequality µ(αi ) ≤
n/2 + 1 when n ≥ 3. qed
∑
n
aij xj , i = 1, . . . , m
i=1
13
has a non-trivial solution in the integers x1 , x2 , . . . , xn with the property that
max |xj | ≤ (2nA)m/(n−m) .
j
A remarkable application is for example the following. Take 10 integers a1 , a2 , . . . , a10 of ten
digits each. Suppose we want to find integers x1 , x2 , . . . , x10 , not all zero, such that
a1 x1 + a2 x2 + · · · + an xn .
Siegel’s Lemma with A = 1010 , m = 1, n = 10 tells us that we can find such xi of absolute
value at most 18. Surprisingly small given the size of the numbers ai .
Here is a proof of Siegel’s Lemma. Choose an integer Q. Let B(Q) be the box consisting
∑n with 0 ≤ ∑
of points (x1 , . . . , xn ) with x1 , . . . , xn integers xi ≤ Q. Consider the map ϕ :
B(Q) → Z given by ϕ : (x1 , . . . , xn ) 7→ ( j=1 a1j xj , . . . , nj=1 amj xj ). The image of B(Q)
m
is contained in the box [−nAQ, nAQ]m . The number of points with integral coordinates in
this box is at most (2nAQ + 1)m . The number of points in B(Q) is precisely (Q + 1)n . So
if (Q + 1)n > (2nAQ + 1)m , then ϕ is not surjective and we find two integral vectors x1 , x2
in B(Q) such that ϕ(x1 − x2 ) = 0. In other words, x1 − x2 is a solution of our system of
equations. In addition, the components of this difference are all bounded by Q in absolute
value. A straightforward calculation shows that (Q + 1)n > (2nAQ + 1)m is satisfied if we
choose Q = [(2nA)m/(n−m) ]. qed
14
coefficients are bounded by C D , where C is some number depending only on α. There are
2D + 2 unknowns is Z. We can apply Siegel’s Lemma and find polynomials P (x), Q(x) with
coefficients whose absolute value is at most
(4(D + 1)C D )mn/(2D+2−mn) < (4(D + 1)C D )2D/(2D−(2−ϵn)D) = (4(D + 1)C D )2/(nϵ) .
Notice that we can find another number C1 , depending only on α such that (4(D +1)C D )2/n <
C1D . Numbers depending only on α will be denoted by C1 , C2 , . . . in the sequel. We conclude
that we have found non-trivial polynomials P (x), Q(x) with integral coefficients bounded by
D/ϵ
C1 such that P (x) − αQ(x) vanishes of order at least m in x = α.
Let x1 /y1 , x2 /y2 be two very large solutions of (3) with y2 >> y1 >> 1. The idea is now to
find both upper and lower bounds for
x 1 x2 x1
∆ = P ( − Q( ) .
y1 y2 y1
A lower bound for ∆ can be attained if we assume that ∆ ̸= 0. For then ∆ is a non-zero
rational number with denominator dividing y2 y1D . Combining this upper and lower bound we
get ( )
1 D/ϵ 1 1
< C1 + n/2+1+θ
y2 y1D m(n/2+1+θ)
y1 y2
where m ≈ (2/n − ϵ)D. Now choose ϵ such that (2/n − ϵ)(n/2 + θ) = 1 + δ for some δ > 0.
We choose D, and as a consequence m, in such a way that y1m ≈ y2 . Then our inequality
simplifies to
1 D/ϵ 1
D
< 2C1 (1+δ)D
.
y2 y1 y2 y1
Hence
D/ϵ −δD
1 < 2C1 y1 .
But this gives a contradiction if y1 is large enough. Since there are infinitely many choices for
y1 we do arrive at a contradiction.
A problem arises if ∆ does vanish. To that end we prove the following Lemma.
15
Lemma 3.5.1 Let P (x), Q(x) be two non-trivial polynomials with rational coefficients and
of degree ≤ D. Let α be an algebraic number of degree n such that P (x) − αQ(x) vanishes
of order at least m at x = α. Suppose m > D/n. Then, for any numbers β, γ, with β not a
conjugate of α, the polynomial P (x) − γQ(x) has vanishing order at most 2D − (m − 1)n at
x = β.
16