Elkies PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 52

The Eightfold Way

MSRI Publications
Volume 35, 1998

The Klein Quartic in Number Theory


NOAM D. ELKIES

Abstract. We describe the Klein quartic X and highlight some of its remarkable properties that are of particular interest in number theory. These
include extremal properties in characteristics 2, 3, and 7, the primes dividing the order of the automorphism group of X; an explicit identification
of X with the modular curve X(7); and applications to the class number 1
problem and the case n = 7 of Fermat.

Introduction
Overview. In this expository paper we describe some of the remarkable properties of the Klein quartic that are of particular interest in number theory. The
Klein quartic X is the unique curve of genus 3 over C with an automorphism
group G of size 168, the maximum for its genus. Since G is central to the
story, we begin with a detailed description of G and its representation on the
three-dimensional space V in whose projectivization P(V ) = P 2 the Klein quartic lives. The first section is devoted to this representation and its invariants,
starting over C and then considering arithmetical questions of fields of definition
and integral structures. There we also encounter a G-lattice that later occurs as
both the period lattice and a MordellWeil lattice for X. In the second section we
introduce X and investigate it as a Riemann surface with automorphisms by G.
In the third section we consider the arithmetic of X: rational points, relations
with the Fermat curve and Fermats Last Theorem for exponent 7, and some
extremal properties of the reduction of X modulo the primes 2, 3, 7 dividing #G.
In the fourth and last section, we identify X explicitly with the modular curve
X(7), describe some quotients of X as classical modular curves, and report on
Kenkus use of one of these quotients in a novel proof of the StarkHeegner theorem on imaginary quadratic number fields of class number 1. We close that
section with Kleins identification of 1 (X) with an arithmetic congruence subgroup of PSL2 (R), and thus of X with what we now recognize as a Shimura
curve.

51

52

NOAM D. ELKIES

Notations. We reserve the much-abused word trivial for the identity element
of a group, the 1-element subgroup consisting solely of that element, or a group
representation mapping each element to the identity.
Matrices will act from the left on column vectors.
We fix the seventh root of unity
:= e2i/7 ,

(0.1)

1 + 7
.
(0.2)
2
The seventh cyclotomic field and its real and quadratic imaginary subfields will
be called

K := Q(), K+ := Q( + 1 ), k := Q 7 = Q().
(0.3)
and set

:= + 2 + 4 =

These are all cyclic Galois extensions of Q. The nontrivial elements of Gal(K/Q)
fixing k are the Galois automorphisms of order 3 mapping to 2 , 4 ; the nontrivial Galois automorphism preserving K+ is complex conjugation x x
. As
usual we write OF for the ring of integers of a number field F ; recall that OK ,
OK+ , Ok are respectively the polynomial rings Z[], Z[ + 1 ], Z[].
We use G throughout for the second-smallest noncyclic simple group
PSL2 (F 7 )
= SL3 (F 2 ) [ = GL3 (F 2 ) ]

(0.4)

of 168 elements.
Acknowledgements. Many thanks to Silvio Levy for soliciting this paper for
the present MSRI volume and for his patience during repeated delays in the
papers completion.
I am grateful to Allan Adler, Benedict Gross, Barry Mazur, and J.-P. Serre
for introducing me to many of the remarkable properties of the Klein quartic and
for numerous enlightening conversations on various aspects of the geometry and
arithmetic of X and of its automorphism group G. I also thank them, as well
as Michael Bennett, Enrico Bombieri, Armand Brumer, Joe Harris, and Curt
McMullen, for references to their and others work and/or for clarifications of
specific concepts and questions that arose in the process of putting this exposition
together.
Hardly any of the results contained in this paper are original with me; some
go back to Kleins work over a century ago, such as the explicit formulas for the
representation of G and the determinantal expressions for its invariants [Klein
1879b], and the equations Kenku [1985] uses, referring to [Klein 1879a, 7]. Much
of the mathematical work of writing this paper lay in finding explicit equations
that not only work locally to exhibit particular aspects of (X, G) but are also
consistent between different parts of the exposition. The extensive symbolic
computations needed to do this were greatly facilitated by the computer packages
pari and macsyma.

THE KLEIN QUARTIC IN NUMBER THEORY

53

This work was made possible in part by funding from the National Science
Foundation and the Packard Foundation.

1. The Group G and its Representation (V, )


1.1. G and its characters. We reproduce from the ATLAS [Conway et al.
1985, p. 3] some information about G and its representations over C. (That
ATLAS page is also the source of facts concerning G cited without proof in the
sequel.) The conjugacy classes c and character table of G are as follows:
c
#c

1A
1

1
3
3
6
7
8

1
3
3
6
7
8

2A
21
1
1
1
2
1
0

3A
56
1
0
0
0
1
1

4A
42
1
1
1
0
1
0

7A
24
1

1
0
1

7B
24
1

1
0
1

(1.1)

The outer automorphism group Aut(G)/G of G has order 2; an outer automorphism switches the conjugacy classes 7A,7B and the characters 3 , 3 , and
(necessarily) preserves the other conjugacy classes and characters. Having specified in (0.2), we can distinguish 3 from 3 by labeling one of the conjugacy
classes of 7-cycles as 7A; we do this
 by regarding G as PSL2 (F 7 ) and selecting
for 7A the conjugacy class of 10 11 . When we regard G as PSL2 (F 7 ), the group
Aut(G) is PGL2 (F 7 ); if we use the SL3 (F 2 ) description of G, we obtain an outer
involution of G by mapping each 3 3 matrix to its inverse transpose.
Modulo the action of Aut(G) there are only two maximal subgroups in G
(every other noncyclic simple group has at least three), of orders 21 and 24. These
are the point stabilizers in the doubly transitive permutation representations
of G on 8 and 7 letters respectively. These come respectively from the action
of G
= PSL2 (F 7 ) on the projective line mod 7 and of G
= SL3 (F 2 ) on the
projective plane mod 2. The 21-element subgroup is the normalizer of a 7-Sylow
subgroup of G, and is the semidirect product of that subgroup (which is of course
cyclic of order 7) with a group of order 3. Since all the 7-Sylows are conjugate
under G, so are the 21-element subgroups, which extend to 42-element maximal
subgroups of Aut(G) isomorphic to the group of permutations x 7 ax + b of F 7 .
The 24-element subgroup is the normalizer of a noncyclic subgroup of order 4
in G, and is the semidirect product of that subgroup with its automorphism
group, isomorphic with the symmetric group S3 ; thus the 24-element maximal
subgroup is isomorphic with S4 . There are 14 such subgroups, in two orbits of

54

NOAM D. ELKIES

size 7 under conjugation by G that are switched by an outer automorphism; thus


these groups do not extend to 48-element subgroups of Aut(G).1
From these groups we readily obtain the irreducible representations of G with
characters 6 , 7 , 8 : the first two are the nontrivial parts of the 7- and 8-letter
permutation representations of G, and the last is induced from a nontrivial onedimensional character of the 21-element subgroup.
We now turn to 3 and 3 . Let (V, ) and (V , ) be the representation with
character 3 and its contragredient representation with character 3 . Both V
and V remain irreducible as representations of the 21- and 24-element subgroups; we use this to exhibit generators for (G) explicitly.
Fix an element g in the conjugacy class 7A. Then V decomposes as a direct
sum of one-dimensional eigenspaces for (g) with eigenvalues , 2 , 4 . The normalizer of hgi in G is generated by g and a 3-cycle h such that h1 gh = g2 . Thus
h cyclically permutes the three eigenspaces. The images of any eigenvector under
1, h, h2 therefore constitute a basis for V ; relative to this basis, the matrices for
(g), (h) are simply
4

0 0
0 1 0
(g) = 0 2 0 ,
(h) = 0 0 1 .
(1.2)
0 0
1 0 0
In other words, the representation (V, ) restricted to the 21-element subgroup
hg, hi of G is induced from a one-dimensional character of hgi sending g to .
Since this subgroup is maximal in G, we need only exhibit the image under of
some group element not generated by g, h. In his historic paper introducing (V, )
and his eponymous quartic curve, Klein [1879b, 5] found that the involution

6 2 5 4 3
1 2

(1.3)

5 4 3 6
7
4
3
6
2
5


fills this bill. We thus refer to the image of G in SL3 (C) generated by the
matrices (1.2,1.3) as the Klein model of (V, ).
The transformation (1.3) may seem outlandish, especially compared with (1.2),

but we can explain it as follows. Except for the scaling factor 1/ 7, it is just
the discrete Fourier transform on the space of odd functions F 7 C: identify
such a function f with the vector (f(1), f(2), f(4)) V . It follows that this
involution (1.3), as well as the transformations (g), (h), are contained in Weils
group of unitary operators of the space of complex-valued functions on F 7 [Weil
1 Let H, H 0 be two subgroups of G isomorphic to S in different orbits. Then H, H 0 are
4
not conjugate in G, but are almost conjugate (a.k.a. Gassmann equivalent [Perlis 1977]):
0
H, H intersect each G-conjugacy class in subsets of equal size. Equivalently, the permutation
representations of the action of G on the coset sets G/H, G/H 0 are isomorphic (in our case with
character 6 1 ). This has been used by Perlis to construct non-isomorphic number fields
of degree 7 (the minimum) with the same zeta function [Perlis 1977] and, following [Sunada
1985], to exhibit isospectral planar domains [Gordon et al. 1992; Buser et al. 1994].

THE KLEIN QUARTIC IN NUMBER THEORY

55

1964, I]; they all commute with the parity involution : f(x) f(x), and
together generate the restriction to V of the commutator of in Weils group.
Starting with any odd prime p instead of 7, this would produce the ((p 1)/2)dimensional representation of PSL2 (F p ) or of its double cover according as p is
congruent to 3 or 1 mod 4; see also [Adler 1981, p. 116] for a concrete
 approach to
the first case, of which G is the instance p = 7. If we take g = 10 11 , h = 20 04

in PSL2 (F 7 ) then (1.3) is the image under of the involution s = 01 10 .
The restriction of to S4 G is the group of orientation-preserving symmetries of the cube, that is, the group of signed 3 3 matrices of determinant 1.
(The action on the four diagonals of the cube identifies this group with S4 ;
the 3-dimensional representation is the nontrivial part of the permutation representation of S4 twisted by its sign character.) Unlike (V, ) and its restriction to the 21-element subgroup, this representation leaves a quadratic form
invariant. We choose the subgroup isomorphic with S4 generated by s, h, and

g2 sg2 = 21 22 . Then the invariant quadric (which we shall need later) is a
multiple of
X 2 + Y 2 + Z 2 + (XY

+ XZ + Y Z);
(1.4)
under the change of basis with matrix

1
1 +

1 + 2 + 6
2

+
we find that s, h, g2 sg2

1 0
0 0
0 1
while g maps to

2 + 6
1

(1.5)

1 +

map to the signed permutation matrices

0
0 1 0
0 1
0
1, 0 0 1, 1 0
0,
0
1 0 0
0 0 1

(1.6)

1 1

1

0.
(1.7)
2
1 1

The matrices (1.6) and (1.7) generate an image of G in SL3 (C), which we shall
call the S4 model of (V, ).
We can also recover from (V, ) and (V , ) the irreducible representations
of G of dimensions 6,7,8: the first is the symmetric square Sym2 (V ); the second
is Sym3 (V ) V ; and the last is (V V ) 1.
1.2. G-invariant polynomials in V . The action of G on V extends to an
action on the ring

M
C[V ] =
Symm (V )
(1.8)
m=0

of polynomials on V . Klein determined over a century ago [1879b, 6] the


subring C[V ]G of polynomials invariant under this action: it is generated by

56

NOAM D. ELKIES

three algebraically independent homogeneous polynomials of degrees 4, 6, 14,


and a fourth polynomial of degree 21 whose square is a polynomial in the first
three. It follows that the subring of polynomials invariant under the 336-element
group G = {1} G is a polynomial ring generated by invariants of degrees
4, 6, 14. It is known [Shephard and Todd 1954] that a finite subgroup of GLn (C)
has a polynomial invariant ring if and only if it is a complex reflection group, that
is, a group generated by its elements g such that 1n g has rank 1. In our case
the complex reflections in {1} G are (s) and its conjugates, of which there
are 21 (the size of the conjugacy class 2A). We next find explicit polynomials
4 , 6 , 14 , 21 such that the invariant rings C[V ]G and C[V ]G are generated
by {4 , 6 , 14} and {4 , 6 , 14, 21 } respectively, and determine 221 as a
polynomial in 4 , 6 , 14 .
Letting X, Y, Z V be the coordinate functions in the Klein model of (V, ),
we can write the quartic invariant as
4 := X 3 Y + Y 3 Z + Z 3 X,

(1.9)

because even the action on Sym4 (V ) of the 21-element subgroup of G generated


by (X, Y, Z) 7 (X, 4 Y, 2 Z) and cyclic permutations of X, Y, Z (see (1.2)) has
only a one-dimensional invariant subspace, generated by 4 . The Klein quartic
is the zero locus
X := {(X : Y : Z) P(V ) : 4 (X, Y, Z) = 0}

(1.10)

of 4 in the projective plane P(V ) = (V {0})/C . In the S4 model the


monomial matrices do not suffice to determine 4 up to scaling, but starting
from (1.9) we may use the change of basis (1.5) to find that 4 is proportional
to
X 0 + Y 0 + Z 0 + 3(X 0 Y 0 + X 0 Z 0 + Y 0 Z 0 ).
4

(1.11)

[We could also have determined the coefficient 3 by requiring invariance under the 7-cycle (1.7).] The formulas we exhibit2 in the next three paragraphs
for 6 , 14, 21 in terms of 4 can then be used to obtain those invariants as
polynomials in the coordinates X 0 , Y 0 , Z 0 of the S4 model, starting from (1.11).
Since 4 is invariant under G, so is its Hessian determinant
2

4 /X 2
2 4 /X Y 2 4 /X Z



H(4 ) = 2 4 /Y X
2 4 /Y 2
2 4 /Y Z ,
(1.12)


2
2
2 /ZX 2 /Z Y
/Z
4

2 These determinantal formulas (1.13), (1.14), and (1.17) come straight from [Klein 1879b,
6]. Except for the coefficients 1/54, 1/9, 1/14, they can also be found in [Benson 1993,
p. 101]; note that Bensons coordinates are related with ours by an odd permutation of the
Klein
coordinates X, Y, Z, and the 3 3 matrix for (s) in [Benson 1993] is missing the factor
1/ 7 and has an incorrect (3, 3) entry.

THE KLEIN QUARTIC IN NUMBER THEORY

57

and we may take


6 :=

1
H(4 ) = XY 5 + Y Z 5 + ZX 5 5X 2 Y 2 Z 2
54

(1.13)

as the sextic invariant. These polynomials 4 , 6 are f and () in Kleins


notation [1879b]. They are irreducible: each of 4 , 6 can have at most 6 irreducible factors, permuted by G up to scaling, and since G has no proper subgroup
of index 6 the factors must be themselves invariant; but the only invariant
polynomials of degree < 4 are constant, so neither 4 nor 6 can admit a proper
factorization.
The degree-14 invariant is not uniquely determined even up to scaling: one
can also add any multiple of 24 6 . But we will usually work mod 4 , so this
additional ambiguity will disappear. A G-invariant polynomial of degree 14 not
proportional to 24 6 can be obtained from either of the two conjugacy classes of
subgroups S4 G: each of these contains seven subgroups, each of which has a
unique invariant quadric (that is, an invariant line in Sym2 (V )), and the product
of these seven quadrics is a G-invariant polynomial of degree 7 2 = 14. We may
choose for 14 any linear combination of this product and 24 6 . Alternatively
4 may be obtained as a differential determinant from 4 , 6 by extending the
Hessian determinant we used to obtain 6 from 4 :
2

2 4 /X Y 2 4 /X Z 6 /X
4 /X 2


2 4 /Y 2
2 4 /Y Z 6 /Y
1 2 4 /Y X
(1.14)
14 = 2
,
9 4 /Z X 2 4 /Z Y
2 4 /Z 2
6 /Z



6 /Y
6 /Z
0
6 /X
which in terms of the Klein coordinates for V is
X
(X 14 34X 11 Y 2 Z 250X 9 YZ 4 + 375X 8 Y 4 Z 2 + 18X 7 Y 7 126X 6 Y 3 Z 5 )
cyc
(1.15)
P
(in which cyc means sum over the three cyclic permutations of X, Y, Z, so
P
for instance 4 = cyc X 3 Y ). All the invariant polynomials of degree 14 are
irreducible except for 24 6 and the products of the two orbits of S4 -invariant
quadrics. Multiplying the images of the quadric (1.4) under powers of (g) yields
14 + (69 + 7)24 6 ,

(1.16)

so the reducible combinations of 24 6 and 14 are 24 6 itself, (1.16), and its


conjugate 14 + (62 7)24 6 .
Finally the invariant 21 may be described as the product of 21 linear forms:
from the character table, each of the 21 involutions in G fixes a one-dimensional
subspace of V , and we obtain 21 by multiplying generators of these subspaces.
Alternatively 21 may be described as a multiple of the Jacobian determinant

58

NOAM D. ELKIES

of (4 , 6 , 14) with respect to (X, Y, Z). We choose the multiple




4 /X 4 /Y
4 /Z

(4 , 6 , 14)
1

21 =
=
6 /Z ;
6 /X 6 /Y

14 (X, Y, Z)
14
/X /Y /Z
14

14

(1.17)

14

the factor 1/14 makes this an integral polynomial X + Y + Z 21 + in the


Klein coordinates. Then 221 is invariant under G, and is thus a polynomial in
4 , 6 , 14 . By comparing coefficients we find
21

21

221 = 314 172876 + 10084 46 14 3224 6 214 + 1971234 56


11524426 14 + 112646436 25674 14 + 1228894 6 . (1.18)
Thus
314 221 172876 mod 4 .

(1.19)

The existence of a linear dependence mod 4 between 76 , 314 , and 221 could
have been surmised from the degrees of these invariants; we shall see that it
is closely related to the description of X as a G-cover of CP 1 branched at only
three points, with ramification indices 2, 3, 7. (It is also the reason that this curve
figures in the analysis of the Diophantine equation Ax2 +By3 = Cz 7 in [Darmon
and Granville 1995].) The occurrence of the coefficient 1728 = 123 in (1.19),
reminiscent of the identity E23 E32 = 1728 for modular forms on PSL2 (Z),
suggests that X may be closely related with elliptic and modular curves; we shall
see that this is in fact the case in the final section.
1.3. Arithmetic of (V, ): fields of definition. So far we have worked over C.
In fact all the representations of G except those of dimension 3 can be realized
by homomorphisms of G to GLd (Q); we say that these representations are defined over Q. This is obvious for the trivial representation, and clear for the 6and 7-dimensional ones from their relation with the 7- and 8-letter permutation
representations of G. By comparing characters we see that the direct sum of the
7- and 8-dimensional representations is isomorphic with the exterior square of
the 6-dimensional one, whence the 8-dimensional representation is also defined
over Q. We cannot hope for the 3-dimensional representations to be defined
on the 7-cycles in G. We next
over Q, because 3 takes irrational values ,
investigate how close we can come to overcoming this difficulty.
The S4 model shows that (V, ) can be defined over the quadratic extension k
of Q generated by the values of 3 . On the other hand, the Klein model of (V, )
uses matrices over the larger field K, but is defined over Q in the weaker sense
that (G) SL3 (K) is stable under Gal(K/Q). Indeed the Galois conjugates
of (g) are its powers, (h) SL3 (Q) is fixed by Gal(K/Q), and the involution (1.3) is contained in SL3 (K+ ) and taken by Gal(K+ /Q) to its conjugates
by powers of h, so the group (G) generated by these three linear transformations is permuted by Gal(K/Q). The S4 model cannot be defined over Q even

THE KLEIN QUARTIC IN NUMBER THEORY

59

in this weaker sense: if it were, complex conjugation would induce a nontrivial


automorphism of G fixing S4 G pointwise, but no such automorphism exists.
This is why the invariants 4 , 6 , 14, 21 are polynomials over Q in the Klein
model but not in the S4 model. This still leaves open the possibility of finding a
model in which (G) is both contained in SL3 (k) and stable under Gal(k/Q) by
applying a suitable GL3 (K) or GL3 (k) change of basis to the Klein or S4 model.
Indeed it turns out that such a model, giving in effect a faithful representation
of Aut(G) into L3 (k),3 does exist, and is in fact unique up to isomorphism. This
is because constructing such a model amounts to choosing an outer involution
of G to map to the Galois involution of k/Q, and there is just one conjugacy
class of involutions in Aut(G) G. Under the identification of Aut(G) with

PGL2 (F 7 ), one such involution is r = 10 10 . The subgroup of G fixed by this
involution is the copy of S3 generated by h, s; thus only this subgroup will map
to matrices in GL3 (Q). Allan Adler points out (in e-mail) a beautiful way to see
the image of the 42-element subgroup hg, h, ri of Aut(G): regard K as a threedimensional vector space over k; let g be multiplication by ; let h be generator of
Gal(K/k) taking to 2 ; and let r be complex conjugation, acting k-antilinearly
as it should. Since, as noted already, hg, hi acts irreducibly on V , this suffices to
determine the representation. We choose the basis ( 6 , 2 5 , 4 3 ) for
K/k note that this basis is orthogonal under the G-invariant Hermitian norm
on K. We find that this basis is related with the basis for the
kk = TrK/k ( )
S4 model by the change of basis with matrix

1
2 + 3
2 + 3

1 ,
(1.20)
1
2 + 3

and that in this basis the matrices for (g), (h), (s) are

1
0 1 0
3 6
1
1

1
1 , 0 0 1 ,
6
2
7
7
1 1 1
1 0 0
2 3

2
3 .
6

(1.21)

We call this the rational S3 model of (V, ). Since it is weakly defined over Q,
its polynomial invariants have rational coefficients. For most purposes it is still
more convenient to use the simpler invariants of the Klein model; for instance
the quartic invariant 4 , which is the pretty trinomial (1.9) in the Klein model,
becomes a multiple of
A4 +B 4 +C 4 +6(AB 3 +BC 3 +CA3 )3(A2 B 2 +B 2 C 2 +C 2A2 )+3ABC(A+B+C)
(1.22)
in our basis, and looks even worse with other coordinate choices. But it does
have the advantage not only of minimal fields of definition but also of identifying
3 By this is meant the semidirect product of GL (k) with Gal(k/Q), in analogy with the
3
semilinear groups Ln (Fq ) over finite fields properly containing F p .

60

NOAM D. ELKIES

G with linear groups over both F 2 and F 7 by reducing (V, ) modulo primes
of OK with those residue fields.
1.4. Arithmetic of (V, ): reduction mod p and the lattice L. Remarkably
the representation (V, ) remains irreducible at every prime, and its reductions
mod 2 and 7 reveal the identification of G with SL3 (F 2 ) and PSL2 (F 7 ) respectively. Before showing this we put it in context by briefly recalling what it means
to reduce a representation mod p.
For this paragraph only, let G be any finite group, and (V, ) an irreducible
representation of G defined over a number field F . Let L V be an OF -lattice
stable under G. (Such a lattice always exists; for instance we may choose any
P
nonzero v V and take for L the OF -linear combinations gG ag (g)(v).) For
each prime ideal p of OF , we then obtain a representation of G on the (OF /p)vector space L/pL. If this representation is irreducible then it does not depend
on the choice of L, and we may unambiguously say that (V, ) is irreducible
mod p and call L/pL its reduction mod p. This is the case for all but finitely
many p, including all primes whose residual characteristic does not divide the
order of G. But it may, and usually does, happen that there are some primes p,
necessarily with #G 0 mod p, such that L/pL is reducible, in which case that
representation may depend on the G-stable lattice L (though the composition
factors of L/pL depend only on (V, ) and p). For instance, if F = Q and G is the
symmetric group Sn (n > 3), and we take for (V, ) its usual (n 1)-dimensional
representation, then it is reducible mod p if and only if p divides n. When p
divides n, the representation L/pL depends on the choice of L. If we choose
for L the root lattice
n
o
Xn
ai = 0 ,
An1 = (a1 , a2 , . . . , an ) Z n :
1

the representation L/pL contains the 1-dimensional trivial representation generated by (1, 1, . . . , 1); if we choose instead the dual lattice An1 then L/pL has a
G-invariant functional but no invariant proper subspace of positive dimension.
We return now to the case that G is the simple group of 168 elements and
V is its 3-dimensional representation with character 3 . We may choose either
F = K or F = k. In either case we may see without any computation that V is
reducible mod p for each prime p of F . Indeed if V was reducible then G would
have a nontrivial representation mod p of dimension 1 or 2; since G is simple
and non-abelian, it would thus be a subgroup of GL2 (OF /p). But the only
non-abelian simple groups with an irreducible 2-dimensional representation over
some field are the groups SL2 (F 2r ) for r > 1 (this follows from the classification
of finite subgroups of SL2 over an arbitrary field, see for instance [Suzuki 1982,
Theorem 6.17]). But G is not such a group it does not even have order 23r 2r .
This completes the proof that V is irreducible at each prime of F .
Thus (V, ) is one of the few known representations of finite groups in dimension greater than 1 that are absolutely irreducible in the sense of [Gross

THE KLEIN QUARTIC IN NUMBER THEORY

61

1990], that is, are irreducible and remain so in every characteristic.4 Since k has
unique factorization, the main result (Prop. 5.4) of [Gross 1990] then shows that
the lattice L is unique up to scaling. In the coordinates of the rational S3 model
L is proportional to the self-dual lattice
n 1
o

(x, y, z) : x, y, z Ok ; y 2x, z 4x 7 OK ; x + 2y + 4z 7OK .


7
(1.23)
In the coordinates of the S4 model we may take L to be the Ok -lattice generated
by the column vectors
(2, 0, 0),

(, , 0) (
, 1, 1).

(1.24)

The group G can in turn be defined as the group of determinant-1 automorphisms


of this lattice [Conway et al. 1985]. Likewise the only G-invariant lattices in V
are of the form cL for nonzero c, where L is generated by
(2, 0, 0),

(
,
, 0) (, 1, 1);

(1.25)

this L may be identified with the dual lattice of L. (Of course L, L are isomorphic qua lattices because the representations V, V are identified by an automorphism of G.) We note two facts for future reference. First, that in our
case it is enough to assume that L or L is a Z-lattice stable under the action
of G: we obtain the action of Ok automatically because (g) + (g2 ) + (g4 )
is multiplication by on V and by
on V . Second, that L is known to be
the unique indecomposable positive-definite unimodular Hermitian Ok -lattice of
rank 3 [Hoffmann 1991, Theorem 6.1].
We next consider the reductions of (V, ) in characteristics 2, 7. We deal with
characteristic 2 first. There are two primes 2 , 2 above 2 in Ok , interchanged
by complex conjugation. We may take 2 = (), 2 = (
). Thus the reductions
of the rational S3 model for (V, ) modulo those primes are related by an outer
automorphism of G. Using either prime, we obtain a nontrivial representation
G GL3 (F 2 ). Since G is simple, this map must be an isomorphism. That is,
each invertible linear transformation of V mod 2 or 2 comes from a unique
element of G; equivalently, each automorphism of L/2 L or L/2 L lifts to a
unique determinant-1 isometry of L! Now Dickson proved that for each prime
power q and every positive integer n the ring of invariants for the action of
GLn (F q ) on its defining representation is polynomial, with generators of degrees
q n q m for m = 0, 1, . . . , n 1. (See the original paper [Dickson 1934], and
[Bourbaki 1968, Chapter V, 5, Ex. 6 on pp. 1378] for a beautiful proof; the
4 The best known examples of absolutely irreducible representations are the defining representations of the Weyl group of E8 and the isometry group of the Leech lattice. Both of those
representations are defined over Q; thus the uniqueness up to scaling of the stable lattices for
those groups is already contained in the work of Thompson [1976], who gave those examples as
well as the 248-dimensional representation of his sporadic simple group. Grosss paper [Gross
1990] extends Thompsons work to several classes of representations not defined over Q, and
gives many examples.

62

NOAM D. ELKIES

Dickson invariants and the invariants of subgroups of GLn (F q ) are treated in


greater detail in the last chapter of [Benson 1993].) In our case, (q, n) = (2, 3),
so the degrees are 4, 6, 7. Indeed (1.18) reduces mod 2 to5 221 = 314 , so that
mod 2 there is a new invariant 7 such that 27 = 14 , 37 = 21 ; the Dickson
invariants for GL3 (F 2 ) are this 7 together with 4 , 6 note that indeed the
degrees 4, 6, 7 are 23 22 , 23 21 , 23 20 .

There is a unique prime 7 = ( 7 ) of Ok above 7. The action of G on the


3-dimensional F 7 -vector space L/7 L is then the unique reduction of (V, ) in
characteristic 7. Since mod 7 for all Ok , the G-invariant Hermitian
form on L reduces to a non-degenerate quadratic form on L/7 L, which G must
respect. Thus the image of our representation G GL3 (F 7 ) is contained in the
orthogonal group SO3 (F 7 ) (not merely O3 (F 7 ) because (G) SL(V ) already
in characteristic zero). But we already know a 3-dimensional representation of
G
= PSL2 (F 7 ) in characteristic 7, namely the symmetric square Sym2 (V2 ) of its
defining representation. [Note that the matrix 1 in the center of SL2 (F 7 ) acts
on Sym2 (V2 ) by multiplication by (1)2 = +1, which is to say trivially, so we actually do obtain a 3-dimensional representation of the quotient group PSL2 (F 7 ).]
Moreover, this representation has an invariant quadratic form, namely the discriminant of a binary quadric, and G acts on Sym2 (V2 ) by linear transformations
of determinant 1. Thus we obtain a map PSL2 (F 7 ) SO3 (F 7 ). The image is not
quite all of SO3 (F 7 ); indeed
SO3 (F 7 )
= Aut(G).

(1.26)

Both groups have order 336 = 2 168, so to obtain the isomorphism (1.26) we
need only extend the action of G on Sym2 (V2 ) to Aut(G)
= PGL2 (F 7 ). To
do this, begin by choosing for each element of Aut(G) G a representative
GL2 (F 7 ) of determinant 1; such a exists since 1 is not a square in F 7 ,
and is well-defined up to . Then induces a linear transformation Sym2
[ = Sym2 () ] of determinant 1 on Sym2 (V2 ) that preserves the quadratic
form. We thus obtain a well-defined Sym2 SO3 (F 7 ) not contained in the
image of G. These elements, together with Sym2 for G, fill out all of
SO3 (F 7 ). (Geometrically, the actions of PGL2 and SO3 induce automorphisms
of P 1 and of a conic in P 2 respectively, and the isomorphism (1.26) reflects the
identification of the conic with P 1 [Fulton and Harris 1991, p. 273].) Weve
seen that the G part of SO3 (F 7 ) is obtained from the action of G on L/7 L.
But Aut(G) acts on L too, and since 7 is Galois-invariant, the conjugate-linear
automorphisms of L also act on L/7 L.
We thus see that, as in the mod-2 case, each automorphism of L/7 L preserving the quadratic form lifts uniquely to an automorphism (possibly conjugate5 It might be objected that we should not be using (1.18) because that equation relates the
invariants of the Klein model. But that model still reduces well in characteristic 2; its only
flaw there is that the field of definition is too large: F 8 instead of F 2 . But this does not affect
the structure of the F 2 -ring of invariants.

THE KLEIN QUARTIC IN NUMBER THEORY

63

linear and/or of determinant 1) of L. Moreover, L explains the sporadic


isomorphism between SL3 (F 2 ) and PSL2 (F 7 ): these two linear groups are just
the mod-2 and mod-7 manifestations of the isometries of L.6
The invariant quadratic form on L/7 L can also be seen by reducing the ring
of G-invariants mod 7 . As in the characteristic-2 case, there is a new invariant
2 , and here each of 4 , 6 , 14 is proportional to the appropriate power 22 , 32 ,
72 of this invariant quadric! Note that our formulas (1.11,1.22) for the quartic
invariant in the S4 and rational S3 models both reduce mod 7 to perfect squares,
2
namely (X 2 + Y 2 + Z 2 )2 and X 2 + Y 2 + Z 2 + 3(XY + YZ + ZX) . Curiously,
though, it is the S4 form that is pertinent for 4 = 22 ; that 4 is also a square
mod 7 in the rational S3 model is not directly relevant. This is because the
matrices (1.6,1.7) for (G) in the S4 model are 7 -integral, while the matrices

(1.21) in the rational S3 model have denominators 7 and even 7, and thus do
not reduce well mod 7 . [For each odd prime power q, the full ring of invariants
of the three-dimensional representations of O3 (F q ), PSL2 (F q ), and the three
intermediate groups have been determined by Kemper [1996, Theorem 2.4(c)].
Of these five groups, only two have polynomial invariants, including O3 (F q ) but
not PSL2 (F q ) of {1} PSL2 (F q ). In our case of q = 7, the invariants of
O3 (F 7 ) are generated by 2 , 221, and a new invariant 8 given by X 8 + Y 8 + Z 8
in the coordinates of the reduced S4 model; G and G do not have polynomial
invariant rings, though another index-2 subgroup of O3 (F 7 ) has invariant ring
F 7 [2 , 8 , 21 ]. See [Kemper 1996] for further details.]

2. The Klein Quartic X as a Riemann Surface


2.1. The action of G on X. The action of G on V induces an action on the
projective plane (V {0})/C
= CP 2 , and on the Klein quartic X CP 2 , which
is the zero-locus of the invariant quartic polynomial 4 . We use this to describe
the geometry of X.
We have seen already that 4 is an irreducible polynomial. Thus its zero locus
X is an irreducible curve. An irreducible plane quartic curve can have at most

41
= 3 singularities. Any singular points of X would be permuted by G; since
2
the largest proper subgroups of G have index 7, each singular point would have
to be fixed by G. But G fixes no point on CP 2 because the representation (V, )
is irreducible. Thus X has no singularities, so is a curve of genus 3 canonically
embedded in CP 2 .
Since each element of G other than the identity can have only finitely many
fixed points on X, there are only a finite number of orbits of G of size less than
#G = 168. We next describe these orbits and their point stabilizers:
6 Several of the other sporadic isomorphisms between linear groups in different characteristics are likewise explained by highly symmetrical lattices in small dimension. For instance the
Weyl group of E6 occurs as both an orthogonal group acting on F 53 and a symplectic group
acting on F 62 , these vector spaces arising as E6 /3E6 and E6 /2E6. See [Kneser 1967].

64

NOAM D. ELKIES

Proposition. (i) Each of the eight 7-Sylow subgroups H7 G has three fixed
points in CP 2 and is the stabilizer in G of each of these three points, all of
which are on X. The 8 3 = 24 points thus obtained are all distinct and
constitute a single orbit of G. They are Weierstrass points of X of weight 1,
and X has no other Weierstrass points.
(ii) Each of the twenty-eight 3-Sylow subgroups H3 G has three fixed points
in CP 2 . The normalizer N (H3 ) of H3 in G, isomorphic with the symmetric
group S3 , is the stabilizer in G of one of these points; this point is not on X.
The remaining fixed points of H3 are on X and each has stabilizer H3 . The
line joining these two points is the unique line of CP 2 stable under N (H3 ),
and is tangent to X at both points. The 28 2 = 56 points thus obtained are
all distinct and constitute a single orbit of G. The lines joining pairs of these
points with the same stabilizer are the 28 bitangents of X.
(iii) Each of the twenty-one 2-element subgroups H2 G fixes a point and a line
in CP 2 . The normalizer N (H2 ) of H2 in G, isomorphic with the 8-element
dihedral group, is the stabilizer in G of the fixed point, which is not on X.
The fixed line meets X in four distinct points, each of which has stabilizer H2
in G; these four points are permuted transitively by N (H2 ). The 21 4 = 84
points thus obtained are all distinct and constitute a single orbit of G.
(iv) Every G-orbit in X, other than the orbits of size 24, 56, 84 described in (i),
(ii), (iii) above, has size 168 and trivial stabilizer .
Proof. Since there are no points of CP 2 fixed by all of G, the stabilizer of every
point P CP 2 must be contained in a maximal subgroup. For both kinds of
maximal subgroup we have representations by monomial matrices relative to a
suitable choice of coordinates, which let us readily describe the point stabilizers.
If the stabilizer S(P ) has even order it must be contained in one of the 24element subgroups. In the coordinates of the S4 model, we find that such a
point P must be one of:
a unit vector, with S(P ) an 8-element dihedral group;
a vector (1 : 1 : 1), with S(P )
= S3 ;
a permutation of (1 : 1 : 0), with S(P ) a noncyclic group of order 4 (these
last three cases coming from an opposite pair of faces, edges, or sides of the
cube respectively);
a permutation of (1 : i : 0), with S(P ) a cyclic group of order 4, or
a permutation of (1 : x : x) for some x
/ {0, 1}, with S(P ) a two-element
group.7
Moreover, the only nontrivial groups of odd order in S4 are its 3-Sylows, which
are conjugate to the group of cyclic permutations of the coordinates; this group
7 There are several x
/ {0, 1} for which the stabilizer of this point in G is larger, but then
that stabilizer is contained in a different maximal S4 G, and the points coordinates in that
subgroups S4 model appear earlier in this list.

THE KLEIN QUARTIC IN NUMBER THEORY

65

fixes (1 : 1 : 1), which we already saw has stabilizer S3 , and the two points
(1 : e2i/3 : e2i/3 ). The stabilizer of each of these last points must be the 3Sylow: it cannot be a larger subgroup of S4 , because we have already accounted
for all of these; and the only other possibility would be a 21-element subgroup,
which has no fixed points at all because it acts irreducibly on CP 2 . Turning
to subgroups of the 21-element subgroup, we use the coordinates of the Klein
model: the 7-element normal subgroup hgi fixes only the three unit vectors, and
all 3-element subgroups are conjugate to hhi which fixes only (1 : 1 : 1) and the
two points
(1 : e2i/3 : e2i/3 ).
Clearly the first of these is also fixed by the involution (1.3). From our analysis
of the S4 model it follows that its stabilizer is the S3 generated by h and that
involution, while the other two fixed points of h have stabilizer hhi.
Moreover, using the explicit formula for 4 in the S4 and Klein models we see
that the stabilizers of any points of X must be cyclic of order 1,2,3, or 7. Thus
part (iv) of the Proposition will follow from the first three parts.
Now a Weierstrass point of any Riemann surface of genus w > 1 is a point
at which some holomorphic differential vanishes to order at least w. (See [Arbarello et al. 1985, 4143] for the facts well need on Weierstrass points.) For
a smooth plane quartic, the holomorphic differentials are linear combinations of
the coordinates, so since w = 3 the Weierstrass points are those at which some
line meets the curve at least triply, which is to say the inflection points of the
curve. In our case the tangent to
X : X 3 Y + Y 3Z + Z 3 X = 0
at (1 : 0 : 0) is the line Y = 0, which indeed meets X triply at that point. Thus
(1 : 0 : 0) is a Weierstrass point, and by G-symmetry so are all 24 points in its
orbit. But each Weierstrass point of a Riemann surface has a positive integral
weight, and the sum of these weights is w3 w. Since this is 24 in our case,
each point has weight 1 and there are no other Weierstrass points, as claimed.
(Knowing the w3 w formula we could have also concluded this directly from the
existence of a unique orbit of size as small as 24, even without computing that
it consists of inflection points.) We have thus proved Part (i) of the proposition.
(ii) First we check that N (H3 ) is indeed S3 . Since all 3-Sylows are conjugate
in G, it is enough to do this when H3 is contained in a maximal S4 . But the
normalizer of every 3-element subgroup of S4 is an S3 , so its normalizer in G is
a subgroup of even order that is thus contained in a (perhaps different) maximal
S4 , so is indeed S3 as claimed.
To get at the fixed points of H3 and its normalizer we again use the Klein
model. We find that the fixed point (1 : 1 : 1) of h is not on X, while the other two
fixed points are. Moreover the line connecting those two points is X +Y +Z = 0;
solving for Z and substituting into 4 we obtain (X 2 + XY + Y 2 )2 , so this

66

NOAM D. ELKIES

line is indeed a bitangent of X. That any smooth plane quartic curve has 28
bitangents is well known; see for instance [Hartshorne 1977, p. 305, Ex. 2.3h].
The remaining claims of (ii) either follow, as in (i), from the conjugacy in G of
all 3-Sylow subgroups, or were already established during the above analysis of
the stabilizers of points in CP 2 .
(iii) Again we first check that N (H2 ) is as claimed, using the fact that the
involutions in G constitute a single conjugacy class. The normalizer of a double
transposition in S4 G is an 8-element dihedral group. Thus its normalizer in G
is either that group, a maximal S4 , or all of G, but the last two are not possible
because these groups have trivial centers. So N (H2 ) is indeed an 8-element
dihedral group.
The noncyclic 4-group N (H2 )/H2 acts on the fixed line of H2 and on its
intersection with X. Since no point of X may have stabilizer properly containing
H2 , the number of points of X on the fixed line must be a multiple of 4. But the
intersection of a line with a smooth quartic curve consists of at least 1 and at
most 4 points. Thus there are four fixed points of H2 on X, transitively permuted
by N (H2 ). The remaining claims of (iii) follow as before.

Corollary [Klein 1879b, 6]. The 24-, 56- and 84-point orbits are the zero loci
of 6 , 14 , and 21 on X, each with multiplicity 1.
Proof. Since none of 6 , 14 , and 21 is a multiple of 4 , these polynomials
do not vanish identically on X, so their zero loci contain respectively 24, 56, and
84 points with multiplicity. Since the polynomials are G-invariant, their zero loci
must be positive linear combinations of G-orbits. But by the Proposition there
are only three orbits of size < 168. Moreover none of the integers 24, 56, 84 can
be written as a nonnegative integer combination of the others: this is clear for
24, which is the smallest of the three; and almost as clear for 56, which is not a
multiple of 24, and for 84, which is congruent to neither 0 nor 56 mod 24. Thus
the vanishing loci can only be as claimed in the Corollary.

(The 6 case could also have been obtained from (1.13), since the inflection
points of any smooth plane curve P (X, Y, Z) = 0 are the zeros of the Hessian
H(P ) [Coolidge 1931, p. 95, Theorem 18]. The case of 21 could also be deduced from our description of 21 as the product of linear forms fixed by the 21
involutions in G. Klein also identifies the zeros of 21 on X with the curves 84
sextactic points, that is, the points at which the osculating conic meets X with
multiplicity 6 rather than the generic 5.)
Hirzebruch [1983, pp. 120, 140] draws attention to the configuration in CP 2
of the 21 lines fixed by involutions in G. Three of these meet at each of the 28
points fixed by subgroups S3 G, and four lines meet at each of the 21 points
fixed by 2-Sylows in G. These are all the points of CP 2 that lie on more than one
of the 21 lines. In the notation of [Hirzebruch 1983], we thus have a configuration
of k = 21 lines with t3 = 28, t4 = 21, and tn = 0 for n 6= 3, 4. Thus this is

THE KLEIN QUARTIC IN NUMBER THEORY

67

one of the few nondegenerate line configuration known to achieve equality in the
inequality
X
(n 4)tn
t2 + 34 t3 k +
n>4

of [Hirzebruch 1983, p. 140].


We can also use this Proposition to obtain, via the RiemannHurwitz formula,
the genus of the quotient of X by each subgroup H G: the quotient by
the trivial group is of course X itself, with genus 3; the quotient by a cyclic
subgroup of order 2, 3, or 4 is a curve of genus 1; and the quotient by any other
subgroup has genus zero. Another way to obtain these is to identify the space
H1 (X/H) of holomorphic differentials on X/H with the subspace (H1 (X))H of
such differentials on X fixed by H. But since X is a smooth plane quartic, we
can identify H1 (X) with the space of linear forms in the coordinates. Thus in
our case the representation of G on H1 (X) is isomorphic with (V , ), and we
may recover the dimension of the subspace fixed by each subgroup H from the
character table.
Since the quotient of X by the 7-Sylow hgi G has genus 0, we can regard
X as a cyclic cover of CP 1 of degree 7. We can see this explicitly: the covering
map sends (X : Y : Z) X to (X 3 Y : Y 3 Z : Z 3 X) on the line
{(a : b : c) CP 2 : a + b + c = 0}.

(2.1)

Then (Y /Z)7 = ab2 /c3 , and (X : Y : Z) is determined by (a : b : c) together


with the seventh root Y /Z of ab2 /c3 . Thus if we take y = Y /Z and x = b/c
we find that X is birational with the curve
y7 = x2 (x + 1).

(2.2)

This model of X exhibits the action of the 21-element subgroup hg, hi of G: g


multiplies y by 1 , while g cyclically permutes a, b, c (or equivalently the points
1, 0, on the x-line). It also lets us write periods of differentials on X as linear
combinations of Beta integrals. For instance, for the form dx/y3 we find
Z 1
Z 0
Z



dx
dx
dx
2 4
1 4
(2.3)
= B 7, 7 ,
= B 7, 7 ,
= B 17 , 27 ;
3
3
3
y
y
y

1
0
the identity (u)(1 u) = / sin u shows that each of these integrals is a
K+ multiple of


1
(2.4)
7 := 17 ) 27 ( 47 ,
7
and thus that all the periods of dx/y3 on X are in K7 . We later (2.12) use this
to evaluate a complete elliptic integral as a multiple of 7 .
We also compute for later use the quotient curve X/hhi of genus 1. Since 4
is not fixed by odd coordinate permutations, we can do this by multiplying 4

68

NOAM D. ELKIES

by its image under such a permutation, and expressing the resulting symmetric
function
(X 3 Y + Y 3 Z + Z 3 X)(X 3 Z + Z 3 Y + Y 3 X)
(2.5)
in terms of the elementary symmetric functions
s1 = X + Y + Z,

s2 = XY + Y Z + ZX,

s3 = XYZ.

(2.6)

We find that (2.5) is


s42 + s3 (s51 5s31 s2 + s1 s22 + 7s21 s3 ).

(2.7)

We thus get an affine model for X/hhi by setting this polynomial equal to zero
and substituting 1 for s1 :
7s23 + (s22 5s2 + 1)s3 + s42 = 0.
To put this in Weierstrass form, divide (2.8) by

s3
s22

2

1
+ (s2
2 5s2 + 1)

s42

(2.8)

and rewrite it as

s3
+ 1 = 0.
s22

(2.9)

Let u = s3 /s22 . Then (2.9) is a quadratic polynomial in s1


2 over Q(u), so it has
a root if and only if its discriminant 28u3 + 21u2 4u is a square. The further
substitution u = 1/x then yields the desired form
Ek : y2 = 4x3 + 21x2 + 28x

(2.10)

of the quotient curve. We can then compute that the curve has j-invariant
3375 = 153 , and thus has complex multiplication (CM) by Ok . We note for
future reference that the unit vectors, which have s2 = s3 = 0, map to the point
at infinity of Ek , while the branch points of the cover XEk are the fixed points
(1 : e2i/3 : e2i/3 ) of h, which have s1 = s2 = 0 and turn out to map to two
points on Ek whose x-coordinates are roots ,
of
x2 x + 7 = 0.

(2.11)

The 2-element group N (hhi)/hhi = hh, si/hhi acts on Ek . Since X/hh, si has
genus 0, the involution in hh, si/hhi must multiply the invariant differential on
Ek by 1. Thus it is of the form P P0 P for some P0 Ek (using the group
law on Ek ), and is determined by the image of a single point. We compute that
s takes the unit vectors to points on X whose coordinates are proportional to the
three roots of u3 7u2 + 49, and that these points map to the 2-torsion point
(0, 0) on Ek . Thus this point is P0 ; in other words, the nontrivial element of
hh, si/hhi acts on Ek by the involution that switches the point at infinity with
(0, 0) but is not translation by that 2-torsion point of Ek .
We further find that the curve Ek has conductor 49. (To see that the conductor is odd, note that the linear change of variable y = 2y1 + x puts Ek in
the form y12 + xy1 = x3 + 5x2 + 7x with good reduction at 2.) This conductor
is small enough that we may locate the curve in the tables of elliptic curves

THE KLEIN QUARTIC IN NUMBER THEORY

69

dominated by modular curves compiled by Tingley et al. (the Antwerp Tables


in [Birch and Kuyk 1975]) and Cremona [1992]: the curve is listed as 49A and
49-A1 respectively. We find there that Ek is literally a modular elliptic curve: it
is not only dominated by, but in fact isomorphic with, X0 (49). We shall later obtain this isomorphism from the identification of X with the modular curve X(7).
Likewise the fact that Ek has CM by Ok is no accident: we shall see that if there
is a nonconstant map from X to an elliptic curve then the elliptic curve has CM
by some order (subring of finite index) in Ok ; equivalently, such a curve must
be isogenous with Ek . (It is clear that conversely a curve isogenous with Ek
admits such a map, since we have just constructed a nonconstant map from X
to Ek itself.) For instance this must be true of the quotient of X by one of the
21 two-element subgroups of G. Since these subgroups are all conjugate in G,
the resulting curves are isomorphic; in fact the reader may check (starting from
the S4 model of X, in which several of these involutions are visible) that these

elliptic curves are all Q-isomorphic


with Ek .
An algebraic map from X to Ek can be used to pull back an invariant differential on Ek to H1 (X). Thus the periods of Ek can be evaluated in terms of the
Beta integrals that arise in the periods of X. This yields the formula
Z



dx
1

17 27 47 ,
= 14 7 =
(2.12)
3
2
4x + 21x + 28x
4 7
0
equivalent to Selberg and Chowlas result [1967, pp. 1023]; its explanation via X
is essentially the argument of Gross and Rohrlich [1978], though they pulled the
differential all the way back to the Fermat curve F7 , for which see Section 3.2
below.
2.2. X as the simplest Hurwitz curve. A classical theorem of Hurwitz
([1893]; see also [Arbarello et al. 1985, Chapter I, Ex. F-3 ff., pp. 4547]) asserts that a Riemann surface S of genus g > 1 can have at most 84(g 1)
automorphisms, and a group of order 84(g 1) is the automorphism group of
some Riemann surface of genus g if and only if it is generated by an element of
orders 2 and one of order 3 such that their product has order 7. In that case
the quotient of S by the group is the Riemann sphere, and the quotient map
SCP 1 is ramified above only three points of CP 1 , with the automorphisms of
orders 2, 3, 7 of S appearing as the deck transformations lifted from cycles around
the three branch points. Thus the group elements of orders 2, 3, 7 specify S by
Riemanns existence theorem for Riemann surfaces. Note that the construction
does not depend on the location of the three branch points on CP 1 , because
Aut(CP 1 ) = PGL2 (C) acts on CP 1 triply transitively.
A Riemann surface with the maximal number 84(g 1) of automorphisms,
regarded as an algebraic curve over C, is called a Hurwitz curve of genus g.
Necessarily g 3, because a curve C genus 2 over C has a hyperelliptic involution
, and Aut(C)/{1, } is the subgroup of PGL2 (C) = Aut(CP 1 ) permuting the

70

NOAM D. ELKIES

six ramified points, but the stabilizer in Aut(CP 1 ) of a six-point set has size at
most 24. So a Hurwitz curve must have genus at least 3. We know already that
X is such a curve. In fact one may check that G is the only group of order 168
satisfying the Hurwitz condition, and that up to Aut(G) there is a unique choice
of elements of orders 2, 3 in G whose product has order 7. (For instance we may
take the involution s and the 3-cycle sg.) Thus X is the unique Hurwitz curve
of genus 3. We readily write the quotient map XX/G
= CP 1 explicitly, using
our invariant polynomials 6 , 14 , 21 : a point (X : Y : Z) on X maps to
j :=

314
221
=
+ 1728
76
76

(2.13)

on CP 1 . Note that this is a rational function of degree 4 42 = 168 = #G on X,


and thus of degree 1 on X/G. That the two expressions in (2.13) are indeed
equal on X follows from (1.19). We then see from (2.13) that the branch points
of orders 2, 3, 7 on CP 1 have j coordinates 1728, 0, respectively. Of course we
have chosen this coordinate j on X/G
= CP 1 to facilitate the identification of X
and X/G with the modular curves X(7) and X(1) later in this paper.
Hurwitz curves can also be characterized in terms of their uniformization by
the hyperbolic plane H. Any Riemann surface S of genus > 1 can be identified with H/1 (S); conversely, any discrete co-compact subgroup Aut(H)
=
PSL2 (R) that acts freely on H (that is, every point has trivial stabilizer) yields a
Riemann surface H/ of genus > 1 whose fundamental group is . The automorphism group of H/ is N ()/, where N () is the normalizer of in Aut(H).
It follows that H/ is a Hurwitz curve if and only if N () is the triangle group
G2,3,7 of orientation-preserving transformations generated by reflections in the
sides of a given hyperbolic triangle with angles /2, /3, /7 in H. Equivalently,
is to be a normal subgroup of G2,3,7. Since G2,3,7 has the presentation


(2.14)
G2,3,7 = 2 , 3 , 7 | 22 = 33 = 77 = 2 3 7 = 1
(with j being a 2/j rotation about the /j vertex of the triangle), this yields
our previous characterization of the groups that can occur as Aut(S) = G2,3,7/.
In Section 4.4 we identify X with a Shimura modular curve by recognizing G2,3,7
as an arithmetic group in PSL2 (R), and 1 (X) with a congruence subgroup of
G2,3,7.
2.3. The Jacobian of X. We have noted already that the representation
of G on H1 (X) isomorphic with (V , ). In particular, the representation is
irreducible and defined over k, and its character takes values
/ Q. It follows as
in [Ekedahl and Serre 1993] that the Jacobian J = J(X) is isogenous to the cube
of an elliptic curve with CM by Ok . This does not determine J completely, but
the fact that G acts on the period lattice of J means that this period lattice is
proportional to L, and this does specify J. (See [Mazur 1986, pp. 2356], where
this is attributed to Serre; also compare [Buser and Sarnak 1994, Appendix 1],

THE KLEIN QUARTIC IN NUMBER THEORY

71

where the packing of congruent spheres in R 6 obtained from L is conjectured to


maximize the density of a packing coming from the period lattice of the Jacobian
of a curve of genus 3.) In the notation of [Serre 1967] we have8 J
= Ek L.
We next describe a MordellWeil lattice associated with X; see for instance
[Elkies 1994] for more background on MordellWeil lattices.
Let E be an elliptic curve, and consider algebraic maps from X to E. These
constitute an abelian group using the group law on E. This group may also
be regarded as the group of rational points of E defined over the function field
of X; we thus call it the MordellWeil group M of maps from X to E, in analogy
with the MordellWeil group of an elliptic curve over a number field. This group
contains a subgroup isomorphic with E, namely the group of constant maps; the
quotient group M/E may in turn be identified (via the embedding of X into J)
with the group of morphisms from J to E. It follows that this group is trivial
unless E has CM by an order in Ok , in which case it is a free abelian group
of rank 6. This proves our earlier claim that the elliptic curves E admitting a
nonconstant map from X are exactly the curves isogenous with Ek .
: M Z taking each f : XE to twice its degree as a rational
The function h
map turns out to be a quadratic form. (For Riemann surfaces this is easy to
see: let be a nonzero invariant differential on E; then f 7 f is a group
homomorphism from M to H1 (X), and h(f) = 2 deg(f) is the image of f
R
R
is a
under the quadratic form 7 2 X C
. Several proofs that h
quadratic form valid in arbitrary characteristic are given in [Elkies 1994]. We
use the notation h because this is a special case of the NeronTate canonical
height; note that thanks to the factor of 2 the associated bilinear pairing

1)
hf1 , f2 i = 12 h(f1 + f2 ) h(f2 ) h(f
is integral.) This quadratic form is positive-definite on the free abelian group
M/E, and gives this group the structure of a Euclidean lattice, which we thus
call the MordellWeil lattice of maps from X to E.
Assume now that E is an elliptic curve with CM by Ok , i.e. that E is isomorphic with Ek . Then the MordellWeil lattice inherits the action of Ok on E
as well as the action of G on X. Therefore it is isomorphic with our lattice L
satisfies the identity
of (1.25) up to scaling. Moreover, the quadratic form h
2
h(f) = ||2 h(f)

for each Ok , because || is the degree of the isogeny


: EE. Thus h is a Hermitian pairing on L. This pairing is again unique
up to scaling, this time because V is unitary and Hermitian (see again [Gross
1990]). If we identify L with the lattice generated by the three vectors (1.25)
then we have

h(v)
= |v1 |2 + |v2 |2 + |v3 |2 .
(2.15)
8 Serre actually defines E L (or rather L E) only when L is a lattice of rank 1 over
End(E), but for each g 1 the same construction for a lattice of rank g yields a polarized
abelian variety isogenous with E g .

72

NOAM D. ELKIES

This lattice has 21 pairs of vectors such as (2, 0, 0) of minimal nonzero norm 4.
These correspond to maps of degree 2 from C to E, which in turn are indexed
by the 21 involutions g G. Each g is counted twice, because there are up to
translation in E two ways to identify the quotient curve X/{1, g} with E, each
yielding a map: XE of degree 2. Likewise the 28 pairs of vectors such as
(, , ) of the next-lowest norm 6 correspond to maps of degree 3, all of which
turn out to be quotient maps by the twenty-eight 3-Sylow subgroups of G. For
each n the number Nn of maps of degree n up to translation on E is the number
of vectors of norm 2n in L, which is the q n coefficient of the theta series
L :=

X
n=0

Nn q n =

q 2 h(v)

(2.16)

vL

of L. But L is a modular form of weight 3 with quadratic character on 0 (7)


fixed by the Fricke involution w7 ([Gross 1990, 9]; we shall encounter 0 (7) and
w7 again in Section 4.2), and the space of such modular forms is 2-dimensional.
The constraints N0 = 1, N1 = 0 determine L uniquely, and we find

L =

X
Ok

3

6q

(1 q n )3 (1 q 7n )3

(2.17)

n=1

= 1 + 42q 2 + 56q 3 + 84q 4 + 168q 5 + 280q 6 + 336q 7 + 462q 8 + .


This confirms our values N2 = 42 and N3 = 56 and lets us easily calculate as
many Nn as we might reasonably desire.

3. Arithmetic Geometry of X
3.1. Rational points on X. Faltings theorem (ne Mordells conjecture) asserts
that a curve of genus at least 2 over a number field has finitely many rational
points. Unfortunately both of Faltings proofs of this [1983; 1991] are ineffective,
in that neither yields an algorithm for provably listing all the points; even for a
specific curve of low genus over Q this problem can be very difficult. (See for
instance [Poonen 1996].) Fortunately the special case of X is much easier. One
shows that the elliptic curve Ek has rank zero, and its only rational points are
the point at infinity and (0, 0). Since X admits a nonconstant map to Ek defined
over Q, namely the quotient map XX/hhi
= Ek , the Q-rational points of X
are just the rational preimages of the two points of Ek (Q). We find that the
only points of X(Q) are the obvious ones at (1 : 0 : 0), (0 : 1 : 0), (0 : 0 : 1).
Equivalently, the only integer solutions of X 3 Y + Y 3 Z + Z 3 X = 0 are those
in which at least two of the three variables vanish. This is all for the Klein
model; one may likewise analyze the rational S3 model for X, computing9 that
its quotient by hhi is isomorphic with Ek , and that neither of the rational points
9 This computation begins in the same way as our derivation of (2.10), but yields an equation
y2 = x4 10x3 + 27x2 10x 27 for the quotient curve; to bring this to Weierstrass form,

THE KLEIN QUARTIC IN NUMBER THEORY

73

of Ek lies under a rational point of X. However, the fact that X has no rational
points in the rational S3 model can be obtained much more simply, without any
computation of quotient curves and analysis of elliptic curves over Q: one need
only observe that the polynomial (1.22) does not vanish mod 2 unless X, Y, Z
are all even.
The proof that Ek (Q) consists only of the point at infinity and (0, 0) is an
application of Fermats method of descent. Suppose that x 6= 0 is a rational
number such that x(4x2 + 21x + 28) = y2 for some y Q. Necessarily x > 0,
because 4x2 + 21x + 28 > 0 for all x R. Write x as a fraction m/n in lowest
terms. If x works then so does 7/x (note that (7/x, 7y/x2) is the translate of
(x, y) by the 2-torsion point (0, 0) in the group law of Ek ). Replacing x by 7/x
if necessary, we may assume that the exponents of 7 in the factorizations of m, n
are both even. Then the integer (n2 y)2 = mn(4m2 + 21mn + 28n2 ) is a perfect
square, and its factors m, n, 4m2 + 21mn + 28n2 are relatively prime in pairs
except possibly for common factors of 2 49r or 4 49r . Thus either all three are
squares, or one is a square and the each of the other two is twice a square. We
claim that the latter is impossible. Indeed, since m, n cannot both be even, we
would have either (m, n) = (M 2 , 2N 2 ) or (m, n) = (2M 2 , N 2). In the first case,
4m2 + 21mn + 28n2 = 2(2M 4 + 21M 2 N 2 + 56N 4 ).

(3.1)

But M is odd (else m, n are both even), so 2M 4 + 21M 2 N 2 + 56N 4 is either


2 or 3 mod 4 according as N is even or odd; in neither case can it be a perfect
square. In the second case, N is odd and
4m2 + 21mn + 28n2 = 2(8M 4 + 21M 2 N 2 + 14N 4 ).

(3.2)

Again the parenthesized factor is either 2 or 3 mod 4, this time depending on


the parity of M , so it cannot be a square.

So we conclude that m, n are both squares. Thus x = x21 for some


p x1 Q , and
2
4
2
4
2
4x1 + 21x1 + 28 Q . We complete the square by writing 4x1 + 21x1 + 28
as 2x2 + (21 )/4, finding
16x2 = 2 42 7.

(3.3)

Necessarily 6= 0 because the right-hand side has irrational roots. Thus we


obtain a point on the elliptic curve
Ek0 : 2 = ( 2 42 7)

(3.4)

other than the origin and (0, 0). We then mimic the argument in the previous
paragraph to show that either or 7/ must be a square. This time the
possibility that must be excluded is that that one of them is 12 for some 1 Q.
Taking 1 = M/N , we would then have a square of the form 7N 4 42M 2 N 2 M 4 .
But this is congruent to 3 mod 4 if either M or N is even, and to 4 mod 16
complete the square as we do several times in the sequel, for instance when obtaining (3.4) or
in the calculation starting with (3.9).

74

NOAM D. ELKIES

if they are both odd, so again we reach a contradiction. Thus = 12 and


2
we find that 14 4212 7 is a square, say 2 (8x2 + 21) . This yields
x2 2 = 4x22 + 21x2 + 28; again the right-hand side has irrational roots, so we
find x2 Q such that x2 (4x22 + 21x2 + 28) Q 2 which is to say, a new point
on Ek ! Moreover, we can compute our original x or 7/x as a rational function
in x2 of degree 4, which means that if the numerator and denominator of x are
at all large (|M |, |N | > 100 is more than enough) then those of x2 are smaller.
Iterating this descent process enough times, we eventually find a rational solution
of y2 = 4x3 + 21x2 + 28x with nonzero x = M/N such that |M |, |N | 100. But
a direct search shows that there is no such x. This completes the proof that the
only rational points on Ek are the two torsion points already known.
[In modern terminology, Fermats method is descent via a 2-isogeny Ek0 Ek
[Silverman 1986, pp. 301 ff.]. The method can be used on any elliptic curve with
a rational 2-torsion point, and will often prove that the curve has only finitely
many rational points. The reappearance of Ek at the second step, which makes it
possible to iterate the process until reaching a small point, is due to the existence
of a dual isogeny Ek Ek0 also of degree 2. Composing these two isogenies yields
the multiplication-by-2 map on Ek ; thus we proved in effect that any rational
point on Ek is in either divisible by 2 or of the form 2P +(0, 0) in Ek (Q), and then
used the fact that multiplication by 2 in Ek (Q) quadruples the height to reduce
the determination of Ek (Q) to a finite search. In our case the 2-isogenous curve

Ek0 has j-invariant 2553 and CM by Z[ 7 ]; it is the elliptic curve numbered


49B in [Birch and Kuyk 1975] and 49-A2 in [Cremona 1992].]
It remains to find the preimages on X of the two rational points of Ek . We
saw already that the point at infinity comes from the unit vectors on X, and
that the 2-torsion point (x, y) = (0, 0) is the image of an hhi-orbit of points on X
whose coordinates are proportional to the three roots of u3 7u2 + 49. These
roots (and their ratios) are contained in K+ but not in Q. Thus the unit vectors
are the only rational points on X, as claimed.
3.2. Fermats Last Theorem for exponent 7. The Fermat curve
F7 : A7 + B 7 + C 7 = 0
admits a nonconstant map to X defined over Q, namely
(A : B : C) 7 (A3 C : B 3 A : C 3 B).
(The map is a cyclic unramified cover of degree 7, but we do not need this for
now.) Thus any rational point on F7 maps to a rational point on X. Having
just listed the rational points on X we can thus determine the rational points
on F. It turns out that each point of X(Q) lies under a unique point of F(Q).
This yields a proof of the case n = 7 of Fermats Last Theorem, a proof that is
elementary in that it uses only tools available to Fermat (algebraic manipulation
and 2-descent on an elliptic curve with a rational 2-torsion point); in particular

THE KLEIN QUARTIC IN NUMBER THEORY

75

it does not require arithmetic in cyclotomic number fields such as K. Indeed the
proof is analogous to Fermats own proof of the case n = 4, in the sense that in
both cases one maps Fn to an elliptic curve and proves that the elliptic curve
has rank 0; it is arguably easier than Eulers proof of the case n = 3, for which
F3 is already an elliptic curve but the determination of F3 (Q) requires what we
now recognize as a 3-descent. As is the case for n = 4, the map from F7 to Ek
is a quotient map, here by a 21-element subgroup of Aut(F7 ) isomorphic with
hg, hi G.
Stripped of all algebro-geometric machinery, this elementary proof runs as
follows: Suppose there existed nonzero integers a, b, c such that a7 + b7 + c7 = 0.
Then
x := a3 c, y := b3 a, z := c3 b
(3.5)
would be nonzero integers with
x3 y + y3 z + z 3 x = a3 b3 c3 (a7 + b7 + c7 ) = 0,

(3.6)

which we showed impossible in the previous section.


Curiously there is yet another proof of the n = 7 case of Fermat along the
same lines, which was discovered in the mid-19th century [Genocchi 1864] 10 but
is practically unknown today. Here we use the quotient of F7 by the group S3 of
coordinate permutations. This yields the following nice generalization of Fermat
for n = 7:
Theorem [Genocchi 1864]11. Let a, b, c be the solutions of a cubic x3 px2 +
qx r = 0 with rational coefficients p, q, r. If a7 + b7 + c7 = 0 then either abc = 0
or a3 = b3 = c3 .
That is, the only rational points on F7 /S3 are the orbits of (1 : 1 : 0) and
(1 : e2i/3 : e2i/3 ). We compute equations for F7 /S3 by writing a7 + b7 + c7
as a polynomial in the elementary symmetric functions
p = a + b + c,

q = ab + ac + bc,

r = abc

(3.7)

of a, b, c.
Proof. We easily calculate
0 = a7 + b7 + c7 = p7 7p5 q + 7p4 r + 14p3 q 2 21p2 qr 7pq 3 + 7pr2 + 7q 2 r (3.8)
(for instance by using the fact that the power moments n = an + bn + cn satisfy
the recursion n+3 pn+2 + qn+1 rn = 0 and starting from 0 = 3, 1 = p,
10 From Dickson [1934, p. 746], footnote 85. Dickson further notes that Genocchis method
may be viewed as a simplification of Lam
es, and that Genocchi does not carry out the descent
for proving that (3.9) has no finite rational points. We likewise leave the 2-descent on the
equivalent curve (3.12) to the reader, who may either do it by hand using the method described
in [Silverman 1986, pp. 301 ff.] or automatically with Cremonas mwrank program.
11 In fact Genocchi states at the end of his paper that he had announced the results several
years earlier in Cimento di Torino, vol. VI, fasc. VIII, 1855.

76

NOAM D. ELKIES

2 = p2 2q to reach the formula (3.8) for 7 ). Now if p = 0 then (3.8) reduces


to 7 = 7q 2 r, so if 7 = 0 then either r = 0 or q = 0, which yields abc = 0
or a3 = b3 = c3 respectively. If on the other hand p 6= 0 then we may assume
p = 1 by replacing a, b, c by a/p, b/p, c/p. We then find that (3.8) is a quadratic
polynomial in r of discriminant 49q 4 98q 3 + 147q 2 98q + 21. We note that
the resulting elliptic curve
u2 = 49q 4 98q 3 + 147q 2 98q + 21

(3.9)

has rational points at infinity, and use them to obtain a Weierstrass form for the
curve by the usual device of completing the square: let
u = 7(q 2 q + 1 2t)

(3.10)

7t(q 2 q) = 7t2 7t + 1,

(3.11)

in (3.9) to find
a quadratic in q with discriminant 196t3 147t2 + 28t. Thus 196t3 147t2 + 28t
must be a square. Taking t = x/7, then, we obtain the elliptic curve
7y2 = 4x3 + 21x2 + 28x,
(3.12)

which we recognize as the 7-twist of Ek . Since that curve has CM by Z[],


this new curve (3.12) is also 7-isogenous with Ek , and thus has rank zero. (This
curve appears as 49C in [Birch and Kuyk 1975] and 49-A3 in [Cremona 1992].)
In fact we can apply a 2-descent directly to (3.12) using the 2-torsion point (0, 0),
and then find that this point is the only rational point of (3.12) other than the
point at infinity. But if x = 0 then t = 7x = 0 and (3.11) becomes 0 = 1, which
is impossible (indeed the points x = 0, on (3.12) come from the solutions
p = r = 0 and p = q = 0 of (3.8) which solution is which depends on the
choice of square root u implicit in (3.9)). Thus indeed p = 0 in any rational
solution of (3.8), which completes the proof of Genocchis theorem.

[Along these lines we note that Gross and Rohrlich [1978] have shown that the
orbits of (1 : 1 : 0) and (1 : e2i/3 : e2i/3 ) also contain the only points of F7
rational over any number field of degree at most 3.]
3.3. Reduction of X modulo 2,3,7. For each of the primes p = 2, 3, 7 dividing #G, the reduction of X mod p enjoys some remarkable extremal properties:
maximal or minimal numbers of points over finite fields in each case, and maximal group of automorphisms for p = 3. We consider these three primes in
turn.
Characteristic 2. Since we want all the automorphisms of G to be defined
over F 2 , we use the S4 or rational S3 model for X. Then the Jacobian of X is
F 2 -isogenous to the cube of an elliptic curve with CM by Z[] and trace 1. It

THE KLEIN QUARTIC IN NUMBER THEORY

77

follows that the characteristic polynomial of Frobenius for X/F 2 is (T 2 T + 2)3 ,


with triple roots ,
. Thus for each m 1 our curve has

2m + 1 3 ()m + (
)m
(3.13)
rational points over F 2m . We tabulate this for the first few m:
m
#(X(F 2m ))

...

0 14 24 14 0 38 168 350 . . .

(3.14)

We noted already that the reduction mod 2 of the rational S3 model for X
has no F 2 -rational points. That it has no F 32 -rational points is rather more
remarkable. By the Weil estimates, a curve of genus w over F q has at least
q 2wq 1/2 + 1 rational points; if w > 1, this lower bound may be negative,
but only for q 4w2 3. In our case w = 3, this bound on q is 33, which
is not a prime power, so F 32 is the largest finite field over which a curve of
genus 3 may fail to have any rational point. [For w = 2, the bound 4w2 3 is
the prime 13, but Stark showed ([1973]; see in particular pages 287288) that
there is no pointless curve of genus 2 over F 13 ; an explicit such curve over F 11
is y2 = (x2 + 1)(x4 + 5x2 + 1).]
The 14 points of our curve over F 4 are all the points of P 2 (F 4 ) P 2 (F 2 ). It is
known that this is the maximal number of points of a genus-3 curve over F 4 [Serre
1983a; 1983b; 1984]. Note that the only F 16 -points are those already defined
over F 4 ; indeed one can use the Riemann hypothesis (which is a theorem of
Weil for curves over finite fields) to show as in [Serre 1983b] that a genus-3 curve
over F 4 with more than 14 points would have fewer F 16 points than F 4 points,
and thus prove that 14 is the maximum. The 24 points over F 8 likewise attain
the maximum for a genus-3 curve over that field [Serre 1983a; 1984]. Note that
the only F 64 -points are those already rational over a subfield F 4 or F 8 .
Upon reading a draft of this paper, Serre noted that in fact for m = 2, 3, 5 the
curve X is the unique curve of genus 3 over F 2m with the maximal (m = 2, 3)
or minimal (m = 5) number of rational points. He shows this as follows. Let
C/F 2m be any curve with the same number of points as X. First Serre proves
that C has the same eigenvalues of Frobenius as X. For m = 3, 5 this follows
from the fact that C attains equality in the refined Weil bound


#C(F q ) (q + 1) gb2q 1/2 c
(3.15)
(see [Serre 1983b, Theorem 1]). For m = 2 we instead use the fact that
#C(F 16 ) #C(F 4 ) = 14. Serre then notes that in each of the three cases

m = 2, 3, 5 we have m = (x 7 )/2 for some x Z, from which it follows


that Z[()m ] is the full ring of integers in k. Thus the Jacobian of C is isomorphic as a principally polarized abelian threefold Ek M , where M is some indecomposable positive-definite unimodular Hermitian Ok -lattice of rank 3. But by
Hoffmanns result [1991, Theorem 6.1] cited above, L is the unique such lattice.
Thus C has the same Jacobian as X, from which C
= X follows by Torelli.

78

NOAM D. ELKIES

Since k has unique factorization, the condition m = x


alent to the Diophantine equation
x2 + 7 = 2n


7 /2 is equiv(3.16)

with n = m + 2. (This equation also arises in [Serre 1983a], in connection


with curves of genus 2 over F 2m with many points, and even in coding theory
[MacWilliams and Sloane 1977, p. 184], because it is equivalent to the condition
that the volume of the Hamming sphere of radius 2 in (Z/2)(x1)/2 be a power
of 2.) Ramanujan observed12 that, in addition to the cases n = 3, 4, 5, 7 already
encountered, this equation has a pair of solutions (n, x) = (15, 181). We find
that ()13 has negative real part, and conclude from Serres argument that X
is the unique curve of genus 3 over F 213 with the maximal number of rational
points, namely 8736 = 25 3 7 13. Nagell [1960] was apparently the first to show
that the Diophantine equation (3.16) has no further integer solutions.
The 24 points over F 8 are, as could be expected, the reduction mod 2 of the
24-point orbit of Weierstrass points of X in characteristic zero. The F 4 points
require some more comment: since G acts on X by automorphisms defined over
the prime field, it permutes these 14 points, whereas in characteristic zero there
was no orbit as small as 14 in the action of G on X, or even on P 2 . But in
characteristic 2 the 24-element subgroups of G
= SL3 (F 2 ) arise naturally as
2
stabilizers of points and lines in P (F 2 ). The stabilizer of a line P 1 (F 2 ) P 2 (F 2 )
permutes the two points of the line rational over F 4 but not F 2 ; the subgroup
fixing each of those points thus has index 2 in the line stabilizer. Moreover each
point of P 2 (F 4 ) P 2 (F 2 ) lies on a unique F 2 -rational line. Thus the stabilizer
of each of these points is a subgroup A4 G. Such a subgroup contains three
involutions, each now having two instead of four fixed points on X, and four
3-Sylows. Thus the 14 points of X(F 4 ) are the reductions mod 2 of both the
56-point and the 84-point G-orbits. All points of X not defined over F 4 or F 8
have trivial stabilizer in G; such points first occur over F 27 , where the 168 points
of X(F 27 ) constitute a single G-orbit. The image of this orbit, together with
those of X(F 4 ) and X(F 8 ), account for the three F 2 -points of X/G
= P 1 . The
350 14 = 336 points in X(F 28 ) X(F 22 ) are likewise the preimages of the two
points of X/G defined over F 4 but not F 2 .
We conclude the description of X in characteristic 2 with an amusing observation of Seidel concerning the F 8 -rational points of X, reported by R. Pellikaan at
a 1997 conference talk. Since F 8 is the residue field of the primes above 2 of K,
the reductions mod 2 of the Klein and S4 models of X become isomorphic over
12 On page 120 of the Journal of the Indian Mathematical Society, Volume 5 #3 (6/1913),
we find under Questions for Solution:

464. (S. Ramanujan): (2n 7) is a perfect square for the values 3, 4, 5, 7, 15 of n.


Find other values.
No other values were found, but it does not seem that a proof that none exist ever appeared
in the Journal.

THE KLEIN QUARTIC IN NUMBER THEORY

79

that field; we use the Klein model. Consider the 24 3 = 21 points of X(F 8 )
other than the three unit vectors. These may be identified with the solutions in
F 8 of the affine equation x3 y + y3 + x = 0 (with x = X/Z, y = Y /Z) for the
Klein model of X. We choose () for our prime above 2, so reduces to a root of
3 + + 1 in F 8 . The 21 solutions (x, y) are then entered in the following table:
x

3
4

(3.17)

(note that we have listed the x- and y-coordinates in different orders so as to make
the hgi symmetry visible). Seidels observation is that (3.17) is the adjacency
matrix for the finite projective plane of order 2! The explanation is that for
x, y F 8 ,
x3 y + y3 + x = 0 x4 y2 + xy4 + x2 y = 0 TrF8 /F2 x2 y = 0.

(3.18)

Now (x, y) 7 TrF8 /F 2 x2 y is a nondegenerate pairing from F 8 F 8 to F 2 , so


if we regard x F 8 an element of 3-dimensional vector space over F 2 then y
is a functional on that vector space and (3.18) is the condition that a nonzero
functional annihilate a nonzero vector. Thus if we regard x, y F 8 as 1- and
2-dimensional subspaces of F 32 then x3 y + y3 + x = 0 if and only if the x-line is
contained in the y-plane, which is precisely the incidence relation on the points
and lines of the finite projective plane P 2 (F 2 ).
Characteristic 3. Since 3 is inert in k, the smallest field over which all the
automorphisms in G might be defined is F 9 . Again we make sure that they are
in fact defined over that field by using the S4 or rational S3 model for X. That 3
does not split in k also makes the elliptic curve Ek , with CM by Ok , supersingular
in characteristic 3; we find that its characteristic polynomial of Frobenius over
F 9 is (T + 3)2 , and hence that X has 9m 6(3)m + 1 rational points over F 9m .
Thus, depending on whether m is odd or even, X has the maximal or minimal
number of rational points for a curve of genus 3 over F 9m . Moreover, the curve
has 28 points over both F 9 and F 81 , and thus maximizes the genus w of a curve
C/F 9 that can attain the Weil upper bound 9 + 6w + 1 on #C(F 9 ).

80

NOAM D. ELKIES

In fact this turns out to be a special case of a known construction of curves


attaining the Weil bound over F q2 . Note that 4 , as given by either (1.11) or
4
4
4
(1.22), reduces mod 3 to X 0 + Y 0 + Z 0 or A4 + B 4 + C 4 . That is, the Klein and
Fermat quartics are isomorphic in characteristic 3. Now for each prime power q,
the equation xq+1 + yq+1 + z q+1 = 0 defining the Fermat curve Fq+1 can be
written as
xq x + yq y + z q z = 0.

(3.19)

For a F q2 we note that a a is just the norm of a from F q2 to F q . This lets us


easily count the solutions of (3.19) in F q2 , and we calculate that Fq+1 has q 3 + 1
rational points over F q2 . Since this curve has genus (q 2 q)/2, it thus attains
the Weil bound. Therefore its characteristic polynomial of Frobenius over F q2 is
2
(T + q)q q , so the number of F q4 -rational points of Fq+1 is
q

q 4 (q 2 q)q 2 + 1 = q 3 + 1

(3.20)

again. If there were a curve C/Fq2 of genus w > (q 2 q)/2 attaining the Weil
bound, its number q 2 + 2qw + 1 of F q2 -rational points would exceed the number
q 4 2q 2 w + 1 of points rational over F q4 ; thus again Fq+1 is the curve of largest
genus attaining the Weil bound over F q2 . These properties of Fq+1 over F q2 are
well-known, see for instance [Serre 1983a; 1984].
Since X
= F4 in characteristic 3, its group of automorphisms over F 9 must
accommodate both G and the 96-element group of automorphisms of F4 in characteristic zero. In fact AutF 9 (X) is the considerably larger group U3 (3) of order
6048, consisting of the unitary 3 3 matrices over F 9 ; it is the largest automorphism group of any genus-3 cuver over an arbitrary field. Again this is a
special case of the remarkable behavior of the Hermitian curve Fq+1 /F q2 : by
regarding xq x + yq y + z q z as a ternary Hermitian form over F q2 we see that any
linear transformation of x, y, z which preserves this form up to scalar multiples
also preserves the zero-locus (3.19); since of those transformations only multiples
of the identity act trivially on Fq+1 , we conclude that the group PGU3 (q) acts
on Fq+1 over F q2 . Once q > 2, this is the full F q -automorphism group of Fq+1 ,
and is the only example of a group of order > 16w4 acting on a curve of genus
w > 1 (here w = (q 2 q)/2 and the group has order q 3 (q 2 1)(q 3 + 1) ) over an
arbitrary field [Stichtenoth 1973].
Returning to the special case of X, we note that the stabilizer in G of each of
its 28 F 9 -rational points is a subgroup N (H3 )
= S3 . Thus the two fixed points
on X of H3 collapse mod 3 to a single point; for each of the three involutions
in S3 , this point is also the reduction of one of its four fixed points. Thus the
56- and 84-point G-orbits reduce mod 3 to the same 28-point orbit. The 24point orbit is undisturbed, and is first seen in X(F 93 ); all other points of X in
characteristic 3 have trivial stabilizer.
3 -endomorphisms
Since Ek is supersingular in characteristic 3, its ring of F

has rank 4 instead of 2; thus the MordellWeil lattice of F 3 -maps from X to Ek

THE KLEIN QUARTIC IN NUMBER THEORY

81

now has rank 12 instead of 6. Gross [1990, p. 957] used the action of U3 (3)
on this lattice to identify it with the CoxeterTodd lattice. This lattice has
756 = 63 12 minimal vectors of norm 4, which as before come from involutions
of the curve; the count is higher than in characteristic zero because there are
63 involutions in U3 (3) = AutF 3 X and 12 automorphisms of Ek , rather than
21 and 2 respectively. To see the new automorphisms, reduce (2.10) mod 3 to
obtain y2 = x3 + x, with automorphisms generated by (x, y) 7 (x + 1, y) and
(x, y) 7 (x, iy) with i2 = 1.
Adler [1997] found that the modular curve X(11), with automorphism group
PSL2 (F 11 ) in characteristic zero, has the larger automorphism group M11 when
reduced mod 3. Once we identify X with the modular curve X(7) in the next
section well be able to regard its extra automorphisms mod 3 as a similar phenomenon. This quartic in characteristic 3 has another notable feature: each of
its points is an inflection point! See [Hartshorne 1977, p. 305, Ex. 2.4], where
the curve13 is described as funny for this reason. (The 28 points of X(F 9 )
are distinguished in that their tangents meet X with multiplicity 4 instead of 3;
these fourfold tangents are the reductions mod 3 of the bitangents of X in characteristic zero.) Again Adler found in [1997] that X(11), naturally embedded
in the 5-dimensional representation of PSL2 (11), is also funny in this sense
when reduced mod 3. While it is not reasonable to expect the extra automorphisms of X(7) and X(11) in characteristic 3 to generalize to higher modular
curves X(N ) (the Mathieu group M11 , being sporadic, can hardly generalize),
one might ask whether further modular curves are funny mod 3 or in other
small characteristics.
Characteristic 7. The curve X even has good reduction in characteristic 7
over a large enough extension of Q; that is, X has potentially good reduction
mod 7. We can see this from our realization of X as a cyclic triple cover of Ek .
The elliptic curve Ek has potentially good reduction mod 7, because the change

of variable x = 7 x1 puts its Weierstrass equation (2.10) in the form

( 7 )3 y2 = 4x31 3 7 x21 4x1 ,


(3.21)
and over a number field containing (7)1/4 the further change of variable y =
2(7)3/4 y1 makes (3.21) reduce to y12 = x31 x1 at a prime above 7. [In general
any CM elliptic curve has potentially good reduction at all primes; equivalently,
the j-invariant of any CM curve is an algebraic integer.] Since the x-coordinates
of the two branch points of the cover XEk are the roots of (2.11), their x1

coordinates are the roots of 7x21 + x1 = 7, one of which has negative


7-valuation while the others 7-valuation is positive. Thus these points reduce to
distinct points on y12 = x31 x1 , namely the point at and the 2-torsion point
(0, 0), and the cover XEk branched at those points has good reduction mod 7
as well.
13 In

its Klein model, but for once the distinction is irrelevant.

82

NOAM D. ELKIES

On the other hand, the homogeneous quartic defining X cannot have good
reduction mod 7, even potentially: we have seen that even in the rational S3
model the quartic invariant 4 factors mod 7 as 22 . How can a plane quartic
curve have good reduction if its defining equation becomes so degenerate?
This apparent paradox is resolved only by realizing that the moduli space
of curves of genus 3 contains not only plane quartics but also hyperelliptic
curves. While a non-hyperelliptic curve of genus 3 is embedded as a quartic in
P 2 canonically14, the canonical map to P 2 from a hyperelliptic curve of genus 3
is a double cover of a conic C : Q2 = 0. Moreover, the moduli space of curves
of genus 3 is connected, so a hyperelliptic curve S of genus 3 may be contained
in a one-parameter family of curves of the same genus most of which are not
hyperelliptic. In that case, the neighbors of S in the family are plane quartics
Q4 = 0 that approach the double conic Q22 = 0 coming from S; if we write Q4 as
Q22 + Q04 + O(2 ) in a neighborhood of S then the branch points of the double
cover SC are the 2 4 = 8 zeros of Q04 on C.15 This means that a smooth plane
quartic curve Q4 = 0 may reduce to a hyperelliptic curve of genus 3 modulo a
prime at which Q4 Q22 . This is in fact the case for our curve X, with Q4 = 4
and Q2 = 2 : Serre found [Mazur 1986, p. 238, footnote] that, over an extension
of k sufficiently ramified above 7 , the Klein quartic reduces to
v 2 = u7 u

(3.22)

at that prime, where u is a degree-1 function on the conic 2 = 0 in P 2 that


1
identifies this conic with
 P . This reduced curve (3.22) inherits the action of G:
a b
a group element c d PSL2 (F 7 ) acts on (3.22) by




au + b
v
a b

: (u, v) 7
.
(3.23)
,
c d
cu + d (cu + d)4
As in the case of characteristic 3, the group of automorphisms of the reduced
curve properly contains G; here it is the direct product of G with the twoelement group (u, v) 7 (u, v) generated by the hyperelliptic involution. Also
as in characteristic 3, this reduced curve attains the upper or lower Weil bound
on the number of points of a genus-3 curve over finite fields of even degree over
the prime field. This is because the prime 7 is not split in k, so the reduction
of Ek to an elliptic curve in characteristic 7 is supersingular. The supersingularity
could also be seen directly from its Weierstrass model y22 = x31 x1 ; that the
eigenvalues of Frobenius for v2 = u7 u over F 49 all equal 7 could also be seen

by counting points: since (u7 u)/ 1 F 7 for all F 49 , and 1 is a square


in F 49 , the preimages of each u P 1 (F 49 ) P 1 (F 7 ) are F 49 -rational, and these
14 Here canonically means via curves holomorphic differentials, which are sections of the
canonical divisor; see for instance [Hartshorne 1977, p. 341].
15 Thanks to Joe Harris for explaining this point; it should be well-known, but is not easy
to find in the literature. Armand Brumer points out that this picture is explained in [Clemens
1980, pp. 155157].

THE KLEIN QUARTIC IN NUMBER THEORY

83

2 42 = 84 points together with the 8 Weierstrass points u P 1 (F 7 ) add up to


92, which attains the Weil bound 49 + 6 7 + 1.

4. X as a Modular Curve
PSL2 (F 7 ) we can realize G
4.1. X as the modular curve X(7). Since G =
as the quotient group (1)/(7), where (1) is the modular group PSL2 (Z) and
(7) is the subgroup of matrices congruent to the identity mod 7. The following
facts are well known: (1) acts on the upper
 half-plane H = { C : Im > 0}
by fractional linear transformations ac db : 7 (a + b)/(c + d); the quotient
curve H/(1) parametrizes elliptic curves up to C-automorphism; if we extend
H by to H by including the cusps Q {} = P 1 (Q), the resulting quotient
curve X(1) may be regarded as a compact Riemann surface of genus 0; and for
each N 1, the quotient of H by the normal subgroup (N ) of (1) is the
modular curve X(N ) whose non-cusp points parametrize elliptic curves E with
a full level-N structure. A full level-N structure means an identification of the
group E[N ] of N -torsion points with some fixed group TN . Why TN and not
simply (Z/N )2 as expected? We can certainly use TN = (Z/N )2 if we regard
X(N ) as a curve over an algebraically closed field such as C. But that will not
do over Q once N > 2: the Weil pairing (see for instance [Silverman 1986, III.8,
pp. 95 ff.]) identifies 2 E[N ] with the N -th roots of unity N , which are not
contained in Q. So TN must be some group
= (Z/N )2 equipped with an action
2

of Gal(Q/Q)
such that TN
= N as Galois modules. There are many choices
for TN for instance, E[N ] for any elliptic curve E/Q! which in general yield
different modular curves over Q (though they all become isomorphic over Q): TN
and TN0 yield the same curve only if TN0
= TN for some quadratic character .
The simplest choice is
TN = (Z/N ) N ,

(4.1)

and that is the choice that we shall use to define XN as a curve over Q. Note,
however, that the action of (1)/(N ) is still defined only over the cyclotomic
field Q(N ). The canonical map X(N )X(1) that forgets the level-N structure is a Galois cover with group (1)/(N ) = PSL2 (Z/N ); it is ramified only
above three points of X(1), namely the cusp and the elliptic points that parametrize elliptic curves with complex multiplication by Z[i] and Z[e2i/3 ], and the
ramification indices at these points are N , 2, and 3 respectively.
Now consider N = 7. Then X(7) is a G-cover of the genus-0 curve X(1) with
three branch points of indices 2, 3, 7; therefore it is a Hurwitz curve, and thus
isomorphic with X at least over C. The 24-point orbit is the preimage of the
cusp, and the 56- and 84-point orbits are the preimages of the elliptic points
= e2i/3 and = i on X(1) parametrizing CM elliptic curves with j-invariants
0 and 1728. We shall show that the choice (4.1) of T7 yields X(7) as a curve
over Q isomorphic with the Klein model of X, and give explicitly an elliptic

84

NOAM D. ELKIES

curve and 7-torsion points parametrized by a generic point (x : y : z) P 2 with


x3 y + y3 z + z 3 x = 0.
The projective coordinates for X can be considered as a basis for H1 (X).
Holomorphic differentials on a modular curve H/ are differentials f( ) d on
H that are regular and invariant under , i.e. such that f( ) is a modular cusp
form of weight 2 for : a holomorphic function satisfying the identity

a + b
c + d

= (c + d)2 f( )

(4.2)


for all ac db and vanishing at the cusps. We next choose a convenient basis
for the modular cusp
forms of weight 2 for (7).

Taking ac db = 10 71 in (4.2) we see that f must be invariant under
7 + 7; thus it has a Fourier expansion in powers of q 1/7 , where as usual
q := e2i

(so

1 dq
).
2i q

d =

(4.3)

Since we require vanishing at the cusp = i, the


 expansion must involve only
positive powers of q 1/7 . The action of g = 10 11 on modular forms multiplies
q by ; thus g decomposes our space of modular forms into eigen-subspaces with
eigenvalues a , such that for each a mod 7 the a eigenspace consists of forms
P
m/7
whose coefficients cm vanish at all m 6 a mod 7. We find three
m>0 cm q
such forms:
x = q 4/7 (1 + 4q 3q 2 5q 3 + 5q 4 + 8q 6 10q 7 + 4q 9 6q 10 ),
y = q 2/7 (1 3q q 2 + 8q 3 6q 5 4q 6 + 2q 8 + 9q 10 ),

(4.4)

z = q 1/7 (1 3q + 4q 3 + 2q 4 + 3q 5 12q 6 5q 7 + 7q 9 + 16q 10 ).


These can be expressed as the modified theta series
X

Re() q /7 ,
x, y, z =

(4.5)

the sum extending over Z[] congruent mod ( 7 ) to 2, 4, 1 respectively.


They also have the product expansions
x, y, z = q a/7

(1 q n )3 (1 q 7n )

n=1

(1 q n ),

(4.6)

n>0
nn0 mod7

where the parameters , a, n0 are: for x, 1, 4, 1; for y, +1, 2, 2; and for z, +1, 1, 4.
That these in fact yield modular forms can be seen by factoring the resulting
products (4.6) into Klein forms (for which see for instance [Kubert and Lang
1981, pp. 25 ff. and 68 ff.]); it follows that x, y, z do not vanish except at the
cusps of X(7).
Since x, y, z are 4 -, 2 -, and -eigenforms for g, the three-dimensional representation of G that they generate must be isomorphic with (V, ), and so the
action of G on X(7) will make it a quartic in the projectivization not of (V, ) but

THE KLEIN QUARTIC IN NUMBER THEORY

85

of (V , ).16 Using either the sum or the product formulas for x, y, z, together
with the action of (1) on theta series or on Klein forms, we can compute that h
cyclically permutes x, y, z. This is enough to identify (x, y, z) up to scaling with
our standard basis for V (again thanks to the fact that the 21-element subgroup
hg, hi of G acts irreducibly on V ). This leads us to expect that
4 (x, y, z) = x3 y + y3 z + z3 x = 0,

(4.7)

and the q-expansions corroborate this. To prove it we note that 4 (x, y, z), being
a G-invariant polynomial in the cusp forms x, y, z, must be a cusp form of weight
4 2 = 8 for the full modular group (1); but the only such form is zero. (See
for instance [Serre 1973, Ch.VII] for the complete description of cusp forms on
(1).) Thus the coordinates (x : y : z) for the canonical image of X(7) in CP 2
identify it with the Klein model of X.
We next identify the other G-invariant polynomials in x, y, z with known modular cusp forms for (1). We find that
6 (x, y, z) = [ = q

(1 q n )24 = q 24q 2 + 252q 3 ],

(4.8)

n=1

which requires only checking the q 1 coefficient because every (1) cusp form
of weight 12 is a multiple of . Likewise the leading terms of 14 (x, y, z) and
21 (x, y, z), together with their weights 28, 42, suffice to identify these modular
forms with


X
n3 q n
= q 2 + 192q 3 8280q 4 ],
n
1

q
n=1
(4.9)


5 n 
X
n
q
21 (x, y, z) = 3 E3 [ = 3 1 504
= q 3 576q 4 + 22140q 5 ].
n
1

q
n=1
(4.10)

14 (x, y, z) = 2 E2 [ = 2 1 + 240

Thus the elliptic curve parametrized by a non-cusp point (x : y : z) on X is


E(x:y:z) : v2 = u3

1 2
48 14 (x, y, z)

3
1
864 21 (x, y, z),

(4.11)

for some yet unknown of weight 14 (that is, homogeneous of degree 7 in


x, y, z) that only changes E(x:y:z) by a quadratic twist.
To determine the values of u at 7-torsion points of E(x:y:z) we identify that

curve with C/(Z Z )


= C /q Z and expand the Weierstrass -function of that
Z
curve at some point q1 C /q in a q-series depending on q1 . We find

X
X
1
qn
q n q1
u =
2
+
.
n
2
12
(1 q )
(1 q n q1 )2
n=
n=1

(4.12)

16 This mildly unfortunate circumstance could only have been avoided by more awkward
artifices such as declaring to be e2i/7 instead of e+2i/7 in (0.1). Of course the distinction
between the V and V models of X is harmless because the two representations are related by
an outer automorphism of G.

86

NOAM D. ELKIES

The 7-torsion points of C /q Z are generated by and q 1/7 . Substituting these for
q1 in (4.12) we obtain P (x, y, z) for certain polynomials P of degree 7 determined
up to multiples of 4 . We find P by comparing q-expansions. For q1 = we
obtain the symmetrical form

7
53 7
53 7
P = 17 (c1 2c2 53
12 )x + (c2 2c4 12 )y + (c4 2c1 12 )z

+ 23 (c2 c4 )x4 y2 z + (c4 c1 )y4 z2 x + (c1 c2 )z4 x2 y , (4.13)
using the abbreviation cj := j + j K+ . the polynomials for 2 , 4 are
obtained from these by cyclically permuting c1 , c2 , c4 and x, y, z. That only these
six monomials can occur is forced by the invariance of the polynomial under hgi.
The polynomial for q1 = q 1/7 looks more complicated, because invariance under
sgs is not so readily detectable; we refrain from exhibiting that polynomial in
full, but note that it can be obtained from (4.13) by the linear substitution (s),
and that its coefficients, unlike those of (4.13), are rational.17
It remains to choose . We would have liked to make it G-invariant, since the
action of G would then preserve our model (4.11) for E(x:y:z) and only permute
its 7-torsion points. But we cannot make an arbitrary homogeneous function of
degree 7 in x, y, z because we are constrained by the condition that E(x:y:z) [7]
=
T7 for all non-cusp (x : y : z) X(7). This means, first, that E(x:y:z) must be a
nondegenerate elliptic curve, and second, that its 7-torsion group be generated

by a rational point (for the Z/7 part of T7 ) and a point that every Gal(Q/Q)
a
element taking to multiplies by a (for the 7 part). The first condition
amounts to the requirement that the divisor of be supported on the cusps
of X(7); this determines up to multiplication by a modular unit in C(X(7)).
The second condition then determines up to multiplication by the square of
a modular unit. It turns out that already the first condition prevents us from
choosing a G-invariant : such a would be 14 /21 times a rational function
of j, and thus would have zeros or poles on the elliptic points of order 2 and 3
(the 56- and 84-point orbits of X).
We next find a , necessarily not G-invariant, that does the job. From our
computation of u-coordinates at 7-torsion points we know that u/ is a polynomial P (x, y, z) Q[x, y, z]. Moreover
Q := P 3

1
P
48 14

864 21

(4.14)

cannot vanish except at a cusp, lest a 7- and a 2-torsion point on C /q Z coincide.


[In fact Q has the product expansion
q 23/7

6
1 8
5
2 2
(1 q n 7 )(1 q n 7 )
(1 q n 7 )2 (1 q n 7 ) (1 q n )84 ,

(4.15)

n=1
17 This is ultimately due to the fact that the coefficients of (4.12) are rational. In fact it is
no accident the least common denominator of the coefficients of P for q1 = q 1/7 is 12, same
as for (4.12); but we need not pursue this here.

THE KLEIN QUARTIC IN NUMBER THEORY

87

which manifestly has neither zero nor pole in X(7) {cusps}.] Thus for any 0
homogeneous of degree 14 in x, y, z whose divisor is supported on the cusps (for
instance 0 = x14 ) we may take
= Q/20 ,

(4.16)

which satisfies the first condition and yields a 7-torsion point on the curve (4.11)
rational over Q(x, y, z).
We claim that this, together with our computations thus far, lets us deduce
that also satisfies the second condition, and thus completes our proof that X(7)
is Q-isomorphic with X, as well as the determination of the 7-torsion points on
the generic elliptic curve (4.11) parametrized by X. We must show that E[7] is

isomorphic as a Gal(Q/Q)
module with T7 = (Z/7) 7 . Indeed, consider the

action on E[7] of an element of Gal(Q/Q)


that takes to a . By our choice
1/7
of , this fixes the point with q1 = q ; thus this point generates a subgroup

= Z/7 of E[7]. From our computation of (4.13) we see that multiplies the
q1 = point by either a or a. But the Weil pairing of the and q 1/7 points is
, which takes to a . Thus must also take the q1 = point to a . In other
words, the q1 7 points comprise a subgroup of E[7] isomorphic as a Galois
module with 7 . Having found subgroups of E[7] isomorphic with Z/7 and 7 ,
we are done.
4.2. The modular interpretation of quotients of X. Now let H be a
subgroup of G, and consider the quotient curve X/H. When H is trivial, this
quotient is X itself, which we have just identified with the moduli space X(7) of
elliptic curves with full level-7 structure. When H = G, the quotient is the moduli space X(1) of elliptic curves with no further structure, and the quotient map
X(7)X(1) in effect forgets the level-7 structure. For intermediate groups H,
the quotient curve, which can still be regarded also as the quotient of H by
a congruence subgroup of (1), parametrizes elliptic curves with partial level-7
structure such as a choice of a 7-torsion point or 7-element subgroup. In this
section we describe the three classical modular curves X0 (7), X1 (7), and X0 (49)
that arise in this way. The same constructions yield for each N > 1 the curves
X0 (N ), X1 (N ), X0 (N 2 ) as quotients of X(N ), though of course for each N we
face anew the problem of finding explicit coordinates and equations for these
modular curves and covers.
Each of the eight 7-element subgroups T of E (equivalently, of E[7]) yields
an isogeny of degree 7 from E to the quotient elliptic curve E/T . The T s may
1
be regarded as points of the projective line (E[7] {0})/F 7
= P (F 7 ), permuted
1
by G. The stabilizer in G of a point on this P (F 7 ) is a 21-element subgroup;
for instance, hg, hi is the stabilizer of . Taking H = hg, hi we conclude that
X/H parametrizes elliptic curves E together with a 7-element subgroup T , or
equivalently together with a 7-isogeny EE/T . This X/H is the quotient of H

88

NOAM D. ELKIES

by the subgroup

 


a b
0 (7) :=
PSL2 (Z) : c 0 mod 7
c d

(4.17)

of (1), and is called the modular curve X0 (7). This curve has genus 0, with
rational coordinate (Hauptmodul)
1
j7 =
q

4
(1 q )/(1 q )
n

7n

= q 1 4 + 2q + 8q 2 5q 3 4q 4 . (4.18)

n=1

Comparing this with the product expansions for x, y, z, , we may express j7 as


a quotient of hg, hi-invariant sextics in x, y, z:
j7 =

(xyz)2
(xyz)2
=
.

6 (x, y, z)

(4.19)

Either by comparing this with (2.13), or directly from the q-expansions, we then
find that the degree-8 cover X0 (7)/X(1) is given by
j = (j72 + 13j7 + 49)(j72 + 245j7 + 74 )3 /j77 .

(4.20)

Given a 7-isogeny EE/T , the image of E[7] in E/T is a 7-element subgroup


of E/T and thus yields a new 7-isogeny E/T E/E[7]
= E. This is in fact
the dual isogeny [Silverman 1986, p. 84 ff.] of the isogeny EE/T . Thus we
have a rational map w7 : X0 (7)X0 (7) that takes a non-cusp point of X0 (7),
parametrizing an isogeny EE/T , to the point parametrizing the dual isogeny
E/T E. Moreover, iterating this construction recovers our original isogeny
EE/T ; thus w7 is an involution of X0 (7). This w7 is known as the Fricke
involution of X0 (7). In general X0 (N ) = H/0 (N ) parametrizes N -isogenies
with cyclic kernel (a.k.a. cyclic N -isogenies) between elliptic curves, and the
dual isogeny yields the Fricke involution wN of X0 (N ). This involution can
also be described over C as the action of the fractional linear transformation
1/N on H, which descends to an automorphism of X0 (N ) because it
normalizes 0 (N ). In our case of N = 7 we find the formula
w7 (j7 ) = 49/j7

(4.21)

for the action of w7 on X0 (7). The coefficients of the curve E/T and the 7isogenies E
E/T parametrized by X0 (7) can be computed as explicit functions
of j7 by the methods of [Elkies 1998a].
The modular curve X1 (7) parametrizes elliptic curves with a rational 7-torsion
point. It is thus the quotient of X(7) by the subgroup of G that fixes a 7-torsion
point. To obtain this modular curve, and the elliptic curve it parametrizes,
over Q, we must be careful to use a 7-torsion point that generates the subgroup
Z/7 of T7 : we have already computed in (2.2) the quotient of X by the 7-element
subgroup hgi of G, which is the stabilizer of a 7-torsion point; but this is the point
(4.13), which generates the subgroup 7 of T7 , and so is not rational over Q.

THE KLEIN QUARTIC IN NUMBER THEORY

89

The Z/7 subgroup has stabilizer hsgsi, so we may obtain X1 (7) as X(7)/hsgsi.
Alternatively we may start from X(7)/hgi and apply w7 . This second approach
requires some explanation. At the level of Riemann surfaces, there is no problem:
for any N > 1, the modular curve X1 (N ) is H/1 (N ) where
 


a b
1 (N ) :=
PSL2 (Z) : c 0, a, d 1 mod N ,
(4.22)
c d
and again 1/N normalizes this subgroup and so yields an involution
of X1 (N ). But over Q some care is required. The curve X1 (N ) parametrizes
pairs (E, P ) where E is an elliptic curve and P E is a point of order N . The
involution takes (E, P ) to (E 0 , P 0 ), where E 0 = E/hP i and P 0 generates the
image of E[N ] under the quotient map EE 0 . But to specify the generator we
must use the Weil pairing: P 0 must be the image of a point P E[N ] whose Weil
pairing with P is e2i/N . Once N > 2 the root of unity e2i/N is not rational, so
we cannot demand that both P and P 0 be rational N -torsion points on E, E 0 .
Instead, P, P 0 must generate Galois modules such that hP i hP 0 i
= N . So, for
instance, if P is rational then hP 0 i
= N , and conversely if hP i
= N then P 0 is
rational. The latter case applies for us: in our model of X(7), the distinguished
7-torsion points on the elliptic curve E parametrized by X(7)/hgi constitute a
subgroup
= 7 of E[7]; thus the curve E 0 has a rational 7-torsion point.
Using X/hgi for X1 (7), we find that this modular curve has rational coordinate
d :=

y2 z
= q 1 + 3 + 4q + 3q 2 5q 4 7q 5 2q 6 + 8q 7 ,
x3

(4.23)

and that the cyclic cubic cover X1 (7)X0 (7) is given by


j7 = d +

1
d1
d3 8d2 + 5d + 1
+
8=
.
1d
d
d2 d

(4.24)

The elliptic curve with a 7-torsion point parametrized by X1 (7) was already
exhibited in extended Weierstrass form by Tate [1974, p. 195]:
y2 + (1 + d d2 )xy + (d2 d3 )y = x3 + (d2 d3 )x2

(4.25)

(we chose our coordinate d so as to agree with this formula). Besides making the
coefficients simpler compared to the standard Weierstrass form y2 = x3 +a4 x+a6 ,
Tates formula has the advantage of putting the origin at a 7-torsion point
Tate actually obtained (4.25) starting from a generic elliptic curve
y2 + a1 xy + a3 y = x3 + a2 x2

(4.26)

tangent to the x-axis at the origin, and working out the condition for the origin
to be a 7-torsion point. The equations for the curve 7-isogenous with (4.25) can
again be obtained by the methods of [Elkies 1998a], or since here the points
of the isogenys kernel are rational already from Velus formulas [Velu 1971]
on which those methods are based.

90

NOAM D. ELKIES

From our discussion in the previous paragraph, the involution w7 of X1 (7) cannot be defined over Q, only over K+ . (The full cyclotomic field K is not needed
because X1 (7) cannot distinguish a 7-torsion point from its inverse, so only the
squares in (Z/7) = Gal(K/Q) are needed, and they comprise Gal(K+ /Q); in
general for each prime p 3 mod 4 the Fricke involution wp of X1 (p) is defined over the real subfield of the cyclotomic field Q(e2i/p ).) In fact there are
three choices of w7 , cyclically permuted by 0 (7)/1 (7) (and Gal(K+ /Q)); we
calculate that the choice associated with 1/7 gives
w7 (d) =

(4 + 3c1 + c2 )d (3 + 3c1 + c2 )
,
d (4 + 3c1 + c2 )

(4.27)

where cj := j + j K+ as in (4.13).
We have seen already that X/hhi coincides with X0 (49), and hinted that this
is in fact no mere coincidence. We can now explain this: where a point on X(7)
specifies an elliptic curve E together with a basis {P1 , P2 } for E[7], the hhiorbit of the point specifies only the two subgroups hP1 i and hP2 i generated by
the basis elements. Equivalently, it specifies two elliptic curves E1 = E/hP1 i,
E2 /hP2 i among the eight curves 7-isogenous with E. (Note that hhi is the
stabilizer in PSL2 (F 7 ) of the two points 0, on P 1 (F 7 ).) But then we obtain a cyclic 49-isogeny E1 E2 by composing the isogenies E1 E, EE2 .
Conversely, any cyclic 49-isogeny between elliptic curves factors as a product
of two 7-isogenies and thus comes from a point X/hhi. Thus X/hhi is indeed
the modular curve X0 (49) parametrizing cyclic 49-isogenies. In this description
of X0 (49), the involution w49 of X/hhi is the involution we have already constructed from the normalizer of hhi in G. Note that w49 switches the roles of
E1 , E2 but preserves E. In terms of congruence subgroups of (1), the identification of
by noting that the congruence groups


 X/hhi with X0 (49) is explained
ac db PSL2 (Z) : b, c 0 mod 7 and 0 (49) are conjugate in PSL2 (R) by

71/2 70 01 : 7 7 .
Some final remarks on this curve Ek = X/hhi = X0 (49): recall that we
showed that its only Q-rational points are the point at infinity and (0, 0). Since
these are both cusps of X0 (49) we conclude that there are no elliptic curves
over Q admitting a rational cyclic 49-isogeny. However, there are infinitely
many number fields, including quadratic ones such as Q(i) and Q(e2i/3 ) =

Q( 3 ), over which Ek is an elliptic curve of positive rank. (Take x = 2 or


x = 3 in the Weierstrass equation (2.10) for Ek .) Over such a number field
there are infinitely many pairs of elliptic curves with different j-invariants that
admit a rational cyclic 49-isogeny. Moreover 49 is the largest integer for which
this can happen: the curve X0 (N ) for N > 49 has genus > 1, and thus by
Faltings only finitely many points over any given number field. See the tables
and introductory remarks of [Birch and Kuyk 1975] for more information on the
genera and rational points of the modular curves X0 (N ).

THE KLEIN QUARTIC IN NUMBER THEORY

91

4.3. Kenkus proof of the solution of the class number 1 problem.


What of the quotients of X by S4 and the 2-Sylow subgroup of G? The first
of these we calculate using the fact that (S4 ) is itself a reflection group,
with invariant ring generated by polynomials of degrees 2, 4, 6; we choose the
elementary symmetric functions of X 2 , Y 2 , Z 2 as our generators:
2 := X 2 + Y 2 + Z 2 , 4 := (XY )2 + (XZ)2 + (YZ)2 , 6 := (XY Z)2 . (4.28)
We then express a basis for the G-invariants in the S4 model as polynomials in
2 , 4 , 6 . Clearly the invariant quartic (1.11) is 22 + (3 2)4 . The degree6 invariant is proportional to (1 + )32 + (2 3)2 4 (42 + 7)6 . The
determinant (1.14) defining the degree-14 invariant is proportional to
(9+9)72 + (5670)52 4 (294+105)32 24 + (28+154)2 34

+6 (1008+2198)42 + (11487014)22 4 + (12348+1078)24
+(15778 + 15435)226 . (4.29)
Now the genus-0 curve X/S4 is rationally parametrized by the function f :=
32 /6 , which is of degree 24 on X and thus of degree 1 on X/S4 . So to obtain
the degree-7 cover X/S4 X/G we need only write the rational parameter 314 /76
of X/G as a rational function of 32 /6 on X. Since 22 = (2 3)4 on X, our
expressions for the G-invariant polynomials of degrees 6, 14 simplify to multiples
of

32 (1 + (14 + 7)f), 72 3 + (490 + 196)f + (3430 + 2401)f 2 . (4.30)
Thus j = 314 /76 is given by
26 3 + (490 + 196)f + (3430 + 2401)f 2

3 

7
1 + (14 + 7)f ,

(4.31)

in which the coefficient 26 may either be obtained by keeping track of all the
constants of proportionality along the way, or by requiring that the third point
of ramification of j (other than the points j = 0, forced by the factorization
in (4.18)) occur at j = 123 . To put (4.31) in a nicer form we replace f by the
equivalent coordinate , related with f by
f=

( + 3) + 14 + 26
,
56( + 3(1 + ))

(4.32)

which puts the pole of j at = and thus makes j a seventh-degree polynomial


in :

3
3
j = 3(1 + ) (2 + ) + (3 + 2)
(4.33)
 2
 2
2
3
= 12 + + (2 + 4) + 2 (6 + 9) 2(1 + ) + (1 2) .
We noted already that the S4 model of X cannot be defined over Q because S4 is
its own normalizer in Aut(G). For the same reason this polynomial (4.33) cannot
have rational coefficients. Over a number field F containing k, we may choose a
conjugacy class of subgroups S4 G, and then depending on our choice either

92

NOAM D. ELKIES

(4.33) or its Gal(k/Q) conjugate parametrizes elliptic curves E/F such that
Gal(F )/F acts on E[7] by a subgroup of a 24-element group in that conjugacy
class.18
On the other hand, the 8-element dihedral subgroups D8 of G do extend to
16-element subgroups of Aut(G). This is a consequence of Sylow theory, but
the subgroups in question can also be seen from the interpretation of G and
Aut(G) as PSL2 (F 7 ), PGL2 (F 7 ): choose an identification of F 27 with F 49 , and
consider the action of L1 (F 49 ) on F 49 . Multiplication by some a F 49 and Galois conjugation are F 7 -linear transformations of determinant a8 and 1 respectively. Using only F 49 we obtain cyclic subgroups of orders 4, 8 in PSL2 (F 7 ) and
PGL2 (F 7 ), the nonsplit Cartan subgroups of these linear groups; allowing also
Galois conjugation, we obtain the normalizers of the nonsplit Cartan subgroups,
which are 8- and 16-element dihedral groups and are the 2-Sylow subgroups of
PSL2 (F 7 ), PGL2 (F 7 ) respectively. Since D8 G is normalized by outer automorphisms of G, the quotient of X/D8 can be defined over Q even though it
factors through the quotient by S4 , which is only defined over k! To obtain that
quotient as a degree-3 cover of the -line we may either proceed as we did to
obtain (4.33), namely, writing 2 , 4 , 6 in terms of the invariants of D8 , or
locate the ramification points of the cover. This triple cover is totally ramified
at the simple root = 3(1 + ) of j, and has double points at the solutions of
2 + 2 = (6 + 9) at which j = 123 . We find that the cover is given by
=

(2 + 3)3 (18 + 15)2 + (42 + 21) + (14 + 7)


,
3 72 + 7 + 7

(4.34)

in which we chose the degree-1 function on X/D8 so that j Q():


3
(2 + 7)(2 7 + 14)(52 15 7)
j = 64
(3 72 + 7 + 7)7
( 3)(24 143 + 212 + 28 + 7)P 2 ()
= 123 + 562
,
(3 72 + 7 + 7)7

(4.35)

where P () is the polynomial


P () = (4 142 + 56 + 21)(4 73 + 142 7 + 7).

(4.36)

In the modular setting parametrizes elliptic curves E such that the Galois
action on E[7] is contained in a subgroup D8 G, i.e. by the normalizer of
a nonsplit Cartan subgroup; we thus refer to the -line as the modular curve
Xn (7).
18 Note that, since F k, any Gal(F
)/F must take to one of , 2 , 4 ; thus the
determinant of its action on E[7] is a square in F 7 . Thus acts on E[7] by a scalar multiple
of a unimodular F 7 -linear transformation of E[7], and may be regarded as an element of
PSL2 (F 7 )
= G.

THE KLEIN QUARTIC IN NUMBER THEORY

93

Kenku [1985] used this curve to obtain a novel proof of the StarkHeegner
theorem, which states
that the only quadratic imaginary fields with unique fac
torization
are
Q(
D)
with D = 3, 4, 7, 8, 11, 19, 43, 67, 163. Let

F = Q( D) be a quadratic imaginary field of discriminant D < 0 and class


number 1. There is then an elliptic curve E/Q with CM by OF , unique up to

Q-isomorphism.
Assume that the prime 7 is inert in F ; this certainly happens if
|D| > 28, else the prime(s) above 7 in F cannot be principal. (The fields with
D = 4, 8, 11 also satisfy this condition.) Then the action of OF on E[7] gives

E[7] the structure of a one-dimensional vector space over F 49 , and Gal(Q/Q)


must respect this structure. Thus E yields a rational point of Xn (7). But this
point is constrained by the condition that jE Z. That is, = (E) must be a
rational number such that j(), given by (4.35), is an integer. Writing = m/n
in lowest terms, we find j() = A(m, n)/B(m, n) with A, B homogeneous polynomials of degree 21 without common factors. Thus gcd(A(m, n), B(m, n)) is
bounded given gcd(m, n) = 1; one may calculate that this gcd is a factor of
567 , and thus that m3 7m2 n + 7mn2 + 7n3 divides 56. Thus if m, n are at all
large then m/n must be a very good rational approximation to one of the roots
3 + 4 cos 2a/7 (a F 7 ) of 3 72 + 7 + 7. In the present case Kenku was able
to list all Q such that j() Z using Nagells list [1969] of the solutions of
x + y = 1 in units x, y of K+ . The list can also be obtained from general bounds
on rational approximation, provided all the constants are given explicitly as they
are in [Bugeaud and Gy
ory 1996]. For our specific problem of approximating elements of K+ \ Q, much better results are available, which make the computation
easily tractable; for instance
reports that the methods of [Ben Michael Bennett

nett 1997] yield the bound cos(/7)p/q > 0.099q 7/3 for all nonzero p, q Z,
which is more than enough to find all solutions of |m3 7mn2 +7mn2 +n3 | 56.
We find that the list of integral points on Xn (7), however obtained, consists of
the points with


19
(4.37)
0, , 1, 1, 2, 3, 5, 35 , 7, 73 , 11
2 , 9 .
Of the resulting integral values of j(), the first eight are j-invariants of CM
elliptic curves, with discriminant 3, 8, 11, 16, 67, 4, 43, 163 re
spectively. (The discriminant 3 occurs even though 7 is split in Q( 3 ) thanks

to the cube roots of unity in Q( 3 ), which yield extra automorphisms of a


curve of j-invariant zero; D = 16 occurs because the order Z[2i] Q(i) still has
unique factorization.) It is easy to check that none of the remaining four values
j = 103 75 , 215 75 , 26 113 233 1493 2693 , 29 176 193 293 1493 can be the j-invariants of a
CM curve, and this completes Kenkus proof that the list of imaginary quadratic
fields of class number 1 is complete.
[We remark that Siegel [1968] had already given a similar proof of the Stark
Heegner theorem using Xn (5) together with the condition that jE is a cube,
which is tantamount to using the degree-30 cover of X(1) by Xn (15). An amusing feature of Siegels argument which I have not seen mentioned elsewhere is

94

NOAM D. ELKIES

that the Diophantine equation for an integral point on Xn (15) is equivalent to


the condition that a Fibonacci number be a perfect cube, and thus that Siegel
in effect reduced the StarkHeegner theorem to the fact that the only such Fibonacci numbers are 0, 1, 8.]
What of the four discriminants D = 3, 12, 19, 27 of imaginary quadratic
orders with unique factorization in which 7 splits? Let E be an elliptic curve with
CM by the order of discriminant D. The primes above 7 yield a distinguished
pair of 7-element subgroups of E, which must be respected by the Galois group.
Thus jE lifts to a rational point on the quotient of X(7) by the normalizer of the
split Cartan group of diagonal matrices. In our case the split Cartan group is
hhi, and its normalizer is hh, si, so we know these quotient curves already. Since
S4 contains the normalizers of both the split and the non-split Cartan groups
(note that p = 7 is the largest case in which PSL2 (F p ) has a proper subgroup
containing Cartan normalizers of both kinds), the j-invariant of a CM curve lifts
to a rational point of X(7)/S4 in both the split and inert cases. These points
(necessarily rational only over k, since X(7)/S4 is not defined over Q) are as
follows:
3

D
x
D
x

11

12

2 + , 3 + 3, 3 2 4 2 2 + 3 5 + 2 3 +
16

19

27

43

67

163

6 + 4 5 2 3 + 6 3 14 42 + 13 283 182

This accounts for all but two of the thirteen rational j-invariants. The remaining
3
3
rational js have D = 7 and D = 28; these are the j-invariants
15 , 255
0
of the curves Ek , Ek , for which 7 is ramified in the CM field Q( D), a.k.a. k.
These two js lift to rational points not on X(7)/S4 but on X0 (7), in fact to the
fixed points j7 = 7 and j7 = +7 of the involution w7 .
4.4. X as a Shimura curve. Our identification of X with X0 (7) = H/0 (7)
identifies 0 (7) with the fundamental group not of X but of X punctured at the
24-point orbit. We have seen already that in the hyperbolic uniformization of X
the fundamental group 1 (X) becomes a normal subgroup of the triangle group
G2,3,7. Remarkably this too is an arithmetic group: let
c = + 1 = 2 cos(2/7),
so OK+ = Z[c]; then there exist matrices i, j GL2 (R) such as c1/2

c1/2 01 10 with
i2 = j 2 = c 1,

ij = ji

1
0

(4.38)

0
and
1
(4.39)

(this determines i, j uniquely up to GL2 (R) conjugation) such that G2,3,7 consists of the images in PSL2 (R) of Z[c]-linear combinations of 1, i, j 0 , ij 0 whose

THE KLEIN QUARTIC IN NUMBER THEORY

determinant equals 1. Here


j 0 :=

1
2


1 + ci + (c2 + c + 1)j ,

95

(4.40)

and the determinant of a1 1 + a2 i + a3 j + a4 ij (a, b, c, d R) is


a21 ca22 ca23 + c2 a24 .

(4.41)

For instance, G2,3,7 is generated by the images in PSL2 (R) of


1 + (c2 2)j + (3 c2 )ij ,

c2 + c 1 + (2 c2 )i + (c2 + c 2)ij ,

g2 := ij/c,
g7 :=

1
2

g3 :=

1
2

(4.42)

with g22 = g33 = g77 = 1 and g2 = g7 g3 . (Note that g2 = ij/c is legitimate


since c is a unit.] Shimura [1967] found that the quotients of H by arithmetic
groups or their congruence subgroups also have modular interpretations, analogous to the interpretation of H /(N ) as the moduli space for elliptic curves with
full level-N structure. The objects parametrized by Shimuras modular curves
are more complicated than elliptic curves; for instance X and X/G parametrize
families of principally polarized abelian varieties of dimension 6. These abelian
sixfolds can be described precisely, but there is as yet no hope of presenting them
explicitly enough to derive formulas for the sixfold parametrized by a given point
of X/G or of X. Still these curves hold a place in number theory comparable
to that of the classical modular curves coming from congruence subgroups of
(1), and limited computational investigation of these curves is now feasible (see
for instance [Elkies 1998b]). For the our present purposes we content ourselves
with describing the specific arithmetic groups and moduli problems connected
with the Klein quartic, referring the reader to [Vigneras 1980] for the arithmetic
of quaternion algebras over number fields in general, and to [Vigneras 1980;
Shimura 1967] for their associated Shimura modular curves.
The K+ -algebra A generated by i, j is a quaternion algebra over K+ : a simple
associative algebra with unit, containing K+ , such that K+ is the center of A
and dimK+ A = 4. The ring O = OK+ [i, j 0 ] A is a maximal order in A. For
each of the three real places v of K+ we may form a quaternion algebra over R
by tensoring A with (K+ )v
= R. It is known that a quaternion algebra over R
is isomorphic with either the algebra M2 (R) of 2 2 real matrices, or with the
Hamilton quaternions H. We have seen that in our chosen real embedding of K+ ,
taking c to 2 cos(2/7), the algebra A K+ (K+ )v is M2 (R); for the other two
places, in which c is 2 cos(4/7) and 2 cos(8/7), that algebra is isomorphic with
H because then i2 , j 2 < 0. It is known that if a quaternion algebra over a number
field F becomes isomorphic with M2 (R) over at least one of F s real places then
the maximal order O is unique up to conjugation in the algebra; moreover, that if
(as in our case) there is exactly one such place and F is totally real then the group
of units of norm 1 in O , GL2 (R) yields a co-compact subgroup
= O /{1}
of PSL2 (R), and thus a compact Riemann surface X(1) := H/, except in
the classical case of the algebra M2 (Q) over Q. Since all maximal orders are

96

NOAM D. ELKIES

conjugate, the resulting curve does not depend on the choice of maximal order O.
As a modular curve, X(1) parametrizes principally polarized abelian varieties
of dimension 2[K : Q] (= 6 in our case) with endomorphisms by O. This means
that the curve X(1), though constructed transcendentally, is defined over some
number field; in our case that field may even be taken to be Q thanks to the facts
that K+ has unique factorization and is Galois over Q. Since for us
= G2,3,7,
this curve is rational: the quotient of H by any triangle group has genus zero.
Our quaternion algebra A over K+ has the remarkable property that, for each
finite place v of K+ , the quaternion algebra A K+ (K+ )v over (K+ )v is isomorphic with M2 ((K+ )v ). (In other words, A is unramified at each finite prime v.)
Using this isomorphism, one may define arithmetic subgroups of and modular curves covering X(1) analogous to the classical modular curves X(N ),
X0 (N ) etc. For instance if is a prime of OK then the units of O congruent
to 1 mod constitute a normal subgroup of O that maps to a normal subgroup
() of . Thanks to the isomorphism of A K+ (K+ )v with M2 ((K+ )v ) we
have /()
= PSL2 (k ) [where k is the residue field OK+ / of ]. The Riemann surface X() := H/() is then a normal cover of X(1) with Galois
group PSL2 (k ). This too is a Shimura modular curve, parametrizing principally polarized abelian sixfolds with endomorphisms by O and complete level-
structure this last makes sense because OK+ O acts on the sixfold so we may
speak about the sixfolds -torsion points. The isomorphism /()
= PSL2 (k )
lets us define groups 0 (), 1 () intermediate between and (), and thus
Shimura modular curves X0 () and X1 (), which parametrize O-sixfolds
with partial level- structure. The curves X(), X0 () and X1 () are
defined over K+ , and even over Q if is Galois-stable. Note that the Galoisstable primes of K+ are those that lie over an inert rational prime, i.e. a prime
2 or 3 mod 7, and the prime 7 = (2 c) lying over the ramified prime 7.
We remarked already that Hurwitz curves come from normal subgroups of
G2,3,7. Shimura observed [1967, p. 83] that since each of the groups () is a
normal subgroup of , and g2,3,7, the resulting curves X() are Hurwitz
curves. In particular X(7 ) is a Hurwitz curve of genus 3. We already know
what this means: X(7 ) is none other than the Klein quartic X. Furthermore,
its fundamental group 1 (X) is the congruence subgroup of consisting of the
images in PSL2 (R) of Z[c]-linear combinations a1 1 + a2 i + a3 j 0 + a4 ij 0 of norm 1
with 2 c dividing a2 , a3 , a4 .
[The four Hurwitz curves of the next smallest genera also arise as X() for
primes of K+ : the prime above 2 yields the FrickeMacbeath curve [Fricke
1899; Macbeath 1965] of genus 7 and automorphism group (P)SL2 (F 8 ), and the
primes above 13 yield three curves of genus 14 with automorphisms by PSL2 (F 13 )
first found by Shimura. The next two Hurwitz curves have genus 17 and come
from non-arithmetic quotients of G2,3,7. See [Conder 1990] for more information
on the groups that can arise as automorphism groups of Hurwitz curves, and
[Conder 1987] for the list of all such groups of order less than 106 .]

THE KLEIN QUARTIC IN NUMBER THEORY

97

The quotient curves X0 (7), X1 (7), X0 (49) of X now reappear as Shimura


modular curves X0 (7 ), X1 (7 ), X0 (27 ). These curves have involutions
w7 and w27 analogous to the Fricke involutions of the classical modular curves.
However, the involutions of X0 (7 ) and X1 (7 ) are not the same as the
involutions of the same quotients of X when considered as the classical modular
curves X0 (7) and X1 (7). For instance, on X0 (7) the involution w7 : j7 49/j7
switched the two cusps j7 = 0, , and also the elliptic points of order 3, at which
j72 + 13j7 + 49 = 0.
On X0 (7 ), the elliptic points of order 3 remain the same and are still switched
by w7 ; but there are no cusps instead, the simple pole j7 = of j is the
unique elliptic point of order 7 of X0 (7 ), and must thus be fixed by w7 .
Therefore w7 takes j7 not to 49/j7 but to 13 j7 . In this setting the three
Fricke involutions of X1 (7 ) are defined over Q, and take d to 1 d, 1/d, and
d/(d 1).
We have seen already that the Fermat curve F7 is an unramified cover of X.
It follows that 1 (F7 ) is a subgroup of 1 (X), and thus of G2,3,7. That subgroup obligingly turns out to be a congruence subgroup, with the result that
F7 , like X, is a Shimura modular curve. That subgroup call it 7 is intermediate between (7 ) and (27 ), and may be described as follows: under
an identification of /(27 ) with PSL2 (OK+ /27 ), the group 7 /(27 ) consists
of matrices congruent to the identity mod whose bottom left entry vanishes.
Clearly 7 , thus defined, contains (7 ) as a normal subgroup of index 7, so
H/7 is a degree-7 unramified cyclic cover of X. This is not yet enough to identify H/7 with F7 , but we obtain more automorphisms of H/7 by observing
that 0 (7 ) is also a normal subgroup. Thus the quotient group 0 (7 )/7 acts
on H/7 . This group of automorphisms contains as an index-3 normal subgroup
1 (7 )/7 , which is an elementary abelian group of order 72 . The quotient of
H/7 by this subgroup is the genus-zero curve H/1(7 ) = X1 (7 ), which we
have already described as X1 (7) = X/hhi; and the ramification behavior of this
quotient map (H/7)X1 (7 ) does suffice to identify H/7 with F7 . The
147-element group 0 (7 )/7 is then an index-2 subgroup of Aut(F7 ), generated
by diagonal 3 3 matrices and cyclic coordinate permutations; extending 0 (7 )
by w7 yields the full group of automorphisms of F7 .

References
[Adler 1981] A. Adler, Some integral representations of PSL2 (Fp ) and their
applications, J. Algebra 72:1 (1981), 115145.
[Adler 1997] A. Adler, The Mathieu group M11 and the modular curve X(11), Proc.
London Math. Soc. (3) 74:1 (1997), 128.

98

NOAM D. ELKIES

[Arbarello et al. 1985] E. Arbarello, M. Cornalba, P. A. Griffiths, and J. Harris,


Geometry of algebraic curves, I, Grundlehren der mathematischen Wissenschaften
267, Springer, New York, 1985.
[Bennett 1997] M. A. Bennett, Effective measures of irrationality for certain algebraic
numbers, J. Austral. Math. Soc. Ser. A 62:3 (1997), 329344.
[Benson 1993] D. J. Benson, Polynomial invariants of finite groups, London Math.
Soc. Lecture Note Series, Cambridge University Press, Cambridge, 1993.
[Birch and Kuyk 1975] B. J. Birch and W. Kuyk (editors), Modular functions of one
variable, IV, edited by B. J. Birch and W. Kuyk, Lecture Notes in Math. 476,
Springer, Berlin, 1975.
[Bourbaki 1968]
N. Bourbaki, Groupes et alg`
ebres de Lie, IVVI, Actualites
scientifiques et industrielles 1337, Hermann, Paris, 1968. Reprinted by Masson,
Paris, 1981.
[Bugeaud and Gy
ory 1996] Y. Bugeaud and K. Gy
ory, Bounds for the solutions of
unit equations, Acta Arith. 74:1 (1996), 6780.
[Buser and Sarnak 1994] P. Buser and P. Sarnak, On the period matrix of a Riemann
surface of large genus, Invent. Math. 117:1 (1994), 2756. With an appendix by J.
H. Conway and N. J. A. Sloane.
[Buser et al. 1994] P. Buser, J. Conway, P. Doyle, and K.-D. Semmler, Some planar
isospectral domains, Internat. Math. Res. Notices 1994:9 (1994), 391399.
[Clemens 1980] C. H. Clemens, A scrapbook of complex curve theory, Plenum, New
York, 1980.
[Conder 1987] M. Conder, The genus of compact Riemann surfaces with maximal
automorphism group, J. Algebra 108:1 (1987), 204247.
[Conder 1990] M. Conder, Hurwitz groups: a brief survey, Bull. Amer. Math. Soc.
(N.S.) 23:2 (1990), 359370.
[Conway et al. 1985] J. H. Conway, R. T. Curtis, S. P. Norton, R. A. Parker, and R. A.
Wilson, Atlas of finite groups, Oxford University Press, Oxford, 1985.
[Coolidge 1931] J. L. Coolidge, A treatise on algebraic plane curves, Clarendon, Oxford,
1931.
[Cremona 1992] J. E. Cremona, Algorithms for modular elliptic curves, Cambridge
University Press, Cambridge, 1992.
[Darmon and Granville 1995] H. Darmon and A. Granville, On the equations
z m = F (x, y) and Axp +Byq = Cz r , Bull. London Math. Soc. 27:6 (1995), 513543.
[Dickson 1934] L. E. Dickson, History of the theory of numbers, II: Diophantine
analysis, Stechert and Co., New York, 1934.
[Ekedahl and Serre 1993] T. Ekedahl and J.-P. Serre, Exemples de courbes algebriques
a jacobienne compl`etement decomposable, C. R. Acad. Sci. Paris S
`
er. I Math.
317:5 (1993), 509513.
[Elkies 1994] N. D. Elkies, Mordell-Weil lattices in characteristic 2, I: Construction
and first properties, Internat. Math. Res. Notices 1994:8 (1994), 343361.
[Elkies 1998a] N. D. Elkies, Elliptic and modular curves over finite fields and related
computational issues, pp. 2176 in Computational perspectives on number theory

THE KLEIN QUARTIC IN NUMBER THEORY

99

(Chicago, 1995), edited by D. A. Buell and J. T. Teitelbaum, AMS/IP Stud. Adv.


Math., Amer. Math. Soc., Providence, RI, 1998.
[Elkies 1998b] N. D. Elkies, Shimura curves computations, pp. 147 in Algorithmic
number theory (ANTS-III: Portland, 1998), edited by J. Buhler, Lecture Notes in
Computer Science 1423, Springer, New York, 1998.
[Faltings 1983] G. Faltings, Endlichkeitss
atze f
ur abelsche Variet
aten u
ber Zahlk
orpern, Invent. Math. 73:3 (1983), 349366. Erratum in 75 (1984), 381.
[Faltings 1991] G. Faltings, Diophantine approximation on abelian varieties, Ann.
of Math. (2) 133:3 (1991), 549576.

[Fricke 1899] R. Fricke, Uber


eine einfache Gruppe von 504 Operationen, Math.
Annalen 52 (1899), 321339.
[Fulton and Harris 1991] W. Fulton and J. Harris, Representation theory, Graduate
Texts in Math. 129, Springer, New York, 1991.
[Genocchi 1864] A. Genocchi, Intorno allequazione x7 + y7 + z 7 = 0, Annali di Mat.
Pura ed Applicata 6 (1864), 287288.
[Gordon et al. 1992] C. Gordon, D. L. Webb, and S. Wolpert, One cannot hear the
shape of a drum, Bull. Amer. Math. Soc. (N.S.) 27:1 (1992), 134138.
[Gross 1978] B. H. Gross, On the periods of abelian integrals and a formula of Chowla
and Selberg, Invent. Math. 45:2 (1978), 193211. With an appendix by David E.
Rohrlich.
[Gross 1990] B. H. Gross, Group representations and lattices, J. Amer. Math. Soc.
3:4 (1990), 929960.
[Gross and Rohrlich 1978] B. H. Gross and D. E. Rohrlich, Some results on the
Mordell-Weil group of the Jacobian of the Fermat curve, Invent. Math. 44:3 (1978),
201224.
[Hartshorne 1977] R. Hartshorne, Algebraic geometry, Graduate Texts in Math. 52,
Springer, New York, 1977.
[Hirzebruch 1983] F. Hirzebruch, Arrangements of lines and algebraic surfaces,
pp. 113140 in Arithmetic and geometry, II, edited by M. Artin and J. Tate,
Progr. Math. 36, Birkh
auser, Boston, 1983. Reprinted as #69 (pp. 679706) of
his Gesammelte Abhandlungen vol. 2, Springer, 1987.
[Hoffmann 1991] D. W. Hoffmann, On positive definite Hermitian forms, Manuscripta
Math. 71:4 (1991), 399429.

[Hurwitz 1893] A. Hurwitz, Uber


algebraische Gebilde mit eindeutigen Transformationen in sich, Math. Annalen 41 (1893), 403442.
[Kemper 1996] G. Kemper, A constructive approach to Noethers problem,
Manuscripta Math. 90:3 (1996), 343363.
[Kenku 1985] M. A. Kenku, A note on the integral points of a modular curve of level
7, Mathematika 32:1 (1985), 4548.
[Klein 1879a] F. Klein, Ueber die Erniedrigung der Modulargleichungen, Math.
Annalen 14 (1879), 417427. Reprinted as [Klein 1923, LXXXIII, pp. 7689].

100

NOAM D. ELKIES

[Klein 1879b] F. Klein, Ueber die Transformationen siebenter Ordnung der elliptischen
Funktionen, Math. Annalen 14 (1879), 428471. Reprinted as [Klein 1923,
LXXXIV, pp. 90136]. Translated in this collection.
[Klein 1923] F. Klein, Gesammelte Mathematische Abhandlungen, 3: Elliptische
Funktionen etc., edited by R. Fricke et al., Springer, Berlin, 1923. Reprinted by
Springer, 1973.

[Kneser 1967] M. Kneser, Uber


die Ausnahme-Isomorphismen zwischen endlichen
klassischen Gruppen, Abh. Math. Sem. Univ. Hamburg 31 (1967), 136140.
[Kubert and Lang 1981] D. S. Kubert and S. Lang, Modular units, Grundlehren der
mathematischen Wissenschaften 244, Springer, New York, 1981.
[Macbeath 1965] A. M. Macbeath, On a curve of genus 7, Proc. London Math. Soc.
(3) 15 (1965), 527542.
[MacWilliams and Sloane 1977] F. J. MacWilliams and N. J. A. Sloane, The theory
of error-correcting codes, North-Holland Mathematical Library 16, North-Holland,
Amsterdam, 1977.
[Mazur 1986] B. Mazur, Arithmetic on curves, Bull. Amer. Math. Soc. (N.S.) 14:2
(1986), 207259.
[Nagell 1960] T. Nagell, The Diophantine equation x2 + 7 = 2n , Ark. Mat. 4 (1960),
185187.
[Nagell 1969] T. Nagell, Sur un type particulier dunites algebriques, Ark. Mat. 8
(1969), 163184.
[Perlis 1977] R. Perlis, On the equation K (s) = K 0 (s), J. Number Theory 9:3
(1977), 342360.
[Poonen 1996] B. Poonen, Computational aspects of curves of genus at least 2, pp.
283306 in Algorithmic number theory: Second International Symposium (Talence,
1996), edited by H. Cohen, Lecture Notes in Comput. Sci. 1122, Springer, Berlin,
1996.
[Selberg and Chowla 1967] A. Selberg and S. Chowla, On Epsteins zeta-function,
J. Reine Angew. Math. 227 (1967), 86110.
[Serre 1967] J.-P. Serre, Complex multiplication, pp. 292296 in Algebraic Number
Theory (Brighton, 1965), edited by J. W. S. Cassels and A. Fr
ohlich, Thompson and
Academic Press, Washington, D.C., and London, 1967. Reprinted as #76, (pp. 455
459) of his Oeuvres, vol. 2, Springer, Berlin, 1986.
[Serre 1973] J.-P. Serre, A course in arithmetic, Graduate Texts in Math. 7, Springer,
New York, 1973. Translation of Cours darithm
etique, Presses univ. de France, Paris,
1970.
[Serre 1983a] J.-P. Serre, Nombres de points des courbes algebriques sur Fq , pp. Exp.
No. 22, 8 in Seminaire de theorie des nombres de Bordeaux (Talence, 1982/1983),
Univ. Bordeaux I, Talence, 1983. Reprinted as #129 (pp. 664668) of his Oeuvres,
vol. 3, Springer, Berlin, 1986.
[Serre 1983b] J.-P. Serre, Sur le nombre des points rationnels dune courbe algebrique
sur un corps fini, C. R. Acad. Sci. Paris S
er. I Math. 296:9 (1983), 397402.
Reprinted as #128 (pp. 658663) of his Oeuvres, vol. 3, Springer, Berlin, 1986.

THE KLEIN QUARTIC IN NUMBER THEORY

101

[Serre 1984] J.-P. Serre, Resumes des cours de 19831984, pp. 7983 in Annuaire,
Coll`ege de France, Paris, 1984. Reprinted as #132 (pp. 701705) of his Oeuvres,
vol. 3, Springer, Berlin, 1986.
[Shephard and Todd 1954] G. C. Shephard and J. A. Todd, Finite unitary reflection
groups, Canadian J. Math. 6 (1954), 274304.
[Shimura 1967] G. Shimura, Construction of class fields and zeta functions of algebraic
curves, Ann. of Math. (2) 85 (1967), 58159.
[Siegel 1968] C. L. Siegel, Zum Beweis des Starkschen Satzes, Invent. Math. 5 (1968),
180191.
[Silverman 1986] J. H. Silverman, The arithmetic of elliptic curves, Graduate Texts in
Math. 106, Springer, New York, 1986.
[Stark 1973] H. M. Stark, On the Riemann hypothesis in hyperelliptic function fields,
pp. 285302 in Analytic number theory (St. Louis, MO, 1972), edited by H. G.
Diamond, Proc. Sympos. Pure Math. 24, Amer. Math. Soc., Providence, 1973.

[Stichtenoth 1973] H. Stichtenoth, Uber


die Automorphismengruppe eines algebraischen Funktionenk
orpers von Primzahlcharakteristik, Arch. Math. (Basel) 24
(1973), 527544, 615631.
[Sunada 1985] T. Sunada, Riemannian coverings and isospectral manifolds, Ann. of
Math. (2) 121:1 (1985), 169186.
[Suzuki 1982] M. Suzuki, Group theory, I, Grundlehren der mathematischen Wissenschaften, Springer, Berlin, 1982. Translated from the Japanese by the author.
[Tate 1974] J. T. Tate, The arithmetic of elliptic curves, Invent. Math. 23 (1974),
179206.
[Thompson 1976] J. G. Thompson, Finite groups and even lattices, J. Algebra 38:2
(1976), 523524.
[Velu 1971] J. Velu, Isogenies entre courbes elliptiques, C. R. Acad. Sci. Paris S
er.
A-B 273 (1971), A238A241.
[Vigneras 1980] M.-F. Vigneras, Arithm
etique des alg`ebres de quaternions, Lecture
Notes in Math. 800, Springer, Berlin, 1980.
[Weil 1964] A. Weil, Sur certains groupes doperateurs unitaires, Acta Math. 111
(1964), 143211.
Noam D. Elkies
Department of Mathematics
Harvard University
Cambridge, MA 02138
United States
elkies@math.harvard.edu

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy