Combi Book
Combi Book
Combinatorics
[Math 701, Spring 2021 lecture notes]
Darij Grinberg
April 6, 2024 (unfinished!)
Contents
1. What is this? 6
2. Before we start... 7
2.1. What is this? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2. Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3. Notations and elementary facts . . . . . . . . . . . . . . . . . . . . 7
3. Generating functions 10
3.1. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.1.1. Example 1: The Fibonacci sequence . . . . . . . . . . . . . 11
3.1.2. Example 2: Dyck words and Catalan numbers . . . . . . . 13
3.1.3. Example 3: The Vandermonde convolution . . . . . . . . . 22
3.1.4. Example 4: Solving a recurrence . . . . . . . . . . . . . . . 23
3.2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2.1. Reminder: Commutative rings . . . . . . . . . . . . . . . . 26
3.2.2. The definition of formal power series . . . . . . . . . . . . 33
3.2.3. The Chu–Vandermonde identity . . . . . . . . . . . . . . . 46
3.2.4. What next? . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.3. Dividing FPSs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.3.1. Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.3.2. Inverses in commutative rings . . . . . . . . . . . . . . . . 49
3.3.3. Inverses in K [[ x ]] . . . . . . . . . . . . . . . . . . . . . . . . 51
3.3.4. Newton’s binomial formula . . . . . . . . . . . . . . . . . . 53
1
Math 701 Spring 2021, version April 6, 2024 page 2
3.3.5. Dividing by x . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.3.6. A few lemmas . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.4. Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.4.1. Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.4.2. Reminders on rings and K-algebras . . . . . . . . . . . . . 65
3.4.3. Evaluation aka substitution into polynomials . . . . . . . 67
3.5. Substitution and evaluation of power series . . . . . . . . . . . . . 69
3.5.1. Defining substitution . . . . . . . . . . . . . . . . . . . . . . 69
3.5.2. Laws of substitution . . . . . . . . . . . . . . . . . . . . . . 74
3.6. Derivatives of FPSs . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
3.7. Exponentials and logarithms . . . . . . . . . . . . . . . . . . . . . 89
3.7.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
3.7.2. The exponential and the logarithm are inverse . . . . . . . 89
3.7.3. The exponential and the logarithm of an FPS . . . . . . . 95
3.7.4. Addition to multiplication . . . . . . . . . . . . . . . . . . 97
3.7.5. The logarithmic derivative . . . . . . . . . . . . . . . . . . 102
3.8. Non-integer powers . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
3.8.1. Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
3.8.2. The Newton binomial formula for arbitrary exponents . . 108
3.8.3. Another application . . . . . . . . . . . . . . . . . . . . . . 115
3.9. Integer compositions . . . . . . . . . . . . . . . . . . . . . . . . . . 118
3.9.1. Compositions . . . . . . . . . . . . . . . . . . . . . . . . . . 118
3.9.2. Weak compositions . . . . . . . . . . . . . . . . . . . . . . . 123
3.9.3. Weak compositions with entries from {0, 1, . . . , p − 1} . . 124
3.10. x n -equivalence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
3.11. Infinite products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
3.11.1. An example . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
3.11.2. A rigorous definition
. . . . . . . . . . . . . . . . . . . . . . 134
i
3.11.3. Why ∏ 1 + x2 works and ∏ (1 + ix ) doesn’t . . . . . 140
i ∈N i ∈N
3.11.4. A general criterion for multipliability . . . . . . . . . . . . 141
3.11.5. x n -approximators . . . . . . . . . . . . . . . . . . . . . . . . 143
3.11.6. Properties of infinite products . . . . . . . . . . . . . . . . 144
3.11.7. Product rules (generalized distributive laws) . . . . . . . . 150
3.11.8. Another example . . . . . . . . . . . . . . . . . . . . . . . . 157
3.11.9. Infinite products and substitution . . . . . . . . . . . . . . 161
3.11.10.Exponentials, logarithms and infinite products . . . . . . . 161
3.12. The generating function of a weighted set . . . . . . . . . . . . . . 162
3.12.1. The theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
3.12.2. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
3.12.3. Domino tilings . . . . . . . . . . . . . . . . . . . . . . . . . 170
3.13. Limits of FPSs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
3.13.1. Stabilization of scalars . . . . . . . . . . . . . . . . . . . . . 178
3.13.2. Coefficientwise stabilization of FPSs . . . . . . . . . . . . . 180
Math 701 Spring 2021, version April 6, 2024 page 3
5. Permutations 269
5.1. Basic definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
5.2. Transpositions and cycles . . . . . . . . . . . . . . . . . . . . . . . 273
5.2.1. Transpositions . . . . . . . . . . . . . . . . . . . . . . . . . . 273
5.2.2. Cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
5.3. Inversions, length and Lehmer codes . . . . . . . . . . . . . . . . . 276
5.3.1. Inversions and lengths . . . . . . . . . . . . . . . . . . . . . 276
5.3.2. Lehmer codes . . . . . . . . . . . . . . . . . . . . . . . . . . 279
5.3.3. More about lengths and simples . . . . . . . . . . . . . . . 286
5.4. Signs of permutations . . . . . . . . . . . . . . . . . . . . . . . . . 299
5.5. The cycle decomposition . . . . . . . . . . . . . . . . . . . . . . . . 303
Math 701 Spring 2021, version April 6, 2024 page 4
1. What is this?
These are the notes for an introductory course on algebraic combinatorics held
in the Spring Quarter 2021 at Drexel University1 . The topics covered are
• formal power series and their use as generating functions (Chapter 3);
Most (but not all) of these chapters are in a finished state (the final few sec-
tions of Chapter 3 need details). However, they are improvable both in detail
and in coverage. They might grow in later iterations of this course (in particular,
various further topics could get included). Errors and confusions will be fixed
whenever I become aware of them (any assistance is greatly appreciated!2 ).
Exercises of varying difficulty appear at the end of the text (Chapter A).
Acknowledgments
Thanks to the students in my Math 701 course for what was in essence an
alpha-test of these notes. Some exercises have been adapted from collections by
Richard P. Stanley, Martin Aigner, Donald Knuth, Miklós Bóna, Mark Wildon
and Igor Pak. A math.stackexchange user named Mindlack has contributed the
proof of Proposition 3.11.30. Andrew Solomon has reported typos.
2. Before we start...
2.1. What is this?
This is a course on algebraic combinatorics. This subject can be viewed either
as a continuation of enumerative combinatorics by other means (specifically, al-
gebraic ones), or as the part of algebra where one studies concrete polynomials
(more precisely, families of polynomials). For example, the Schur polynomi-
als can be viewed on the one hand as a tool for enumerating certain kinds of
tableaux (essentially, tabular arrangements of numbers that increase along rows
and columns), while on the other hand they form a family of polynomials with
a myriad surprising properties, generalizing (e.g.) the Vandermonde determi-
nant. I hope to cover both aspects of the subject to a reasonable amount in this
course.
2.2. Prerequisites
To understand this course, you are assumed to speak the language of rings
and fields (we will mostly need the basic properties of polynomials and linear
maps; we will define what we need about power series), and to have some basic
knowledge of enumerative combinatorics (see below). My notes [23wa], and
the references I gave therein, can help refresh your knowledge of the former.
As for the latter, there are dozens of sources available (I made a list at https:
//math.stackexchange.com/a/1454420/ , focussing mostly on texts available
online), including my own notes [22fco].
• The symbol “#” means “number”. For example, the size | A| of a set A is
the # of elements of A.
Math 701 Spring 2021, version April 6, 2024 page 8
We will need some basics from enumerative combinatorics (see, e.g., [Newste19,
§8.1] for details, and [19fco, Chapters 1 and 2] for more details):
• addition principle = sum rule: If A and B are two disjoint sets, then
| A ∪ B | = | A | + | B |.
• multiplication principle = product rule: If A and B are any two sets, then
| A × B | = | A | · | B |.
• bijection principle: There is a bijection (= bijective map = invertible map
= one-to-one correspondence) between two sets X and Y if and only if
| X | = |Y | .
n n
• A set with n elements has 2 subsets, and has size-k subsets for any
k
k ∈ R.
• A set with n elements has n! permutations (= bijective maps from this set
to itself).
If n, k ∈ N and n ≥ k, then
n n!
= . (2)
k k! (n − k )!
But this formula only applies to the case when n, k ∈ N and n ≥ k. Our above
definition ismore
general than it. The combinatorial meaning of the binomial
n
coefficient (as the # of k-element subsets of a given n-element set) also
k
cannot be used for negative or non-integer values of n.
1 · 3 · 5 · · · · · (2n − 1) n
2n
Example 2.3.3. Let n ∈ N. Then, = ·2 .
n n!
that Proposition 2.3.5 really requires m ∈ N. For example, 1.5 < 2 but
Note
1.5
= 0.375 ̸= 0.
2
Yet another useful property of the binomial coefficients is the following ([19fco,
Theorem 1.3.11]):
3. Generating functions
In this first chapter, we will discuss generating functions: first informally, then
on a rigorous footing. You may have seen generating functions already, as their
usefulness extends far beyond combinatorics; but they are so important to this
course that they are worth covering twice in case of doubt.
Rigorous introductions to generating functions (and formal power series in
general) can also be found in [Loehr11, Chapter 7 (in the 1st edition)], in
[Henric74, Chapter 1], in [Sambal22], and (to some extent) in [19s, Chapter
7].3 A quick overview is given in [Niven69], and many applications are found
in [Wilf09]. There are furthermore numerous books that explore enumerative
combinatorics through the lens of generating functions ([GouJac83], [Wagner08],
[Lando03] and others).
3 Bourbaki’s[Bourba03, §IV.4] contains what might be the most rigorous and honest treatment
of formal power series available in the literature; however, it is not the most readable source,
as the notation is dense and heavily relies on other volumes by the same author.
Math 701 Spring 2021, version April 6, 2024 page 11
3.1. Examples
Let me first show what generating functions are good for. Then, starting in the
next section, I will explain how to rigorously define them. For now, I will work
informally; please suspend your disbelief until the next section.
The idea behind generating functions is easy: Any sequence ( a0 , a1 , a2 , . . .) of
numbers gives rise to a “power series” a0 + a1 x + a2 x2 + · · · , which is called
the generating function of this sequence. This “power series” is an infinite sum
(an “infinite polynomial” in an indeterminate x), so it is not immediately clear
what it means and what we are allowed to do with it; but before we answer
such questions, let us first play around with these power series and hope for
the best. The following four examples show how they can be useful.
1
= 1 + x + x2 + x3 + · · · . (5)
1−x
Indeed, this follows by observing that
(1 − x ) 1 + x + x 2 + x 3 + · · ·
= 1 + x + x2 + x3 + · · · − x 1 + x + x2 + x3 + · · ·
2 3 2 3 4
= 1+x+x +x +··· − x+x +x +x +···
=1
(again, we are hoping that these manipulations of infinite sums are allowed).
Note that the equality (5) is a version of the geometric series formula familiar from
real analysis. Now, for any α ∈ C, we can substitute αx for x in the equality (5),
and thus obtain
1
= 1 + αx + (αx )2 + (αx )3 + · · ·
1 − αx
= 1 + αx + α2 x2 + α3 x3 + · · · . (6)
Math 701 Spring 2021, version April 6, 2024 page 13
1 1 1 1
F (x) = √ · −√ ·
5 1 − ϕ+ x 5 1 − ϕ− x
1
2 2 3 3
1 2 2 3 3
=√ · 1 + ϕ+ x + ϕ+ x + ϕ+ x + · · · − √ · 1 + ϕ− x + ϕ− x + ϕ− x +···
5 5
( by (6), applied to α = ϕ + and again to α = ϕ− )
1 1
= √ ∑ ϕ+ x − √ ∑ ϕ−
k k k k
x
5 k ≥0 5 k ≥0
1 1
= ∑ √ · ϕ+ − √ · ϕ− x k .
k k
k ≥0 5 5
Now, for any given n ∈ N, the coefficient of x n in the power series on the
left hand side of this equality is f n (since F ( x ) = f 0 + f 1 x + f 2 x2 + f 3 x3 + · · · ),
1 1
whereas the coefficient of x n on the right hand side is clearly √ · ϕ+ n −√ · ϕ− n.
5 5
Thus, comparing coefficients before x n , we obtain
1 n 1 n
f n = √ · ϕ+ − √ · ϕ−
5 5
assuming that “comparing coefficients” is allowed, i.e.,
that equal power series really have equal coefficients
√ !n √ !n
1 1+ 5 1 1− 5
=√ · −√ ·
5 2 5 2
The names “NE-steps” and “SE-steps” in the definition of a Dyck path refer
to compass directions: If we treat the Cartesian plane as a map with the x-axis
directed eastwards and the y-axis directed northwards, then an NE-step moves
to the northeast, and an SE-step moves to the southeast.
Note that any NE-step and any SE-step increases the x-coordinate by 1 (that
is, the step goes from a point with x-coordinate k to a point with x-coordinate
Math 701 Spring 2021, version April 6, 2024 page 15
k + 1). Thus, any Dyck path from (0, 0) to (2n, 0) has precisely 2n steps. Of these
2n steps, exactly n are NE-steps while the remaining n are SE-steps (because
any NE-step increases the y-coordinate by 1, while any SE-step decreases the
y-coordinate by 1). Since a Dyck path must never fall below the x-axis, we see
that the number of SE-steps up to any given point can never be larger than the
number of NE-steps up to this point. But this is exactly the condition (7) from
the definition of a Dyck word, except that we are talking about NE-steps and
SE-steps instead of 1’s and 0’s. Thus, there is a simple bijection between Dyck
words of length 2n and Dyck paths (0, 0) → (2n, 0):
• send each 1 in the Dyck word to a NE-step in the Dyck path;
• send each 0 in the Dyck word to a SE-step in the Dyck path.
So the # of Dyck words (of length 2n) equals the # of Dyck paths (from (0, 0)
to (2n, 0)). But what is this number?
Example: For n = 3, this number is 5. Indeed, here are all Dyck paths from
(0, 0) to (6, 0), and their corresponding Dyck words:
Dyck path Dyck word
(1, 1, 0, 0, 1, 0)
(1, 1, 1, 0, 0, 0)
(1, 0, 1, 0, 1, 0)
(1, 0, 1, 1, 0, 0)
(1, 1, 0, 1, 0, 0)
Math 701 Spring 2021, version April 6, 2024 page 16
(We will soon stop writing the commas and parentheses when writing down
words. For example, the word (1, 1, 0, 0, 1, 0) will just become 110010.)
Back to the general question.
For each n ∈ N, we define
Then, c0 = 1 (since the only Dyck path from (0, 0) to (0, 0) is the trivial path)
and c1 = 1 and c2 = 2 and c3 = 5 and c4 = 14 and so on. These numbers cn
are known as the Catalan numbers. Entire books have been written about them,
such as [Stanle15].
Let us first find a recurrence relation for cn . The argument below is best
understood by following an example; namely, consider the following Dyck path
from (0, 0) to (16, 0) (so the corresponding n is 8):
Fix a positive integer n. If D is a Dyck path from (0, 0) to (2n, 0), then the
first return of D (this is short for “first return of D to the x-axis”) shall mean
the first point on D that lies on the x-axis but is not the origin (i.e., that has the
form (i, 0) for some integer i > 0). For instance, in the example that we just
gave, the first return is the point (6, 0). If D is a Dyck path from (0, 0) to (2n, 0),
and if (i, 0) is its first return, then i is even4 , and therefore we have i = 2k for
some k ∈ {1, 2, . . . , n}. Hence, for any Dyck path from (0, 0) to (2n, 0), the first
return is a point of the form (2k, 0) for some k ∈ {1, 2, . . . , n}. Thus,
Now, let us fix some k ∈ {1, 2, . . . , n}. We shall compute the # of Dyck paths
from (0, 0) to (2n, 0) whose first return is (2k, 0). Any such Dyck path has a
natural “two-part” structure: Its first 2k steps form a path from (0, 0) to (2k, 0),
while its last (i.e., remaining) 2 (n − k ) steps form a path from (2k, 0) to (2n, 0).
Thus, in order to construct such a path, we
4 Proof.The number of NE-steps before the first return must equal the number of SE-steps
before the first return (because these steps have altogether taken us from the origin to a
point on the x-axis, and thus must have increased and decreased the y-coordinate an equal
number of times). This shows that the total number of steps before the first return is even.
In other words, i is even (because the total number of steps before the first return is i).
Math 701 Spring 2021, version April 6, 2024 page 17
• first choose its first 2k steps: They have to form a Dyck path from (0, 0) to
(2k, 0) that never returns to the x-axis until (2k, 0). Hence, they begin with
a NE-step and end with a SE-step (since any other steps here would cause
the path to fall below the x-axis). Between these two steps, the remaining
2k − 2 = 2 (k − 1) steps form a path that not only never falls below the
x-axis, but also never touches it (since (2k, 0) is the first return of our
Dyck path, so that our Dyck path does not touch the x-axis between (0, 0)
and (2k, 0)). In other words, these 2 (k − 1) steps form a path from (1, 1)
to (2k − 1, 1) that never falls below the y = 1 line (i.e., below the x-axis
shifted by 1 upwards). This means that it is a Dyck path from (0, 0) to
(2 (k − 1) , 0) (shifted by (1, 1)). Thus, there are ck−1 possibilities for this
path. Hence, there are ck−1 choices for the first 2k steps of our Dyck path.
• then choose its last 2 (n − k ) steps: They have to form a path from (2k, 0)
to (2 (n − k ) , 0) that never falls below the x-axis (but is allowed to touch
it any number of times). Thus, they form a Dyck path from (0, 0) to
(2 (n − k) , 0) (shifted by (2k, 0)). So there are cn−k choices for these last
2 (n − k ) steps.
Thus, there are ck−1 cn−k many options for such a Dyck path from (0, 0) to
(2n, 0) (since choosing the first 2k steps and choosing the last 2 (n − k) steps are
independent).
Let me illustrate this reasoning on the Dyck path from (0, 0) to (16, 0) shown
above. This Dyck path has first return (6, 0); thus, the corresponding k is 3.
Since this Dyck path does not return to the x-axis before (2k, 0) = (6, 0), its first
2k steps stay above (or on) the yellow trapezoid shown here:
In particular, the first and the last of these 2k steps are uniquely determined,
while the steps between them form a diagonally shifted Dyck path that is filled
in green here:
Finally, the last 2 (n − k ) steps form a horizontally shifted Dyck path that is
Math 701 Spring 2021, version April 6, 2024 page 18
Our above argument shows that there are ck−1 choices for the green Dyck path
and cn−k choices for the purple Dyck path, therefore ck−1 cn−k options in total.
Forget that we fixed k. Our counting argument above shows that
(# of Dyck paths from (0, 0) to (2n, 0) whose first return is (2k, 0))
= c k −1 c n − k (8)
Thus,
C ( x ) = c0 + c1 x + c2 x 2 + c3 x 3 + · · ·
= 1 + ( c0 c0 ) x + ( c0 c1 + c1 c0 ) x 2 + ( c0 c2 + c1 c1 + c2 c0 ) x 3 + · · ·
since c0 = 1
and cn = c0 cn−1 + c1 cn−2 + c2 cn−3 + · · · + cn−1 c0
for each n > 0
2
= 1 + x ( c0 c0 ) + ( c0 c1 + c1 c0 ) x + ( c0 c2 + c1 c1 + c2 c0 ) x + · · ·
| {z }
2
=(c0 +c1 x +c2 x2 +··· )
2
(because if we multiply out (c0 +c1 x +c2 x2 +··· )
and collect like powers of x, we obtain
exactly (c0 c0 )+(c0 c1 +c1 c0 ) x +(c0 c2 +c1 c1 +c2 c0 ) x2 +···)
2
= 1 + x c0 + c1 x + c2 x2 + · · · = 1 + x (C ( x ))2 .
| {z }
=C ( x )
Let us pretend that the formula (11) holds not only for n ∈ N, but also for
n = 1/2. That is, we have
1/2 k
(1 + x ) 1/2
= ∑ x . (12)
k ≥0
k
Now, substitute −4x for x in this equality (here we are making the rather plau-
sible assumption that we can substitute −4x for x in a power series); then, we
get
1/2 1/2
(1 − 4x ) 1/2
= ∑ (−4x ) = ∑
k
(−4)k x k
k ≥0
k k ≥0
k
1/2 1/2
= 0
x +∑
(−4) |{z}0
(−4)k x k
0 | {z } k ≥1
k
| {z } =1 =1
=1
1/2
= 1+ ∑ (−4)k x k .
k ≥1
k
Hence,
!
1/2 1/2
1 − (1 − 4x ) 1/2
= 1− 1+ ∑ k
(−4) x = − ∑
k k
k
(−4)k x k .
k ≥1 k ≥1
1/2
yields
n+1
(1/2) (1/2 − 1) (1/2 − 2) · · · (1/2 − n)
1/2
=
n+1 ( n + 1) !
1 −1 −3 −5 − (2n − 1)
· · · ·····
2 2 2 2 2
=
( n + 1) !
(1 · (−1) · (−3) · (−5) · · · · · (− (2n − 1))) /2n+1
=
( n + 1) !
((−1) · (−3) · (−5) · · · · · (− (2n − 1))) /2n+1
=
( n + 1) !
n
(−1) (1 · 3 · 5 · · · · · (2n − 1)) /2n+1
= .
( n + 1) !
Thus, (13) rewrites as
(−1)n (1 · 3 · 5 · · · · · (2n − 1)) /2n+1
cn = · 2 (−4)n
( n + 1) !
1 · 3 · 5 · · · · · (2n − 1) (−1)n · 2 (−4)n
= · +1
( n + 1) ! | 2n{z }
| {z }
=2n
1 1 · 3 · 5 · · · · · (2n − 1)
= ·
n+1 n!
(since (n+1)!=(n+1)·n!)
1 · 3 · 5 · · · · · (2n − 1) n
1 1 2n
= · ·2 = .
n+1 | n!{z } n+1 n
2n
=
n
(by Example 2.3.3)
1 2n
Here is the upshot: The # of Dyck words of length 2n is cn = . In
n+1 n
other words, a 2n-tuple that consists of n entries equal to 0 and n entries equal
1
to 1 (chosen uniformly at random) is a Dyck word with probability .
n+1
(There are also combinatorial ways to prove this; see, e.g., [GrKnPa94, §7.5,
discussion at the end of Example 4] or [Stanle15, §1.6] or [Martin13] or [22fco,
Lecture 29, Theorem 5.6.7] or [Loehr11, Theorem 1.56]7 or [Spivey19, §8.5,
proofs of Identity 244]8 .)
Here is a table of the first 12 Catalan numbers cn :
n 0 1 2 3 4 5 6 7 8 9 10 11
.
cn 1 1 2 5 14 42 132 429 1430 4862 16 796 58 786
7 Note that the “Dyck paths” in [Loehr11] differ from ours in that they use N-steps (i.e., steps
(i, j) 7→ (i, j + 1)) and E-steps (i.e., steps (i, j) 7→ (i + 1, j)) instead of NE-steps and SE-steps,
and stay above the x = y line instead of above the x-axis. But this √ notion of Dyck paths is
equivalent to ours, since a clockwise rotation by 45◦ followed by a 2-homothety transforms
it into ours.
8 Again, [Spivey19] works not directly with Dyck paths, but rather with paths that use E-steps
(i.e., steps (i, j) 7→ (i + 1, j)) and N-steps (i.e., steps (i, j) 7→ (i, j + 1)) instead of NE-steps
and SE-steps, and stay below the x = y line instead of above the x-axis. But this kind of
Dyck paths is equivalent to our Dyck paths, since √ a reflection across the x = y line, followed
by a clockwise rotation by 45◦ followed by a 2-homothety transforms it into ours.
Math 701 Spring 2021, version April 6, 2024 page 23
Thus, the first entries of this sequence are 1, 2, 5, 12, 27, 58, 121, . . .. This se-
quence appears in the OEIS (= Online Encyclopedia of Integer Sequences) as
A000325, with index shifted.
Can we find an explicit formula for an (without looking it up in the OEIS)?
Again, generating functions are helpful. Set
A ( x ) = a0 + a1 x + a2 x 2 + a3 x 3 + · · · .
Then,
A ( x ) = a0 + a1 x + a2 x 2 + a3 x 3 + · · ·
= 1 + (2a0 + 0) x + (2a1 + 1) x2 + (2a2 + 2) x3 + · · ·
(since a0 = 1 and an+1 = 2an + n for all n ≥ 0)
= 1 + 2 a0 x + a1 x2 + a2 x3 + · · · + 0x + 1x2 + 2x3 + · · ·
| {z } | {z }
= xA( x ) = x (0+1x +2x2 +3x3 +··· )
= 1 + 2xA ( x ) + x 0 + 1x + 2x2 + 3x3 + · · · . (16)
Hence,
′ ′
2 3
2 3 1
1 + 2x + 3x + 4x + · · · = 1 + x + x + x + · · · =
1−x
1
(since (5) yields 1 + x + x2 + x3 + · · · = ). Using the quotient rule, we can
′ 1 − x
1 1
easily find that = , so that
1−x (1 − x )2
′
2 3 1 1
1 + 2x + 3x + 4x + · · · = = . (17)
1−x (1 − x )2
Math 701 Spring 2021, version April 6, 2024 page 25
The left hand side of this looks very similar to the power series 0 + 1x + 2x2 +
3x3 + · · · that we want to simplify. And indeed, we have the following:
1
0 + 1x + 2x2 + 3x3 + · · · = x 1 + 2x + 3x2 + 4x3 + · · · = x ·
| {z } (1 − x )2
1
=
(1 − x )2
x
= . (18)
(1 − x )2
k
1 = x · (1+x+x2 +x3 +··· )
=x ·
1−x 1
(by (5)) =x·
1−x
(by (5))
1 1 x
= ·x· = . (19)
1−x 1−x (1 − x )2
(We used some unstated assumptions here about infinite sums – specifically,
we assumed that we can rearrange them without worrying about absolute con-
vergence or similar issues – but we will later see that these assumptions are
well justified. Besides, we obtained the same result as by our first way, which
is reassuring.)
Having computed 0 + 1x + 2x2 + 3x3 + · · · , we can now simplify (16), obtain-
ing
A ( x ) = 1 + 2xA ( x ) + x 0 + 1x + 2x2 + 3x3 + · · ·
| {z }
x
=
(1 − x )2
x
= 1 + 2xA ( x ) + x · .
(1 − x )2
Math 701 Spring 2021, version April 6, 2024 page 26
k ≥0 | {z }
= ∑ ( k +1) x k
| {z }
= ∑ 2k +1 x k k ≥0
k ≥0
= ∑ 2k +1 x k − ∑ ( k + 1) x k = ∑ 2k +1 − ( k + 1 ) x k .
k ≥0 k ≥0 k ≥0
3.2. Definitions
The four examples above should have convinced you that generating functions
can be useful. Thus, it is worthwhile to put them on a rigorous footing by
first defining generating functions and then justifying the manipulations we
have been doing to them in the previous section (e.g., dividing them, solving
quadratic equations, taking infinite sums, taking derivatives, ...). We are next
going to sketch how this can be done (see [Loehr11, Chapter 7 (in the 1st edi-
tion)] and [19s, Chapter 7] for some details).
First things first: Generating functions are not actually functions. They are
so-called formal power series (short FPSs). Roughly speaking, a formal power
series is a “formal” infinite sum of the form a0 + a1 x + a2 x2 + · · · , where x
is an “indeterminate” (we shall soon see what this all means). You cannot
substitute x = 2 into such a power series. (For example, substituting x = 2
1 1
into = 1 + x + x2 + x3 + · · · would lead to the absurd equality =
1−x −1
1 + 2 + 4 + 8 + 16 + · · · .) The word “function” in “generating function” is
somewhat of a historical artifact.
is defined in any textbook on abstract algebra for more details, but we recall
the definition for the sake of completeness.
Informally, a commutative ring is a set K equipped with binary operations
⊕, ⊖ and ⊙ and elements 0 and 1 that “behave” like addition, subtraction
and multiplication (of numbers) and the numbers 0 and 1, respectively. For
example, they should satisfy rules like ( a ⊕ b) ⊙ c = ( a ⊙ c) ⊕ (b ⊙ c).
Formally, commutative rings are defined as follows:
Definition 3.2.1. A commutative ring means a set K equipped with three maps
⊕ : K × K → K,
⊖ : K × K → K,
⊙ : K×K → K
7. Distributivity: We have
a ⊙ (b ⊕ c) = ( a ⊙ b) ⊕ ( a ⊙ c) and ( a ⊕ b) ⊙ c = ( a ⊙ c) ⊕ (b ⊙ c)
for all a, b, c ∈ K.
The operations ⊕, ⊖ and ⊙ are called the addition, the subtraction and the
multiplication of the ring K. This does not imply that they have any connection
with the usual addition, subtraction and multiplication of numbers; it merely
means that they play similar roles to the latter and behave similarly. When
confusion is unlikely, we will denote these three operations ⊕, ⊖ and ⊙ by
+, − and ·, respectively, and we will abbreviate a ⊙ b = a · b by ab.
The elements 0 and 1 are called the zero and the unity (or the one) of the
ring K. Again, this does not imply that they equal the numbers 0 and 1, but
merely that they play analogous roles. We will simply call these elements 0
and 1 when confusion with the corresponding numbers is unlikely.
We will use PEMDAS conventions for the three operations ⊕, ⊖ and ⊙.
These imply that the operation ⊙ has higher precedence than ⊕ and ⊖, while
the operations ⊕ and ⊖ are left-associative. Thus, for example, “ab + ac”
means ( ab) + ( ac) (that is, ( a ⊙ b) ⊕ ( a ⊙ c)). Likewise, “a − b + c” means
( a − b) + c = ( a ⊖ b) ⊕ c.
• The sets Z, Q, R and C are commutative rings. (Of course, the operations
⊕, ⊖ and ⊙ of these rings are just the usual operations +, − and · known
from high school.)
• The set N is not a commutative ring, since it has no subtraction. (It is,
however, something called a commutative semiring.)
• The matrix ring Qm×m (this is the ring of all m × m-matrices with rational
entries) is not a commutative ring for m > 1 (because it fails the “commu-
tativity of multiplication” axiom). However, it satisfies all axioms other
than “commutativity of multiplication”. This makes it a noncommutative
ring.
• The set h√ i n √ o
Z 5 = a + b 5 | a, b ∈ Z
is a commutative ring with operations +, − and · inherited from R. This
is because any a, b, c, d ∈ Z satisfy
√ √ √ h√ i
a + b 5 + c + d 5 = ( a + c) + (b + d) 5 ∈ Z 5 ;
√ √ √ h √ i
a + b 5 − c + d 5 = ( a − c) + (b − d) 5 ∈ Z 5 ;
√ √ √ h√ i
a + b 5 c + d 5 = ( ac + 5bd) + ( ad + bc) 5 ∈ Z 5 .
a + b = a + b, a − b = a − b, a · b = ab.
If m > 0, then this ring Z/m is finite and has size m. It is also known
as Z/mZ or Zm (careful with the latter notation; it can mean different
things to different people). When m is prime, the ring Z/m is actually a
finite field and is called Fm . (But, e.g., the ring Z/4 is not a field, and not
the same as F4 .)
• In the examples we have seen so far, the elements of the commutative ring
either are numbers or (as in the case of matrices or residue classes) consist
of numbers. For a contrast, here is an example where they are sets:
For any two sets X and Y, we define the symmetric difference X △ Y of X
and Y to be the set
( X ∪ Y ) \ ( X ∩ Y ) = ( X \ Y ) ∪ (Y \ X )
= {all elements that belong to exactly one of X and Y } .
Fix a set S. Consider the power set P (S) of S (that is, the set of all
subsets of S). This power set P (S) is a commutative ring if we equip
it with the operation △ as addition (that is, X ⊕ Y = X △ Y for any
subsets X and Y of S), with the same operation △ as subtraction (that
is, X ⊖ Y = X △ Y), and with the operation ∩ as multiplication (that
is, X ⊙ Y = X ∩ Y), and with the elements ∅ and S as zero and unity
(that is, with 0 = ∅ and 1 = S). Indeed, it is straightforward to see
that all the axioms in Definition 3.2.1 hold for this ring. (For example,
distributivity holds because any three sets A, B, C satisfy A ∩ ( B △ C ) =
( A ∩ B) △ ( A ∩ C ) and ( A △ B) ∩ C = ( A ∩ C ) △ ( B ∩ C ).) This is an
example of a Boolean ring (i.e., a ring in which aa = a for each element a
of the ring).
and
• You can compute finite sums (of elements of K) without specifying the
order of summation or the placement of parentheses. For example, for
any a, b, c, d, e ∈ K, we have
(( a + b) + (c + d)) + e = ( a + (b + c)) + (d + e) ,
a + b + c + d + e = d + b + a + e + c.
(See [Grinbe15, §2.14] for very detailed proofs9 . Alternatively, you can
treat them as exercises on induction.)
If S = ∅, then ∑ as = 0 by definition. Such a sum ∑ as with S = ∅ is
s∈S s∈S
called an empty sum.
• The same holds for finite products. If S = ∅, then ∏ as = 1 by definition.
s∈S
• If a ∈ K, then − a denotes 0 − a = 0 − a ∈ K.
• If n ∈ Z and a ∈ K, then we can define an element na ∈ K by
|a + a +{z· · · + }a,
if n ≥ 0;
n times
na =
− |a + a +{z· · · + }a , if n < 0.
−n times
In particular, a0 = |aa {z
· · · }a = 1 (since an empty product is always 1).
0 times
⊕ : M × M → M,
⊖ : M × M → M,
⇀:K×M → M
(notice that the third map has domain K × M, not M × M) and an element
−
→
0 ∈ M satisfying the following axioms:
This all having been said, we can now define formal power series.
Definition 3.2.4. A formal power series (or, short, FPS) in 1 indeterminate over
K means a sequence ( a0 , a1 , a2 , . . .) = ( an )n∈N ∈ KN of elements of K.
( a0 + b0 , a1 + b1 , a2 + b2 , . . .) .
It is denoted by a + b.
(b) The difference of two FPSs a = ( a0 , a1 , a2 , . . .) and b = (b0 , b1 , b2 , . . .) is
defined to be the FPS
( a0 − b0 , a1 − b1 , a2 − b2 , . . .) .
Math 701 Spring 2021, version April 6, 2024 page 34
It is denoted by a − b.
(c) If λ ∈ K and if a = ( a0 , a1 , a2 , . . .) is an FPS, then we define an FPS
(b) Furthermore, K [[ x ]] is a K-module (with its scaling being the map that
sends each (λ, a) ∈ K × K [[ x ]] to the FPS λa defined in Definition 3.2.5 (c)).
Its zero vector is 0. Concretely, this means that:
(Also, some of the facts from part (a) are included in this statement.)
(c) We have λ (a · b) = (λa) · b = a · (λb) for all λ ∈ K and a, b ∈ K [[ x ]].
(d) Finally, we have λa = λ · a for all λ ∈ K and a ∈ K [[ x ]].
• Powers exist: that is, you can take an for each FPS a and each n ∈ N.
• Standard rules hold: e.g., we have an+m = an am and (ab)n = an bn for any
a, b ∈ K [[ x ]] and any n, m ∈ N.
Math 701 Spring 2021, version April 6, 2024 page 36
[ x n ] a := an .
Thus, the definition of the sum of two FPSs (Definition 3.2.5 (a)) rewrites as
follows: For any a, b ∈ K [[ x ]] and any n ∈ N, we have
[ x n ] (a + b) = [ x n ] a + [ x n ] b. (20)
[ x n ] (a − b) = [ x n ] a − [ x n ] b. (21)
Meanwhile, the definition of the product of two FPSs (Definition 3.2.5 (d))
rewrites as follows: For any a, b ∈ K [[ x ]] and any n ∈ N, we have
[ x n ] (ab)
h i h i h i h i h i h i
0 n 1 n −1 2 n −2 n
= x a · [x ] b + x a · x b+ x a· x b + · · · + [ x ] a · x0 b
n h i h i
= ∑ x i a · x n −i b (22)
i =0
n h i h i
= ∑ x n− j
a · xj b (23)
j =0
Proof of Theorem 3.2.6. Most parts of Theorem 3.2.6 are straightforward to verify.
Let us check the associativity of multiplication.
Let a, b, c ∈ K [[ x ]]. We must prove that a (bc) = (ab) c. Let n ∈ N. Consider
the two equalities
n h i h i
n
[ x ] ((ab) c) = ∑ x n− j
(ab) · xj c
j =0 | {z }
n− j
= ∑ [ x i ] a · [ x n − j −i ] b
i =0
(by (22),
applied to n− j instead of n)
by (23), applied to ab and c
instead of a and b
n− j h i
!
n h i h i
= ∑ ∑ x i
a · x n − j −i
b · xj c
j =0 i =0
n n− j h i h i h i
= ∑ ∑ i
x a· x n − j −i
b · xj c
j =0 i =0
and
n
h i h i
[ x n ] (a (bc)) = ∑ xi a · x n−i (bc)
i =0 | {z }
n −i
= ∑ [ x n −i − j ] b · [ x j ] c
j =0
(by (23), applied to n−i − j, b and c
instead of n, a and b)
(by (22), applied to bc instead of b)
!
n h i n −i h i h i
= ∑ x a· ∑ x
i n −i − j j
b· x c
i =0 j =0
n n −i h i h i h i
= ∑ ∑ xi a · x n−i− j b · x j c.
i =0 j =0
The right hand sides of these two equalities are equal, since10
n n− j n n −i
∑ ∑= ∑ = ∑ ∑
j =0 i =0 (i,j)∈N2 ; i =0 j =0
i + j≤n
10 The
first equality we are about to state is an equality of summation signs. Such an equality
means that whatever you put inside the summation signs, they produce equal results. For
3
example, ∑ = ∑ and ∑ = ∑ .
i ∈{1,2,3} i =1 i ∈N i ≥0
Math 701 Spring 2021, version April 6, 2024 page 38
and n − j − i = n − i − j. Thus, their left hand sides are equal as well. In other
words,
[ x n ] ((ab) c) = [ x n ] (a (bc)) .
Now, forget that we fixed n. We thus have shown that [ x n ] ((ab) c) = [ x n ] (a (bc))
for each n ∈ N. In other words, each entry of (ab) c equals the corresponding
entry of a (bc). This entails (ab) c = a (bc) (since a FPS is just the sequence
of its entries). In other words, a (bc) = (ab) c. This concludes the proof of
associativity of multiplication.
The remaining claims of Theorem 3.2.6 are LTTR11 (their proofs follow the
same pattern, but are easier to execute).
Since K [[ x ]] is a commutative ring, any finite sum of FPSs is well-defined.
Sometimes, however, infinite sums of FPSs make sense as well: for example, it
stands to reason that
(1, 1, 1, 1, . . .)
+ (0, 1, 1, 1, . . .)
+ (0, 0, 1, 1, . . .)
+ (0, 0, 0, 1, . . .)
+···
= (1, 2, 3, 4, . . .) , (26)
because FPSs are added entrywise. Let us rigorously define such sums. First,
we define “essentially finite” sums of elements of K:
So the idea behind Definition 3.2.8 (b) is that addends that equal 0 can be
discarded in a sum, even when there are infinitely many of them.
Sums of essentially finite families satisfy the usual rules for sums (such as
the breaking-apart rule ∑ as = ∑ as + ∑ as when a set S is the union of
i ∈S i∈X i ∈Y
two disjoint sets X and Y). See [Grinbe15, §2.14.15] for details13 . There is
only one caveat: Interchange of summation signs (e.g., replacing ∑ ∑ ai,j by
i∈ I j∈ J
∑ ∑ ai,j ) works only if the family ai,j (i,j)∈ I × J is essentially finite (i.e., all but
j∈ J i∈ I
finitely many pairs (i, j) ∈ I × J satisfy ai,j = 0); it does not suffice that the
sums ∑ ∑ ai,j and ∑ ∑ ai,j themselves are essentially finite (i.e., that the
i∈ I j∈ J j∈ J i∈ I
families ai,j j∈ J for all i ∈ I, the families ai,j i∈ I for all j ∈ J, and the families
!
∑ ai,j and ∑ ai,j are essentially finite).
j∈ J i∈ I
j∈ J
i∈ I
For a counterexample, consider the family ai,j (i,j)∈ I × J of integers with I = {1, 2, 3, . . .}
and J = {1, 2, 3, . . .}, where ai,j is given by the following table:
i=1 1 −1 ···
i=2 1 −1 ···
i=3 1 −1 ···
i=4 1 −1 ···
i=5 1 ···
.. .. .. .. .. .. ..
. . . . . . .
13 Note that [Grinbe15, §2.14.15] uses the words “finitely supported” instead of “essentially fi-
nite”.
Math 701 Spring 2021, version April 6, 2024 page 40
(where all the entries in the empty cells are 0). For this family ai,j (i,j)∈ I × J , both
sums ∑ ∑ ai,j and ∑ ∑ ai,j are essentially finite, but they are not equal (indeed,
i∈ I j∈ J j∈ J i∈ I
the former sum is ∑
i∈ I
∑ ai,j = ∑ 0 = 0, whereas the latter sum is ∑
i∈ I j∈ J
∑ ai,j =
i∈ I
j∈ J
| {z }
=0
∑ ai,1 + ∑ ∑ ai,j = 1 + ∑ 0 = 1). And indeed, this family ai,j
(i,j)∈ I × J
is not essen-
i∈ I j >1 i∈ I j >1
| {z } | {z }
=1 =0
tially finite.
We have now made sense of infinite sums of elements of K when all but
finitely many addends are 0. Of course, we can do the same for K [[ x ]] instead
of K (since K [[ x ]], too, is a commutative ring). However, this does not help
make sense of the sum on the left hand side of (26), because this sum is not
essentially finite (it is a sum of infinitely many nonzero FPSs). Thus, for sums
of FPSs, we need a weaker version of essential finiteness. Here is its definition:
Definition 3.2.9. A (possibly infinite) family (ai )i∈ I of FPSs is said to be
summable (or entrywise essentially finite) if
check that the family (ai )i∈N is summable and that its sum ∑ ai really
i ∈N
equals the right hand side of (26).
For each n ∈ N, all but finitely many i ∈ N satisfy [ x n ] ai = 0 (indeed, all
i > n satisfy this equality). Thus, the family (ai )i∈N is summable. For each
n ∈ N, we have
!
[ xn ] ∑ ai = ∑ [ x n ] ai (by (27))
i ∈N i ∈N
n n
= ∑ [ x n ] a + ∑ [ x n ] ai = ∑ 1 + ∑ 0
| {z }i
i =0 i >n i =0 i >n
| {z }
=1 =0 |{z}
(since i ≤n) (since i >n) =0
n
= ∑ 1 = n + 1.
i =0
Thus, ∑ ai = (1, 2, 3, 4, . . .). This is precisely the right hand side of (26).
i ∈N
Thus, (26) has been justified rigorously.
You can think of summable infinite sums of FPSs as a crude algebraic imitate
of uniformly convergent infinite sums of holomorphic functions in complex
analysis. (However, the former are a much simpler concept than the latter. In
particular, complex analysis is completely unnecessary in their study.)
The following fact is nearly obvious:14
Proposition 3.2.12. Let (ai )i∈ I be a summable family of FPSs. Then, any
subfamily of (ai )i∈ I is summable as well.
Proof of Proposition 3.2.12. Let J be a subset of I. We must prove that the sub-
family (ai )i∈ J is summable.
Let n ∈ N. Then, all but finitely many i ∈ I satisfy [ x n ] ai = 0 (since the
family (ai )i∈ I is summable). Hence, all but finitely many i ∈ J satisfy [ x n ] ai =
0 (since J is a subset of I). Since we have proved this for each n ∈ N, we
thus conclude that the family (ai )i∈ J is summable. This proves Proposition
3.2.12.
Just as with essentially finite families, we can work with summable sums of
FPSs “as if they were finite” most of the time:
Proposition 3.2.13. Sums of summable families of FPSs satisfy the usual rules
for sums (such as the breaking-apart rule ∑ as = ∑ as + ∑ as when a set
i ∈S i∈X i ∈Y
S is the union of two disjoint sets X and Y). See [19s, Proposition 7.2.11] for
14 A subfamily of a family ( f i )i∈ I means a family of the form ( f i )i∈ J , where J is a subset of I.
Math 701 Spring 2021, version April 6, 2024 page 42
details. Again, the only caveat is about interchange of summation signs: The
equality
∑ ∑ ai,j = ∑ ∑ ai,j
i∈ I j∈ J j∈ J i∈ I
holds when the family ai,j (i,j)∈ I × J is summable (i.e., when for each n ∈ N,
all but finitely many pairs (i, j) ∈ I × J satisfy [ x n ] ai,j = 0); it does not
generally hold if we merely assume that the sums ∑ ∑ ai,j and ∑ ∑ ai,j
i∈ I j∈ J j∈ J i∈ I
are summable.
Proof of Proposition 3.2.13. The proof is tedious (as there are many rules to check),
but fairly straightforward (the idea is always to focus on a single coefficient,
and then to reduce the infinite sums to finite sums). For example, consider the
“discrete Fubini rule”, which says that
for each n ∈ N. Fix n ∈ N; then, we have [ x n ] ai,j = 0 for all but finitely
many (i, j) ∈ I × J (since the family ai,j (i,j)∈ I × J is summable). That is, the
set of all pairs (i, j) ∈ I × J satisfying [ x n ] ai,j ̸= 0 is finite. Hence, the set
the second entries of all these pairs. Now, the definitions of I ′ and J ′ ensure
that any pair (i, j) ∈ I × J satisfies [ x n ] ai,j = 0 unless i ∈ I ′ and j ∈ J ′ . Hence,
we easily obtain the three equalities
!
[ xn ] ∑ ∑ ai,j =∑ ∑ [xn ] ai,j = ∑′ ∑′ [xn ] ai,j
i∈ I j∈ J i∈ I j∈ J i∈ I j∈ J
and
and !
[ xn ] ∑ ∑ ai,j = ∑ ∑ [xn ] ai,j = ∑′ ∑′ [xn ] ai,j .
j∈ J i∈ I j∈ J i∈ I j∈ J i∈ I
However, the right hand sides of these three equalities are equal (since the sums
appearing in them are finite sums, and thus satisfy the usual rules for sums).
Thus, the left hand sides are equal, exactly as we needed to show. See [19s,
proof of Proposition 7.2.11] for more details of this proof. Proving the other
properties of sums is easier.
A few conventions about infinite sums will be used rather often:
Convention 3.2.14. (a) For any given integer m ∈ Z, the summation sign ∑
k≥m
∞
is to be understood as ∑ . We also write ∑ for this summation
k∈{m,m+1,m+2,...} k=m
sign.
(b) For any given integer m ∈ Z, the summation sign ∑ is to be under-
k>m
stood as ∑ .
k∈{m+1,m+2,m+3,...}
(c) Let I be a set, and let A (i ) be a logical statement for each i ∈ I. (For
example, I can be N, and A (i ) can be the statement “i is odd”.) Then,
the summation sign ∑ is to be understood as ∑ . (For example,
i ∈ I; i ∈{ j∈ I | A( j)}
A(i )
the summation sign ∑ means ∑ , that is, a sum over all odd
i ∈N; i ∈{ j∈N | j is odd}
i is odd
elements of N.)
We can now define the x that figured so prominently in our informal explo-
ration of formal power series back in Section 3.1:
The following simple lemma follows almost immediately from the definition
of multiplication of FPSs:
n
A similar argument can be used for n = 0 (except that now, the sum ∑ xi x ·
i =0
an−i has no x1 x · an−1 addend), and results in the conclusion that [ x n ] ( x · a) =
x · x m = 0, 0, 0, . . . , 0, 1, 0, 0, 0, . . . = 0, 0, . . . , 0, 1, 0, 0, 0, . . . .
| {z } | {z }
m times m+1 times
( a0 , a1 , a2 , . . . ) = a0 + a1 x + a2 x 2 + a3 x 3 + · · · = ∑ an x n .
n ∈N
In particular, the right hand side here is well-defined, i.e., the family
( an x n )n∈N is summable.
Proof of Corollary 3.2.18 (sketched). (See [19s, Corollary 7.2.16] for details.) By
Math 701 Spring 2021, version April 6, 2024 page 46
( a + b ) ( a + b − 1) ( a + b − 2) · · · ( a + b − n + 1)
a+b
= and
n n!
n n
a ( a − 1) · · · ( a − k + 1)
a b b
∑ k n−k = ∑ k! n−k
.
k =0 k =0
a ∈ N), then they must be identical (because two univariate polynomials that
are equal at infinitely many points must necessarily be identical16 ). Hence, the
two sides of (31) must be identical as polynomials in a. Thus, the equality (31)
holds not only for each a ∈ N, but also for each a ∈ C.
Now, forget that b was fixed. Instead, let us fix a ∈ C. As we just have
proved, the equality (31) holds for each b ∈ N. We want to show that it holds
for each b ∈ C. But this can be achieved by the same argument that we just
used to extend it from a ∈ N to a ∈ C: We view both sides of the equality
as polynomials (but this time in b, not in a), and argue that these polynomials
must be identical because they are equal at infinitely many points. The upshot
is that the equality (31) holds for all a, b ∈ C; thus, Theorem 3.2.21 is proven.
(See [20f, proofs of Lemma 7.5.8 and Theorem 7.5.3] or [19s, §2.17.3] for this
proof in more detail. Alternatively, see [Grinbe15, §3.3.2 and §3.3.3] for two
other proofs.)
• when and why can we take the square root of an FPS and solve a quadratic
equation using the quadratic formula.
Convention 3.3.1. From now on, we identify each a ∈ K with the constant
FPS a = ( a, 0, 0, 0, 0, . . .) ∈ K [[ x ]].
16 Quick reminder on why this is true: If p and q are two univariate polynomials (with rational,
real or complex coefficients) that are equal at infinitely many points (i.e., if there exist
infinitely many numbers z satisfying p (z) = q (z)), then p = q (because the assumption
entails that the difference p − q has infinitely many roots, but this entails p − q = 0 and thus
p = q). See [20f, Corollary 7.5.7] for this argument in more detail.
Math 701 Spring 2021, version April 6, 2024 page 49
(check this!), and because the zero and the unity of the ring K [[ x ]] are 0 and 1,
respectively.
Furthermore, I will stop using boldfaced letters (like a, b, c) for FPSs. (I did
this above for the sake of convenience, but this is rarely done in the literature.)
Note that the condition “ab = ba = 1” in Definition 3.3.2 (a) can be restated
as “ab = 1”, because we automatically have ab = ba (since L is a commutative
ring). I have chosen to write “ab = ba = 1” in order to state the definition in a
form that applies verbatim to noncommutative rings as well.
Example 3.3.3. (a) In the ring Z, the only two invertible elements are 1 and
−1. Each of these two elements is its own inverse.
(b) In the ring Q, every nonzero element is invertible. The same holds for
the rings R and C (and, more generally, for any field).
(e) Laws of fractions hold: If a and c are two invertible elements of L, and
if b and d are any two elements of L, then
b d bc + ad b d bd
+ = and · = .
a c ac a c ac
(f) Division undoes multiplication: If a, b, c are three elements of L with a
c
being invertible, then the equality = b is equivalent to c = ab.
a
Proof. Exercise. (See, e.g., [19s, solution to Exercise 4.1.1] for a proof of parts (c)
and (d) in the special case where L = C; essentially the same argument works
in the general case. The remaining parts of Proposition 3.3.6 are even easier to
check. Note that parts (a) and (c) as well as the ( ab)−1 = b−1 a−1 part of part
(b) would hold even if L was a noncommutative ring.)
Math 701 Spring 2021, version April 6, 2024 page 51
3.3.3. Inverses in K [[ x ]]
Now, which FPSs are invertible in the ring K [[ x ]] ? For example, we know
from (5) that the FPS 1 − x is invertible, with inverse 1 + x + x2 + x3 + · · · . On
the other hand, the FPS x is not invertible, since Lemma 3.2.16 shows that any
product of x with an FPS must begin with a 0 (but the unity of K [[ x ]] does not
begin with a 0). (Strictly speaking, this is only true if the ring K is nontrivial
– i.e., if not all elements of K are equal. If K is trivial, then K [[ x ]] is trivial,
and thus any FPS in K [[ x ]] is invertible, but this does not make an interesting
statement.)
It turns out that we can characterize invertible FPSs in K [[ x ]] in a rather
simple way:
Proof. =⇒: Assume that a is invertible in K [[ x ]]. That is, a has an inverse b ∈
K [[ x ]]. Consider this b. Since b is an inverse of a, we have ab = ba = 1 (where
“1” means 0 1 by Convention
0 03.3.1).
However, (24) (applied 0 to a = a and
0 b = b)
yields x ( ab) = x a · x b. Comparing this with x ( ab) = x 1 = 1,
|{z}
0 0 0 =10
0 find 0x a · x 0 b =
we 01. Thus, x b is 0an 0 of x0 a in0 K (since
inverse
x b · x a = x a · x b = 1, so that x a· x b = x b · x a = 1).
Therefore, x0 a is invertible in K (with inverse x0 b). This proves the “=⇒”
direction of Proposition 0 3.3.7.
⇐=: Assume that x a is invertible in K. Write the FPS a in the form 0 a =
0
( a0 , a1 , a2 , . . .). Thus, x a = a0 , so that a0 is invertible in K (since x a is
invertible in K). Thus, its inverse a0−1 is well-defined.
Now, we want to prove that a is invertible in K [[ x ]]. We thus try to find an
inverse of a.
We work backwards at first: We assume that b = (b0 , b1 , b2 , . . .) ∈ K [[ x ]] is an
inverse for a, and we try to figure out how this inverse looks like.
Since b is an inverse of a, we have ab = 1 = (1, 0, 0, 0, . . .). However, from
a = ( a0 , a1 , a2 , . . .) and b = (b0 , b1 , b2 , . . .), we have
ab = ( a0 , a1 , a2 , . . .) (b0 , b1 , b2 , . . .)
= ( a0 b0 , a0 b1 + a1 b0 , a0 b2 + a1 b1 + a2 b0 , a0 b3 + a1 b2 + a2 b1 + a3 b0 , . . .)
(by the definition of the product of FPSs). Comparing this with ab = (1, 0, 0, 0, . . .),
we obtain
(1, 0, 0, 0, . . .)
= ( a0 b0 , a0 b1 + a1 b0 , a0 b2 + a1 b1 + a2 b0 , a0 b3 + a1 b2 + a2 b1 + a3 b0 , . . .) .
Math 701 Spring 2021, version April 6, 2024 page 52
I claim that this system of equations uniquely determines (b0 , b1 , b2 , . . .). In-
deed, we can solve the first equation (1 = a0 b0 ) for b0 , thus obtaining b0 = a0−1
(since a0 is invertible). Having thus found b0 , we can solve the second equation
(0 = a0 b1 + a1 b0 ) for b1 , thus obtaining b1 = − a0−1 ( a1 b0 ) (again because a0 is
invertible). Having thus found both b0 and b1 , we can solve the third equation
(0 = a0 b2 + a1 b1 + a2 b0 ) for b2 , thus obtaining b2 = − a0−1 ( a1 b1 + a2 b0 ). Proceed-
ing like this, we obtain recursive expressions for all coefficients b0 , b1 , b2 , . . . of
b, namely
b0 = a0−1 ,
b1 = − a0−1 ( a1 b0 ) ,
b2 = − a0−1 ( a1 b1 + a2 b0 ) , (33)
b3 = − a−1 ( a1 b2 + a2 b1 + a3 b0 ) ,
0
....
(This procedure for solving systems of linear equations is well-known from lin-
ear algebra – it is a form of Gaussian elimination, but a particularly simple
one because our system is triangular with invertible coefficients on the diago-
nal. The only complication is that it has infinitely many variables and infinitely
many equations.)
So we have shown that if b is an inverse of a, then the entries bi of the FPS b
are given recursively by (33). This yields that b is unique; alas, this is not what
we want to prove. Instead, we want to prove that b exists.
Fortunately, we can achieve this by simply turning our above argument around:
Forget that we fixed b. Instead, we define a sequence (b0 , b1 , b2 , . . .) of elements
of K recursively by (33), and we define the FPS b = (b0 , b1 , b2 , . . .) ∈ K [[ x ]].
Then, the equalities (32) hold (because they are just equivalent restatements of
the equalities (33)). In other words, we have
(1, 0, 0, 0, . . .)
= ( a0 b0 , a0 b1 + a1 b0 , a0 b2 + a1 b1 + a2 b0 , a0 b3 + a1 b2 + a2 b1 + a3 b0 , . . .) .
However, as before, we can show that
ab = ( a0 b0 , a0 b1 + a1 b0 , a0 b2 + a1 b1 + a2 b0 , a0 b3 + a1 b2 + a2 b1 + a3 b0 , . . .) .
Comparing these two equalities, we find ab = (1, 0, 0, 0, . . .) = 1. Thus, ba =
ab = 1, so that ab = ba = 1. This shows that b is an inverse of a, so that a is
invertible. This proves the “⇐=” direction of Proposition 3.3.7.
Math 701 Spring 2021, version April 6, 2024 page 53
Corollary 3.3.8. Assume that K is a field. Let a ∈ K [[ x ]]. Then, the FPS a is
invertible in K [[ x ]] if and only if x0 a ̸= 0.
(1 + x ) −1 = 1 − x + x 2 − x 3 + x 4 − x 5 ± · · · = ∑ (−1)n x n .
n ∈N
(since all powers of x other than 1 cancel out). This shows that 1 − x + x2 − x3 +
x4 − x5 ± · · · is an inverse of 1 + x (since K [[ x ]] is a commutative ring). Thus,
1 + x is invertible, and its inverse is (1 + x )−1 = 1 − x + x2 − x3 + x4 − x5 ± · · · =
n
∑ (−1) x n . This proves Proposition 3.3.9.
n ∈N
(since we have a telescoping sum in front of us, in which all powers of x other
than 1 cancel out). This shows that 1 − x + x2 − x3 + x4 − x5 ± · · · is an inverse
of 1 + x (since K [[ x ]] is a commutative ring). Thus, 1 + x is invertible, and its
inverse is (1 + x )−1 = 1 − x + x2 − x3 + x4 − x5 ± · · · = ∑ (−1)n x n . This
n ∈N
proves Proposition 3.3.9.
Proposition 3.3.9 shows that the FPS 1 + x is invertible; thus, its powers
(1 + x )n are defined for all n ∈ Z (by Definition 3.3.5 (c)). The following for-
mula – known as Newton’s binomial theorem17 – describes these powers explicitly:
n k
Note that the sum ∑ x is summable for each n ∈ N (indeed, it equals
k ∈ N k
n n n
the FPS , , , . . . ). If n ∈ N, then it is essentially finite.
0 1 2
The reader may want to check that the particular case n = −1 of Theorem
3.3.10 agrees with Proposition 3.3.9. (Recall Example 2.3.2!)
Of course, Theorem 3.3.10 should look familiar – an identical-looking for-
mula appears in real analysis under the same name. However, the result in real
analysis is concerned with infinite sums of real numbers, while our Theorem
3.3.10 is an identity between FPSs over an arbitrary commutative ring. Thus,
the two facts are not the same.
We will prove Theorem 3.3.10 in a somewhat roundabout way, since this
gives us an opportunity to establish some auxiliary results that are of separate
interest (and usefulness). The first of these auxiliary results is a fundamental
property of binomial coefficients, known as the upper negation formula (see, e.g.,
[19fco, Proposition 1.3.7]):
−n k k+n−1
= (−1) .
k k
−n
Proof of Theorem 3.3.11 (sketched). If k < 0, then this is trivial because both
k
k+n−1
and are 0 (by (1)). Thus, we WLOG assume that k ≥ 0. Hence,
k
k n+k−1
(1 + x ) = ∑ (−1)
−n
xk .
k ∈N
k
(1 + x )−0 = (1 + x )0 = 1 = (1, 0, 0, 0, . . .) = 1
with
0+k−1 k
∑ (−1) k
k
x
k ∈N
0 0+0−1 0+k−1
= (−1) x0 + ∑ (−1)k xk
| {z } 0 |{z}
k∈N;
k
=1 | {z } =1 | {z }
k >0
=1 k−1
= =0
k
(by Proposition 2.3.5
(applied to m=k−1 and n=k),
since k−1<k and k−1∈N)
(here, we have split off the addend for k = 0 from the sum)
= 1+ ∑ (−1)k 0x k = 1,
k∈N;
k >0
| {z }
=0
0+k−1 k
−0 k
we obtain (1 + x ) = ∑ (−1) x . In other words, Proposition
k ∈N k
3.3.12 holds for n = 0.
Math 701 Spring 2021, version April 6, 2024 page 56
k ( j + 1) + k − 1
(1 + x ) −( j+1)
= ∑ (−1) xk .
k ∈N
k
k ∈N
k k
j+k k j+k
= ∑ (−1) k
x + ∑ (−1)
k
x k +1
k ∈N
k k ∈N
k
| {z } |
j+k−1 j+k−1
{z }
= +
j + ( k − 1 )
= ∑ (−1)k−1 xk
k−1 k k ≥1 k−1
(by Proposition 2.3.4, (here, we have substituted k−1 for k
applied to m= j+k and n=k) in the sum)
j+k−1 j+k−1 k −1 j + ( k − 1 )
= ∑ (−1) k
+ x + ∑ (−1)
k
xk
k ∈N
k − 1 k k ≥1
| {z } k − 1
=−(−1)k
| | {z }
{z } j+k−1
j+k−1 k j+k−1 k
= ∑ (−1)k x + ∑ (−1)k x =
k ∈N k−1 k ∈N k k−1
k j+k−1 k j+k−1
j + k − 1
= ∑ (−1) x + ∑ (−1)
k
x + ∑ − (−1)
k k
xk
k ∈N
k − 1 k ∈N
k k ≥1
k − 1
k j+k−1 k j+k−1 k j+k−1
= ∑ (−1) x + ∑ (−1)
k
x − ∑ (−1)
k
xk
k ∈N
k − 1 k ∈N
k k ≥1
k − 1
!
k j+k−1 k j+k−1 k j+k−1
= ∑ (−1) x − ∑ (−1)
k
x + ∑ (−1)
k
xk
k ∈N
k−1 k ≥1
k−1 k ∈N
k
| {z } | {z }
−j
j+0−1 0
=(1+ x )
=(−1)0 x (by (34))
0−1
(since the two sums differ only in their k=0 addend)
j+0−1
= (−1) 0
x 0 + (1 + x ) − j = (1 + x ) − j .
0−1
| {z }
=0
∈N)
(by (1), since 0−1/
Math 701 Spring 2021, version April 6, 2024 page 58
so that
k ( j + 1) + k − 1
j+k
(1 + x ) −( j+1)
= ∑ (−1) k
x = ∑ (−1)
k
xk .
k ∈N
k k ∈N
k
| {z }
( j + 1) + k − 1
=
k
−n k
(1 + x ) = ∑
−n
x .
k ∈N
k
k n+k−1 −n k
(1 + x ) = ∑ (−1)
−n
x = ∑
k
x .
k ∈N |
k k ∈ N
k
{z }
k + n − 1
=(−1)k
k
−n
=
k
(by Theorem 3.3.11)
with
n n
n k n n n k
∑ k x = ∑ k xk + ∑ k
x = ∑
k
k
x + ∑ 0x k
k ∈N k =0 k>n | {z } k =0 k>n
| {z }
=0 =0
(by Proposition 2.3.5,
since n<k)
n
n k
= ∑ x .
k =0
k
− (−n) k
(1 + x ) −(−n)
= ∑ x .
k ∈N
k
n n k
Since − (−n) = n, this rewrites as (1 + x ) = ∑ x . Thus, Theorem 3.3.10
k ∈N k
is proven.
We thus have a formula for (1 + x )n for each integer n. We don’t yet have
such a formula for (1 + x )1/2 (nor do we have a proper definition of (1 + x )1/2 ),
but this was clearly a step forward.
3.3.5. Dividing by x
Let us see how this all helps us justify our arguments in Section 3.1. Proposition
3.3.7 justifies the fractions that appear in (4), but
0 it does not justify dividing by
the FPS 2x in (10), since the constant term x (2x ) is surely not invertible.
1
And indeed, the FPS 2x is not invertible; the fraction is not a well-defined
2x
FPS.
However, it is easy to see directly which FPSs can be divided by x (and thus
by 2x, if K = Q), and what it means to divide them by x. In fact, Lemma
3.2.16 shows that multiplying an FPS by x means moving all its entries by one
position to the right, and putting a 0 into the newly vacated starting position.
Thus, it is rather clear what dividing by x should be:
Let a ∈ K [[ x ]] and
Proposition 3.3.15.
a
b ∈ K [[ x ]] be two FPSs. Then, a = xb
0
if and only if x a = 0 and b = .
x
Proof. Exercise.
a
Having defined in Definition 3.3.14 (when a has constant term 0), we can
x
a a 1 a
also define when 2 is invertible in K (just set = · ). Thus, the fraction
√ 2x 2x 2 x
1 ± 1 − 4x
in (10) makes sense when the ± sign is a − sign (but not when it is
2x
√
1/2
a + sign), at least if we interpret the square root 1 − 4x as ∑ (−4x )k .
k ≥0 k
18 Proof. a a
We have = ( a1 , a2 , a3 , . . .) (by Definition 3.3.14). Thus, Lemma 3.2.16 (applied to
x x
and ai−1 instead of a and ai ) yields
a
x· = (0, a1 , a2 , a3 , . . .) = ( a0 , a1 , a2 , a3 , . . .) (since 0 = a0 )
x
= ( a0 , a1 , a2 , . . .) = a.
a
Thus, a = x · .
x
Math 701 Spring 2021, version April 6, 2024 page 61
∞
In other words, f = x k a for a = ∑ f n x n−k . This shows that f is a multiple of
n=k
x k . Thus, the “=⇒” direction of Lemma 3.3.18 is proved.
Math 701 Spring 2021, version April 6, 2024 page 62
Then,
[ x m ] ( a f ) = [ x m ] ( ag) for each m ∈ {0, 1, . . . , n} .
Then,
[ x m ] ( ac) = [ x m ] (bd) for each m ∈ {0, 1, . . . , n} .
words, [ x m ] ( ac) = [ x m ] (bc). On the other hand, (21) yields [ x m ] (bc − bd) =
[ x m ] (bc) − [ x m ] (bd). Comparing this with (42), we obtain [ x m ] (bc) − [ x m ] (bd) =
0. In other words, [ x m ] (bc) = [ x m ] (bd). Hence, [ x m ] ( ac) = [ x m ] (bc) =
[ x m ] (bd). This proves Lemma 3.3.22.
3.4. Polynomials
3.4.1. Definition
Let us take a little side trip to relate FPSs to polynomials. As should be clear
enough from the definitions, we can think of an FPS as a “polynomial with
(potentially) infinitely many nonzero coefficients”. This can be easily made
precise. Indeed, we can define polynomials as FPSs that have only finitely
many nonzero coefficients:
Proof of Theorem 3.4.2 (sketched). This is a rather easy exercise. The hardest part
is to show that K [ x ] is closed under multiplication. But this, too, is easy: Let
a, b ∈ K [ x ]. Then, all but finitely many n ∈ N satisfy [ x n ] a = 0 (since a ∈ K [ x ]).
In other words, there exists a finite subset I of N such that
h i
xi a = 0 for all i ∈ N \ I. (43)
Thus, in either case, at least one of the two coefficients xi a and x j b is 0, so that their
Forget that we fixed i. We thus have shown that xi a · x n−i b is 0 for each i ∈
n
{0, 1, . . . , n}. In other words, all addends of the sum ∑ xi a · x n−i b are 0. Hence, the
i =0
n
whole sum is 0. In other words, [ x n ] ( ab) = 0 (since [ x n ] ( ab) = ∑ xi a · x n−i b), qed.
i =0
Math 701 Spring 2021, version April 6, 2024 page 66
• For any n ∈ N, the matrix ring Rn×n (that is, the ring of all n × n-matrices
with real entries) is a ring. This ring is commutative if n ≤ 1, but not if
n > 1.
More generally, if K is any ring (commutative or not), then the matrix ring
K n×n is a ring for every n ∈ N.
Next, let us recall the notion of a K-algebra ([23wa, §3.11]). Recall that K is a
fixed commutative ring.
⊕ : A × A → A,
⊖ : A × A → A,
⊙ : A × A → A,
⇀:K×A→ A
−
→ −
→
and two elements 0 ∈ A and 1 ∈ A satisfying the following properties:
1. The set A, equipped with the maps ⊕, ⊖ and ⊙ and the two elements
−
→ −
→
0 and 1 , is a (noncommutative) ring.
−→
2. The set A, equipped with the maps ⊕, ⊖ and ⇀ and the element 0 , is
a K-module.
20 Note that the word “noncommutative ring” does not imply that the ring is not commuta-
tive; it merely means that commutativity is not required. Thus, any commutative ring is a
noncommutative ring.
Math 701 Spring 2021, version April 6, 2024 page 67
3. We have
λ ⇀ ( a ⊙ b) = (λ ⇀ a) ⊙ b = a ⊙ (λ ⇀ b) (45)
for all λ ∈ K and a, b ∈ A.
Note that the axiom (45) in the definition of K-algebra can be rewritten as
using our conventions (to write ab for a ⊙ b and to write λc for λ ⇀ c). It says
that scaling a product in A by a scalar in λ ∈ K is equivalent to scaling either
of its two factors by λ.
f [ a] := ∑ f n an .
n ∈N
Many people write f ( a) for the value f [ a] we have just defined. Unfortu-
nately, this leads to ambiguities (for example, f ( x + 1) could mean either the
value of f at x + 1, or the product of f with x + 1). By writing f [ a] or f ◦ a
instead, I will avoid these ambiguities.
For example, if f = 4x3 + 2x + 7, then f [ a] = 4a3 + 2a + 7. For another
example, if f = ( x + 5)3 , then f [ a] = ( a + 5)3 (although this is not obvious; it
follows from Theorem 3.4.6 below).
If f and g are two polynomials in K [ x ], then the value f [ g] = f ◦ g (this is
the value of f at g; it is well-defined because K [ x ] is a K-algebra) is also known
as the composition of f with g. We note that any polynomial f ∈ K [ x ] satisfies
f [x] = f and
h i
f [0] = x0 f = (the constant term of f ) and
h i h i h i
f [1] = x0 f + x1 f + x2 f + · · · = (the sum of all coefficients of f ) .
( f + g) [ a] = f [ a] + g [ a] and ( f g) [ a] = f [ a] · g [ a] .
(λ f ) [ a] = λ · f [ a] .
Proof. See [19s, Theorem 7.6.3] for parts (a), (b), (c), (d) and (e). Part (f) is [19s,
Proposition 7.6.14].
21 Forinstance, it is not hard to see that there is no nonzero complex number that can be
substituted into the FPS ∑ n!x n to obtain a convergent result. Thus, even though some
n ∈N
complex numbers can be substituted into some FPSs, there is no complex number other
than 0 that can be substituted into every FPS.
Math 701 Spring 2021, version April 6, 2024 page 70
However, let us not give up on FPSs yet. Some things can be substituted into
an FPS. For example:
• We can always substitute 0 for x in an FPS a0 + a1 x + a2 x2 + a3 x3 + · · · .
The result is
a 0 + a 1 0 + a 2 02 + a 3 03 + · · · = a 0 + 0 + 0 + 0 + · · · = a 0 .
Only the first n + 1 addends of this infinite sum (i.e., only the addends
k
ak x + x2 with k ≤ n) can contribute to the coefficient of x n , since any
of the remaining addends is a multiple of x n+1 (because it has the form
k
ak x + x2 = ak ( x (1 + x ))k = ak x k (1 + x )k with k ≥ n + 1) and thus has
a zero coefficient of x n . Hence, the coefficient of x n in this infinite sum
equals the coefficient of x n in the finite sum
2 3 n
a0 + a1 x + x 2 + a2 x + x 2 + a3 x + x 2 + · · · + a n x + x 2 .
But the latter coefficient is clearly a finite sum of ai ’s. Thus, my claim is
proved, and it follows that the result of substituting x2 + x for x in an FPS
a0 + a1 x + a2 x2 + a3 x3 + · · · is well-defined.
Math 701 Spring 2021, version April 6, 2024 page 71
The idea of the last example can be generalized; there was nothing special
about x + x2 that we used other than the fact that x + x2 is a multiple of x (that
is, an FPS whose constant term is 0). Thus, generalizing our reasoning from this
example, we can convince ourselves that any FPS g that is a multiple of x (that
is, whose constant term is 0) can be substituted into any FPS. Let us introduce
a notation for this, exactly like we did for substituting things into polynomials:
f [ g] := ∑ f n gn . (46)
n ∈N
Once again, it is not uncommon to see this FPS f [ g] denoted by f ( g), but I
will eschew the latter notation (since it can be confused with a product).
In order to prove that Definition 3.5.1 makes sense, we need to ensure that the
infinite sum ∑ f n gn in (46) is well-defined. The proof of this fact is analogous
n ∈N
to the reasoning I used in the last example; let me present it again in the general
case:
of the FPS x n hn are 0. In other words, the first n coefficients of the FPS gn are 0
(since gn = x n hn ). Thus, Proposition 3.5.2 (a) is proved.
(b) This follows from part (a). Here are the details.
We must prove that the family ( f n gn )n∈N is summable. In other words, we
must prove that the family f i gi i∈N is summable (since f i gi i∈N = ( f n gn )n∈N ).
In other words, we must prove that for each n ∈ N, all but finitely many i ∈ N
satisfy [ x n ] f i gi = 0 (by the definition of “summable”). So let us prove this.
Fix n ∈ N. We must prove that all but finitely many i ∈ N satisfy [ x n ] f i gi =
0.
Indeed, let i ∈ N satisfy i > n. Then, n < i. Now, the first i coefficients of the
FPS gi are 0 (by Proposition 3.5.2 (a), applied to i instead of n). However, the
n i i
coefficient [ x ] g of g is one of these first i coefficients (because n < i). Thus,
this coefficient [ x n ] gi must be 0. Now,
f i ∈ K; thus, (25) (applied to λ = f i
and a = gi ) yields [ x n ] f i gi = f i · [ x ] gi = 0.
n
| {z }
=0
Forget that we fixed i. We thus have shown that all i ∈ N satisfying i > n
satisfy [ x ] f i g = 0. Hence, all but finitely many i ∈ N satisfy [ x ] f i g = 0
n i n i
(because all but finitely many i ∈ N satisfy i > n). This is precisely what we
wanted to prove. Thus, Proposition 3.5.2 (b) is proved.
(c) Let n be a positive integer. We shall first show that x0 ( f n gn ) = 0.
Indeed, Proposition 3.5.2 (a) shows that the first n coefficients of the FPS
gn are 0. However, the coefficient x0 ( gn ) isone of these first n coefficients
(since n is positive). Thus, this coefficient x0 ( gn ) must be 0. Now, 0 f n n∈ K;
n
thus, i (applied to f n , g and 0 instead of λ, a and n) yields x ( f n g ) =
h (25)
f n · x0 ( gn ) = 0.
| {z }
=0
Forget that we fixed n. We thus have shown that
h i
x0 ( f n gn ) = 0 for each positive integer n. (47)
Math 701 Spring 2021, version April 6, 2024 page 73
Now,
!
h i h i
x0 ∑ fn g n
= ∑ x0 ( f n gn ) (by (27))
n ∈N n ∈N
h i h i
= x 0 f 0 g0 + ∑ x 0 ( f n g n )
n >0 |
|{z} {z }
=1 =0
(by (47))
here, we have split off the
addend for n = 0 from the sum
h i h i
= x0 ( f 0 1) + ∑ 0 = x0 ( f 0 , 0, 0, 0, . . .) = f 0 .
n >0
| {z }
= f 0 (1,0,0,0,...) | {z }
=( f 0 ,0,0,0,...) =0
Cancelling x from this equality (this is indeed allowed – make sure you un-
derstand why!), we obtain
1
= f 1 + f 2 x + f 3 x2 + f 4 x3 + · · · .
1 − x − x2
Math 701 Spring 2021, version April 6, 2024 page 74
1 1 h 2
i
= x + x , (48)
1 − x − x2 1−x
1 1
because substituting x + x2 for x in the expression results in .
1−x 1 − x − x2
1
x + x2 to be
This is plausible but not obvious – after all, we defined
1−x
1
the result of substituting x + x2 for x into the expanded version of
1−x
2 3 1
(which is 1 + x + x + x + · · · ), not into the fractional expression .
1−x
Nevertheless, (48) is true (and will soon be proved). If we take this fact for
granted, then our claim easily follows:
1 1 h i
f 1 + f 2 x + f 3 x2 + f 4 x3 + · · · = = x + x 2
1 − x − x2
1−x h i
= 1 + x + x2 + x3 + · · · x + x2
1
(since = 1 + x + x2 + x3 + · · · ).
1−x
Proposition 3.5.4. Composition of FPSs satisfies the rules you would expect
it to satisfy:
(a) If f 1 , f 2 , g ∈ K [[ x ]] satisfy x0 g = 0, then ( f 1 + f 2 ) ◦ g = f 1 ◦ g + f 2 ◦ g.
( f 2 ◦ g ).
f f ◦g
(c) If f 1 , f 2 , g ∈ K [[ x ]] satisfy x0 g = 0, then 1 ◦ g = 1
, as long as
f2 f2 ◦ g
f 2 is invertible. (In particular, f 2 ◦ g is automatically invertible under these
assumptions.)
(d) If f , g ∈ K [[ x ]] satisfy x0 g = 0, then f k ◦ g = ( f ◦ g)k for each k ∈ N.
22 Weare treating the symbol “◦” similarly to the multiplication sign · in our PEMDAS conven-
tion. Thus, an expression like “ f 1 ◦ g + f 2 ◦ g” is understood to mean ( f 1 ◦ g) + ( f 2 ◦ g).
Math 701 Spring 2021, version April 6, 2024 page 75
and ( f ◦ g) ◦ h = f ◦ ( g ◦ h).
(f) We have a ◦ g = a for each a ∈ K and g ∈ K [[ x ]].
(g) We have x ◦ g = g ◦ x = g for each g ∈ K [[ x ]].
(h) If ( f i )i∈ I ∈ K [[ x ]] I is a summable family of FPSs, and if g ∈ K [[ x ]] is an
FPS satisfying x0 g= 0, then the family ( f i ◦ g)i∈ I ∈ K [[ x ]] I is summable
as well and we have ∑ f i ◦ g = ∑ f i ◦ g.
i∈ I i∈ I
For our proof of Proposition 3.5.4, we will need the following lemma:
Proof of Lemma 0 3.5.5. This is very similar to the proof of Proposition 3.5.2 (a).
We have x g = 0. Hence, Lemma 3.3.16 (applied to a = g) yields that there
exists an h ∈ K [[ x ]] such that g = xh. Consider this h.
Write the FPS f in the form f = ( f 0 , f 1 , f 2 , . . .). Then, the first k coefficients
of f are f 0 , f 1 , . . . , f k−1 . Hence, these coefficients f 0 , f 1 , . . . , f k−1 are 0 (since the
first k coefficients of f are 0). In other words,
f [ g] = ∑ f n gn = ∑ f n gn + ∑ fn gn = ∑ 0gn + ∑ f n ( xh)n
n ∈N n∈N; n∈N; n∈N; n∈N;
|{z} |{z}
n<k =0 n≥k =( xh)n n<k k≤n
(by (49)) |{z} (since g= xh) | {z }
= ∑ =0
n∈N;
k≤n
= ∑ f n ( xh)n = ∑ f n x n hn .
n∈N;
| {z } n∈N;
k≤n = x n hn k≤n
Thus,
f ◦ g = f [ g] = ∑ f n x n hn = ∑ x n f n hn .
n∈N; n∈N;
|{z}
n
k≤n = x f n k≤n
But this ensures that the first k coefficients of f ◦ g are 0 23 . Thus, Lemma
3.5.5 follows.
23 Proof. We must show that [ x m ] ( f ◦ g) = 0 for any nonnegative integer m < k. But we can do
Math 701 Spring 2021, version April 6, 2024 page 76
Our proof of Proposition 3.5.4 will furthermore use the Kronecker delta nota-
tion:
Definition 3.5.6. If i and j are any objects, then δi,j means the element
(
1, if i = j;
of K.
0, if i ̸= j
with f 1,0 , f 1,1 , f 1,2 , . . . ∈ K and f 2,0 , f 2,1 , f 2,2 , . . . ∈ K. Then, adding the two equal-
ities in (50) together, we find
[ xm ] ( f ◦ g) = [ xm ] ∑ x n f n hn since f ◦ g = ∑ x n f n hn
n∈N; n∈N;
k≤n k≤n
m h i h i
= ∑ [ x m ] ( x n f n hn ) = ∑ ∑ xi ( xn ) · x m −i ( f n h n )
n∈N; | {z } n∈N; i =0 | {z }
m
k≤n = k≤n
∑ [ xi ]( x n )·[ x m−i ]( f n hn ) =0
i =0 (since i ≤m<k ≤n
(by (22), and thus i ̸=n)
applied to x n , f n hn and m
instead of a, b and n)
m h i
= ∑ ∑ 0 · x m −i ( f n h n ) = 0,
n∈N; i =0
k≤n
∑ aδn,0 x n = a δ0,0 x0 +
|{z} ∑ a δn,0 xn = a · 1 + ∑ a · 0x n
n ∈N n∈N; n∈N;
|{z} |{z}
=1 =1 =0
n ̸ =0 n ̸ =0
(since 0=0) (since n̸=0) | {z }
=0
= a · 1 = a · (1, 0, 0, 0, . . .) = ( a · 1, a · 0, a · 0, a · 0, . . .)
= ( a, 0, 0, 0, . . .) = a. (52)
a [ g] = ∑ aδn,0 gn = a δ0,0
|{z} n∑
g0 + a δn,0 gn = a · 1 + ∑ a · 0gn
n ∈N ∈N; n∈N;
|{z} |{z}
=1 =1 n ̸ =0 =0 n ̸ =0
(since 0=0) (since n̸=0) | {z }
=0
= a · 1 = a.
∑ δn,1 x n = δ1,1 x1 +
|{z} ∑ δn,1 xn = x + ∑ 0x n
n ∈N n∈N; n∈N;
|{z} |{z}
=1 =x =0
n ̸ =1 n ̸ =1
(since 1=1) (since n̸=1) | {z }
=0
= x. (53)
x [ g] = ∑ δn,1 gn = δ1,1
|{z} n∑
g1 + δn,1 gn = g + ∑ 0gn = g.
n ∈N ∈N; n∈N;
|{z} |{z}
=1 =g n ̸ =1 =0 n ̸ =1
(since 1=1) (since n̸=1) | {z }
=0
(b) This appears in [Loehr11, Theorem 7.62] and [Brewer14, Proposition 2.2.2].
Here is the proof:
Let f 1 , f 2 , g ∈ K [[ x ]] satisfy x0 g = 0. Write the FPSs f 1 and f 2 as
with f 1,0 , f 1,1 , f 1,2 , . . . ∈ K and f 2,0 , f 2,1 , f 2,2 , . . . ∈ K. Thus, Definition 3.5.1 yields
and
f 1 [ g] = ∑ f 1,n gn and f 2 [ g] = ∑ f 2,n gn .
n ∈N n ∈N
Hence,
∑ ∑
n
=
f 1,i f 2,j
g . (55)
n ∈N (i,j)∈N2 ;
i + j=n
However, we can
apply the same computations to x instead of g (since x is
also an FPS with x0 x = 0). Thus, we obtain
∑ ∑
n
f1 [x] · f2 [x] =
x .
f 1,i f 2,j
n ∈N (i,j)∈N2 ;
i + j=n
∑ ∑
n
f1 · f2 =
f 1,i f 2,j
x .
n ∈N (i,j)∈N2 ;
i + j=n
∑ ∑
n
( f 1 · f 2 ) [ g] =
f 1,i f 2,j
g
n ∈N (i,j)∈N2 ;
i + j=n
(since ∑ f 1,i f 2,j ∈ K for each n ∈ N). Comparing this with (55), we obtain
(i,j)∈N2 ;
i + j=n
( f 1 · f 2 ) [ g] = f 1 [ g] · f 2 [ g] .
Math 701 Spring 2021, version April 6, 2024 page 80
= 0.
Thus, all but finitely many pairs (i, j) ∈ N × N satisfy [ x m ] f 1,i f 2,j gi+ j = 0 (because
all but finitely many such pairs satisfy m < i + j). This proves Statement 1.]
As explained above, Statement 1 shows that the family
f 1,i f 2,j gi+ j (i,j)∈N×N is summable, and thus our interchange of summation signs made
(by Proposition 3.5.4 (f), applied to a = 1). Thus, the FPS f 2−1 ◦ g is an inverse of
f ◦g
f 2 ◦ g. Hence, f 2 ◦ g is invertible. The expression 1 is therefore well-defined.
f2 ◦ g
f f ◦g
It now remains to prove that 1 ◦ g = 1 . To this purpose, we argue as
f2 f2 ◦ g
f
follows: The expression 1 is well-defined, since f 2 is invertible. Proposition
f2
f1 f1 f1
3.5.4 (b) (applied to instead of f 1 ) yields · f2 ◦ g = ◦ g · ( f 2 ◦ g ).
f2 f2 f2
f f1
In view of 1 · f 2 = f 1 , this rewrites as f 1 ◦ g = ◦ g · ( f 2 ◦ g). We can
f2 f2
divide both sides of this equality by f 2 ◦ g (since f 2 ◦ g is invertible), and thus
f ◦g f f f ◦g
obtain 1 = 1 ◦ g. In other words, 1 ◦ g = 1 . Thus, Proposition 3.5.4
f2 ◦ g f2 f2 f2 ◦ g
(c) is proven.
for each k ∈ N.
We prove this by induction on k:
Induction base: We have f 0 ◦ g = 1 ◦ g = 1 (by Proposition 3.5.4 (f), applied
|{z}
=1
to a = 1). Comparing this with ( f ◦ g)0 = 1, we find f 0 ◦ g = ( f ◦ g)0 . In other
words, f k ◦ g = ( f ◦ g)k holds for k = 0.
Induction step: Let m ∈ N. Assume that f k ◦ g = ( f ◦ g)k holds for k = m. We
must prove that f k ◦ g = ( f ◦ g)k holds for k = m + 1.
We have assumed that f k ◦ g = ( f ◦ g)k holds for k = m. In other words, we
have f m ◦ g = ( f ◦ g)m . Now,
f m +1 ◦ g = ( f · f m ) ◦ g = ( f ◦ g ) · ( f m ◦ g )
| {z } | {z }
=f·fm =( f ◦ g)m
(by Proposition 3.5.4 (b), applied to f 1 = f and f 2 = f m )
= ( f ◦ g ) · ( f ◦ g ) m = ( f ◦ g ) m +1 .
(by the definition of “summable”). In other words, for each n ∈ N, there exists
a finite subset In of I such that
Consider this subset In . Thus, all the sets I0 , I1 , I2 , . . . are finite subsets of I.
Now, let n ∈ N be arbitrary. The set I0 ∪ I1 ∪ · · · ∪ In is a union of n + 1 finite
subsets of I (because all the sets I0 , I1 , I2 , . . . are finite subsets of I), and thus
itself is a finite subset of I. Moreover,
thus itself is a finite subset of I. The set J × {0, 1, . . . , n} must be finite (since it is the
product of the two finite sets J and {0, 1, . . . , n}).
Now, let (i, m) ∈ ( I × N) \ ( J × {0, 1, . . . , n}). We shall prove that [ x n ] ( f i,m gm ) = 0.
We have (i, m) ∈ / J × {0, 1, . . . , n} (since (i, m) ∈ ( I × N) \ ( J × {0, 1, . . . , n})).
n m n m
Wenote
0
that f i,m ∈ K and thus [ x ] ( f i,m g ) = f i,m · [ x ] ( g ) (by (25)). However, we
have x g = 0. Thus, Proposition 3.5.2 (a) (applied to f i and f i,j and m instead of f
and f j and n) yields that the first m coefficients of the FPS gm are 0. In other words, we
have h i
x k ( gm ) = 0 for each k ∈ {0, 1, . . . , m − 1} . (59)
In other words, the family ( f i,m gm )(i,m)∈ I ×N is summable. This proves (58).]
Now, we have shown that the family ( f i,m gm )(i,m)∈ I ×N is summable. Renam-
ing the index (i, m) as (i, n), we thus conclude that the family ( f i,n gn )(i,n)∈ I ×N
is summable. The same argument (but with g replaced by x) shows that the
n 0
family ( f i,n x )(i,n)∈ I ×N is summable (since the FPS x satisfies x x = 0).
The proof of ∑ f i ◦ g = ∑ f i ◦ g is now just a matter of computation: hen,
i∈ I i∈ I
summing the equalities f i = ∑ f i,n x n over all i ∈ I, we obtain
n ∈N
∑ fi = ∑ ∑ f i,n x n = ∑ ∑ fi,n xn
i∈ I i∈ I n ∈N n ∈N i∈ I
Math 701 Spring 2021, version April 6, 2024 page 84
(here, we have been able to interchange the summation signs, since the family
( f i,n x n )(i,n)∈ I ×N is summable). Thus,
!
∑ fi = ∑ ∑ fi,n xn = ∑ ∑ fi,n xn .
i∈ I n ∈N i∈ I n ∈N i∈ I
∑ f i [ g] = ∑ ∑ f i,n gn = ∑ ∑ fi,n gn
i∈ I i∈ I n ∈N n ∈N i∈ I
(again, we have been able to interchange the summation signs, since the family
( f i,n gn )(i,n)∈ I ×N is summable). Thus,
!
∑ fi [ g] = ∑ ∑ fi,n gn = ∑ ∑ fi,n gn .
i∈ I n ∈N i∈ I n ∈N i∈ I
Comparing this with (60), we find ∑ f i [ g] = ∑ f i [ g]. In other words,
i∈ I i∈ I
(e) This is [Loehr11, Theorem 7.63] and [Brewer14, Proposition 2.2.5].24 Again,
let us give the proof:
Write the FPS g in the form g = ∑ gn x n for some g0 , g1 , g2 , . . . ∈ K. Then,
n ∈N
g0 = x g = 0. Moreover, g ◦ h = g [h] = ∑ gn hn (by Definition 3.5.1, because
0
n ∈N
g = ∑ gn x n with g0 , g1 , g2 , . . . ∈ K). But Proposition 3.5.2 (c) (applied to g,
n ∈N
n
0
h and gn instead of f , g and f n ) yields x ∑ gn h = g0 = 0. In view
n ∈N
24 See also [19s, Proposition 7.6.14] for a similar property for polynomials.
Math 701 Spring 2021, version April 6, 2024 page 85
Moreover, the family ( f n gn )n∈N is summable (by Proposition 3.5.2 (b)). Hence,
Proposition 3.5.4 (h) (applied to ( f n gn )n∈N and h instead of ( f i )i∈ I and g) yields
that the family (( f n gn ) ◦ h)n∈N ∈ K [[ x ]]N is summable as well and that we have
!
∑ f n gn ◦h = ∑ ( f n gn ) ◦ h. (61)
n ∈N n ∈N
( f ◦ g) ◦ h = ∑ ( f n gn ) ◦ h. (62)
n ∈N
Example 3.5.7. Let us use Proposition 3.5.4 (c) to justify the equality (48) that
we used in Example 3.5.3. Indeed, we know that the FPS 1 − x is invertible.
Thus, applying Proposition 3.5.4 (c) to f 1 = 1 and f 2 = 1 − x and g = x + x2 ,
we obtain
1 ◦ x + x2
1
2
◦ x+x = .
1−x (1 − x ) ◦ ( x + x 2 )
Using the notation f [ g] instead of f ◦ g, we can rewrite this as
1 x + x2
1 h 2
i
x+x = .
1−x (1 − x ) [ x + x 2 ]
In view of 1 x + x2 = 1 and (1 − x ) x + x2 = 1 − x + x2 = 1 − x − x2 ,
1 1
x + x2 =
this rewrites as . Thus, (48) is proved.
1−x 1 − x − x2
This justifies some more of the things we did back in Section 3.1; in particular,
Example 1 from that section is now fully justified. But we still have not defined
(e.g.) the square root of an FPS, which we used in Example 2.
Before I explain square roots, let me quickly survey differentiation of FPSs.
To make sure that this derivative behaves nicely, we need to check that it
satisfies the familiar properties of derivatives. And indeed, it does:
f i′
(b) If ( f i )i∈ I is a summable family of FPSs, then the family i∈ I
is
summable as well, and we have
!′
∑ fi = ∑ f i′ .
i∈ I i∈ I
( f ◦ g)′ = f ′ ◦ g · g′
′ ′
f ′ f f ′ f
of · g = f , this rewrites as f = · g + · g . Solving this for , we
g g g g
′
f f ′ g − f g′
find = . This proves Theorem 3.6.2 (e).
g g2
(f) This follows by induction on n, using part (d) (in the induction step) and
′
1 = 0 (in the induction base).
(g) Let f , g ∈ K [[ x ]] be two FPSs such that f is a polynomial or x0 g = 0.
= ∑ f n n g ′ g n −1 = ∑ n f n g n −1 g ′ . (64)
n >0 n >0
| {z }
= g n −1 g ′
or f is a polynomial) yields
f ′ [ g] = ∑ ( m + 1 ) f m +1 g m = ∑ n f n g n −1
m ∈N n >0
3.7.1. Definitions
Convention 3.7.1 entails that elements of K can be divided by the positive inte-
gers 1, 2, 3, . . .. We can use this to define three specific (and particularly impor-
tant) FPSs over K:
1 n
exp := ∑ n!
x ,
n ∈N
(−1)n−1 n
log := ∑ n x,
n ≥1
1 n
exp := exp −1 = ∑ n!
x .
n ≥1
1 n
(The last equality sign here follows from exp = ∑ x =
n∈N n!
1 1 n 1 n
x0 + ∑ x = 1+ ∑ x .)
0! n≥1 n! n≥1 n!
|{z}
|{z} =1
=1
Note that the FPS exp is the usual exponential series from analysis, but now
manifesting itself as a FPS. Likewise, log is the Mercator series for log (1 + x ),
where log stands for the natural logarithm function. The natural logarithm
function itself cannot be interpreted as an FPS, since log 0 is undefined.
exp and log around 1, with the reservation that the point 1 has been moved
to the origin by a shift). This approach uses nontrivial results from complex
analysis, so I will not follow it and instead start from scratch.
The main tool in the proof of (65) will be the following useful proposition
([Grinbe17, Lemma 0.4]):
(a) We have
(exp ◦ g)′ = (exp ◦ g)′ = (exp ◦ g) · g′ .
(b) We have ′
log ◦ g = (1 + g ) −1 · g ′ .
Proof of Proposition 3.7.3. (a) Let us first show that exp′ = exp′ = exp. Indeed,
exp = exp −1, so that exp = exp + 1 and therefore
1 1 1 n
exp′ = ∑ n· x n −1 = ∑ n − 1) !
x n −1 = ∑ x
n ≥1 | {zn!} n ≥1 ( n ∈N
n!
1
=
( n − 1) !
(since n!=n·(n−1)!)
(here, we have substituted n for n − 1 in the sum)
= exp . (67)
0Now,
we can apply the chain rule (Theorem 3.6.2 (g)) to f = exp (since
x g = 0), and thus obtain
The same computation (but with exp replaced by exp) yields (exp ◦ g)′ = (exp ◦ g) ·
g′ . Combining these two formulas, we obtain (exp ◦ g)′ = (exp ◦ g)′ = (exp ◦ g) ·
g′ . Thus, we have proved Proposition 3.7.3 (a).
Math 701 Spring 2021, version April 6, 2024 page 91
(−1)n−1 n
(b) We have log = ∑ x . Thus,
n ≥1 n
!′
′ (−1)n−1 n
log = ∑ n x
n ≥1
(−1)n−1
= ∑ ( x n )′ (by Theorem 3.6.2 (b))
n ≥1
n | {z }
=nx ′ x n−1
(by Theorem 3.6.2 (f),
applied to x instead of g)
(−1)n−1
= ∑ n n |{z} x ′ x n−1 = ∑ (−1)n−1 x n−1 = ∑ (−1)n x n
n ≥1 | {z } =1 n ≥1 n ∈N
=(−1)n−1
(here, we have substituted n for n − 1 in the sum). On the other hand, Proposi-
tion 3.3.7 yields (1 + x )−1 = ∑ (−1)n x n . Comparing these two equalities, we
n ∈N
find
′
log = (1 + x )−1 . (69)
0Now,
we can apply the chain rule (Theorem 3.6.2 (g)) to f = log (since
x g = 0), and thus obtain
′ ′ ′
log ◦ g = log ◦ g · g = (1 + x ) −1 ◦ g · g ′ . (70)
|{z}
=(1+ x )−1
and
(1 + x ) ◦ g = 1 + g (this follows easily from Definition 3.5.1) ,
1
this rewrites as (1 + x )−1 ◦ g = = (1 + g)−1 . Hence, (70) becomes
1+g
′
log ◦ g = (1 + x ) ◦ g · g′ = (1 + g)−1 · g′ .
−1
| {z }
=(1+ g)−1
We will need a very simple lemma, which says (in particular) that if two FPSs
have constant terms 0, then so does their composition:
0
Lemma
0 3.7.4. Let
0 f , g ∈ K [[ x ]] be two FPSs with x g = 0. Then,
x ( f ◦ g) = x f .
However, exp = exp −1 and thus 1 + exp = exp. Now, Proposition 3.7.3 (b)
(applied to g = exp) yields
−1
′
log ◦ exp = 1 + exp · exp′ = exp−1 · exp = 1 = x ′
| {z } |{z}
=exp =exp
(by (68))
(since x ′ = 1). Hence, Theorem 3.6.2 (h) (applied to f = log ◦ exp and g = x)
yields that log ◦ exp − x is constant. In other words, log ◦ exp − x = a for some
Math 701 Spring 2021, version April 6, 2024 page 93
a ∈ K. Consider this a. From log ◦ exp − x = a, we obtain x0 log ◦ exp − x =
0
x a = a. Comparing this with (71), we find a = 0. Hence, log ◦ exp − x = a
rewrites as log ◦ exp − x = 0. In other words, log ◦ exp = x.
Now it remains to prove that exp ◦ log = x. There are (at least) two ways to
do this:
• 1st
0 way: A homework exercise (Exercise A.2.5.2) says that any FPS f with
x f = 0 and with x1 f invertible
0has
a unique compositional inverse
(i.e., there is a unique FPS g with x g = 0 and f ◦ g = g ◦ f = x). We
can apply this to f = log (since x0 log = 0 and since x1 log = 1 is
invertible), and thus see that log has a unique compositional inverse g.
This compositional
inverse g must be exp, since log ◦ exp = x (indeed,
comparing g ◦ log ◦ exp = x ◦ exp = exp with
| {z }
=x
g ◦ log ◦ exp = g ◦ log ◦ exp (by Proposition 3.5.4 (e))
| {z }
=x
= g◦x = g
• 2nd way: Here is a more direct argument. We shall first show that exp ◦log =
1 + x.
To wit: The FPS 1 + x is invertible (by Proposition 3.3.9). Thus, applying
the quotient rule (Theorem 3.6.2 (e)) to f = exp ◦ log and g = 1 + x, we
obtain
′
· (1 + x ) ′
!′
exp ◦ log exp ◦ log · ( 1 + x ) − exp ◦ log
= .
1+x (1 + x )2
In view of
′ ′
exp ◦ log = exp ◦ log · log
|{z}
=(1+ x )−1
(by (69))
by Proposition 3.7.3 (a), applied to g = log
= exp ◦ log · (1 + x )−1
Math 701 Spring 2021, version April 6, 2024 page 94
exp ◦ log
Thus, Theorem 3.6.2 (h) (applied to f = and g = 0) yields that
1+x
exp ◦ log exp ◦ log
− 0 is constant. In other words, is constant. In other
1+x 1+x
exp ◦ log exp ◦ log
words, = a for some a ∈ K. Consider this a. From = a,
1+x 1+x
we obtain exp ◦ log = a (1 + x ) = a (1 + x ). Thus,
h i h i
x0 exp ◦ log = x0 ( a (1 + x )) = a.
However, it is easy to see that x0 exp ◦ log = 1 25 . Comparing these
two equalities, we find a = 1. Thus, exp ◦ log = |{z}
a (1 + x ) = 1 + x.
=1
Now, exp = exp −1 = exp + −1. Hence,
exp ◦ log
= (exp + −1) ◦ log = exp ◦ log + −1 ◦ log
| {z } | {z }
=1+ x =−1
(by Proposition 3.5.4 (f),
applied to −1 and log instead of a and g)
!
by Proposition 3.5.4 (a),
applied to f 1 = exp and f 2 = −1 and g = log
= 1 + x + −1 = (1 + −1) + x = x.
| {z }
=0
Either way, we have shown that exp ◦ log = x. Thus, the proof of Theorem
3.7.5 is complete.
25 Proof. Recall that x0 log = 0. Hence, Lemma 3.7.4 (applied to f = exp and g = log) yields
!
h i h i 1 1 n
x 0 0
exp ◦ log = x exp = since exp = ∑ x
0! n ∈N
n!
1
= = 1.
1
Math 701 Spring 2021, version April 6, 2024 page 95
and
(These two maps are well-defined according to parts (c) and (d) of Lemma
3.7.7 below.)
The maps Exp and Log are algebraic analogues of the maps in complex anal-
ysis that take any holomorphic function f to its exponential and logarithm,
respectively (at least within certain regions in which these things are well-
defined). As one would hope, and as we will soon see, they are mutually
inverse. Let us first check that their definition is justified:
Lemma 3.7.7. (a) For any f , g ∈ K [[ x ]]0 , we have f ◦ g ∈ K [[ x ]]0 .
(b) For any f ∈ K [[ x ]]1 and g ∈ K [[ x ]]0 , we have f ◦ g ∈ K [[ x ]]1 .
(c) For any g ∈ K [[ x ]]0 , we have exp ◦ g ∈ K [[ x ]]1 .
(d) For any f ∈ K [[ x ]]1 , we have f − 1 ∈ K [[ x ]]0 and log ◦ ( f − 1) ∈ K [[ x ]]0 .
0 (a) Let f , g ∈
Proof of Lemma 3.7.7.
0
K [[ x ]]0 . In view of the definition of
0K [[ x ]]0 ,
0 entails that x f = 0 and x g = 0. Hence, Lemma 3.7.4 yields x ( f ◦ g) =
this
x f = 0. In other words, f ◦ g ∈ K [[ x ]]0 (by the definition of K [[ x ]]0 ). This
proves Lemma 3.7.7 (a).
(b) This is analogous to the proof of Lemma 3.7.7 (a).
1 n 1
(c) Let g ∈ K [[ x ]]0 . From exp = ∑ x , we obtain x0 exp =
= 1,
n∈N n! 0!
so that exp ∈ K [[ x ]]1 . Hence, Lemma 3.7.7 (b) (applied to f = exp) yields
exp ◦ g ∈ K [[ x ]]1 . This proves Lemma
0 3.7.7 (c). 0
i Leth f i∈ K [[ x ]]1 . Thus, x f = 1. Now, (21) yields x ( f − 1) =
h (d)
x0 f − x0 1 = 1 − 1 = 0, so that f − 1 ∈ K [[ x ]]0 . Furthermore, x0 log = 0
| {z } | {z }
=1 =1
Math 701 Spring 2021, version April 6, 2024 page 96
(−1)n−1 n
(since log = ∑ x ) and thus log ∈ K [[ x ]]0 . Hence, Lemma 3.7.7 (a)
n ≥1 n
(applied to log and f − 1 instead of f and g) yields log ◦ ( f − 1) ∈ K [[ x ]]0 .
Thus, Lemma 3.7.7 (d) is proven.
Lemma 3.7.8. The maps Exp and Log are mutually inverse bijections between
K [[ x ]]0 and K [[ x ]]1 .
(by Proposition 3.5.4 (a), applied to f 1 = exp and f 2 = 1). However, Proposition
3.5.4 (f) (applied to a = 1) yields 1 ◦ g = 1 = 1. Hence, exp ◦ g = exp ◦ g +
1 ◦ g = exp ◦ g + 1. This proves (72).]
|{z}
=1
Now, let us show that Exp ◦ Log = id. Indeed, we fix some f ∈ K [[ x ]]1 . Then,
f − 1 ∈ K [[ x ]]0 (by Lemma 3.7.7 (d)). Hence, Proposition
3.5.4 (e) (applied
to exp, log and f − 1 instead of f , g and h) yields exp ◦ log ◦ ( f − 1) =
exp ◦ log ◦ ( f − 1) . Thus,
exp ◦ log ◦ ( f − 1) = exp ◦ log ◦ ( f − 1) = x ◦ ( f − 1)
| {z }
=x
(by Theorem 3.7.5)
= f −1 (73)
26 As before,
the “◦” operation behaves like multiplication in the sense of PEMDAS conventions.
Thus, the expression “exp ◦ g + 1” means (exp ◦ g) + 1.
Math 701 Spring 2021, version April 6, 2024 page 97
Proof of Lemma 3.7.9 (sketched). (a) Like many of our arguments involving FPSs,
this will be a short computation followed by lengthy technical arguments justi-
fying the interchanges of summation signs. (In this aspect, our algebraic replica
of the analysis of infinite sums doesn’t differ that much from the original.) We
begin with the computation; the justifying arguments will be sketched after-
wards.
Let f , g ∈ K [[ x ]]0 . Thus, x0 hf =i 0 and
0
h ix g = 0. Hence, f + g ∈ K [[ x ]]0
(since (20) yields x ( f + g) = x0 f + x0 g = 0).
0
| {z } | {z }
=0 =0
By the definition of Exp, we have
1 n
Exp f = exp ◦ f = exp [ f ] = ∑ n!
f
n ∈N
1 n
(by Definition 3.5.1, since exp = ∑ x ). Similarly,
n∈N n!
1 n
Exp g = ∑ n!
g
n ∈N
and
1
Exp ( f + g) = ∑ n!
( f + g)n .
n ∈N
Math 701 Spring 2021, version April 6, 2024 page 99
1 n n k n−k
1
Exp ( f + g) = ∑ ( f + g) n
= ∑ ∑ k f g
n ∈N
n! | {z } n ∈N
n! k =0
n n k n−k
=∑ f g
k =0 k
(by the binomial theorem)
n n
1 n 1
= ∑ ∑ n! k
k n−k
f g = ∑ ∑ k! (n − k )!
f k gn−k
n ∈N k =0 n ∈N
| {z } | {z k=}0
1 = ∑
= (n,k)∈N×N;
k! (n − k)! k≤n
(by (2)) = ∑ ∑
k ∈N n≥k
1 1 k ℓ
= ∑ ∑ k! ( n − k ) !
f k gn−k = ∑ ∑ k!ℓ!
f g
k ∈N n≥k k ∈N ℓ∈N
(here, we have substituted ℓ for n − k in the second sum) .
Comparing this with
! !
1 1
Exp f · Exp g = ∑ k! f k · ∑ ℓ! gℓ
k ∈N ℓ∈N
1 n 1 k
since Exp f = ∑ f = ∑ f
n∈N n! k∈N k!
1 n 1 ℓ
and Exp g = ∑ g = ∑ g
n∈N n! ℓ∈N ℓ !
1 k 1 ℓ 1 k ℓ
= ∑ ∑ k!
f · g = ∑
ℓ! ∑ k!ℓ!
f g,
k ∈N ℓ∈N k ∈N ℓ∈N
Statement 2.]
Note that Statement 2 entails that the family f k gℓ (k,ℓ)∈N×N is summable (because
when m ∈ N is given, all but finitely many pairs (k, ℓ) ∈ N × N satisfy m < k + ℓ).
However, we need to prove Statement 1, so let us do this:
[Proof of Statement 1: Let m ∈ N. If (n, k ) ∈ N × N is a pair satisfying k ≤ n and
m < n, then
1 k n−k 1
m
[x ] f g = [ x m ] f k gn−k (by (25))
k! (n − k )! k! (n − k )! | {z }
=0
(by Statement 2
(applied to ℓ=n−k),
since m<n=k +(n−k ))
= 0.
Thus, all but finitely many pairs (n, k ) ∈ N × N satisfying k ≤ n satisfy
1
[ xm ] f k gn−k = 0 (because all but finitely many such pairs satisfy m < n).
k! (n − k )!
This proves Statement 1.]
As explained above, Statement 1 shows that the family
1
f k gn−k is summable, and thus our interchange of
k! (n − k )! (n,k)∈N×N satisfying k≤n
summation signs made above is justified. This completes our proof of Lemma 3.7.9
(a).
(b) This easily follows from part (a), since we know that Log is inverse to
Exp. Here are the details:
Let f , g ∈ K [[ x ]]1 . Set u = Log f and v = Log g; then, u, v ∈ K [[ x ]]0 (since
Log is a map from K [[ x ]]1 to K [[ x ]]0 ). Hence, Lemma 3.7.9 (a) (applied to u and
v instead of f and g) yields Exp (u + v) = Exp u · Exp v.
However, Lemma 3.7.8 says that the maps Exp and Log are mutually inverse
bijections between K [[ x ]]0 and K [[ x ]]1 . Hence, Exp ◦ Log = id and Log ◦ Exp =
id.
Math 701 Spring 2021, version April 6, 2024 page 101
Proof of Proposition
0 3.7.10. (a) It is clear that the set K [[ x ]]0 contains the FPS
0 (since x 0 = 0). Thus, it remains to show that K [[ x ]]0 is closed 0 under
0 and subtraction. But this is easy: If f , g ∈ K [[ x ]]0 , then
addition 0 x f = 0
h i x hg =i 0, and therefore f + g ∈ K [[ x ]]0 (since (20) yields x ( f + g) =
and
x0 f + x0 g = 0) and f − g ∈ K [[ x ]]0 (by a similar argument using (21)).
| {z } | {z }
=0 =0
Thus, K [[ x ]]0 is closed under addition and subtraction. This proves Proposition
3.7.10 (a).
Any a ∈ K [[0x]]1 is invertible in K [[ x ]] (indeed, a ∈ K [[ x ]]1 shows that
0(b)
x a = 1; thus, x a is invertible in K; therefore, Proposition 3.3.7 entails that
f
a is invertible in K [[ x ]]). Hence, is well-defined for any f , g ∈ K [[ x ]]1 .
g
Next, we claim that K [[ x ]]1 is0 closed under multiplication. Indeed, if f , g ∈
K [[ x ]]1 , then x0 fh= i1 and h ix g = 1, and therefore f g ∈ K [[ x ]]1 (since (24)
yields x ( f g) = x0 f · x0 g = 1). This shows that K [[ x ]]1 is closed under
0
| {z } | {z }
=1 =1
multiplication.
It remains to prove that K [[ x ]]1 is closed under division. Indeed, if f , g ∈
f
K [[ x ]]1 , then x0 f = 1 and x0 g = 1, and therefore ∈ K [[ x ]]1 (because we
g
Math 701 Spring 2021, version April 6, 2024 page 102
f
have f = · g and thus
g
h i f h i
0 f
h i h i
0 0
x f = x ·g = x · x0 g (by (24))
g g | {z }
=1
h i f
= x0
g
f f
and thus x0 = x0 f = 1, so that ∈ K [[ x ]]1 ). This shows that K [[ x ]]1 is
g g
closed under division. Thus, Proposition 3.7.10 (b) is proven.
The two groups in Proposition 3.7.10 can now be connected through Exp and
Log:
and
Log : (K [[ x ]]1 , ·, 1) → (K [[ x ]]0 , +, 0)
are mutually inverse group isomorphisms.
Proof of Theorem 3.7.11 (sketched). Lemma 3.7.9 yields that these two maps are
group homomorphisms27 . Lemma 3.7.8 shows that they are mutually inverse.
Combining these results, we conclude that these two maps are mutually inverse
group isomorphisms. This proves Theorem 3.7.11.
Theorem 3.7.11 helps us turn addition into multiplication and vice versa
when it comes to FPSs, at least if the constant terms are the right ones. This
will come useful rather soon.
27 Here,we are using the following fact: If ( G, ∗, eG ) and ( H, ∗, e H ) are any two groups, and if
Φ : G → H is a map such that every f , g ∈ G satisfy Φ ( f ∗ g) = Φ ( f ) ∗ Φ ( g), then Φ is a
group homomorphism.
Math 701 Spring 2021, version April 6, 2024 page 103
The reason why this FPS loder f is called the logarithmic derivative of f is
made clear by the following simple fact:
′( f − 1) ′
(Log f ) = .
f
More generally, let us try to define non-integer powers of FPSs (since square
roots are just 1/2-th powers). Thus, we are trying to solve the following prob-
lem:
• It should not conflict with the existing notion of f c for c ∈ N. That is,
if c ∈ N, then our new definition of f c should yield the same result as
the existing meaning that f c has in this case (namely, f f · · · f ). The same
| {z }
c times
should hold for c ∈ Z when f is invertible.
• For any positive integer n and any FPS f ∈ K [[ x ]], the 1/n-th power f 1/n
should be an n-th root of f (that is, an FPS whose n-th power is f ). (This
actually follows from the previous two properties, since we can apply the
rule ( f a )b = f ab to a = 1/n and b = n.)
Besides imposing the above wishlist of properties, we want this c-th power f c
itself to belong to K [[ x ]]1 , since otherwise the iterated power ( f a )b in our rules
of exponents might be undefined.
It turns out that this is still too much to ask. Indeed, if K = Z/2, then the
FPS 1 + x ∈ K [[ x ]]1 has no square root (you get to prove this in Exercise A.2.8.1
(c)), so its 1/2-th power (1 + x )1/2 cannot be reasonably defined.
However, if we assume (as in Convention 3.7.1) that K is a commutative Q-
algebra, then we get lucky: Our “more realistic problem” can be solved in (at
least) two ways:
1st solution: We define
c k
(1 + x ) : = ∑
c
x for each c ∈ K,
k ∈N
k
in order to make Newton’s binomial formula (Theorem 3.3.10) hold for arbi-
trary exponents30 . Subsequently, we define
from Theorem 3.7.11. Thus, for any f ∈ K [[ x ]]1 and any c ∈ Z, the equation
Log ( f c ) = c Log f
c ( c − 1) ( c − 2) · · · ( c − k + 1)
30 Note c
that = is well-defined since K is a commutative
k k!
Q-algebra.
Math 701 Spring 2021, version April 6, 2024 page 107
This definition of f c does not conflict with our original definition of f c when
c ∈ Z because (as we said) the original definition of f c already satisfies Log ( f c ) =
c Log f and therefore f c = Exp (c Log f ).
Moreover, Definition 3.8.1 makes the rules of exponents hold:
Theorem 3.8.2. Assume that K is a commutative Q-algebra. For any a, b ∈ K
and f , g ∈ K [[ x ]]1 , we have
f a+b = f a f b , ( f g)a = f a g a , ( f a )b = f ab .
(1 − 2xC )2 = 1 − 4x.
Taking both sides of this equation to the 1/2-th power, we obtain
1/2
(1 − 2xC )2 = (1 − 4x )1/2
(since both sides are FPSs with constant term 1). However, the FPS 1 − 2xC
1/2
has constant term 1; thus, the rules of exponents yield (1 − 2xC )2 =
(1 − 2xC )2·1/2 = 1 − 2xC. Hence,
1/2
2
1 − 2xC = (1 − 2xC ) = (1 − 4x )1/2 .
The following proof illustrates a technique that will probably appear prepos-
terous if you are seeing it for the first time, but is in fact both legitimate and
rather useful.
Proof of Theorem 3.8.3 (sketched). The definition of Log yields
(1 + x ) c
= Exp (c Log (1 + x )) = Exp c log since Log (1 + x ) = log
= exp ◦ c log (by the definition of Exp)
! !
(−1)n−1 n (−1)n−1 n
= exp ◦ c ∑ x since log = ∑ x
n ≥1
n n ≥1
n
!
(−1)n−1 n
= exp ◦ ∑ cx
n ≥1
n
!m
n −1
1 (−1)
= ∑ ∑ cx n (77)
m ∈N
m! n≥1 n
1 n 1 m
(by Definition 3.5.1, since exp = ∑ x = ∑ x ).
n∈N n! m∈N m!
Math 701 Spring 2021, version April 6, 2024 page 109
!m
(−1)n−1 n
Now, fix m ∈ N. We shall expand ∑ cx . Indeed, we can
n ≥1 n
replace the “ ∑ ” sign by an “ ∑ ” sign, since P = {1, 2, 3, . . .}. Thus,
n ≥1 n ∈P
!m
(−1)n−1 n
∑ n cx
n ≥1
!m
(−1)n−1 n
= ∑ n cx
n ∈P
! ! !
(−1)n−1 n (−1)n−1 n (−1)n−1 n
= ∑ n cx ∑ n cx ··· ∑
n
cx
n ∈P n ∈P n ∈P
| {z }
m times
! ! !
(−1)n1 −1 n1 (−1) n2 −1
(−1)nm −1 nm
= ∑ n1
cx ∑ n2
cx n2 ··· ∑
nm
cx
n ∈P
1 n2 ∈P n m ∈P
(here, we have renamed the summation indices)
! ! !
(−1)n1 −1 n1 (−1)n2 −1 n2 (−1)nm −1 nm
= ∑ n1
cx
n2
cx ···
nm
cx
(n1 ,n2 ,...,nm )∈Pm
Math 701 Spring 2021, version April 6, 2024 page 110
Now, forget that we fixed m. We thus have proved (78) for each m ∈ N.
Now, (77) becomes
(1 + x ) c
!m
n −1
1 (−1)
= ∑ m! ∑ n
cx n
m ∈N n ≥1
1 (−1)n1 +n2 +···+nm −m m k
= ∑ m! ∑ ∑ n 1 n 2 · · · n m
c x (by (78))
m ∈N k ∈N )∈P ;
(n1 ,n2 ,...,nm m
n1 +n2 +···+nm =k
for any m sets A1 , A2 , . . . , Am and any elements ai,j ∈ K, provided that all the sums on the
left hand side of this equality are summable. We leave it to the reader to convince himself
of this rule (intuitively, it just says that we can expand a product of sums in the usual way,
even when the sums are infinite) and to check that the sums we are applying it to are indeed
summable.
Math 701 Spring 2021, version April 6, 2024 page 111
1
Now, let k ∈ N. Let us rewrite the “middle sum” ∑ ∑ ·
m ∈N (n1 ,n2 ,...,nm )∈Pm ; m!
n1 +n2 +···+nm =k
n1 +n2 +···+nm −m
(−1)
cm on the right hand side as a finite sum. Indeed, a com-
n1 n2 · · · n m
position of k shall mean a tuple (n1 , n2 , . . . , nm ) of positive integers satisfying
n1 + n2 + · · · + nm = k. (For example, (1, 3, 1) is a composition of 5. We will
study compositions in more detail in Section 3.9.) Let Comp (k ) denote the
set of all compositions of k. It is easy to see that this set Comp (k ) is finite32 .
Now, we can rewrite the double summation sign “ ∑ ∑ ” as a
m ∈N (n1 ,n2 ,...,nm )∈Pm ;
n1 +n2 +···+nm =k
single summation sign “ ∑ ” (since Comp (k) is precisely the set
(n1 ,n2 ,...,nm )∈Comp(k)
of all tuples (n1 , n2 , . . . , nm ) ∈ Pm satisfying n1 + n2 + · · · + nm = k). Hence, we
obtain
1 (−1)n1 +n2 +···+nm −m m
∑ ∑ m!
·
n1 n2 · · · n m
c
m ∈N (n1 ,n2 ,...,nm)∈Pm ;
n1 +n2 +···+nm =k
Forget that we fixed k. Thus, for each k ∈ N, we have defined a finite set
Comp (k ) and shown that (80) holds.
32 Proof.
Let (n1 , n2 , . . . , nm ) ∈ Comp (k). Thus, (n1 , n2 , . . . , nm ) is a composition of k. In other
words, (n1 , n2 , . . . , nm ) is a finite tuple of positive integers satisfying n1 + n2 + · · · + nm = k.
Hence, all its m entries n1 , n2 , . . . , nm are positive integers and thus are ≥ 1; therefore,
n1 + n2 + · · · + nm ≥ 1 + 1 + · · · + 1 = m, so that m ≤ n1 + n2 + · · · + nm = k. Thus,
| {z }
m times
m ∈ {0, 1, . . . , k }.
Furthermore, the sum n1 + n2 + · · · + nm is ≥ to each of its m addends (since its m
addends n1 , n2 , . . . , nm are positive). In other words, we have n1 + n2 + · · · + nm ≥ ni for
each i ∈ {1, 2, . . . , m}. Thus, for each i ∈ {1, 2, . . . , m}, we have ni ≤ n1 + n2 + · · · + nm = k
and therefore ni ∈ {1, 2, . . . , k} (since ni is a positive integer). Hence,
ℓ∈{0,1,...,k}
(1 + x ) c
n1 +n2 +···+nm −m
1 (−1)
= ∑ ∑ m!
·
n1 n2 · · · n m
cm x k . (81)
k ∈N (n1 ,n2 ,...,nm )∈Comp(k)
Hence, the equality (82) also holds for each c ∈ N (and each k ∈ N). And this
is precisely what we needed to show!
Let me explain this argument in detail now, as it is somewhat vertigo-inducing.
We forget that we fixed K and c. Now, fix c ∈ N. Thus, c ∈ N ⊆ Z. Hence, in
the ring Q [[ x ]], we have
c k
(1 + x ) = ∑
c
x (by Theorem 3.3.10, applied to n = c) .
k ∈N
k
and
x ( x − 1) ( x − 2) · · · ( x − k + 1)
x
g := = ∈ Q [x]
k k!
satisfy f [c] = g [c] for each c ∈ N (because f [c] is the left hand side of (83),
while g [c] is the right hand side of (83)). Thus, each c ∈ N satisfies ( f − g) [c] =
f [c] − g [c] = g [c] − g [c] = 0. In other words, each c ∈ N is a root of f − g.
|{z}
= g[c]
Hence, the polynomial f − g has infinitely many roots in Q (since there are
infinitely many c ∈ N). Since f − g is a polynomial with rational coefficients,
Math 701 Spring 2021, version April 6, 2024 page 114
n + i − 1 2i
f = ∑ x (86)
i ∈N
i
n j
and g= ∑ x. (87)
j ∈N
j
(We will soon see why we chose to define them this way.) Multiplying the two
33 Specifically,
[19fco, Exercise 2.10.7] proves Proposition 3.8.4 in the particular case when n ∈
N; then, [19fco, Exercise 2.10.8] extends it to the case when n ∈ R. However, the latter
argument can just as well be used to extend it to arbitrary n ∈ C.
Math 701 Spring 2021, version April 6, 2024 page 116
n+i−1
h i n
xk ( f g) = ∑ i j
. (88)
(i,j)∈N×N;
2i + j=k
n+i−1
n
rewriting this sum as ∑ . Hence,
i ∈N; i k − 2i
k−2i ∈N
n+i−1 n+i−1
n n
∑ i j
= ∑ i k − 2i
(i,j)∈N×N; i ∈N;
2i + j=k k−2i ∈N
n+i−1
n
= ∑ i k − 2i
i ∈N;
2i ≤k
since an i ∈ N satisfies k − 2i ∈ N
if and only if it satisfies 2i ≤ k
n+i−1
n
= ∑
i ∈N;
i k − 2i
i ≤k
(here, we have replaced the condition “2i ≤ k” under the summation sign by
the weaker condition “i ≤ k”, thus extending the range of the sum; but this did
not change thesum, since
all the newly introduced addends are 0 because of
n
the vanishing factor). Thus, (88) becomes
k − 2i
n+i−1
h i n
k
x ( f g) = ∑ i j
(i,j)∈N×N;
2i + j=k
n+i−1
n
= ∑ i k − 2i
i ∈N;
i ≤k
k
n+i−1
n
= ∑ i k − 2i
. (89)
i =0
Note that the right hand side here is precisely the left hand side of the identity
we are trying to prove. This is why we defined f and g as we did. With a bit
of experience, the computation above can easily be reverse-engineered, and the
definitions of f and g are essentially forced by the goal of making (89) hold.
Anyway, it is now clear that a simple expression for f g would move us for-
ward. So let us try to simplify f and g. For g, the answer is easiest: We have
n j
g= ∑ x = (1 + x ) n ,
j ∈N
j
n j n
because Theorem 3.8.3 (applied to c = n) yields (1 + x ) = ∑ x . For f ,
j ∈N j
Math 701 Spring 2021, version April 6, 2024 page 118
Let us write P for the set {1, 2, 3, . . .}. Then, a composition of n into k parts
is nothing but a k-tuple (α1 , α2 , . . . , αk ) ∈ Pk satisfying α1 + α2 + · · · + αk = n.
Hence, (91) can be rewritten as
an,k = # of all k-tuples (α1 , α2 , . . . , αk ) ∈ Pk satisfying α1 + α2 + · · · + αk = n
= ∑ 1. (93)
)∈Pk ;
(α1 ,α2 ,...,αk
α1 +α2 +···+αk =n
Math 701 Spring 2021, version April 6, 2024 page 120
∑ ∑ ∑ ∑
n
xn
Ak =
x =
1 |{z}
n ∈N (α1 ,α2 ,...,αk )∈Pk ; n ∈N (α1 ,α2 ,...,αk )∈Pk ; = x α1 +α2 +···+αk
α1 +α2 +···+αk =n α1 +α2 +···+αk =n (since α1 +α2 +···+αk =n)
= ∑ ∑ |x
α1 +α2 +···+αk
{z }= ∑ x x ···x
α1 α2 αk
n ∈N (α1 ,α2 ,...,αk )∈Pk ; = x α1 x α2 ··· x αk (α1 ,α2 ,...,αk )∈Pk
α1 +α2 +···+αk =n
| {z }
= ∑
(α1 ,α2 ,...,αk )∈Pk
! ! !
= ∑ x α1
∑ x α2
··· ∑ x αk
α1 ∈P α2 ∈P α k ∈P
by the same product rule that we used back
in the proof of Theorem 3.8.3
!k
= ∑ xn
n ∈P
(here, we have renamed all the k summation indices as n, and realized that all
k sums are identical). Since
1 x
∑ x n
= x 1
+ x 2
+ x 3
+ · · · = x 1 + x + x 2
+ · · · = x·
1−x
=
1−x
,
n ∈P | {z }
1
=
1−x
this can be rewritten further as
k
x
Ak = = x k (1 − x ) − k . (94)
1−x
−k j
(1 + x ) = ∑
−k
x.
j ∈N
j
Substituting − x for x on both sides of this equality (i.e., applying the map
Math 701 Spring 2021, version April 6, 2024 page 121
f 7→ f ◦ (− x )), we obtain
−k j+k−1
(1 − x ) = ∑
−k
(− x ) j = ∑ (−1) j
· (−1) j x j
j ∈N
j | {z } j ∈N
j
| {z } =(−1) j x j
j+k−1
=(−1) j
j
(by Theorem 3.3.11,
applied to k and j
instead of n and k)
2 j + k − 1
j+k−1 j
= ∑ (−1) j
j
j
x = ∑ j
x (95)
j ∈N |{z } j ∈N
=1
n − 1 n−k
= ∑ x
n≥k
n−k
(here, we have substituted n − k for j in the sum). Hence, our above computa-
tion of Ak can be completed as follows:
n − 1 n−k
Ak = x k
(1 − x ) −k
=x ∑
k
x
| {z } n≥k
n−k
n − 1 n−k
=∑ x
n≥k n−k
n − 1 k n−k n−1 n n−1 n
= ∑
n − k | {zn } n∑
x x = x = ∑ x
n≥k ≥k
n−k n ∈N
n−k
=x
(here, we have extended the range of the summation from all n ≥ k to all n ∈ N;
this did not change the sum, since all the newly introduced addends with n < k
are 0). Comparing coefficients, we thus obtain
n−1
an,k = for each n ∈ N. (96)
n−k
n−1
If n > 0, then we can rewrite the right hand side of this equality as
k−1
(using Theorem 2.3.6). However, if n = 0, then this right hand side equals δk,0
instead (where we are using Definition 3.5.6). Thus, we can rewrite (96) as
n − 1 , if n > 0;
an,k = k−1 for each n ∈ N. (97)
δ ,
k,0 if n = 0
We have thus answered our Question 2. Let us summarize the two an-
swers we have found ((96) and (97)) in the following theorem ([19fco, Theorem
2.10.1]):
Math 701 Spring 2021, version April 6, 2024 page 122
This theorem has other proofs as well. See [19fco, Proof of Theorem 2.10.1]
for a proof by bijection and [19fco, solution to Exercise 2.10.2] for a proof by
induction.
As an easy consequence of Theorem 3.9.3, we can now answer Question 1 as
well:
(# of compositions of n)
= ∑ (# of compositions of n into k parts)
k∈{1,2,...,n}
| {z }
n−1
=
n−k
(by Theorem 3.9.3)
n n −1
n−1 n−1 n−1
= ∑ n−k
= ∑ n−k
= ∑ k
k∈{1,2,...,n} k =1 k =0
Proof of Theorem 3.9.7 (sketched). (See [19fco, Theorem 2.10.5] for details and al-
ternative proofs.) Adding 1 to a nonnegative integer yields a positive integer.
Furthermore, if we add 1 to each entry of a k-tuple, then the sum of all entries
of the k-tuple increases by k.
Thus, if (α1 , α2 , . . . , αk ) is a weak composition of n into k parts, then
(α1 + 1, α2 + 1, . . . , αk + 1) is a composition of n + k into k parts. Hence, we can
34 Beware that the word “weak composition” does not have a unique meaning in the literature.
Math 701 Spring 2021, version April 6, 2024 page 124
define a map
{weak compositions of n into k parts} → {compositions of n + k into k parts} ,
(α1 , α2 , . . . , αk ) 7→ (α1 + 1, α2 + 1, . . . , αk + 1) .
Furthermore, it is easy to see that this map is a bijection (indeed, its inverse is
rather easy to construct). Thus, by the bijection principle, we have
|{weak compositions of n into k parts}|
= |{compositions of n + k into k parts}|
= (# of compositions of n + k into k parts)
n+k−1
by the first equality sign in Theorem 3.9.3,
=
n+k−k applied to n + k instead of n
n+k−1
= .
n
Thus, we have shown that the # of weak compositions of n into k parts is
n+k−1
n+k−1
, if k > 0;
. It remains to prove that this equals k−1 as
n
δ ,
n,0 if k = 0
well. This is similar to how we obtained (97): If k = 0, then it is clear by
inspection; otherwise it follows from Theorem 2.3.6. Theorem 3.9.7 is proven.
∑ ∑
n
Wk,p =
x
1
n ∈N (α1 ,α2 ,...,αk )∈{0,1,...,p−1}k ;
α1 +α2 +···+αk =n
= ∑ ∑ xn
|{z}
n ∈N (α1 ,α2 ,...,αk )∈{0,1,...,p−1}k ; = x α1 +α2 +···+αk
α1 +α2 +···+αk =n (since α1 +α2 +···+αk =n)
= ∑ ∑ α1 +α2 +···+αk
|x {z }
n ∈N (α1 ,α2 ,...,αk )∈{0,1,...,p−1}k ; = x α1 x α2 ··· x αk
α1 +α2 +···+αk =n
| {z }
= ∑
(α1 ,α2 ,...,αk )∈{0,1,...,p−1}k
= ∑ x α1 x α2 · · · x α k
(α1 ,α2 ,...,αk )∈{0,1,...,p−1}k
= ∑ x α1 ∑ x α2 · · · ∑ x αk
α1 ∈{0,1,...,p−1} α2 ∈{0,1,...,p−1} αk ∈{0,1,...,p−1}
by the same product rule that we used back
in the proof of Theorem 3.8.3
k
= ∑ xn
n∈{0,1,...,p−1}
(here, we have renamed all the k summation indices as n, and realized that all
k sums are identical). Since
1 − xp
∑ x n = x 0 + x 1 + · · · + x p −1 =
1−x
n∈{0,1,...,p−1}
(the last equality sign here is easy to check35 ), this can be rewritten further as
1 − xp k
Wk,p = = (1 − x p ) k (1 − x ) − k . (98)
1−x
In order to expand the right hand side, let us expand (1 − x p )k and (1 − x )−k
separately.
The binomial theorem yields
k k
j k j k j k
(1 − x ) = ∑ (−1)
p k
( x ) = ∑ (−1)
p j
x = ∑ (−1)
pj
x pj
j =0
j | {z } j =0
j j ∈N
j
= x pj
35 Just show that (1 − x ) x0 + x1 + · · · + x p−1 = 1 − x p .
Math 701 Spring 2021, version April 6, 2024 page 126
(1 − x p ) k (1 − x ) − k
! !
j+k−1 j
j k
= ∑ (−1) x pj
∑ x
j ∈N
j j ∈N
j
! !
−
k i + k 1
= ∑ (−1) j x pj ∑ xi
j ∈N
j i ∈N
i
here, we have renamed the summation index j
as i in the second sum
i + k − 1 pj+i
j k
= ∑ (−1)
j i
x
(i,j)∈N×N
| {z }
= ∑ ∑
n ∈N (i,j)∈N×N;
pj+i =n
i+k−1
k
= ∑ ∑ (−1) j j
i
pj+i
|x {z }
n ∈N (i,j)∈N×N; =xn
pj+i =n (since pj+i =n)
i+k−1 n
k
= ∑ ∑ (−1) j j
i
x
n ∈N (i,j)∈N×N;
pj+i =n
i+k−1
j k
= ∑ ∑ xn .
(− 1 )
n∈N (i,j)∈N×N;
j i
pj+i =n
i+k−1
j k
wn,k,p = ∑ (−1) j i
(i,j)∈N×N;
pj+i =n
n − pj + k − 1
k
= ∑ (−1) j
j∈N;
j n − pj
pj≤n
(here, we have removed all addends with j > k from the sum; this does not
change the sum, since all these addends are 0).
Thus, we have proved the following fact:
n − 2j + k − 1
k k j
to p = 2) yields that this # equals ∑ (−1) . Comparing
j =0 j n − 2j
these two results, we obtain the following identity:
3.10. x n -equivalence
We now return to general properties of FPSs.
xn
Definition 3.10.1. Let n ∈ N. Let f , g ∈ K [[ x ]] be two FPSs. We write f ≡ g
if and only if
xn
Thus, we have defined a binary relation ≡ on the set K [[ x ]]. We say that
xn
an FPS f is x n -equivalent to an FPS g if and only if f ≡ g.
and
1
= 1 + 3x + (3x )2 + (3x )3 + · · · = x0 + 3x1 + 9x2 + 27x3 + · · · .
1 − 3x
x1 1
Thus, (1 + x )3 ≡ , since each m ∈ {0, 1} satisfies [ x m ] (1 + x )3 =
1 − 3x
1 1
[ xm ] (indeed, we have x0 (1 + x )3 = 1 = x0 and
1 − 3x 1 − 3x
1
(1 + x )3 = 3 = x 1
1
x ). Of course, this also shows that
1 − 3x
x0 1 x2 1
(1 + x )3 ≡ . However, we don’t have (1 + x )3 ≡ (at least not
1 − 3x 1 − 3x
1
for K = Z), since x2 (1 + x )3 = 3 does not equal x2
= 9.
1 − 3x
Math 701 Spring 2021, version April 6, 2024 page 129
x1 1 x1
(b) More generally, (1 + x )n ≡ and (1 + x )n ≡ 1 + nx for each
1 − nx
n ∈ Z.
(c) The generating function F = F ( x ) = 0 + 1x + 1x2 + 2x3 + 3x4 + 5x5 +
x2 x3
· · · from Section 3.1 satisfies F ≡ x + x2 and F ≡ x + x2 + 2x3 .
(d) If f ∈ K [[ x ]] is any FPS, and if n ∈ N, then there exists a polynomial
xn n
p ∈ K [ x ] such that f ≡ p. Indeed, we can take p = ∑ x k f · x k .
k =0
xn
One way to get an intuition for the relation ≡ is to think of it as a kind
of “approximate equality” up to degree n. (This makes the most sense if one
thinks of x as an infinitesimal quantity, in which case a term λx k (with λ ∈ K)
xn
is the more “important” the lower k is. From this viewpoint, f ≡ g means that
the FPSs f and g agree in their n + 1 most “important” terms and differ at most
xn
in their “error terms”.) For this reason, the statement “ f ≡ g” is sometimes
written as “ f = g + o ( x n )” (an algebraic imitation of Landau’s little-o notation
from asymptotic analysis). Another intuition comes from elementary number
xn
theory: The relation ≡ is similar to congruence of integers modulo a given
xn
integer. This is more than a similarity; the relation ≡ can in fact be restated as a
divisibility in the same fashion as for congruences of integers (see Proposition
xn
3.10.4 below). For this reason, the statement “ f ≡ g” is sometimes written
as “ f ≡ g mod x n+1 ”. We shall, however, eschew both of these alternative
xn
notations, and use the original notation “ f ≡ g” from Definition 3.10.1, as
both intuitions (while useful) would distract from the simplicity of Definition
3.10.136 .
xn
Here are some basic properties of the relation ≡ (some of which will be used
without explicit reference):
xn
• This relation is reflexive (i.e., we have f ≡ f for each f ∈ K [[ x ]]).
xn
• This relation is transitive (i.e., if three FPSs f , g, h ∈ K [[ x ]] satisfy f ≡ g
xn xn
and g ≡ h, then f ≡ h).
36 Case in point: Definition 3.10.1 can be generalized to multivariate FPSs, but the two intu-
itions are no longer available (or, worse, give the “wrong” concepts) when extended to this
generality.
Math 701 Spring 2021, version April 6, 2024 page 130
xn
• This relation is symmetric (i.e., if two FPSs f , g ∈ K [[ x ]] satisfy f ≡ g,
xn
then g ≡ f ).
xn xn
(b) If a, b, c, d ∈ K [[ x ]] are four FPSs satisfying a ≡ b and c ≡ d, then we
also have
xn
a + c ≡ b + d; (99)
xn
a − c ≡ b − d; (100)
xn
ac ≡ bd. (101)
xn xn
(c) If a, b ∈ K [[ x ]] are two FPSs satisfying a ≡ b, then λa ≡ λb for each
λ ∈ K.
xn xn
(d) If a, b ∈ K [[ x ]] are two invertible FPSs satisfying a ≡ b, then a−1 ≡ b−1 .
xn xn
(e) If a, b, c, d ∈ K [[ x ]] are four FPSs satisfying a ≡ b and c ≡ d, and if the
FPSs c and d are invertible, then we also have
a xn b
≡ . (102)
c d
(f) Let S be a finite set. Let ( as )s∈S ∈ K [[ x ]]S and (bs )s∈S ∈ K [[ x ]]S be two
families of FPSs such that
xn
each s ∈ S satisfies as ≡ bs . (103)
Then, we have
xn
∑ as ≡ ∑ bs ; (104)
s∈S s∈S
xn
∏ a s ≡ ∏ bs . (105)
s∈S s∈S
Proof of Theorem 3.10.3 (sketched). All of these properties are analogous to famil-
iar properties of integer congruences, except for Theorem 3.10.3 (d), which is
moot for integers (since there are not many integers that are invertible in Z).
The proofs are similarly simple (using (20), (21), (22) and (25)). Thus, we shall
only give some hints for the proof of Theorem 3.10.3 (d) here; detailed proofs
of all parts of Theorem 3.10.3 can be found in Section B.1.
xn
(d) Let a, b ∈ K [[ x ]] be two invertible FPSs satisfying a ≡ b. We want to show
Math 701 Spring 2021, version April 6, 2024 page 131
xn
that a−1 ≡ b−1 .
The FPS a is invertible; thus, its constant term x0 a is invertible in K (by
Proposition 3.3.7).
xn
Recall that a ≡ b. In other words,
each m ∈ {0, 1, . . . , n} satisfies [ x m ] a = [ x m ] b. (106)
xn
Now, we want to prove that a−1 ≡ b−1 . In other words, we want to prove
m
that each m ∈ {0, 1, . . . , n} satisfies [ x ] a − 1 m − 1
= [ x ] b . We shall prove this
by strong induction on m: We fix some k ∈ {0, 1, . . . , n}, and we assume (as an
induction hypothesis) that
m −1 m −1
x
[ ] a = x
[ ] b for each m ∈ {0, 1, . . . , k − 1} . (107)
h i kh i h i
x k aa−1 = ∑ x i
a · x k −i
a −1
(by (22))
i =0
h i h i k h i h i
0
= x a· x k
a −1
+∑ x a· x
i k −i
a −1
i =1
(here, we have split off the addend for i = 0 from the sum). Thus,
h i h i k h i h i h i h i
x0 a · x k a−1 + ∑ xi a · x k−i a−1 = x k aa−1 = x k 1.
i =1 | {z }
=1
We can solve this equation for x k a−1 (since x0 a is invertible), and thus
obtain
!
k h i
h i 1 h i h i
x k a −1 = 0 · x k 1 − ∑ x i a · x k − i a −1 .
[x ] a i =1
Proof of Proposition 3.10.4. See Section B.1 for this proof (a simple consequence
of Lemma 3.3.18).
Finally, here is a subtler property of x n -equivalence similar to the ones in
Theorem 3.10.3 (b):
xn
a ◦ c ≡ b ◦ d.
3.11.1. An example
We start with a motivating example (due to Euler, in [Euler48, §328–329]),
which we shall first discuss informally.
Assume (for the time being) that the infinite product
∏ 1 + x 2i
= 1 + x 1
1 + x 2
1 + x 4
1 + x 8
··· (108)
i ∈N
i +1
i 1 − x2 i +1
We can observe that each i ∈ N satisfies 1 + x2 = i
(since 1 − x2 =
i 2 1 − x2
i i
1 − x2 = 1 − x2 1 + x2 ). Multiplying these equalities over all i ∈ N,
we obtain
i +1
1 − x2 1 − x2 1 − x4 1 − x8 1 − x16
∏ 1 + x 2i
= ∏ 2i
= · · ·
1 − x1 1 − x2 1 − x4 1 − x8
···· .
i ∈N i ∈N 1 − x
The product on the right hand side here is a telescoping product – meaning that
each numerator is cancelled by the denominator of the following fraction. As-
suming (somewhat plausibly, but far from rigorously) that we are allowed to
cancel infinitely many factors from an infinite product, we thus end up with
a single 1 − x1 factor in the denominator. That is, our product simplifies to
1
. Thus, we obtain
1 − x1
1 1
∏ 1 + x = 1 − x1 = 1 − x = 1 + x + x2 + x3 + · · · .
2i
(109)
i ∈N
i
This was not very rigorous, so let us try to compute the product ∏ 1 + x2
i ∈N
in a different way. Namely, we recall a simple fact about finite products: If
a0 , a1 , . . . , am are finitely many elements of a commutative ring, then the product
m
∏ (1 + a i ) = (1 + a0 ) (1 + a1 ) · · · (1 + a m ) (110)
i =0
∏ (1 + a i ) = ∑ a i1 a i2 · · · a i k (111)
i ∈N i1 <i2 <···<ik
where qn is the # of ways to write the integer n as a sum 2i1 + 2i2 + · · · + 2ik
with nonnegative integers i1 , i2 , . . . , ik satisfying i1 < i2 < · · · < ik . Comparing
this with (109), we obtain
∑ qn x n = 1 + x + x2 + x3 + · · · = ∑ xn ,
n ∈N n ∈N
at least if our assumptions were valid. Comparing coefficients, this would mean
that qn = 1 for each n ∈ N. In other words, each n ∈ N can be written
in exactly one way as a sum 2i1 + 2i2 + · · · + 2ik with nonnegative integers
i1 , i2 , . . . , ik satisfying i1 < i2 < · · · < ik . In other words, each n ∈ N can
be written uniquely as a finite sum of distinct powers of 2.
Is this true? Yes, because this is just saying that each n ∈ N has a unique
binary representation. For example, 21 = 24 + 22 + 20 corresponds to the binary
representation 21 = (10101)2 .
Thus, the two results we have obtained in (109) and (112) are actually equal,
which is reassuring. Yet, this does not replace a formal definition of infinite
products that rigorously justifies the above arguments.
This is how we defined infinite sums of FPSs. We cannot use the same defi-
nition for infinite products, because usually
!
we don’t expect to have [ x n ] ∏ ai = ∏ [ x n ] ai
i∈ I i∈ I
(You can think of this condition as saying “If you add any further ai s to the
sum ∑ ai , then the x n -coefficient stays unchanged”, or, more informally: “If
i∈ M
you want to know the x n -coefficient of ∑ ai , it suffices to take the partial sum
i∈ I
over all i ∈ M”.)
(b) We say that M determines the x n -coefficient in the product of (ai )i∈ I if every
finite subset J of I satisfying M ⊆ J ⊆ I satisfies
! !
[ xn ] ∏ ai = [ xn ] ∏ ai .
i∈ J i∈ M
(You can think of this condition as saying “If you multiply any further ai s to
the product ∏ ai , then the x n -coefficient stays unchanged”, or, more infor-
i∈ M
mally: “If you want to know the x n -coefficient of ∏ ai , it suffices to take the
i∈ I
partial product over all i ∈ M”.)
(this is simply a consequence of the fact that the only two entries of our fam-
i
ily that have a nonzero x3 -coefficient are the entries x + x2 for i ∈ {2, 3}).
Thus, any finite subset of N that contains
{2, 3} as a subset determines the
i
x3 -coefficient in the sum of this family x + x2 .
i ∈N
(b) Consider the family
1 + xi = 1 + 1, 1 + x, 1 + x2 , 1 + x3 , 1 + x4 , . . .
i ∈N
Math 701 Spring 2021, version April 6, 2024 page 137
(The philosophical reason is that, even though the monomial x3 itself does
not appear in any of the entries 1 + x1 and 1 + x2 , it does emerge in the
product of these two entries with the constant term of ∏ 1 + xi =
i ∈{0,3}
3
(1 + 1) 1 + x .)
(c) Here is a simple but somewhat slippery example: Let I = {1, 2, 3},
and define the FPS ai = 1 + (−1)i x over K = Z for each i ∈ I (so that
a1 = a3 = 1 − x and a2 = 1 + x). Consider the finite family (ai )i∈ I . Its
product is already well-defined by dint of its finiteness:
∏ a i = (1 − x ) (1 + x ) (1 − x ) = 1 − x − x 2 + x 3 .
i∈ I
!
For example, J = {1, 2} does not, since x1 ∏ ai
= −1 but
i ∈{1}
!
1
x ∏ ai = 0.)
i ∈{1,2}
This shows that, in order for a subset M of I to determine the x n -
coefficient in the product of a family (ai )i∈ I , it does not suffice to check !
that
[ x n ] ∏ ai = [ x n ] ∏ ai ; it rather needs to be shown that [ x n ] ∏ ai =
i ∈ I i∈ M i∈ J
Using these concepts, we can now reword our definition of infinite sums as
follows:
Definition 3.11.5. Let (ai )i∈ I be a (possibly infinite) family of FPSs. Then:
(a) The family (ai )i∈ I is said to be multipliable if and only if each coefficient
in its product is finitely determined.
(b) If the family (ai )i∈ I is multipliable, then its product ∏ ai is defined to
i∈ I
be the FPS whose x n -coefficient (for any n ∈ N) can be computed as follows:
If n ∈ N, and if M is a finite subset of I that determines the x n -coefficient in
the product of (ai )i∈ I , then
! !
[ xn ] ∏ ai = [ xn ] ∏ ai .
i∈ I i∈ M
So let us prove this. Let M1 and M2 be two finite subsets of I that each
determine the x n -coefficient in the product of (ai )i∈ I . Thus, in particular, M1
determines the x n -coefficient in the product of (ai )i∈ I . In other words, every
finite subset J of I satisfying M1 ⊆ J ⊆ I satisfies
! !
[ xn ] ∏ ai = [ xn ] ∏ ai .
i∈ J i ∈ M1
The left hand sides of the equalities (114) and (115) are equal (since M1 ∪ M2 =
M2 ∪ M1 ). !Thus, the right ! hand sides are equal as well. In other words,
Proposition 3.11.7. Let (ai )i∈ I be a finite family of FPSs. Then, the product
∏ ai defined according to Definition 3.11.5 (b) equals the finite product ∏ ai
i∈ I i∈ I
defined in the usual way (i.e., defined as in any commutative ring).
Proof. Argue that I itself is a subset of I that determines all coefficients in the
product of (ai )i∈ I . See Section B.2 for a detailed proof.
We shall apply Convention 3.2.14 to infinite products just like we have been
∞
applying it to infinite sums. For instance, the product sign ∏ (for a fixed
k=m
m ∈ Z) means ∏ .
k∈{m,m+1,m+2,...}
i
3.11.3. Why ∏ 1 + x2 works and ∏ (1 + ix ) doesn’t
i ∈N i ∈N
i
Let us now see how Definition 3.11.5 legitimizes our product ∏ 1 + x2 from
i ∈N
Subsection 3.11.1. Indeed,
∏ 1+x = 1+x 1+x 1+x 1+x ··· .
2i 1 2 4 8
i ∈N
If you want to compute the x6 -coefficient in this 4product, you only need to
1 2
multiply the first 3 factors 1 + x 1+x 1 + x ; none of the other factors
will change this coefficient in any way, because multiplying an FPS by 1 + x m
Math 701 Spring 2021, version April 6, 2024 page 141
(for some m > 0) does not change its first m coefficients38 . Likewise, if you
want to compute the x13 -coefficient of the above product, then you only need
to multiply the first 4 factors; none of the others will have any effect on this
coefficient. The same logic applies to the x n -coefficient for any n ∈ N; it is
determined by the first ⌊log2 n⌋ + 1 factors of the product. Thus, each coefficient
in the product is finitely determined. This means that the family is multipliable;
thus, its product makes sense.
In contrast, the product
(1 + 0x ) (1 + 1x ) (1 + 2x ) (1 + 3x ) (1 + 4x ) · · · = ∏ (1 + ix)
i ∈N
does not make sense. Indeed, its x1 -coefficient is not finitely determined (any
of the factors other than 1 + 0x affects it), so the family (1 + ix )i∈N is not mul-
tipliable.
Likewise, the product
x x x x
1+ 2
1
1+ 2
2
1+ 2 ··· =
3 ∏ 1+ 2
i
i ∈{1,2,3,...}
Then,
[ x m ] a (1 + f ) = [ x m ] ( a + a f ) = [ x m ] a + [ x m ] ( a f ) (by (20))
| {z } | {z }
= a+ a f =0
(by (117))
= [ x m ] a.
This proves Lemma 3.11.8.
For convenience, let us extend Lemma 3.11.8 to products of several factors:
Then,
!
[ x m ] a ∏ (1 + f i ) = [ xm ] a for each m ∈ {0, 1, . . . , n} .
i∈ J
Proof of Lemma 3.11.9. This is just Lemma 3.11.8, applied several times (specifi-
cally, | J | many times). See Section B.2 for a detailed proof.
Now, using Lemma 3.11.9, we can obtain the following convenient criterion
for multipliability:
Proof of Theorem 3.11.10. This is an easy consequence of Lemma 3.11.9. See Sec-
tion B.2 for a detailed proof.
We notice two simple sufficient (if rarely satisfied) criteria for multipliability:
Math 701 Spring 2021, version April 6, 2024 page 143
Proposition 3.11.11. If all but finitely many entries of a family (ai )i∈ I ∈
K [[ x ]] I equal 1 (that is, if all but finitely many i ∈ I satisfy ai = 1), then
this family is multipliable.
Proof. Assume that the family (ai )i∈ I contains 0 as an entry. That is, there exists
some j ∈ I such that a j = 0. Consider this j. Now, it is easy to see that the
subset { j} of I determines all coefficients in the product of (ai )i∈ I . The details
are LTTR.
3.11.5. x n -approximators
Working with multipliable families gets slightly easier using the following no-
tion:
Proof of Lemma 3.11.15 (sketched). This is an easy consequence of the fact that a
union of finitely many finite sets is finite. A detailed proof can be found in
Section B.2.
As promised above, we can use x n -approximators to “approximate” infinite
products of FPSs (in the sense of: compute the first n + 1 coefficients of these
products). Here is why this works:39
xn
∏ ai ≡ ∏ ai .
i∈ J i∈ M
xn
∏ ai ≡ ∏ ai .
i∈ I i∈ M
Proof. This follows easily from Definition 3.11.13 and Definition 3.11.5 (b). See
Section B.2 for a detailed proof.
39 See xn
Definition 3.10.1 for the meaning of the symbol “≡” appearing in this proposition.
Math 701 Spring 2021, version April 6, 2024 page 145
(b) We have
!
∏ ai = ∏ ai · ∏ ai .
i∈ I i∈ J i∈ I \ J
∏ ai = a j · ∏ ai
i∈ I i ∈ I \{ j}
(assuming that the family (ai )i∈ I \{ j} is multipliable40 ). This rule allows us to
split off any factor from a multipliable product, as long as the rest of the product
is still multipliable.
Our next property generalizes the classical rule ∏ ( ai bi ) = ∏ ai · ∏ bi
i∈ I i∈ I i∈ I
of finite products to infinite ones:
Proposition 3.11.18. Let (ai )i∈ I ∈ K [[ x ]] I and (bi )i∈ I ∈ K [[ x ]] I be two multi-
pliable families of FPSs. Then:
(a) The family (ai bi )i∈ I is multipliable.
(b) We have ! !
∏ ( ai bi ) = ∏ ai · ∏ bi .
i∈ I i∈ I i∈ I
Proposition 3.11.19. Let (ai )i∈ I ∈ K [[ x ]] I and (bi )i∈ I ∈ K [[ x ]] I be two multi-
pliable families of FPSs. Assume that the FPS bi is invertible for each i ∈ I.
Then:
ai
(a) The family is multipliable.
bi i ∈ I
(b) We have
∏ ai
ai
∏ bi = ∏ bi .
i∈ I
i∈ I
i∈ I
Proof of Proposition 3.11.21 (sketched). This is another proof in the tradition of the
proofs of Proposition 3.11.17 and Proposition 3.11.18. We must show that the
family (ai )i∈ J is multipliable whenever J is a subset of I. The idea is to show that
if U is an x n -approximator for (ai )i∈ I , then U ∩ J determines the x n -coefficient
in the product of (ai )i∈ J (and, in fact, is an x n -approximator for (ai )i∈ J ). This
relies on the invertibility of ∏ ai ; this is why the FPS ai are required to be
i ∈U \ J
invertible in the proposition.
The details of this proof can be found in Section B.2.
Math 701 Spring 2021, version April 6, 2024 page 147
As Remark 3.11.20 shows, Proposition 3.11.21 would not hold without the
word “invertible”.
Now let us state a trivial rule that is nevertheless worth stating. Recall that
finite products can be reindexed using a bijection – i.e., if f : S → T is a bijection
between two finite sets S and T, then any product ∏ at can be rewritten as
t∈ T
∏ a f (s) . The same holds for infinite products:
s∈S
∏ at = ∏ a f (s)
t∈ T s∈S
(and, in particular,
the product on the right hand side is well-defined, i.e.,
the family a f (s) is multipliable).
s∈S
∏ as = ∏ ∏ as . (119)
s∈S w ∈W s∈S;
f (s)=w
(In
particular,
the right hand side is well-defined – i.e., the family
∏ as is multipliable.)
s∈S;
f (s)=w w ∈W
41 Notethat this assumption automatically holds if we assume that all FPSs as are invertible.
Indeed, the family (as )s∈S; f (s)=w is a subfamily of the multipliable family (as )s∈S and thus
must itself be multipliable if all the as are invertible (by Proposition 3.11.21).
Math 701 Spring 2021, version April 6, 2024 page 148
Proof of Proposition 3.11.23 (sketched). This can be derived from the analogous
property of finite products, since all coefficients in a multipliable product are
finitely determined (conveniently using x n -approximators, which determine
several coefficients at the same time). See Section B.2 for the details of this
proof.
The next rule allows (under certain technical conditions) to interchange prod-
uct signs, and to rewrite a nested product as a product over pairs:
(Fubini
Proposition 3.11.24 rule for infinite products of FPSs). Let I and J
be two sets. Let a(i,j) ∈ K [[ x ]] I × J be a multipliable family of FPSs.
(i,j)∈ I × J
Assume that for each i ∈ I, the family a(i,j) is multipliable. Assume that
j∈ J
for each j ∈ J, the family a(i,j) is multipliable. Then,
i∈ I
(In particular, all the products appearing in this equality are well-defined.)
Proof of Proposition 3.11.24 (sketched). The first equality follows by applying Propo-
sition 3.11.23 to S = I × J and W = I and f (i, j) = i (and appropriately rein-
dexing the products). The second equality is analogous. See Section B.2 for the
details of this proof.
The above rules (Proposition 3.11.17, Proposition 3.11.18, Proposition 3.11.19,
Proposition 3.11.21, Proposition 3.11.22, Proposition 3.11.23, Proposition 3.11.24)
show that infinite products (as we have defined them) are well-behaved – i.e.,
satisfy the usual rules that finite products satisfy, with only one minor caveat:
Subfamilies of multipliable families might fail to be multipliable (as saw in
Remark 3.11.20). If we restrict ourselves to multipliable families of invertible
FPSs, then even this caveat is avoided:
(In particular, all the products appearing in this equality are well-defined.)
These rules (and some similar ones, which the reader can easily invent and
prove42 ) allow us to work with infinite products almost as comfortably as with
finite ones. Of course, we need to check that our families are multipliable,
and occasionally verify that a few other technical requirements are met (such
as multipliability of subfamilies), but usually such verifications are straightfor-
ward and easy and can be done in one’s head.
Does this justify our manipulations in Subsection 3.11.1? To some extent.
We need to be careful with the telescope principle, whose infinite analogue is
rather subtle and needs some qualifications. Here is an example of how not to
use the telescope principle:
1 2 2 2
· · · · · · · ̸= 1.
2 2 2 2
It is tempting to argue that the infinitely many 2’s in these fractions cancel
each other out, and yet the 1 that remains is not the right result. See Exercise
A.2.13.4 (b) for how an infinite telescope principle should actually look like.
Anyway, our computations in Subsection 3.11.1 did not truly need the telescope
k
42 For example, if (ai )i∈ I is a multipliable family of FPSs, and if k ∈ N, then ∏ aik = ∏ ai .
i∈ I i∈ I
If the ai are moreover invertible, then this equality also holds for all k ∈ Z.
Math 701 Spring 2021, version April 6, 2024 page 150
principle; they can just as well be made using more fundamental rules43 .
There is one more rule that we have used in Subsection 3.11.1 and have not
justified yet: the equality (111). We will justify it next.
where
the last step used Proposition
3.11.19 and relied on the fact that both families
2 i +1 2 i
1−x and 1 − x are multipliable (this is important, but very easy to check
i ∈N i ∈N
i
in this case) and that each FPS 1 − x2 is invertible (because
itsconstant term is 1). However,
i
splitting off the factor for i = 0 from the product ∏ 1 − x2 , we obtain
i ∈N
∏
i
∏
i
∏
0 i +1
1 − x2 = 1 − x2 · 1 − x2 = (1 − x ) · 1 − x2 ,
i ∈N | {z } i >0 i ∈N
=1− x 1 =1− x
| {z }
i +1
= ∏ 1− x 2
i ∈N
(here, we substituted i +1
for i in the product)
so that
i +1
∏ 1 − x2
i ∈N 1
= .
∏ 1 − x2
i 1−x
i ∈N
i +1
1 − x2
∏
i
i ∈N 1
Hence, ∏ 1 + x2 = = . So we don’t need the telescope principle
i ∈N ∏ 1 − x2
i 1−x
i ∈N
to justify this equality.
44 Keep in mind that an empty Cartesian product (i.e., a Cartesian product of 0 sets) is always
a 1-element set; its only element is the 0-tuple (). Thus, a sum ranging over an empty
Cartesian product has exactly 1 addend.
Math 701 Spring 2021, version April 6, 2024 page 151
of L. Then,
n mi n
∏ ∑ pi,k = ∑ ∏ pi,ki . (120)
i =1 k =1 (k1 ,k2 ,...,k n )∈[m1 ]×[m2 ]×···×[mn ] i =1
where the right hand side is the sum of all m1 m2 · · · mn many ways to multiply
one addend from each of the factors on the left hand side.
See [Grinbe15, solution to Exercise 6.9] for a formal proof of Proposition
3.11.26. (The idea is to reduce it to the case n = 2 by induction, then to use the
discrete Fubini rule.)
Let us now move on to product rules for infinite sums and products. First,
let us extend Proposition 3.11.26 to a finite product of infinite sums (which are
now required to be in K [[ x ]] in order to have a notion of summability):
Proposition 3.11.27. For every n ∈ N, let [n] denote the set {1, 2, . . . , n}.
Let n ∈ N. For every i ∈ [n], let ( pi,k )k∈S be a summable family of elements
i
of K [[ x ]]. Then,
n n
∏ ∑ pi,k = ∑ ∏ pi,ki . (121)
i =1 k ∈ Si (k1 ,k2 ,...,k n )∈S1 ×S2 ×···×Sn i =1
n
In particular, the family ∏ pi,ki is summable.
i =1 (k1 ,k2 ,...,k n )∈S1 ×S2 ×···×Sn
Proof. Same method as for Proposition 3.11.26, but now using the discrete Fu-
bini rule for infinite sums.
Proposition 3.11.27 is rather general, but is only concerned with finite prod-
ucts. Thus, it cannot be directly used to justify (111), since the product in (111)
is infinite. Thus, we need a product rule for infinite products of sums. Such
rules are subtle and require particular care: Not only do our sums have to
be summable and our product multipliable, but we also must avoid cases like
(1 − 1) (1 − 1) (1 − 1) · · · , which would produce non-summable infinite sums
when expanded (despite being multipliable). We also need on come clear about
what addends we get when we expand our products: For example, when ex-
panding the product
(1 + a0 ) (1 + a1 ) (1 + a2 ) (1 + a3 ) (1 + a4 ) · · · ,
Math 701 Spring 2021, version April 6, 2024 page 152
Note that the assumption (122) in Proposition 3.11.29 ensures that each of
the sums being multiplied contains an addend that equals 1 (and that this
addend is named pi,0 ; but this clearly does not restrict the generality of the
proposition). The equality (123) shows how we can expand an infinite prod-
uct of such sums. The result is a huge sum of products (the right hand side
of (123)); each of these products is formed by picking an addend from each
sum ∑ pi,k (just as in the finite case). The picking has to be done in such
k ∈ Si
a way that the addend 1 gets picked from all but finitely many of our sums
(for instance, when expanding 1 + x1 1 + x2 1 + x3 · · · , we don’t want to
pick the xi ’s from all sums, as this would lead to x1 x2 x3 · · · =“x ∞ ”); this is
why the sum on the right hand side of (123) is ranging only over the essen-
tially finite sequences (k1 , k2 , k3 , . . .) ∈ S1 × S2 × S3 × · · · . Finally, the condi-
tion that the family ( pi,k )(i,k)∈S be summable is our way of ruling out cases like
(1 − 1) (1 − 1) (1 − 1) · · · , which cannot be expanded. Proposition 3.11.29 is
not the most general version of the product rule, but it is sufficient for many of
our needs (and we will see some more general versions below).
The rest of this subsection should probably be skipped at first reading – it
is largely a succession of technical arguments about finite and infinite sets in
service of making rigorous what is already intuitively clear.
Before we sketch a proof of Proposition 3.11.29, let us show how (111) can be
derived from it:
Proof of (111) using Proposition 3.11.29. Let ( a0 , a1 , a2 , . . .) = ( an )n∈N be a summable
sequence of FPSs in K [[ x ]]. We must prove the equality (111).
Set Si = {0, 1} for each i ∈ {1, 2, 3, . . .}. Thus,
Define the set S as in Proposition 3.11.29. Then, S = {(1, 1) , (2, 1) , (3, 1) , . . .}.
Now, set pi,k = aik−1 for each i ∈ {1, 2, 3, . . .} and each k ∈ {0, 1}. Thus, the fam-
ily ( pi,k )(i,k)∈S is summable45 , and the statement (122) holds as well (indeed, we
have pi,0 = a0i−1 = 1 for any i ∈ {1, 2, 3, . . .}). Hence, Proposition 3.11.29 can be
applied.
Moreover, for each i ∈ {1, 2, 3, . . .}, we have Si = {0, 1}. Thus, for each
i ∈ {1, 2, 3, . . .}, we have
45 Indeed,
this family can be rewritten as ( p1,1 , p2,1 , p3,1 , . . .) = a10 , a11 , a12 , . . . = ( a0 , a1 , a2 , . . .),
Hence,
∞ ∞
∏ ∑ pi,k = ∏ (1 + a i −1 ) = ∏ (1 + a i )
i =1 k ∈ Si i =1 i ∈N
| {z }
=1 + a i −1
to {finite lists (i1 , i2 , . . . , ik ) of nonnegative integers such that i1 < i2 < · · · < ik }
that sends each sequence (k0 , k1 , k2 , . . .) ∈ {0, 1}∞ to the list of all i ∈ N satisfy-
ing k i = 1 (written in increasing order). Furthermore, if this bijection sends an
essentially finite sequence (k0 , k1 , k2 , . . .) ∈ {0, 1}∞ to a finite list (i1 , i2 , . . . , ik ),
Math 701 Spring 2021, version April 6, 2024 page 155
then
∏ ai i =
∏
∏
k k k
ai i ai i (since each k i is either 0 or 1)
i ∈N i ∈N; |{z} i∈N; |{z}
k i =0 = a0 =1 k i =1 = a1i = ai
i
∏
∏ ai = ∏
= 1
a i = a i1 a i2 · · · a i k
i ∈N; i ∈N; i ∈N;
k i =0 k i =1 k i =1
| {z }
=1
(since (i1 , i2 , . . . , ik ) is the list of all i ∈ N satisfying k i = 1). Thus, we can use
this bijection to re-index the sum on the right hand side of (124); we obtain
∑ ∏ ai i = ∑
k
a i1 a i2 · · · a i k .
(k0 ,k1 ,k2 ,...)∈{0,1}∞ i ∈N i1 <i2 <···<ik
is essentially finite
∏ (1 + a i ) = ∑ a i1 a i2 · · · a i k .
i ∈N i1 <i2 <···<ik
∏ (1 + a i ) = ∑ ∏ ai . (125)
i ∈N J is a finite i∈ J
subset of N
Indeed, the right hand side of this equality is precisely the right hand side of
(111).
We will prove Proposition 3.11.29 soon. First, however, let us generalize
Proposition 3.11.29 to products indexed by arbitrary sets instead of {1, 2, 3, . . .}:
Proposition 3.11.30. Let I be a set. For any i ∈ I, let Si be a set that contains
the number 0. Set
For any i ∈ I and any k ∈ Si , let pi,k be an element of K [[ x ]]. Assume that
Assume further that the family ( pi,k )(i,k)∈S is summable. Then, the product
∏ ∑ pi,k is well-defined (i.e., the family ( pi,k )k∈Si is summable for each
i ∈ I k ∈ Si
!
i ∈ I, and the family ∑ pi,k is multipliable), and we have
k ∈ Si i∈ I
In particular, the family ∏ pi,ki is summable.
i∈ N ( k i ) i ∈ N ∈ ∏ Si
i∈ N
Proof of Proposition 3.11.31. This is just Proposition 3.11.27, with the indexing
set {1, 2, . . . , n} replaced by N. It can be proved by reindexing the products (or
directly by induction on | N |).
Let me furthermore state an infinite version of Lemma 3.11.9, which will be
used in the proof of Proposition 3.11.30 given below:
Proof of Lemma 3.11.32 (sketched). The idea here is to argue that the first n + 1
coefficients of a ∏ (1 + f i ) agree with those of a ∏ (1 + f i ) for some finite sub-
i∈ J i∈ M
set M of J. Then, apply Lemma 3.11.9 to this subset M. The details of this
argument can be found in Section B.2.
With this lemma proved, we have all necessary prerequisites for the proof of
Proposition 3.11.30. The proof, however, is rather long due to the bookkeeping
required, and therefore has been banished to the appendix (Section B.2, to be
specific).
We notice that Proposition 3.11.30 (and thus also Proposition 3.11.29) can be
generalized slightly by lifting the requirement that all sets Si contain 0 (this
means that the sums being multiplied no longer need to contain 1 as an ad-
dend). See Exercise A.2.11.5 for this generalization. This lets us expand prod-
ucts such as x 1 + x1 1 + x2 1 + x3 · · · (but of course, we could just as
well expand this particular product by splitting off the x factor and applying
Proposition 3.11.29).
Proof. First, it is easy to see that the families 1 + x k k>0 and 1 − x k k>0 and
1 − x2k k>0 and 1 − x2i−1 i>0 are multipliable46 . This shows that the product
FPS 1 −
the x2i−1 is invertible for each i > 0). In other words, the family
−1
1 − x2i−1
is multipliable. Thus, the product on the left hand side of
i >0
Proposition 3.11.33 is well-defined.
46 Indeed, this follows from Theorem 3.11.10, since the families xk and − xk and
k >0 k >0
− x2k and − x2i−1 i>0 are summable.
k >0
Math 701 Spring 2021, version April 6, 2024 page 158
1 − x2k 2k = 1 − x k 2 =
For each k > 0, we have 1 + x k =
(since 1 − x
1 − xk
k k
1−x 1 + x ). Thus,
1 − x2k
∏ 1 + x = ∏ 1 − xk
k
k>0 | {z } k >0
1−x 2k
=
1 − xk
∏ 1 − x2k
1 − x2 1 − x4 1 − x6 1 − x8 · · ·
1
1 2 3 4
= ,
(1 − x ) (1 − x ) (1 − x ) (1 − x ) · · · (1 − x ) (1 − x ) (1 − x 5 ) (1 − x 7 ) · · ·
1 3
where dn is the # of all strictly increasing tuples (i1 < i2 < · · · < ik ) of positive
integers such that n = i1 + i2 + · · · + ik . We can rewrite this definition as follows:
dn is the # of ways to write n as a sum of distinct positive integers, with no
regard for the order47 .
Next, let us expand the left hand side: For each positive integer i, we have
−1 2 3
1 − x2i−1 = 1 + x2i−1 + x2i−1 + x2i−1 + · · ·
!
by substituting x2i−1 for x in the
equality (1 − x )−1 = 1 + x + x2 + x3 + · · ·
= 1 + x2i−1 + x2(2i−1) + x3(2i−1) + · · · = ∑ x k(2i−1) .
k ∈N
47 “Noregard for the order” means that, for example, 3 + 4 + 1 and 1 + 3 + 4 count as the same
way of writing 8 as a sum of distinct integers.
Math 701 Spring 2021, version April 6, 2024 page 160
= ∑ on x n , (131)
n ∈N
∑ dn x n = ∑ on x n .
n ∈N n ∈N
6, 1 + 5, 2 + 4, 1+2+3
(don’t forget the first of these ways, trivial as it looks!). On the other hand,
on = 4, because the ways to write n = 6 as a sum of odd positive integers are
1 + 5, 3 + 3, 3 + 1 + 1 + 1, 1 + 1 + 1 + 1 + 1 + 1.
We will soon learn a bijective proof of Theorem 3.11.34 as well (see the Second
proof of Theorem 4.1.14 below).
∑ x | a| = ∑ (# of a ∈ A having weight n) · x n ∈ Z [[ x ]] .
a∈ A n ∈N
Example 3.12.2. Let B be the weighted set of all binary strings, i.e., finite
tuples consisting of 0s and 1s. Thus,
1
B= ∑ (# of a ∈ B having weight n) · x n = ∑ 2n x n =
1 − 2x
.
n ∈N n ∈N
| {z }
=(# of binary strings of length n)
=2n
Example 3.12.3. Let B′ be the weighted set of all binary strings, i.e., finite
tuples consisting of 0s and 1s. The weight of an n-tuple ( a1 , a2 , . . . , an ) ∈ B′ is
defined to be a1 + a2 + · · · + an . (Thus, B′ is the same set as the B in Example
3.12.2, but equipped with a different weight function.) This weighted set
B′ is not finite-type, since there are infinitely many a ∈ B′ having weight 0
(namely, all tuples of the form (0, 0, . . . , 0), with any number of zeroes.)
Those familiar with graded vector spaces (or graded modules) will recognize
weighted sets as their combinatorial analogues: Weighted sets are to sets what
graded vector spaces are to vector spaces. (In particular, the weight generating
Math 701 Spring 2021, version April 6, 2024 page 164
Proof. This is almost trivial: The weighted sets A and B are isomorphic; thus,
there exists an isomorphism ρ : A → B. Consider this ρ. Then, ρ is a bijection
and preserves the weight (since ρ is an isomorphism of weighted sets). The
latter property says that we have
= ∑ x | a| .
a∈ A
∑ x | a|
A= by the definition of A ,
a∈ A
Definition 3.12.5. Let A and B be two weighted sets. Then, the weighted set
A + B is defined to be the disjoint union of A and B, with the weight function
inherited from A and B (meaning that each element of A has the same weight
that it had in A, and each element of B has the same weight that it had in B).
Formally speaking, this means that A + B is the set ({0} × A) ∪ ({1} × B),
with the weight function given by
and
|(1, b)| = |b| for each b ∈ B. (134)
A+B = ∑ x |c|
c∈ A+ B
= ∑ x |c| + ∑ x |c|
c∈{0}× A c∈{1}× B
| {z } | {z }
= ∑ x |(0,a)| = ∑ x |(1,b)|
a∈ A b∈ B
(here, we have substituted (0,a) (here, we have substituted (1,b)
for c in the sum, since the for c in the sum, since the
map A→{0}× A, a7→(0,a) is a bijection) map B→{1}× B, b7→(1,b) is a bijection)
here, we have split the sum, since the
set A + B is the union of the two disjoint
sets {0} × A and {1} × B
= ∑ |(0,a)|
|x {z } + ∑ |(1,b)|
|x {z } = ∑ x|a| + ∑ x|b| = A + B.
a∈ A
= x | a|
b∈ B
= x |b| |a∈ A{z } b∈ B
| {z }
(by (133)) (by (134))
=A =B
We can easily extend Definition 3.12.5 and Proposition 3.12.6 to disjoint unions
of any number (even infinite) of weighted sets. (However, in the case of infinite
disjoint unions, we need to require the disjoint union to be finite-type in order
for Proposition 3.12.6 to make sense.)
We note that disjoint unions of weighted sets do not satisfy associativity
“on the nose”: If A, B and C are three weighted sets, then the weighted sets
A + B + C and ( A + B) + C and A + ( B + C ) are not literally equal but rather
are isomorphic via canonical isomorphisms. But Proposition 3.12.4 shows that
their weight generating functions are the same, so that we need not distinguish
between them if we are only interested in these functions.
Another operation on weighted sets is the Cartesian product:
Definition 3.12.7. Let A and B be two weighted sets. Then, the weighted set
A × B is defined to be the Cartesian product of the sets A and B (that is, the
set {( a, b) | a ∈ A and b ∈ B}), with the weight function defined as follows:
For any ( a, b) ∈ A × B, we set
= ∑ | a|+|b|
|x {z }
( a,b)∈ A× B = x |a| · x |b|
! !
= ∑ x | a| · x |b| = ∑ x | a| ∑ x |b| = A · B.
( a,b)∈ A× B a∈ A b∈ B
| {z }| {z }
=A =B
48 Here is the idea: Let n ∈ N. We must prove that there are only finitely many pairs ( a, b) ∈
A × B having weight |( a, b)| = n. Since |( a, b)| = | a| + |b|, these pairs have the property
that a ∈ A has weight k for some k ∈ {0, 1, . . . , n}, and that b ∈ B has weight n − k for the
same k. This leaves only finitely many options for k, only finitely many options for a (since
A is finite-type and thus has only finitely many elements of weight k), and only finitely
many options for b (since B is finite-type and thus has only finitely many elements of n − k).
Altogether, we thus obtain only finitely many options for ( a, b).
Math 701 Spring 2021, version April 6, 2024 page 167
|( a1 , a2 , . . . , ak )| = | a1 | + | a2 | + · · · + | ak | . (136)
Note that the 0-th Cartesian power A0 of a weighted set A always consists of
a single element – namely, the empty 0-tuple (), which has weight 0.
If A is a weighted set, then the infinite disjoint union A0 + A1 + A2 + · · ·
consists of all (finite) tuples of elements of A (including the 0-tuple, the 1-
tuples, and so on). The weight of a tuple is the sum of the weights of its entries
(indeed, this is just what (136) says).
3.12.2. Examples
Now, let us use this theory to revisit some of the things we have already
counted:
Ck := {compositions of length k }
= {( a1 , a2 , . . . , ak ) | a1 , a2 , . . . , ak are positive integers}
= Pk , where P = {1, 2, 3, . . .} .
k
x 1 2 3 x
= since x + x + x + · · · = .
1−x 1−x
• Recall the notion of Dyck paths (as defined in Example 2 in Section 3.1),
as well as the Catalan numbers c0 , c1 , c2 , . . . (defined in the same place).
Let
These two sets Dtriv and Dnon are subsets of D, and thus are weighted sets
themselves (we define their weight functions by restricting the one of D).
49 Proof. The weight of a composition ( a1 , a2 , . . . , ak ) ∈ Ck in Ck is
|( a1 , a2 , . . . , ak )| = a1 + a2 + · · · + ak (by definition) .
|( a1 , a2 , . . . , ak )| = | a1 | + | a2 | + · · · + | ak | (by (136))
= a1 + a2 + · · · + a k (since |n| = n for each n ∈ P) .
These two weights are clearly equal. Thus, the weight functions of Ck and Pk agree. Hence,
Ck = Pk as weighted sets.
Math 701 Spring 2021, version April 6, 2024 page 169
The set Dtriv consists of a single Dyck path, which has weight 0; thus, its
weight generating function is
Dtriv = x0 = 1.
In Example 2 of Section 3.1, we have seen that any nontrivial Dyck path
π has the following structure:50
– a NE-step,
– followed by a (diagonally shifted) Dyck path (drawn in green),
– followed by a SE-step,
– followed by another (horizontally shifted) Dyck path (drawn in pur-
ple).
If we denote the green Dyck path by α and the purple Dyck path by β,
then we obtain a pair (α, β) ∈ D × D of two Dyck paths. Thus, each
nontrivial Dyck path π ∈ Don gives rise to a pair (α, β) ∈ D × D of two
Dyck paths. This yields a map
Dnon → D × D,
π 7→ (α, β) ,
This is a one-element set, so the only real difference between the weighted
sets X × D × D and D × D is in the weights. Indeed, the sets D × D and
X × D × D are in bijection (any pair (α, β) ∈ D × D corresponds to the
Dnon → X × D × D,
π 7→ (1, α, β) ,
D∼
= Dtriv + Dnon ∼
= Dtriv + X × D × D,
| {z }
∼
=X ×D×D
so that
(note that the colors are purely ornamental here: we are coloring a domino
pink if it lies horizontally and green if it stands vertically, for the sake of con-
venience).
This sounds geometric, but actually is a combinatorial object hiding behind
geometric language. Our rectangles and dominos all align to a square grid.
Thus, rectangles can be modeled simply as finite sets of grid squares, and domi-
nos are unordered pairs of adjacent grid squares. Grid squares, in turn, can be
modeled as pairs (i, j) of integers (corresponding to the Cartesian coordinates
of their centers). Thus, we redefine domino tilings combinatorially as follows:
For example, the domino tiling of the rectangle R3,2 is the set
partition
{{(1, 1) , (1, 2)} , {(2, 1) , (3, 1)} , {(2, 2) , (3, 2)}}
of R3,2 (here, {(1, 1) , (1, 2)} is the vertical domino, whereas {(2, 1) , (3, 1)} is
the bottom horizontal domino, and {(2, 2) , (3, 2)} is the top horizontal domino).
0 d0,2 = 1
1 d1,2 = 1
2 d2,2 = 2 , .
3 d3,2 = 3 , ,
4 d4,2 = 5 , , ,
A height-2 rectangle shall mean a rectangle of the form Rn,2 with n ∈ N. Let
us define the weighted set
D= ∑ dn,2 x n .
n ∈N
So we want to compute D. Let us define a new weighted set that will help
us at that.
Namely, we say that a fault of a domino tiling T is a vertical line ℓ such that
• each domino of T lies either left of ℓ or right of ℓ (but does not straddle
ℓ), and
• there is at least one domino of T that lies left of ℓ, and at least one domino
of T that lies right of ℓ.
The tiling on the left has a fault (namely, the vertical line separating the 2nd
from the 3rd column), but the tiling on the right has none (a fault must be a
vertical line by definition; a horizontal line doesn’t count).
A domino tiling will be called faultfree if it is nonempty and has no fault.
Thus, the tiling on the right (in the above example) is faultfree.
We now observe a crucial (but trivial) lemma:
decomposes as
, , , .
(Note that if the original tiling was faultfree, then it will decompose into
a 1-tuple. If the original tiling was empty, then it will decompose into a
0-tuple.)
Moreover, the sum of the weights of the faultfree tilings in the tuple is the
weight of the original tiling. (In other words, if a tiling T decomposes into
the tuple ( T1 , T2 , . . . , Tk ), then | T | = | T1 | + | T2 | + · · · + | Tk |.)
(with the same weights as in D), then we obtain an isomorphism (i.e., weight-
preserving bijection)
D → |F0 + F1 + F{z2 + F3 + · ·}·
an infinite disjoint union
of weighted sets
D∼
= F0 + F1 + F2 + F3 + · · · ,
and therefore
D = F0 + F1 + F2 + F3 + · · · = F0 + F1 + F2 + F3 + · · ·
(by the infinite analogue of Proposition 3.12.6)
0 1 2 3
= F +F +F +F +··· (by Proposition 3.12.10)
1
= by (5), with F substituted for x .
1−F
and .
Math 701 Spring 2021, version April 6, 2024 page 175
I claim that these two tilings are the only faultfree tilings of height-2 rectan-
gles. Indeed, consider any faultfree tiling of a height-2 rectangle. In this tiling,
look at the domino that covers the box (1, 1). If it is a vertical domino, then this
vertical domino must constitute the entire tiling, since otherwise there would
be a fault to its right. If it is a horizontal domino, then there must be a second
horizontal domino stacked atop it, and these two dominos must then constitute
the entire tiling, since otherwise there would be a fault to their right. This leads
to the two options we just named.
Thus, the weighted set F consists of just the two tilings shown above: one
tiling of weight 1 and one tiling of weight 2. Hence, its weight generating
function is F = x + x2 . So
1 1 1
D= = 2)
= 2
= f 1 + f 2 x + f 3 x2 + f 4 x3 + · · · ,
1−F 1 − ( x + x 1 − x − x
where ( f 0 , f 1 , f 2 , . . .) is the Fibonacci sequence. Thus, comparing coefficients,
we find
dn,2 = f n+1 for each n ∈ N.
There are, of course, more elementary proofs of this (see [19fco, Proposition
1.1.11]).
1
Remark 3.12.13. Here is an outline of an alternative proof of D = :
1−F
Any tiling in D is either empty, or can be uniquely split into a pair of a
faultfree tiling and an arbitrary tiling (just split it along its leftmost fault, or
along the right end if there is no fault). Thus,
D∼
= {0} + F × D,
where the set {0} here is viewed as a weighted set with |0| = 0. Hence,
D = {0} + F × D = {0} + F · D = 1 + F · D.
1
Solving this for D, we find D = .
1−F
Now, let us try to solve the analogous problem for height-3 rectangles. Forget
about the D and F we defined above. Instead, define a new weighted set
D= ∑ dn,3 x n .
n ∈N
Math 701 Spring 2021, version April 6, 2024 page 176
···
···
(these are the top-down mirror images of the tilings from the previous
bullet point). The weights of these tilings are 2, 4, 6, . . ., so their total
contribution to the weight generating function F of F is x2 + x4 + x6 + · · · .
(yes, there is only one such tiling). The weight of this tiling is 2, so its
total contribution to the weight generating function F of F is x2 .
This classification of faultfree domino tilings entails
F = x2 + x4 + x6 + · · · + x2 + x4 + x6 + · · · + x2
2 1 2 1 2 2 4 6 2 1
=x · +x · +x since x + x + x + · · · = x ·
1 − x2 1 − x2 1 − x2
3x2 − x4
= .
1 − x2
Thus,
1 1 1 − x2
D= = =
1−F 3x2 − x4 1 − 4x2 + x4
1−
1 − x2
= 1 + 3x + 11x + 41x6 + 153x8 + · · · .
2 4
You will notice that only even powers of x appear in this FPS. In other words,
dn,3 = 0 when n is odd.
This is not surprising, because if n is odd, then the rectangle Rn,3 has an odd #
of squares, and thus cannot be tiled by dominos.
But we can also compute dn,3 for even n. Indeed, using the same method
(partial fractions) that we used for the Fibonacci sequence in Section 3.1, we
1 − x2
can expand as a sum of geometric series:
1 − 4x2 + x4
√ √
1 − x2 3+ 3 1 3− 3 1
2 4
= · √ + · √ .
1 − 4x + x 6 1 − 2 + 3 x2 6 1 − 2 − 3 x2
Thus, we find
√ √
3+ 3 √ n/2 3 − 3 √ n/2
dn,3 = 2+ 3 + 2− 3 for any even n.
6 6
Now, what about computing dn,m in general? The above reasoning leading up
1
to D = can be applied for any m ∈ N, but describing F becomes harder
1−F
and harder as m grows larger. The generating function D is still a quotient of
two polynomials for any m (see, e.g., [KlaPol79]), but this requires more insight
to prove. For m ≥ 6, it appears that there is no formula for dn,m that requires
only quadratic irrationalities.
It is worth mentioning a different formula for dn,m , found by Kasteleyn in
1961 (motivated by a theoretical physics model):
Math 701 Spring 2021, version April 6, 2024 page 178
See [Loehr11, Theorem 12.85] or [Stucky15] for proofs of this rather surpris-
ing formula (which, alas, require some more advanced methods than those
introduced in this text). Note that it can indeed be used for exact computa-
tion of dn,m (as there are algorithms for exact manipulation of “trigonometric
jπ
irrationals”51 such as cos ); for example, it yields d8,8 = 12 988 816.
m+1
• The sequence
5 5 5 5
= , , , . . . = (5, 2, 1, 1, 1, 0, 0, 0, 0, . . .) ∈ ZN
i i ≥1 1 2 3
5
stabilizes to 0, since all integers i ≥ 6 satisfy = 0.
i
5
• On the other hand, the sequence ∈ ZN does not stabilize to
i i ≥1
anything. (It does converge to 0 in the sense of real analysis, but this is a
weaker statement than stabilization.)
Remark 3.13.2. If you are familiar with point-set topology, then you will rec-
ognize Definition 3.13.1 as an instance of topological convergence: Namely,
a sequence ( ai )i∈N stabilizes to some a ∈ K if and only if ( ai )i∈N converges
to a in the discrete topology.
Math 701 Spring 2021, version April 6, 2024 page 180
a single 1 and infinitely many 0s; thus, this sequence stabilizes to 0.) In other
words, we have lim xi = 0.
i→∞
1
(b) The sequence x of FPSs does not stabilize (since the x1 -
i i ≥1
1 1
coefficients of these FPSs x never stabilize). Thus, lim x does not exist.
i i→∞ i
(c) We have
∞
lim
i→∞
1+x 1
1+x 2
··· 1+x i
= ∏ 1+x k
.
k =1
pends only on those factors in which the exponent is ≤ n, and thus stops
changing after i surpasses n; therefore, it stabilizes to the x n -coefficient of the
∞
infinite product ∏ 1 + x k .
k =1
x n
(d) It would be nice to have lim 1 + = exp, as in real analysis.
n→∞ n
Unfortunately, this is not the case. In fact, the binomial formula yields
n
x n x n x 2 2
1+ = 1+n· + · + · · · = 1 + x + 2 x2 + · · · .
n | {zn} 2 n n
=x
Math 701 Spring 2021, version April 6, 2024 page 181
x n
This shows that the x0 -coefficient and the x1 -coefficient stabilize of 1 +
n
x n
as n → ∞, but the x -coefficient does not. Thus, lim 1 +
2 does not exist
n→∞ n
(according to our definition of limit).
(e) Coefficientwise stabilization is weaker than stabilization: If a sequence
( 0 , f 1 , f 2 , . . .) of FPSs stabilizes to an FPS f (meaning that f i = f for all
f
sufficiently high i ∈ N), then it coefficientwise stabilizes to f as well. In
particular, any constant sequence ( f , f , f , . . .) of FPSs stabilizes to f .
Remark 3.13.5. If you are familiar with point-set topology, then you will
again recognize Definition 3.13.3 as an instance of topological convergence:
Namely, we recall that each FPS is an infinite sequence of elements of K (its
coefficients). Thus, the set K [[ x ]] is the infinite Cartesian product K × K ×
K × · · · . If we equip each factor K in this product with the discrete topology,
then the entire product K [[ x ]] becomes a product space equipped with the
product topology. Now, a sequence ( f i )i∈N of FPSs in K [[ x ]] coefficientwise
stabilizes to some FPS f ∈ K [[ x ]] if and only if ( f i )i∈N converges to f in this
product topology. (This is not surprising: After all, the product topology
is also known as the “topology of pointwise convergence”, and in our case
“pointwise” means “coefficientwise”.)
Proof. Obvious.
The following easy lemma connects limits with the notion of x n -equivalence
(as introduced in Definition 3.10.1):
Proof of Lemma 3.13.7 (sketched). Let n ∈ N. For each k ∈ {0, 1, . . . , n}, we pick
an Nk ∈ N such that all i ≥ Nk satisfy x f i = x f (such an Nk exists, since
k k
xn
lim f i = f ). Then, all integers i ≥ max { N0 , N1 , . . . , Nn } satisfy f i ≡ f . See
i→∞
Section B.4 for the details of this proof.
The following proposition is an analogue of the classical “limits respect sums
and products” theorem from real analysis:
Proposition 3.13.8. Assume that ( f i )i∈N and ( gi )i∈N are two sequences of
FPSs, and that f and g are two FPSs such that
Then,
(In other words, the sequences ( f i + gi )i∈N and ( f i gi )i∈N coefficientwise sta-
bilize to f + g and f g, respectively.)
Proof of Proposition 3.13.8 (sketched). Let n ∈ N. Then, Lemma 3.13.7 shows that
there exists some integer N ∈ N such that
xn
all integers i ≥ N satisfy f i ≡ f .
Similarly, there exists some integer M ∈ N such that
xn
all integers i ≥ M satisfy gi ≡ g.
Pick such integers N and M, and set P := max { N, M}. Then, all integers i ≥ P
xn xn xn
satisfy both f i ≡ f and gi ≡ g, and therefore f i gi ≡ f g (by (101)), and thus
[ x n ] ( f i gi ) = [ x n ] ( f g). This shows that the sequence ([ x n ] ( f i gi ))i∈N stabilizes
to [ x n ] ( f g). Since this holds for each n ∈ N, we conclude that lim ( f i gi ) = f g.
i→∞
The proof of lim ( f i + gi ) = f + g is analogous.
i→∞
See Section B.4 for the details of this proof.
Corollary 3.13.9. Let k ∈ N. For each i ∈ {1, 2, . . . , k}, let f i be an FPS, and
let ( f i,n )n∈N be a sequence of FPSs such that
lim ( f i,n ) = f i
n→∞
Remark 3.13.10. The analogue of Corollary 3.13.9 for infinite sums (or prod-
ucts) is not true without further requirements. For instance, for each i ∈ N,
we can define a constant FPS f i,n = δi,n (using Definition 3.5.6). Then, we
∞
have lim f i,n = 0 for any fixed i ∈ N, but lim
n→∞ n→∞
∑ fi,n = nlim
→∞
1 = 1.
i =0
| {z }
=1
Proposition 3.13.11. Assume that ( f i )i∈N and ( gi )i∈N are two sequences of
FPSs, and that f and g are two FPSs such that
Assume that each FPS gi is invertible. Then, g is also invertible, and we have
fi f
lim = .
i→∞ gi g
Proof of Proposition 3.13.11 (sketched). First use Proposition 3.3.7 to show that g
f f
is invertible; then use Theorem 3.10.3 (e) to prove lim i = . A detailed proof
i → ∞ gi g
can be found in Section B.4.
Limits of FPSs furthermore respect composition:
Proposition 3.13.12. Assume that ( f i )i∈N and ( gi )i∈N are two sequences of
FPSs, and that f and g are two FPSs such that
lim ( f i ◦ gi ) = f ◦ g.
i→∞
Proof of Proposition 3.13.12 (sketched). Similar to Proposition 3.13.11, but now us-
ing Proposition 3.10.5. See Exercise A.2.13.1 (b) for details.
Limits of FPSs also respect derivatives (unlike in real analysis, where this
holds only under subtle additional conditions):
Math 701 Spring 2021, version April 6, 2024 page 184
i
lim ∑
i → ∞ n =0
fn = ∑ fn.
n ∈N
In other words, the infinite sum ∑ f n is the limit of the finite partial sums
n ∈N
i
∑ fn.
n =0
i
lim ∏ fn = ∏
i → ∞ n =0
fn.
n ∈N
In other words, the infinite product ∏ f n is the limit of the finite partial
n ∈N
i
products ∏ f n .
n =0
i
a = lim ∑ an x n .
i → ∞ n =0
This corollary can be restated as “the polynomials are dense in the FPSs”
(more formally: K [ x ] is dense in K [[ x ]]). This fact is useful, as it allows you
Math 701 Spring 2021, version April 6, 2024 page 185
i
Theorem 3.13.17. Let ( f 0 , f 1 , f 2 , . . .) be a sequence of FPSs such that lim ∑ f n
i → ∞ n =0
exists. Then, the family ( f n )n∈N is summable, and satisfies
i
∑ f n = lim ∑
i → ∞ n =0
fn.
n ∈N
i
Theorem 3.13.18. Let ( f 0 , f 1 , f 2 , . . .) be a sequence of FPSs such that lim ∏ f n
i → ∞ n =0
exists. Then, the family ( f n )n∈N is multipliable, and satisfies
i
∏ f n = lim
i→∞
∏ fn.
n ∈N n =0
n= ∑ bi 2 i .
i ∈N
(Recall that “essentially finite” means “all but finitely many i ∈ N satisfy
bi = 0”.)
Note that we are encoding the digits (actually, bits) of a binary representation
as essentially finite sequences instead of finite tuples. This way, we don’t have
to worry about leading zeros breaking the uniqueness in Theorem 3.14.2.
Let us now define a variation of binary representation:
n= ∑ bi 3 i .
i ∈N
19 = 1 − 9 + 27 = 30 − 32 + 33 = 1 · 30 + (−1) · 32 + 1 · 33 .
• The integer 42 has a balanced ternary representation (0, −1, −1, −1, 1, 0, 0, 0, . . .),
because
42 = 81 − 27 − 9 − 3 = 34 − 33 − 32 − 31 .
Math 701 Spring 2021, version April 6, 2024 page 187
• Note that (unlike with binary representations) even negative integers can
have balanced ternary representations. For example, the integer −11 has
a balanced ternary representation (1, −1, −1, 0, 0, 0, . . .).
There are various ways to prove this (see, e.g., [20f, solution to Exercise 3.7.8]
for an elementary one). Let us here try to prove it using FPSs (imitating our
proof of Theorem 3.14.2 in Subsection 3.11.1). Since the bi can be −1s, we must
allow for negative powers of x.
Let us first argue informally; we will later see whether we can make sense of
what we have done. The following informal argument was proposed by Euler
([Euler48, §331]). We shall compute the product
1 + x + x −1 1 + x 3 + x −3 1 + x 9 + x −9 · · · = ∏ 1 + x 3 + x −3
i i
i ≥0
in two ways:
Math 701 Spring 2021, version April 6, 2024 page 188
=∏ ∑
i
x b ·3
i ≥0 b∈{0,1,−1}
∑
0 1 2
= x b0 3 x b1 3 x b2 3 · · ·
(b0 ,b1 ,b2 ,...)∈{0,1,−1}N
is essentially finite
here, we have just expanded the product using (127)
(hoping that (127) still works in our setting)
∑
0 + b 31 + b 32 +···
= x b0 3 1 2
= ∑ ∑ xn
n ∈Z (b0 ,b1 ,b2 ,...)∈{0,1,−1}N
is essentially finite;
b0 30 +b1 31 +b2 32 +···=n
= ∑ (# of balanced ternary representations of n) · x n , (137)
n ∈Z
1 + x 3 + x −3 = = for each i ≥ 0.
3 i 3 i 3 i 3 i
x 1−x x 1−x
Math 701 Spring 2021, version April 6, 2024 page 189
Hence,
∏ 1+x +x
3i − 3i
i ≥0
i +1
1 − x3
=∏
3 i 3 i
i ≥0 x 1−x
1 − x3 1 − x9 1 − x27 1 − x81
= · 3 · · ····
x (1 − x ) x (1 − x 3 ) x9 (1 − x9 ) x27 (1 − x27 )
1 1
= 3 9
·
x27 · · }·
|xx x {z − x}
|1 {z
= x −∞ =1+ x + x2 + x3 +···
(whatever this means)
for each i ∈ Z
= ∑ xn . (138)
n ∈Z
· · · + x −2 + x −1 + x0 + x1 + x2 + · · · = 0.
( a n ) n ∈Z + ( bn ) n ∈Z = ( a n + bn ) n ∈Z ;
λ ( an )n∈Z = (λan )n∈Z for each λ ∈ K.
( a n ) n ∈Z · ( bn ) n ∈Z = ( c n ) n ∈Z , where cn = ∑ a i bn − i
i ∈Z
(since this is what we would get if we expanded ∑ an xn ∑ bn xn and
n ∈Z n ∈Z
combined like powers of x). Unfortunately, the sum ∑ ai bn−i is now infinite
i ∈Z
(unlike for K [[ x ]]), and is not always well-defined. Thus, the product ( an )n∈Z ·
(bn )n∈Z of two elements of K [[ x ± ]] does not always exist52 . Therefore, the K-
module K [[ x ± ]] is not a K-algebra. This explains why our above computations
have led us astray.
52 The simplest example for this is (1)n∈Z · (1)n∈Z . If this product existed, then it should be
(cn )n∈Z , where cn = ∑ 1 · 1, but the latter sum clearly does not exist.
i ∈Z
Math 701 Spring 2021, version April 6, 2024 page 191
( a n ) n ∈Z · ( bn ) n ∈Z = ( c n ) n ∈Z , where cn = ∑ a i bn − i .
i ∈Z
x = (δi,1 )i∈Z
Here, the powers xi are taken in the Laurent polynomial ring K [ x ± ], but
the infinite sum ∑ ai xi is taken in the K-module K [[ x ± ]]. (The notions of
i ∈Z
53 Forproofs of the statements made (implicitly and explicitly) in the following (or at least for
the less obvious among these proofs), we refer to Section B.5.
Math 701 Spring 2021, version April 6, 2024 page 192
summable families and infinite sums are defined in K [[ x ± ]] in the same way
as they are defined in K [[ x ]].)
• any polynomial in K [ x ];
• x −15 ;
• x2 + 3 + 7x −3 .
There are other, equivalent ways to define the Laurent polynomial ring K [ x ± ]:
(These are done in some textbooks on abstract algebra – e.g., see [Ford21,
Exercise 3.6.31] for a quick overview.)
k
∏
i i
1 + x 3 + x −3
i =0
k k
= 1 + x + x −1 1 + x 3 + x −3 · · · 1 + x 3 + x −3
k +1
1 − x3 1 − x9 1 − x3
= · 3 · · · · ·
x (1 − x ) x (1 − x 3 )
k k
x3 1 − x3
!
i
this is somewhat unrigorous, since 1 − x3 are not invertible
in K [ x ± ] , but this will soon be made rigorous
k +1
1 1 − x3
= · (by cancelling factors)
3k −x }
| 1 {z
3 9
|xx x {z· · · x }
− 30 +31 +···+3k ) =1+ x + x2 +···+ x3k+1 −1
=x (
−(30 +31 +···+3k ) 2 3k +1 −1
=x · 1+x+x +···+x
0 1 k
0 1 k
= x −(3 +3 +···+3 ) · 1 + x + x2 + · · · + x2(3 +3 +···+3 )
since 3k+1 − 1 = 2 · 30 + 31 + · · · + 3k (check this!)
0 +31 +···+3k
= x − (3 ) + x −(30 +31 +···+3k )+1 + x −(30 +31 +···+3k )+2 + · · · + x30 +31 +···+3k
= ∑ xn .
n∈Z;
|n|≤30 +31 +···+3k
Math 701 Spring 2021, version April 6, 2024 page 194
k
=∏ ∑
i
x b ·3
i =0 b∈{0,1,−1}
by expanding the product
= ∑ x b0 30 b1 31
x ···x bk 3 k
using Proposition 3.11.26
k +1
(b0 ,b1 ,...,bk )∈{0,1,−1}
∑
0 + b 31 +···+ b 3k
= x b0 3 1 k
( a n ) n ∈Z · ( bn ) n ∈Z = ( c n ) n ∈Z , where cn = ∑ a i bn − i .
i ∈Z
Now, the ring K (( x )) contains both the FPS ring K [[ x ]] and the Laurent
polynomial ring K [ x ± ] as subrings (and actually as K-subalgebras). This makes
it one of the most convenient places for formal manipulation of FPSs. (However,
it has a disadvantage compared to the FPS ring K [[ x ]]: Namely, you cannot
easily substitute something for x in a Laurent series f ∈ K (( x )).)
k i i
Now, our above computation of ∏ 1 + x + x 3 − 3 makes perfect sense in
i =0
the Laurent series ring K (( x )): Indeed, for each positive integer i, the power
series 1 − xi is invertible in K [[ x ]] and thus also invertible in K (( x )). Hence, at
last, we have a rigorous proof of Theorem 3.14.4.
Actually, you could also make sense of our original argument for prov-
i i
ing Theorem 3.14.4, with the infinite product ∏ 1 + x3 + x −3 , as long as
i ≥0
you made sure to interpret it correctly: First, compute the finite products
k i i
k i i
3 − 3 3 − 3
∏ 1+x +x in the ring K (( x )). Then, take their limit lim ∏ 1 + x + x
i =0 k → ∞ i =0
in K [[ x ± ]] (this is not a ring, but the notion of a limit in K [[ x ± ]] is defined just
as it was in K [[ x ]]).
More can be said about the K-algebra K (( x )) when K is a field: Indeed, in
this case, it is itself a field! This fact (whose proof is Exercise A.2.14.2) is not
very useful in combinatorics, but quite so in abstract algebra.
(so that each indeterminate has exactly one coefficient equal to 1, while all other
coefficients are 0). The rules for addition, subtraction, scaling and multiplica-
tion are essentially as they are for univariate FPSs, except that now the indexing
Math 701 Spring 2021, version April 6, 2024 page 197
• the sum i + j means the entrywise sum of the k-tuples i and j (that
is, i + j = (i1 + j1 , i2 + j2 , . . . , ik + jk ), where i = (i1 , i2 , . . . , ik ) and j =
( j1 , j2 , . . . , jk ));
• if m ∈ Nk and h is an FPS in x1 , x2 , . . . , xk , then [xm ] h is the m-th en-
try of the family h. Just as in the univariate case, this entry [xm ] h is
m
called the coefficient of the monomial xm := x1m1 x2m2 · · · xk k in h (where
m = (m1 , m2 , . . . , mk )).
With this notation, the formula for ab doesn’t look any more complicated
than the analogous formula in the univariate case. The only difference is that
the monomials and the coefficients are now indexed not by the nonnegative
integers, but by the k-tuples of nonnegative integers instead.
The indeterminates x1 , x2 , . . . , xk are defined by
xi = δn,(0,0,...,0,1,0,0,...,0) ,
k n ∈N
∑
m
f = f m x1m1 x2m2 · · · xk k .
m=(m1 ,m2 ,...,mk )∈Nk
Math 701 Spring 2021, version April 6, 2024 page 198
Most of what we have said about FPSs in one variable applies similarly to
FPSs in multiple variables. The proofs are similar but more laborious due to
the need for subscripts. Instead of the derivative of an FPS, there are now k
derivatives (one for each variable); they are called partial derivatives. One needs
to be somewhat careful with substitution – e.g., one cannot substitute non-
commuting elements into a multivariate polynomial. For example, you cannot
substitute two non-commuting matrices A and B for x and y into the polyno-
mial xy, at least not without sacrificing the rule that a value of the product of
two polynomials should be the product of their values (since x [ A, B] · y [ A, B] =
| {z } | {z }
=A =B
AB would have to equal y [ A, B] · x [ A, B] = BA). But you can still substitute k
| {z } | {z }
=B =A
commuting elements for the k indeterminates in a k-variable polynomial. You
can also compose multivariate FPSs as long as appropriate summability condi-
tions are satisfied.
Sometimes we will use different names for our variables. For example, if we
work with 2 variables, we will commonly call them x and y instead of x1 and
x2 . Correspondingly, we will use the notation K [[ x, y]] (instead of K [[ x1 , x2 ]])
for the K-algebra of FPSs in these two variables.
Let me give an example of working with multivariate FPSs.
Math 701 Spring 2021, version April 6, 2024 page 199
xk
n n
k +1
= ∑ x for each k ∈ N. (140)
(1 − x ) n ∈N
k
∑ f k yk = ∑ gk y k in K [[ x, y]] . (141)
k ∈N k ∈N
Proof. For each k ∈ N, let us write the two FPSs f k and gk as f k = ∑ f k,n x n
n ∈N
and gk = ∑ gk,n x n with f k,n , gk,n ∈ K. Then, the equality (141) can be rewritten
n ∈N
as
∑ ∑ f k,n x n yk = ∑ ∑ gk,n x n yk .
k ∈N n ∈N k ∈N n ∈N
xk
n n
= ∑ x for each k ∈ N.
(1 − x ) k +1 n ∈N
k
p (n) := (# of partitions of n) .
p0 (5) = 0;
p1 (5) = 1;
p2 (5) = 2;
p3 (5) = 2;
p4 (5) = 1;
p5 (5) = 1;
p k (5) =0 for any k > 5;
Here are the values of p (n) for the first 15 nonnegative integers n:
n 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
.
p (n) 1 1 2 3 5 7 11 15 22 30 42 56 77 101 135
The sequence ( p (0) , p (1) , p (2) , . . .) is remarkable for being an integer se-
quence that grows faster than polynomially, but still considerably slower than
exponentially. (See (168) for an asymptotic expansion.) This not-too-fast growth
(for instance, p (100) = 190 569 292 is far smaller than 2100 ) makes integer par-
titions rather convenient for computer experiments.
Proof of Proposition 4.1.7 (sketched). (a) The size of a partition is always nonneg-
ative (being a sum of positive integers). Thus, a negative number n has no
partitions whatsoever. Thus, pk (n) = 0 whenever n < 0 and k ∈ N.
(b) If (λ1 , λ2 , . . . , λk ) is a partition, then λi ≥ 1 for each i ∈ {1, 2, . . . , k }
(because a partition is a tuple of positive integers, i.e., of integers ≥ 1). Hence,
if (λ1 , λ2 , . . . , λk ) is a partition of n into k parts, then
Thus, a partition of n into k parts cannot satisfy k > n. Thus, no such partitions
exist if k > n. In other words, pk (n) = 0 if k > n.
(c) The integer 0 has a unique partition into 0 parts, namely the empty tuple
(). A nonzero integer n cannot have any partitions into 0 parts, since the empty
tuple has size 0 ̸= n. Thus, p0 (n) equals 1 for n = 0 and equals 0 for n ̸= 0. In
other words, p0 (n) = [n = 0].
(d) Any positive integer n has a unique partition into 1 part – namely, the
1-tuple (n). On the other hand, if n is not positive, then this 1-tuple is not a
partition, so in this case n has no partition into 1 part. Thus, p1 (n) equals 1 if
n is positive and equals 0 otherwise. In other words, p1 (n) = [n > 0].
(e) Assume that k > 0. We must prove that pk (n) = pk (n − k) + pk−1 (n − 1).
We consider all partitions of n into k parts. We classify these partitions into
two types:
For example, here are the partitions of 5 along with their types:
Let us count the type-1 partitions and the type-2 partitions separately.
Any type-1 partition has 1 as a part, therefore as its last part (because it is
weakly decreasing). Hence, any type-1 partition has the form (λ1 , λ2 , . . . , λk−1 , 1).
If (λ1 , λ2 , . . . , λk−1 , 1) is a type-1 partition (of n into k parts), then (λ1 , λ2 , . . . , λk−1 )
is a partition of n − 1 into k − 1 parts. Thus, we have a map
This map is a bijection (since it has an inverse map, which simply inserts a 1 at
the end of a partition). Thus, the bijection principle shows that
This map is a bijection (since it has an inverse map, which simply adds 1 to
each entry of a partition). Thus, the bijection principle shows that
Since the left hand side of this equality is pk (n), we thus have proved that
p k ( n ) = p k ( n − k ) + p k −1 ( n − 1 ).
(f) Let n ∈ N. The partitions of n into 2 parts are
(n − 1, 1) , (n − 2, 2) , (n − 3, 3) , . . . , n − ⌊n/2⌋, ⌊n/2⌋ .
| {z }
=⌈n/2⌉
(g) Let n ∈ N. Any partition of n must have k parts for some k ∈ N. Thus,
n ∞
p (n) = ∑ pk (n) = ∑ pk (n) + ∑ p (n)
| k{z }
k ∈N k =0 k = n +1
=0
(by Proposition 4.1.7 (b))
n
= ∑ p k ( n ) = p0 ( n ) + p1 ( n ) + · · · + p n ( n ) .
k =0
Example 4.1.9. Let us check the above equality “up to x5 ”, i.e., let us compare
the coefficients of xi for i < 5. (In doing so, we can ignore all powers of x
Math 701 Spring 2021, version April 6, 2024 page 206
= ∑ | Qn | x n ,
n ∈N
where
= . . . , 3, 3, . . . , 3, 2, 2, . . . , 2, 1, 1, . . . , 1 .
| {z } | {z } | {z }
u3 times u2 times u1 times
π : Qn → {partitions of n} .
ρ : {partitions of n} → Qn
because λ is a partition of n). It is now easy to check that the maps π and ρ are
mutually inverse, so that π is a bijection. The bijection principle therefore yields
| Qn | = (# of partitions of n) = p (n); but this is precisely what we wanted to
show. The proof of Theorem 4.1.8 is thus complete.
Theorem 4.1.8 has a “finite” analogue (finite in the sense that the product
∞ 1
∏ k
is replaced by a finite product; the FPSs are still infinite):
k =1 1 − x
Proof of Theorem 4.1.10. This proof is mostly analogous to the above proof of
Theorem 4.1.8, and to some extent even simpler because it uses m-tuples instead
of infinite sequences.
Math 701 Spring 2021, version April 6, 2024 page 208
We have
m
1
∏ k
k =1 |1 −
{zx }
=1+ x k + x2k + x3k +···
m m
= ∏ 1+x +x k 2k
+x +··· = ∏
3k
∑ x ku
k =1 | {z } k =1 u ∈N
= ∑ x ku
u ∈N
here, we expanded the product
= ∑ x 1u1 2u2
x ···x mum
using Proposition 3.11.27
(u1 ,u2 ,...,um )∈Nm
where
Qn = {(u1 , u2 , . . . , um ) ∈ Nm | 1u1 + 2u2 + · · · + mum = n} .
Thus, it will suffice to show that
| Qn | = pparts≤m (n) for each n ∈ N.
Let us fix n ∈ N. We want to construct a bijection from Qn to the set
{partitions λ of n such that all parts of λ are ≤ m}.
Here is how to do this: For any (u1 , u2 , . . . , um ) ∈ Qn , define a partition
π (u1 , u2 , . . . , um ) := (the partition that contains each i exactly ui times)
= m, m, . . . , m, . . . , 2, 2, . . . , 2, 1, 1, . . . , 1 .
| {z } | {z } | {z }
um times u2 times u1 times
55 The argument is analogous to the one used in the proof of Theorem 4.1.8.
Math 701 Spring 2021, version April 6, 2024 page 209
Theorem 4.1.11. Let I be a subset of {1, 2, 3, . . .}. For each n ∈ N, let p I (n)
be the # of partitions λ of n such that all parts of λ belong to I. Then,
1
∑ p I (n) x n = ∏ 1 − xk .
n ∈N k∈ I
Proof of Theorem 4.1.11 (sketched). This is analogous to the proof of Theorem 4.1.10,
m
with some minor changes: The ∏ sign has to be replaced by ∏ ; the m-
k =1 k∈ I
tuples (u1 , u2 , . . . , um ) ∈ Nm must be replaced by the essentially finite families
(ui )i∈ I ∈ N I ; the bijection π has to be replaced by the new bijection
(where Qn is now the set of all essentially finite families (ui )i∈ I ∈ N I ) defined
by
π (ui )i∈ I := (the partition that contains each i ∈ I exactly ui times)
= . . . , i3 , i3 , . . . , i3 , i2 , i2 , . . . , i2 , i1 , i1 , . . . , i1 ,
| {z } | {z } | {z }
ui3 times ui2 times ui1 times
Theorem 4.1.14 (Euler’s odd-distinct identity). We have podd (n) = pdist (n)
for each n ∈ N.
We have already encountered this theorem before (as Theorem 3.11.34, albeit
in less precise language), and we have proved it using the generating function
identity
−1
∏ 1−x 2i −1
= ∏ 1+x . k
i >0 k >0
Let me outline a different, bijective proof.
Second proof of Theorem 4.1.14 (sketched). Let n ∈ N. We want to construct a bi-
jection
Why is this map A well-defined? We only specified the sort of steps we are al-
lowed to take when computing A (λ); however, there is often a choice involved
in taking these steps (since there are often several pairs of equal parts).57 So
56 The two entries underlined are the two equal entries that are going to get merged in the next
step. Note that there are usually several candidates, and we just pick one pair at will.
57 For example, we could have also computed A (5, 5, 3, 1, 1, 1) as follows:
(5, 5, 3, 1, 1, 1) → 5, 5, 3, 2, 1 → (10, 3, 2, 1) .
Math 701 Spring 2021, version April 6, 2024 page 211
Convention 4.1.16. We agree to say that the largest part of the empty parti-
tion () is 0 (even though this partition has no parts).
and
5→
4→
4→
1→
Now, let us flip this table across the “main diagonal” (i.e., the diagonal that
Math 701 Spring 2021, version April 6, 2024 page 213
5 4 4 1
↓ ↓ ↓ ↓
The lengths of the rows of the resulting table again form a partition of n. (In
our case, this new partition is (4, 3, 3, 3, 1).) Moreover, the largest part of this
new partition is k (because the original table had k rows, so the flipped table
has k columns, and this means that its top row has k boxes). This procedure
(i.e., turning a partition into a table, then flipping the table across the “main
diagonal”, and then reading the lengths of the rows of the resulting table again
as a partition) therefore gives a map from
to
{partitions of n whose largest part is k} .
Moreover, this map is a bijection (indeed, its inverse can be effected in the exact
same way, by flipping the table). This bijection is called conjugation of partitions,
and will be studied in more detail later.
Here are some pointers to how this proof can be formalized (see Exercise
A.3.1.1 for much more): For any partition λ = (λ1 , λ2 , . . . , λk ), we define the
Young diagram of λ to be the set
n o
Y (λ) := (i, j) ∈ Z2 | 1 ≤ i ≤ k and 1 ≤ j ≤ λi .
This Young diagram is precisely the table that we drew above, as long as we
agree to identify each pair (i, j) ∈ Y (λ) with the box in row i and column j.
Now, the conjugate of the partition λ is the partition λt uniquely determined by
of λ and where
(This conjugate λt is also often called λ′ , and is also known as the transpose
t
of λ.) Now, it is not hard to show that λt = |λ| and λt = λ for each
partition λ, and that the largest part of λt equals the length of λ. Using these
observations (which are proved in Exercise A.3.1.1), we see that the map
p0 ( n ) + p1 ( n ) + · · · + p k ( n )
= (# of partitions of n whose largest part is ≤ k) .
p0 ( n ) + p1 ( n ) + · · · + p k ( n )
k
= ∑ p (n)
| i{z }
i =0
=(# of partitions of n whose largest part is i )
(by Proposition 4.1.15,
applied to i instead of k)
k
= ∑ (# of partitions of n whose largest part is i)
i =0
= (# of partitions of n whose largest part is ≤ k) .
p0 ( n ) + p1 ( n ) + · · · + p m ( n )
= (# of partitions of n whose largest part is ≤ m) (by Corollary 4.1.18)
= (# of partitions of n whose all parts are ≤ m)
because the condition “the largest part is ≤ m” for a
partition is clearly equivalent to “all parts are ≤ m”
= pparts≤m (n) ,
Theorem 4.1.20. For any positive integer n, let σ (n) denote the sum of all
positive divisors of n. (For example, σ (6) = 1 + 2 + 3 + 6 = 12 and σ (7) =
1 + 7 = 8.)
For any n ∈ N, we have
n
np (n) = ∑ σ (k) p (n − k) .
k =1
For reference, let us give a table of the first 15 sums σ (n) defined in Theorem
4.1.20 (see the Sequence A000203 in the OEIS for more values):
n 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
.
σ (n) 1 3 4 7 6 12 8 15 13 18 12 28 14 24 24
Math 701 Spring 2021, version April 6, 2024 page 216
Hence,
! !
SP = ∑ σ (k) x k ∑ p (n) x n = ∑ ∑ σ (k) x k p (n) x n
k >0 n ∈N k >0 n ∈N
| {z }
= p (n ) x k+n
= ∑ ∑ σ (k ) p (n ) x k+n = ∑ ∑ σ (k ) p (m − k ) |x k+({zm−k})
k >0 n ∈N k >0 m∈N; =xm
m≥k
| {z }
= ∑ ∑
m ∈N k>0;
m≥k
m
= ∑ ∑
m ∈N k =1
= ∑ np (n) x n
(144)
n ∈N
(here, we have extended the range of the sum to include n = 0, which caused
no change to the value of the sum, since the newly added n = 0 addend is
0p (0) x0 = 0).
Let us now recall the notion of a logarithmic derivative (as defined in Defini-
0
tion 3.7.12). The FPS P has constant coefficient x P = p (0) = 1, thus belongs
to Z [[ x ]]1 . Hence, its logarithmic derivative loder P is well-defined.
We shall now use loder P to compute P′ in a roundabout way. Namely, we
have
∞
1
P = ∑ p (n) x n = ∏ k
n ∈N k =1 1 − x
(by Theorem 4.1.8). In other words,
1
P= ∏ 1 − xk (145)
k >0
Math 701 Spring 2021, version April 6, 2024 page 217
∞
(since the product sign ∏ is synonymous with ∏ ).
k =1 k >0
But Corollary 3.7.15 says that any k FPSs f 1 , f 2 , . . . , f k ∈ K [[ x ]]1 (for any
commutative ring K) satisfy
equivalently, !
loder ∏ fk = ∑ loder ( f k ) .
k >0 k >0
1
Applying this to K = Z and f k = , we obtain
1 − xk
!
1 1
loder ∏ k
= ∑ loder .
k >0 1 − x k >0 1 − xk
1
loder P = ∑ loder 1 − xk . (146)
k >0
Math 701 Spring 2021, version April 6, 2024 page 218
∑ ∑ x n −1 = ∑ σ ( n ) x n −1 .
= k
n >0 k>0; n >0
k|n
| {z }
=(sum of all positive divisors of n)
=σ(n)
(by the definition of σ(n))
Math 701 Spring 2021, version April 6, 2024 page 219
P′
loder P = (by Definition 3.7.12) ,
P
we obtain
P′
= ∑ σ ( n ) x n −1 .
P n >0
Multiplying both sides of this equality by xP, we find
Theorem 4.1.21. Let I be a subset of {1, 2, 3, . . .}. For each n ∈ N, let p I (n)
be the # of partitions λ of n such that all parts of λ belong to I.
For any positive integer n, let σI (n) denote the sum of all positive divisors
of n that belong to I. (For example, if O = {all odd positive integers} and
E = {all even positive integers}, then σO (6) = 1 + 3 = 4 and σE (6) = 2 +
6 = 8.)
For any n ∈ N, we have
n
np I (n) = ∑ σI (k) p I (n − k) .
k =1
Math 701 Spring 2021, version April 6, 2024 page 220
(3k − 1) k
wk = .
2
This is called the k-th pentagonal number.
k ··· −5 −4 −3 −2 −1 0 1 2 3 4 5 ···
.
wk ··· 40 26 15 7 2 0 1 5 12 22 35 ···
Note that wk really is a nonnegative integer for any k ∈ Z (check this!). The
name “pentagonal numbers” is historically motivated (see the Wikipedia page
for details); the only thing we need to know about them (beside their definition)
is the fact that they are nonnegative integers and grow quadratically with n in
both directions (i.e., when n → ∞ and when n → −∞). The latter fact ensures
that the infinite sum ∑ (−1)k x wk is a well-defined FPS in Z [[ x ]]. Rather sur-
k ∈Z
prisingly, this infinite sum coincides with a particularly simple infinite product:
k =1
= ∑ (−1)k xwk
k ∈Z
= · · · + x w −4 − x w −3 + x w −2 − x w −1 + x w0 − x w1 + x w2 − x w3 + x w4 − x w5 ± · · ·
= · · · + x26 − x15 + x7 − x2 + 1 − x + x5 − x12 + x22 − x35 ± · · ·
= 1 − x − x2 + x5 + x7 − x12 − x15 + x22 + x26 ± · · · .
We will prove Theorem 4.2.2 in the next section (as a particular case of Ja-
cobi’s Triple Product Identity).59 First, let us use it to derive the following
recursive formula for the partition numbers p (n):
p (n) = ∑ (−1)k−1 p (n − wk )
k∈Z;
k ̸ =0
= p ( n − 1) + p ( n − 2) − p ( n − 5) − p ( n − 7)
+ p (n − 12) + p (n − 15) − p (n − 22) − p (n − 26) ± · · · .
k ∈Z k =1
∞ ∞
! ! ! !
1
∑ p (m) xm · ∑ (−1)k xwk = ∏ 1 − xk · ∏ 1 − xk
m ∈N k ∈Z k =1 k =1
∞
1
=∏ k
· 1−x k
k =1 | 1 − x
{z }
=1
= 1. (147)
59 See [Bell06] for the history of Theorem 4.2.2.
Math 701 Spring 2021, version April 6, 2024 page 222
= (−1)0 p n − w0 + ∑ (−1)k p (n − wk )
k∈Z;
| {z } |{z}
=1 =0 k ̸ =0
= p (n) + ∑ k
(−1) p (n − wk ) .
k∈Z;
k ̸ =0
But the x n -coefficient on the right hand side of (4.2.3) is 0 (since n is positive).
Hence, comparing the coefficients yields
p (n) + ∑ (−1)k p (n − wk ) = 0.
k∈Z;
k ̸ =0
What are q and z here? It appears that (148) should be an identity between
multivariate Laurent series (in the indeterminates q and z), but we have never
defined such a concept. Multivariate Laurent series can indeed be defined,
but this is not as easy as the univariate case and involves some choices (see
[ApaKau13] for details).
A simpler ring in which the identity (148) can be placed is (Z [z± ]) [[q]] (that
is, the ring of FPSs in the indeterminate q whose coefficients are Laurent poly-
nomials over Z in the indeterminate z). In other words, we state the following:
Theorem 4.3.1 (Jacobi’s triple product identity, take 1). In the ring
(Z [z± ]) [[q]], we have
∏ 1+q z 1+q z = ∑ qℓ zℓ .
2n−1 2n−1 −1 2n 2
1−q
n >0 ℓ∈Z
Before we start proving this theorem, let us check that the infinite product on
its left hand side and the infinite sum on its right are well-defined:
• The infinite product is
∏ 1 + q 2n−1
z 1 + q 2n−1 −1
z 1 − q 2n
n >0
−1
= ∏ a 2n−1
1 + (ux ) vx b a 2n−1
1 + (ux ) vx b a 2n
1 − (ux )
n >0
= ∏ 1 + u2n−1 vx (2n−1)a+b 1 + u2n−1 v−1 x (2n−1)a−b 1 − u2n x2na .
n >0
Math 701 Spring 2021, version April 6, 2024 page 224
All factors in this product belong to the ring Q [[ x ]] (not just to Q (( x ))),
since the exponents (2n − 1) a + b and (2n − 1) a − b and 2na are always
nonnegative for any n > 0 (indeed, for any n > 0, we have (2n − 1) |{z} a +b ≥
| {z }
≥1 ≥|b|
|b| + b ≥ 0 and (2n − 1) |{z}
a − b ≥ |b| − b ≥ 0 and 2n |{z}
a > 0). More-
| {z }
≥1 ≥|b| >0
over, this product is multipliable, because
– the number (2n − 1) a + b grows linearly when n → ∞ (since a > 0);
– the number (2n − 1) a − b grows linearly when n → ∞ (since a > 0);
– the number 2na grows linearly when n → ∞ (since a > 0).
Thus, the infinite product is well-defined.
This argument also shows that the three subproducts ∏ 1 + q2n−1 z and
n >0
2n − 1 − 1 2n
∏ 1+q and ∏ 1 − q
z are well-defined.
n >0 n >0
All addends in this sum belong to the ring Q [[ x ]] (not just to Q (( x ))),
since the exponent aℓ2 + bℓ is always nonnegative for any ℓ ∈ Z (indeed,
ℓ2 +
a |{z}
we have |{z} bℓ
|{z} ≥ |b| · |ℓ|2 − |b| · |ℓ| = |b| · |ℓ| · (|ℓ| − 1) ≥
|{z} | {z }
≥|b| =|ℓ|2 ≥−|bℓ|=−|b|·|ℓ| ≥0 ≥0
0 for any ℓ ∈ Z). Moreover, this sum is summable, because aℓ2 + bℓ grows
quadratically when ℓ → +∞ or ℓ → −∞ (since a > 0).
= ∏ 1 − 6n−2 6n−4
x6n
|x {z } 1 − |x {z } 1 − |{z}
n >0
=( x2 )3n−1 =( x2 )3n−2 =( x2 )3n
3n−1 3n−2 3n
= ∏ 1− x 2
1− x 2
1 − x2
n >0
k
= ∏ 1 − x2 ,
k >0
K be acommutative
Lemma 4.3.3. Let ring. Let f and g be two FPSs in K [[ x ]].
Assume that f x2 = g x2 . Then, f = g.
where
• A white (=hollow) circle means a level that is contained in the state
(you can think of it as an “electron”).
• A black (=filled) circle means a level that is not contained in the state
(think of it as a “hole”).
For any state S,
• we define the energy of S to be
energy S := ∑ |{z} p∑
2p − 2p ∈ N
p>0; <0;
|{z}
p ∈ S >0 ∈ S <0
p/
(where the summation index p in the first sum runs over the finitely many
positive levels contained in S, while the summation index p in the second
sum runs over the finitely many negative levels not contained in S).
• we define the particle number of S to be
parnum S := (# of levels p > 0 such that p ∈ S)
/ S) ∈ Z.
− (# of levels p < 0 such that p ∈
We will first transform this identity into an equivalent one: Namely, we move
the 1 − q2n factors from the left hand side to the right hand side by multiply-
−1
ing both sides with ∏ 1 − q2n . Thus, we can rewrite our identity as
n >0
!
−1
∏ 1 + q2n−1 z 1 + q2n−1 z−1 = ∑ ℓ2
q zℓ ∏ 1 − q2n .
n >0 ℓ∈Z n >0
Math 701 Spring 2021, version April 6, 2024 page 228
We will prove this new identity by showing that both of its sides are
∑ qenergy S zparnum S .
S is a state
1
∏ (1 − x n ) −1 = ∏ 1 − x n = ∑ p (n) x n (by Theorem 4.1.8)
n >0 n >0 n ∈N
= ∑ x |λ|
λ is a
partition
(because the sum ∑ x |λ| contains each monomial x n precisely p (n) times).
λ is a
partition
Substituting q2 for x in this equality, we find
n −1 |λ|
∏ 1 − q2 = ∑ q2 .
n >0 λ is a
partition
In other words,
−1
∏ 1 − q2n = ∑ q 2| λ | .
n >0 λ is a
partition
2
Multiplying both sides of this equality by ∑ qℓ zℓ , we obtain
ℓ∈Z
! !
−1
∑q ℓ2
zℓ ∏ 1 − q2n = ∑q ℓ2
zℓ ∑ q 2| λ |
ℓ∈Z n >0 ℓ∈Z λ is a
partition
∑ ∑
2 +2| λ |
= qℓ zℓ . (152)
ℓ∈Z λ is a
partition
Let us do this. Fix ℓ ∈ Z. We define the state Gℓ (called the “ℓ-ground state”)
by
1 3 5
Gℓ := {all levels < ℓ} = ℓ − , ℓ − , ℓ− , ... .
2 2 2
Math 701 Spring 2021, version April 6, 2024 page 230
energy Gℓ = 1 + 3 + 5 + · · · + (2ℓ − 1) = ℓ2
We say that this state jump p,q S is obtained from S by letting the electron at
level p jump q steps to the right. Note that jump p,q S has the same particle
number as S (check this!60 ), whereas its energy is 2q higher than that of S
(check this!61 ). Thus, a jumping electron raises the energy but keeps the particle
number unchanged.
For any partition λ = (λ1 , λ2 , . . . , λk ), we define the state Eℓ,λ (called an
“excited state”) by starting with the ℓ-ground state Gℓ , and then successively
letting the k electrons at the highest levels (which are – from highest to lowest
1 1 1
– the levels ℓ − 1 + , ℓ − 2 + , . . . , ℓ − k + ) jump λ1 , λ2 , . . . , λk steps to
2 2 2
the right, respectively (starting with the rightmost electron). In other words,
Eℓ,λ := jumpℓ−k+1/2, λk · · · jumpℓ−2+1/2, λ2 jumpℓ−1+1/2, λ1 ( Gℓ )
1
= {all levels < ℓ − k} ∪ ℓ − i + + λi | i ∈ {1, 2, . . . , k} .
2
(Check that these jumps are well-defined – i.e., that each electron jumps to an
unoccupied state.)
[Example: Let ℓ = 3 and k = 4 and λ = (λ1 , λ2 , λ3 , λ4 ) = (4, 2, 2, 1). Then,
3 7 9 13
Eℓ,λ = {all levels < −1} ∪ , , , .
2 2 2 2
0 ℓ
· · · · · · · · · negative levels · · · · · · · · · · · · · · · · · · positive levels · · · · · · · · · · · ·
jumpℓ−1+1/2, λ1
jumpℓ−2+1/2, λ2
jumpℓ−3+1/2, λ3
jumpℓ−4+1/2, λ4
Note how the order of the electrons after the jumps is the same as before – i.e.,
the electron in the rightmost position before the jumps is still the rightmost one
after the jumps, etc.]
Recall that the ℓ-ground state Gℓ has energy ℓ2 and particle number ℓ. The
state Eℓ,λ that we have just defined is obtained from this ℓ-ground state Gℓ by k
jumps, which are jumps by λ1 , λ2 , . . . , λk steps respectively. Recall that a jump
by q steps raises the energy by 2q but keeps the particle number unchanged.
Thus, the “excited state” Eℓ,λ has energy ℓ2 + 2λ1 + 2λ2 + · · · + 2λk = ℓ2 + 2 |λ|
| {z }
=2| λ |
and particle number ℓ. Furthermore, every state with particle number ℓ can be
written as Eℓ,λ for a unique partition λ (check this!62 ).
62 Here is how to obtain this λ: For each i ∈ {1, 2, 3, . . .}, let ui be the i-th largest level in the
Math 701 Spring 2021, version April 6, 2024 page 232
Hence,
∑ ∑ qenergy(Φℓ (λ))
2
q ℓ +2| λ | =
λ is a λ is a
| {z }
partition =qenergy(Φℓ (λ)) partition
(since energy(Φℓ (λ))=ℓ2 +2|λ|)
= ∑ qenergy S (153)
S is a state with
particle number ℓ
(here, we have substituted S for Φℓ (λ) in the sum, since the map Φℓ is a bijec-
tion).
Forget that we fixed ℓ. We thus have proved (153) for each ℓ ∈ Z.
Now, (152) becomes
!
−1
∑ ∏ = ∑ ∑ q ℓ +2| λ | z ℓ
2
ℓ ℓ 2n 2
q z 1 − q
ℓ∈Z n >0 ℓ∈Z λ is a
partition
| {z }
= ∑ qenergy S
S is a state with
particle number ℓ
(by (153))
= ∑ ∑ qenergy S zℓ
|{z}
ℓ∈Z S is a state with
=zparnum S
particle number ℓ
(since ℓ=parnum S)
= ∑ ∑ qenergy S zparnum S
ℓ∈Z S is a state with
particle number ℓ
| {z }
= ∑
S is a state
= ∑ qenergy S zparnum S .
S is a state
1
state, and let λi = ui − ℓ + i − . Then, (λ1 , λ2 , λ3 , . . .) is a weakly decreasing sequence
2
of nonnegative integers, and all but finitely many of its entries are 0 (since the state has
1
particle number ℓ, so that it is not hard to see that ui = ℓ − i + for any sufficiently
2
large i). Removing all 0s from this sequence (λ1 , λ2 , λ3 , . . .) thus results in a finite tuple
(λ1 , λ2 , . . . , λk ), which is precisely the partition λ whose corresponding Eℓ,λ is our state.
Math 701 Spring 2021, version April 6, 2024 page 233
This proves Jacobi’s Triple Product Identity (Theorem 4.3.1 and Theorem 4.3.2).
Theorem 4.3.4. For any positive integer n, let σ (n) denote the sum of all
positive divisors of n. (For example, σ (6) = 1 + 2 + 3 + 6 = 12 and σ (7) =
1 + 7 = 8.)
Then, for each positive integer n, we have
(
(−1)k−1 n, if n = wk for some k ∈ Z;
∑ (−1) σ (n − wk ) = 0,
k
if not.
k∈Z;
wk < n
The left hand side of the equality in Theorem 4.3.4 looks as follows:
(the sum goes only as far as the arguments remain positive integers, so it is
a finite sum). Thus, by solving this equality for σ (n), we obtain a recursive
formula for σ (n). (In this form, Theorem 4.3.4 appears in many places, such as
[Ness61, §23] and [Johnso20, Theorem 37].) This does not give an efficient algo-
rithm for computing σ (n) (it is much slower than the formula in [19s, Exercise
Math 701 Spring 2021, version April 6, 2024 page 234
2.18.1 (b)], even though the latter requires computing the prime factorization of
n), but its beauty and element of surprise (why would you expect the pentago-
nal numbers to have anything to do with sums of divisors?) make up for this
practical uselessness. Even more surprisingly, it can be proved using Euler’s
pentagonal number theorem, even though it has seemingly nothing to do with
partitions!
Proof of Theorem 4.3.4 (sketched). First, let us show that the right hand side in
Theorem 4.3.4 is well-defined. Indeed, it is easy to see that
thus, the pentagonal numbers wk for all k ∈ Z are distinct. Hence, if a pos-
itive integer n satisfies n = wk for some k ∈ Z, then this latter k is uniquely
determined by n, and therefore the expression
(
(−1)k−1 n, if n = wk for some k ∈ Z;
0, if not
is well-defined. In other words, the right hand side in Theorem 4.3.4 is well-
defined.
Now, define the two FPSs P and S as in our above proof of Theorem 4.1.20.
In that very proof, we have shown that these FPSs satisfy
and
∞
Q= ∑ (−1)k x wk = ∏ 1 − xk (by Theorem 4.2.2) ,
k ∈Z k =1
we obtain
∞ ∞ ∞
! !
1 1
PQ = ∏ 1 − xk ∏ 1−x k
=∏ k
· 1−x k
= 1.
k =1 k =1 k=1 | 1 − x {z }
=1
Combining this equality with (154), we can easily see that xQ′ = − QS. In-
deed, here are two (essentially equivalent) ways to show this:
Math 701 Spring 2021, version April 6, 2024 page 235
Q′ P′ P′
= loder Q = − loder P = − since loder P = .
Q P P
xQ′ xP′ SP
=− =− (by (154))
Q P P
= −S,
• A less conceptual but slicker way to draw the same conclusion is the
following: Taking derivatives in the equality PQ = 1, we find ( PQ)′ =
1′ = 0, so that 0 = ( PQ)′ = P′ Q + PQ′ (by Theorem 3.6.2 (d)). Thus,
PQ′ = − P′ Q. Multiplying this equality by x, we find PQ′ · x = − P′ Q · x =
xP′ Q = −S PQ = −S. Multiplying this further by Q, we obtain
− |{z}
|{z}
=SP =1
(by (154))
PQ′ · xQ = −SQ = − QS. In view of PQ′ · xQ = xQ′ · PQ = xQ′ , we can
|{z}
=1
rewrite this as xQ′ = − QS.
Either way, we now know that xQ′ = − QS. However, from Q = ∑ (−1)k x wk ,
k ∈Z
we obtain
!′
Q′ = ∑ (−1)k xwk = ∑ (−1)k ( x wk ) ′ = ∑ (−1)k wk xwk −1.
k ∈Z k ∈Z k ∈Z
| {z }
= w k x w k −1
(where we understand this
expression to mean 0 if wk =0)
k ∈Z k ∈Z = x wk k ∈Z
Math 701 Spring 2021, version April 6, 2024 page 236
Hence,
k ∈Z i >0
= ∑ ∑ − (−1)k x wk σ (i ) xi
k ∈Z i >0 | {z } | {zw +i}
= σ (i ) x k
=(−1)k−1
= ∑ ∑ (−1)k−1 σ (m − wk ) |x wk +({zm−wk})
k ∈Z m > wk =xm
| {z }
= ∑ ∑
m >0 k∈Z;
m > wk
here, we have substituted m − wk for i
in the inner sum
= ∑ ∑ (−1)k−1 σ (m − wk ) x m . (155)
m >0 k∈Z;
m > wk
(since the pentagonal numbers wk for all k ∈ Z are distinct, and thus the differ-
ent addends on the left hand side of (155) contribute to different monomials).
On the right hand side, the coefficient of x n is obviously
∑ (−1)k−1 σ (n − wk ) .
k∈Z;
n > wk
Since these two coefficients are equal (because (155) is an identity), we thus
Math 701 Spring 2021, version April 6, 2024 page 237
conclude that
(
(−1)k wk , if n = wk for some k ∈ Z;
0, if not
= ∑ (−1)k−1 σ (n − wk )
k∈Z;
| {z }
n>wk =−(−1)k
|{z}
= ∑
k∈Z;
wk < n
=− ∑ (−1)k σ (n − wk ) .
k∈Z;
wk < n
Thus,
(
(−1)k wk , if n = wk for some k ∈ Z;
∑ (−1)k σ (n − wk ) = −
0, if not
k∈Z;
wk < n
(
− (−1)k wk , if n = wk for some k ∈ Z;
=
0, if not
(
(−1)k−1 wk , if n = wk for some k ∈ Z;
=
0, if not
k k −1
since − (−1) = (−1)
(
(−1)k−1 n, if n = wk for some k ∈ Z;
=
0, if not
4.4.1. Motivation
For any n ∈ N, we have
p (n) = (# of partitions of n) .
Math 701 Spring 2021, version April 6, 2024 page 238
Consider the lower boundary of this Young diagram – i.e., the “irregular”
southeastern border between what is in the diagram and what is outside of it.
Let me mark it in thick red:
This lower boundary can be viewed as a lattice path from the point (0, 0) to the
point (6, 4) (where we are using Cartesian coordinates to label the intersections
of grid lines, so that the southwesternmost point in our diagram is (0, 0); note
that this is completely unrelated to our labeling of cells used in defining the
Math 701 Spring 2021, version April 6, 2024 page 239
Young diagram!63 ). This lattice path consists of east-steps (i.e., steps (i, j) →
(i + 1, j)) and north-steps (i.e., steps (i, j) → (i, j + 1)); moreover, it begins with
an east-step (since otherwise, our partition would have fewer than 4 parts)
and ends with a north-step (since otherwise, our partition would have largest
part < 6). Moreover, the Young diagram (and thus the partition) is uniquely
determined by this lattice path, since its cells are precisely the cells “northwest”
of this lattice path. Conversely, any lattice path from (0, 0) to (6, 4) that consists
of east-steps and north-steps and begins with an east step and ends with a
north-step uniquely determines a Young diagram and therefore a partition.
Therefore, in order to count the partitions that have 4 parts and largest part 6,
we only need to count such lattice paths.
To count them, we notice that any such lattice path has precisely 10 steps
(since any step increases the sum of the coordinates by 1; but this sum must
increase from 0 + 0 = 0 to 6 + 4 = 10). The first and the last steps are prede-
termined; it remains to decide which
of the remaining 8 steps are north-steps.
8
The # of ways to do this is , because we want precisely 3 of our 8 non-
3
predetermined steps to be north-steps (in order to end up at (6, 4) rather than
some other point).
As a consequence of this all, we find
8
(# of partitions with 4 parts and largest part 6) = .
3
k+ℓ−2
(# of partitions with k parts and largest part ℓ) = .
k−1
63 For additional clarity, here are the Cartesian coordinates of all grid points on our lattice path:
(6, 4)
(6, 3)
(3, 3) (4, 3) (5, 3)
(3, 2)
(3, 1)
(2, 1)
• This is a finite number, even without fixing the size of the partition. This
is not surprising, since you have only finitely many parts and only finitely
many options for each part.
• The number is symmetric in k and ℓ. This, too, is not surprising, because
conjugation (as defined in the proof of Theorem 4.1.15) gives a bijection
from {partitions with k parts and largest part ℓ}
to {partitions with ℓ parts and largest part k} .
Now, let us integrate the size of the partition back into our count – i.e., let us
try to count the partitions of a given n ∈ N with k parts and largest part ℓ. No
simple formula (like Proposition 4.4.1) exists for this number any more, so we
switch our focus to the generating function of such numbers (for fixed k and ℓ).
In other words, we try to compute the FPS
• we rename ℓ as n − k (note that n will no longer stand for the size of the
partition);
• we replace “largest part n − k and length k” by “largest part ≤ n − k and
length ≤ k” (this changes the results of our counts, but we can easily re-
cover the answer to the original question from an answer to the new ques-
tion; e.g., in order to count the length-k partitions, it suffices to subtract
the # of length-(≤ k − 1)-partitions from the # of length-(≤ k) partitions);
• we rename the indeterminate x as q.
4.4.2. Definition
Convention 4.4.2. In this section, we will mostly be using FPSs in the inde-
terminate q. That is, we call the indeterminate q rather than x. Thus, e.g., our
formula
1
∏ (1 − x n ) −1 = ∏ 1 − x n = ∑ p (n) x n = ∑ x |λ|
n >0 n >0 n ∈N λ is a
partition
Math 701 Spring 2021, version April 6, 2024 page 241
becomes
1
∏ (1 − q n ) −1 = ∏ 1 − q n = ∑ p (n) qn = ∑ q|λ| .
n >0 n >0 n ∈N λ is a
partition
∑ q|λ| ∈ Z [ q ] .
λ is a partition
with largest part ≤n−k
and length ≤k
n
This is also denoted by (but this notation has other meanings, too, and
k
suppresses q).
(b) If a is any element of a ring A, then we set
n n
:= [ a]
k a k q
!
n
this means the result of substituting a for q in
k q
= ∑ a|λ| ∈ A.
λ is a partition
with largest part ≤n−k
and length ≤k
n
Remark 4.4.4. The we defined in Definition 4.4.3 (a) is really a polyno-
k q
mial, not merely a FPS, because (for any given n and k) there are only finitely
many partitions with largest part ≤ n − k and length ≤ k.
n
Remark 4.4.5. The notation (and the name “q-binomial coefficient”)
k q
n
suggests a similarity to the usual binomial coefficient . And indeed, we
k
Math 701 Spring 2021, version April 6, 2024 page 242
n n
will soon see that = .
k 1 k
n n
Note, however, that is only defined for n, k ∈ N (unlike , which
k q k
we defined for arbitrary n, k ∈ C). It is possible to extend it to negative
integers n, but this will result in a Laurent polynomial. (See Exercise A.3.4.4
for this extension.)
and
4
2 q
= ∑ q|λ|
λ is a partition
with largest part ≤2
and length ≤2
Here, the sum ranges over all weakly increasing k-tuples (i1 , i2 , . . . , ik ) ∈
{0, 1, . . . , n − k}k . If k > n, then this is an empty sum (since the
set {0, 1, . . . , n − k } is empty in this case, and thus its k-th power
{0, 1, . . . , n − k}k is also empty because k > n ≥ 0).
Math 701 Spring 2021, version April 6, 2024 page 243
(b) Set sum S = ∑ s for any finite set S of integers. (For example,
s∈S
sum {2, 4, 5} = 2 + 4 + 5 = 11.) Then, we have
n
= ∑
k q S⊆{1,2,...,n};
qsum S−(1+2+···+k) .
|S|=k
(c) We have
n n
= .
k 1 k
5
Example 4.4.8. For example, let us compute using Proposition 4.4.7
2 q
(b). Namely, applying Proposition 4.4.7 (b) to n = 5 and k = 2, we obtain
5
= ∑ qsum S−(1+2)
2 q S⊆{1,2,...,5 };
|S|=2
(1+2)−(1+2)
=q + q(1+3)−(1+2) + q(1+4)−(1+2) + q(1+5)−(1+2)
+ q(2+3)−(1+2) + q(2+4)−(1+2) + q(2+5)−(1+2)
+ q(3+4)−(1+2) + q(3+5)−(1+2) + q(4+5)−(1+2)
since the 2-element subsets of {1, 2, . . . , 5} are
{1, 2} , {1, 3} , {1, 4} , {1, 5} , {2, 3} , {2, 4} ,
{2, 5} , {3, 4} , {3, 5} , {4, 5}
= q0 + q1 + q2 + q3 + q2 + q3 + q4 + q4 + q5 + q6
= 1 + q + 2q2 + 2q3 + 2q4 + q5 + q6 .
n
Proof of Proposition 4.4.7. (a) The definition of yields
k q
n
k q
= ∑ q|λ|
λ is a partition
with largest part ≤n−k
and length ≤k
k
= ∑ ∑ q|λ| . (156)
ℓ=0 λ is a partition
with largest part ≤n−k
and length ℓ
Now, let us simplify the inner sum on the right hand side.
Math 701 Spring 2021, version April 6, 2024 page 244
Fix ℓ ∈ {0, 1, . . . , k }. Then, any partition λ with length ℓ has the form
(λ1 , λ2 , . . . , λℓ ) for some nonnegative integers λ1 , λ2 , . . . , λℓ satisfying λ1 ≥
λ2 ≥ · · · ≥ λℓ > 0 (by the definitions of “partition” and “length”). More-
over, this partition λ has largest part ≤ n − k if and only if its entries satisfy
n − k ≥ λ1 ≥ λ2 ≥ · · · ≥ λℓ > 0. Finally, the size |λ| of this partition equals
λ1 + λ2 + · · · + λℓ . Hence, we can rewrite the sum
Furthermore,
= ( λ1 + λ2 + · · · + λ ℓ ) + (0 + 0 + · · · + 0) = λ1 + λ2 + · · · + λ ℓ .
This proves Observation 1 (c).
(d) The ℓ-tuple (λ1 , λ2 , . . . , λℓ ) is weakly decreasing (since λ1 ≥ λ2 ≥ · · · ≥ λk entails
λ1 ≥ λ2 ≥ · · · ≥ λℓ ) and consists of positive integers (since Observation 1 (a) says that
we have λi > 0 for all i ∈ {1, 2, . . . , ℓ}). Thus, it is a weakly decreasing tuple of positive
integers, i.e., a partition. This proves Observation 1 (d).]
Let us recall that ℓ ∈ {0, 1, . . . , k}, so that ℓ ≤ k. Hence, any ℓ-tuple (λ1 , λ2 , . . . , λℓ ) ∈
Nℓ can be extended to a k-tuple (λ1 , λ2 , . . . , λk ) ∈ Nk by inserting k − ℓ zeroes
at the end (i.e., by setting λℓ+1 = λℓ+2 = · · · = λk = 0). 64 If the original
ℓ-tuple (λ1 , λ2 , . . . , λℓ ) ∈ Nℓ was a partition
with largest part ≤
n − k, then
67 .These two maps are mutually inverse68 , and therefore are bijections. In par-
ticular, this shows that the first map is a bijection. This bijection allows us to
replace our partitions (λ1 , λ2 , . . . , λℓ ) ∈ Nℓ by k-tuples (λ1 , λ2 , . . . , λk ) ∈ Nk in
the sum ∑ qλ1 +λ2 +···+λℓ ; we thus find
(λ1 ,λ2 ,...,λℓ )∈Nℓ ;
n−k≥λ1 ≥λ2 ≥···≥λℓ >0
= ∑ λ1 +λ2 +···+λk
q .
)∈Nk ;
(λ1 ,λ2 ,...,λk
n−k≥λ1 ≥λ2 ≥···≥λk ≥0;
numpos(λ1 ,λ2 ,...,λk )=ℓ
Now, forget that we fixed ℓ. We thus have proved (158) for each ℓ ∈ {0, 1, . . . , k }.
inserts k − ℓ zeroes at the end of the partition. Thus, if we apply the first map after the
second map, we clearly recover the partition that we started with. If we apply the second
map after the first map, then we end up replacing the last k − ℓ entries of our k-tuple by
zeroes. However, if (λ1 , λ2 , . . . , λk ) ∈ Nk is any k-tuple satisfying n − k ≥ λ1 ≥ λ2 ≥ · · · ≥
λk ≥ 0 and numpos (λ1 , λ2 , . . . , λk ) = ℓ, then the last k − ℓ entries of this k-tuple are 0
(by Observation 1 (b)), and therefore the k-tuple does not change if we replace these k − ℓ
entries by zeroes.
Math 701 Spring 2021, version April 6, 2024 page 247
Hence we obtain
= ∑ q sum S−(1+2+···+k)
.
S⊆{1,2,...,n};
|S|=k
n
Proposition 4.4.9. Let n, k ∈ N satisfy k > n. Then, = 0.
k q
n
Proof. From k > n, we obtain n − k < 0. The definition of yields
k q
n
k q
= ∑ q|λ| . (159)
λ is a partition
with largest part ≤n−k
and length ≤k
The sum on the right hand side is an empty sum, since there exists no partition
n
with largest part ≤ n − k (because n − k < 0). Thus, (159) rewrites as =
k q
(empty sum) = 0, and this proves Proposition 4.4.9.
n n
Proposition 4.4.10. We have = = 1 for each n ∈ N.
0 q n q
Proof. This is easy and left as a homework exercise (Exercise A.3.4.1 (a)).
The next convention mirrors a convention we made for the (usual) binomial
coefficients:
n
Convention 4.4.11. Let n ∈ N. For any k ∈
/ N, we set := 0.
k q
(b) We have
n−1 k n−1
n
= +q .
k q k−1 q k q
Proof. (a) This is similar to the combinatorial proof of the recurrence relation
for binomial coefficients.
If k = 0, then the claimweare proving down to 1 = qn−k 0 + 1 (because
boils
n n−1
Proposition 4.4.10 yields = 1 and = 1, and because Convention
0 q 0 q
Math 701 Spring 2021, version April 6, 2024 page 250
n−1
4.4.11 yields = 0). Hence, we WLOG assume that k > 0. Thus,
−1 q
k − 1 ∈ N.
Proposition 4.4.7 (b) says that
n
= ∑ qsum S−(1+2+···+k) .
k q S⊆{1,2,...,n
(160)
};
|S|=k
n−1
= ∑
k − 1 q S⊆{1,2,...,n
qsum S−(1+2+···+(k−1)) . (161)
−1};
|S|=k−1
n−1
k
= ∑ qsum S−(1+2+···+k) . (162)
q S⊆{1,2,...,n−1};
|S|=k
• A type-2 subset will mean a k-element subset of {1, 2, . . . , n} that does not
contain n.
Each k-element subset of {1, 2, . . . , n} is either type-1 or type-2 (but not both
at the same time). Thus,
∑ qsum S−(1+2+···+k)
S⊆{1,2,...,n};
|S|=k
The type-2 subsets are precisely the k-element subsets of {1, 2, . . . , n − 1}.
Hence,
n−1
∑ q sum S−(1+2+···+k)
= ∑ q sum S−(1+2+···+k )
=
k
S⊆{1,2,...,n}; S⊆{1,2,...,n−1}; q
|S|=k; |S|=k
S is type-2
Math 701 Spring 2021, version April 6, 2024 page 251
(by (162)).
The type-1 subsets are just the (k − 1)-element subsets of {1, 2, . . . , n − 1}
with an n inserted into them; i.e., the map
= ∑ qsum S+n−(1+2+···+k)
S⊆{1,2,...,n−1};
| {z }
|S|=k−1 =qsum S+n−(1+2+···+(k−1))−k
=qn−k qsum S−(1+2+···+(k−1))
n−k n − 1 n−1
=q + .
k−1 q k q
(b) This is somewhat similar to Theorem 4.4.12 (a) (but a little bit more com-
plicated). It is left as a homework exercise (Exercise A.3.4.1 (b)).
n ( n − 1) · · · ( n − k + 1)
n
Next, we shall derive a q-analogue of the formula = =
k k!
n ( n − 1) · · · ( n − k + 1)
:
k ( k − 1) · · · 1
Theorem 4.4.13. Let n, k ∈ N satisfy n ≥ k. Then:
(a) We have
n
k k −1 1
1−q 1−q ··· 1−q ·
k q
n n −1 n − k +1
= (1 − q ) 1 − q ··· 1−q .
(b) We have
(1 − q n ) 1 − q n −1 · · · 1 − q n − k +1
n
=
1 − q k 1 − q k −1 · · · (1 − q 1 )
k q
(in the ring Z [[q]] or in the field of rational functions over Q).
Note that part (b) of Theorem 4.4.13 is the more intuitive statement, but part
(a) is easier to substitute things in (because substituting something for q in part
(b) requires showing that the denominator remains invertible, whereas part (a)
has no denominators and thus requires no such diligence).
Proof of Theorem 4.4.13. This is left as a homework exercise (Exercise A.3.4.1 (c)).
(Use induction on n and Theorem 4.4.12.)
Remark
n
4.4.14.
n − 1
If I
n − k + 1
just
gave you the fraction
(1 − q ) 1 − q ··· 1−q
, you would be surprised to hear
1 − q k 1 − q k −1 · · · (1 − q 1 )
that it is a polynomial (i.e., that the denominator divides the numerator)
and
n
has nonnegative coefficients. But given the way we defined , you are
k q
now getting this for free from Theorem 4.4.13.
Theorem 4.4.13 (b) can be rewritten in a somewhat simpler way using the
following notations:
[ n ] q : = q 0 + q 1 + · · · + q n −1 ∈ Z [ q ] .
Math 701 Spring 2021, version April 6, 2024 page 253
[ n ] q ! : = [1] q [2] q · · · [ n ] q ∈ Z [ q ] .
(c) As usual, if a is an element of a ring A, then [n] a and [n] a ! will mean the
results of substituting a for q in [n]q and [n]q !, respectively. Thus, explicitly,
[n] a = a0 + a1 + · · · + an−1 and [n] a ! = [1] a [2] a · · · [n] a .
1 − qn
[n]q = (in Z [[q]] or in the ring of rational functions over Q)
1−q
and
[ n ]1 = n and [n]1 ! = n!.
(in the ring Z [[q]] or in the ring of rational functions over Q).
Math 701 Spring 2021, version April 6, 2024 page 254
This formula holds whenever a and b are two elements of a commutative ring,
or even more generally, whenever a and b are two commuting elements of an
arbitrary ring. If we want to integrate a q into this formula, we need to
This gives rise to two different “q-analogues” of the binomial formula. Both
are important (one for the theory of partitions, and another for the theory of
quantum groups). Here is the first one:
Lemma 4.4.20. Let L be a commutative ring. Let n ∈ N. Let [n] denote the
set {1, 2, . . . , n}. Let a1 , a2 , . . . , an be n elements of L. Let b1 , b2 , . . . , bn be n
further elements of L. Then,
!
n
∏ ( a i + bi ) = ∑ ∏ ai ∏ bi . (164)
i =1 S⊆[n] i ∈S i ∈[n]\S
Lemma 4.4.20 is well-known and intuitively clear: When expanding the prod-
n
uct ∏ ( ai + bi ) = ( a1 + b1 ) ( a2 + b2 ) · · · ( an + bn ), you obtain a sum of 2n terms,
i =1
each of which is a product of one addend chosen from each of the n sums
a1 + b1 , a2 + b2 , . . . , an + bn . This is precisely what the right hand side of (164)
is. A rigorous proof of Lemma 4.4.20 can be found in [Grinbe15, Exercise 6.1
(a)].
Math 701 Spring 2021, version April 6, 2024 page 256
Proof of Theorem 4.4.19. Let [n] denote the set {1, 2, . . . , n}. We have
n
0
aq + b 1
aq + b · · · aq n −1
+ b = ∏ aq i −1
+b
i =1
!
= ∑ ∏ aqi−1 ∏ b
S⊆[n] i ∈S i ∈[n]\S
| {z }| {z }
= a | S | ∏ q i −1 =b|[n]\S|
i ∈S
i −1
by Lemma 4.4.20, applied to L = K [q] , ai = aq and bi = b
!
= ∑ a|S| ∏ q i −1 b|[n]\S|
S⊆[n] i ∈S
|{z} | {z }
n =qsum S−|S|
=∑ ∑ (since the sum of the exponents i −1
k =0 S⊆[n]; over all i ∈S is precisely sum S−|S|)
|S|=k
n
= ∑ ∑ a|S|
|{z} qsum S−|S| |[n]\S|
|b {z }
k =0 S⊆[n];
| {z }
= ak =qsum S−k =bn−k
|S|=k (since |S|=k) (since S is a k-element
(since |S|=k) subset of the n-element
set [n], and thus we
have |[n]\S|=n−k)
n
= ∑ ∑ ak qsum S−k bn−k
k =0 S⊆[n];
| {z }
|S|=k =qsum S−(1+2+···+k) q1+2+···+(k−1)
=qsum S−(1+2+···+k) qk(k−1)/2
(since 1+2+···+(k−1)=k(k−1)/2)
n
= ∑ ∑ ak qsum S−(1+2+···+k) qk(k−1)/2 bn−k
k =0 S⊆[n];
|S|=k
n n
n
∑q k(k−1)/2
∑ sum S−(1+2+···+k) k n−k
∑q k(k−1)/2
ak bn−k .
= q a b =
k =0
S⊆[n]; k =0
k q
|S|=k
| {z }
n
=
k q
(by Proposition 4.4.7 (b),
since [n]={1,2,...,n})
It is easy to check that these two matrices satisfy ba = − ab, that is, ba = ωab.
Thus, Theorem 4.4.21 predicts that
n
n
( a + b) = ∑
n
ak bn−k .
k =0
k ω
• Linear algebra (i.e., the notions of vector spaces, subspaces, linear inde-
pendence, bases, matrices, Gaussian elimination, etc.) can be done over
any field. In fact, many of its concepts can be defined over any commu-
tative ring, but only over fields do they behave as nicely as they do over
the real numbers. Thus, much of the linear algebra that you have learned
over the real numbers remains valid over any field. (Exceptions are some
properties that rely on positivity or on characteristic 0.)
Thus, it makes sense to talk about finite-dimensional vector spaces over finite
fields. Such spaces are finite as sets, and thus can be viewed as combinatorial
objects. An n-dimensional vector space over a finite field F has size | F |n .
Now, we might wonder how many k-dimensional subspaces such an n-dimensional
vector space has. The answer is given by the following theorem:
Compare this with the classical fact that if S is an n-element set, then
n
= (# of k-element subsets of S) .
k
This hints at an analogy between finite sets and finite-dimensional vector spaces.
Such an analogy does indeed exist; the expository paper [Cohn04] gives a great
overview over its reach.
The easiest proof of Theorem 4.4.24 uses three lemmas. The first one is a clas-
sical fact from linear algebra, which holds for any vector space (not necessarily
finite-dimensional) over any field (not necessarily finite):
Math 701 Spring 2021, version April 6, 2024 page 259
v1 ∈
/ span () and
| {z }
={0}
v2 ∈
/ span (v1 ) and
v3 ∈
/ span (v1 , v2 ) and
··· and
vk ∈
/ span (v1 , v2 , . . . , vk−1 ) .
β 1 v1 + β 2 v2 + · · · + β k v k
= ( β 1 v 1 + β 2 v 2 + · · · + β i −1 v i −1 ) + β i v i + ( β i +1 v i +1 + β i +2 v i +2 + · · · + β k v k )
| {z }
=0vi+1 +0vi+2 +···+0vk
=0
= ( β 1 v 1 + β 2 v 2 + · · · + β i −1 v i −1 ) + β i v i ,
so that
β i v i = ( β 1 v 1 + β 2 v 2 + · · · + β k v k ) − ( β 1 v 1 + β 2 v 2 + · · · + β i −1 v i −1 )
| {z }
=0
= − ( β 1 v 1 + β 2 v 2 + · · · + β i −1 v i −1 )
= (− β 1 ) v1 + (− β 2 ) v2 + · · · + (− β i−1 ) vi−1
∈ span (v1 , v2 , . . . , vi−1 ) .
Since β i ̸= 0, we thus obtain vi ∈ span (v1 , v2 , . . . , vi−1 ) (since span (v1 , v2 , . . . , vi−1 )
is an F-vector subspace of V and thus preserved under scaling). This contra-
dicts our assumption that vi ∈ / span (v1 , v2 , . . . , vi−1 ). This contradiction shows
that our assumption was wrong, and thus completes our proof of the “⇐=”
direction of our claim.
Thus, both directions of our claim are proved. This concludes the proof of
Lemma 4.4.25.
The next lemma we are going to use is itself an answer to a rather natural
counting problem. Indeed, it is well-known that (see, e.g., [19fco, Proposition
2.7.2]) if X is an n-element set, and if k ∈ N, then the
v1 ∈
/ span () and
| {z }
={0}
v2 ∈
/ span (v1 ) and
v3 ∈
/ span (v1 , v2 ) and
··· and
vk ∈
/ span (v1 , v2 , . . . , vk−1 ) .
• And so on, until the last vector vk in our list has been chosen.
Math 701 Spring 2021, version April 6, 2024 page 262
Hence,
Lemma 4.4.27 (Multijection principle). Let A and B be two finite sets. Let
m ∈ N. Let f : A → B be any map. Assume that each b ∈ B has exactly m
preimages under f (that is, for each b ∈ B, there are exactly m many elements
a ∈ A such that f ( a) = b). Then,
| A| = m · | B| .
preimages under f .
(# of preimages of W under f )
= (# of linearly independent k-tuples of vectors in W )
k 0 k 1 k k −1
= | F| − | F| | F| − | F| · · · | F| − | F|
(by Lemma 4.4.26, applied to W and k instead of V and n). This proves Obser-
vation 1.]
Now, Observation
1 shows
k-dimensional
that each vector subspace of V
has exactly | F |k − | F |0 | F |k − | F |1 · · · | F |k − | F |k−1 preimages under f .
Hence, Lemma 4.4.27 (applied to A = {linind k-tuples of vectors in V } and
B = {k-dimensional
vector subspaces
of V } and
k
m = | F| − | F| 0
| F | − | F | · · · | F |k − | F |k−1 ) shows that
k 1
Therefore,
(1 − q n ) 1 − q n −1 · · · 1 − q n − k +1
n
= (by Theorem 4.4.13 (b))
1 − q k 1 − q k −1 · · · (1 − q 1 )
k q
1 − q n − k +1 1 − q n − k +2 · · · (1 − q n )
=
(1 − q1 ) (1 − q2 ) · · · 1 − q k
(here, we have turned both products upside down)
k
∏ 1 − q n − k +i
k
1
= i =1 = · ∏ 1 − q n − k +i .
k k
∏ (1 − q i ) ∏ (1 − q i ) i =1
i =1 i =1
However, we have lim qn = 0 (check this!71 ). Thus, for each i ∈ {1, 2, . . . , k},
n→∞
we have
lim qn−k+i = 0
n→∞
(since the family qn−k+i is just a reindexing of the family (qn )n≥0 ), and
n ≥ k −i
therefore
n − k +i
lim 1 − q = 1.
n→∞
k k k k
lim
n→∞
∑ 1−q n − k +i
= ∑1 and lim
n→∞
∏ 1−q n − k +i
= ∏ 1.
i =1 i =1 i =1 i =1
Thus, in particular,
k k
n→∞
lim ∏ 1 − q n − k +i = ∏ 1 = 1.
i =1 i =1
Hence,
k
n 1
lim = lim ·∏ 1−q n − k +i
n→∞ k n→∞ k
q i =1
∏ (1 − q i )
i =1
k
1 1
= · lim ∏ 1 − qn−k+i =
k n→∞ k
i =1
∏ (1 − q i ) | {z } ∏ (1 − q i )
i =1 =1 i =1
k
1
=∏ i
. (166)
i =1 1 − q
to obtain that
n→∞
lim ∑ q|λ| = ∑ q|λ| .
λ is a partition λ is a partition
with largest part ≤n−k with length ≤k
and length ≤k
(by Theorem 4.1.19, with the letters x, m and k renamed as q, k and i). Thus,
Proposition 4.4.28 is proved again.
4.5. References
Thus ends our foray into integer partitions and related FPSs. We will partially
revisit this topic later, as we discuss symmetric functions. Here are just a few
things we are omitting:
∞
5
1 − x5i
∑ p (5n + 4) x = 5 ∏
n
6
,
n ∈N i =1 (1 − x i )
whose proof is far from straightforward. All of these results (and some
rather subtle generalizations) are shown in [Berndt06, Chapter 2] and
[Hirsch17, Chapters 3 and 5]; see also [Aigner07, Chapter 3, Highlight]
for a proof of the latter equality.
can be used to count partitions into parts that are congruent to ±1 mod 5
or congruent to ±2 mod 5, respectively. These surprising identities can
be proved using Proposition 4.4.28 and the Jacobi Triple Product Iden-
tity; see [Doyle19] for a self-contained writeup of this proof. A whole
book [Sills18] has been written about these two identities and their many
variants.
• The book [AndEri04] by Andrews and Eriksson is a beautiful (if not al-
ways fully precise) introduction to integer partitions and related topics.
5. Permutations
We now come back to the foundations of combinatorics: We will study permu-
tations of finite sets. I will assume that you know their most basic properties
Math 701 Spring 2021, version April 6, 2024 page 270
(see, e.g., [Strick13, Appendix B] and [Goodma15, §1.5] for refreshers; see also
[Grinbe15, Chapter 5] for many more details on inversions), and will show
some more advanced results. For deeper treatments, see [Bona12], [Sagan01]
and [Stanle11, Chapter 1].
Definition 5.1.2. Let n ∈ Z. Then, [n] shall mean the set {1, 2, . . . , n}. This is
an n-element set if n ≥ 0, and is an empty set if n ≤ 0.
The symmetric group S[n] (consisting of all permutations of [n]) will be
denoted Sn and called the n-th symmetric group. Its size is n! (when n ≥ 0).
For instance, S3 is the group of all 6 permutations of the set [3] = {1, 2, 3}.
If two sets X and Y are in bijection, then their symmetric groups SX and SY
are isomorphic. Intuitively, this is clear (just think of Y as a “copy” of X with
all elements relabelled, and use this to reinterpret each permutation of X as a
permutation of Y). We can formalize this as the following proposition:
S f : SX → SY ,
σ 7 → f ◦ σ ◦ f −1
···
p1 p2 pn
,
σ ( p1 ) σ ( p2 ) · · · σ ( p n )
where the entries p1 , p2 , . . . , pn of the top row are the n elements of [n] in
some order. Note that this is a standard notation for any kind of map from a
finite set. Commonly, we pick pi = i, so we get the array
1 2 ··· n
.
σ (1) σ (2) · · · σ ( n )
• For each i ∈ [n], draw an arrow (“arc”) from the node labelled i to the
node labelled σ (i ).
Example 5.1.6. Let σ : [4] → [4] be the map that sends the elements 1, 2, 3, 4
to 2, 4, 3, 1, respectively. Then, σ is a bijection, thus a permutation of [4].
1 2 3 4 3 1 4 2
(a) A two-line notation of σ is . Another is .
2 4 3 1 3 2 1 4
4 1 3 2
Another is . There are 24 two-line notations of σ in total, since
1 2 3 4
we can freely choose the order in which the elements of [4] appear in the top
row.
(b) The one-line notation of σ is (2, 4, 3, 1). Omitting the commas and the
parentheses, we can rewrite this as 2431.
(c) One way to draw the cycle digraph of σ is
1 2 3 4
Another is
4 1 3
2
.
(When drawing cycle digraphs, one commonly tries to place the nodes in
such a way as to make the arcs as short as possible. Thus, it is natural to
keep the cycles separate in the picture. But formally speaking, any picture is
fine, as long as the nodes and arcs don’t overlap.)
Example 5.1.7. Let σ : [10] → [10] be the map that sends the elements
1, 2, 3, 4, 5, 6, 7, 8, 9, 10 to 5, 4, 3, 2, 6, 10, 1, 9, 8, 7, respectively. Then, σ is a
bijection, hence a permutation of [10]. The one-line notation of σ is
(5, 4, 3, 2, 6, 10, 1, 9, 8, 7). If we omit the commas and the parentheses, then
this becomes
5 4 3 2 6 (10) 1 9 8 7.
(We have put the 10 in parentheses to make its place clearer.) The cycle
Math 701 Spring 2021, version April 6, 2024 page 273
digraph of σ is
5 9 8
1 6
7 10 2 4
.
5.2.1. Transpositions
Example 5.2.2. The permutation t2,4 of the set [7] sends the elements
1, 2, 3, 4, 5, 6, 7 to 1, 4, 3, 2, 5, 6, 7, respectively. Its one-line notation (with com-
mas and parentheses omitted) is therefore 1432567.
Note that ti,j = t j,i for any two distinct elements i and j of a set X.
Example 5.2.4. The permutation s2 of the set [7] sends the elements
1, 2, 3, 4, 5, 6, 7 to 1, 3, 2, 4, 5, 6, 7, respectively. Its one-line notation is therefore
1324567.
Here are some very basic properties of simple transpositions:72
Proof. To prove that two permutations α and β of [n] are identical, it suffices
to show that α (k) = β (k ) for each k ∈ [n]. Using this strategy, we can prove
all three parts of Proposition 5.2.5 straightforwardly (distinguishing cases cor-
responding to the relative positions of k, i, i + 1, j and j + 1). This is done in
detail for Proposition 5.2.5 (c) in [Grinbe15, solution to Exercise 5.1 (a)]; the
proofs of parts (a) and (b) are easier and LTTR.
5.2.2. Cycles
The following definition can be viewed as a generalization of Definition 5.2.1:
i1 to i2 ,
i2 to i3 ,
i3 to i4 ,
...,
ik−1 to ik ,
ik to i1
and leaves all other elements of X unchanged. In other words, cyci1 ,i2 ,...,ik
means the permutation of X that satisfies
(
i j+1 , if p = i j for some j ∈ {1, 2, . . . , k } ;
cyci1 ,i2 ,...,ik ( p) =
p, otherwise
for every p ∈ X,
72 Recall Definition 5.1.1. Thus, for example, s2i means si si = si ◦ si , whereas si s j means si ◦ s j .
Math 701 Spring 2021, version April 6, 2024 page 275
The name “k-cycle” harkens back to the cycle digraph of cyci1 ,i2 ,...,ik , which
consists of a cycle of length k (containing the nodes i1 , i2 , . . . , ik in this order)
along with | X | − k isolated nodes (more precisely, each of the elements of X \
{i1 , i2 , . . . , ik } has an arrow from itself to itself in the cycle digraph of σ). Here
is an example:
2 to 6,
6 to 5,
5 to 2
and leaves all other elements of X unchanged. Thus, this permutation has
OLN 16342578 and cycle digraph
5 6 1 3
4 7 8
Example 5.2.8. Let X be a set. If i and j are two distinct elements of X, then
cyci,j = ti,j . Thus, the 2-cycles in SX are precisely the transpositions in SX , so
|X|
there are many of them (since any 2-element subset {i, j} of X gives
2
rise to a transposition ti,j , and this assignment of transpositions to 2-element
subsets is bijective).
Note that the k-cycle cyci1 ,i2 ,...,ik is often denoted by (i1 , i2 , . . . , ik ), but I will not
use this notation here, since it clashes with the standard notation for k-tuples.
Exercise 5.2.2.1. Let n ∈ N and let k ∈ [n]. Let X be an n-element set. How
many k-cycles exist in SX ?
Solution to Exercise 5.2.2.1 (sketched). First, we note that there is exactly one 1-
cycle in SX (for n > 0), since a 1-cycle is just the identity map. This should be
viewed as a degenerate case; thus, we WLOG assume that k > 1.
Math 701 Spring 2021, version April 6, 2024 page 276
That is, cyci1 ,i2 ,...,ik does not change if we cyclically rotate the list (i1 , i2 , . . . , ik ).
Any k-cycle cyci1 ,i2 ,...,ik uniquely determines the elements i1 , i2 , . . . , ik up to
cyclic rotation (since k > 1). Indeed, if σ = cyci1 ,i2 ,...,ik is a k-cycle, then the
elements i1 , i2 , . . . , ik are precisely the elements of X that are not fixed by σ (it is
here that we use our assumption k > 1), and furthermore, if we know which of
these elements is i1 , then we can reconstruct the remaining elements i2 , i3 , . . . , ik
recursively by
i2 = σ ( i1 ) , i3 = σ ( i2 ) , i4 = σ ( i3 ) , ..., i k = σ ( i k −1 )
(that is, i2 , i3 , . . . , ik are obtained by iteratively applying σ to i1 ). Therefore,
if σ ∈ SX is a k-cycle, then there are precisely k lists (i1 , i2 , . . . , ik ) for which
σ = cyci1 ,i2 ,...,ik (coming from the k possibilities for which of the k non-fixed
points of σ should be i1 ).
Hence, the map
f : {k-tuples of distinct elements of X } → {k-cycles in SX } ,
(i1 , i2 , . . . , ik ) 7→ cyci1 ,i2 ,...,ik
is a k-to-1 map (i.e., each k-cycle in SX has precisely k preimages under this
map). Therefore, Lemma 4.4.27 (applied to m = k and
A = {k-tuples of distinct elements of X } and B = {k-cycles in SX }) yields
(# of k-tuples of distinct elements of X ) = k · (# of k-cycles in SX ) .
Therefore,
1
(# of k-cycles in SX ) = · (# of k-tuples of distinct elements of X )
k | {z }
=n(n−1)(n−2)···(n−k+1)
(by (165), since X is an n-element set)
1
= · n ( n − 1) ( n − 2) · · · ( n − k + 1)
k
n
= · ( k − 1) ! (by a bit of simple algebra) .
k
This is the answer to Exercise 5.2.2.1 in the case k > 1. Hence, Exercise 5.2.2.1
is solved.
(In LaTeX, the symbol “ℓ” is obtained by typing “\ell”. If you just type “l”,
you will get “l”.)
An inversion of a permutation σ can thus be viewed as a pair of elements of
[n] whose relative order changes when σ is applied to them. (We require this
pair (i, j) to satisfy i < j in order not to count each such pair doubly.)
Example 5.3.2. Let π ∈ S4 be the permutation with OLN 3142. The inversions
of π are
(d) If n ≥ 1, then
(# of σ ∈ Sn with ℓ (σ) = 1) = n − 1.
( n − 2) ( n + 1)
(# of σ ∈ Sn with ℓ (σ) = 2) = .
2
Indeed, the only permutations σ ∈ Sn with ℓ (σ) = 2 are the products si s j
with 1 ≤ i < j < n as well as the products si si−1 with i ∈ {2, 3, . . . , n − 1}. If
( n − 2) ( n + 1)
n ≥ 2, then there are such products (and they are all distinct).
2
Example 5.3.4. Written in one-line notation, the permutations of the set [3]
are 123, 132, 213, 231, 312, and 321. Their lengths are
Thus,
∑ x ℓ(σ)
σ ∈ Sn
n −1
= ∏ 1+x+x +···+x2 i
i =1
2 2 3 2 n −1
= (1 + x ) 1 + x + x 1+x+x +x ··· 1+x+x +···+x
= [n] x !.
(Here, we are using Definition 4.4.15, so that [n] x ! means the result of substi-
tuting x for q in the q-factorial [n]q !.)
(The last equality sign here is clear, since the j ∈ [n] satisfying i < j are
precisely the j ∈ {i + 1, i + 2, . . . , n}.)
(b) For each m ∈ Z, we let [m]0 denote the set {0, 1, . . . , m}. (This is an
empty set when m < 0.)
(c) We let Hn denote the set
[ n − 1]0 × [ n − 2]0 × · · · × [ n − n ]0
= {( j1 , j2 , . . . , jn ) ∈ Nn | ji ≤ n − i for each i ∈ [n]} .
L : Sn → Hn ,
σ 7→ (ℓ1 (σ) , ℓ2 (σ) , . . . , ℓn (σ)) .
(This map is well-defined, since each σ ∈ Sn and each i ∈ [n] satisfy ℓi (σ) ∈
{0, 1, . . . , n − i } = [n − i ]0 .)
(e) If σ ∈ Sn is a permutation, then the n-tuple L (σ ) =
(ℓ1 (σ) , ℓ2 (σ) , . . . , ℓn (σ)) is called the Lehmer code (or just the code) of σ.
σ L (σ)
123 (0, 0, 0)
132 (0, 1, 0)
213 (1, 0, 0) .
231 (1, 1, 0)
312 (2, 0, 0)
321 (2, 1, 0)
(since the elements of [n] \ {σ (1) , σ (2) , . . . , σ (i )} are precisely the entries that
appear after σ (i ) in the OLN of σ). We can replace the set [n] \ {σ (1) , σ (2) , . . . , σ (i )}
by [n] \ {σ (1) , σ (2) , . . . , σ (i − 1)} in this equality73 (this will not change the
# of elements of this set that are smaller than σ (i ), because it only inserts the
element σ (i ) into the set, and obviously this element σ (i ) is not smaller than
σ (i )). Thus, we obtain
Therefore,
73 If i = 1, then the set {σ (1) , σ (2) , . . . , σ (i − 1)} is empty, so that we have [n] \
{σ (1) , σ (2) , . . . , σ (i − 1)} = [n] in this case.
Math 701 Spring 2021, version April 6, 2024 page 282
This map σ
• is always well-defined (indeed, we never run out of values in the pro-
cess of constructing σ, because of the following argument: each i ∈
[n] satisfies ji ≤ n − i 74 , and thus the ( n − i + 1)-element set [ n ] \
if and only if
For example, (4, 1, 2, 5) <lex (4, 1, 3, 0) and (1, 1, 0, 1) <lex (2, 0, 0, 0). The re-
lation (172) is usually pronounced as “( a1 , a2 , . . . , an ) is lexicographically smaller
than (b1 , b2 , . . . , bn )”; the word “lexicographic” comes from the idea that if num-
bers were letters, then a “word” a1 a2 · · · an would appear earlier in a dictionary
than b1 b2 · · · bn if and only if ( a1 , a2 , . . . , an ) <lex (b1 , b2 , . . . , bn ).
Now, the following is easy to see:
Actually, it is not hard to show that the relation <lex is a total order on the
set Zn (known as the lexicographic order); however, Proposition 5.3.11 is the only
part of this statement that we will need.
Then,
From these two equalities, we can easily see that ℓk (σ ) < ℓk (τ ). (In fact, any
element of Z that is smaller than σ (k) must also be smaller than τ (k ) (since
σ (k ) < τ (k)), but there is at least one element of Z that is smaller than τ (k )
Math 701 Spring 2021, version April 6, 2024 page 285
but not smaller than σ (k) (namely, the element σ (k )). Hence, there are fewer
elements of Z that are smaller than σ (k ) than there are elements of Z that are
smaller than τ (k).)
Combining (175) with ℓk (σ ) < ℓk (τ ), we obtain (ℓ1 (σ) , ℓ2 (σ) , . . . , ℓn (σ)) <lex
(ℓ1 (τ ) , ℓ2 (τ ) , . . . , ℓn (τ )) (by Definition 5.3.10). This proves Proposition 5.3.12.
= ∑ x j1 ∑ x j2 · · · ∑ x jn
j1 ∈[n−1]0 j2 ∈[n−2]0 jn ∈[n−n]0
(the last equality sign here is easy to check). This proves Proposition 5.3.5.
Proof of Proposition 5.3.13 (sketched). (See [Grinbe15, Exercise 5.2 (f)] for details.)
Recall that ℓ (σ) is the # of inversions of σ, while ℓ σ − 1 is the # of inversions
of σ .− 1
Recall also that an inversion of σ is a pair (i, j) ∈ [n] × [n] such that i < j and
σ (i ) > σ ( j). Likewise, an inversion of σ−1 is a pair (u, v) ∈ [n] × [n] such that
u < v and σ−1 (u) > σ−1 (v).
Thus, if (i, j) is an inversion of σ, then (σ ( j) , σ (i )) is an inversion of σ−1 .
Math 701 Spring 2021, version April 6, 2024 page 287
principle yields
(# of inversions of σ) = # of inversions of σ−1 .
(a) We have
(
ℓ (σ) + 1, if σ (k) < σ (k + 1) ;
ℓ (σsk ) =
ℓ (σ) − 1, if σ (k) > σ (k + 1) .
(b) We have
(
ℓ (σ) + 1, if σ−1 (k) < σ−1 (k + 1) ;
ℓ (sk σ ) =
ℓ (σ) − 1, if σ−1 (k) > σ−1 (k + 1) .
We will only outline the proof of Lemma 5.3.14; a detailed proof can be found
in [Grinbe15, Exercise 5.2 (a)] (although not completely the same proof).
Proof of Lemma 5.3.14 (sketched). (b) The OLN77 of sk σ is obtained from the OLN
of σ by swapping the two entries k and k + 1. This is best seen on an example:
For example, if σ = 512364 (in OLN), then s3 σ = 512463. In general, this
follows by observing that
k + 1, if σ (i ) = k;
(sk σ) (i ) = sk (σ (i )) = k, if σ (i ) = k + 1; for each i ∈ [n] .
σ (i ) , otherwise
Let us now use this observation to see how the inversions of sk σ differ from
the inversions of σ. Indeed, let us call an inversion (i, j) of a permutation τ
exceptional if we have τ (i ) = k + 1 and τ ( j) = k. All other inversions of τ will
be called non-exceptional.
Now, we make the following observation:
(# of non-exceptional inversions of sk σ)
= (# of non-exceptional inversions of σ) . (176)
• If σ−1 (k ) < σ−1 (k + 1), then the permutation sk σ has a unique excep-
tional inversion, whereas the permutation σ has none.
• If σ−1 (k) > σ−1 (k + 1), then the permutation σ has a unique exceptional
inversion, whereas the permutation sk σ has none.
Math 701 Spring 2021, version April 6, 2024 page 289
Thus,
(# of exceptional inversions of sk σ)
= (# of exceptional inversions of σ)
(
1, if σ−1 (k) < σ−1 (k + 1) ;
+ (177)
−1, if σ−1 (k) > σ−1 (k + 1) .
(# of inversions of sk σ)
(
1, if σ−1 (k ) < σ−1 (k + 1) ;
= (# of inversions of σ) +
−1, if σ−1 (k ) > σ−1 (k + 1)
(
(# of inversions of σ) + 1, if σ−1 (k) < σ−1 (k + 1) ;
=
(# of inversions of σ) − 1, if σ−1 (k) > σ−1 (k + 1) .
In other words,
(
ℓ (σ) + 1, if σ−1 (k) < σ−1 (k + 1) ;
ℓ (sk σ ) =
ℓ (σ) − 1, if σ−1 (k) > σ−1 (k + 1)
where
Proof of Proposition 5.3.15 (sketched). (b) This follows by a diligent analysis of the
possible interactions between an inversion and composition by ti,j . To be more
concrete:
• The fact that ℓ σti,j = ℓ (σ) − 2 | Q| − 1 when σ (i ) > σ ( j) is [Grinbe15,
Exercise 5.20]. A straightforward solution was given by Elafandi in [18f-hw4se].
(The solution given in [Grinbe15, Exercise 5.20] is more circuitous, as it
uses summation tricks to bypass case distinctions.)
• The fact that ℓ σti,j = ℓ (σ) + 2 | R| + 1 when σ (i ) < σ ( j) follows by
applying the previous fact to σti,j instead of σ. (Indeed, if σ (i ) < σ ( j),
then σti,j (i ) = σ ( j) > σ (i ) = σti,j ( j) and σ ti,j ti,j = σ. Moreover,
| {z }
=t2i,j =id
when we replace σ by σti,j , the sets Q and R trade places.)
Theorem 5.3.17 (1st reduced word theorem for the symmetric group). Let
n ∈ N and σ ∈ Sn . Then:
(a) We can write σ as a composition (i.e., product) of ℓ (σ ) simples.
(b) The number ℓ (σ ) is the smallest p ∈ N such that we can write σ as a
composition of p simples.
[Keep in mind: The composition of 0 simples is id, since id is the neutral
element of the group Sn .]
Example 5.3.18. Let σ ∈ S4 be the permutation 4132 (in OLN). How can we
write σ as a composition of simples? There are several ways to do this; for
example,
σ = s2 s3 s2 s1 = s3 s2 s3 s1 = s3 s2 s1 s3 = s2 s1 s1 s3 s2 s1 = s2 s1 s3 s1 s2 s1 = · · · .
| {z } |{z}
= s3 s2 s3 = s1 s3
The convex hull of all these n! many points Vσ (for σ ∈ Sn ) is a polytope (i.e.,
a bounded convex polyhedron in Rn ). This polytope is known as the per-
mutahedron (corresponding to n). It is actually (n − 1)-dimensional, since
all its vertices lie on the hyperplane with equation x1 + x2 + · · · + xn =
1 + 2 + · · · + n. It can be shown (see, e.g., [GaiGup77]) that:
The (intuitively obvious) fact that any two vertices of a polytope can be con-
nected by a sequence of edges therefore yields that any σ ∈ Sn can be written
as a product of simples. This is a weaker version of Theorem 5.3.17 (a).
(3, 2, 1)
(2, 3, 1) (3, 1, 2)
3
2
1
(1, 3, 2) (2, 1, 3)
1 2
1
2 (1, 2, 3)
33
(4,1,2,3)
(3,1,2,4) (4,2,1,3)
(3,2,1,4)
(4,1,3,2)
(2,1,3,4) (4,3,1,2)
(2,3,1,4)
(3,1,4,2) (4,2,3,1)
(2,1,4,3) (4,3,2,1)
(1,2,3,4) (3,4,1,2)
(1,3,2,4) (2,4,1,3)
(3,2,4,1)
(1,2,4,3) (3,4,2,1)
(1,4,2,3)
(2,3,4,1)
(1,3,4,2) (2,4,3,1)
(1,4,3,2)
Thus, the IH tells us that we can apply Theorem 5.3.17 (a) to σsk instead of σ.
We conclude that we can write σsk as a composition of ℓ (σsk ) simples. In other
words, we can write σsk as a composition of h simples (since ℓ (σsk ) = h). That
is, we have
σ= σsk s−
k
1
= s i1 s i2 · · · s i h s k .
|{z} |{z}
=si1 si2 ···sih =sk
(This follows from Lemma 5.3.14 (a), since both numbers ℓ (σ ) + 1 and ℓ (σ) − 1
Math 701 Spring 2021, version April 6, 2024 page 295
This proves (179), and thus concludes our proof of Theorem 5.3.17 (b).
Proof of Corollary 5.3.20 (sketched). (a) (See [Grinbe15, Exercise 5.2 (b)] for de-
tails.)
For any σ ∈ Sn and any k ∈ [n − 1], we have
(This follows from Lemma 5.3.14 (a), since both numbers ℓ (σ) + 1 and ℓ (σ) − 1
are congruent to ℓ (σ) + 1 modulo 2.)
Now, let σ ∈ Sn and τ ∈ Sn . Theorem 5.3.17 (a) yields that we can write τ as a
composition of ℓ (τ ) simples. In other words, we can write τ as τ = sk1 sk2 · · · skq
Math 701 Spring 2021, version April 6, 2024 page 296
Now, starting from each ×, we draw a vertical ray downwards and a hori-
zontal ray eastwards. I will call these two rays the Lehmer lasers. Here is how
the rays look like in our running example:
Now, we draw a little circle ◦ into each cell that is not hit by any laser.
Math 701 Spring 2021, version April 6, 2024 page 298
s4 s3 s2 s1
s3
s4
s5
Math 701 Spring 2021, version April 6, 2024 page 299
Finally, read the matrix row by row, starting with the top row, and reading
each row from left to right. The result, in our running example, is
s4 s3 s2 s1 s3 s4 s5 .
The claim of Remark 5.3.23 can be restated in a more direct (if less visual)
fashion:
We refer to [Grinbe15, Exercise 5.21 parts (b) and (c)] for a (detailed, but an-
noyingly long) proof of Proposition 5.3.24. (You are probably better off proving
it yourself.)
σ (i ) − σ ( j )
(−1)σ = ∏ i−j
for each σ ∈ Sn .
1≤ i < j ≤ n
Proof of Proposition 5.4.2 (sketched). Most of this follows easily from what we
have proved above, but here are references to complete proofs:
(a) This is [Grinbe15, Proposition 5.15 (a)], and follows easily from ℓ (id) = 0.
(d) This is [Grinbe15, Proposition 5.15 (c)], and follows easily from Corollary
5.3.20 (a). A different proof appears in [Strick13, Proposition B.13].
(b) This is [Grinbe15, Exercise 5.10 (b)], and follows easily from Exercise
A.4.3.2 (a).
(c) This is [Grinbe15, Exercise 5.17 (d)], and follows easily from Exercise
A.4.2.1 (a) and Exercise A.4.3.2 (b) using Proposition 5.4.2 (d).
(e) This is [Grinbe15, Proposition 5.28], and follows by induction from Propo-
sition 5.4.2 (d).
(f) This is [Grinbe15, Proposition 5.15 (d)], and follows easily from Proposi-
tion 5.4.2 (d) or from Proposition 5.3.13.
(h) This is [Grinbe15, Exercise 5.13 (a)] (or, rather, the straightforward gener-
alization of [Grinbe15, Exercise 5.13 (a)] to arbitrary commutative rings). The
proof is fairly easy: Each factor xσ(i) − xσ( j) on the left hand side appears also
on the right hand side, albeit with a different sign if (i, j) is an inversion of
σ. Thus, the products on both sides agree up to a sign, which is precisely
(−1)ℓ(σ) = (−1)σ .
(g) This is [Grinbe15, Exercise 5.13 (c)], and is a particular case of Proposition
5.4.2 (h).
Math 701 Spring 2021, version April 6, 2024 page 301
Sn → {1, −1} ,
σ 7→ (−1)σ
79 See,e.g., https://groupprops.subwiki.org/wiki/Alternating_groups_are_simple or
[Goodma15, Theorem 10.3.4] for a proof.
Math 701 Spring 2021, version April 6, 2024 page 302
Proof of Corollary 5.4.6 (sketched). The symmetric group Sn contains the simple
transposition s1 (since n ≥ 2). If σ ∈ Sn , then
Both sides of this equality must furthermore equal to n!/2, since they add up
to |Sn | = n!. This proves Corollary 5.4.6. (See [Grinbe15, Exercise 5.4] for
details.)
As a consequence of Corollary 5.4.6, we see that
Proposition 5.4.7. Let X be a finite set. We want to define the sign of any
permutation of X.
Fix a bijection ϕ : X → [n] for some n ∈ N. (Such a bijection always exists,
since X is finite.) For every permutation σ of X, set
−1
(−1)σϕ := (−1)ϕ◦σ◦ϕ .
3 2 9
5 8
4
1 6 7
thus, we can view the permutation σ as acting on the five subsets {1, 4, 3},
{2, 6}, {5}, {7, 9} and {8} of [9] separately. On the first of these five subsets,
σ acts as the 3-cycle cyc1,4,3 (in the sense that σ (k) = cyc1,4,3 (k ) for each
k ∈ {1, 4, 3}). On the second, it acts as the 2-cycle cyc2,6 . On the third, it acts
as the 1-cycle cyc5 (which, of course, is the identity map). On the fourth, it
acts as the 2-cycle cyc7,9 . On the fifth, it acts as the 1-cycle cyc8 (which, again,
is just the identity map). Combining these observations, we conclude that
σ (k ) = cyc1,4,3 ◦ cyc2,6 ◦ cyc5 ◦ cyc7,9 ◦ cyc8 (k ) for each k ∈ [9]
and
• we have
Such a list is called a disjoint cycle decomposition (or short DCD) of σ. Its
entries (which themselves are lists of elements of X) are called the cycles of
σ.
(b) Any two DCDs of σ can be obtained from each other by (repeatedly)
swapping the cycles with each other, and rotating each cycle (i.e., replacing
ai,1 , ai,2 , . . . , ai,ni by ai,2 , ai,3 , . . . , ai,ni , ai,1 ).
(c) Now assume that X is a set of integers (or, more generally, any totally
ordered finite set). Then, there is a unique DCD
a1,1 , a1,2 , . . . , a1,n1 ,
( a2,1 , a2,2 , . . . , a2,n2 ) ,
...,
ak,1 , ak,2 , . . . , ak,nk
• we have ai,1 ≤ ai,p for each i ∈ [k ] and each p ∈ [ni ] (that is, each cycle
in this DCD is written with its smallest entry first), and
• we have a1,1 > a2,1 > · · · > ak,1 (that is, the cycles appear in this DCD
in the order of decreasing first entries).
Example 5.5.3. Let σ ∈ S9 be the permutation from Example 5.5.1. Then, the
representation (184) shows that
(1, 4, 3) , (2, 6) , (5) , (7, 9) , (8)
Math 701 Spring 2021, version April 6, 2024 page 306
is a DCD of σ. By swapping the five cycles of this DCD, and by rotating each
cycle, we can produce various other DCDs of σ, such as
(7, 9) , (6, 2) , (3, 1, 4) , (8) , (5) .
The unique DCD of σ that satisfies the two additional requirements of Theo-
rem 5.5.2 (c) is
(8) , (7, 9) , (5) , (2, 6) , (1, 4, 3) .
Proof of Theorem 5.5.2 (sketched). This is a classical result with an easy proof;
sadly, this easy proof does not present well in writing. I will try to be as clear
as the situation allows. Some familiarity with digraphs (= directed graphs) is
recommended80 .
(a) Let D be the cycle digraph of σ, as in Example 5.5.1. This cycle digraph
D has the following two properties:
• Outbound uniqueness: For each node i, there is exactly one arc outgoing
from i. (Indeed, this is the arc from i to σ (i ), as should be clear from the
construction of D .)
• Inbound uniqueness: For each node i, there is exactly one arc incoming into
i. (Indeed, this is the arc from σ−1 (i ) to i, since σ is a permutation and
therefore invertible.)
Using these two properties, we will now show that the cycle digraph D con-
sists of several node-disjoint cycles (i.e., several cycles that pairwise share no
nodes).
Indeed, let us first observe the following: If two cycles C and D of D have a
node in common, then they are identical81 (because the outbound uniqueness
property prevents these cycles from ever separating after meeting at the com-
mon node). In other words, any two cycles C and D of D are either identical or
node-disjoint (i.e., share no nodes with each other).
Now, let i be any node of D . Then, if we start at i and follow the outgoing
arcs, then we obtain an infinite walk
σ0 (i ) → σ1 (i ) → σ2 (i ) → σ3 (i ) → · · ·
along our digraph D . Since X is finite, the Pigeonhole Principle guarantees that
this walk will eventually revisit a node it has already been to; i.e., there exist
80 See,
e.g., [Guicha20, §5.11] or [Loehr11, §3.5] for brief introductions to digraphs.
81 Here, we identify any cycle with its cyclic rotations. For example, if a → b → c → a is a
cycle, then we consider b → c → a → b to be the same cycle.
Math 701 Spring 2021, version April 6, 2024 page 307
σ0 (i ) → σ1 (i ) → σ2 (i ) → · · · → σ v (i ) = σ0 (i ) .
(This is indeed a cycle, since (185) shows that its first v nodes are distinct.) This
shows that the node i lies on a cycle Ci of D (namely, the cycle that we just
found).
Now, forget that we fixed i. We thus have shown that each node i of D lies
on a cycle Ci . The cycles Ci for all nodes i ∈ X will be called the chosen cycles.
Any arc of our digraph D must belong to one of these chosen cycles. Indeed,
if a is an arc from a node i to a node j, then a must be the only arc outgoing
from i (by the outbound uniqueness property); but this means that this arc a
belongs to the chosen cycle Ci .
Now, let us look back at what we have shown:
• Some of the chosen cycles may be identical, but apart from that, the cho-
sen cycles are pairwise node-disjoint (since any two cycles of D are either
identical or node-disjoint).
(making sure to label each cycle only once). Then, each element of X appears
exactly once in the composite list
( a1,1 , a1,2 , . . . , a1,n1 ,
a2,1 , a2,2 , . . . , a2,n2 ,
...,
ak,1 , ak,2 , . . . , ak,nk ),
and we have
σ = cyca1,1 ,a1,2 ,...,a1,n ◦ cyca2,1 ,a2,2 ,...,a2,n ◦ · · · ◦ cycak,1 ,ak,2 ,...,ak,n
1 2 k
(since σ moves any node i ∈ X one step forward along its chosen cycle). This
proves Theorem 5.5.2 (a).
Alternative proofs of Theorem 5.5.2 (a) can be found (e.g.) in [Goodma15,
Theorem 1.5.3] or in [Knapp16, §I.4, Proposition 1.21] or in [Bourba74, Chap-
ter I, §5.7, Proposition 7] or in [Sagan19, §1.9, proof of Theorem 1.5.1] (this
is essentially our proof) or in https://proofwiki.org/wiki/Existence_and_
Uniqueness_of_Cycle_Decomposition (see also [17f-hw7s, Exercise 7 (e) and
(d)] for a rather formalized proof). Note that some of these sources work with
a slightly modified concept of a DCD, in which they throw away the 1-cycles
(i.e., they replace “appears exactly once” by “appears at most once”, and re-
quire all cycle lengths n1 , n2 , . . . , nk to be > 1). For instance, the DCD (184)
becomes
σ = cyc1,4,3 ◦ cyc2,6 ◦ cyc7,9
if we use this modified notion of a DCD.
(b) See [Goodma15, Theorem 1.5.3] or [Bourba74, Chapter I, §5.7, Proposition
7]. The idea is fairly simple: Let
a1,1 , a1,2 , . . . , a1,n1 ,
( a2,1 , a2,2 , . . . , a2,n2 ) ,
...,
ak,1 , ak,2 , . . . , ak,nk
be a DCD of σ. Then, for each i ∈ X, the cycle of this DCD that contains i
is uniquely determined by σ and i up to cyclic rotation (indeed, it is a rotated
2
version of the list i, σ (i ) , σ (i ) , . . . , σ r − 1 (i ) , where r is the smallest positive
integer satisfying σr (i ) = i). Therefore, all cycles of this DCD are uniquely
determined by σ up to cyclic rotation and up to the relative order in which
these cycles appear in the DCD. But this is precisely the claim of Theorem 5.5.2
(b).
(c) In order to obtain a DCD of σ that satisfies these two requirements, it
suffices to
Math 701 Spring 2021, version April 6, 2024 page 309
• then rotate each cycle of this DCD so that it begins with its smallest entry,
and
• then repeatedly swap these cycles so they appear in the order of decreas-
ing first entries.
Example 5.5.5. Let σ ∈ S9 be the permutation from Example 5.5.1. Then, the
cycles of σ are
(1, 4, 3) , (2, 6) , (5) , (7, 9) , (8) .
Their lengths are 3, 2, 1, 2, 1. Hence, the cycle lengths partition of σ is
(3, 2, 2, 1, 1).
The number of cycles of a permutation determines its sign. Let us state this
for permutations of [n] in particular (the reader can easily extend this to the
general case using Proposition 5.4.7):
5.6. References
We end our discussion of permutations here, although we will revisit it every
once in a while. Much more about permutations can be found in [Bona12],
[Kitaev11] (focussing on permutation patterns), [Sagan01] (focussing on the
representation theory of the symmetric group) and various other texts.
It is worth mentioning that the symmetric groups Sn are a particular case of
Coxeter groups – a class of groups highly significant to algebra, combinatorics
and geometry. One of the most combinatorial introductions to this subject
(which sheds new light on the combinatorics of symmetric groups) is the highly
readable text [BjoBre05]. Other texts include [Cohen08] and (for the particularly
resolute) [Bourba02, Chapter IV].
up with I \ {1}. In other words, we pair up a finite set I with either I \ {1} or
I ∪ {1}, depending on whether 1 ∈ I or 1 ∈ / I.
Let us do this systematically for all finite sets: For each finite set I, we define
the partner of I to be the set
(
I \ {1} , if 1 ∈ I;
I ′ := = I △ {1} ,
I ∪ {1} , if 1 ∈
/I
Now, recall our line of reasoning: We start with the sum of the signs of all ac-
ceptable sets, and we cancel any two addends that correspond to an acceptable
set and its acceptable partner. What remains are the addends corresponding
to the acceptable sets that have non-acceptable partners. According to Claim
1, these are precisely the acceptable sets I that satisfy (1 ∈/ I and | I | = m). In
other words, these are precisely the m-element subsets of [n] that do not con-
tain 1. In other words, these are precisely the m-element subsets of [n] \ {1}
(since a subset of [n] that contain 1 is the same as a subset of [n] \ {1}).
does not
n−1
Thus, there are precisely of these subsets (since [n] \ {1} is an (n − 1)-
m
m
element set), and each of them has sign (−1) . Hence, there are precisely
n−1
addends left in the sum after our cancellations, and each of these
m
addends is (−1)m . Hence,
m n−1
(the sum of the signs of all acceptable sets) = (−1) .
m
Comparing this with (190), we obtain
m
m n−1
k n
∑ (−1) k = (−1) m
.
k =0
A := {acceptable sets}
and
f : X → X,
I 7→ I ′ .
The reason for this equality is that in the sum of the signs of all acceptable sets,
the contributions of the sets that belong to X (that is, of the acceptable sets
that have acceptable partners) cancel each other out. This principle is worth
generalizing and stating as a lemma:
Lemma 6.1.2 (Cancellation principle, take 1). Let A be a finite set. Let X be
a subset of A.
For each I ∈ A, let sign I be a real number (not necessarily 1 or −1). Let
f : X → X be a bijection with the property that
∑ sign I = ∑ sign I.
I ∈A I ∈A\X
Note that we did not require that f ◦ f = id in Lemma 6.1.2; we only required
that f is a bijection. That said, most examples that I know do have f ◦ f = id.
Proof of Lemma 6.1.2. Intuitively, this is clear: The contributions of all I ∈ X to
the sum ∑ sign I cancel out, to the extent they are not already zero. However,
I ∈A
rather than formalize this cancellation, let us give an even slicker argument:
We have
∑ sign I = ∑ sign ( f ( I ))
I ∈X I ∈X
| {z }
=− sign I
(by (191))
here, we have substituted f ( I ) for I
in the sum, since f : X → X is a bijection
= ∑ (− sign I ) = − ∑ sign I.
I ∈X I ∈X
But there are many other situations in which Lemma 6.1.2 can be applied. For
example, A can be some set of permutations, and sign σ can be the sign of σ
(as in Definition 5.4.1).
Let us observe that Lemma 6.1.2 can be generalized. Indeed, in Lemma 6.1.2,
we can replace “real number” by “element of any Q-vector space” or even by
“element of any additive abelian group with the property that 2a = 0 implies
a = 0”. We cannot, however, remove this requirement entirely. Indeed, if all the
signs sign I were the element 1 of Z/2, then the sign-reversing condition (191)
would hold automatically (since 1 = −1 in Z/2), but the claim of Lemma 6.1.2
would not necessarily be true.
However, if we replace the word “bijection” by “involution with no fixed
points”, then Lemma 6.1.2 holds even without any requirements on the group:
Lemma 6.1.3 (Cancellation principle, take 2). Let A be a finite set. Let X be
a subset of A.
For each I ∈ A, let sign I be an element of some additive abelian group.
Let f : X → X be an involution (i.e., a map satisfying f ◦ f = id) that has no
fixed points. Assume that
Then,
∑ sign I = ∑ sign I.
I ∈A I ∈A\X
Proof. The idea is that all addends corresponding to the I ∈ X cancel out from
the sum ∑ sign I (because they come in pairs of addends with opposite signs).
I ∈A
See Section B.6 for a detailed proof.
A more general version of Lemma 6.1.3 allows for f to have fixed points, as
long as these fixed points have sign 0:
Lemma 6.1.4 (Cancellation principle, take 3). Let A be a finite set. Let X be
a subset of A.
For each I ∈ A, let sign I be an element of some additive abelian group.
Let f : X → X be an involution (i.e., a map satisfying f ◦ f = id). Assume
that
sign ( f ( I )) = − sign I for all I ∈ X .
Math 701 Spring 2021, version April 6, 2024 page 317
Then,
∑ sign I = ∑ sign I.
I ∈A I ∈A\X
Proof. This is similar to Lemma 6.1.3, except that the addends corresponding to
the I ∈ X satisfying f ( I ) = I don’t cancel (but are already zero and thus can
be removed right away). See Section B.6 for a detailed proof.
Let us try to use this idea in another setting. Recall the notion of q-binomial
coefficients, and specifically their values (Definition 4.4.3 (b)).
n
Exercise 6.1.0.1. Let n, k ∈ N. Simplify .
k −1
4
Example 6.1.5. Let us compute . Theorem 4.4.13 (b) yields
2 −1
1 − q4 1 − q3
4
= .
2 q (1 − q2 ) (1 − q1 )
1 − q4 1 − q3
4
= 2 1
= q4 + q3 + 2q2 + q + 1,
2 q (1 − q ) (1 − q )
|S|=k
Math 701 Spring 2021, version April 6, 2024 page 318
where sum S denotes the sum of the elements of a finite set S of integers. Sub-
stituting −1 for q in this equality, we find
n
= ∑ (−1)sum S−(1+2+···+k) .
k −1 S⊆{1,2,...,n };
|S|=k
Using the shorthand [n] for the set {1, 2, . . . , n}, we can rewrite this as
n
= ∑ (−1)sum S−(1+2+···+k) . (192)
k −1 S⊆[n];
|S|=k
Let us analyze the sum on the right hand side using sign-reversing involutions.
Thus, we set
and
sign S := (−1)sum S−(1+2+···+k) for every S ∈ A.
Hence, (192) rewrites as
n
k −1 S∑
= sign S. (193)
∈A
Now, we seek a reasonable subset X ⊆ A and a sign-reversing bijection f :
X → X in order to cancel addends in the sum ∑ sign S.
S∈A
To wit, let us try to construct f as a partial map first, and then (as an af-
terthought) define X to be the set of all S ∈ A for which f (S) is defined.
Consider a k-element subset S of [n]. What is a way to transform S that leaves
its size |S| = k unchanged, but flips its sign (i.e., flips the parity of sum S) ? One
thing we can do is switching 1 with 2. By this I mean the following operation:
• If 1 ∈ S and 2 ∈
/ S, then we replace 1 by 2 in S.
• Otherwise, if 2 ∈ S and 1 ∈
/ S, then we replace 2 by 1 in S.
For example,
Indeed, the condition “|S ∩ {1, 2}| = 1” is equivalent to having either (1 ∈ S and 2 ∈ / S)
or (1 ∈/ S and 2 ∈ S); and in this case, the symmetric difference S △ {1, 2} is
precisely the set we need (i.e., the set (S \ {1}) ∪ {2} if we have (1 ∈ S and 2 ∈
/ S ),
and the set (S \ {2}) ∪ {1} if we have (1 ∈ / S and 2 ∈ S)).
This map switch1,2 : A → A is certainly a bijection (and, in fact, an involu-
tion). It is not sign-reversing on the entire set A; however, it has the property
that the sign of switch1,2 (S) is opposite to the sign of S whenever we have
(1 ∈ S and 2 ∈ / S) or (1 ∈
/ S and 2 ∈ S) (because in these two cases, sum S ei-
ther increases by 1 or decreases by 1, respectively). We can restate this property
as follows: The sign of switch1,2 (S) is opposite to the sign of S whenever we
have |S ∩ {1, 2}| = 1. Thus, we can use switch1,2 to cancel many addends from
our sum ∑ sign S. Still, many other addends (of different signs) remain, and
S∈A
the result is far from simple.
Thus, we need a “Plan B” if the map switch1,2 does not succeed. Assuming
that |S ∩ {1, 2}| ̸= 1 (that is, the set S ∈ A contains none or both of 1 and 2),
we gain nothing by switching 1 with 2 in S, but maybe we get lucky switching
2 with 3 in S (which is defined in the same way as switching 1 with 2, but with
the obvious changes)? If that, too, fails, we can try to switch 3 with 4. If that
fails as well, we can try to switch 4 with 5, and so on, until we get to the end of
the set [n].
In other words, we try to define a bijection f : A → A as follows: For any
S ∈ A, we pick the smallest i ∈ [n − 1] such that |S ∩ {i, i + 1}| = 1 (in other
words, the smallest i ∈ [n − 1] such that exactly one of the two elements i and
i + 1 belongs to S); and we switch i with i + 1 in S (that is, we replace S by
S △ {i, i + 1}). This produces a new subset S′ of [n] that has the same size as S
but has the opposite sign (actually, we have sum S′ = sum S ± 1), except for the
two cases when S = ∅ and when S = [n] (these are the cases where we cannot
find any i ∈ [n − 1] such that |S ∩ {i, i + 1}| = 1). We set f (S) := S △ {i, i + 1}.
Math 701 Spring 2021, version April 6, 2024 page 320
Alas, the last two of these examples show that f is not injective (as f ({1, 4}) =
{2, 4} = f ({3, 4})). Thus, f is not a bijection. The underlying problem is that
the i that was picked in the construction of f (S) is not uniquely recoverable
from f (S). Hence, our map f does not work for us – we cannot use it to cancel
addends, since we cannot cancel (e.g.) a single 1 against multiple −1s.
How can we salvage this argument? We change our map f to “space” our
switches apart. That is, we again start by trying to switch 1 with 2; if this fails,
we jump straight to trying to switch 3 with 4; if this fails too, we jump further
to trying to switch 5 with 6; and so on, until we either succeed at some switch
or run out of pairs to switch. For the explicit description of f , this means that
instead of picking the smallest i ∈ [n − 1] such that |S ∩ {i, i + 1}| = 1, we
pick the smallest odd i ∈ [n − 1] such that |S ∩ {i, i + 1}| = 1; and then we set
f (S) := S △ {i, i + 1} as before.
In other words, we define our new map f : A → A as follows: For any
S ∈ A, we set
f (S) := S △ {i, i + 1} ,
where i is the smallest odd element of [n − 1] such that |S ∩ {i, i + 1}| = 1. If
no such i exists, we just set f (S) := S. (We will soon see when this happens.)
Here are some examples (for n = 8 and k = 3):
Once again, it is clear that the set f (S) has size k whenever S does. Hence,
f : A → A is at least a well-defined map. This time, the map f is furthermore
an involution (that is, f ◦ f = id). Here is a quick argument for this (details
are left to the reader): Since we have “spaced” the switches apart, they don’t
interfere with each other. Thus, the i that gets chosen in the construction of
Math 701 Spring 2021, version April 6, 2024 page 321
f (S) will again get chosen in the construction of f ( f (S)) (since the elements
of f (S) that are smaller than this i will not have changed from S). Thus, the
switch that happens in the construction of f ( f (S)) undoes the switch made in
the construction of f (S), and as a result, the set f ( f (S)) will be S again. This
shows that f ◦ f = id.
Thus, f is an involution, hence a bijection. Moreover, sign ( f (S)) = − sign S
holds whenever f (S) ̸= S (because f (S) ̸= S implies that f (S) = S △ {i, i + 1}
for some i satisfying |S ∩ {i, i + 1}| = 1, and therefore sum ( f (S)) = sum S ±
1). Thus, we set
X := {S ∈ A | f (S) ̸= S} ,
and we restrict f to a map X → X (this is well-defined, since it is easy to see
from f ◦ f = id that f (S) ∈ X for each S ∈ X ). Then, the map f becomes a
sign-reversing bijection from X to X . Hence, Lemma 6.1.2 yields
∑ sign I = ∑ sign I.
I ∈A I ∈A\X
∑ sign S = ∑ sign S.
S∈A S∈A\X
Now, what is A \ X ? In other words, what addends are left behind uncan-
celled?
In order to answer this question, we need to consider the case when n is even
and the case when n is odd separately. We begin with the case when n is even.
A k-element subset S of [n] belongs to A \ X if and only if it satisfies f (S) =
S. In other words, S belongs to A \ X if and only if there exists no odd i ∈
[n − 1] such that |S ∩ {i, i + 1}| = 1 (because f has been defined in such a way
that f (S) = S in this case, while f (S) = S △ {i, i + 1} ̸= S in the other case).
In other words, S belongs to A \ X if and only if for each odd i ∈ [n − 1], the
size |S ∩ {i, i + 1}| is either 0 or 2. This is equivalent to saying that if we break
up the n elements 1, 2, . . . , n into n/2 “blocks”
(this can be done, since n is even), then the intersection of S with each block
has size 0 or 2. In other words, this is saying that the set S consists of entire
blocks (i.e., each block is either fully included in S or is disjoint from S). In
other words, this is saying that the set S is a union (possibly empty) of blocks.
Math 701 Spring 2021, version April 6, 2024 page 322
It remains to handle the case when n is odd. This case is different in that the
n elements 1, 2, . . . , n are now subdivided into (n + 1) /2 “blocks”
with the last of these blocks having size 1. As a consequence, this time, a blocky
subset of [n] can have odd size. Moreover, the parity of k determines whether
a blocky k-element subset of [n] will contain n:
Thus, when classifying the blocky k-element subsets of [n], we can either
dismiss n immediately (if k is even) or take n for granted (if k is odd); in either
case, the problem gets reduced to classifying the blocky k-element or (k − 1)-
element subsets of [n − 1], which we already know how to do (since n − 1 is
even). The result is that the # of blocky k-element subsets of [n] (in the case
when n is odd) is
(n − 1) /2
, if k is even;
(n − 1) /2
k/2
= ,
(n − 1) /2
⌊k/2⌋
(k − 1) /2 , if k is odd
ωd = 1
but
ω i ̸= 1 for each i ∈ {1, 2, . . . , d − 1} .
In other words, a primitive d-th root of unity in K means an element of the
multiplicative group K × whose order is d.
Math 701 Spring 2021, version April 6, 2024 page 325
e2πi2/6 e2πi1/6
e2πi3/6 e2πi0/6
e2πi5/6
e2πi4/6 .
Note that the equality (200) contains two ω-binomial coefficients and one
regular binomial coefficient.
It is not hard to check that (199) is the particular case of Theorem 6.1.7 for
d = 2 and ω = −1. Indeed, the only possible remainders of an integer upon
r
division by 2 are 0 and 1, and the ω-binomial coefficients for r, v ∈ {0, 1}
v ω
are
0 0 1 1
= 1, = 0, = 1, = 1.
0 ω 1 ω 0 ω 1 ω
Theorem 6.2.1 (size version of the PIE). Let n ∈ N. Let U be a finite set. Let
A1 , A2 , . . . , An be n subsets of U. Then,
(# of u ∈ U that satisfy u ∈
/ Ai for all i ∈ [n])
= ∑ (−1)| I | (# of u ∈ U that satisfy u ∈ Ai for all i ∈ I ) .
I ⊆[n]
• Here and in the following, we are using the notation [n] for the set {1, 2, . . . , n},
as defined in Definition 5.1.2.
• The summation sign “ ∑ ” means a sum over all subsets I of [n]. More
I ⊆[n]
generally, if S is a given set, then the summation sign “ ∑ ” shall always
I ⊆S
mean a sum over all subsets I of S.
• The shorthand “PIE” in the name of Theorem 6.2.1 is short for “Principle
of Inclusion and Exclusion”.
In one form or another, Theorem 6.2.1 appears in almost any text on com-
binatorics (e.g., in [Sagan19, Theorem 2.1.1], in [Loehr11, §4.11], in [Strick20,
Theorem 5.3], in [19fco, Theorem 2.9.7], or – in almost the same form as above
– in [20f, Theorem 7.8.6]). Most commonly, its claim is stated in the shorter (if
less transparent) form
∑ (−1)| I |
\
|U \ ( A1 ∪ A2 ∪ · · · ∪ An )| = Ai ,
I ⊆[n] i∈ I
|U \ ( A1 ∪ A2 ∪ · · · ∪ An )|
= (# of u ∈ U that satisfy u ∈ / A1 ∪ A2 ∪ · · · ∪ A n )
= (# of u ∈ U that satisfy u ∈ / Ai for all i ∈ [n])
\
Ai = (# of u ∈ U that satisfy u ∈ Ai for all i ∈ I )
i∈ I
T
(by the definition of Ai that we just gave).
i∈ I
Rather than prove Theorem 6.2.1 directly, we shall soon derive it from more
general results (in order to avoid duplicating arguments). First, however, let us
give an interpretation that makes Theorem 6.2.1 a little bit more intuitive, and
sketch four applications (more can be found in textbooks – e.g., [Sagan19, §2.1],
[Stanle11, Chapter 2], [Wildon19, Chapter 3], [AndFen04, Chapter 6], [19fco,
§2.9]).
that each element of U may or may not satisfy. (For instance, one
element of U might satisfy all of these rules; another might satisfy
none; yet another might satisfy rules 1 and 3 only. A rule can be
something like “thou shalt be divisible by 5” (if the elements of U
are numbers) or “thou shalt be a nonempty set” (if the elements of
U are sets).)
Assume that, for each I ⊆ [n], we know how many elements u ∈ U
satisfy all rules in I (but may or may not satisfy the remaining rules).
For example, this means that we know how many elements u ∈ U
satisfy rules 2, 3, 5 (simultaneously). Then, we can compute the # of
elements u ∈ U that violate all n rules 1, 2, . . . , n by the following
formula:
(# of elements u ∈ U that violate all n rules 1, 2, . . . , n)
= ∑ (−1)| I | (# of elements u ∈ U that satisfy all rules in I ) .
I ⊆[n]
Thus, if you have a counting problem that can be restated as “count things
that violate a bunch of rules”, then you can apply Theorem 6.2.1 (in the inter-
pretation we just gave) to “turn the problem positive”, i.e., to make it about
counting rule-followers instead of rule-violators. If the “positive” problem is
easier, then this is a useful technique. We will now witness this on four exam-
ples.
6.2.2. Examples
Example 1. Let n, m ∈ N. Let us compute the # of surjective maps from [m] to
[n]. (We will outline the argument here; details can be found in [20f, §7.8.2] or
[19fco, §2.9.4].)
What are surjective maps? They are maps that take each element of the target
set as a value. Thus, in particular, a map f : [m] → [n] is surjective if and only
if it takes each i ∈ [n] as a value.
Hence, if we impose n rules 1, 2, . . . , n on a map f : [m] → [n], where rule i
says “thou shalt not take i as a value”, then the surjective maps f : [m] → [n]
are precisely the maps f : [m] → [n] that violate all n rules. Hence,
(# of surjective maps f : [m] → [n])
= (# of maps f : [m] → [n] that violate all n rules 1, 2, . . . , n)
= ∑ (−1)| I | (# of maps f : [m] → [n] that satisfy all rules in I )
I ⊆[n]
Math 701 Spring 2021, version April 6, 2024 page 329
k =0 I ⊆[n]; k =0
| I |=k
| {z }
n
= (−1)k (n−k)m
k
n
(since this is a sum of
k
many equal addends)
n
n
= ∑ (−1) k
(n − k )m .
k =0
k
k =0
k
Math 701 Spring 2021, version April 6, 2024 page 330
This is the simplest expression for this number. It has no product formula
(unlike the # of injective maps f : [m] → [n], which is n (n − 1) (n − 2) · · · (n − m + 1)).
Before we move on to the next example, let us draw a few consequences from
Theorem 6.2.2:
finite sets of the same size is bijective. Thus, any surjective map f : [n] → [n] is bijective,
and hence is a permutation of [n]. The converse is true as well (i.e., any permutation of [n]
is a surjective map f : [n] → [n]). Thus, the surjective maps f : [n] → [n] are precisely the
permutations of [n].
Math 701 Spring 2021, version April 6, 2024 page 331
However, the orbits of this action form a partition of U (since each element of
U belongs to exactly one orbit). Thus, we have
(since each orbit has size n!). This entails n! | |U |, which is exactly what we
wanted to show. This proves Corollary 6.2.3 (d).
We note that parts (a) and (b) of Corollary 6.2.3 can also be proved alge-
braically (see, e.g., [20f, Exercise 5.4.2 (d)] for an algebraic generalization of
Corollary 6.2.3 (a)); but this is harder for Corollary 6.2.3 (d) and (to my knowl-
edge) impossible for Corollary 6.2.3 (c).
90 For a refresher on group actions, see (e.g.) [Quinla21, §3.2] or [Loehr11, §9.11–§9.15] or
[Artin10, §6.7–§6.9] or [Armstr19, Fall 2018, Weeks 8–10] or [Aigner07, §6.1]. When G is a
group, we use the word “G-set” to mean a set on which G acts.
91 This is called “acting by post-composition” (since σ ◦ f is obtained from f by composing with
σ “after” f ).
92 Proof. Let f ∈ U. We must prove that the stabilizer of f is {id}.
Let σ belong to the stabilizer of f . Thus, σ ◦ f = f . Now, let j ∈ [n]. Recall that f ∈ U;
hence, f is surjective. Thus, there exists some i ∈ [m] such that j = f (i ). Consider this i.
Now, from j = f (i ), we obtain σ ( j) = σ ( f (i )) = (σ ◦ f ) (i ) = f (i ) = j. Now, forget that
| {z }
=f
we fixed j. We thus have shown that σ ( j) = j for each j ∈ [n]. In other words, σ = id.
Forget that we fixed σ. We thus have shown that any σ in the stabilizer of f satisfies
σ = id. In other words, the stabilizer of f is a subset of {id}. Therefore, the stabilizer of f
must be {id} (since id is clearly in the stabilizer of f ).
Math 701 Spring 2021, version April 6, 2024 page 332
Example 2. (See [19fco, §2.9.5] for details.) We will use the following defini-
tion:
• In the symmetric group S3 , the derangements are the 3-cycles cyc1,2,3 and
cyc1,3,2 . Thus, D3 = 2.
n 0 1 2 3 4 5 6 7 8 9 10
.
Dn 1 0 1 2 9 44 265 1854 14 833 133 496 1 334 961
Dn = (# of derangements of [n])
= (# of permutations u ∈ U that violate all n rules 1, 2, . . . , n)
= ∑ (−1)| I | (# of permutations u ∈ U that satisfy all rules in I )
I ⊆[n]
k =0 I ⊆[n]; k =0
| I |=k
| {z }
n
= (−1)k (n−k)!
k
n n n
(−1)k
k n n!
= ∑ (−1) (n − k)! = ∑ (−1)k = n! · ∑ .
k =0
k k = 0
k! k = 0
k!
| {z }
n!
=
k!
(by (2))
Remark 6.2.6. The sum on the right hand side is a partial sum of the well-
∞ (−1)k
known infinite series ∑ = e−1 (where e = 2.718 . . . is Euler’s num-
k =0 k!
ber). This is quite helpful in approximating Dn ; indeed, it is easy to see
Math 701 Spring 2021, version April 6, 2024 page 334
n!
Dn = round for each n > 0,
e
where round x means the result of rounding a real number x to the nearest
integer (fortunately, since e is irrational, we never get a tie).
See Exercise A.5.2.3 and [Wildon19, Chapter 1] for more about derangements.
Example 3. Like many other things in these lectures, the following elemen-
tary number-theoretical result is due to Euler:
Note that the # of all u ∈ [c] that are coprime to c is usually denoted by ϕ (c)
in number theory, and the map ϕ : {1, 2, 3, . . .} → N that sends each c to ϕ (c)
is called Euler’s totient function.
Theorem 6.2.7 can be proved in many ways (and a proof can be found in
almost any text on elementary number theory). Probably the most transparent
proof relies on the PIE:
Proof of Theorem 6.2.7 (sketched). (See [19fco, proof of Theorem 2.9.19] for the de-
tails of this argument.) Let U = [c]. A number u ∈ U is coprime to c if and
only if it is not divisible by any of the prime factors p1 , p2 , . . . , pn of c. Again,
this means that u breaks all n rules 1, 2, . . . , n, where rule i says “thou shalt be
divisible by pi ”. Thus, by the “rule-breaking” interpretation of Theorem 6.2.1,
Math 701 Spring 2021, version April 6, 2024 page 335
we obtain
Example 4. (This one is taken from [Sagan19, Theorem 2.3.3].) Recall Theo-
rem 4.1.14, which states that each n ∈ N satisfies
where
podd (n) = ∑ (−1)| I | (# of partitions that contain the entry 2i for each i ∈ I ) .
I ⊆[n]
(204)
Now, comparing these two equalities, we see that in order to prove that
pdist (n) = podd (n), it will suffice to show that
(# of partitions that contain each of the entries i ∈ I twice)
= (# of partitions that contain the entry 2i for each i ∈ I )
for any subset I of [n].
So let I be a subset of [n]. We are looking for a bijection
from {partitions that contain each of the entries i ∈ I twice}
to {partitions that contain the entry 2i for each i ∈ I } .
Such a bijection can be obtained as follows: For each i ∈ I (from highest to
lowest94 ), we remove two copies of i from the partition, and insert a 2i into
93 Once again, “twice” means “at least twice”.
94 Actually,a bit of thought reveals that the order in which we go through the i ∈ I does
not affect the result; this becomes particularly clear if we identify each partition with the
multiset of its entries. Thus, me saying “from highest to lowest” is unnecessary.
Math 701 Spring 2021, version April 6, 2024 page 337
the partition in their stead. For example, if I = {2, 4, 5} and n = 33, then our
bijection sends the partition
(5, 5, 4, 4, 3, 3, 2, 2, 2, 2, 1) to (10, 8, 4, 3, 3, 2, 2, 1) .
(Note that the 4 in the resulting partition is not one of the original two 4s, but
rather a new 4 that was inserted when we removed two copies of 2. On the
other hand, the two 2s in the resulting partition are inherited from the original
partition, because (unlike the bijection A in our Second proof of Theorem 4.1.14
above) our bijection only removes two copies of each i ∈ I.)
It is easy to see that this purported bijection really is a bijection95 . Thus, we
have found our bijection. The bijection principle therefore yields
(# of partitions that contain each of the entries i ∈ I twice)
= (# of partitions that contain the entry 2i for each i ∈ I ) .
We have proved this equality for all I ⊆ [n]. Hence, the right hand sides of
the equalities (203) and (204) are equal. Thus, their left hand sides are equal as
well. In other words, pdist (n) = podd (n). This proves Theorem 4.1.14 again.
Remark 6.2.8. It is worth contrasting the above four examples (in which we
applied the PIE to solve a counting problem, obtaining an alternating sum as
a result) with our arguments in Section 6.1 (in which we computed alternat-
ing sums using sign-reversing involutions). Sign-reversing involutions help
turn alternating sums into combinatorial problems, while the PIE moves us
in the opposite direction. The two techniques are thus, in some way, inverse
to each other. This will become less mysterious once we prove the PIE itself
using a sign-reversing involution. The PIE can also be used backwards, to
turn an alternating sum into a counting problem, which is how we proved
Corollary 6.2.3 above.
95 Itsinverse, of course, does what you would expect: For each i ∈ I, we remove a 2i from the
partition, and insert two copies of i in its stead.
Math 701 Spring 2021, version April 6, 2024 page 338
Example 6.2.11. Let S = [2] = {1, 2}. Then, the assumptions of Theorem
6.2.10 state that
b∅ = a∅ ;
b{ 1 } = a ∅ + a { 1 } ;
b{ 2 } = a ∅ + a { 2 } ;
b{1,2} = a∅ + a{1} + a{2} + a{1,2} .
Math 701 Spring 2021, version April 6, 2024 page 339
a∅ = b∅ ;
a{1} = −b∅ + b{1} ;
a{2} = −b∅ + b{2} ;
a{1,2} = b∅ − b{1} − b{2} + b{1,2} .
These four equalities can be verified easily. For instance, let us check the last
of them:
Before we prove Theorem 6.2.10, let us show that the weighted PIE (Theorem
6.2.9) is a particular case of it:
Proof of Theorem 6.2.9 using Theorem 6.2.10. Let S = [n]. We note that the map
{subsets of S} → {subsets of S} ,
J 7→ S \ J
is a bijection. (Indeed, this map is an involution, since each subset J of S satisfies
S \ (S \ J ) = J.)
For each u ∈ U, define a subset Viol u of S by
Viol u := {i ∈ S | u ∈
/ Ai } .
(In terms of the “rule-breaking” interpretation, Viol u is the set of all rules that
u violates.) Now, for each subset I of [n], we set
a I := ∑ w (u) and b I := ∑ w (u) .
u∈U; u∈U;
Viol u= I Viol u⊆ I
= ∑ aJ.
J⊆I
Math 701 Spring 2021, version April 6, 2024 page 340
(because the elements u ∈ U that satisfy Viol u ⊆ J are precisely the elements
u ∈ U that satisfy (u ∈ Ai for all i ∈ S \ J ) 96 ).
Now, applying (208) to I = S, we obtain
= ∑ (−1)| I | ∑ w (u)
I ⊆S u∈U;
u∈ Ai for all i ∈ I
aS = ∑ w (u) = ∑ w (u)
u∈U; u∈U;
Viol u=S ∈ Ai for all i ∈[n]
u/
(Viol u ⊆ J ) ⇐⇒ ({i ∈ S | u ∈/ Ai } ⊆ J )
⇐⇒ (each i ∈ S satisfying u ∈ / Ai belongs to J )
⇐⇒ (each i ∈ S that does not belong to J must satisfy u ∈ Ai )
(by contraposition)
⇐⇒ (each i ∈ S \ J must satisfy u ∈ Ai )
⇐⇒ (u ∈ Ai for all i ∈ S \ J ) .
(since the elements u ∈ U that satisfy Viol u = S are precisely the elements
/ Ai for all i ∈ [n]) 97 ).
u ∈ U that satisfy (u ∈
Comparing these two equalities, we obtain
aQ = ∑ (−1)|Q\ I | b I .
I ⊆Q
= ∑ ∑ (−1) | Q\ I |
aP . (210)
I ⊆Q P⊆ I
The two summation signs “ ∑ ∑ ” on the right hand side of this equality
I ⊆Q P⊆ I
result in a sum over all pairs ( I, P) of subsets of Q satisfying P ⊆ I ⊆ Q. The
same result can be obtained by the two summation signs “ ∑ ∑ ” (indeed,
P⊆ Q I ⊆ Q;
P⊆ I
the only difference between “ ∑ ∑ ” and “ ∑ ∑ ” is the order in which
I ⊆Q P⊆ I P⊆ Q I ⊆ Q;
P⊆ I
the two subsets I and P are chosen). Thus, we can replace “ ∑ ∑ ” by
I ⊆Q P⊆ I
(Viol u = S) ⇐⇒ ({i ∈ S | u ∈/ Ai } = S )
⇐⇒ (each i ∈ S satisfies u ∈/ Ai )
⇐⇒ (u ∈
/ Ai for all i ∈ S)
⇐⇒ (u ∈
/ Ai for all i ∈ [n]) (since S = [n]) .
∑ (−1)|Q\ I | b I = ∑ ∑ (−1)|Q\ I | a P
I ⊆Q P⊆ Q I ⊆ Q;
P⊆ I
= ∑ ∑ (−1)|Q\ I | a P . (211)
P⊆ Q I ⊆ Q;
P⊆ I
We want to prove that this equals aQ . Since the a P ’s are arbitrary elements
of an abelian group, the only way this can possibly be achieved is by showing
that the sum on the right hand side simplifies to aQ formally – i.e., that the
coefficient ∑ (−1)|Q\ I | in front of a P is 0 whenever P ̸= Q, and is 1 whenever
I ⊆ Q;
P⊆ I
P = Q. Thus, we now set out to prove this. Using Definition A.1.5, we can
restate this goal as follows: We want to prove that every subset P of Q satisfies
∑ (−1)|Q\ I | = [ P = Q] . (212)
I ⊆ Q;
P⊆ I
We shall prove this soon (in Lemma 6.2.12 (b) below). For now, let us explain
how the proof of Theorem 6.2.10 can be completed if (212) is known to be true.
Indeed, (211) becomes
∑ (−1)|Q\ I | b I = ∑ ∑ (−1)|Q\ I | a P = ∑ [ P = Q] a P
I ⊆Q P⊆ Q I ⊆ Q; P⊆ Q
P⊆ I
| {z }
=[ P= Q]
(by (212))
= [ Q = Q] aQ + ∑ [ P = Q] a P
P⊆ Q;
| {z } | {z }
=1
(since Q= Q) P̸= Q (since=P
0
̸= Q)
here, we have split off the addend for P = Q
from the sum (since Q ⊆ Q)
= aQ + ∑ 0a P = aQ .
P⊆ Q;
P̸= Q
| {z }
=0
(b) We have
∑ (−1)|Q\ I | = [ P = Q] . (214)
I ⊆ Q;
P⊆ I
As promised, Lemma 6.2.12 (b) (once proved) will yield (212) and thus will
complete our above proof of Theorem 6.2.10 (and, with it, the proofs of Theorem
6.2.9 and Theorem 6.2.1).
Proof of Lemma 6.2.12. (a) There are many ways to prove this (in particular, a
simple one using the binomial theorem – do you see it?); but staying true to the
spirit of this chapter, we pick one using a sign-reversing involution. (A variant
of this proof can be found in [19fco, solution to Exercise 2.9.1].98 )
We must prove the equality (213). If P = Q, then this equality is easily seen
to hold99 . Hence, for the rest of this proof, we WLOG assume that P ̸= Q.
Thus, [ P = Q] = 0.
Now, P is a proper subset of Q (since P is a subset of Q and satisfies P ̸= Q).
Hence, there exists some q ∈ Q such that q ∈ / P. Fix such a q.
98 Our sets Q and P are called S and T in [19fco, solution to Exercise 2.9.1].
99 Proof.Assume that P = Q. Then, the only subset I of Q that satisfies P ⊆ I is the set Q itself
(since any such subset I has to satisfy both Q = P ⊆ I and I ⊆ Q, which in combination
entail I = Q). Thus, the sum ∑ (−1)| I | has only one addend, namely the addend for
I ⊆ Q;
P⊆ I
I = Q. Consequently, this sum simplifies as follows:
Let
A := { I ⊆ Q | P ⊆ I } ,
and let
sign I := (−1)| I | for each I ∈ A.
Then,
∑ sign I = ∑ (−1)| I | . (216)
I ∈A I ⊆ Q;
P⊆ I
This map f is well-defined, because (as we have just shown in the two para-
graphs above) every I ∈ A satisfies I ∪ {q} ∈ A and I \ {q} ∈ A. Moreover,
this map f is an involution101 . This involution f has no fixed points (because
if I ∈ A, then f ( I ) = I △ {q} ̸= I). Furthermore, if I ∈ A, then the set
f ( I ) = I △ {q} differs from I in exactly one element (namely, q), and thus
satisfies | f ( I )| = | I | ± 1, so that
(−1)| f ( I )| = − (−1)| I | ,
or, equivalently,
sign ( f ( I )) = − sign I
On the other hand, from P = Q, we obtain
Comparing this with (215), we obtain ∑ (−1)| I | = (−1)| P| [ P = Q]. Thus, we have shown
I ⊆ Q;
P⊆ I
that (213) holds under the assumption that P = Q.
100 Here, the notation X △ Y means the symmetric difference ( X ∪ Y ) \ ( X ∩ Y ) of two sets X
and Y (as in Subsection 3.2.1).
101 Indeed, the map f merely removes q from a set I if q is contained in I, and inserts it into I
otherwise; but this is clearly an operation that undoes itself when performed a second time.
Math 701 Spring 2021, version April 6, 2024 page 345
(since the definition of sign I yields sign I = (−1)| I | , and similarly sign ( f ( I )) =
(−1)| f ( I )| ). Thus, Lemma 6.1.3 (applied to X = A) shows that
Hence,
= (−1)|Q| (−1)| P| [ P = Q] .
102 Proofof (217): If P ̸= Q, then the equality (217) boils down to (−1)|Q| (−1)| P| · 0 = 0
(since P ̸= Q entails [ P = Q] = 0), which is obviously true. Hence, (217) is proved
if P ̸= Q. Thus, for the rest of this proof, we WLOG assume that P = Q. Hence,
(−1)|Q| (−1)| P| = (−1)|Q| (−1)|Q| = (−1)|Q|+|Q| = 1 (since | Q| + | Q| = 2 | Q| is even).
Therefore, (−1)|Q| (−1)| P| [ P = Q] = [ P = Q]. This proves (217).
| {z }
=1
Math 701 Spring 2021, version April 6, 2024 page 346
As said above, this completes the proofs of Theorem 6.2.10, of Theorem 6.2.9
and of Theorem 6.2.1.
While Theorem 6.2.10 has played the part of the ultimate generalization to
us, it can be generalized further. Indeed, it is merely a particular case of
Möbius inversion for arbitrary posets (see, e.g., [Stanle11, Proposition 3.7.1] or
[Martin21, Theorem 2.3.1] or [Sagan19, Theorem 5.5.5] or [Sam21, Theorem
6.10] or [Wagner20, Theorem 14.6.4]).
6.4. Determinants
Determinants were introduced by Leibniz in the 17th century, and quickly be-
came one of the most powerful tools in mathematics. They remained so until
the early 20th century. There is a 5-volume book by Thomas Muir [Muir30] that
merely summarizes the results found on determinants... until 1920.
Most of these old results are still interesting and nontrivial. The relative role
of determinants in mathematics has declined mainly because other parts of
mathematics have “caught up” and have produced easier ways to many of the
places that were previously only accessible through the study of determinants.
As with anything else, we will just present some of the most basic results
and methods related to determinants. For more, see [MuiMet60], [Zeilbe85],
[Grinbe15, Chapter 6], [Prasol94, Chapter I], [BruRys91, Chapter 9] and vari-
ous other sources. A good introduction to the most fundamental properties is
[Strick13].
Convention 6.4.1. For the rest of Section 6.4, we fix a commutative ring K. In
most examples, K will be Z or Q or a polynomial ring.
Note that the letters “i” and “j” in the notation “ ai,j 1≤i≤n, 1≤ j≤m ” are not
carved in stone. We could just as well use any other letters instead, and write
a x,y 1≤ x≤n, 1≤y≤m or (somewhat misleadingly, but technically correctly)
a j,i 1≤ j≤n, 1≤i≤m for the exact same matrix. (However, a j,i 1≤i≤m, 1≤ j≤n is
a different matrix. Whichever index is mentioned first in the subscript after
the closing parenthesis is used to index rows; the other index is used to index
columns.)
(c) We let K n×m denote the set of all n × m-matrices with entries in K. This
is a K-module. If n = m, this is also a K-algebra.
(d) Let A ∈ K n×m be an n × m-matrix. The transpose A T of A is defined to
be the m × n-matrix whose entries are given by
AT = A j,i for all i ∈ [m] and j ∈ [n] .
i,j
6.4.1. Definition
There are several ways to define the determinant of a square matrix. The fol-
lowing is the most direct one:
of K. Here, as before:
• we let Sn denote the n-th symmetric group (i.e., the group of permuta-
tions of [n] = {1, 2, . . . , n});
• we let (−1)σ denote the sign of the permutation σ (as defined in Defi-
nition 5.4.1).
Math 701 Spring 2021, version April 6, 2024 page 348
Using less cumbersome notations, we can rewrite this as follows: For any
a, b, a′ , b′ ∈ K, we have
a b
det = ab′ − ba′ .
a′ b′
(The six addends on the right hand side here correspond to the six permu-
tations in S3 , which in one-line notation are 123, 132, 213, 231, 312, and 321,
respectively.)
Similarly,
for n = 1, we obtain that the determinant of the 1 × 1-matrix
a is
det a = a.
Here, the “ a ” on the left hand side is a 1 × 1-matrix.
Finally, for n = 0, we obtain that the determinant of the 0 × 0-matrix ()
(this is an empty matrix, with no rows and no columns) is
Some (particularly, older) texts use the notation | A| instead of det A for the
determinant of the matrix A.
The above definition of the determinant is purely combinatorial: it is an
alternating sum over the n-th symmetric group Sn . Typically, when computing
determinants, this definition is not in itself very useful (e.g., because Sn gets
rather large when n is large). However, in some cases, it suffices. Here are a
few examples:
Math 701 Spring 2021, version April 6, 2024 page 349
(The “o” is a letter “oh”, not a zero. Not that it matters much...)
Proof of Example 6.4.5. Let A be the 5 × 5-matrix whose determinant we are try-
ing to identify as 0; thus, A1,1 = a and A1,2 = b and A3,2 = 0 and so on. Notice
that
Ai,j = 0 whenever i, j ∈ {2, 3, 4} (218)
(since A has a “hollow core”, i.e., a 3 × 3-square consisting entirely of zeroes in
its middle). We must prove that det A = 0.
Our definition of det A yields
5
det A = ∑ (−1)σ ∏ Ai,σ(i) . (219)
σ ∈ S5 i =1
Now, I claim that each of the addends in the sum on the right hand side is 0.
5
In other words, I claim that ∏ Ai,σ(i) = 0 for each σ ∈ S5 .
i =1
To prove this, fix σ ∈ S5 . The three numbers σ (2) , σ (3) , σ (4) are three dis-
tinct elements of [5] (distinct because σ is injective), so they cannot all belong to
the 2-element set {1, 5} (since there are no three distinct elements in a 2-element
set). Hence, at least one of them must belong to the complement {2, 3, 4} of this
set. In other words, there exists some i ∈ {2, 3, 4} such that σ (i ) ∈ {2, 3, 4}. This
i must then satisfy Ai,σ(i) = 0 (by (218), applied to j = σ (i )). Thus, we have
shown that there exists some i ∈ {2, 3, 4} such that Ai,σ(i) = 0.
5
This shows that at least one factor of the product ∏ Ai,σ(i) is 0. Thus, the
i =1
entire product is 0.
5
Forget that we fixed σ. We thus have proved that ∏ Ai,σ(i) = 0 for each
i =1
σ ∈ S5 . Hence,
5
det A = ∑ (−1)σ ∏ Ai,σ(i) = 0,
σ ∈ S5
|i=1 {z }
=0
and thus Example 6.4.5 is proved.
[We note that there are various alternative proofs, e.g., using Laplace expan-
sion. Also, if K is a field, you can argue that det A = 0 using rank arguments.
See [Grinbe15, Exercise 6.47 (a)] for a generalization of Example 6.4.5.]
Math 701 Spring 2021, version April 6, 2024 page 350
det () = 1;
det x1 y1 = x1 y1 ;
x1 y1 x1 y2
det = 0;
x2 y1 x2 y2
x1 y1 x1 y2 x1 y3
det x2 y1 x2 y2 x2 y3 = 0.
x3 y1 x3 y2 x3 y3
σ ∈ Sn | {z }
=y1 y2 ···yn
(since σ is a bijection [n]→[n])
= ∑ (−1)σ ( x1 x2 · · · xn ) (y1 y2 · · · yn )
σ ∈ Sn
= ( x1 x2 · · · x n ) ( y1 y2 · · · y n ) ∑ (−1)σ = 0.
σ ∈ Sn
| {z }
=0
(by (183))
for any x ∈ K. In other words, if all entries of a square matrix of size ≥ 2 are
equal, then the determinant of this matrix is 0.
det () = 1;
det x1 + y1 = x1 + y1 ;
x1 + y1 x1 + y2
det = − ( x1 − x2 ) ( y1 − y2 ) ;
x2 + y1 x2 + y2
x1 + y1 x1 + y2 x1 + y3
det x2 + y1 x2 + y2 x2 + y3 = 0.
x3 + y1 x3 + y2 x3 + y3
Proof of Proposition 6.4.9 (sketched). (See [Grinbe15, Example 6.7] for more de-
Math 701 Spring 2021, version April 6, 2024 page 352
= ∑ (−1)σ ∑ ∏ xi ∏ y σ (i )
σ ∈ Sn I ⊆[n] i∈ I i ∈[n]\ I
!
= ∑ ∑ (−1)σ ∏ xi ∏ y σ (i )
I ⊆[n] σ ∈ Sn i∈ I i ∈[n]\ I
!
= ∑ ∏ xi ∑ (−1)σ ∏ y σ (i ) .
I ⊆[n] i∈ I σ ∈ Sn i ∈[n]\ I
Now, I claim that the inner sum is 0 for each I. In other words, I claim that
[Proof of (221): Fix a subset I of [n]. We shall show that all addends in the
sum ∑ (−1)σ ∏ yσ(i) cancel each other – i.e., that for each addend in this
σ ∈ Sn i ∈[n]\ I
sum, there is a different addend with the same product of y j ’s but a different
sign (−1)σ . To achieve this, we need to pair up each σ ∈ Sn with a different
′
permutation σ′ = σtu,v ∈ Sn that satisfies ∏ yσ′ (i) = ∏ yσ(i) but (−1)σ =
i ∈[n]\ I i ∈[n]\ I
σ
− (−1) . (Indeed, this pairing will then produce the required cancellations:
the addend for each σ will cancel the addend for the corresponding σ′ . To
be more rigorous, we are here applying Lemma 6.1.3 to A = Sn , X = Sn
and sign σ = (−1)σ ∏ yσ(i) (of course, this should not be confused for the
i ∈[n]\ I
σ
notation sign σ for (−1) ) and
f = (the map Sn → Sn that sends each σ ∈ Sn to the corresponding σ′ ).)
So let us construct our pairing. Indeed, from I ⊆ [n], we obtain | I | +
|[n] \ I | = |[n]| = n ≥ 3; hence, at least one of the two sets I and [n] \ I has size
> 1. In other words, must be in one of the following two cases:
Math 701 Spring 2021, version April 6, 2024 page 353
103 Both factors yσ(u) and yσ(v) do indeed appear in these products, since u and v belong to
[n] \ I.
Math 701 Spring 2021, version April 6, 2024 page 354
determinants). Let n ∈ N. If A ∈ K
Theorem 6.4.10 (Transposes preserve n×n
T
is any n × n-matrix, then det A = det A.
Proof. See [Strick13, Proposition B.11] or [Grinbe15, Exercise 6.3 and the para-
graph after Exercise 6.4].
As a consequence of Theorem 6.4.11, we see that the determinant of a diag-
onal matrix is the product of its diagonal entries (since any diagonal matrix is
triangular).
Then,
det C = det A + det B.
Example 6.4.13. Let us see what Theorem 6.4.12 is saying in some particular
cases (specifically, for 3 × 3-matrices):
(a) One instance of Theorem 6.4.12 (a) is
a b c a b c
det a′′ b′′ c′′ = − det a′ b′ c′ .
a′ b′ c′ a′′ b′′ c′′
(b) One instance of Theorem 6.4.12 (b) is
a b c
det 0 0 0 = 0.
a′′ b′′ c′′
(c) One instance of Theorem 6.4.12 (c) is
a b c
det a′ b′ c′ = 0.
a b c
(d) One instance of Theorem 6.4.12 (d) is
a b c a b c
det λa′ λb′ λc′ = λ det a′ b′ c′ .
a′′ b′′ c′′ a′′ b′′ c′′
Math 701 Spring 2021, version April 6, 2024 page 356
(Specifically, this is the particular case of Theorem 6.4.12 (g) for n = 3 and
k = 2.)
Parts (b), (d) and (g) of Theorem 6.4.12 are commonly summarized under the
mantle of “multilinearity of the determinant” or “linearity of the determinant in the
k-th row”. In fact, they say that (for any given n ∈ N and k ∈ [n]) if we hold all
rows other than the k-th row of an n × n-matrix A fixed, then det A depends
K-linearly on the k-th row of A.
Proof of Theorem 6.4.12. (a) See [Grinbe15, Exercise 6.7 (a)]. This is also a partic-
ular case of [Strick13, Corollary B.19].
(b) See [Grinbe15, Exercise 6.7 (c)]. This is also near-obvious from Definition
6.4.3.
(c) See [Grinbe15, Exercise 6.7 (e)] or [Laue15, §5.3.3, property (iii)] or [19fla,
2019-10-23 blackboard notes, Theorem 1.3.3].104
(d) See [Grinbe15, Exercise 6.7 (g)] or [Laue15, §5.3.3, property (ii)]. This is
also a particular case of [Strick13, Corollary B.19].
104 Warning:Several authors claim to give an easy proof of part (c) using part (a). This “proof”
goes as follows: If A has two equal rows, then swapping these rows leaves A unchanged,
but (because of Theorem 6.4.12) flips the sign of det A. Hence, in this case, we have det A =
− det A, so that 2 det A = 0 and therefore det A = 0, right? Not so fast! In order to obtain
det A = 0 from 2 det A = 0, we need the element 2 of K to be invertible or at least be a non-
zero-divisor (since we have to divide by 2). This is true when K is one of the “high-school
rings” Z, Q, R and C, but it is not true when K is the field F2 with 2 elements (or, more
generally, any field of characteristic 2). This slick argument can be salvaged, but in the form
just given it is incomplete.
Math 701 Spring 2021, version April 6, 2024 page 357
(f) See [Grinbe15, Exercise 6.8 (a)]. This is also a particular case of [Strick13,
Corollary B.19].
(e) This is the particular case of part (f) for λ = 1.
(g) See [Grinbe15, Exercise 6.7 (i)] or [Laue15, §5.3.3, property (i)] or [19fla,
2019-10-30 blackboard notes, Theorem 1.2.3].
Proof. Theorem 6.4.10 shows that the determinant of a matrix does not change
when we replace it by its transpose; however, the rows of this transpose A T are
the transposes of the columns of A. Thus, Theorem 6.4.14 follows by applying
Theorem 6.4.12 to the transposes of all the matrices involved. (See [Grinbe15,
Exercises 6.7 and 6.8] for the details.)
and
det Ai,τ ( j) = (−1)τ · det A. (223)
1≤i ≤n, 1≤ j≤n
In words: When we permute the rows or the columns of a matrix, its deter-
minant gets multiplied by the sign of the permutation.
Proof of Corollary 6.4.15. Let us first prove (223).
The definition of det A yields
Sn → Sn ,
σ 7 → τ −1 σ
n
is a bijection. Hence, we can substitute τ −1 σ for σ in the sum ∑ (−1)σ ∏ Ai,τ (σ(i)) .
σ ∈ Sn i =1
Thus, we obtain
n
∑ (−1)σ ∏ Ai,τ (σ(i))
σ ∈ Sn i =1
n
τ −1 σ
= ∑ (−1) ∏ Ai,τ ((τ −1 σ)(i))
σ ∈ Sn i =1
| {z }
−1
| {z }
=(−1)τ ·(−1)σ = Ai,σ(i)
(by Proposition 5.4.2 (d), (since τ ((τ −1 σ )(i ))=(ττ −1 σ )(i )=σ(i )
applied to τ −1 and σ instead of σ and τ)
(because ττ −1 σ=σ))
−1
n n
τ −1
= ∑ (−1)τ · (−1)σ ∏ Ai,σ(i) = (−1) · ∑ (−1)σ
∏ Ai,σ(i)
σ ∈ Sn i =1 σ ∈ Sn i =1
| {z }
=(−1)τ | {z }
(by Proposition 5.4.2 (f), =det A
applied to τ −1 instead of σ) (by (224))
= (−1)τ · det A.
In view of this, we can rewrite (225) as
det Ai,τ ( j) = (−1)τ · det A.
1≤i ≤n, 1≤ j≤n
Thus,
T
AT = Aτ ( j),i = Aτ (i),j
i,τ ( j) 1≤i ≤n, 1≤ j≤n 1≤i ≤n, 1≤ j≤n 1≤i ≤n, 1≤ j≤n
Math 701 Spring 2021, version April 6, 2024 page 359
!
det Aτ (i),j = det AT = (−1)τ · det A
1≤i ≤n, 1≤ j≤n i,τ ( j) 1≤i ≤n, 1≤ j≤n
(by (226)). This proves (222). Thus, the proof of Corollary 6.4.15 is complete.
The following is probably the most remarkable property of determinants:
(Only the first column of A and the first row of B have any nonzero entries.)
Math 701 Spring 2021, version April 6, 2024 page 360
Now,
x1 y1 x1 y2 · · · x1 y n
x2 y1 x2 y2 · · · x2 y n
xi y j =
1≤i ≤n, 1≤ j≤n .. .. .. ..
. . . .
x n y1 x n y2 · · · xn yn
x1 0 0 · · · 0 y1 y2 y3 · · · yn
x2 0 0 · · · 0
0 0 0 ··· 0
=
x3 0 0 · · · 0
0 0 0 ··· 0 = AB,
.. .. .. . . .. .. .. .. .. ..
. . . . . . . . . .
xn 0 0 · · · 0 0 0 0 ··· 0
| {z }| {z }
=A =B
so that
det xi y j 1≤i ≤n, 1≤ j≤n
= det ( AB) = det A · det B
(by Theorem 6.4.16). However, the matrix A has a zero column (since n ≥ 2),
and thus satisfies det A = 0 (by Theorem 6.4.14 (b)105 ). Hence,
| {zA} · det B = 0.
det xi y j 1≤i≤n, 1≤ j≤n = det
=0
(Only the first two columns of A and the first two rows of B have any nonzero
entries.)
105 Of course,by “Theorem 6.4.14 (b)”, we mean “the analogue of Theorem 6.4.12 (b) for columns
instead of rows”.
Math 701 Spring 2021, version April 6, 2024 page 361
Now,
x1 + y1 x1 + y2 · · · x1 + y n
x2 + y1 x2 + y2 · · · x2 + y n
xi + y j =
1≤i ≤n, 1≤ j≤n .. .. ... ..
. . .
x n + y1 x n + y2 · · · xn + yn
x1 1 0 · · · 0 1 1 1 ··· 1
x2 1 0 · · · 0
y1 y2 y3 · · · yn
=
x3 1 0 · · · 0
0 0 0 ··· 0 = AB,
.. .. .. . . . .. .. .. .. ..
. ..
. . . . . . . .
xn 1 0 · · · 0 0 0 0 ··· 0
| {z }| {z }
=A =B
so that
det xi + y j 1≤i ≤n, 1≤ j≤n
= det ( AB) = det A · det B
(by Theorem 6.4.16). However, the matrix A has a zero column (since n ≥ 3),
and thus satisfies det A = 0 (by Theorem 6.4.14 (b)). Hence,
| {zA} · det B = 0.
det xi + y j 1≤i≤n, 1≤ j≤n = det
=0
and
det d j Ai,j 1≤i ≤n, 1≤ j≤n
= d1 d2 · · · dn · det A. (228)
First proof of Corollary 6.4.17 (sketched). The matrix di Ai,j 1≤i≤n, 1≤ j≤n is obtained
from the matrix A by multiplying the 1-st row by d1 , multiplying the 2-nd row
by d2 , multiplying the 3-rd row by d3 , and so on. Theorem 6.4.12 (d) (applied
repeatedly – once for each row) shows that these multiplications result in the
determinant of A getting multiplied by d1 , d2 , . . . , dn (in succession). Hence,
det di Ai,j 1≤i≤n, 1≤ j≤n = d1 d2 · · · dn · det A.
Thus, (227) is proved. The proof of (228) is analogous (using columns instead
of rows). Corollary 6.4.17 is proven.
Math 701 Spring 2021, version April 6, 2024 page 362
and
∑
σ
det di Ai,j 1≤i ≤n, 1≤ j≤n
= (− 1 ) d A
1 1,σ(1) d A
2 2,σ(2) · · · d A
n n,σ(n)
σ ∈ Sn | {z }
=(d1 d2 ···dn )( A1,σ(1) A2,σ(2) ··· An,σ(n) )
= ∑ (−1)σ (d1 d2 · · · dn ) A1,σ(1) A2,σ(2) · · · An,σ(n)
σ ∈ Sn
= d1 d2 · · · d n · ∑ (−1)σ A1,σ(1) A2,σ(2) · · · An,σ(n)
σ ∈ Sn
| {z }
=det A
(by (229))
= d1 d2 · · · dn · det A
Math 701 Spring 2021, version April 6, 2024 page 363
and
det d j Ai,j 1≤i ≤n, 1≤ j≤n
= ∑ (− 1 ) σ
d A
σ(1) 1,σ(1) d A
σ(2) 2,σ(2) · · · d A
σ(n) n,σ(n)
σ ∈ Sn | {z }
=(dσ(1) dσ(2) ···dσ(n) )( A1,σ(1) A2,σ(2) ··· An,σ(n) )
= ∑ (−1)σ d σ (1) d σ (2) · · · d σ ( n ) A1,σ(1) A2,σ(2) · · · An,σ(n)
σ ∈ Sn | {z }
=d1 d2 ···dn
(since σ is a permutation of the set [n])
= ∑ (−1)σ (d1 d2 · · · dn ) A1,σ(1) A2,σ(2) · · · An,σ(n)
σ ∈ Sn
= d1 d2 · · · dn · det A (as we have seen above) .
6.4.3. Cauchy–Binet
The multiplicativity of the determinant generalizes to non-square matrices A
and B, but the general statement is subtler and less famous:
∑
det ( AB) = det colsg1 ,g2 ,...,gn A · det rowsg1 ,g2 ,...,gn B .
( g1 ,g2 ,...,gn )∈[m]n ;
g1 < g2 <···< gn
The sum runs over all ways to form an n × n-matrix by picking n columns of A
(in increasing order, with no repetitions). The corresponding n rows of B form
an n × n-matrix as well.
a b c
Example 6.4.19. Let n = 2 and m = 3, and let A = and
a′ b′ c′
x′
x
B= y y′ . Then, Theorem 6.4.18 yields
z z′
∑
det ( AB) = det colsg1 ,g2 A · det rowsg1 ,g2 B
( g1 ,g2 )∈[3]2 ;
g1 < g2
= det (cols1,2 A) · det (rows1,2 B)
+ det (cols1,3 A) · det (rows1,3 B)
+ det (cols2,3 A) · det (rows2,3 B)
!
2
since the only 2-tuples ( g1 , g2 ) ∈ [3]
satisfying g1 < g2 are (1, 2) and (1, 3) and (2, 3)
x x′
a b
= det · det
a′ b′ y y′
x x′
a c
+ det · det
a′ c′ z z′
y y′
b c
+ det · det
b′ c′ z z′
= ab′ − ba′ xy′ − yx ′ + ac′ − ca′ xz′ − zx ′
− ax ′ + by′ + cz′ a′ x + b′ y + c′ z ,
x x′
ax ′ + by′ + cz′
a b c y y′ = ax + by + cz
AB = .
a′ b′ c′ ′ a′ x + b′ y + c′ z a′ x ′ + b′ y′ + c′ z′
z z
Math 701 Spring 2021, version April 6, 2024 page 365
∑
det ( AB) = det colsg1 ,g2 ,...,gn A · det rowsg1 ,g2 ,...,gn B
( g1 ,g2 ,...,gn )∈[m]n ;
g1 < g2 <···< gn
= (empty sum)
(since the set [m] has no n distinct elements)
= 0.
When K is a field, this can also be seen trivially from rank considerations (to
wit, the matrix A has rank ≤ m < n, and thus the product AB has rank < n
as well). When K is not a field, the notion of a rank is not available, so we do
need Theorem 6.4.18 to obtain this (although there are ways around this).
If m = n, then the claim of Theorem 6.4.18 becomes
∑
det ( AB) = det colsg1 ,g2 ,...,gn A · det rowsg1 ,g2 ,...,gn B
( g1 ,g2 ,...,gn )∈[n]n ;
g1 < g2 <···< gn
= det (cols1,2,...,n A) · det (rows1,2,...,n B)
| {z } | {z }
=A =B
!
since the only n-tuple ( g1 , g2 , . . . , gn ) ∈ [n]n
satisfying g1 < g2 < · · · < gn is (1, 2, . . . , n)
= det A · det B,
6.4.4. det ( A + B)
So much for det ( AB). What can we say about det ( A + B) ? The answer is
somewhat cumbersome, but still rather useful. We need some notation to state
it:
with
u1 < u2 < · · · < u p and v1 < v2 < · · · < v q ,
we set
V
subU A := Aui ,v j .
1≤i ≤ p, 1≤ j≤q
V
Roughly speaking, subU A is the matrix obtained from A by focusing only
on the i-th rows for i ∈ U (that is, removing all the other rows) and only on
the j-th columns for j ∈ V (that is, removing all the other columns).
V
This matrix subU A is called the submatrix of A obtained by restricting to the
U-rows and the V-columns.
If this matrix is square (i.e., if |U | = |V |), then its
V
determinant det subU A is called a minor of A.
and
a b c
{3}
sub{2} a′ b′ c′ = c′
.
a′′ b′′ c′′
Theorem 6.4.23. Let n ∈ N. For any subset I of [n], we let e I be the comple-
ment [n] \ I of I. (For example, if n = 4 and I = {1, 4}, then e
I = {2, 3}.)
For any finite set S of integers, define sum S := ∑ s.
s∈S
Let A and B be two n × n-matrices in K n×n . Then,
det ( A + B) = ∑ ∑ (−1) sum P+sum Q Q Q
det subP A · det sub e B .
e
P
P⊆[n] Q⊆[n];
| P|=| Q|
Math 701 Spring 2021, version April 6, 2024 page 367
Proof of Theorem 6.4.23. See [Grinbe15, Theorem 6.160]. (Note that subQ
P A is
w( Q)
called subw( P) A in [Grinbe15].) The main difficulty of the proof is bookkeep-
ing; the underlying idea is simple (expand everything and regroup).
Here is a rough outline of the argument. If σ ∈ Sn , and if P is a subset of [n], then
σ ( P) = {σ (i ) | i ∈ P} is a subset of [n] as well, and has the same size as P (since σ is
a permutation and therefore injective); thus, it satisfies | P| = |σ ( P)|.
Math 701 Spring 2021, version April 6, 2024 page 368
(here, we have split the inner sum according to the value of the subset σ ( P) = {σ (i ) | i ∈ P},
recalling that it satisfies | P| = |σ ( P)|).
Now, fix two subsets P and Q of [n] satisfying | P| = | Q|. Thus, Pe = Q e as well.
Write the sets P, Q, Pe and Q e as
where the notation “U = {u1 < u2 < · · · < u a }” is just a shorthand way to say “U =
{u1 , u2 , . . . , u a } and u1 < u2 < · · · < u a ” (or, equivalently, “the elements of U in strictly
increasing order are u1 , u2 , . . . , u a ”). Now, for each permutation σ ∈ Sn satisfying
σ ( P) = Q, we see that:
• The elements σ ( p1 ) , σ ( p2 ) , . . . , σ ( pk ) are the elements q1 , q2 , . . . , qk in some or-
der (since σ ( P) = Q), and thus there exists a unique permutation α ∈ Sk such
that
σ ( p i ) = q α (i ) for each i ∈ [k ] .
We denote this α by ασ .
• The elements σ ( p1′ ) , σ ( p2′ ) , . .
. , σ( p′ℓ ) are the elements q1′ , q2′ , . . . , q′ℓ in some or-
der (since σ ( P) = Q entails σ Pe = Q, e because σ is a permutation of [n]), and
thus there exists a unique permutation β ∈ Sℓ such that
σ pi′ = q′β(i)
for each i ∈ [ℓ] .
We denote this β by β σ .
Thus, for each permutation σ ∈ Sn satisfying σ ( P) = Q, we have defined two
permutations ασ ∈ Sk and β σ ∈ Sℓ that (roughly speaking) describe the actions of σ on
the subsets P and P,
e respectively. It is not hard to see that the map
{permutations σ ∈ Sn satisfying σ ( P) = Q} → Sk × Sℓ ,
σ 7→ (ασ , β σ )
Math 701 Spring 2021, version April 6, 2024 page 369
(Proving this is perhaps the least pleasant part of this proof, but it is pure combina-
torics. It is probably easiest to reduce this to the case when P = [k ] and Q = [k ] by a re-
duction procedure that involves multiplying σ by sum P + sum Q − 2 · (1 + 2 + · · · + k )
many transpositions106 . Once we are in the case P = [k ] and Q = [k ], we can prove
(230) by directly counting the inversions of σ (showing that ℓ (σ) = ℓ (ασ ) + ℓ ( β σ )).
Note that the proof I give in [Grinbe15, Theorem 6.160] avoids proving (230), opting
106 Specifically: Recall the simple transpositions si defined for all i ∈ [n − 1] in Definition 5.2.3.
By replacing σ by σsi (for some i ∈ [n − 1]), we can swap two adjacent values of σ (namely,
σ (i ) and σ (i + 1)). Furthermore, σ ( P) = Q implies (σsi ) (si ( P)) = Q (since si si = s2i = id).
Thus, the equality σ ( P) = Q is preserved if we simultaneously replace σ by σsi and replace
P by si ( P). Such a simultaneous replacement will be called a shift. Furthermore, if i is
chosen in such a way that i ∈ / P and i + 1 ∈ P, then this shift will be called a left shift.
Let us see what happens to σ, P, ασ and β σ when we perform a left shift. Indeed,
consider a left shift which replaces σ by σsi and replaces P by si ( P), where i ∈ [n − 1] is
chosen in such a way that i ∈ / P and i + 1 ∈ P. The set si ( P) is the set P with the element
i + 1 replaced by i. Thus, sum (si ( P)) = sum P − 1. In other words, our left shift has
decremented sum P by 1. Thus, our left shift has flipped the sign of (−1)sum P+sum Q (since
sum Q obviously stays unchanged). The permutation σ has been replaced by σsi , which is
the same permutation as σ but with the values σ (i ) and σ (i + 1) swapped. Since i ∈ / P and
i + 1 ∈ P, this swap has not disturbed the relative order of the elements of P and Pe (but
merely replaced i + 1 by i in P and replaced i by i + 1 in P), e so that the permutations ασ
and β σ have not changed. Our left shift thus has left ασ and β σ unchanged. It has, however,
flipped the sign of (−1)σ (because (−1)σsi = (−1)σ (−1)si = − (−1)σ ).
| {z }
=−1
Let us summarize: Each left shift leaves ασ and β σ unchanged, while flipping the signs of
(−1)sum P+sum Q and (−1)σ . However, by performing left shifts, we can move the smallest
element of P one step to the left, or (if the smallest element of P is already 1) we can move
the second-smallest element of P one step to the left, or (if the two smallest elements of P
are already 1 and 2) we can move the third-smallest element of P one step to the left, and so
on. After sufficiently many such left shifts, the set P will have become [k], whereas ασ and
β σ will not have changed (because we have seen above that each left shift leaves ασ and β σ
unchanged). The total number of left shifts we need for this is ( p1 − 1) + ( p2 − 2) + · · · +
( pk − k) = sum P − (1 + 2 + · · · + k).
Likewise, we can define left co-shifts, which are operations similar to left shifts but act-
ing on the values rather than positions of σ and acting on Q rather than P. Explicitly, a
left co-shift replaces σ by si σ and replaces Q by si ( Q), where i ∈ [n − 1] is chosen such
that i ∈/ Q and i + 1 ∈ Q. Again, we can see that each left co-shift leaves ασ and β σ
unchanged, while flipping the signs of (−1)sum P+sum Q and (−1)σ . After a sequence of
sum Q − (1 + 2 + · · · + k ) left co-shifts, the set Q will have become [k].
Each left shift and each left co-shift multiplies the permutation σ by a transposition.
Hence, after our sum P − (1 + 2 + · · · + k) left shifts and our sum Q − (1 + 2 + · · · + k ) left
co-shifts, we will have multiplied σ by altogether
instead for an argument using row permutations; see [Grinbe15, solution to Exercise
6.44] for the details.)
Now,
! !
∑ (−1)σ ∏ Ai,σ(i) ∏ Bi,σ(i)
σ ∈ Sn ; i∈ P
| {z }
i ∈ Pe
σ( P)= Q =(−1)sum P+sum Q (−1)ασ (−1) β σ | {z } | {z }
(by (230)) k ℓ
= ∏ A pi ,qα = ∏ B p′ ,q′
i =1 σ (i ) i β σ (i )
i =1
(by the definition of ασ ) (by the definition of β σ )
! !
k ℓ
= ∑ (−1) sum P+sum Q ασ
(−1) (−1) βσ
∏ A p ,q i α σ (i ) ∏ Bp′ ,q′ i β σ (i )
σ ∈ Sn ; i =1 i =1
σ ( P)= Q
! !
k ℓ
= ∑ (−1)sum P+sum Q (−1)α (−1) β ∏ A p ,q ( ) i α i ∏ Bp′ ,q′ ( ) i β i
(α,β)∈Sk ×Sℓ i =1 i =1
here, we have substituted (α, β) for (ασ , β σ ) in the sum,
since the map σ 7→ (ασ , β σ ) is a bijection
! !
k ℓ
= (−1)sum P+sum Q ∑ (−1)α ∏ A pi ,qα(i) ∑ (−1) β ∏ B pi′ ,q′β(i)
α ∈ Sk i =1 β ∈ Sℓ i =1
| {z }| {z }
=det(subQ
P A) =det subQe B
e
P
(by the definition of subQ
P A (by the definition of subQe B
e
and its determinant) P
and its determinant)
= (−1)sum P+sum Q det subQ A · det sub Q
B . (231)
e
P Pe
Forget that we fixed P and Q. We thus have proved (231) for any two subsets P and
Q of [n] satisfying | P| = | Q|. Thus, our original computation of det ( A + B) becomes
! !
det ( A + B) = ∑ ∑ ∑ (−1)σ ∏ Ai,σ(i) ∏ Bi,σ(i)
P⊆[n] Q⊆[n]; σ ∈ Sn ; i∈ P i ∈ Pe
| P|=| Q| σ ( P)= Q
| {z }
sum P+sum Q Q Q
=(−1) det(subP A)·det sub e B
e
P
(by (231))
= ∑ ∑ (−1)sum P+sum Q det subQ A · det sub Q
B .
e
P Pe
P⊆[n] Q⊆[n];
| P|=| Q|
We shall soon see a few applications of Theorem 6.4.23. First, let us observe
a simple property of diagonal matrices:107
107 We are using the Iverson bracket notation (see Definition A.1.5) again.
Math 701 Spring 2021, version April 6, 2024 page 371
Proof of Lemma 6.4.25. This is [Grinbe15, Lemma 6.163] (slightly rewritten); see
[Grinbe15, Exercise 6.49] for a detailed proof. (That said, it is almost obvious:
In part (a), the submatrix subPP D is itself diagonal, and its diagonal entries are
precisely the di for i ∈ P. In part (b), the submatrix subQ P D has a zero row
(indeed, from | P| = | Q| and P ̸= Q, we see that there exists some i ∈ P \ Q,
and then the corresponding row of subQ P D is zero) and thus has determinant
0.)
Lemma 6.4.25 gives very simple formulas for minors of diagonal matrices.
Thus, the formula of Theorem 6.4.23 becomes simpler when the matrix B is
diagonal:
Then,
det ( A + D ) = ∑ det subPP A · ∏ di .
P⊆[n] i ∈[n]\ P
The minors det subPP A of an n × n-matrix A are known as the principal
minors of A.
Math 701 Spring 2021, version April 6, 2024 page 372
det ( A + D )
= ∑ det subP A ·
P
∏ di
P⊆[3] i ∈[3]\ P
A · ∏ di + det
{1}
A · ∏ di
= det sub∅
∅ sub{1}
| {z } i∈[3]\∅ | {z } i∈[3]\{1}
=1 | {z } = A1,1 | {z }
= d1 d2 d3 = d2 d3
{2}
A · ∏ di + det
{3}
+ det sub{2} sub{3} A · ∏ di
| {z } i∈[3]\{2} | {z } i∈[3]\{3}
= A2,2 | {z } = A3,3 | {z }
= d1 d3 = d1 d2
+ det sub{1,2} A · ∏ di + det sub{1,3} A · ∏ di
{1,2} {1,3}
| {z } i∈[3]\{1,2} | {z } i∈[3]\{1,3}
A1,1 A1,2 A1,1 A1,3
| {z } | {z }
=det = d3 =det = d2
A2,1 A2,2 A3,1 A3,3
+ det sub{2,3} A · ∏ di + det sub{1,2,3} A ·
{2,3} {1,2,3}
∏ di
| {z } i∈[3]\{2,3} | {z } i∈[3]\{1,2,3}
=det A
A2,2 A2,3
| {z } | {z }
=det = d1 =(empty product)
A3,2 A3,3 =1
Proof of Theorem 6.4.26 (sketched). (See [Grinbe15, Corollary 6.162] for details.)
We shall use the notations e I and sum S as defined in Theorem 6.4.23. If P and
Q are two subsets of [n] satisfying | P| = | Q| but P ̸= Q, then their complements
Pe and Qe are also distinct (since P ̸= Q) and satisfy Pe = Q e (since | P| = | Q|),
and therefore Lemma 6.4.25 (b) (applied to Pe and Q
e instead of P and Q) yields
Q
det sub e D = 0. (232)
e
P
Math 701 Spring 2021, version April 6, 2024 page 373
P⊆[n] Q⊆[n]; | {z P }
| P|=| Q| This is 0 if P̸= Q
(by (232))
= ∑ | (− 1 ) sum P+sum P
det sub P
P A · det sub Pe
Pe
D
P⊆[n]
{z } | {z }
=1
= ∏ di
i ∈ Pe
(by Lemma 6.4.25 (a))
here, we have removed all addends with P ̸= Q
from the double sum, since these addends are 0
= ∑ det subPP A · ∏ di .
P⊆[n] i ∈ Pe
Then,
n
det F = d1 d2 · · · dn + x ∑ d1 d2 · · · dbi · · · dn ,
i =1
where the hat over the “di ” means “omit the di factor” (that is, the expression
“d1 d2 · · · dbi · · · dn ” is to be understood as “d1 d2 · · · di−1 di+1 di+2 · · · dn ”).
Proof of Proposition 6.4.28 (sketched). Define the two n × n-matrices
x x ··· x
x x ··· x
A := ( x )1≤i≤n, 1≤ j≤n = . . . ∈ K n×n and
.. .. .
. . ..
x x ··· x
d1 0 · · · 0
0 d2 · · · 0
D := (di [i = j])1≤i≤n, 1≤ j≤n = . ∈ K n×n .
.. .
.. . .
. . ..
0 0 · · · dn
Math 701 Spring 2021, version April 6, 2024 page 374
P⊆[n]; i ∈[n]\ P
| P|≤1
n
{ p}
∏ i ∑ ∏ di
= det sub∅ A · d + det sub { p}
A ·
| {z ∅ } i∈[n]\∅ p =1 | {z } i ∈[n]\{ p}
=1 | {z } =x | {z }
∅ { }
(since sub∅ A is = ∏ di
p
(since sub{ p} A= x ) =d1 d2 ···dcp ···dn
a 0×0-matrix) i ∈[n]
=d1 d2 ···dn
since the subsets P of [n] satisfying | P| ≤ 1
are the n + 1 subsets ∅, {1} , {2} , . . . , {n}
n
= d1 d2 · · · d n + ∑ x · d1 d2 · · · dbp · · · dn
p =1
n
= d1 d2 · · · d n + x ∑ d1 d2 · · · dbp · · · dn
p =1
n
= d1 d2 · · · dn + x ∑ d1 d2 · · · dbi · · · dn
i =1
n
next-highest term is − Tr A · x n−1 where Tr A := ∑ Ai,i is the trace of A, and
i =1
whose constant term is (−1)n det A. We shall extend this by explicitly com-
puting all coefficients of this polynomial. For the sake of simplicity, we will
compute det ( A + xIn ) instead of det ( xIn − A) (this is tantamount to replacing
x by − x), and we will take x to be an element of K rather than an indeterminate
(but this setting is more general, since we can take K itself to be a polynomial
ring and then choose x to be its indeterminate). Here is our formula:
Proof of Proposition 6.4.29 (sketched). (See [Grinbe15, Corollary 6.164] for details.)
The matrix xIn is diagonal, and its diagonal entries are x, x, . . . , x; in fact,
x 0 ··· 0
0 x ··· 0
xIn = ( x [i = j])1≤i≤n, 1≤ j≤n = . . . .
.. .. .
. . ..
0 0 ··· x
Hence, Theorem 6.4.26 (applied to D = xIn and di = x) yields
det ( A + xIn )
= ∑ det subP A ·P
∏ x = ∑ det subPP A · x n−| P|
P⊆[n] i ∈[n]\ P P⊆[n]
| {z }
= x |[n]\ P| = x n−| P|
(since |[n]\ P|=|[n]|−| P|=n−| P|)
n
here, we have split the sum
= ∑ ∑ det subPP A x n−k
according to the value of | P|
k =0 P⊆[n];
| P|=k
n
here, we have substituted n − k
= ∑ ∑ det subPP A x k
for k in the sum
k =0 P⊆[n];
| P|=n−k
n
∑ ∑ det subPP A k
=
x .
k =0 P⊆[n];
| P|=n−k
There are many ways to prove Proposition 6.4.30, but here is a particularly
simple one:
Proof of Proposition 6.4.30 (sketched). (See [Grinbe15, Exercise 6.11] for details.)
| {z L} · det
det A = det ( LU ) = det | {zU} (by Theorem 6.4.16)
=1 =1
= 1;
this proves Proposition 6.4.30.
How can you discover such a proof? Our serendipitous factorization of A as
LU might appear unmotivated, but from the viewpoint of linear algebra it is an
instance of a well-known and well-understood kind of factorization, known as
the LU-decomposition or the LU-factorization. Over a field, almost every square
matrix110 has an LU-decomposition (i.e., a factorization as a product of a lower-
triangular matrix with an upper-triangular matrix). This LU-decomposition is
unique if you require (e.g.) the diagonal entries of the lower-triangular factor
to all equal 1. It can furthermore be algorithmically computed using Gaussian
elimination (see, e.g., [OlvSha18, §1.3, Theorem 1.3]). Now, computing the LU-
decomposition of the matrix A from Proposition 6.4.30 for n = 4, we find
1 1 1 1 1 0 0 0 1 1 1 1
1 2 3 4 1 1 0 0 0 1 2 3
1 3 6 10 = .
1 2 1 0 0 0 1 3
1 4 10 20 1 3 3 1 0 0 0 1
| {z } | {z }
this is the lower-triangular factor this is the upper-triangular factor
The entries of both factors appear to be the binomial coefficients familiar from
Pascal’s triangle. This suggests that we might have
i−1 j−1
L= and U= ,
k−1 1≤i ≤n, 1≤k≤n k−1 1≤k≤n, 1≤ j≤n
not just for n = 4 but also for arbitrary n ∈ N. And once this guess has been
made, it is easy to prove that A = LU (our proof above is not the only one
possible; four proofs appear in [EdeStr04]).
This is not the only example where LU-decomposition helps compute a de-
terminant (see, e.g., [Kratte99, §2.6] for examples). Sometimes it is helpful to
transpose a matrix, or to permute its rows or columns to obtain a matrix with
a good LU-decomposition.
(b) We have
anj −i ∏
det = ai − a j .
1≤i ≤n, 1≤ j≤n
1≤ i < j ≤ n
(c) We have
j −1
∏
det ai = ai − a j .
1≤i ≤n, 1≤ j≤n
1≤ j < i ≤ n
(d) We have
aij−1 ∏
det = ai − a j .
1≤i ≤n, 1≤ j≤n
1≤ j < i ≤ n
Example 6.4.32. Here is what the four parts of Theorem 6.4.31 say for n = 3:
a21 a1 1
2
(a) We have det a2 a2 1 = ( a1 − a2 ) ( a1 − a3 ) ( a2 − a3 ).
a23 a3 1
2 2
a23
a1 a2
(b) We have det a1 a2 a3 = ( a1 − a2 ) ( a1 − a3 ) ( a2 − a3 ).
1 1 1
1 a1 a21
(c) We have det 1 a2 a22 = ( a2 − a1 ) ( a3 − a1 ) ( a3 − a2 ).
1 a3 a23
1 1 1
Theorem 1]; a combinatorial proof can also be found in Exercise A.5.3.6; two
more proofs are obtained in Exercise A.5.3.8 and Exercise A.5.3.9). We will now
sketch a proof using factor hunting and polynomials. We will first focus on
proving part (a) of Theorem 6.4.31, and afterwards derive the other parts from
it.
The first step of our proof is reducing Theorem 6.4.31 (a) to the “particu-
lar case” in which K is the polynomial ring Z [ x1 , x2 , . . . , xn ] and the elements
a1 , a2 , . . . , an are the indeterminates x1 , x2 , . . . , xn . This is merely a particular
case (one possible choice of K and a1 , a2 , . . . , an among many); however, as we
will soon see, proving Theorem 6.4.31 (a) in this particular case will quickly
entail that Theorem 6.4.31 (a) holds in the general case. Let us elaborate on
this argument. First, let us state Theorem 6.4.31 (a) in this particular case as a
lemma:
(by the definition of a determinant). The right hand side of this equality is a
n (n − 1) 111
homogeneous polynomial in x1 , x2 , . . . , xn of degree . Thus, f is
2
n ( n − 1)
a homogeneous polynomial in x1 , x2 , . . . , xn of degree . Furthermore,
2
the monomial x1n−1 x2n−2 · · · xnn−n appears with coefficient 1 on the right hand
side of (237) (indeed, all the n! addends in the sum on this right hand side
contain distinct monomials, and thus only the addend for σ = id makes any
contribution to the coefficient of the monomial x1n−1 x2n−2 · · · xnn−n ). Hence,
h i
x1n−1 x2n−2 · · · xnn−n f = 1. (238)
n − σ (1) n − σ (2) n−σ(n)
111 becauseit is a Z-linear combination of the monomials x1 x2 · · · xn , each of
which has degree
Therefore, f ̸= 0.
Now, let u and v be two elements of [n] satisfying u < v. Then, the polyno-
mial f becomes 0 when we set xu equal to xv (that is, we substitute xv for
when
n− j
xu ). Indeed, when we set xu equal to xv , the matrix xi becomes
1≤i ≤n, 1≤ j≤n
a matrix that has two equal rows (namely, its u-th and its v-th row both become
equal to xvn−1 , xvn−2 , . . . , xvn−n ), and thus its determinant becomes 0 (by Theo-
rem 6.4.12 (c)); but this means precisely that f = 0 (since f is the determinant
of this matrix).
Now, we recall the following well-known property of univariate polynomials:
Z [ x1 , x2 , . . . , x n ] and (Z [ x1 , x2 , . . . , c
xu , . . . , xn ]) [ xu ] .
is 1.)
Math 701 Spring 2021, version April 6, 2024 page 384
and mutually non-associate (i.e., no two of them are associate115 )116 . Since the
ring Z [ x1 , x2 , . . . , xn ] is a unique factorization domain117 , we thus conclude that
any polynomial p ∈ Z [ x1 , x2 , . . . , xn ] that is a multiple of each of these linear
∏
polynomials must necessarily be a multiple of their product xi − x j .
1≤ i < j ≤ n
∏
Hence, the polynomial f must be a multiple of this product xi − x j
1≤ i < j ≤ n
(since f is a multiple of each of the linear polynomials above). In other words,
∏
the polynomial f must be a multiple of g (since g = xi − x j ). In
1≤ i < j ≤ n
other words, f = gq for some q ∈ Z [ x1 , x2 , . . . , xn ]. Consider this q. Clearly,
gq = f ̸= 0 and thus q ̸= 0 and g ̸= 0.
The ring Z is an integral domain. Thus, any two nonzero polynomials a and
b over Z satisfy deg ( ab) = deg a + deg b. Applying this to a = g and b = q,
we find deg ( gq) = deg g + deg q. In other words, deg f = deg g + deg q (since
gq = f ).
Now, g = ∏
xi − x j and thus
1≤ i < j ≤ n
!
∏ deg xi − x j = ∑ 1
∑
deg g = deg xi − x j =
1≤ i < j ≤ n 1≤ i < j ≤ n | {z } 1≤ i < j ≤ n
=1
2
n n ( n − 1)
= # of pairs (i, j) ∈ [n] satisfying i < j = = .
2 2
n ( n − 1)
However, we have deg f ≤ (since f is a homogeneous polynomial of
2
n ( n − 1) n ( n − 1)
degree ). In view of deg g = , this rewrites as deg f ≤ deg g.
2 2
Hence,
deg g ≥ deg f = deg g + deg q.
Therefore, 0 ≥ deg q. This shows that the polynomial q is constant. In other
words, q ∈ Z.
It remains to show that this constant q is 1. However, this can be done by
comparing some coefficients of f and g. Indeed, let us look at the coefficient of
x1n−1 x2n−2 · · · xnn−n (as we already know this coefficient for f to be 1).
Expanding the product ∏ xi − x j , we obtain a sum of several (in fact,
1≤ i < j ≤ n
2n(n−1)/2 many) monomials with + and − signs. I claim that among these
115 Recall that two elements a and b of a commutative ring R are said to be associate if there exists
some unit u of R such that a = ub. Being associate is known (and easily verified) to be an
equivalence relation.
116 Check this! (Recall that the only units of the polynomial ring Z [ x , x , . . . , x ] are 1 and −1.)
1 2 n
117 This is a nontrivial, but rather well-known result in abstract algebra. Proofs can be found in
[Ford21, Theorem 3.7.4], [Knapp16, Remark after Corollary 8.21], [MiRiRu87, Chapter IV,
Theorems 4.8 and 4.9] and [Edward05, Essay 1.4, Corollary of Theorem 1 and Corollary 1
of Theorem 2].
Math 701 Spring 2021, version April 6, 2024 page 385
monomials, the monomial x1n−1 x2n−2 · · · xnn−n will appear exactly once, and with
a + sign. Indeed, in order to obtain x1n−1 x2n−2 · · · xnn−n when expanding the
product
∏
x i − x j = ( x1 − x2 ) ( x1 − x3 ) · · · ( x1 − x n )
1≤ i < j ≤ n
( x2 − x3 ) · · · ( x2 − x n )
.. ..
. .
( x n −1 − x n ) ,
it is necessary to pick the x1 minuends from all n − 1 factors in the first row
(since none of the other factors contain any x1 ), then to pick the x2 minuends
from all n − 2 factors in the second row (since none of the remaining factors
contain any x2 ), and so on – i.e., to take the minuend (rather than the sub-
trahend) from each factor. Thus, only one of the monomials obtained by the
expansion will be x1n−1 x2n−2 · · · xnn−n , and it will appear with a + sign. Hence,
the coefficient of x1n−1 x2n−2 · · · xnn−n in the product ∏
xi − x j is 1. In other
1≤ i < j ≤ n
n −1 n −2 n − n
words, the coefficient of x1 x2 · · · xn in g is 1 (since g = ∏
xi − x j ).
1≤ i < j ≤ n
In other words, h i
x1n−1 x2n−2 · · · xnn−n g = 1.
Proof of Theorem 6.4.31 (sketched). (a) We have already given a proof of Theorem
6.4.31 (a) (and with Lemma 6.4.33 established, this proof is now complete).
n− j
(b) The matrix anj −i is the transpose of the matrix ai .
1≤i ≤n, 1≤ j≤n 1≤i ≤n, 1≤ j≤n
Thus, Theorem 6.4.31 (b) follows from Theorem 6.4.31 (a) using Theorem 6.4.10.
n− j
(c) Let A be the n × n-matrix ai . Then, Theorem 6.4.31 (a)
1≤i ≤n, 1≤ j≤n
∏
says that det A = ai − a j .
1≤ i < j ≤ n
Let τ ∈ Sn be the permutation of [n] that sends each j ∈ [n] to n + 1 − j.
Then, each i, j ∈ [n] satisfy j − 1 = n − τ ( j) and therefore
j −1 n−τ ( j)
ai = ai = Ai,τ ( j) (by the definition of A) .
j −1
Hence, ai = Ai,τ ( j) and thus
1≤i ≤n, 1≤ j≤n 1≤i ≤n, 1≤ j≤n
j −1
det ai = det Ai,τ ( j) = (−1)τ · det A
1≤i ≤n, 1≤ j≤n 1≤i ≤n, 1≤ j≤n
(by (223)).
However, the permutation τ has OLN (n, n − 1, n − 2, . . . , 1). Thus, each pair
(i, j) ∈ [n]2 satisfying i < j is an inversion of τ. Therefore, the Coxeter length
ℓ (τ ) of τ is the # of all pairs (i, j) ∈ [n]2 satisfying i < j. Thus, the sign of τ is
Now,
j −1
det ai
1≤i ≤n, 1≤ j≤n
! !
∏ (−1) · ∏ ai − a j
= (−1)τ ·
| {zA}
det =
1≤ i < j ≤ n 1≤ i < j ≤ n
| {z }
= ∏ (−1) = ∏ ( ai − a j )
1≤ i < j ≤ n 1≤ i < j ≤ n
= ∏ (−1) ai − a j = ∏ a j − ai = ∏
ai − a j
1≤ i < j ≤ n | {z } 1≤ i < j ≤ n 1≤ j < i ≤ n
= a j − ai
(here, we have renamed the index (i, j) as ( j, i ) in the product). This proves
Theorem 6.4.31 (c).
Math 701 Spring 2021, version April 6, 2024 page 387
j −1
(d) The matrix aij−1 is the transpose of the matrix ai .
1≤i ≤n, 1≤ j≤n 1≤i ≤n, 1≤ j≤n
Thus, Theorem 6.4.31 (d) follows from Theorem 6.4.31 (c) using Theorem 6.4.10.
( x 1 + y 1 ) n −1 ( x 1 + y 2 ) n −1 · · · ( x 1 + y n ) n −1
( x 2 + y 1 ) n −1 ( x 2 + y 2 ) n −1 · · · ( x 2 + y n ) n −1
= det .. .. .. ..
.
. . .
( x n + y 1 ) n −1 ( x n + y 2 ) n −1 · · · ( x n + y n ) n −1
n −1 ! ! !
n−1
∏ k ∏ xi − x j ∏ y j − yi .
=
k =0 1≤ i < j ≤ n 1≤ i < j ≤ n
First proof of Proposition 6.4.34 (sketched). Here is a rough outline of a proof that
uses factor hunting (in the same way as in our above proofs of Theorem 6.4.31
(a) and Lemma 6.4.33). We WLOG assume that x1 , x2 , . . . , xn and y1 , y2 , . . . , yn
are distinct indeterminates in a polynomial ring over Z. (This is again an
assumption that we can make, because the argument that we used to derive
Theorem 6.4.31 (a) from Lemma 6.4.33 can be applied here as well.) Then, we
can easily see that
n −1
det xi + y j
1≤i ≤n, 1≤ j≤n
is a homogeneous polynomial of degree n (n − 1). This polynomial vanishes if
we set any xu equal to any xv (for u < v), and also vanishes if we set any yu
equal to any yv (for u < v). Thus we have identified n (n − 1) linear factors of
this polynomial (namely, the differences xu − xv and yv − yu for u < v), and we
can again conclude (since any polynomial ring over Z is a unique factorization
domain) that
! !
n −1
∏ xi − x j ∏ y j − yi · q
det xi + y j =
1≤i ≤n, 1≤ j≤n
1≤ i < j ≤ n 1≤ i < j ≤ n
n −1 n−1
for a constant q ∈ Z. It remains to prove that this constant q equals ∏ .
k =0 k
This can be done by studying the coefficients of the monomial
x1n−1 x2n−2 · · · xnn−n y01 y12 · · · ynn−1
Math 701 Spring 2021, version April 6, 2024 page 388
! !
n −1
∏ ∏
in det xi + y j and in xi − x j y j − yi .
1≤i ≤n, 1≤ j≤n 1≤ i < j ≤ n 1≤ i < j ≤ n
We leave the details to the reader.
Second proof of Proposition 6.4.34. (See [Grinbe15, Exercise 6.17 (b)] for details.)
Let n −1
C : = xi + y j ∈ K n×n .
1≤i ≤n, 1≤ j≤n
n − 1 k −1
n−k
P := x and Q := y j ,
k−1 i 1≤i ≤n, 1≤k≤n 1≤k≤n, 1≤ j≤n
(by the definition of the matrix product). Since this holds for all i, j ∈ [n], we
thus obtain C = PQ. Hence,
(by Theorem
6.4.16).
It now remains to compute
det P and det Q.
From Q = ynj −k = ynj −i , we obtain
1≤k≤n, 1≤ j≤n 1≤i ≤n, 1≤ j≤n
ynj −i ∏
det Q = det = yi − y j (240)
1≤i ≤n, 1≤ j≤n
1≤ i < j ≤ n
obtain
!
n − 1 j −1
det P = det x
j−1 i 1≤i ≤n, 1≤ j≤n
n−1 n−1 n−1
j −1
= ··· · det xi
1−1 2−1 n−1 1≤i ≤n, 1≤ j≤n
| {z } | {z }
n−1
n −1 = ∏ ( xi − x j )
=∏ 1≤ j < i ≤ n
k =0 k (by Theorem 6.4.31 (c),
applied to ai = xi )
j −1
by (228), applied to A = xi
1≤i ≤n, 1≤ j≤n
n−1
and di =
i−1
!
n −1
n−1
∏ k · ∏
= xi − x j
k =0 1≤ j < i ≤ n
!
n −1
n−1
∏ k · ∏
= x j − xi (241)
k =0 1≤ i < j ≤ n
Note that some authors denote the minors det A∼ p,∼q in Theorem 6.4.37 by
A p,q . This is, of course, totally incompatible with our notations.
Proof of Theorem 6.4.37. See [Grinbe15, Theorem 6.82] or [Ford21, Lemma 4.5.7]
or [Laue15, 5.8 and 5.8’] or [Strick13, Proposition B.24 and Proposition B.25] or
[Loehr11, Theorem 9.48].
Theorem 6.4.37 yields several ways to compute the determinant of a ma-
trix118 . When we compute a determinant det A using Theorem 6.4.37 (a), we
say that we expand this determinant along the p-th row. When we compute a deter-
minant det A using Theorem 6.4.37 (b), we say that we expand this determinant
along the q-th column.
Theorem 6.4.37 has many applications, some of which you have probably
seen in your course on linear algebra. (A few might appear on the homework.)
118 Infact, some authors use Theorem 6.4.37 as a definition of the determinant. (However, this
is somewhat tricky, as it requires proving that all the values obtained for det A by applying
Theorem 6.4.37 are actually equal.)
Math 701 Spring 2021, version April 6, 2024 page 391
The main theoretical application of Theorem 6.4.37 is the concept of the adjugate
(or classical adjoint) of a matrix, which we shall introduce in a few moments.
First, let us see what happens if we replace the A p,q in Theorem 6.4.37 by en-
tries from another row or another column. In fact, the respective sums become
0 (instead of det A), as the following proposition shows:
Proof. See [Grinbe15, Proposition 6.96]. This is also implicit in [Strick13, proof
of Proposition B.28] and [Loehr11, proof of Theorem 9.50].
Here is a sketch of the proof:
(a) Let p ∈ [n] satisfy p ̸= r. Let C be the matrix A with its p-th row replaced by its
r-th row. Then, the matrix C has two equal rows, so that det C = 0 (by Theorem 6.4.12
(c)). On the other hand, expanding det C along the p-th row (i.e., applying Theorem
6.4.37 (a) to C instead of A) yields
n n
∑ p+q
∑ (−1) p+q Ar,q det A∼ p,∼q .
det C = (− 1 ) C p,q det C∼ p, ∼ q =
q =1 |{z} | {z } q=1
= Ar,q = A∼ p,∼q
Comparing these two equalities, we obtain the claim of Proposition 6.4.38 (a). A similar
argument proves Proposition 6.4.38 (b).
This matrix adj A is called the adjugate of the matrix A. (Some older texts call
it the adjoint, but this name has since been conquered by a different notion.
As a compromise, some still call adj A the classical adjoint of A.)
Math 701 Spring 2021, version April 6, 2024 page 392
ei − f h ch − bi b f − ce
a b c
adj d e f = f g − di ai − cg cd − a f .
g h i dh − ge bg − ah ae − bd
The main property of the adjugate adj A is its connection to the inverse A−1
of a matrix A. Indeed, if an n × n-matrix A is invertible, then its inverse A−1
1
is · adj A. More generally, even if A is not invertible, the product of
det A
adj A with A (in either order) equals (det A) · In (where In is the n × n identity
matrix). Let us state this as a theorem:
More about the adjugate matrix can be found in [Grinbe15, §6.15] and [Grinbe19,
§5.4–§5.6]; see also [Robins05] for some applications.
There is also a generalization of Theorem 6.4.37, called Laplace expansion along
multiple rows (or columns):
Math 701 Spring 2021, version April 6, 2024 page 393
a b c d
a′ b′ c′ d′
a′′ b′′ c′′ d′′ and P = {3, 4}.
Example 6.4.43. Let n = 4 and A =
det A
∑ (−1)sum P+sum Q det subQ Q
= A det sub A
e
P Pe
Q⊆[n];
| Q|=| P|
sum{3,4}+sum{1,2} {1,2} ^
{1,2}
= (−1) det sub{3,4} A det sub^ A
{3,4}
sum{3,4}+sum{1,3} {1,3} ^
{1,3}
+ (−1) det sub{3,4} A det sub^ A
{3,4}
sum{3,4}+sum{1,4} {1,4} ^
{1,4}
+ (−1) det sub{3,4} A det sub^ A
{3,4}
sum{3,4}+sum{2,3} {2,3} ^
{2,3}
+ (−1) det sub{3,4} A det sub^ A
{3,4}
sum{3,4}+sum{2,4} {2,4} ^
{2,4}
+ (−1) det sub{3,4} A det sub^ A
{3,4}
sum{3,4}+sum{3,4} {3,4} ^
{3,4}
+ (−1) det A det sub{3,4} A sub^
{3,4}
′′
b′′
′′
c′′
a c d a b d
= det det − det det
a′′′ b′′′ c′ d′ a′′′ c′′′ b′ d′
′′
d′′
′′
c′′
a b c b a d
+ det det + det det
a′′′ d′′′ b′ c′ b′′′ c′′′ a′ d′
′′
d′′
′′
d′′
b a c c a b
− det det + det det .
b′′′ d′′′ a′ c′ c′′′ d′′′ a′ b′
(This is precisely what remains of the matrix A when we remove the first
Math 701 Spring 2021, version April 6, 2024 page 395
row, the last row, the first column and the last column.) Then,
det A · det A′ = det ( A∼1,∼1 ) · det ( A∼n,∼n ) − det ( A∼1,∼n ) · det ( A∼n,∼1 )
det ( A∼1,∼1 ) det ( A∼1,∼n )
= det .
det ( A∼n,∼1 ) det ( A∼n,∼n )
Proof of Theorem 6.4.44. See [Grinbe15, Proposition 6.122] or [Bresso99, §3.5, proof
of Theorem 3.12] or [Zeilbe98].
Theorem 6.4.44 provides a recursive way of computing determinants: Indeed,
in the setting of Theorem 6.4.44, if det ( A′ ) is invertible (which, when K is a
field, simply means that det ( A′ ) ̸= 0), then Theorem 6.4.44) yields
The five matrices appearing on the right hand side of this are smaller than A,
so their determinants are often easier to compute than det A. In particular,
if you are proving something by strong induction on n, you will occasionally
be able to use the induction hypothesis to compute these determinants. This
method of recursively simplifying determinants is often known as Dodgson con-
densation, as it was popularized (perhaps even discovered) by Charles Lutwidge
Dodgson (aka Lewis Carroll) in [Dodgso67, Appendix II]. We outline a sample
application:
∏
! xi − x j yi − y j
1 = 1≤ i < j ≤ n
det .
∏
xi + y j xi + y j
1≤i ≤n, 1≤ j≤n
(i,j)∈[n]2
Math 701 Spring 2021, version April 6, 2024 page 396
Once again, there are many ways to prove this (see, e.g., [Grinbe15, Exer-
cise 6.18], [Prasol94, §1.3], [Grinbe09, Theorem 2], or https://proofwiki.org/
wiki/Value_of_Cauchy_Determinant ). But using the Desnanot–Jacobi identity,
there is a rather straightforward proof of Theorem ! 6.4.46. Indeed, if A is a
1
Cauchy matrix (i.e., a matrix of the form ), then so is each
xi + y j
1≤i ≤n, 1≤ j≤n
submatrix of A. Thus, if we proceed by strong induction on n, we can use the
induction hypothesis to compute all five determinants on the right hand side of
(242). The only difficulty is making sure that det ( A′ ) is invertible. To achieve
this, we again have to WLOG assume that our x’s and y’s are indeterminates in
a polynomial ring, and we have to rewrite the claim of Theorem 6.4.46 in the
form
!
det ∏ ( xi + yk ) = ∏
xi − x j yi − y j
k̸= j 1≤i ≤n, 1≤ j≤n 1≤ i < j ≤ n
Theorem 6.4.47 is the particular case of Theorem 6.4.48 for P = {u, v} and
Q = { p, q}. 119 Theorem 6.4.44 is, of course, the particular case of Theorem
6.4.47 for p = 1 and q = n and u = 1 and v = n.
6.5.1. Definitions
We have already seen lattice paths (and even counted them in Subsection 4.4.1).
We shall now introduce them formally and study them in greater depth.
119 Indeed, if we set P = {u, v} and Q = { p, q} in the situation of Theorem 6.4.47, then
P [n]\{u,v}
e A = sub[n]\{ p,q} A;
subQ
e
thus, Theorem 6.4.48 is easily seen to boil down to Theorem 6.4.47 in this case (the powers
of −1 all cancel).
Math 701 Spring 2021, version April 6, 2024 page 398
Convention 6.5.1. (a) Recall that “digraph” means “directed graph”, i.e., a
graph whose edges are directed (and are called arcs). Against a widespread
convention, we will allow our digraphs to be infinite (i.e., to have infinitely
many vertices and arcs).
(b) A digraph D will be called path-finite if it has the property that for any
two vertices u and v, there are only finitely many paths from u to v. (Thus,
in particular, such paths can be counted.)
(c) A digraph D will be called acyclic if it has no directed cycles.
2
4
2
3 1 is not.
4
(d) A simple digraph D means a digraph whose arcs are merely pairs of
distinct vertices (i.e., each arc is a pair (u, v) of two vertices u and v with
u ̸= v).
We note that a path may contain 0 arcs (in which case its starting and ending
point are identical).
Definition 6.5.2. We consider the infinite simple digraph with vertex set Z2
(so the vertices are pairs of integers) and arcs
and
(i, j) → (i, j + 1) for all (i, j) ∈ Z2 . (244)
The arcs of the form (243) are called “east-steps” or “right-steps”; the arcs of
the form (244) are called “north-steps” or “up-steps”.
The vertices of this digraph will be called lattice points or grid points or
simply points. They will be represented as points in the Cartesian plane
(in the usual way: the vertex (i, j) ∈ Z2 is represented as the point with
x-coordinate i and y-coordinate j).
The entire digraph will be denoted by Z2 and called the integer lattice or
integer grid (or, to be short, just lattice or grid). Here is a picture of a small
Math 701 Spring 2021, version April 6, 2024 page 399
0
0 1 2 3 4 5
( a, b) + (c, d) = ( a + c, b + d) and
( a, b) − (c, d) = ( a − c, b − d) .
Thus, the digraph Z2 has an arc from a vertex u to a vertex v if and only if
v − u ∈ {(0, 1) , (1, 0)}.
The digraph Z2 is acyclic (i.e., it has no directed cycles). Thus, its paths
are the same as its walks. We call these paths the lattice paths (or just paths).
Thus, a lattice path is a finite tuple (v0 , v1 , . . . , vn ) of points vi ∈ Z2 with the
property that
0
0 1 2 3 4 5 .
((0, 0) , (0, 1) , (1, 1) , (2, 1) , (3, 1) , (3, 2) , (4, 2) , (4, 3) , (5, 3)) .
Its step sequence (i.e., the sequence of the directions of its steps) is
URRRURUR (meaning that its first step is an up-step, its second step is
a right-step, its third step is a right-step, and so on).
Clearly, any path is uniquely determined by its starting point and its step
sequence.
Note that we are considering one of the simplest possible notions of a lat-
tice path here. In more advanced texts, the word “lattice path” is often used
for paths in digraphs more complicated than Z2 (for instance, a digraph with
the same vertex set Z2 but allowing steps in all four directions). However, the
digraph we are considering is perhaps the most useful for algebraic combina-
torics.
cs (vi ) − cs (vi−1 ) = 1
(by (246), applied to u = vi−1 and v = vi ). Summing these equalities over all i ∈ [n],
we obtain
n n
∑ (cs (vi ) − cs (vi−1 )) = ∑ 1 = n.
i =1 i =1
Hence,
n
n= ∑ (cs (vi ) − cs (vi−1 )) = cs |{z}
(vn ) − cs (v0 ) (by the telescope principle)
i =1
|{z}
=(c,d) =( a,b)
= cs (c, d) − cs ( a, b) = c + d − ( a + b) = c + d − a − b.
| {z } | {z }
=c+d = a+b
In other words, the path (v0 , v1 , . . . , vn ) has exactly c + d − a − b steps (since this path
clearly has n steps).
Forget that we fixed (v0 , v1 , . . . , vn ). We thus have shown that each path (v0 , v1 , . . . , vn )
from ( a, b) to (c, d) has exactly c + d − a − b steps. This proves Observation 1.]
[Proof of Observation 2: For any point v ∈ Z2 , we define x (v) to be the x-coordinate
of v. (Thus, x ( x, y) = x for each ( x, y) ∈ Z2 .)
Obviously, the x-coordinate of a point increases by exactly 1 along each east-step and
stays unchanged along each north-step: That is, if u → v is an arc of Z2 , then
(
1, if u → v is an east-step;
x (v) − x (u) = (247)
0, if u → v is a north-step.
120 A “step” of a path means an arc of this path.
Math 701 Spring 2021, version April 6, 2024 page 402
(by (247), applied to u = vi−1 and v = vi ). Summing these equalities over all
i ∈ [n], we obtain
(
n n
1, if vi−1 → vi is an east-step;
∑ (x (vi ) − x (vi−1 )) = ∑ 0, if v → v is a north-step
i =1 i =1 i −1 i
= (# of i ∈ [n] such that vi−1 → vi is an east-step)
= (# of east-steps in the path (v0 , v1 , . . . , vn )) .
Hence,
[Proof of Observation 3: Let (c′ , d′ ) be the point at which this path p ends. Then,
we can apply Observation 1 to (c′ , d′ ) instead of (c, d), and hence conclude that this
path has exactly c′ + d′ − a − b steps. Since we already know that this path has exactly
c + d − a − b steps, we therefore conclude that c′ + d′ − a − b = c + d − a − b. In
Math 701 Spring 2021, version April 6, 2024 page 403
not only forbids them from crossing each other, but also forbids them from
touching or bouncing off each other, or starting or ending at the same point.)
We shall abbreviate “non-intersecting path tuple” as “nipat”. (Histori-
cally, the more common abbreviation is “NILP”, for “non-intersecting lattice
paths”, but I prefer “nipat” as it stresses the tupleness.)
(e) A path tuple ( p1 , p2 , . . . , pk ) is said to be intersecting if it is not non-
intersecting (i.e., if two of its paths have a vertex in common).
We shall abbreviate “intersecting path tuple” as “ipat”.
6 B3 B2
p3
5
p2
4 B1
3 A3
p1
2
1 A2 A1
0
0 1 2 3 4 5 6 7 .
Math 701 Spring 2021, version April 6, 2024 page 405
6 B3 B2
p3
5
p2
4 B1
3 A3
p1
2
1 A2 A1
0
0 1 2 3 4 5 6 7 .
(c) The following path tuple is an ipat, too (for several reasons):
6 B3 B2
5 p2
4 p3
3 A3 B1
2
p1
1 A1 A2
0
0 1 2 3 4 5 6 7 .
(In this tuple, the paths p1 and p2 even have an arc in common. Don’t let the
picture confuse you: The two curved arcs are actually one and the same arc
of Z2 appearing in two paths, not two different arcs.)
Proposition 6.5.7 (LGV lemma for two paths). Let ( A, A′ ) and ( B, B′ ) be two
2-vertices (i.e., let A, A′ , B, B′ be four lattice points). Then,
− # of nipats from A, A′ to B′ , B .
Example 6.5.8. Let A = (0, 0) and A′ = (1, 1) and B = (2, 2) and B′ = (3, 3).
Then, Proposition 6.5.7 says that
4 6
2 3
det = # of nipats from A, A′ to B, B′
2 4
1 2
− # of nipats from A, A′ to B′ , B .
(The matrix entries on the left hand side have been computed using Propo-
sition 6.5.4.)
And indeed, this equality is easily verified. There are 2 nipats from ( A, A′ )
to ( B, B′ ), one of which is
3 B′
2 B
1 A′
0A
0 1 2 3
while the other is its reflection in the x = y diagonal. There are 6 nipats from
( A, A′ ) to ( B′ , B), three of which are
3 B′ 3 B′ 3 B′
2 B 2 B 2 B
1 A′ 1 A′ 1 A′
0A 0A 0A
0 1 2 3 0 1 2 3 0 1 2 3
Math 701 Spring 2021, version April 6, 2024 page 407
while the other three are their reflections in the x = y diagonal. The right
hand side of the above equality is thus 2 − 6 = −4, which is also the left
hand side.
B′
A′
A .
We need to show that on the right hand side, all the intersecting path tuples
cancel each other out (so that only the nipats remain).
Our k-vertices are 2-vertices; thus, our path tuples are pairs. Hence, such a
path tuple ( p, p′ ) is intersecting if and only if p and p′ have a vertex in common.
We shall use these common vertices to define a sign-reversing involution on the
intersecting path tuples. Specifically, we do the following:
Define the set
Here, the symbol “⊔” means “disjoint union (of sets)”, which is a way of unit-
ing two sets without removing duplicates (i.e., even if the sets are not disjoint,
we treat them as disjoint for the purpose of the union, and therefore include
two copies of each common element). As a consequence of us taking the dis-
joint union, each path tuple in A “remembers” whether it comes from the set
{path tuples from ( A, A′ ) to ( B, B′ )} or from the set
{path tuples from ( A, A′ ) to ( B′ , B)} (and if these two sets have a path tuple
in common, then A has two copies of it, each of which remembers from which
set it comes). However, in practice, this is barely relevant: Indeed, the only case
in which the sets
{path tuples from ( A, A′ ) to ( B, B′ )} and {path tuples from ( A, A′ ) to ( B′ , B)}
can fail to be disjoint is the case when B = B′ ; however, in this case, the claim
we are proving is trivial anyway, since there are no nipats123 , and our matrix
has determinant 0 (since it has two equal columns).
Define a subset X of A by
= ∑ sign p, p′
( p,p′ )∈A
and
A, A′ to B, B′
# of nipats from
− # of nipats from A, A′ to B′ , B
∑ sign p, p′
= (since A \ X = {nipats in A}) .
( p,p′ )∈A\X
123 Indeed, if p and p′ are two paths with the same destination, then p and p′ automatically have
a vertex in common.
124 because we took the disjoint union of {path tuples from ( A, A′ ) to ( B, B′ )} and
We want to prove that the left hand sides of these two equalities are equal.
Thus, it clearly suffices to show that the right hand sides are equal. By Lemma
6.1.3, it suffices to find a sign-reversing involution f : X → X that has no fixed
points.
So let us define our sign-reversing involution f : X → X . For each path
tuple ( p, p′ ) ∈ X , we define f ( p, p′ ) as follows:
• Call the part of p that comes after v the tail of p. Call the part of p that
comes before v the head of p.
Call the part of p′ that comes after v the tail of p′ . Call the part of p′ that
comes before v the head of p′ .
• Now, we exchange the tails of the paths p and p′ . That is, we set
q := (head of p) ∪ tail of p′
and
q′ := head of p′ ∪ (tail of p)
(where the symbol “∪” means combining a path ending at v with a path
starting at v in the obvious way), and set f ( p, p′ ) := (q, q′ ).
B B
p q′
p′ B′ q ′
p′ q B
′ ′
If p, p = , then q, q = .
A′ A′
p′ q′
p q
A A
Math 701 Spring 2021, version April 6, 2024 page 410
Here is the same configuration, with the point v marked and with the tails of
the two paths drawn extra-thick:
B
p
p′ B′
p′
.
A′ v
p′
p
( f ◦ f ) p, p′ = f f p, p′ = f q, q′ = p, p′ .
| {z }
=(q,q′ )
∑ sign p, p′ = ∑ sign p, p′ .
(249)
( p,p′ )∈A ( p,p′ )∈A\X
A, A′ to B, B′
# of nipats from
− # of nipats from A, A′ to B′ , B
= # of nipats from A, A′ to B, B′
− # of nipats from A, A′ to B′ , B .
− # of nipats from A, A′ to B′ , B .
x A′ ≤ x ( A) , y A′ ≥ y ( A) ,
(250)
′ ′
x B ≤ x ( B) , y B ≥ y ( B) . (251)
Note that the condition (250) can be restated as “the point A′ lies weakly
northwest of A”, where “weakly northwest of A” allows for the options “due
north of A”, “due west of A” and “at A”. Likewise, (251) can be restated as
Math 701 Spring 2021, version April 6, 2024 page 412
“the point B′ lies weakly northwest of B”. The following picture illustrates the
situation of Proposition 6.5.10:
6 B′
p
5
4 B
p′
3 A′
p′
2
1 A
p
0
0 1 2 3 4 5 6 7
Proposition 6.5.10 has an intuitive plausibility to it (one can think of the path
p as creating a “river” that the path p′ must necessarily cross somewhere), but
it is not obvious from a mathematical perspective. We give a rigorous proof in
Section B.7.
Proposition 6.5.7 is just the k = 2 case of a more general theorem that we will
soon derive; however, it already has a nice application:
2
n n n
Corollary 6.5.11. Let n, k ∈ N. Then, ≥ · .
k k−1 k+1
Proof of Corollary 6.5.11 (sketched). This is easy to see algebraically, but here is a
combinatorial proof: Define four lattice points A = (1, 0) and A′ = (0, 1) and
B = (k + 1, n − k ) and B′ = (k, n − k + 1). Then, Proposition 6.5.7 yields
The right hand side of this equality can be viewed as a signed count of all
k-tuples of paths that start at the points A1 , A2 , . . . , Ak in this order, but end
at the points B1 , B2 , . . . , Bk in some order. For example, for k = 3, the claim of
Math 701 Spring 2021, version April 6, 2024 page 414
saying that a pair (σ, p) ∈ A is a nipat if the path tuple p is a nipat. This is a bit sloppy (σ
has nothing to do with whether p is an ipat or a nipat), but we hope that no confusion will
ensue.
128 This is probably obvious, but just in case: We say that two paths intersect if they have a vertex
in common.
Math 701 Spring 2021, version April 6, 2024 page 415
• Then, we pick the largest j ∈ [k] such that v belongs to p j . (Note that j > i,
since v is crowded.)
• Call the part of pi that comes after v the tail of pi . Call the part of pi that
comes before v the head of pi .
Call the part of p j that comes after v the tail of p j . Call the part of p j that
comes before v the head of p j .
• Then, we exchange the tails of the paths pi and p j (while leaving all other
paths unchanged).
• Finally, we set
f (σ, ( p1 , p2 , . . . , pk )) := σ′ , (q1 , q2 , . . . , qk ) .
129 Thus,
qi = (head of pi ) ∪ tail of p j and q j = head of p j ∪ (tail of pi )
(where “head” means “part until v”, and “tail” means “part after v”). Furthermore, we have
qm = pm for any m ∈ [k] \ {i, j}, since we have left all paths other than pi and p j unchanged.
Math 701 Spring 2021, version April 6, 2024 page 416
B3′ B5′
p5 p3
p3 B2′
p2
p5 p2
A5 p3
A4 B4′
p3 p4
A3 p2
A2
A1 B1′
p1
B3′ B5′
q5 q3
q3 B2′
q4
q5 q4
A5 q3
A4 v B4′
q3 q2
A3 q2
A2
A1 B1′
q1
(by the definition of the determinant), whereas the right hand side is
since X = {ipats in A}
∑ (−1) = σ
∑ (−1) σ
entails A \ X = {nipats in A}
(σ,p)∈A\X (σ,p)∈A is a nipat
= ∑ (−1)σ (# of nipats from A to σ (B))
σ ∈ Sk
w ( p) := ∏ w ( a) .
a is an arc of p
w ( p ) : = w ( p1 ) w ( p2 ) · · · w ( p k ) .
130 An arc will appear multiple times in the product if it appears in multiple paths.
Math 701 Spring 2021, version April 6, 2024 page 419
w ( p) := ∏ w ( a) .
a is an arc of p
w ( p ) : = w ( p1 ) w ( p2 ) · · · w ( p k ) .
∑ (−1)σ ∑ w (p)
σ ∈ Sk p is a nipat
from A to σ(B)
has only one nonzero addend. We have already seen this happen often in the
k = 2 case (thanks to Proposition 6.5.10). Here is the analogous statement for
general k:
Corollary 6.5.15 (LGV lemma, nonpermutable lattice weight version). Con-
sider the setting of Theorem 6.5.13, but additionally assume that
x ( A1 ) ≥ x ( A2 ) ≥ · · · ≥ x ( A k ) ; (253)
y ( A1 ) ≤ y ( A2 ) ≤ · · · ≤ y ( A k ) ; (254)
x ( B1 ) ≥ x ( B2 ) ≥ · · · ≥ x ( Bk ) ; (255)
y ( B1 ) ≤ y ( B2 ) ≤ · · · ≤ y ( Bk ) . (256)
Math 701 Spring 2021, version April 6, 2024 page 420
det ∑ w ( p)
p:Ai → Bj
1≤i ≤k, 1≤ j≤k
= ∑ w (p) . (257)
p is a nipat
from A to B
Proof of Corollary 6.5.15 (sketched). This is easy using Proposition 6.5.10. Here
are the details:
Let σ ∈ Sk be a permutation that is not the identity permutation id ∈ Sk . Then, we
don’t have σ (1) ≤ σ (2) ≤ · · · ≤ σ (k ) (since σ is not id). In other words, there exists
some i ∈ [k − 1] such that σ (i ) > σ (i + 1). Consider this i.
Now, let p be a nipat from A to σ (B). Write p in the form p = ( p1 , p2 , . . . , pk ). Thus,
pi is a path from Ai to Bσ(i) , whereas pi+1 is a path from Ai+1 to Bσ(i+1) . Moreover, pi
and pi+1 have no vertex in common (since p is a nipat).
The sequence (x ( B1 ) , x ( B2 ) , . . . , x ( Bk )) is weakly decreasing (by (255)). In other
words, if m and n are two elements of [k ] satisfying m > n, then
x( Bm ) ≤x ( Bn ).
Applying this to m = σ (i ) and n = σ (i + 1), we obtain x Bσ(i) ≤ x Bσ(i+1) (since
σ (i ) > σ (i + 1)). Likewise, using (256), we can obtain y Bσ(i) ≥ y Bσ(i+1) . How-
ever, (253) shows that x ( Ai ) ≥ x ( Ai+1 ). In other words, x ( Ai+1 ) ≤ x ( Ai ). Further-
more, (254) shows that y ( Ai ) ≤ y ( Ai+1 ). In other words, y ( Ai+1 ) ≥ y ( Ai ).
Hence, Proposition 6.5.10 (applied to A = Ai , B = Bσ(i+1) , A′ = Ai+1 , B′ = Bσ(i) ,
p = pi and p′ = pi+1 ) yields that pi and pi+1 have a vertex in common. This contradicts
the fact that pi and pi+1 have no vertex in common.
Forget that we fixed p. We thus have found a contradiction for each nipat p from A
to σ (B). Hence, there are no nipats from A to σ (B).
Forget that we fixed σ. We thus have proved that there are no nipats from A to σ (B)
when σ ∈ Sk is not the identity permutation id ∈ Sk . Hence, if σ ∈ Sk is not the identity
permutation id ∈ Sk , then
det ∑ w ( p)
p:Ai → Bj
1≤i ≤k, 1≤ j≤k
= ∑ (−1) σ
∑ w (p)
σ ∈ Sk p is a nipat
from A to σ(B)
(here, we have split off the addend for σ = id from the sum)
= ∑ w (p) + ∑ (−1)σ 0 = ∑ w (p) = ∑ w (p)
p is a nipat σ ∈ Sk ; p is a nipat p is a nipat
from A to id(B) σ̸=id from A to id(B) from A to B
| {z }
=0
a1 ≥ a2 ≥ · · · ≥ a k and b1 ≥ b2 ≥ · · · ≥ bk .
Then,
!
ai ≥ 0.
det
bj
1≤i ≤k, 1≤ j≤k
Proof of Corollary 6.5.16 (sketched). Set K = Z, and set w ( a) := 1 for each arc a
of Z2 . Define the lattice points
for all i ∈ [k]. These lattice points satisfy the assumptions of Corollary 6.5.15.
Hence, (257) entails
Since all the weights w ( p) and w (p) are 1 in our situation, we can rewrite this
as
det # of paths from Ai to Bj 1≤i≤k, 1≤ j≤k = (# of nipats from A to B) .
Using Proposition 6.5.4, we!can easily see that the matrix on the left hand
ai
side of this equality is . Thus, this equality rewrites as
bj
1≤i ≤k, 1≤ j≤k
!
ai = (# of nipats from A to B) .
det
bj
1≤i ≤k, 1≤ j≤k
Its left hand side is therefore ≥ 0 (since its right hand side is ≥ 0). This proves
Corollary 6.5.16.
1 2n
Corollary 6.5.17. Let k ∈ N. Recall the Catalan numbers cn =
n+1 n
for all n ∈ N. Then,
c 0 c 1 · · · c k −1
c1 c2 · · · ck
det ci+ j−2 1≤i≤k, 1≤ j≤k = det . = 1.
.. .. . . ..
. . .
c k −1 c k · · · c2k−2
Proof of Corollary 6.5.17 (sketched). We will use not the lattice Z2 , but a different
digraph. Namely, we use the simple digraph with vertex set Z × N (that is, the
vertices are the lattice points that lie on the x-axis or above it) and arcs
and
(i, j) → (i + 1, j − 1) for all (i, j) ∈ Z × P,
Math 701 Spring 2021, version April 6, 2024 page 423
0
0 1 2 3 4 5 .
As we know, the Catalan number cn counts the paths from (0, 0) to (2n, 0) on
this digraph (indeed, these are just the Dyck paths131 ). Hence, cn also counts
the paths from (i, 0) to (2n + i, 0) whenever i ∈ N (because these are just the
Dyck paths shifted by i in the x-direction). It is easy to see that this digraph is
acyclic and path-finite.
Now, define two k-vertices A = ( A1 , A2 , . . . , Ak ) and B = ( B1 , B2 , . . . , Bk ) by
setting
Ai := (−2 (i − 1) , 0) and Bi := (2 (i − 1) , 0)
for all i ∈ [k ]. It is not hard to show (see Exercise A.5.4.2 (a)) that there is
only one nipat from A to B, which is shown in the case k = 4 on the following
131 See Example 2 in Section 3.1 for the definition of a Dyck path.
Math 701 Spring 2021, version April 6, 2024 page 424
picture:132
A4 A3 A2 B11
A B2 B3 B4
(the point A1 coincides with B1 , and the path from A1 to B1 is invisible, since it
has no arcs). Moreover, it can be shown (see Exercise A.5.4.2 (b)) that there are
no nipats from A to σ (B) when σ ∈ Sk is not the identity permutation id ∈ Sk .
(This is analogous to Corollary 6.5.15.) Hence, if we set K = Z and w ( a) = 1
for each arc a of our digraph, then (257) entails
(by the same reasoning as in the proof of Corollary 6.5.15). The right hand side
of this equality is 1 (since there is only one nipat
from A to B), while the matrix
on the left hand side is easily seen to be ci+ j−2 1≤i≤k, 1≤ j≤k (since the # of paths
from Ai to Bj is the Catalan number ci+ j−2 ). This yields the claim of Corollary
6.5.17. The details are LTTR.
The LGV lemma in all its variants is one major place in which combinatorial
questions reduce to the computation of determinants. Other such places are
the matrix-tree theorem (see, e.g., [Zeilbe85, §4], [Loehr11, §3.17], [Stanle18, Theo-
rems 9.8 and 10.4], [Grinbe23, §5.14]) and the enumeration of perfect matchings
132 Inthis picture, we are drawing only “half” of the grid. Indeed, the vertices (i, j) of our
digraph Z × N can be classified into even vertices (i.e., the ones for which i + j is even) and
odd vertices (i.e., the ones for which i + j is odd). Any arc either connects two even vertices or
connects two odd vertices. Hence, a path starting at an even vertex cannot contain any odd
vertex (and vice versa). Since all our vertices A1 , A2 , . . . , Ak and B1 , B2 , . . . , Bk are even, we
thus don’t have to bother even drawing the odd vertices (as they have no chance to appear
in any paths between our vertices). As a consequence, we are drawing only the grid lines
containing the even vertices.
Math 701 Spring 2021, version April 6, 2024 page 425
7. Symmetric functions
This final chapter is devoted to the theory of symmetric functions. Specifically,
we will restrict ourselves to symmetric polynomials (the “functions” part is a
technical tweak that makes the theory neater but we won’t have time to intro-
duce). Serious treatments of the subject can be found in [Wildon20], [Loehr11,
Chapters 10–11], [Egge19], [MenRem15], [Macdon95], [Aigner07, Chapter 8],
[Stanle23, Chapter 7], [Sagan19, Chapter 7], [Sagan01, Chapter 4], [Krishn86],
[FoaHan04, Chapters 14–19], [Savage22], [GriRei20, Chapter 2] and [LLPT95].
We begin with some oversimplified historical motivation.
Symmetric polynomials first(?) appeared in the study of roots of polynomi-
als. Consider a monic univariate polynomial
f = x n + a 1 x n −1 + a 2 x n −2 + · · · + a n x 0 ∈ C [ x ]
f = ( x − r1 ) ( x − r2 ) · · · ( x − r n ) ,
r1 + r2 + · · · + r n = − a1 ;
∑ r i r j = a2 ;
i< j
∑ r i r j r k = − a3 ;
i < j<k
...;
r1 r2 · · · rn = (−1)n an .
These equalities are now known as Viete’s formulas. They allow computing
certain expressions in the ri ’s without having to compute the ri ’s themselves.
For instance, we can compute r12 + r22 + · · · + rn2 (that is, the sum of the squares
of all roots of f ) by observing that
(r1 + r2 + · · · + rn )2 = r12 + r22 + · · · + rn2 + 2 ∑ ri r j ,
i< j
Math 701 Spring 2021, version April 6, 2024 page 426
so that
2
This shows, among other things, that r12 + r22 + · · · + rn2 is an integer if the co-
efficients of f are integers. Newton and others found similar formulas for
r13 + r23 + · · · + rn3 and other such polynomials. (These formulas are now known
as the Newton–Girard identities – see Theorem 7.1.12 below.) Gauss extended
this to arbitrary symmetric polynomials in r1 , r2 , . . . , rn (by algorithmically ex-
pressing them as polynomials in a1 , a2 , . . . , an ), and used it in one of his proofs
of the Fundamental Theorem of Algebra [Gauss16]; Galois used this to build
what is now known as Galois theory (even though modern treatments of Ga-
lois theory often avoid symmetric polynomials); some harbingers of this can be
seen in Cardano’s solution of the cubic equation. See [Tignol16] and [Armstr19]
for the real history.
Here is a simple modern application of the same ideas: Let A ∈ Cn×n be a
matrix with eigenvalues λ1 , λ2 , . . . , λn (listed with algebraic multiplicities). Let
f ∈ C [ x ] be a univariate polynomial. The spectral mapping theorem says that the
eigenvalues of the matrix f [ A] are f [λ1 ] , f [λ2 ] , . . . , f [λn ] (here, I am using the
notation f [ a] for the value of f at some element a; this is usually written f ( a)).
Thus, the characteristic polynomial of f [ A] is
Recall that S N denotes the N-th symmetric group, i.e., the group of all permu-
tations of the set [ N ] := {1, 2, . . . , N }.
and σ · ( x1 − x3 x4 ) = x2 − x1 x4 .)
Roughly speaking, the group S N is thus acting on P by permuting vari-
ables: A permutation σ ∈ S N transforms a polynomial f by substituting xσ(i)
for each xi .
Note that this action of S N on P is a well-defined group action (as we will
see in Proposition 7.1.4 below).
(c) A polynomial f ∈ P is said to be symmetric if it satisfies
σ· f = f for all σ ∈ S N .
(c) We have ( x − y) (y − z) (z − x ) ∈
/ S (since the simple transposition s1 ∈
S3 transforms ( x − y) (y − z) (z − x ) into
s1 · (( x − y) (y − z) (z − x )) = (y − x ) ( x − z) (z − y)
= − ( x − y) (y − z) (z − x )
̸= ( x − y) (y − z) (z − x ) ,
where f (a1 ,a2 ,...,a N ) ∈ K are its coefficients. The definition of τ · f yields
h i
τ · f = f x τ (1) , x τ (2) , . . . , x τ ( N ) = ∑ f (a1 ,a2 ,...,a N ) xτa1(1) xτa2(2) · · · xτa N( N )
( a1 ,a2 ,...,a N )∈N N
(here, we have substituted x(στ )(1) , x(στ )(2) , . . . , x(στ )( N ) for x1 , x2 , . . . , x N on both sides
of (260)). Comparing these two equalities, we obtain
h i
(στ ) · f = (τ · f ) xσ(1) , xσ(2) , . . . , xσ( N ) = σ · (τ · f )
P → P,
f 7→ σ · f
= (σ · f ) · (σ · g) .
P → P,
f 7→ σ · f
respects multiplication. Similarly, this map respects addition, respects scaling, respects
the zero and respects the unity. Hence, this map is a K-algebra morphism from P to
P . Furthermore, this map is invertible, since its inverse is the map
P → P,
f 7 → σ −1 · f .
Proof of Theorem 7.1.6 (sketched). We need to show that S is closed under addition, mul-
tiplication and scaling, and that S contains the zero and the unity of P . Let me just
show that S is closed under multiplication (since all the other claims are equally easy):
Let f , g ∈ S . We must show that f g ∈ S .
The polynomial f is symmetric (since f ∈ S ); in other words, σ · f = f for each
σ ∈ S N . Similarly, σ · g = g for each σ ∈ S N . Now, for each σ ∈ S N , we have
σ · ( f g) = (σ · f ) · (σ · g) = f g.
| {z } | {z }
=f =g
(as we have seen) (as we have seen)
Definition 7.1.8. (a) A monomial is an expression of the form x1a1 x2a2 · · · x aNN
with a1 , a2 , . . . , a N ∈ N.
(b) The degree deg m of a monomial m = x1a1 x2a2 · · · x aNN is defined to be
a1 + a2 + · · · + a N ∈ N.
(c) A monomial m = x1a1 x2a2 · · · x aNN is said to be squarefree if a1 , a2 , . . . , a N ∈
{0, 1}. (This is saying that no square or higher power of an indeterminate
appears in m; thus the name “squarefree”.)
(d) A monomial m = x1a1 x2a2 · · · x aNN is said to be primal if there is at most
one i ∈ [ N ] satisfying ai > 0. (This is saying that the monomial m contains
no two distinct indeterminates. Thus, a primal monomial is just 1 or a power
of an indeterminate.)
en = ∑ n
xi1 xi2 · · · xin = (sum of all squarefree monomials of degree n) .
(i1 ,i2 ,...,in )∈[ N ] ;
i1 <i2 <···<in
hn = ∑ n
xi1 xi2 · · · xin = (sum of all monomials of degree n) .
(i1 ,i2 ,...,in )∈[ N ] ;
i1 ≤i2 ≤···≤in
e2 = ∑ x i1 x i2 = ∑ xi x j
2 2
(i1 ,i2 )∈[ N ] ; (i,j)∈[ N ] ;
i1 < i2 i< j
= x1 x2 + x1 x3 + · · · + x1 x N
+ x2 x3 + · · · + x2 x N
..
. ··· ··· ···
+ x N −1 x N .
h2 = ∑ x i1 x i2 = ∑ xi x j
2 2
(i1 ,i2 )∈[ N ] ; (i,j)∈[ N ] ;
i1 ≤ i2 i≤ j
= x12 + x1 x2 + x1 x3 + ··· + x1 x N
+ x22 + x2 x3 + ··· + x2 x N
..
. ··· ··· ···
+ x2N −1 + x N −1 x N
+ x2N .
Proof. Let n > N be an integer. Then, the set [ N ] has no n distinct elements.
Thus, there exists no n-tuple (i1 , i2 , . . . , in ) ∈ [ N ]n satisfying i1 < i2 < · · · <
in (because if (i1 , i2 , . . . , in ) was such an n-tuple, then its n entries i1 , i2 , . . . , in
would be n distinct elements of [ N ]).
Math 701 Spring 2021, version April 6, 2024 page 433
p2 = e1 p1 − 2e2 = ( x1 + x2 + · · · + x N ) ( x1 + x2 + · · · + x N ) − 2 ∑ xi x j .
i< j
Before we prove (part of) Theorem 7.1.12, we establish some equalities in the
polynomial rings P [t] and P [u, v] (here, t, u, v are three new indeterminates)
and in the FPS ring P [[t]]:
Math 701 Spring 2021, version April 6, 2024 page 434
= ∑ t n
∑ m
n ∈N m is a squarefree
monomial of degree n
| {z }
=(sum of all squarefree monomials of degree n)
= en
(by the definition of en )
= ∑ tn en .
n ∈N
Math 701 Spring 2021, version April 6, 2024 page 435
1
= 1 + txi + (txi )2 + (txi )3 + · · ·
1 − txi
by substituting txi for x in the
geometric series formula (5)
= ∑ (txi ) a .
a ∈N
= ∑ tn ∑ m = ∑ tn hn .
n ∈N m is a monomial n ∈N
of degree n
| {z }
=(sum of all monomials of degree n)=hn
(by the definition of hn )
Proof of the 1st Newton–Girard formula (261). In the FPS ring P [[t]], we have
N
∏ (1 − txi ) = ∑ (−1)n tn en (by Proposition 7.1.14 (a))
i =1 n ∈N
and
N
1
∏ 1 − txi = ∑ tn hn (by Proposition 7.1.14 (c)) .
i =1 n ∈N
Multiplying these two equalities, we obtain
! !
N N
1
∏ (1 − txi ) ∏ 1 − txi
i =1 i =1
! ! ! !
= ∑ (−1)n tn en ∑ tn hn = ∑ (−1) j t j e j ∑ tk hk
n ∈N n ∈N j ∈N k ∈N
here, we have renamed the summation
indices n and n (in the two sums) as j and k
= ∑ ∑ (−1) j t j e j tk hk = ∑ (−1) j e j hk t j+k
j ∈N k ∈N | {z } ( j,k)∈N2
| {z } =e j hk t j+k
= ∑
( j,k)∈N2
= ∑ ∑ (−1) j e j hk tn
n ∈N ( j,k)∈N2 ;
j+k=n
(here, we have split the sum according to the value of j + k). Comparing this
with
! !
N N N N
1 1
∏ (1 − txi ) ∏ 1 − txi = ∏ (1 − txi ) · 1 − txi = ∏ 1 = 1,
i =1 i =1 i =1 | {z } i =1
=1
we obtain
1= ∑ ∑ 1 j
(− ) j k tn .
e h
n ∈N ( j,k)∈N2 ;
j+k=n
(here, we have substituted ( j, n − j) for ( j, k ) in the sum, since the map {0, 1, . . . , n} →
( j, k) ∈ N2 | j + k = n that sends each j ∈ {0, 1, . . . , n} to the pair ( j, n − j)
K [y , y , . . . , y N ] → S ,
| 1 2{z }
a polynomial ring
in N variables
g 7 → g [ p1 , p2 , . . . , p N ]
is a K-algebra isomorphism.
Example 7.1.16. (a) Theorem 7.1.15 (a) yields that p3 can be uniquely written
as a polynomial in e1 , e2 , . . . , e N . How to write it this way?
Here is a method that (more generally) can be used to express pn (for any
given n > 0) as a polynomial in e1 , e2 , . . . , en . This method is recursive, so we
assume that all the “smaller” power sums p1 , p2 , . . . , pn−1 have already been
expressed in this way. Now, the 2nd Newton–Girard formula (262) yields
n n −1
nen = ∑ (−1) j−1 en− j p j = ∑ (−1) j−1 en− j p j + (−1)n−1 e|{z}
n−n pn
j =1 j =1
= e0 = 1
here, we have split off the addend
for j = n from the sum
n −1
= ∑ (−1) j−1 en− j p j + (−1)n−1 pn .
j =1
The right hand side can now be expressed in terms of e1 , e2 , . . . , en (since the
only power sums appearing in it are p1 , p2 , . . . , pn−1 , which we already know
how to express in these terms); therefore, we obtain an expression of pn as a
polynomial in e1 , e2 , . . . , en .
For example, here is what we obtain for n ∈ [4] by following this method:
p 1 = e1 ;
p2 = e12 − 2e2 ;
p3 = e13 − 3e2 e1 + 3e3 ;
p4 = e14 − 4e2 e12 + 2e22 + 4e3 e1 − 4e4 .
If N < n, then this expression of pn as a polynomial in e1 , e2 , . . . , en becomes
an expression as a polynomial in e1 , e2 , . . . , e N if we throw away all addends
Math 701 Spring 2021, version April 6, 2024 page 439
p1 = h1 ;
p2 = −h21 + 2h2 ;
p3 = h31 − 3h2 h1 + 3h3 ;
p4 = −h41 + 4h2 h21 − 2h22 − 4h3 h1 + 4h4 .
e1 = p 1 ;
1 1
e2 = p21 − p2 ;
2 2
1 3 1 1
e3 = p 1 − p 2 p 1 + p 3 ;
6 2 3
1 4 1 1 1 1
e4 = p1 − p2 p21 + p22 + p3 p1 − p4 .
24 4 8 3 4
Math 701 Spring 2021, version April 6, 2024 page 440
Note the fractions on the right hand sides! This is why we required K to be
a Q-algebra in Theorem 7.1.15 (c). In general, we cannot express en in terms
of p1 , p2 , . . . , pn if the integer n is not invertible in K.
The question of expressing en as a polynomial in p1 , p2 , . . . , p N (as opposed
to p1 , p2 , . . . , pn ) is easily reduced to what we just have done: If n ≤ N,
then we have answered it already; if n > N, then the answer is en = 0 (by
Proposition 7.1.11).
(d) Not even algebraic independence of p1 , p2 , . . . , p N is true in general (if
we don’t assume that K is a Q-algebra)! Indeed, if K = Z/2, then
More generally, if K = Z/p for some prime p, then the Idiot’s Binomial
Formula (i.e., the formula ( x + y) p = x p + y p that holds in any commutative
p
Z/p-algebra) yields p1 = p p . (Did I mention that lowercase letters are in
short supply in the theory of symmetric polynomials?)
(e) If N = 3, then Theorem 7.1.15 (b) yields that h4 can be written as a
polynomial in h1 , h2 , h3 . Here is how this looks like:
(f) If N = 3, then
(λ1 , λ2 , . . . , λℓ ) 7→ λ1 , λ2 , . . . , λℓ , 0, 0, . . . , 0 .
| {z }
N −ℓ zeroes
Proof. Straightforward. (We essentially did this back in our proof of Proposition
4.4.7 (a), although we used the letter k instead of N back then.)
The N-partitions turn out to be closely connected to the ring S . Indeed, we
will soon see various bases of the K-module S , all of which are indexed by the
N-partitions. We shall construct the simplest one in a moment. First, we define
some auxiliary notations:
For example, if N = 5, then x (1,5,0,4,4) = x11 x25 x30 x44 x54 = x1 x25 x44 x54 and
sort (1, 5, 0, 4, 4) = (5, 4, 4, 1, 0).
Math 701 Spring 2021, version April 6, 2024 page 442
and
= x13 x22 x3 + x13 x2 x32 + x12 x23 x3 + x12 x2 x33 + x1 x23 x32 + x1 x22 x33
= x13 x22 x3 + (all other 5 permutations of this monomial)
and
and
m(2,2,2) = ∑ x a = x12 x22 x32 .
a∈N3 ;
sort a=(2,2,2)
where the size |λ| of an N-partition λ is defined to be the sum of its entries
(i.e., if λ = (λ1 , λ2 , . . . , λ N ), then |λ| := λ1 + λ2 + · · · + λ N ).
(c) Assume that N > 0. For each n ∈ N, we have
pn = m(n,0,0,...,0) ,
where (n, 0, 0, . . . , 0) is the N-tuple that begins with an n and ends with N − 1
zeroes.
Theorem 7.2.7. (a) The family (mλ )λ is an N-partition is a basis of the K-module
S.
(b) Each symmetric polynomial f ∈ S satisfies
h i
∑
λ1 λ2 λN
f = x x
1 2 · · · x N f mλ .
λ=(λ1 ,λ2 ,...,λ N )
is an N-partition
( x + y) (y + z) (z + x ) = x2 y + x2 z + y2 x + y2 z + z2 x + z2 y + 2 xyz
| {z } |{z}
=m(2,1,0) =m(1,1,1)
= m(2,1,0) + 2m(1,1,1) .
The proof of Theorem 7.2.7 will rely on a simple proposition that expresses
how a permutation σ ∈ S N transforms the coefficients of a polynomial f ∈ P
(guess what: it permutes these coefficients):
for any ( a1 , a2 , . . . , a N ) ∈ N N .
Here, as in Section 3.15, we are using the notation x1a1 x2a2 · · · x aNN g for the
z2 z 1 i< j
x5 x3 1
What about similar determinants, such as det y5 y3 1 ? Just as in
z5 z3 1
the proof of Lemma 6.4.33 (in which we computed the original Vandermonde
determinant), we can argue that this is a polynomial in x, y, z that is divisible
by each of x − y and x − z and y − z (since it becomes 0 if we set one of x, y, z
equal to another). Hence,
x5 x3 1
det y5 y3 1 = ( x − y) ( x − z) (y − z) · q
z5 z3 1
z5 z3 1 − x 5 y3 + x 5 z3 + x 3 y5 − x 3 z5 − y5 z3 + y3 z5
q= =
( x − y) ( x − z) (y − z) − x2 y + x2 z + xy2 − xz2 − y2 z + yz2
= x2 y3 + x3 y2 + x2 z3 + x3 z2 + y2 z3 + y3 z2 + xyz3
+ xy3 z + x3 yz + 2xy2 z2 + 2x2 yz2 + 2x2 y2 z
= m(3,2,0) + m(3,1,1) + 2m(2,2,1) ∈ S .
z5 z3 1
( x − y) ( x − z) (y − z) get multiplied by −1, so their ratio q stays unchanged.
This shows that σ · q = q whenever σ ∈ S3 is a transposition. Since the
transpositions generate the group S3 (indeed, Corollary 5.3.22 yields that the
simple transpositions s1 , s2 generate S3 ), this entails that σ · q = q for any
σ ∈ S3 (not just for transpositions). This means that q is symmetric.
Math 701 Spring 2021, version April 6, 2024 page 446
There is nothing special about the exponents 5 and 3 and 0 in the above
determinant. More generally, for any a, b, c ∈ N, we can define the so-called
alternant
x a xb xc
det y a yb yc ∈ P .
z a zb zc
When studying this alternant, we can WLOG assume that a, b, c are distinct
(since otherwise, the alternant is just 0) and furthermore assume that a >
b > c (since the general case is reduced to this one by swapping the columns
around). The alternant is then a polynomial divisible by x − y and x − z and
y − z (since it becomes 0 if we set one of x, y, z equal
to another), and thus
2
x x 1
divisible by ( x − y) ( x − z) (y − z) = det y2 y 1 (the simplest nonzero
z2 z 1
alternant). Moreover, the ratio
x a xb xc x a xb xc
det y a yb yc det y a yb yc
z a zb zc z a zb zc
=
( x − y) ( x − z) (y − z) x2 x 1
det y2 y 1
z2 z 1
z5 z3 1
Note that the definition of aρ yields
ρj N−j
aρ = det xi = det xi
1≤i ≤ N, 1≤ j≤ N 1≤i ≤ N, 1≤ j≤ N
since ρ j = N − j for each j ∈ [ N ]
∏
= xi − x j (265)
1≤ i < j ≤ N
.
Math 701 Spring 2021, version April 6, 2024 page 448
(The 3-rd row is invisible since it has length 0.) The four boxes in the 1-st (i.e.,
topmost) row of this diagram are (1, 1), (1, 2), (1, 3) and (1, 4) (from left to
right), while the single box in its 2-nd row is (2, 1).
Now, we are going to fill our Young diagrams – i.e., to put numbers in the
boxes:
• For instance, the entry of a tableau T in box (i, j) will mean the value
T (i, j).
• Also, the u-th row of a tableau T (for a given u ≥ 1) will mean the sequence
of all entries of T in the boxes (i, j) with i = u.
• Likewise, the v-th column of a tableau T (for a given v ≥ 1) will mean the
sequence of all entries of T in the boxes (i, j) with j = v.
• If T is a Young tableau of shape λ, then the boxes of Y (λ) will also be
called the boxes of T.
• Two boxes of a Young diagram (or of a tableau) are said to be adjacent
if they have an edge in common when drawn on the picture (i.e., when
one of them has the form (i, j), while the other has the form (i, j + 1) or
(i + 1, j)).
• The words “north”, “west”, “south” and “east” are to be understood ac-
cording to the picture of a Young diagram: e.g., the box (2, 4) lies one step
north and three steps west of the box (3, 7).
We let SSYT (λ) denote the set of all semistandard Young tableaux of shape
λ. (This depends on N as well, but N is fixed, so we omit it from our no-
tation.) We will usually say “semistandard tableau” instead of “semistandard
Young tableau”.
1 3 3 4 2 1 3 4 1 1 2 3 (266)
2 3 5 3 4 5 2 4 5
4 6 6
1 2 3 4 1 1 1 1 1 2 3 4
5 6 7 2 2 2 1 2 3
8 3 1
Which of these 6 tableaux are semistandard? The first one is not semistan-
dard, since the entries in its second column do not strictly increase down the
column. The second one is not semistandard, since the entries in its first row
do not weakly increase along the row. The third one is semistandard. The
fourth one is semistandard, too. The fifth one is semistandard, too (actually
it has a special property: each of its entries is the smallest possible value that
an entry of a semistandard tableau could have in its box). The sixth one is
not semistandard, again because of the columns.
Math 701 Spring 2021, version April 6, 2024 page 450
For example, the three Young tableaux in (266) have corresponding monomi-
als
x1 x3 x3 x4 x2 x3 x5 x4 = x1 x2 x33 x42 x5 ,
x2 x1 x3 x4 x3 x4 x5 x6 ,
x1 x1 x2 x3 x2 x4 x5 x6 .
Example 7.3.10. (a) Let n ∈ N. Consider the N-partition (n, 0, 0, . . . , 0). The
semistandard tableaux T of shape (n, 0, 0, . . . , 0) are simply the fillings of a
single row with n elements of [ N ] that weakly increase from left to right:
T = i1 i2 · · · i n with i1 ≤ i2 ≤ · · · ≤ in .
Thus,
s(n,0,0,...,0) = ∑ x i1 x i2 · · · x i n = h n .
i1 ≤i2 ≤···≤in
Hence,
Hence,
s(2,1,0,0,0,...,0) = ∑ xi x j x k = ∑ xi x j x k + ∑ xi x j x k + ∑ xi x j x k + ∑ xi x j x k
i ≤ j; i <k; i < j<k i <k; i <k< j
i <k j =i | {z } j=k | {z }
| {z } = e3 | {z } = e3
= ∑ xi xi x k = ∑ xi x k x k
i<k i <k
since each triple (i, j, k ) of elements of [ N ]
that satisfies i ≤ j and i < k must satisfy
exactly one of the four
conditions (i < k and j = i ) and i < j < k
and (i < k and j = k ) and i < k < j,
and conversely, each triple satisfying one
of the latter four conditions must
satisfy i ≤ j and i < k
= ∑ xi xi xk + e3 + ∑ xi xk xk + e3 = 2e3 + ∑ |{z}
xi xi x k + ∑ xi x k x k
|{z}
i <k i <k i <k i <k
= xi2 = xk2
= 2e3 + e2 e1 − 3e3 = e2 e1 − e3 .
α + β : = ( α1 + β 1 , α2 + β 2 , . . . , α N + β N ) .
This theorem (once it will be proved) will yield Conjecture 7.3.3 in the case
when α1 > α2 > · · · > α N . Indeed, if α = (α1 , α2 , . . . , α N ) ∈ N N is an N-tuple
Math 701 Spring 2021, version April 6, 2024 page 452
we have ⊆ ,
134 This is easy to prove (but see Lemma 7.3.39 (a) below for a proof).
135 See Lemma 7.3.39 (b) below for the details of this.
Math 701 Spring 2021, version April 6, 2024 page 453
For example,
Y ((4, 3, 1) / (2, 1, 0)) = .
We note that any row or column of a skew Young diagram Y (λ/µ) is con-
tiguous, i.e., has no holes between boxes. Better yet, if ( a, b) and (e, f ) are two
boxes of Y (λ/µ), then any box (c, d) that lies “between” them (i.e., that satisfies
a ≤ c ≤ e and b ≤ d ≤ f ) must also belong to Y (λ/µ). Let us state this as a
lemma:
Hints to the proof of Lemma 7.3.14. This follows easily from the definition of Y (λ/µ)
using the fact that λ and µ are weakly decreasing N-tuples. A detailed proof
can be found in Section B.10.
Lemma 7.3.14 is known as the convexity of Y (λ/µ) (albeit in a very specific
sense of the word “convexity”).
Next, we can define Young tableaux of shape λ/µ whenever λ and µ are two
N-partitions satisfying µ ⊆ λ. The definition is analogous to Definition 7.3.5,
except that we are only filling the boxes of Y (λ/µ) (rather than all boxes of
Y (λ)) this time:
The notion of a semistandard tableau of shape λ/µ is, again, defined in the
same way as for shape λ:
We let SSYT (λ/µ) denote the set of all semistandard Young tableaux of
shape λ/µ. We will usually say “semistandard tableau” instead of “semistan-
dard Young tableau”.
For example, here is a semistandard Young tableau of shape (4, 3, 1) / (2, 1, 0):
1 3 .
2 2
1
Meanwhile, there are no Young tableaux of shape (3, 2, 1) / (2, 2, 2) (since we
don’t have (2, 2, 2) ⊆ (3, 2, 1)), and thus the set SSYT ((3, 2, 1) / (2, 2, 2)) is
empty.
The phrases “increase weakly along each row” and “increase strictly down
each column” in Definition 7.3.13 have been formalized in terms of adjacent
entries: e.g., we have declared “increase weakly along each row” to mean
“T (i, j) ≤ T (i, j + 1)” rather than “T (i, j1 ) ≤ T (i, j2 ) whenever j1 ≤ j2 ”. How-
ever, since any row or column of Y (λ/µ) is contiguous, the latter stronger
meaning actually follows from the former. To wit:
Lemma 7.3.17. Let λ and µ be two N-partitions. Let T be a semistandard
Young tableau of shape λ/µ. Then:
(a) If (i, j1 ) and (i, j2 ) are two elements of Y (λ/µ) satisfying j1 ≤ j2 , then
T (i, j1 ) ≤ T (i, j2 ).
(b) If (i1 , j) and (i2 , j) are two elements of Y (λ/µ) satisfying i1 ≤ i2 , then
T ( i1 , j ) ≤ T ( i2 , j ).
(c) If (i1 , j) and (i2 , j) are two elements of Y (λ/µ) satisfying i1 < i2 , then
T ( i1 , j ) < T ( i2 , j ).
(d) If (i1 , j1 ) and (i2 , j2 ) are two elements of Y (λ/µ) satisfying i1 ≤ i2 and
j1 ≤ j2 , then T (i1 , j1 ) ≤ T (i2 , j2 ).
(e) If (i1 , j1 ) and (i2 , j2 ) are two elements of Y (λ/µ) satisfying i1 < i2 and
j1 ≤ j2 , then T (i1 , j1 ) < T (i2 , j2 ).
Hints to the proof of Lemma 7.3.17. This is easy using Lemma 7.3.14. A detailed
proof can be found in Section B.10.
Math 701 Spring 2021, version April 6, 2024 page 455
For example,
Definition 7.3.19. Let λ and µ be two N-partitions. We define the skew Schur
polynomial sλ/µ ∈ P by
sλ/µ := ∑ xT .
T ∈SSYT(λ/µ)
s(3,3,2,0,0,0,...,0)/(2,1,0,0,0,...,0) = ∑ xi x j x k x ℓ x m ,
i < j≥k<ℓ≥m
Theorem 7.3.21. Let λ and µ be any two N-partitions. Then, the polynomial
sλ/µ is symmetric.
We will now prove Theorem 7.3.21 bijectively, using a beautiful set of combi-
natorial bijections known as the Bender–Knuth involutions.
T= 1 2 2 2 2 3 .
1 1 2 3 3 4 6
2 3 3 5
2 4
(We have color-coded the entries so that 2’s are red, 3’s are blue, and all
other entries are black. You can mostly forget about the black entries, since
our construction of β k ( T ) will neither change them nor depend on them.)
We note that the entries of T increase weakly along each row (since T is
semistandard), and increase strictly down each column (for the same reason).
Now, an entry k in T is matched if and only if there is a k + 1 anywhere in its
column (because if there is a k + 1 anywhere in its column, then this k + 1 must
be directly underneath the k 137 , and therefore the k is matched). Likewise, an
entry k + 1 in T is matched if and only if there is a k anywhere in its column.
Thus, matched entries come in pairs: If a k in T is matched, then the k + 1
directly underneath it is also matched, and conversely, if a k + 1 in T is matched,
then the k directly above it is also matched. Hence, there is an obvious bijection
between the sets {matched k’s in T } and {matched (k + 1) ’s in T } 138 . Thus,
by the bijection principle, we have
(# of matched k’s in T )
= (# of matched (k + 1) ’s in T ) . (267)
Each column of T that contains a matched entry must contain exactly two
matched entries (one k and one k + 1); we shall refer to these two entries as
each other’s “partners”.
136 We will explain this argument in more detail at the end of this proof.
137 since the entries of each column of T increase down this column
138 Strictly speaking, these sets should consist not of the entries, but rather of the boxes in which
Our goal is to modify some of the entries k and k + 1 in such a way that we
obtain a new semistandard tableau that has as many k’s as our original tableau
T had (k + 1)’s, and has as many (k + 1)’s as our original tableau T had k’s. We
do not want to change any entries other than k’s and (k + 1)’s; nor do we want
to replace any k’s or (k + 1)’s by entries other than k and k + 1.
These requirements force us to leave all matched entries (both k’s and (k + 1)’s)
unchanged. Indeed, if an entry is matched, then its column contains both a k
and a k + 1, and thus neither of these two entries can be changed without
breaking the “entries increase strictly down each column” condition in Defini-
tion 7.3.16. Thus, the matched entries will have to stay unchanged.
On the other hand, we can arbitrarily replace the free entries by k’s or (k + 1)’s,
as long as we make sure to keep the rows weakly increasing; the columns will
stay strictly increasing no matter what we do (because a column containing a
free k does not contain any k + 1, and a column containing a free k + 1 does not
contain any k), so our tableau will remain semistandard.
In view of these observations, let us perform the following procedure:
• For each row of T, if there are a free k’s and b free (k + 1)’s in this row, we
replace them by b free k’s and a free (k + 1)’s (placed in this order, from
left to right).
βk (T ) = 1 2 2 2 3 3 .
1 1 2 3 3 4 6
2 3 3 5
3 4
Indeed:
• The 1-st row of T had 2 free 2’s and 1 free 3, so we replaced them by 1
free 2 and 2 free 3’s.
• The 2-nd row had 0 free 2’s and 0 free 3’s, so we replaced them by 0
free 2’s and 0 free 3’s. (Of course, this did not change anything.)
• The 3-rd row had 1 free 2 and 1 free 3, so we replaced them by 1 free 2
and 1 free 3. (Of course, this did not change anything.)
• The 4-th row had 1 free 2 and 0 free 3’s, so we replaced them by 0 free
2’s and 1 free 3.
Math 701 Spring 2021, version April 6, 2024 page 459
entries < k matched k’s free k’s free (k + 1)’s matched (k + 1)’s entries > k + 1
,
which appear in this order from left to right. (Each of these blocks
can be empty.)
column column
v′ v
↓ ↓
row u + 1 → k+1
.
Now, the Young diagram of λ/µ contains the box (u, v′ ), but also contains
the box (u + 1, v), which lies one row south and some number of columns east
of the former box (since v > v′ ). Hence, the Young diagram of λ/µ must have
a box (u + 1, v′ ) directly underneath the box (u, v′ ) (that is, the box (u + 1, v′ )
must belong to Y (λ/µ) 139 ). Here is an illustration of this:
column column
v′ v
↓ ↓
this entry k is matched. This contradicts our assumption that this k is free.
This contradiction shows that our assumption was false. Thus, we have shown
that all matched k’s in our row stand further left than all free k’s. Similarly,
we can show that all free (k + 1)’s stand further left than all matched (k + 1)’s
(the argument is analogous, but it uses the (u − 1)-st row of T rather than the
(u + 1)-st one). As explained above, this completes the proof of Observation 1.]
Observation 1 entails that the free entries in each row of T are stuck together
between the rightmost matched k and the leftmost matched k + 1. Hence, re-
placing these free entries as in our above definition of β k ( T ) does not mess up
the weakly increasing order of the entries in this row. This completes our proof
that β k ( T ) is a semistandard tableau.
Hence, the map β k : SSYT (λ/µ) → SSYT (λ/µ) is well-defined. This map β k
is called the k-th Bender–Knuth involution.
We shall now show that this map β k is a bijection. Better yet, we will show
that it is an involution (i.e., that β k ◦ β k = id):
β k ( T ) has b free k’s and a free (k + 1)’s, and therefore the same row of β k ( β k ( T )) will, in
turn, have a free k’s and b free (k + 1)’s again; but this means that its free entries are the
same as in T.
Math 701 Spring 2021, version April 6, 2024 page 462
(# of k’s in β k ( T )) = (# of (k + 1) ’s in T ) (268)
and
(# of (k + 1) ’s in β k ( T )) = (# of k’s in T ) . (269)
Moreover, if i ∈ [ N ] satisfies i ̸= k and i ̸= k + 1, then
(# of matched k’s in β k ( T ))
= (# of matched k’s in T ) (271)
and
(# of matched (k + 1) ’s in β k ( T ))
= (# of matched (k + 1) ’s in T ) . (272)
On the other hand, the map β k flips the imbalance between free k’s and
free (k + 1)’s in each row (but all these free entries remain free, whereas the
matched entries of T remain matched in β k ( T )); therefore, it also flips the total
imbalance between free k’s and free (k + 1)’s in the entire tableau143 . Thus,
(# of free k’s in β k ( T ))
= (# of free (k + 1) ’s in T ) (273)
143 sincethe total # of free k’s in a tableau equals the sum of the #s of free k’s in all rows (and
the same holds for free (k + 1)’s)
Math 701 Spring 2021, version April 6, 2024 page 463
and
(# of free (k + 1) ’s in β k ( T ))
= (# of free k’s in T ) . (274)
Now, each k in β k ( T ) is either free or matched (but not both at the same
time). Hence,
(# of k’s in β k ( T ))
= (# of free k’s in β k ( T )) + (# of matched k’s in β k ( T ))
| {z } | {z }
=(# of free (k+1)’s in T ) =(# of matched k’s in T )
(by (273)) (by (271))
= (# of free (k + 1) ’s in T ) + (# of matched k’s in T )
| {z }
=(# of matched (k+1)’s in T )
(by (267))
= (# of free (k + 1) ’s in T ) + (# of matched (k + 1) ’s in T )
= (# of (k + 1) ’s in T )
(since each k + 1 in T is either free or matched, but not both at the same time).
Also, each k + 1 in β k ( T ) is either free or matched (but not both at the same
time). Hence,
(# of (k + 1) ’s in β k ( T ))
= (# of free (k + 1) ’s in β k ( T )) + (# of matched (k + 1) ’s in β k ( T ))
| {z } | {z }
=(# of free k’s in T ) =(# of matched (k+1)’s in T )
(by (274)) (by (272))
= (# of free k’s in T ) + (# of matched (k + 1) ’s in T )
| {z }
=(# of matched k’s in T )
(by (267))
= (# of free k’s in T ) + (# of matched k’s in T )
= (# of k’s in T )
(since each k in T is either free or matched, but not both at the same time).
Moreover, if i ∈ [ N ] satisfies i ̸= k and i ̸= k + 1, then
(# of i’s in β k ( T )) = (# of i’s in T )
(since the map β k leaves all i’s in T unchanged144 , and does not replace any
other entries by i’s). Thus, Observation 3 is proved.]
From Observation 3, we can easily conclude the following:
[Proof of Observation 4: For the sake of completeness, here is a detailed proof. Let
T ∈ SSYT (λ/µ). Then,
N
∏ xi
(# of times i appears in T )
xT = (by Definition 7.3.18)
i =1
N
= ∏ xi
(# of i’s in T )
i =1
(since (# of times i appears in T ) = (# of i’s in T ) for each i ∈ [ N ])
∏
(# of k’s in T ) (# of (k+1)’s in T ) (# of i’s in T )
= xk · x k +1 · xi (275)
i ∈[ N ];
i ̸=k and i ̸=k +1
(here, we have split off the factors for i = k and for i = k + 1 from the product). The
same argument (applied to β k ( T ) instead of T) yields
∏
(# of k’s in β k ( T )) (# of (k+1)’s in β k ( T )) (# of i’s in β k ( T ))
x β k (T ) = xk ·x · x
| {z } | k +1 {z } i ∈[ N ];
|i {z }
(# of (k+1)’s in T ) (# of k’s in T ) (# of i’s in T )
= xk = x k +1 i ̸=k and i ̸=k +1 = xi
(by (268)) (by (269)) (by (270))
∏
(# of (k+1)’s in T ) (# of k’s in T ) (# of i’s in T )
= xk · x k +1 · xi
i ∈[ N ];
i ̸=k and i ̸=k +1
∏
(# of k’s in T ) (# of (k+1)’s in T ) (# of i’s in T )
= x k +1 · xk · xi .
i ∈[ N ];
i ̸=k and i ̸=k +1
On the other hand, applying the transposition sk (or, more precisely, the action of this
transposition sk ∈ S N on the ring P ) to both sides of the equality (275), we obtain
∏
(# of k’s in T ) (# of (k+1)’s in T ) (# of i’s in T )
sk · x T = sk ·
xk · x k +1 · xi
i ∈[ N ];
i ̸=k and i ̸=k +1
∏
(# of k’s in T ) (# of (k+1)’s in T ) (# of i’s in T )
= x k +1 · xk · xi
i ∈[ N ];
i ̸=k and i ̸=k+1
(since the action of sk on P swaps the indeterminates xk and xk+1 while leaving all
other indeterminates xi unchanged). Comparing the last two equalities, we obtain
x β k (T ) = sk · x T . This proves Observation 4.]
Now, the definition of sλ/µ yields
sλ/µ = ∑ xT . (276)
T ∈SSYT(λ/µ)
I think it was Hermann Weyl who originally proved the existence of such an
expansion (i.e., that any product sν sλ of two Schur polynomials is a sum of
Schur polynomials). The original proof used Lie group representations. The
idea of the proof, in a nutshell, is the following (skip this paragraph if you are
unfamiliar with representation theory): The irreducible polynomial represen-
tations of the classical group GL N (C) are (more or less) in bijection with the
N-partitions, meaning that there is an irreducible polynomial representation
Vλ for each N-partition λ, and all irreps (= irreducible polynomial representa-
tions) of GL N (C) have this form145 . These Vλ ’s are known as the Weyl modules,
or in a slightly more general form as the Schur functors. The tensor product of
two such irreps can be decomposed as a direct sum of irreps (since polynomial
representations of GL N (C) are completely reducible):
c(ν,λ,ω )
Vν ⊗ Vλ ∼
M
= Vω .
| {z }
ω is an N-partition
a direct sum of c(ν,λ,ω )
many Vω ’s
sν sλ = ∑ c (ν, λ, ω ) sω .
ω is an N-partition
In fact, the Schur polynomials sλ are the so-called characters of the irreps Vλ ,
and it is known that tensor products of representations correspond to products
of their characters.
All of this, in the detail it deserves, is commonly taught in a 1st or 2nd course
on representation theory (e.g., [Proces07] or [EGHetc11] or [Prasad15, Chapter
6]). But we are here for something else: we want to know these c (ν, λ, ω )’s. In
other words, we want a formula that expands a product sν sλ as a finite sum of
Schur polynomials.
Such a formula was first conjectured by Dudley Ernest Littlewood and Archibald
Read Richardson in 1934. It remained unproven for 40 years, not least because
the statement was not very clear. In the 1970s, proofs were found independently
by Marcel-Paul Schützenberger and Glanffrwd Thomas. Since then, at least a
dozen different proofs have appeared. The proof that I will show was pub-
lished by Stembridge in 1997 (in [Stembr02], perhaps one of the most readable
papers in all of mathematics), and crystallizes decades of work by many au-
thors (Gasharov’s somewhat similar proof [Gashar98] probably being the main
harbinger). It will prove not just an expansion for sν sλ , but also a generalization
(replacing sλ by a skew Schur polynomial sλ/µ ) found by Zelevinsky in 1981
145 Atleast if one uses the “right” definition of a polynomial representation. See [KraPro10, §5
and §6] or [Prasad15, §6.1] for details.
Math 701 Spring 2021, version April 6, 2024 page 467
α + β : = ( α1 + β 1 , α2 + β 2 , . . . , α N + β N ) and
α − β : = ( α1 − β 1 , α2 − β 2 , . . . , α N − β N ) .
For example,
col≥3 1 1 2 = 1 2 and
2 3 3
1 3 5 5
2 2
Remark 7.3.28. What shape does the tableau col≥ j T in Definition 7.3.27 have?
We don’t care, since we will only need this tableau for its content col≥ j T
(which is defined independently of the shape). However, the answer is not
hard to give: If λ = (λ1 , λ2 , . . . , λ N ) and µ = (µ1 , µ2 , . . . , µ N ), then col≥ j T is
a skew Young tableau of shape λ′ /µ′ , where
(Thus, the first j − 1 columns of col≥ j T are empty, i.e., have no boxes.)
This is a complex and somewhat confusing notion; before we move on, let us
thus give a metaphor that might help clarify it, and several examples.
Example 7.3.31. (a) Let N = 3 and ν = 0 = (0, 0, 0). Which of the following
six tableaux are 0-Yamanouchi?
T1 = 1 1 , T2 = 1 1 , T3 = 1 2 ,
2 2 2 3 2 2
T4 = 1 , T5 = 1 , T6 = 1 1 .
1 1 1 2 2
2 3 2 3
Note that all six of these tableaux are semistandard.
Let us check whether T1 is 0-Yamanouchi. Indeed, we compute the N-tuple
ν + cont col≥ j T ∈ N N for each positive integer j, obtaining
All of the results (2, 2, 0), (2, 1, 0), (1, 0, 0) and (0, 0, 0) are N-partitions. Thus,
T1 is 0-Yamanouchi.
Let us check whether T2 is 0-Yamanouchi. Indeed,
These are, in a sense, the “minimal” choices of ν for this to happen, but of
course there are many other choices of ν that work.
• In the sum on the right hand side of (280), the Schur polynomial sν+cont T
is always well-defined. Indeed, if T is a ν-Yamanouchi semistandard
tableau of shape λ/µ, then ν + cont |{z} T = ν + cont (col≥1 T ) is an
=col≥1 T
N-partition (by the definition of “ν-Yamanouchi”), so that sν+cont T is a
well-defined Schur polynomial.
Math 701 Spring 2021, version April 6, 2024 page 471
What are the T’s in the sum? The (1, 0, 0)-Yamanouchi semistandard tableaux
of shape (2, 1, 0) /0 are
1 1 , 1 2 , 1 2 ,
2 2 3
s(1,0,0)+(2,1,0) = s(3,1,0) ,
s(1,0,0)+(1,2,0) = s(2,2,0) ,
s(1,0,0)+(1,1,1) = s(2,1,1) .
Before we prove this lemma, let us explore its consequences. One of them is
the Littlewood–Richardson rule; another is Theorem 7.3.11 (b). Let us first see
how the latter can be derived from the lemma. This derivation, in turn, relies
on another (simple) lemma:
Math 701 Spring 2021, version April 6, 2024 page 472
Proof of Lemma 7.3.35. This lemma is an easy consequence of the fact that the
entries of a semistandard tableau increase strictly down each column. A de-
tailed proof is given in Section B.10.
Proof of Theorem 7.3.11 (b) using Lemma 7.3.34. Recall that 0 = (0, 0, . . . , 0) ∈ N N .
Applying Lemma 7.3.34 to µ = 0 and ν = 0, we obtain
This rewrites as
aρ · sλ = ∑ acont T +ρ (282)
T is a 0-Yamanouchi
semistandard tableau
of shape λ/0
T0 = 1 1 1 1 .
2 2
3 3
4
It turns out that this minimalistic tableau is the only T on the right hand side
of (282). This will follow from the following two observations:
that for each positive integer j, the N-tuple 0 + cont col≥ j T0 ∈ N N is an N-partition
(since the i-th row of T0 has λi many boxes, and thus the i-th row of col≥ j T0 has
max {λi − ( j − 1) , 0} many boxes). Now, the N-tuple
0 + cont col≥ j T0
= cont col≥ j T0
= # of 1’s in col≥ j T0 , # of 2’s in col≥ j T0 , . . . , # of N’s in col≥ j T0
= (max {λ1 − ( j − 1) , 0} , max {λ2 − ( j − 1) , 0} , . . . , max {λ N − ( j − 1) , 0})
(by (283))
T i′ , j′ = T0 i′ , j′
(284)
(since we have chosen (i, j) to have maximum possible j among the pairs satisfying
T (i, j) ̸= T0 (i, j)). Furthermore, for each (i′ , j′ ) ∈ Y (λ/0) satisfying i′ < i and j′ = j,
we have
T i′ , j′ = T0 i′ , j′
(285)
(since we have chosen (i, j) to have minimum possible i among the maximum-j pairs
satisfying T (i, j) ̸= T0 (i, j)).
The definition of the minimalistic tableau T0 yields T0 (i, j) = i. Set p := T (i, j).
Hence, p = T (i, j) ̸= T0 (i, j) = i. 146
146 Here is an example of how our tableau T can look like at this point (for N = 6 and λ =
Math 701 Spring 2021, version April 6, 2024 page 474
The number p appears at least once in the j-th column of T (since p = T (i, j)), and
thus appears at least once in the restricted tableau col≥ j T (since this restricted tableau
contains the j-th column of T).
The definition of Y (λ/0) yields Y (λ/0) = Y (λ) \ Y (0) = Y (λ). Hence, a tableau
| {z }
=∅
of shape λ/0 is the same as a tableau of shape λ. Thus, T is a tableau of shape λ (since
T is a tableau of shape λ/0). Since T is semistandard, we can thus apply Lemma 7.3.35,
and conclude that T (i, j) ≥ i. Hence, p = T (i, j) ≥ i. Combining this with p ̸= i, we
obtain p > i. In other words, i < p.
Now, recall that T is 0-Yamanouchi; hence, 0 + cont col≥ j T isan N-partition (by
the definition of “0-Yamanouchi”). In other
words, cont col≥ j T is an N-partition
(since 0 + cont col≥ j T = cont col≥ j T ). Write this N-partition cont col≥ j T as
( a1 , a2 , . . . , a N ). For each
k ∈ [ N ], its entry ak is the # of k’s in col≥ j T (by the defi-
nition of cont col≥ j T ). Applying this to k = i, we see that ai is the # of i’s in col≥ j T.
Similarly, a p is the # of p’s in col≥ j T. Hence, a p ≥ 1 (since we know that the number
p appears at least once in the restricted tableau col≥ j T). However, a1 ≥ a2 ≥ · · · ≥ a N
(since ( a1 , a2 , . . . , a N ) is an N-partition), and thus ai ≥ a p (since i < p). Hence, ai ≥
a p ≥ 1. In other words, the number i appears at least once in the restricted tableau
col≥ j T (since ai is the # of i’s in col≥ j T). In other words, the number i appears at least
once in one of the columns j, j + 1, j + 2, . . . of the tableau T. In other words, there
exists some (i′ , j′ ) ∈ Y (λ/0) satisfying j′ ≥ j and T (i′ , j′ ) = i. Consider this (i′ , j′ ).
Let us first assume (for the sake of contradiction) that j′ > j. Thus, (284) yields
T (i′ , j′ ) = T0 (i′ , j′ ) = i′ (by the definition of the minimalistic tableau T0 ). There-
fore, i′ = T (i′ , j′ ) = i. Hence, we can rewrite (i′ , j′ ) ∈ Y (λ/0) and T (i′ , j′ ) = i as
(i, j′ ) ∈ Y (λ/0) and T (i, j′ ) = i. Also, j < j′ (since j′ > j). However, the tableau T is
semistandard; thus, its entries increase weakly along each row. Therefore, from j < j′ ,
we obtain T (i, j) ≤ T (i, j′ ) 147 . Thus, p = T (i, j) ≤ T (i, j′ ) = i. But this contradicts
p > i.
This contradiction shows that our assumption (that j′ > j) was false. Hence, we
must have j′ ≤ j. Combined with j′ ≥ j, this yields j′ = j. Thus, we can rewrite
(i′ , j′ ) ∈ Y (λ/0) and T (i′ , j′ ) = i as (i′ , j) ∈ Y (λ/0) and T (i′ , j) = i.
We assume (for the sake of contradiction) that i′ < i. Hence, (285) yields T (i′ , j′ ) =
T0 (i′ , j′ ) = i′ (by the definition of the minimalistic tableau T0 ), so that i′ = T (i′ , j′ ) = i;
? 1 1 1 1 1 .
? 2 2 2 2
? p 3 3 3
? ?
? ?
?
Here, the known entries come from (284) and (285) (since the definition of the minimalistic
tableau T0 shows that T0 (i′ , j′ ) = i′ for each (i′ , j′ ) ∈ Y (λ/0)).
147 Strictly speaking, this follows by applying Lemma 7.3.17 (a) to (i, j ) and (i, j′ ) instead of (i, j )
1
and (i, j2 ).
Math 701 Spring 2021, version April 6, 2024 page 475
since it is easy to see that cont ( T0 ) = λ. This proves Theorem 7.3.11 (b) (using
Lemma 7.3.34).
Let us furthermore derive Theorem 7.3.32 from Lemma 7.3.34. This relies on
some elementary properties of certain polynomials. The underlying notion is
defined in an arbitrary commutative ring:
prefer to consider 0 to be a non-zero-divisor as well (so that they can say that an integral
domain has no zero-divisors, rather than saying that the only zero-divisor in an integral
domain is 0), even though 0 is not a regular element (unless the ring L is trivial). This
exception tends to make the notion of a non-zero-divisor fickle and unreliable.
Math 701 Spring 2021, version April 6, 2024 page 476
Since the element aρ ∈ P is regular (by Lemma 7.3.38), we can cancel aρ from
this equality (i.e., we can apply Lemma 7.3.37 to L = P and a = aρ and u =
sν · sλ/µ and v = ∑ sν+cont T ). As a result, we obtain
T is a ν-Yamanouchi
semistandard tableau
of shape λ/µ
sν · sλ/µ = ∑ sν+cont T .
T is a ν-Yamanouchi
semistandard tableau
of shape λ/µ
Proof of Lemma 7.3.39. This is an easy consequence of Definition 7.3.2 (b). See
Section B.10 for a detailed proof.
Proof of Lemma 7.3.34. For any β ∈ N N and any i ∈ [ N ], we let β i denote the
i-th entry of β. Thus, for example, ρk = N − k for each k ∈ [ N ] (since ρ =
( N − 1, N − 2, . . . , N − N )).
Since addition on N N is defined entrywise, we have ( β + γ)i = β i + γi for
any β, γ ∈ N N and i ∈ [ N ].
The group S N acts on P by K-algebra automorphisms. Hence, in particular,
we have
σ · ( f g) = (σ · f ) · (σ · g)
for any σ ∈ S N and any f , g ∈ P . In other words, we have
(σ · f ) · (σ · g) = σ · ( f g) (286)
∑
β β β
= (−1)σ xσ(11) xσ(22) · · · xσ(NN )
σ∈S N |
{z
}
β β β
=σ· x1 1 x2 2 ··· x NN
(because the action of σ∈S N
on P substitutes xσ(i) for each xi )
∑ (−1)σ σ · x ν+ρ ·
= sλ/µ
σ∈S N |{z}
=σ·sλ/µ
(since sλ/µ is symmetric,
so that σ·sλ/µ =sλ/µ )
∑ ν+ρ
σ
= (− 1 ) σ · x · σ · s λ/µ
σ∈S N | {z }
ν +
=σ·( x sλ/µ )
ρ
(by (286))
∑ x ν+ρ sλ/µ .
= (−1)σ σ · (289)
σ∈S N
= ∑ ν+ρ+cont T
|x {z } = ∑ x ν+cont T +ρ .
T ∈SSYT(λ/µ) = x ν+cont T +ρ T ∈SSYT(λ/µ)
Math 701 Spring 2021, version April 6, 2024 page 479
= ∑ (−1)σ ∑ σ · x ν+cont T +ρ
σ∈S N T ∈SSYT(λ/µ)
= ∑ ∑ (−1)σ σ · x ν+cont T +ρ
T ∈SSYT(λ/µ) σ∈S N
| {z }
= aν+cont T +ρ
(by (288), applied to β=ν+cont T +ρ)
= ∑ aν+cont T +ρ . (290)
T ∈SSYT(λ/µ)
This almost looks like the claim we want to prove, but the sum on the right
hand side is too big: It runs over all semistandard tableaux of shape λ/µ, while
we only want it to run over the ones that are ν-Yamanouchi. Thus, we will now
try to cancel the extraneous addends (i.e., the addends corresponding to the
T’s that are not ν-Yamanouchi).
Let us first make this a bit more precise. We define two sets
sign T := aν+cont T +ρ .
T is nonempty. On the other hand, this set is finite152 . Hence, this set has a
maximum element. In other words, the largest violator of T exists. 153
Let j be the largest violator of T. Then, ν + cont col≥ j T is not an N-
partition, but ν + cont col≥ j+1 T is an N-partition (since j is the largest vi-
olator of T).
Define two N-tuples b ∈ N N and c ∈ N N by b := ν + cont col≥ j T and c :=
T= 2 .
1 1 2 2 3
2 2 3 4
1 3 3 5 6
2 4 5 6
152 Proof.
Let j ≥ 1 be larger than each entry of λ. Then, the restricted tableau col≥ j T is empty
and thus satisfies cont col≥ j T = 0. Hence, ν + cont col≥ j T = ν + 0 = ν, which is an
| {z }
=0
N-partition by assumption. Thus, j is not a violator of T (by the definition of a “violator”).
Forget that we fixed j. We thus have shown that if j ≥ 1 is larger than each entry of λ,
then j is not a violator of T. Hence, if j ≥ 1 is sufficiently high, then j is not a violator of T.
Thus, the set of violators of T is bounded from above, and therefore finite (since it is a set
of positive integers).
153 For example, if ν = 0, then the tableaux
T2 = 1 1 , T3 = 1 2 , T5 = 1
2 3 2 2 1
3
We have
ν + cont col≥ j T = ν = (4, 2, 2, 0, 0, 0, 0) for each j ≥ 8;
| {z }
=0
ν + cont (col≥7 T ) = ν + (0, 1, 1, 0, 0, 0, 0) = (4, 3, 3, 0, 0, 0, 0) ;
| {z }
=(0,1,1,0,0,0,0)
ν + cont (col≥6 T ) = ν + (0, 2, 1, 1, 0, 0, 0) = (4, 4, 3, 1, 0, 0, 0) ;
| {z }
=(0,2,1,1,0,0,0)
ν + cont (col≥5 T ) = ν + (0, 3, 2, 1, 0, 1, 0) = (4, 5, 4, 1, 0, 1, 0) .
| {z }
=(0,3,2,1,0,1,0)
j := 5 and
b := ν + cont col≥ j T = ν + cont (col≥5 T ) = (4, 5, 4, 1, 0, 1, 0) and
c := ν + cont col≥ j+1 T = ν + cont (col≥6 T ) = (4, 4, 3, 1, 0, 0, 0) .
The missteps of T are the numbers k ∈ [ N − 1] such that bk < bk+1 ; these
numbers are 2 and 5 (since b2 < b3 and b5 < b6 ). Thus, the smallest misstep
of T is 2. Hence, we set k := 2.
Let us next make a few general observations about b and c.
The restrictions col≥ j T and col≥ j+1 T of T are “almost the same”: The only
difference between them is that the j-th column of T is included in col≥ j T but
not in col≥ j+1 T. Hence,
cont col≥ j T = cont col≥ j+1 T + cont col j T ,
where col j T denotes the j-th column of T (or, to be more precise, the restriction
of T to the j-th column). Now,
b = ν+ cont col≥ j T = ν + cont col≥ j+1 T + cont col j T
| {z } | {z }
=cont(col≥ j+1 T )+cont(col j T ) =c
= c + cont col j T . (292)
Now, recall that the tableau T is semistandard; thus, its entries increase
strictly down each column. Hence, in particular, the entries of the j-th col-
umn of T increase strictly down this column. Therefore, any given number
i ∈ [ N ] appears at most once in this column. In other words, any given number
i ∈ [ N ] appears at most once in col j T. In other words, cont col j T i ≤ 1
Math 701 Spring 2021, version April 6, 2024 page 482
for each i ∈ [ N ] (because cont col j T i counts how often i appears in col j T).
Applying this inequality to i = k + 1, we obtain cont col j T k+1 ≤ 1. Now,
from (292), we obtain
bk+1 = c + cont col j T k+1 = ck+1 + cont col j T k+1 ≤ ck+1 + 1,
| {z }
≤1
so that bk < bk+1 ≤ ck+1 + 1. Since bk and ck+1 + 1 are integers, this entails
bk ≤ (ck+1 + 1) − 1 = ck+1 . However, (292) also yields
bk = c + cont col j T k = ck + cont col j T k ≥ ck ,
| {z }
≥0
(since (cont(col j T ))k counts
how often k appears in col j T)
bk + 1 = bk + 1 . (293)
col< j ( T ∗ ) = β k col< j T
and (294)
col≥ j ( T ∗ ) = col≥ j T (295)
(where col< j ( T ∗ ) is defined just as col< j T was defined, except that we are using
T ∗ instead of T).
Math 701 Spring 2021, version April 6, 2024 page 483
T= 2 =⇒ T ∗ = 2
1 1 2 2 3 1 1 2 2 3
2 2 3 4 2 3 3 4
1 3 3 5 6 1 2 3 5 6
2 4 5 6 3 4 5 6
(where we have grayed out all boxes in columns 5, 6, 7, . . ., because the en-
tries in these boxes stay unchanged and are ignored by the Bender–Knuth
involution).
We shall now show that T ∗ ∈ X . Indeed, let us first check that the tableau T ∗
is semistandard. We know that the tableau T is semistandard,so that its restric-
tions col< j T and col≥ j T are semistandard; thus, β k col< j T is semistandard
as well (since the Bender–Knuth involution β k sends semistandard tableaux to
semistandard tableaux). Now, recall that the tableau T ∗ is obtained from T by
applying β k to columns
1, 2, . . . , j − 1 only; thus, T ∗ is obtained by glueing the
tableaux β k col< j T and col≥ j T together (along a vertical line). Hence:
• It is not hard to see that the entries of T ∗ increase weakly along each
row156 .
156 Proof.Let i ∈ [ N ]. We must prove that the entries of T ∗ increase weakly along the i-th row of
T∗. Assume the contrary. Thus, there exist two adjacent entries in the i-th row of T ∗ that are
out of order (in the sense that the one lying further left is larger than the one lying further
right). In other words, there exists some positive integer u such that (i, u) ∈ Y (λ/µ) and
(i, u + 1) ∈ Y (λ/µ) and T ∗ (i, u) > T ∗ (i, u + 1). Consider this u.
Recall that T ∗ is obtained by glueing the tableaux β k col< j T and col≥ j T together (along
a vertical line). Thus, the i-th row of T ∗ is obtained by glueing the i-th row of β k col< j T
together with the i-th row of col≥ j T. In other words, this row consists of two blocks, looking
as follows:
i-th row of β k col< j T i-th row of col≥ j T
.
Math 701 Spring 2021, version April 6, 2024 page 484
We shall refer to thesetwo blocks as the left block and the right block (so the left block is the
i-th row of β k col< j T , whereas the right block is the i-th row of col≥ j T). The boundary
between the two blocks falls between the ( j − 1)-st and j-th columns; the left block covers
columns 1, 2, . . . , j − 1, while the right block covers columns j, j + 1, j + 2, . . ..
The entries of the left block increase
weakly from left to right (since this left block is
a row of the tableau β k col< j T , which is semistandard). Thus, if both boxes (i, u) and
(i, u + 1) belonged to the left block, then we would have T ∗ (i, u) ≤ T ∗ (i, u + 1), which
would contradict T ∗ (i, u) > T ∗ (i, u + 1). Hence, it is impossible for both boxes (i, u) and
(i, u + 1) to belong to the left block; thus, at least one of these boxes must belong to the
right block. Therefore, (i, u + 1) belongs to the right block (since the right block is further
right than the left block).
The entries of the right block also increase weakly from left to right (since this right
block is a row of the tableau col≥ j T, which is semistandard). Thus, if both boxes (i, u) and
(i, u + 1) belonged to the right block, then we would have T ∗ (i, u) ≤ T ∗ (i, u + 1), which
would contradict T ∗ (i, u) > T ∗ (i, u + 1). Hence, it is impossible for both boxes (i, u) and
(i, u + 1) to belong to the right block; thus, at least one of these boxes must belong to the left
block. Therefore, (i, u) belongs to the left block (since (i, u + 1) belongs to the right block).
Thus, the boxes (i, u) and (i, u + 1) straddle the boundary between the left block and the
right block. Since this boundary falls between the ( j − 1)-st and j-th columns, this entails
that the box (i, u) lies on the ( j − 1)-st column, while the box (i, u + 1) lies on the j-th
column. In other words, u = j − 1 and u + 1 = j. Thus, the inequality T ∗ (i, u) > T ∗ (i, u + 1)
can be rewritten as T ∗ (i, j − 1) > T ∗ (i, j). Moreover, from u = j − 1, we obtain j − 1 = u
and thus (i, j − 1) = (i, u) ∈ Y (λ/µ). Furthermore, from u + 1 = j, we obtain j = u + 1 and
thus (i, j) = (i, u + 1) ∈ Y (λ/µ).
The equality (295) shows that the entries of T ∗ in columns j, j + 1, j + 2, . . . equal the
corresponding entries of T. Thus, in particular, we have T ∗ (i, j) = T (i, j) (since the box
(i, j) lies in column j). Thus, T ∗ (i, j − 1) > T ∗ (i, j) = T (i, j).
On the other hand, the tableau T is semistandard, so that its entries increase weakly along
each row. Hence, T (i, j − 1) ≤ T (i, j). Therefore, T (i, j) ≥ T (i, j − 1), so that T ∗ (i, j − 1) >
T (i, j) ≥ T (i, j − 1).
We know that the number k does not appear in the j-th column of T; thus, T (i, j) ̸= k
(since T (i, j) is an entry in the j-th column of T).
We recall a simple property of the Bender-Knuth involution β k (which follows directly
from the construction of β k ): When we apply β k to a semistandard tableau,
– some k’s get replaced by (k + 1)’s,
– some (k + 1)’s get replaced by k’s, and
– all other entries remain unchanged.
Thus, in particular, when we apply β k to a semistandard tableau, the only entries that
can get replaced by larger entries are k’s, and in that case they can only be replaced by
(k + 1)’s. In other words, if some entry of a semistandard tableau gets replaced by a larger
entry when we apply β k to the tableau, then this entry must have been k before applying
β k , and must get replaced by k + 1 when β k is applied.
Since T ∗ is obtained from T by applying β k to columns 1, 2, . . . , j − 1 (while all other
columns remain unchanged), we thus conclude that if some entry of T gets replaced by a
larger entry when we pass from T to T ∗ , then this entry must have been k in T, and must
get replaced by k + 1 in T ∗ . Let us restate this in a more formal language: If ( p, q) ∈ Y (λ/µ)
Math 701 Spring 2021, version April 6, 2024 page 485
T ( p, q) = k and T ∗ ( p, q) = k + 1.
We can apply this to ( p, q) = (i, j − 1) (since T ∗ (i, j − 1) > T (i, j − 1)), and thus conclude
that T (i, j − 1) = k and T ∗ (i, j − 1) = k + 1.
Now, from T (i, j − 1) = k, we obtain k = T (i, j − 1) ≤ T (i, j). On the other hand,
T ∗ (i, j − 1) > T (i, j), so that T (i, j) < T ∗ (i, j − 1) = k + 1. Since T (i, j) and k + 1 are
integers, this entails T (i, j) ≤ (k + 1) − 1 = k. Combining this with k ≤ T (i, j), we obtain
T (i, j) = k. This contradicts T (i, j) ̸= k. This contradiction shows that our assumption was
wrong. Hence, we have shown that the entries of T ∗ increase weakly along the i-th row of
T ∗ . Qed.
157 Indeed, a misstep of T was defined to be a k ∈ [ N − 1] such that b < b
k k +1 , where
b = ν + cont col≥ j T . In other words, a misstep of T means a k ∈ [ N − 1] such that
ν + cont col≥ j T k < ν + cont col≥ j T k+1 .
Math 701 Spring 2021, version April 6, 2024 page 486
We claim that this construction undoes the previous construction and recov-
ers T (so that ( T ∗ )∗ = T). To see this, we argue as follows:
Thus, f f ( T ) = f ( T ∗ ) = ( T ∗ )∗ = T.
| {z }
=T∗
Forget that we fixed T. We thus have proved that f ( f ( T )) = T for each
T ∈ X . As explained, this completes the proof of Observation 1.]
Next, we shall show two observations about the effect of the map f on the
sign of a tableau:
158 Here
is the argument in some more detail:
We have
col≥ j ( T ∗ ) = col≥ j T, (296)
and therefore we also have
(since the tableau col≥ p T is obtained by removing some columns from col≥ j T, whereas the
tableau col≥ p ( T ∗ ) is obtained in the same fashion from col≥ j ( T ∗ )).
We know that j is the largest violator of T. In other words, the N-tuple ν + cont col≥ j T
is not an N-partition, but ν + cont col≥ p T is an N-partition for any integer p > j. In view
of (296) and (297), we can rewrite this as follows: The N-tuple ν + cont col≥ j ( T ∗ ) is not
an N-partition, but ν + cont col≥ p ( T ∗ ) is an N-partition for any integer p > j. In other
Now, we are going to show that the N-tuple γ is obtained from α by swapping
two entries. Once this is shown, we will easily conclude sign ( f ( T )) = − sign T
by applying Lemma 7.3.39 (b).
We recall the notations from the construction of T ∗ : Let j be the largest vi-
olator of T. Let k be the smallest misstep of T. Define an N-tuple b ∈ N N
by b := ν + cont col≥ j T . Then, bk + 1 = bk+1 (as we have proved in (293)).
However,
bk = ν + cont col≥ j T k since b = ν + cont col≥ j T
= νk + cont col≥ j T k
| {z }
=(# of k’s in col≥ j T )
(by Definition 7.3.26)
= νk + # of k’s in col≥ j T . (298)
We shall now show that γk = αk+1 . Indeed, the vertical line that separates
the ( j − 1)-st and j-th columns cuts the tableau T into its two parts col< j T and
col≥ j T. Thus, every i ∈ [ N ] satisfies
(# of i’s in T )
= # of i’s in col< j T + # of i’s in col≥ j T . (300)
(# of i’s in T ∗ )
= # of i’s in col< j ( T ∗ ) + # of i’s in col≥ j ( T ∗ ) .
(301)
Math 701 Spring 2021, version April 6, 2024 page 488
γk = α k +1 . (303)
A similar argument (using (269) instead of (268)) can be used to show that
γk + 1 = α k . (304)
γ = ν + cont ( T ∗ ) + ρ = ν + cont T + ρ = α.
| {z }
=cont T
Hence, γk = αk . However, the equality (303) (which we have shown in the proof
of Observation 2) yields γk = αk+1 . Comparing these two equalities, we obtain
αk = αk+1 . Therefore, the N-tuple α ∈ N N has two equal entries (namely, its
k-th and its (k + 1)-st entry). Thus, Lemma 7.3.39 (a) yields aα = 0.
However, the definition of sign T yields
∑ sign I = ∑ sign I.
I ∈A I ∈A\X
∑ sign T = ∑ sign T.
T ∈A T ∈A\X
= ∑ aν+cont T +ρ
T is a ν-Yamanouchi
semistandard tableau
of shape λ/µ
Remark 7.3.42. All the above properties of skew Schur polynomials sλ/µ can
be generalized further by taking an arbitrary M ∈ N and allowing λ and
µ to be M-partitions (rather than N-partitions). Thus, the Young diagram
Y (λ/µ) is now defined to be the set {(i, j) | i ∈ [ M] and j ∈ [λi ] \ [µi ]}; in
particular, it may have more than N rows (if M > N). In this generalized
setup, the tableaux of shape λ/µ are defined just as they were in Definition
7.3.15 (in particular, they may have more than N rows, but their entries still
have to be elements of [ N ]); the same applies to the notions of semistandard
tableaux and the skew Schur polynomials (which are still polynomials in
P = K [ x1 , x2 , . . . , x N ]). All of our above results (particularly, Theorem 7.3.21
and Theorem 7.3.32) still hold in this generalized setup (note that the ν in
Theorem 7.3.32 must still be an N-partition, not an M-partition), and the
proofs given above still work.
160 because
Y (λ/µ) = .
From this picture, it is clear that this skew partition λ/µ is a horizontal strip
(and, in fact, a horizontal 6-strip, since |Y (λ/µ)| = 6), but not a vertical strip
(since, e.g., there are 3 boxes in the second row of Y (λ/µ)).
(b) If λ = (3, 3, 2, 1) and µ = (2, 2, 1, 0), then we have µ ⊆ λ, and the Young
diagram Y (λ/µ) looks as follows:
Y (λ/µ) = .
From this picture, it is clear that this skew partition λ/µ is a vertical strip
(and, in fact, a vertical 4-strip, since |Y (λ/µ)| = 4), but not a horizontal strip
(since there are 2 boxes in the third column of Y (λ/µ)).
(c) If λ = (4, 3, 1, 1) and µ = (3, 2, 1, 0), then we have µ ⊆ λ, and the Young
diagram Y (λ/µ) looks as follows:
Y (λ/µ) = .
From this picture, it is clear that this skew partition λ/µ is both a horizontal
strip (and, in fact, a horizontal 3-strip) and a vertical strip (and, in fact, a
vertical 3-strip).
(d) If λ = (3, 3, 2, 1) and µ = (1, 1, 1, 1), then we have µ ⊆ λ, and the Young
diagram Y (λ/µ) looks as follows:
Y (λ/µ) = .
Math 701 Spring 2021, version April 6, 2024 page 493
From this picture, it is clear that this skew partition λ/µ is neither a horizon-
tal strip nor a vertical strip.
Horizontal and vertical strips can also be characterized in terms of the entries
of the partitions:
Proposition 7.3.45. Let λ = (λ1 , λ2 , . . . , λ N ) and µ = (µ1 , µ2 , . . . , µ N ) be two
N-partitions.
(a) The skew partition λ/µ is a horizontal strip if and only if we have
λ1 ≥ µ1 ≥ λ2 ≥ µ2 ≥ · · · ≥ λ N ≥ µ N .
(b) The skew partition λ/µ is a vertical strip if and only if we have
µi ≤ λi ≤ µi + 1 for each i ∈ [ N ] .
(b) We have
en sµ = ∑ sλ .
λ is an N-partition;
λ/µ is a vertical n-strip
since the N-partitions λ for which λ/µ is a horizontal 2-strip are precisely
the four N-partitions (2, 2, 1, 1), (3, 1, 1, 1), (3, 2, 1, 0) and (4, 1, 1, 0). Here
are the Young diagrams Y (λ) of these four N-partitions λ (with the Y (µ)
subdiagram colored red each time):
, , , .
Math 701 Spring 2021, version April 6, 2024 page 494
since the N-partitions λ for which λ/µ is a vertical 2-strip are precisely the
four N-partitions (2, 2, 1, 1), (2, 2, 2, 0), (3, 1, 1, 1) and (3, 2, 1, 0). Here are the
Young diagrams Y (λ) of these four N-partitions λ (with the Y (µ) subdia-
gram colored red each time):
, , , .
The proof of Theorem 7.3.48 is not too hard using what we have learnt about
lattice paths in Section 6.5. Here is an outline (with some details left to exer-
cises):
Proof of Theorem 7.3.48 (sketched). Let us follow Convention 6.5.1, Definition 6.5.2
and Definition 6.5.5. We will work with the digraph Z2 . For each arc a of the
digraph Z2 , we define an element w ( a) ∈ P (called the weight of a) as follows:
w ( p) := ∏ w ( a) .
a is an arc of p
Now, it is not hard to see the following (compare with Proposition 6.5.4):
∑ w ( p ) = hc− a .
p is a path
from ( a,1) to (c,N )
It is easy to see that the conditions (253), (254), (255) and (256) of Corollary
6.5.15 are satisfied. Hence, (257) yields
det ∑ w ( p)
p:Ai → Bj
1≤i ≤k, 1≤ j≤k
= ∑ w (p) , (306)
p is a nipat
from A to B
det ∑ w ( p)
p:Ai → Bj
1≤i ≤k, 1≤ j≤k
= det h(λ − j)−(µ −i) (by Observation 1)
j i 1≤i ≤k, 1≤ j≤k
= det h(λ −i)−(µ − j) (by Theorem 6.4.10)
i j 1≤i ≤k, 1≤ j≤k
= det h λi − µ j −i + j
1≤i ≤k, 1≤ j≤k
= det h λi − µ j −i + j (307)
1≤i ≤ M, 1≤ j≤ M
(since k = M). We shall now analyze the right hand side of (306). To that
purpose, we need to understand the nipats from A to B.
We define the height of an east-step (i, j) → (i + 1, j) to be the number j. We
define the height sequence of a path p to be the sequence of the heights of the
east-steps of p (going from the starting point to the ending point of p). For
example, the path shown in Example 6.5.3 has height sequence (1, 1, 1, 2, 3). It
is clear that the height sequence of a path is always weakly increasing.
If p = ( p1 , p2 , . . . , pk ) is a nipat from A to B, we let T (p) be the tableau of
shape Y (λ/µ) such that the entries in the i-th row of T (p) (for each i ∈ [k]) are
the entries of the height sequence of pi .
Example 7.3.50. Let N = 6 and M = 3 (so that k = M = 3) and λ = (4, 2, 1)
and µ = (1, 0, 0). Here is a nipat p from A to B, and the corresponding
Math 701 Spring 2021, version April 6, 2024 page 497
tableau T (p):
B3 B2 B1
1 2 5
p= p3 =⇒ T (p) = 2 5 .
4
p2 p1
A3 A2 A1
We could have defined the tableau T (p) just as easily for any path tuple p
from A to B (not just for a nipat); however, the case of a nipat is particularly
useful, because it turns out that the tableau T (p) is semistandard if and only if
p is a nipat. Moreover, the following stronger statement holds:
Observation 2: There is a bijection
{nipats from A to B} → SSYT (λ/µ) ,
p 7→ T (p) .
∑ w (p) = ∑ x T (p) = ∑ xT
p is a nipat p is a nipat T ∈SSYT(λ/µ)
| {z }
= x T (p)
from A to B from A to B
here, we have substituted T for T (p)
in the sum, since the map in
Observation 2 is a bijection
Thus,
Our above proof of Theorem 7.3.48 is essentially taken from [Stanle23, First
proof of Theorem 7.16.1]; other proofs can be found in [GriRei20, Exercise
2.7.13] (see also [GriRei20, paragraph after Theorem 2.4.6] for several refer-
ences).
The second Jacobi–Trudi formula involves elementary symmetric polynomials
en (instead of hn ) and transpose partitions (as in Exercise A.3.1.1):
A. Homework exercises
What follows is a collection of problems (of varying difficulty) that are meant
to illuminate, expand upon and otherwise complement the above text.
The numbers in the squares (like 3 ) are the experience points you gain for
solving the problems. They are a mix of difficulty rating and relevance score:
The harder or more important the problem, the larger is the number in the
square. I believe a 5 represents a good graduate-level homework problem
that requires thinking and work. A 3 usually requires some thinking or work.
A 1 is a warm-up question. A 7 should be somewhat too hard for regular
homework. Anything above 10 is not really meant as homework, but I’d be
excited to hear your ideas. Multi-part exercises sometimes have points split
between the parts – i.e., if parts (b) and (c) of an exercise are solved using the
same idea, then they may both be assigned 3 points even if each for itself
would be a 5 .
In solving an exercise, you can freely use (without proof) the claims of all
exercises above it.
Your goal (for an A grade in the 2024 iteration of Math 531) is to gain at least
20 experience points from each of the Chapters 3–7 (counting Chapter 2 as part
of Chapter 3).
The next exercise is concerned with the notion of lacunar sets. This notion
appears all over combinatorics (particularly in connection with Fibonacci num-
bers), so chances are we will meet it again.
For example, the set {1, 5, 7} is lacunar, but {1, 5, 6} is not. Any 1-element
subset of Z is lacunar, and so is the empty set.
Some people say “sparse” instead of “lacunar”, but the word “sparse” also
has other meanings.
From now on, we shall use the so-called Iverson bracket notation:
n−b
n a n
= for any n, a, b ∈ C.
a b b a−b
(b) 3 Let N ∈ N.
For each c ∈ C, let Lc ∈ C N × N be the N × N-matrix whose rows are in-
dexed 0, 1, . . . , N − 1 and
whose columns are indexed 0, 1, . . . , N − 1, and
i i− j
whose (i, j)-th entry is c for each i, j ∈ {0, 1, . . . , N − 1}. (The expres-
j
i i− j
sion “ c ” should be understood as 0 if i < j, even if c itself is 0.)
j
1 0 0 0 0
1c 1 0 0 0
[For example, if N = 5, then Lc = 1c2 2c 1 0 0 .]
3
1c 3c2 3c 1 0
1c4 4c3 6c2 4c 1
Prove that
Lc Ld = Lc+d for any c, d ∈ C.
(c) 1 Prove that the matrices L1 and L−1 are mutually inverse.
Math 701 Spring 2021, version April 6, 2024 page 502
(c) 3 Prove the claim of part (b) still holds if we replace “a, b ∈ N” by
“a, b ∈ Z”.
[Hint: The following suggests a combinatorial solution (algebraic solutions
also exist).
Consider the cyclic group C p = Z/pZ with p elements.
For part (a), let U be the set of all p-tuples of elements of {0, 1} with the
property that exactly k entries of the p-tuple are 1. The group C p acts on U
by cyclic rotation. Argue that each orbit of this action has size divisible by p.
For part (b), let W be the set of all p × a-matrices with entries in {0, 1}
and having the property that the sum of all entries of the matrix is bp
(that is, exactly bp entries are 1). Construct an action of the group C pa =
C p × C p × · · · × C p on W in which the k-th C p factor cyclically rotates the
| {z }
a times
a
entries of the k-th row of the matrix. Argue that all but orbits of this
b
action have size divisible by p2 , and conclude by writing |W | as the sum of
the sizes of the orbits.
For part (c), fix b ∈ N and p (why enough to consider b ∈ N?),
is it
ap a
and show that the remainders of − modulo p2 are periodic as a
bp b
function in a.]
A.2.1. Examples
All the properties of generating functions that have been used without proof in
Section 3.1 can also be used in the following exercises.
Math 701 Spring 2021, version April 6, 2024 page 503
Exercise A.2.1.4. 5 Find and prove an explicit formula for the coefficient of
1
x n in the formal power series .
1 − x − x2 + x3
equally many 1’s and −1’s), and that has the additional property that for
each k, we have
(# of −1 ’s among its first k entries)
≤ (# of 1’s among its first k entries) .
A Motzkin path of length n is a path from the point (0, 0) to the point (n, 0)
in the Cartesian plane that moves only using “NE-steps” (i.e., steps of the
form ( x, y) → ( x + 1, y + 1)), “SE-steps” (i.e., steps of the form ( x, y) →
( x + 1, y − 1)) and “E-steps” (i.e., steps of the form ( x, y) → ( x + 1, y)) and
never falls below the x-axis (i.e., does not contain any point ( x, y) with y < 0).
For example, here is a Motzkin path from (0, 0) to (6, 0):
.
For each n ∈ N, we define the Motzkin number mn by
mn := (# of Motzkin paths (0, 0) → (n, 0)) .
Here is a table of the first 12 Motzkin numbers mn :
n 0 1 2 3 4 5 6 7 8 9 10 11
.
mn 1 1 2 4 9 21 51 127 323 835 2188 5798
(a) 1 Prove that mn is also the # of Motzkin words of length n for each
n ∈ N.
(b) 2 Let cn be the n-th Catalan number (defined in Section 3.1) for each
n ∈ N. Prove that
n
n
mn = ∑ c for each n ∈ N.
k =0
2k k
(The sum here just as well range from k = 0 to ⌊n/2⌋ or range over all
could
n
k ∈ N, since = 0 when 2k > n.)
2k
(c) 2 Prove that
n n
n−k n−k
1 n 1 n+1
mn = ∑
n + 1 k∑
=
k =0
k+1 k k =0
k+1 k
for each n ∈ N.
[Despite the analogy between the Motzkin numbers mn and the Catalan
numbers cn , there is no formula for mn as simple as (14) or (15).]
Math 701 Spring 2021, version April 6, 2024 page 505
A.2.2. Definitions
(In particular, show that the sum on the left hand side is well-defined.)
(b) 3 For each positive integer n, let ν2 (n) be the highest k ∈ N such
that 2k | n. (Equivalently, ν2 (n) is the exponent with which 2 appears in
the prime factorization of n; when n is odd, this is understood to be 0. For
example, ν2 (40) = 3 and ν2 (41) = 0.)
Math 701 Spring 2021, version April 6, 2024 page 506
Prove that n
x2
∑ 1 − x2n = ∑ (ν2 (n) + 1) xn .
n ∈N n >0
(In particular, show that the sum on the left hand side is well-defined.)
A.2.4. Polynomials
The following exercise is a generalization of Binet’s formula for the Fibonacci
sequence (Example 1 in Section 3.1):
d
an = ∑ pi a n −i .
i =1
p = (1 − r1 x ) (1 − r2 x ) · · · (1 − r d x )
The next exercise reveals an application of FPSs to number theory (more such
applications will appear later on):
Exercise A.2.4.2. Let p and q be two coprime positive integers. We define the
set
S ( p, q) := { ap + bq | ( a, b) ∈ N × N} .
(For example, if p = 3 and q = 5, then S ( p, q) = {0, 3, 5, 6, 8, 9, 10, 11, . . .},
where the “. . .” is saying that all integers ≥ 8 belong to S ( p, q). The set
S ( p, q) can be viewed as the set of all denominations that can be paid with
p-cent coins and q-cent coins, without getting change.)
(a) 3 Prove that
1 − x pq
∑ xn =
(1 − x p ) (1 − x q )
.
n∈S( p,q)
(1 − x p ) (1 − x q ) ∑ xn = ∑ x n − x n+ p − x n+q + x n+ p+q
n∈S( p,q) n∈S( p,q)
Exercise A.2.4.3. Let N ∈ N. Let PN denote the C-vector space of all polyno-
mials f ∈ C [ x ] of degree < N. Consider the matrices Lc for all c ∈ C defined
in Exercise A.1.1.4 (b).
For each c ∈ C, let Bc be the basis ( x − c)0 , ( x − c)1 , . . . , ( x − c) N −1
of PN . (This is a basis, since it is the image of the monomial basis
x0 , x1 , . . . , x N −1 under the “substitute x − c for x” automorphism.)
Math 701 Spring 2021, version April 6, 2024 page 508
(b) 2 Use this to give a new solution to Exercise A.1.1.4 (b) (without using
Exercise A.1.1.4 (a)).
and a ◦ b = x and b ◦ a = x.
Prove the following:
(a) 1 If a compositional inverse of a exists, then it is unique.
(b) 4 A compositional inverse of a exists if and only if x1 a is invertible
in K.
A.2.6. Derivatives of FPSs
Qm := ∑ nm x n = 0m x0 + 1m x1 + 2m x2 + · · · ∈ Z [[ x ]] .
n ∈N
For example,
1
Q0 = x 0 + x 1 + x 2 + x 3 + · · · = ;
1−x
x
Q1 = 0x0 + 1x1 + 2x2 + 3x3 + · · · = (by (18)) ;
(1 − x )2
it can furthermore be shown that
x (1 + x )
Q2 = 0x0 + 1x1 + 4x2 + 9x3 + · · · = ;
(1 − x )2
x 1 + 4x + x2
Q3 = 0x0 + 1x1 + 8x2 + 27x3 + · · · = ;
(1 − x )4
x 1 + 11x + 11x2 + x3
Q4 = 0x0 + 1x1 + 16x2 + 81x3 + · · · = .
(1 − x )5
The expressions become more complicated as m increases, but one will still
Am
notice that each Qm has the form , where Am is a polynomial of
(1 − x ) m +1
degree m that has constant term 0 (unless m = 0) and whose coefficients have
a “palindromic” symmetry (in the sense that the sequence of coefficients is
symmetric across its middle). Let us prove this.
For each m ∈ N, we define an FPS
Am := (1 − x )m+1 Qm ∈ Z [[ x ]] .
Am
(Thus, Qm = , so that the Am we just defined are the Am we are
(1 − x ) m +1
interested in – but we don’t yet know that they are polynomials.)
Let ϑ : Z [[ x ]] → Z [[ x ]] be the Z-linear map that sends each FPS f ∈ Z [[ x ]]
to x f ′ . (That is, ϑ takes the derivative of an FPS and then multiplies it by x.)
(a) 1 Prove that ϑ ( f g) = ϑ ( f ) · g + f · ϑ ( g) for any f , g ∈ Z [[ x ]]. (In the
lingo of algebraists, this is saying that ϑ is a derivation of Z [[ x ]].)
(b) 1 Prove that ϑ (1 − x )k = −kx (1 − x )k−1 for each k ∈ Z.
i ∈ {0, 1, . . . , m + 1}.
The polynomials A0 , A1 , A2 , . . . are known as the Eulerian polynomials.
Exercise A.2.6.3. For any nonzero FPS f ∈ K [[ x ]], define the order ord ( f ) of
f to be the smallest m ∈ N such that [ x m ] f ̸= 0. Further define the norm || f ||
1
of an FPS f ∈ K [[ x ]] to be the rational number ord( f ) if f is nonzero. If f is
2
zero, set || f || := 0.
This norm on K [[ x ]] gives rise to a metric d : K [[ x ]] × K [[ x ]] → Q on
K [[ x ]], defined by
K [[ x ]] × K [[ x ]] → K [[ x ]] ,
( f , g) 7→ f + g
and
K [[ x ]] × K [[ x ]] → K [[ x ]] ,
( f , g) 7→ f g
and
K [[ x ]] × K [[ x ]]0 → K [[ x ]] ,
( f , g) 7→ f ◦ g
are continuous with respect to the topologies induced by this metric. (Recall
that K [[ x ]]0 denotes the subset of K [[ x ]] consisting of all FPSs g ∈ K [[ x ]]
0
satisfying x g = 0. This subset becomes a topological space by inheriting
a subspace topology from K [[ x ]].)
Math 701 Spring 2021, version April 6, 2024 page 511
K [[ x ]] → K [[ x ]] ,
f 7→ f ′
dsup (( f 1 , g1 ) , ( f 2 , g2 )) = max {d ( f 1 , f 2 ) , d ( g1 , g2 )} .
Show that all three maps are Lipschitz continuous with Lipschitz constant 1
– i.e., that any ( f 1 , g1 ) and ( f 2 , g2 ) in the respective product spaces satisfy
d ( f 1 + g1 , f 2 + g2 ) ≤ dsup (( f 1 , g1 ) , ( f 2 , g2 )) and
d ( f 1 g1 , f 2 g2 ) ≤ dsup (( f 1 , g1 ) , ( f 2 , g2 )) and
d ( f 1 ◦ g1 , f 2 ◦ g2 ) ≤ dsup (( f 1 , g1 ) , ( f 2 , g2 )) .
[n] that have k blocks. For example, S (4, 3) = 6, since the set partitions of [4]
that have 3 blocks are
(The first and the last equalities here hold for n = 0 as well. However,
S (0, 0) = 1 and S (0, 1) = 0.)
(b) 3 Show that every n ∈ N satisfies
n
xn = ∑ S (n, k) xk in K [ x ] ,
k =0
k −1
where xk denotes the polynomial ∏ (x − i) =
i =0
x ( x − 1) ( x − 2) · · · ( x − k + 1).
(c) 1 Show that every n ∈ N satisfies
n
∑ (−1)k k! · S (n, k) = (−1)n .
k =0
[Hint: For part (d), take the derivative of part (b) and evaluate at x = 0.]
Exercise A.2.7.1. 1 For this exercise, let K be any commutative ring (not
necessarily a Q-algebra). Let f ∈ K [[ x ]]1 and g ∈ K [[ x ]]0 be two FPSs. Note
that f ◦ g ∈ K [[ x ]]1 (by Lemma 3.7.7 (b)), so that loder ( f ◦ g) is well-defined.
Prove that
loder ( f ◦ g) = ((loder f ) ◦ g) · g′ .
Exercise A.2.7.2. Recall the Stirling numbers of the 2nd kind S (n, k ) defined
in Exercise A.2.6.5.
(a) 2 Show that every positive integers n and k satisfy
S (n, k ) = k · S (n − 1, k ) + S (n − 1, k − 1) .
[Hint: For part (b), denote the left hand side by f k , and show that f k′ =
k f k + f k−1 for each k ≥ 0.]
f n = ax m + ∑ ai x i
i >m
(b) Find an explicit formula for the n-th entry an of this sequence (in terms
of binomial coefficients).
n+m−1
∑ a1 a2 · · · a n =
2n − 1
.
( a ,a ,...,a )∈Nn ;
1 2 n
a1 + a2 +···+ an =m
(Note that Exercise A.2.9.2 is behind many appearances of the Fibonacci num-
bers in the research literature, e.g., in the theory of “peak algebras”.)
It is surprising that one and the same sequence (the Fibonacci sequence)
answers the three different counting questions in Exercise A.2.9.2. Even more
surprisingly, this generalizes:
Math 701 Spring 2021, version April 6, 2024 page 515
Exercise A.2.9.3. 4 Let n and k be two positive integers such that k > 1.
Let u be the # of compositions α of n such that each entry of α is congruent
to 1 modulo k.
Let v be the # of compositions β of n + k − 1 such that each entry of β is
≥ k.
Let w be the # of compositions γ of n − 1 such that each entry of γ is either
1 or k.
Prove that u = v = w.
A.2.10. x n -equivalence
Next come some more exercises on the technicalities of multipliability and in-
finite products. The first one is a (partial) converse to Theorem 3.11.10:
Exercise A.2.11.3. 5 Let (ai )i∈ I be a multipliable family of FPSs such that
each ai is invertible (in K [[ x ]]). Prove that the family (ai − 1)i∈ I is summable.
Exercise A.2.11.4. Let (ai )i∈ I and (bi )i∈ I be two families of FPSs.
(a) 1 If (ai )i∈ I and (bi )i∈ I are multipliable, is it necessarily true that the
family (ai + bi )i∈ I is multipliable?
(b) 1 If (ai )i∈ I is summable and (bi )i∈ I is multipliable, is it necessarily
true that the family (ai + bi )i∈ I is multipliable?
(c) 1 Does the answer to part (b) change if we additionally assume that
bi is invertible for each i ∈ I ?
For any i ∈ I and any k ∈ Si , let pi,k be an element of K [[ x ]]. Assume that
Assume further that the family ( pi,k )(i,k)∈S is summable. Then, the product
∏ ∑ pi,k is well-defined (i.e., the family ( pi,k )k∈Si is summable for each
i ∈ I k ∈ Si
!
i ∈ I, and the family ∑ pi,k is multipliable), and we have
k ∈ Si i∈ I
Exercise A.2.12.1. Recall the Motzkin paths and the Motzkin numbers mn ,
defined in Exercise A.2.1.6.
Math 701 Spring 2021, version April 6, 2024 page 517
1 n +2
n+2−k 1/2 1/2
mn = − ∑ (−3) for each n ∈ N.
2 k =0 k n+2−k
x2 x2
and showing that F = x + x2 + +2· .
1 − x2 1−x
(b) 3 Use this to find an explicit formula for dn,4 (as defined in Definition
3.12.11 (e)) that uses only quadratic irrationalities. (Note that the formula
will be rather intricate and contain nested square roots.)
Prove that the shape Rn,m has a k-omino tiling if and only if we have k | n
or k | m.
[Hint: Consider Rn,m as a weighted set, where the weight of a square
(i, j) ∈ N2 is defined to be i + j. If Rn,m has a k-omino tiling, then show
that the weight generating function Rn,m = ∑ xi+ j must be divisible by
(i,j)∈ Rn,m
x2
1+ x + + ··· + (as a polynomial in Q [ x ], for example). However,
x k −1
Rn,m has a simple form.]
lim ef i = fe,
i→∞
Exercise
0 A.2.13.2. (a) 2 Let f ∈ K [[ x ]] be any FPS whose constant term
x f is nilpotent. (An element u ∈ K is said to be nilpotent if there exists
some m ∈ N such that um = 0.) Prove that lim f i = 0.
i→∞
(This requires checking that the n-layered finite continued fractions are
well-defined and converge to a limit in K [[ x ]].)
(a) 2 We have
∑ (ai − ai+1 ) = a0 − nlim
→∞
an
i ∈N
by
a = 1 + x −1 + x −2 + x −3 + · · · ,
b = 1 − x,
c = 1 + x + x2 + x3 + · · · .
[Hint: Compute
n the coefficients of the Laurent polynomial
2
x + x −1 − 2 in two ways.]
the coefficient xord f f (that is, the xord f -coefficient of f ). For example, the
(To be fully precise, “ord f ” here means the element (ord f ) · 1K of the ring
K.)
[Hint: In parts of this exercise, it may be expedient to first prove the claim
under the assumption that K is a Q-algebra (so that 1, 2, 3, . . . can be divided
by in K), and then to argue that the assumption can be lifted.]
The following exercise tells a cautionary tale about applying some of our
results past their stated assumptions:
xk
n n
∑ k x =
n ∈N (1 − x ) k +1
Math 701 Spring 2021, version April 6, 2024 page 523
xn yn
∑ 1 − yqn ∑ 1 − xqn
= in the ring K [[ x, y, q]] .
n ∈N n ∈N
The next two exercises are concerned with FPSs in two indeterminates x and y.
Exercise A.2.15.5. 1 Recall the Stirling numbers of the 2nd kind S (n, k )
studied in Exercise A.2.6.5 and Exercise A.2.7.2. Show that in Q [[ x, y]], we
have
S (n, k )
∑ ∑ n! xn yk = exp [y · (exp [x] − 1)] .
n ∈N k ∈N
yn 1−x
∑ An ·
n!
=
1 − x exp [(1 − x ) y]
.
n ∈N
for any a, b, c ∈ N.
Math 701 Spring 2021, version April 6, 2024 page 524
(a) 1 Set
b+c c+a a+b
F ( a, b, c) := ∑ (−1) k
c+k a+k b+k
k ∈Z
F ( a, b, c) = F ( a − 1, b, c) + F ( a, b − 1, c) + F ( a, b, c − 1) .
( a + b + c)! ( a − 1 + b + c ) ! ( a + b − 1 + c ) ! ( a + b + c − 1) !
= + + .
a!b!c! ( a − 1)!b!c! a! (b − 1)!c! a!b! (c − 1)!
k ∈Z
for any a, b, c ∈ N.
[Hint: For part (d), use the fact that
x 2 ( y − z ) + y2 ( z − x ) + z2 ( x − y ) = − ( y − z ) ( z − x ) ( x − y ) ,
i ≥ m.]
Math 701 Spring 2021, version April 6, 2024 page 525
Exercise A.3.1.1. The purpose of this exercise is to make the proof of Propo-
sition 4.1.15 rigorous.
For any partition λ = (λ1 , λ2 , . . . , λk ), we define the Young diagram Y (λ) of
λ to be the finite set
Visually, this set Y (λ) is represented by drawing each (i, j) ∈ Y (λ) as a cell
of an (invisible) matrix, namely as the cell in row i and in row j. The resulting
picture is a table of k left-aligned rows, where the i-th row (counted from the
top) has exactly λi cells. For example, if λ = (4, 2, 1), then the Young diagram
Y (λ) of λ is
.
(This is only one way to draw Young diagrams; it is known as English notation
or matrix notation, since our labeling of cells matches the way the cells of a
matrix are commonly labeled. If we flip our pictures across a horizontal axis,
we would get French notation aka Cartesian notation, as the labeling of cells
would then match the Cartesian coordinates of their centers.)
(a) 1 Prove that |Y (λ)| = |λ| for any partition λ.
(b) 1 Prove that the Young diagram Y (λ) uniquely determines the parti-
tion λ.
A NW-set shall mean a subset S of {1, 2, 3, . . .}2 with the following prop-
erty: If (i, j) ∈ S and (i′ , j′ ) ∈ {1, 2, 3, . . .}2 satisfy i′ ≤ i and j′ ≤ j, then
(i′ , j′ ) ∈ S as well. (In terms of our above visual model, this means that walk-
ing northwest from a cell of S never moves you out of S, unless you walk out
of the matrix. For example, the set
is not a NW-set, since the left neighbor of the leftmost cell in the topmost
row is not in this set.)
(c) 1 Prove that Y (λ) is a NW-set for each partition λ.
Math 701 Spring 2021, version April 6, 2024 page 526
(d) 2 Prove that any finite NW-set has the form Y (λ) for a unique parti-
tion λ.
Now, let flip : {1, 2, 3, . . .}2 → {1, 2, 3, . . .}2 be the map that sends each
(i, j) ∈ {1, 2, 3, . . .}2 to ( j, i ). Visually, this map flip is a reflection in the
“main diagonal” (the diagonal going from the northwest to the southeast).
We can apply flip to a subset of {1, 2, 3, . . .}2 by applying flip to each element
of this subset. For example:
flip sends to .
(e) 1 Prove that for any partition λ, there is a unique partition λt such
that Y λt = flip (Y (λ)).
Exercise A.3.1.2. 5 A partition will be called binarial if all its parts are powers
of 2. For instance, (8, 2, 2, 1) is a binarial partition of 13. Recall that the length
of a partition λ is denoted by ℓ (λ).
Let n > 1 be an integer. Prove that
∑ (−1)ℓ(λ) = 0.
λ is a binarial
partition of n
In other words, prove that the # of binarial partitions of n having even length
equals the # of binarial partitions of n having odd length.
as monomials in x1 , x2 , x3 , . . ..
(For example, for n = 3, this is saying that
( x1 ) · ( x1 x1 ) · ( x1 x2 x3 )
|{z} | {z } | {z }
factors for λ=(3) factors for λ=(2,1) factors for λ=(1,1,1)
= x31 · x11 x21 · x13 .
| {z } | {z } | {z }
factors for λ=(3) factors for λ=(2,1) factors for λ=(1,1,1)
Make sure you understand why part (a) is a particular case of (b).)
(# of single-cell upgrades of λ)
= (# of partitions µ such that λ is a single-cell upgrade of µ) + 1.
(c) 3 We have
∞
x 1
∑ γ (λ ) x |λ| =
1−x ∏ 1 − xi in Z [[ x ]] .
λ is a partition i =1
(For instance, the partition (4, 2, 2, 2, 2) has no part divisible by 3, but it has
3 equal parts.)
[Remark: Theorem 4.1.14 is the particular case of this exercise for d = 2.]
Exercise A.3.1.9. Let n ∈ N. Let Parn denote the set of all partitions of n.
We define a partial order ≼ on the set Parn as follows: For two partitions
λ = (λ1 , λ2 , . . . , λk ) and µ = (µ1 , µ2 , . . . , µℓ ), we set λ ≼ µ if and only if each
positive integer i satisfies
λ1 + λ2 + · · · + λ i ≤ µ1 + µ2 + · · · + µ i . (310)
Here, we set λ j := 0 for each j > k, and we set µ j := 0 for each j > ℓ.
(For example, for n = 5, we have (2, 1, 1, 1) ≼ (2, 2, 1), since we have
2 ≤ 2,
2 + 1 ≤ 2 + 2,
2 + 1 + 1 ≤ 2 + 2 + 1,
2 + 1 + 1 + 1 ≤ 2 + 2 + 1 + 0,
2 + 1 + 1 + 1 + 0 ≤ 2 + 2 + 1 + 0 + 0,
and so on. Note that there are infinitely many inequalities to be checked,
but only finitely many of them are relevant, since both sides of (310) are
essentially finite sums that stop growing at some point.)
(a) 1 For any n ≥ 6, find two partitions λ and µ of n satisfying neither
λ ≼ µ nor µ ≼ λ. (This shows that ≼ is not a total order for n ≥ 6.)
(b) 2 Prove that two partitions λ = (λ1 , λ2 , . . . , λk ) and µ =
(µ1 , µ2 , . . . , µℓ ) of n satisfy λ ≼ µ if and only if each i ∈ {1, 2, . . . , k} sat-
isfies (310).
(c) 2 Prove that two partitions λ = (λ1 , λ2 , . . . , λk ) and µ = (µ1 , µ2 , . . . , µℓ )
of n satisfy λ ≼ µ if and only if each i ∈ {1, 2, . . . , ℓ} satisfies (310).
(d) 4 Prove that two partitions λ and µ of n satisfy λ ≼ µ if and only if
they satisfy µt ≼ λt . (See Exercise A.3.1.1 for the definition of λt and µt .)
[Note: The partial order ≼ is called the dominance order or the majorization
order; it is rather important in the theory of symmetric functions.]
[Hint: It is helpful to identify a partition λ = (λ1 , λ2 , . . . , λk ) with the
λ = (λ1 , λ2 , . . . , λk , 0, 0, 0, . . .).]
weakly decreasing essentially finite sequence b
m =1 k ∈Z
n
(That is, the sequence of coefficients of this polynomial is palindromic.)
k q
n − k ( n − k ) n
(c) We have = q in the Laurent polynomial ring
k q −1 k q
Z [ q ± ].
Exercise A.3.4.4. Let us extend the definition of the q-integer [n]q (Definition
4.4.15 (a)) to the case when n is a negative integer as follows: If n is a negative
integer, then we set
−1
[ n ] q : = − q −1 − q −2 − · · · − q n = − ∑ qk ∈ Z q± .
k=n
n n
(f) 1 Prove that ∈ Z [q ] (that is,
± is a Laurent polynomial,
k q k q
not just a Laurent series) for any n ∈ Z and k ∈ Z.
(g) 2 Prove that Theorem 4.4.12 holds for any n ∈ Z (not just for positive
integers n).
(h) 1 Prove that Proposition 4.4.18 does not hold for negative n (in gen-
eral).
Exercise A.3.4.5. Consider the ring Z [[z, q]] of FPSs in two indeterminates z
and q. Let n ∈ N.
(a) 1 Prove that
n −1 n
n
∏ 1 + zq i
= ∑q k(k−1)/2
k q
zk .
i =0 k =0
−1
where we set := 1 (this is consistent with the definition in Exercise
0 q
A.3.4.4). (Up to sign, this is a q-analogue of Proposition 3.3.12.)
for any m, n ∈ N.
(b) 4 Recover Theorem 4.3.1 by taking the limit m → ∞ and then n → ∞.
(This yields a new proof of Theorem 4.3.1.)
The following exercise does not explicitly involve q-binomial coefficients, but
they can be used profitably in its solution:
Exercise A.3.4.11. We work in the ring Z [[z, q]] of FPSs in two indeterminates
z and q.
(a) 2 Prove that
∞ zk qk(k+1)/2
∏ 1 + zqi = ∑ 1 2 k
.
i =1 k ∈N (1 − q ) (1 − q ) · · · 1 − q
∞ 2
1 zk qk
∏ 1 − zqi = ∑ (1 − q1 ) (1 − q2 ) · · · 1 − qk · (1 − zq1 ) (1 − zq2 ) · · · 1 − zqk .
i =1 k ∈N
[Hint: For part (c), define the h-index h (λ) of a partition λ to be the largest
i ∈ N such that λ has at least i parts that are ≥ i. For instance, h (4, 3, 1, 1) = 2
and h (4, 3, 3, 1) = 3. Note that h (λ) is the size of the largest square that fits
into the Young diagram of λ. What remains if this square is removed from
the Young diagram of λ ? See also the Hirsch index.]
[Hint: For part (a), what is the connection between injective linear maps
and surjective linear maps (between finite-dimensional vector spaces)?]
each k ∈ N.
(c) 2 Rederive Theorem 4.4.19 by applying Theorem 4.4.21 to β, α and q
instead of a, b and ω.
Math 701 Spring 2021, version April 6, 2024 page 537
(where the letters e and i have the usual meanings they have in complex
analysis) and, in particular, |Ω| = p and 1 ∈ Ω.
Let n be a positive integer.
np − 1
(a) 1 Prove that = 1 for each ω ∈ Ω \ {1}.
p−1 ω
np
(b) 2 Prove that = n for each ω ∈ Ω \ {1}.
p ω
(
p, if p | k;
(c) 1 Prove that ∑ ω k = for any k ∈ N.
ω ∈Ω 0, if p ∤ k
(a) 1 Prove that this is well-defined, i.e., that f − f [qx ] is really divisible
by (1 − q) x in R for any f ∈ R.
The map Dq is known as the q-derivative or the Jackson derivative. It is easily
seen to be K [[q]]-linear.
(b) 1 Prove that Dq ( x n ) = [n]q x n−1 for each n ∈ N. (For n = 0, read the
right hand side as 0.)
(c) 1 Prove the “q-Leibniz rule”: For any f , g ∈ R, we have
Dq ( f g) = Dq f · g + f [qx ] · Dq g.
xn
(d) 1 Let expq ∈ R be the FPS ∑ (called the “q-exponential”). Prove
n ∈N [ n ] q !
that Dq expq = expq . (Note that, unlike the usual exponential exp, this does
not require that K be a Q-algebra, since [n]q ! is a FPS with constant term 1
and thus always invertible.)
(e) 2 Prove that each m ∈ N satisfies
1 [m]q
Dq = .
m −1
m
∏ (1 − q x )
i ∏ (1 − q x )
i
i =0 i =0
For example,
1
Qq,0 = ∑ [n]0q x n = ∑ x n =
1−x
;
n∈N |{z} n ∈N
=1
x
Qq,1 = ∑ [n]1q xn = ∑ [n]q x n =
(1 − x ) (1 − qx )
(why?) ;
n∈N |{z} n ∈N
=[n]q
Aq,m
It appears that each Qq,m has the form for
(1 − x ) (1 − qx ) · · · (1 − qm x )
some polynomial Aq,m ∈ Z [q, x ] of degree m. Let us prove this.
Math 701 Spring 2021, version April 6, 2024 page 539
Aq,m
(Thus, Qq,m = .)
(1 − x ) (1 − qx ) · · · (1 − qm x )
Let ϑq : R → R be the Z-linear map that sends each FPS f ∈ R to x · Dq f .
(f) 1 Prove that Qq,m = ϑq Qq,m−1 for each m > 0.
(g) 2 Prove that Aq,m = [m]q xAq,m−1 + x (1 − x ) · Dq Aq,m−1 for each m >
0.
(h) 1 Conclude that Aq,m is a polynomial in x and q for each m ∈ N.
(i) 2 Show that this polynomial Aq,m has the form Aq,m = x m qm(m−1)/2 +
Rq,m , where Rq,m is a Z-linear combination of xi q j with i < m and j <
m (m − 1) /2.
The polynomials Aq,m are called Carlitz’s q-Eulerian polynomials.
A.4. Permutations
The notations of Chapter 5 shall be used here. In particular, if X is a set, then
SX shall mean the symmetric group of X; and if n is a nonnegative integer, then
Sn shall mean the symmetric group S[n] of the set [n] := {1, 2, . . . , n}.
Exercise A.4.2.3. Let p be a prime number. Let Z be the set of all p-cycles in
the symmetric group S p .
Let ζ be the specific p-cycle cyc1,2,...,p ∈ S p . Note that ζ has order p in the
group S p , and thus generates a cyclic subgroup ⟨ζ ⟩ of order p.
(a) 2 Prove that a permutation σ ∈ S p satisfies σζ = ζσ if and only if
σ ∈ ⟨ζ ⟩ (that is, if and only if σ is a power of ζ).
(b) 2 Prove that |⟨ζ ⟩ ∩ Z | = p − 1.
(c) 1 Prove that the cyclic group ⟨ζ ⟩ acts on the set Z by conjugation:
(where the symbol “⇀” means the action of a group G on a G-set X – i.e., we
let g ⇀ x denote the result of a group element g ∈ G acting on some x ∈ X).
Exercise A.4.3.3. Let σ ∈ Sn and i ∈ [n]. Prove the following (using the
notation of Definition 5.3.6 (a)):
(a) 1 We have ℓi (σ) = |[σ (i ) − 1] \ σ ([i ])|.
(b) 1 We have ℓi (σ) = |[σ (i ) − 1] \ σ ([i − 1])|.
(c) 1 We have σ (i ) ≤ i + ℓi (σ).
(d) 2 Assume that i ∈ [n − 1]. We have σ (i ) > σ (i + 1) if and only if
ℓ i ( σ ) > ℓ i +1 ( σ ).
A.4.4. V-permutations
Exercise A.4.4.1. 5 Let n ∈ N. For each r ∈ [n], let cr denote the permutation
cycr,r−1,...,2,1 ∈ Sn . (Thus, c1 = cyc1 = id and c2 = cyc2,1 = s1 .)
Let G = g1 , g2 , . . . , g p be a subset of [n], with g1 < g2 < · · · < g p . Let
σ ∈ Sn be the permutation c g1 c g2 · · · c g p .
[Example: If n = 6 and p = 2 and G = {2, 5}, then σ = c2 c5 =
cyc2,1 cyc5,4,3,2,1 . In one-line notation, this permutation σ is 521346.]
Prove the following:
(a) We have σ (1) > σ (2) > · · · > σ ( p).
(b) We have σ ([ p]) = G.
(c) We have σ ( p + 1) < σ ( p + 2) < · · · < σ (n).
(Note that a chain of inequalities that involves less than two numbers is
considered to be vacuously true. For example, Exercise A.4.4.1 (c) is vacu-
ously true when p = n − 1 and also when p = n.)
Math 701 Spring 2021, version April 6, 2024 page 542
• Statement 3: For each i ∈ [n], the set σ−1 ([i ]) is an integer interval (i.e.,
there exist integers u and v such that σ−1 ([i ]) = {u, u + 1, u + 2, . . . , v}).
• Statement 5: We have ℓ1 (σ) > ℓ2 (σ ) > · · · > ℓ p (σ) and ℓ p+1 (σ) =
ℓ p+2 (σ) = · · · = ℓn (σ) = 0 for some p ∈ {0, 1, . . . , n}. (In other
words, the n-tuple L (σ) is strictly decreasing until it reaches 0, and
then remains at 0.)
and
n n
n
∑a p n− p
b ∑ q ℓ(σ)
= ∑q k(k−1)/2
k q
ak bn−k .
p =0 σ ∈ Sn ; k =0
σ (1)>σ(2)>···>σ( p);
σ( p+1)<σ( p+2)<···<σ (n)
Definition A.4.3. Let n ∈ N. For every σ ∈ Sn , we let Inv σ denote the set of
all inversions of σ.
We know from Corollary 5.3.20 (b) that any n ∈ N and any two permutations
σ and τ in Sn satisfy the inequality ℓ (στ ) ≤ ℓ (σ ) + ℓ (τ ). In the following
exercise, we will see when this inequality becomes an equality:
Inv τ −1 σ−1 .
Math 701 Spring 2021, version April 6, 2024 page 544
Exercise A.4.6.1 (d) shows that if two permutations in Sn have the same set
of inversions, then they are equal. In other words, a permutation in Sn is
uniquely determined by its set of inversions. The next exercise shows what set
of inversions a permutation can have:
The next few exercises cover some of the most basic results in the theory of
pattern avoidance (see [Bona12, Chapter 4] and [Kitaev11] for much more161 ).
This can be viewed as one possible way of generalizing monotonicity (i.e., in-
creasingness and decreasingness). We begin by defining some basic concepts:
Example A.4.5. Let t = (5, 1, 6, 2, 3, 4). Then, (1, 3) and (5, 6, 2) are subse-
quences of t (indeed, if we write t as (t1 , t2 , . . . , t6 ), then (1, 3) = (t2 , t5 ) and
(5, 6, 2) = (t1 , t3 , t4 )), whereas (1, 5) and (2, 1, 6) and (1, 1, 6) are not.
Next, we need to define the notion of equally ordered tuples. Roughly speaking,
these are tuples of the same length that might differ in their values, but agree
in the relative order of their values (e.g., if one tuple has a smaller value in
position 2 than in position 5, then so does the other tuple). Here is the formal
definition:
This relation is clearly symmetric in a and b (that is, a and b are equally
ordered if and only if b and a are equally ordered).
We agree that a k-tuple and an ℓ-tuple are never equally ordered when
k ̸ = ℓ.
161 There is a yearly conference on this subject!
Math 701 Spring 2021, version April 6, 2024 page 546
Example A.4.8. (a) The two triples (3, 1, 6) and (1, 0, 2) are equally ordered.
(b) The two quadruples (3, 1, 1, 2) and (4, 1, 1, 3) are equally ordered.
(c) The two triples (3, 1, 2) and (2, 1, 3) are not equally ordered (indeed, we
have 3 < 2, but we don’t have 2 < 3).
• not 123-avoiding, since it contains the 123-pattern (1, 3, 4) (and also the
123-pattern (2, 3, 4));
Exercise A.4.8.2. 6 Let n ∈ N. Let cn denote the n-th Catalan number (from
Example 2 in Section 3.1).
(a) Prove that
(# of 132-avoiding permutations in Sn ) = cn .
(b) Prove that
(# of 231-avoiding permutations in Sn ) = cn .
(c) Prove that
(# of 213-avoiding permutations in Sn ) = cn .
(d) Prove that
(# of 312-avoiding permutations in Sn ) = cn .
[Hint: Easy bijections show that parts (a), (b), (c) and (d) are equivalent.
For part (b), proceed recursively: Assume that n > 0, and let σ ∈ Sn , and let
i = σ−1 (n). Show that the permutation σ is 231-avoiding if and only if the
two tuples (σ (1) , σ (2) , . . . , σ (i − 1)) and (σ (i + 1) , σ (i + 2) , . . . , σ (n)) are
231-avoiding and satisfy {σ (1) , σ (2) , . . . , σ (i − 1)} = {1, 2, . . . , i − 1} and
{σ (i + 1) , σ (i + 2) , . . . , σ (n)} = {i, i + 1, . . . , n − 1}. This yields a recursive
equation for the # of 231-avoiding permutations in Sn .]
Math 701 Spring 2021, version April 6, 2024 page 548
Exercise A.4.8.3. 8 Let n ∈ N. Let cn denote the n-th Catalan number (from
Example 2 in Section 3.1).
(a) Prove that
(# of 123-avoiding permutations in Sn ) = cn .
(b) Prove that
(# of 321-avoiding permutations in Sn ) = cn .
[Hint: Consider any 321-avoiding permutation σ ∈ Sn . A record of σ means
a value σ (i ) for some i ∈ [n] satisfying
N c1 S i1 +1 N c2 S i2 +1 · · · N c k S i k +1 ,
B (σ)
= N 1 S1 N 3 S1 N 2 S1 N 1 S3 N 1 S2
= NSNNNSNNSNSSSNSS
=
Math 701 Spring 2021, version April 6, 2024 page 549
Exercises A.4.8.2 and A.4.8.3 can be combined into a single statement, which
says that for any τ ∈ S3 and any n ∈ N, the # of (τ (1) , τ (2) , τ (3))-avoiding
permutations in Sn equals the Catalan number cn , independently of τ. The
independence appears almost too good to be true. Parts of this miracle survive
even for τ ∈ S4 ; for example, for any n ∈ N, we have
(# of 4132-avoiding permutations in Sn )
= (# of 3142-avoiding permutations in Sn )
(a result of Stankova [Stanko94, Theorem 3.1]), but this number does not equal
the # of 1324-avoiding permutations in Sn (in general), nor does it have any
simple formula. A Wikipedia page collects known results about these and
similar numbers.
We state a few simple results about permutations avoiding several patterns:
Definition A.4.16. Let u and v be two tuples of integers. A permutation in Sn
(or a tuple of integers) is said to be (u, v)-avoiding if and only if it is both u-
avoiding and v-avoiding. Similarly we define (u, v, w)-avoiding permutations
(where u, v and w are three tuples of integers).
[Hint: In parts (a), (b) and (d), the permutations you are counting have
already appeared in one of the previous problems under a different guise.]
ℓ1 ( σ ) ≥ ℓ2 ( σ ) ≥ · · · ≥ ℓ n ( σ ) .
(b) 1 If σ and τ are two conjugate elements in the group SX , then ℓr (σ) =
ℓr ( τ ).
(c) 3 For any σ ∈ SX and any two distinct elements i and j of X, we have
( σ
ℓr (σ) + 1, if we don’t have i ∼ j;
ℓr σti,j = ℓr ti,j σ = σ
ℓr (σ) − 1, if i ∼ j.
Math 701 Spring 2021, version April 6, 2024 page 551
σ
Here, the notation “i ∼ j” means “i and j belong to the same cycle of σ” (that
is, “there exists some p ∈ N such that i = σ p ( j)”).
(d) 2 If σ ∈ SX , then the number ℓr (σ) is the smallest p ∈ N such that we
can write σ as a composition of p transpositions.
(e) 2 For any σ ∈ SX and τ ∈ SX , we have ℓr (στ ) ≤ ℓr (σ ) + ℓr (τ ).
(f) 2 For any σ ∈ SX and τ ∈ SX , we have ℓr (στ ) ≡ ℓr (σ ) + ℓr (τ ) mod 2.
(g) 1 If σ ∈ SX , then ℓr (σ) ≤ ℓ (σ).
(This is the n × n-matrix whose (i, j)-th entry is [i = σ ( j)], where we are
using the notation of Definition 4.1.5.) For instance, if n = 4 and σ = 3124 in
one-line notation, then
0 1 0 0
0 0 1 0
Pσ = 1 0 0 0 .
0 0 0 1
Sn → GLn (K ) ,
σ 7→ Pσ
is a group homomorphism.
Math 701 Spring 2021, version April 6, 2024 page 552
1
(e) 2 We have Log b = (Log (1 + x ) − Log (1 − x )).
2
1 + x 1/2
−1/2
(f) 1 We have b = = (1 + x ) · 1 − x 2 .
1−x
⌊n/2⌋ −1/2
(g) 2 We have bn = (−1) for each n ∈ N.
⌊n/2⌋
(h) 2 For each n ∈ N, we have
( 2
−
1/2 c , if n is even;
an = n! · (−1)⌊n/2⌋ = n2
⌊n/2⌋ ncn , if n is odd,
Math 701 Spring 2021, version April 6, 2024 page 553
[Hint: In part (b), first choose the cycle of σ that contains the element 1.]
Note that Theorem 5.3.17 (a) shows that any permutation σ ∈ Sn has a Cox-
eter word. Furthermore, Theorem 5.3.17 (b) says that the length ℓ (σ) of a
permutation σ ∈ Sn is the smallest length of a Coxeter word for σ. Thus, a
reduced word for a permutation σ ∈ Sn is the same as a Coxeter word for σ
that has length ℓ (σ). (This is the reason for the name “length”.)
Note that each permutation σ ∈ Sn has finitely many reduced words, but
infinitely many Coxeter words (unless n ≤ 1). The identity permutation id has
only one reduced word – namely, the 0-tuple () – but usually many Coxeter
words, such as (1, 2, 3, 3, 2, 1).
We shall now study the combinatorics of Coxeter and reduced words of a
permutation σ ∈ Sn in more depth. First, let us view them from a different
perspective:
Math 701 Spring 2021, version April 6, 2024 page 554
• If a Coxeter word for σ ∈ Sn has two adjacent entries that are equal, then
we can remove these two entries and still have a Coxeter word for σ, since
Proposition 5.2.5 (a) yields si si = s2i = id for each i ∈ [n − 1].
323123 321323
232123 321232
231213 312132
213231 132312
212321 123212
121321 123121
It turns out that each reduced word for σ appears as a node on this graph.
In other words, each reduced word for σ can be obtained from 121321 by a
sequence of commutation moves and braid moves. This is not a coincidence,
but a general result, known as Matsumoto’s theorem for the symmetric group:
Math 701 Spring 2021, version April 6, 2024 page 557
The graph in Example A.4.21 is furthermore bipartite; better yet, any cycle
has an even # of thick edges and an even # of thin edges. This, too, is not a
coincidence:
How many reduced words does a given permutation σ ∈ Sn have? For most
σ, there is no nice formula for the answer. However, in at least one specific case,
a surprising (and deep) formula exists, which I am here mentioning less as a
reasonable exercise than as a curiosity:
A.4.11. Descents
Descents are one of the most elementary features of a permutation σ ∈ Sn : they
are just the positions at which σ decreases (from that position to the next).
Formally, they are defined as follows:
n!
(# of σ ∈ Sn satisfying Des σ ⊆ I ) = .
d1 !d2 ! · · · dk+1 !
(b) 5 Let us use the notations from Definition 4.4.15 (b). Prove that
[n]q !
∑ qℓ(σ) =
[ d 1 ] q ! [ d 2 ] q ! · · · [ d k +1 ] q !
in the ring Z [q] .
σ ∈ Sn ;
Des σ⊆ I
Note that Exercise A.4.11.2 (b) generalizes both Exercise A.4.11.2 (a) (obtained
by setting q = 1) and Proposition 5.3.5 (obtained by setting I = [n − 1] and
q = x).
What about permutations σ ∈ Sn satisfying Des σ = I rather than Des σ ⊆ I
? See Exercise A.5.4.3 further below for this.
Meanwhile, let us connect descents with Eulerian polynomials:
∑ x ℓ(σ) = ∑ xmaj σ in Z [ x ] .
σ ∈ Sn σ ∈ Sn
Exercise A.5.1.2. Recall the concepts of Dyck words and Dyck paths defined
in Example 2 in Section 3.1.
Let n ∈ N.
If w ∈ {0, 1}2n is a 2n-tuple, and if k ∈ {0, 1, . . . , 2n}, then we define the
k-height hk (w) of w to be the number
and we define the sign sign (w) of w to be the number (−1)(area(w)−n)/2 (we
will soon see that this is well-defined).
[For example, if n = 5 and w = (1, 1, 0, 1, 1, 0, 0, 0, 1, 0), then
10
area (w) = ∑ hk (w)
k =0
= h0 ( w ) + h1 ( w ) + h2 ( w ) + h3 ( w ) + h4 ( w ) + h5 ( w )
| {z } | {z } | {z } | {z } | {z } | {z }
=0 =1 =2 =1 =2 =3
+ h6 (w) + h7 (w) + h8 (w) + h9 (w) + h10 (w)
| {z } | {z } | {z } | {z } | {z }
=2 =1 =0 =1 =0
= 0 + 1 + 2 + 1 + 2 + 3 + 2 + 1 + 0 + 1 + 0 = 13
(a) 1 Prove that any w ∈ {0, 1}2n satisfies (area (w) − n) /2 ∈ Z (so that
(−1)(area(w)−n)/2 really is well-defined).
(b) 2 Prove that a 2n-tuple w ∈ {0, 1}2n is a Dyck word of length 2n if and
only if it satisfies
The next exercise is not about alternating sums, but rather about proving the
q-Lucas theorem (Theorem 6.1.7):
Exercise A.5.1.3. Let K be a field. Let d be a positive integer. Let ω be a
primitive d-th root of unity in K.
d
(a) 2 Prove that = 0 for each k ∈ {1, 2, . . . , d − 1}.
k ω
Now, let A be a noncommutative K-algebra, and let a, b ∈ A be such that
ba = ωab.
(b) 1 Prove that ( a + b)d = ad + bd .
(c) 3 Prove that adand bd commute with both a and b (that is, we have
uv = vu for each u ∈ ad , bd and each v ∈ { a, b}).
(d) 5 Prove Theorem 6.1.7.
n [ n ] q n − 1
[Hint: For part (a), show that = for all n > 0 and
k q [k ]q k − 1 q
k > 0. For part (d), first construct a noncommutative K-algebra A and two
elements a, b ∈ A satisfying ba = ωab and such that all the monomials ai b j
are K-linearly independent. Use Exercise A.3.4.15 for this. In this K-algebra,
d
expand both sides of ( a + b)n = ( a + b)q ( a + b)r . Alternatively, there is a
commutative approach using Theorem 4.4.19.]
Math 701 Spring 2021, version April 6, 2024 page 563
k =0 k
n+1
· n!.
2
where S (m, n) is the Stirling number of the 2nd kind (as defined in Exercise
A.2.6.5).
The next exercise is concerned with the derangement numbers Dn from Defini-
tion 6.2.4.
where we set v0 := w.
(b) 1 For each m ∈ {0, 1, . . . , n − 1}, we have
!m
∑ (−1)n−| I | w + ∑ vi = 0.
I ⊆[n] i∈ I
(c) 1 We have
!n
∑ (−1)n−| I | w + ∑ vi = n!v1 v2 · · · vn .
I ⊆[n] i∈ I
(d) 2 We have
n
∑ (−1)n−| I | ∑ vi − ∑ vi = 2n n!v1 v2 · · · vn .
I ⊆[n] i∈ I i ∈[n]\ I
Prove that
∑ (−1)|X | = ∑ (−1)|Y | .
X ⊆ A; Y ⊆ B;
M( X )= B N (Y )= A
Math 701 Spring 2021, version April 6, 2024 page 565
The following two exercises show some applications of the methods of Chap-
ter 6 to graph theory.
Exercise A.5.2.10. Let G be a finite undirected graph with vertex set V and
edge set E. Fix n ∈ N.
An n-coloring of G means a map c : V → [n]. If c : V → [n] is an n-coloring,
then we regard the values c (v) of c as the “colors” of the respective vertices
v.
An n-coloring c of G is said to be proper if there exists no edge of G whose
two endpoints v and w satisfy c (v) = c (w). (In other words, an n-coloring
of G is said to be proper if and only if there is no edge whose two endpoints
have the same color.)
Let χG (n) denote the # of proper n-colorings of G.
(a) 5 Prove that
where conn (V, F ) denotes the # of connected components of the graph with
vertex set V and edge set F.
Math 701 Spring 2021, version April 6, 2024 page 566
2 3
p q r
1 8 7 4
v w t
6 5
(where the vertex v is the vertex labelled v). Then, for example, the set
{1, 2} ⊆ E infects edges 1, 2, 3, 6, 8 (but none of the other edges). The set
{1, 2, 5} infects the same edges as {1, 2} (indeed, the additional edge 5 does
not increase its infectiousness, since it is not on any {1, 2, 5}-path from v).
The set {1, 2, 3} infects every edge other than 5. The set {1, 2, 3, 4} infects
each edge, and thus is pandemic.]
[Hint: There is a direct proof, but it is perhaps neater to derive this from
Exercise A.5.2.13.]
[Remark: The (n + 1)-tuple (b0 , b1 , . . . , bn ) is called the binomial transform
of ( a0 , a1 , . . . , an ).]
The next few exercises show some ways of generalizing the Principle of In-
clusion and Exclusion (in its original form – Theorem 6.2.1). The first one
replaces the question “how many elements of U belongs to none of the n sub-
sets A1 , A2 , . . . , An ” by “how many elements of U belong to exactly k of the n
subsets A1 , A2 , . . . , An ”:
Math 701 Spring 2021, version April 6, 2024 page 568
Show that
|I|
| Sk | = ∑ (−1) | I |−k
k
(# of u ∈ U that satisfy u ∈ Ai for all i ∈ I ) .
I ⊆[n]
if U = A1 ∪ A2 ∪ · · · ∪ An .
in K.
(b) 1 Derive Theorem 6.2.9 as a particular case of part (a).
(c) 2 Prove that each n ∈ N satisfies
n
n!
∑ q|Fix σ| = ∑ k!
( q − 1) k
σ ∈ Sn k =0
in the polynomial ring Z [q]. (See Definition A.4.2 for the definition of Fix σ.)
Next comes another counting problem that can be solved in many ways:
(| A| − 1)n + (−1)n (| A| − 1)
= .
| A|
Exercise A.5.2.21. 5 Solve Exercise A.2.14.3 (b) again using the Principle of
Inclusion and Exclusion.
A.5.3. Determinants
We fix a commutative ring K.
A1,2 · · · A1,2 b1 · · ·
A1,1 b1 A1,n A1,1 A1,n
A2,1 b2 A2,2 · · · A2,n A2,1 A2,2 b2 · · · A2,n
det + det
.. .. .. .. .. .. .. ..
. . . . . . . .
An,1 bn An,2 · · · An,n An,1 An,2 bn · · · An,n
A1,2 · · ·
A1,1 A1,n b1
A2,1 A2,2 · · · A2,n b2
+ · · · + det .
.. .. ... ..
. .
An,1 An,2 · · · An,n bn
A1,1 A1,2 · · · A1,n
A2,1 A2,2 · · · A2,n
= (b1 + b2 + · · · + bn ) det . .
.. .. ... ..
. .
An,1 An,2 · · · An,n
∑ (−1)σ = (−1)n−1 (n − 1) .
σ∈Sn is a
derangement
for any x ∈ K. (See Definition A.4.2 (b) for the definition of |Fix σ |.)
(c) 3 Prove that
(−1)σ n
∑ |Fix σ| + 1
= (−1)n+1
n+1
.
σ ∈ Sn
2 2 2 2
3 3 3 3
1 1 1 1
. (311)
2 2 2 2
3 3 3 3
1 1 1 1
(Note that “simple graph” implies that any arc is merely a pair of two distinct
Math 701 Spring 2021, version April 6, 2024 page 573
vertices; thus, in particular, there are no arcs of the form (i, i ).)
Fix n ∈ N. Let T be the set of all tournaments with vertex set [n]. It is easy
to see that | T | = 2n(n−1)/2 .
For any permutation σ ∈ Sn , we define Pσ ∈ T to be the tournament with
vertex set [n] and with arcs
(For example, in the above table (311) of tournaments with vertex set [3],
the first tournament is Pid , while the second tournament is Ps2 .)
We define the scoreboard scb D of a tournament D ∈ T to be the n-tuple
(s1 , s2 , . . . , sn ) ∈ Nn , where
∏ a i − a j = ∑ w ( D ).
(c) 2 Prove that
1≤ j < i ≤ n D∈T
(d) 2 Prove that det aij−1 = ∑ w ( D ).
1≤i ≤n, 1≤ j≤n D ∈ T is
injective
[Hint: Part (a) can be done in many ways, but the simplest is probably by
factoring the matrix p j ( ai ) 1≤i≤n, 1≤ j≤n as a product.]
∏
! ai − a j
ai 1≤ j < i ≤ n
det = .
j−1 1≤i ≤n, 1≤ j≤n H (n)
∏
(b) 1 Conclude that H (n) | ai − a j for any n integers
1≤ j < i ≤ n
a1 , a2 , . . . , a n .
(c) 1 Prove that H (n) = ∏ ( i − j ).
1≤ j < i ≤ n
∏
! ai − a j
ai + j 1≤ j < i ≤ n
det = .
j−1 1≤i ≤n, 1≤ j≤n H (n)
∏
! ai − a j
1 1≤ i < j ≤ n
det = n .
( ai + j ) ! 1≤i ≤n, 1≤ j≤n ∏ ( ai + n ) !
i =1
Combining parts (b) and (c) of Exercise A.5.3.9, we see that any n integers
a1 , a2 , . . . , an satisfy
∏ (i − j ) | ∏ ai − a j .
1≤ j < i ≤ n 1≤ j < i ≤ n
In other words, the product of the pairwise differences between n given inte-
gers a1 , a2 , . . . , an is always divisible by the product of the pairwise differences
between 1, 2, . . . , n. This curious fact is one of the beginnings of Bhargava’s the-
ory of generalized factorials ([Bharga00]). We will not elaborate further on this
theory here, but let us point out that our curious fact has a similarly curious
analogue, in which differences are replaced by differences of squares, and the
numbers 1, 2, . . . , n are replaced by 0, 1, . . . , n − 1. This analogue, too, can be
proved using determinants:
′
a 2 k −1
· ∏ a2 − r2 for each a ∈ R and k ∈ N.
(c) 1 We have =
k (2k)! r=0
(d) 3 For any n integers a1 , a2 , . . . , an , we have
′ !
∏ a2i − a2j
ai = 1≤ j < i ≤ n
det .
j−1 H ′ (n)
1≤i ≤n, 1≤ j≤n
(e) 1 For any n integers a1 , a2 , . . . , an , we have H ′ (n) | ∏ a2i − a2j .
1≤ j < i ≤ n
∏ i 2 − j2 | ∏ a2i − a2j
(f) 1 Conclude that for any n inte-
0≤ j < i ≤ n −1 1≤ j < i ≤ n
gers a1 , a2 , . . . , an .
∏ i3 j3 ∏ a3i a3j
(g) 1 Is it true that − | − for any n integers
0≤ j < i ≤ n −1 1≤ j < i ≤ n
a1 , a2 , . . . , a n ?
[Hint: In part (d), use Exercise A.5.3.7 (c) again, but this time apply it to
a21 , a22 , . . . , a2n instead of a1 , a2 , . . . , an .]
The following exercise gives some variations on Proposition 6.4.34:
Exercise A.5.3.11. Let n be a positive integer. Let x1 , x2 , . . . , xn be n elements
of K. Let y1 , y2 , . . . , yn be n elements of K.
(a) 2 For every m ∈ {0, 1, . . . , n − 2}, prove that
m
det xi + y j = 0.
1≤i ≤n, 1≤ j≤n
er [ z 1 , z 2 , . . . , z n ] : = ∑ z i1 z i2 · · · z ir
(i1 ,i2 ,...,ir )∈[n]r ;
i1 <i2 <···<ir
Prove that
z0 z1 · · · z n −1
z1 z2 · · · zn
det z i + j −2 = det
1≤i ≤n, 1≤ j≤n .. .. . . ..
. . . .
z n −1 zn · · · z2n−2
∏
2
= u1 u2 · · · u n · ai − a j .
1≤ i < j ≤ n
∗ 1 0 0
∗ ∗ 1 0
for n = 4, such a matrix has the form ∗ ∗ ∗ 1 , where each asterisk
∗ ∗ ∗ ∗
∗ stands for an arbitrary entry.)
For each subset I of [n − 1], we define an element p I ( A) ∈ K as follows:
Write the subset I in the form {i1 , i2 , . . . , ik } with i1 < i2 < · · · < ik . Addi-
tionally, set i0 := 0 and ik+1 := n. Then, set
k +1
p I ( A) := Ai1 ,i0 +1 Ai2 ,i1 +1 · · · Aik+1 ,ik +1 = ∏ Aiu ,iu−1 +1.
u =1
Prove that
det A = ∑ (−1)n−1−| I | p I ( A) .
I ⊆[n−1]
Our next exercise is concerned with tridiagonal matrices. These are matrices
whose all entries are zero except for those on the diagonal and “its neighbors”.
∗ ∗ 0
For example, a 3 × 3-matrix is tridiagonal if it has the form ∗ ∗ ∗ (where
0 ∗ ∗
each asterisk ∗ means an arbitrary entry).
Math 701 Spring 2021, version April 6, 2024 page 580
det A bn − 1 c n − 1
= an − ,
det ( A:n−1 ) bn − 2 c n − 2
a n −1 −
bn − 3 c n − 3
a n −2 −
a n −3 −
..
.
b2 c2
−
b c
a2 − 1 1
a1
Thus,
a n−1 0 ··· 0 0 0
1 a n−2 ··· 0 0 0
0 2 a ··· 0 0 0
A=
.. .. .. .. .. .. ...
. . . . . . .
0 0 0 ··· a 2 0
0 0 0 ··· n−2 a 1
0 0 0 ··· 0 n−1 a
(a) 6 Prove that
n −1
det A = ∏ (a − 2k + n − 1) .
k =0
(This particular matrix A, for a = 0, is called the Kac matrix; thus, our formula
for det A computes its characteristic polynomial.)
(b) ? Can you find a bijective proof using the formula in Exercise A.5.3.15
(b)?
Another variation on the Vandermonde determinant:
(d) 2 Show that the Catalan sequence (c0 , c1 , c2 , . . .) is the only sequence
( a0 , a1 , a2 , . . .) of real numbers that satisfies
det ai+ j−2 1≤i≤k, 1≤ j≤k = det ai+ j−1 1≤i≤k, 1≤ j≤k = 1
for all k ∈ N.
for each k ∈ N.
(e) 3 Compute det ci + j 1≤i ≤k, 1≤ j≤k
(b) 5 Let us use the notations from Definition 4.4.3. Prove that
!
n−c
∑ qℓ(σ) = det c j − cii−−11 in the ring Z [q] .
σ∈S ;
n q 1≤i ≤k+1, 1≤ j≤k+1
Des σ= I
Math 701 Spring 2021, version April 6, 2024 page 584
[Hint: One way to approach this is by observing that the matrices in ques-
tion are transposes of normalized lower Hessenberg matrices as in Exercise
A.5.3.13. Another is to apply the LGV lemma (in its weighted version for part
(b)) to the (k + 1)-vertices A = ( A1 , A2 , . . . , Ak+1 ) and B = ( B1 , B2 , . . . , Bk+1 )
defined by
B4
0
A4 B3
1
0
A3 B2
4
A2 B1
5
1
A1
.
The next exercise sketches out a visual proof of the Cauchy–Binet formula
(Theorem 6.4.18) using the LGV lemma:
1′ 2′ 3′ 4′
1′′ 2′′ .
(a) 1 Prove that this digraph D is acyclic.
Now, for each arc a of D, we define a weight w ( a) ∈ K as follows:
• If a is the arc i → j′ for some i ∈ [n] and j ∈ [m], then we set w ( a) :=
Ai,j .
• If a is the arc i′ → j′′ for some i ∈ [m] and j ∈ [n], then we set w ( a) :=
Bi,j .
∑ (−1)σ ∑ w (p)
σ ∈ Sn p is a nipat
from A to σ (B)
∑
= det colsg1 ,g2 ,...,gn A · det rowsg1 ,g2 ,...,gn B ,
( g1 ,g2 ,...,gn )∈[m]n ;
g1 < g2 <···< gn
Exercise A.5.4.5. Consider the situation of Theorem 6.5.14. Assume that our
digraph D has vertex set [n] for some n ∈ N. Let E be the set of all arcs of D.
Let M ∈ K n×n be the n × n-matrix whose (i, j)-th entry is given by
(Note that if D is a simple digraph, then the sum on the right hand side of
this equality has at most one addend.)
(a) 2 Prove that each k ∈ N satisfies
Mk = ∑
w ( p)
.
p:i → j is a path
with k steps 1≤i ≤n, 1≤ j≤n
and
n
h n [ x1 , x2 , . . . , x N ] = ∑ h i [ x 1 , x 2 , . . . , x M ] · h n − i [ x M +1 , x M +2 , . . . , x N ]
i =0
Math 701 Spring 2021, version April 6, 2024 page 587
p n [ x 1 , x 2 , . . . , x N ] = p n [ x 1 , x 2 , . . . , x M ] + p n [ x M +1 , x M +2 , . . . , x N ] .
Exercise A.6.1.3. 5 Finish our proof of Theorem 7.1.12 by proving the re-
maining two Newton–Girard formulas (262) and (263).
in the ring Z [q], where we are using the notation of Definition 4.4.3.
(c) 2 Recover a nontrivial identity between q-binomial coefficients by sub-
stituting q0 , q1 , . . . , q N −1 into an identity between symmetric polynomials.
(There are several valid answers here.)
(d) 3 For any m, k ∈ N, the unsigned Stirling number of the 1st kind c (m, k ) ∈
N is defined to be the # of all permutations σ ∈ Sm that have exactly k cycles
(see Definition 5.5.4 (a)). Prove that
en [1, 2, . . . , N ] = c ( N + 1, N + 1 − n) .
Math 701 Spring 2021, version April 6, 2024 page 588
(e) 3 For any m, k ∈ N, the Stirling number of the 2nd kind S (m, k ) ∈ N is
defined to be the # of all set partitions of the set [m] into k parts (i.e., the # of
sets {U1 , U2 , . . . , Uk } consisting of k disjoint nonempty subsets U1 , U2 , . . . , Uk
of [m] such that U1 ∪ U2 ∪ · · · ∪ Uk = [m]). Prove that
hn [1, 2, . . . , N ] = S ( N + n, N ) .
Exercise A.6.1.6. For each n ∈ N and each positive integer k, we define the
(n, k)-th Petrie symmetric polynomial gk,n ∈ S by
gk,n = ∑ n
x i1 x i2 · · · x i n
(i1 ,i2 ,...,in )∈[ N ] ;
i1 ≤i2 ≤···≤in ;
no k of the numbers i1 ,i2 ,...,in are equal
(The sum on the right hand side is well-defined, since hn−ki = 0 whenever
n
i > .)
k
(e) 3 Set
(
2, if i ≡ j mod 3;
ci,j := for any i, j ∈ Z.
−1, if i ̸≡ j mod 3
(This family begins with Q1 = x1 and Q2 = x12 − 2x2 and Q3 = x13 − 3x1 x2 +
3x3 .)
(c) 3 Express the polynomials Pn explicitly as determinants of certain
matrices.
(d) 3 Express the polynomials Qn explicitly as determinants of certain
matrices.
(e) 1 What is the coefficient of xi in the polynomial Pi ?
(f) 1 What is the coefficient of xi in the polynomial Qi ?
∑ ∏ xc(v) ∈ P .
c:V →[ N ] is a v ∈V
proper N-coloring
∑ ∏ xc(v) = ∑ xi x j x k = ∑ xi2 x j + ∑ xi x j x k
c:[3]→[ N ]; v∈[3] i,j,k∈[ N ]; i,j∈[ N ]; i,j,k∈[ N ];
c(1)̸=c(2); i ̸= j; i̸= j i,j,k are distinct
c(2)̸=c(3) j̸=k | {z } | {z }
= p 2 e1 − p 3 =6e3
= p2 e1 − p3 + 6e3
= e2 e1 + 3e3 (by some computation)
= p31 − 2p1 p2 + p3 (by some computation) .
XG = ∑ (−1)| F| ∏ p |C | .
F⊆E C is a connected component
of the graph with vertex set V
and edge set F
XG 1, 1, . . . , 1 = χG ( N )
| {z }
N ones
∑ ∏ xc(v)
w(v)
∈ P.
c:V →[ N ] is a v ∈V
proper N-coloring
Stag w := {i ∈ [n − 1] | wi = wi+1 } ;
For instance, the 7-tuple (2, 2, 4, 1, 4, 4, 2) has descent set {3, 6} and stagnation
set {1, 5}.
(a) 1 Fix s ∈ N. Prove that
∑ xw ∈ S .
w∈[ N ]n ;
|Stag w|=s
∑ xw ∈ S .
w∈[ N ]n ;
|Des w|=d;
|Stag w|=s
∑n xw , ∑n xw , ∑n xw
w∈[ N ] ; w∈[ N ] ; w∈[ N ] ;
|Des w|=d; |Des w|=d; |Des w|=d;
|Stag w|=s; |Stag w|=s; |Stag w|=s;
w1 < w n w1 = w n w1 > w n
all belong to S .
N
xk ∏ ( x k + xi )
i ∈[ N ]\{k}
∑ ∏ ( x k − xi )
= x1 + x2 + · · · + x N .
k =1
i ∈[ N ]\{k}
N
∏ ( x k + xi ) (
i ∈[ N ]\{k} 0, if N is even;
∑ ∏ ( x k − xi )
=
1, if N is odd.
k =1
i ∈[ N ]\{k}
i −1
h p [ x i , x i +1 , . . . , x N ] = ∑ (−1)t et [x1, x2, . . . , xi−1 ] · h p−t .
t =0
(The “h p−t ” at the end of the right hand side means h p−t [ x1 , x2 , . . . , x N ].)
∂e j
= e j−1 [ x1 , x2 , . . . , xbi , . . . , x N ] ,
∂xi
where the hat over the “xi ” means “omit the xi entry” (that
is, the expression “x1 , x2 , . . . , xbi , . . . , x N ” is to be understood as
“x1 , x2 , . . . , xi−1 , xi+1 , xi+2 , . . . , x N ”).
(b) 3 Prove that
!
∂e j
∏
det = xi − x j .
∂xi 1≤i ≤ N, 1≤ j≤ N 1≤ i < j ≤ N
[Hint: For part (c), it is helpful to first prove the following more general
result:
Let u1 , u2 , . . . , u N be any N polynomials in P . Let v1 , v2 , . . . , v N be any
N polynomials in the polynomial ring K [y1 , y2 , . . . , y N ] (in N indeterminates
Math 701 Spring 2021, version April 6, 2024 page 594
This fact is a generalization of the chain rule and can be proved, e.g., by
decomposing v j into monomials and considering each monomial separately.
Now, use parts (a) and (e) of Exercise A.6.1.7 to apply this to our setting.]
m λ [ x1 , x2 , . . . , x N ]
= ∑ m µ [ x 1 , x 2 , . . . , x M ] · m ν [ x M +1 , x M +2 , . . . , x N ] .
µ is an M-partition;
ν is an ( N − M )-partition;
µ⊔ν=λ
for each j ∈ [ N ].
Prove that
∏ ∑
xi + x j = tλ mλ ,
1≤ i < j ≤ N λ is an N-partition
of N ( N −1)/2
The next exercise answers a rather natural question about the definition of a
semistandard tableau (Definition 7.3.6): What happens when one replaces “in-
crease strictly down each column” by “increase weakly down each column”?
This replacement gives rise to a more liberal notion of semistandard tableaux
(which I call “semi-semistandard tableaux”), and to a variant of the Schur poly-
nomial that correspondingly has more terms. As the following exercise shows,
alas, this new polynomial is rarely ever symmetric:
sλ :=
b ∑ xT ,
T ∈SSSYT(λ)
The next exercise asks you to explore some possible (or impossible) variations
Math 701 Spring 2021, version April 6, 2024 page 597
sρ = ∏
xi + x j .
1≤ i < j ≤ N
min{n,N }−1
pn = ∑ (−1)i sQ(i),
i =0
sλ/µ, α/β := ∑ xT ∈ P .
T ∈SSYT(λ/µ);
αi < T (i,j)≤ β i for all (i,j)∈Y (λ/µ)
(This is the same sum as sλ/µ , but restricted to those semistandard tableaux
whose entries in the i-th row belong to the half-open interval (αi , β i ] for each
i ∈ [ M ]. This polynomial is known as a (row-)flagged Schur polynomial; in
general, it is not symmetric.) Then,
h i
sλ/µ, α/β = det h λ i − µ j − i + j x α j +1 , x α j +2 , . . . , x β i .
1≤i ≤ M, 1≤ j≤ M
∑
a a −1 a a
each monomial x1a1 x2a2 · · · x aNN to ai x1a1 x2a2 · · · xi−i−11 xi i xi+i+11 xi+i+22 · · · x aNN .
i ∈[ N ]; | {z
a a a
}
a i >0 This is just the monomial x11 x22 ··· x NN ,
with the exponent on xi decreased by 1
• the sum ranges over all N-partitions µ such that the Young diagram
Y (µ) can be obtained from Y (λ) by removing a single box (without
shifting the remaining boxes);
Exercise A.6.3.17. Let I be the ideal of the ring P generated by the N poly-
nomials e1 , e2 , . . . , e N .
(a) 2 Prove that any homogeneous symmetric polynomial f ∈ S of posi-
tive degree is contained in I.
(b) 9 Prove that the quotient ring P /I is a free K-module with basis
x1a1 x2a2 · · · x aNN ,
( a1 ,a2 ,...,a N )∈ HN
where the N!-element set HN is defined as in Definition 5.3.6 (c) (for n = N).
(Here, the notation f denotes the projection of a polynomial f ∈ P onto the
quotient ring P /I.)
(c) 5 More generally: Let u1 , u2 , . . . , u N be N polynomials in P such that
e1 − u 1 , e2 − u 2 , . . . , e N − u N .
where the N!-element set HN is defined as in Definition 5.3.6 (c) (for n = N).
(Here, the notation f denotes the projection of a polynomial f ∈ P onto the
quotient ring P /I ′ .)
Math 701 Spring 2021, version April 6, 2024 page 601
B.1. x n -equivalence
Detailed proof of Theorem 3.10.3. (a) We claim the following:
xn
Claim 1: We have f ≡ f for each f ∈ K [[ x ]].
xn xn
[Proof of Claim 2: Let f , g, h ∈ K [[ x ]] be three FPSs satisfying f ≡ g and g ≡ h. We
xn
must show that f ≡ h.
xn
We have f ≡ g. In other words,
[ xm ] f = [ xm ] g (by (313))
= [ xm ] h (by (314)) .
xn
In other words, we have f ≡ h (by Definition 3.10.1). This proves Claim 2.]
xn xn
Claim 3: If two FPSs f , g ∈ K [[ x ]] satisfy f ≡ g, then g ≡ f .
xn
[Proof of Claim 3: Let f , g ∈ K [[ x ]] be two FPSs satisfying f ≡ g. We must show that
xn
g ≡ f.
xn
We have f ≡ g. In other words,
[ xm ] ( a + c) = [ xm ] a + [ xm ] c (317)
[ x m ] (b + d) = [ x m ] b + [ x m ] d (318)
(by (20), applied to m, b and d instead of n, a and b). Hence, every m ∈ {0, 1, . . . , n}
satisfies
[ xm ] ( a + c) = [ xm ] a + [ xm ] c (by (317))
| {z } | {z }
=[ x m ]b =[ x m ]d
(by (315)) (by (316))
= [ x m ] b + [ xm ] d = [ xm ] (b + d) (by (318)) .
xn
In other words, a + c ≡ b + d (by Definition 3.10.1). Thus, we have proved (99). The
same argument (but with all “+” signs replaced by “−” signs, and with all references
to (20) replaced by references to (21)) can be used to prove (100). It remains to prove
(101).
Let m ∈ {0, 1, . . . , n}. Then, m ≤ n.
Now, let i ∈ {0, 1, . . . , m}. Then, i ∈ {0, 1, .. . , m} ⊆ {0, 1, . . . , n} (since m ≤ n).
Hence, (315) (applied to i instead of m) yields xi a = xi b. Furthermore, from i ∈
{0, 1, . . . , m}, we obtain m − i ∈ {0, 1, . . . , m} ⊆ {0, 1, . . . ,n} (since m ≤ n). Hence, (316)
(applied toi m − i instead
m−i of m) yields x m−i c = x m−i d. Multiplying the equalities
−
i m i
x a = x b and x c= x d, we obtain
h i h i h i h i
xi a · x m−i c = xi b · x m−i d. (319)
Math 701 Spring 2021, version April 6, 2024 page 603
Forget that we fixed i. We thus have proved (319) for each i ∈ {0, 1, . . . , m}. Now,
(22) (applied to m, a and c instead of n, a and b) yields
m h i h i m h i h i
[ x m ] ( ac) = ∑ x i
a · x m −i
c = ∑ x i
b · x m −i
d.
i =0 | {z } i =0
= [ x i ] b · [ x m −i ] d
(by (319))
[ x m ] (λa) = λ · [ x m ] a (321)
[ x m ] (λb) = λ · [ x m ] b (322)
(by (25), applied to m and b instead of n and a). Now, each m ∈ {0, 1, . . . , n} satisfies
xn
In other words, λa ≡ λb (by Definition 3.10.1). This proves Theorem 3.10.3 (c).
xn xn
(d) Let a, b ∈ K [[ x ]] be two FPSs satisfying a ≡ b. We have a ≡ b. In other words,
Induction step: Fix some k ∈ {0, 1, . . . , n}. We assume (as an induction hypothesis)
that (324) is true for any m < k. In other words, for any m ∈ {0, 1, . . . , n} satisfying
m < k, we have
[ x m ] a −1 = [ x m ] b −1 . (325)
We
k shall
− 1
now prove
k
(324) is true for m = k. In other words, we shall prove that
−that
1
x a = x b .
Proposition 3.3.7 shows that the FPS a is invertible in K [[x ]] if and only if its constant
term x0 a is invertible in K. Hence, its constant term x0 a is invertible in K (since a
(here, we have split off the addend for i = 0 from the sum). Thus,
0 h k i −1 k h i h i h i h i
x a· x a + ∑ xi a · x k−i a−1 = x k aa−1 = x k 1,
i =1 | {z }
=1
so that
0 h k i −1 h k i k h i h i
x a· x a = x 1 − ∑ x i a · x k − i a −1 .
i =1
We can divide this equality by x0 a (since x0 a is invertible in K), and thus obtain
!
k h i
− 1
h i h i h i
x k a −1 = x 0 a · x k 1 − ∑ x i a · x k − i a −1
. (326)
i =1
to m = 0).
In other words, (324) is true for m = k. This completes the induction step. Thus, (324)
is proved.]
Thus, we have shown that each m ∈ {0, 1, . . . , n} satisfies [ x m ] a−1 = [ x m ] b−1 . In
xn
other words, a−1 ≡ b−1 (by Definition 3.10.1). This proves Theorem 3.10.3 (d).
xn xn
(e) Let a, b, c, d ∈ K [[ x ]] be four FPSs satisfying a ≡ b and c ≡ d. Assume that the
FPSs c and d are invertible. Then, Theorem 3.10.3 (d) (applied to c and d instead of
xn
a and b) yields c−1 ≡ d−1 . Hence, (101) (applied to c−1 and d−1 instead of c and d)
xn a xn b a b
yields ac−1 ≡ bd−1 . In other words, ≡ (since ac−1 = and bd−1 = ). This proves
c d c d
Theorem 3.10.3 (e).
(f) Let us first prove (104):
[Proof of (104): We proceed by induction on |S|:
Induction base: It is easy to see that (104) holds for |S| = 0 162 .
Induction step: Let k ∈ N. Assume (as the induction hypothesis) that (104) holds for
|S| = k. We must prove that (104) holds for |S| = k + 1.
162 Proof.
Let S, ( as )s∈S and (bs )s∈S be as in Theorem 3.10.3 (f), and assume that |S| = 0. From
|S| = 0, we obtain S = ∅. Hence,
∑ bs = (empty sum) = 0.
s∈S
xn
Comparing these two equalities, we obtain ∑ as = ∑ bs . Hence, ∑ as ≡ ∑ bs (since the
s∈S s∈S s∈S s∈S
xn
relation ≡ is an equivalence relation and thus is reflexive). Thus, we have proved (104)
under the assumption that |S| = 0.
Math 701 Spring 2021, version April 6, 2024 page 606
So let S, ( as )s∈S and (bs )s∈S be as in Theorem 3.10.3 (f), and assume that |S| = k + 1.
Then, |S| = k + 1 > k ≥ 0, so that the set S is nonempty. In other words, there exists
xn
some t ∈ S. Consider this t. Each s ∈ S \ {t} satisfies s ∈ S \ {t} ⊆ S and thus as ≡ bs
(by (103)). Moreover, from t ∈ S, we obtain |S \ {t}| = |S| − 1 = k (since |S| = k + 1).
Hence, we can apply (104) to S \ {t} instead of S (since our induction hypothesis says
that (104) holds for |S| = k). As a result, we obtain
xn
∑ as ≡ ∑ bs .
s∈S\{t} s∈S\{t}
xn
On the other hand, at ≡ bt (by (103), applied to s = t). Hence, (99) (applied to a = at
and b = bt and c = ∑ as and d = ∑ bs ) yields
s∈S\{t} s∈S\{t}
xn
at + ∑ a s ≡ bt + ∑ bs .
s∈S\{t} s∈S\{t}
In view of
here, we have split off the
∑ as = at + ∑ as
addend for s = t from the sum
s∈S s∈S\{t}
and
here, we have split off the
∑ bs = bt + ∑ bs
addend for s = t from the sum
,
s∈S s∈S\{t}
this rewrites as
xn
∑ as ≡ ∑ bs .
s∈S s∈S
Hence, we have shown that (104) holds for |S| = k + 1. This completes the induction
step. Thus, the induction proof of (104) is complete.]
We have now proved (104). The exact same argument (but with all sums replaced by
products, and with the reference to (99) replaced by a reference to (101)) can be used
to prove (105). Hence, the proof of Theorem 3.10.3 (f) is complete.
[ xm ] ( f − g) = [ xm ] f − [ xm ] g (330)
[Proof of Claim 1: Let S be the set {1, 2, . . . , i }. This set S is finite, and satisfies |S| = i.
xn
Moreover, we have c ≡ d for each s ∈ S (by assumption). Hence, (105) (applied to
as = c and bs = d) yields
xn
∏ c ≡ ∏ d.
s∈S s∈S
In view of
xn
we can rewrite this as ci ≡ di . In other words,
each m ∈ {0, 1, . . . , n} satisfies [ x m ] ci = [ x m ] di (337)
xn
(by the definition of “ci ≡ di ”).
Now, let m ∈ {0, 1, . . . , n}. Then, (25) (applied to m, ai and ci instead
of n, λ and
a) yields [ x ] ai c = ai · [ x ] c . Similarly, [ x ] bi di = bi · [ x m ] di . On the other
m i m i m
and
[ x m ] di = 0. (339)
j ∈ {0, 1, . . . , i − 1}. We can apply this to j = m (since m ∈ {0, 1, . . . , i − 1}), and thus
obtain [ x m ] ci = 0. This proves (338). The proof of (339) is analogous (but uses the
FPS d instead of c). Thus, Claim 2 is proven.]
Next, we generalize Claim 1 to all i ∈ N:
[Proof of Claim 3: If i ∈ {0, 1, . . . , m}, then this follows from Claim 1. Thus, for the
rest of this proof, we WLOG assume that we don’t have i ∈ {0, 1, . . . , m}. Hence,
i∈/ {0, 1, . . . , m}, so that i ∈ N \ {0, 1, . . . , m} (since i ∈ N). Thus, Claim 2 applies, and
therefore (338) and (339) hold.
i m i
(25) (applied to m, ai and c instead of n, λ and a) yields [ x ] ai c = ai ·
Now,
[ x m ] ci = 0. The same argument (applied to bi and d instead of ai and c) yields
| {z }
=0
(by (338))
[ x m ] bi d i
= 0. Comparing these two equalities, we obtain [ x m ] ai ci = [ x m ] bi di .
xn xn
In other words, a ◦ c ≡ b ◦ d (by the definition of “a ◦ c ≡ b ◦ d”). This proves Proposition
3.10.5.
Definition 3.11.5 (b) shows that if n ∈ N, and if M is a finite subset of I that deter-
mines the x n -coefficient in the product of (ai )i∈ I , then
! !
[ xn ] ∏
f a = [ xn ]
i ∏ ai . (340)
i∈ I i∈ M
Now, let n ∈ N. The set I is finite (since the family (ai )i∈ I is finite), and thus is a
finite subset of I. Moreover, every finite subset J of I satisfying I ⊆ J ⊆ I satisfies
! !
[ xn ] ∏ ai = [ xn ] ∏ ai
i∈ J i∈ I
!
(because combining I ⊆ J and J ⊆ I yields J = I, and thus [ xn ] ∏ ai =[ xn ] ∏ ai ).
i∈ J i∈ I
In other words, I determines the x n -coefficient in the product of (ai )i∈ I (by the defi-
nition of “determining the x n -coefficient in a product”). Hence, we can apply (340) to
M = I. As a consequence, we obtain
! !
[ xn ] f a = [ xn ] a . ∏ i ∏ i
i∈ I i∈ I
Detailed proof of Lemma 3.11.9. We shall prove Lemma 3.11.9 by induction on | J | (this is
allowed, since the set J is supposed to be finite):
Induction base: Lemma 3.11.9 is true in the case when | J | = 0 163 .
Induction step: Let k ∈ N. Assume (as the induction hypothesis) that Lemma 3.11.9
is true in the case when | J | = k. We must prove that Lemma 3.11.9 is true in the case
when | J | = k + 1.
163 Proof. Let a, ( f i )i∈ J and n be as in Lemma 3.11.9. Assume that | J | = 0. Thus, J = ∅, so
that the product ∏ (1 + f i ) is empty. Hence, ∏ (1 + f i ) = (empty product) = 1. Hence, we
i∈ J i∈ J
have
[ x ] a ∏ (1 + f i )
m m
= [x ] a for each m ∈ {0, 1, . . . , n} .
i∈ J
| {z }
=1
Thus, we have proved Lemma 3.11.9 under the assumption that | J | = 0. Therefore, Lemma
3.11.9 is true in the case when | J | = 0.
Math 701 Spring 2021, version April 6, 2024 page 611
[ xm ] a ∏ (1 + f i ) 1 + f j = [ x m ] a ∏
(1 + f i )
i ∈ J \{ j} i ∈ J \{ j}
∏ (1 + f i ) ∏ ∏
a = a 1 + fj (1 + f i ) = a (1 + f i ) 1 + f j ,
i∈ J i ∈ J \{ j} i ∈ J \{ j}
| {z }
= ( 1+ f j ) ∏ (1+ f i )
i ∈ J \{ j}
(here, we have split off the factor
for i = j from the product)
[ x m ] a ∏ (1 + f i ) = [ xm ] a ∏ (1 + f i ) (342)
i∈ J i ∈ J \{ j}
[ x m ] a ∏ (1 + f i ) = [ x m ] a ∏ (1 + f i ) (by (342))
i∈ J i ∈ J \{ j}
= [ xm ] a (by (343)) .
This is precisely the claim of Lemma 3.11.9. Thus, we have proved that Lemma 3.11.9
is true in the case when | J | = k + 1. This completes the induction step. Thus, the
induction proof of Lemma 3.11.9 is complete.
Math 701 Spring 2021, version April 6, 2024 page 612
Detailed proof of Theorem 3.11.10. The family ( f i )i∈ I is summable. In other words,
(by the definition of “summable”). In other words, for each n ∈ N, there exists a finite
subset In of I such that
all i ∈ I \ In satisfy [ x n ] f i = 0. (344)
Consider this subset In . Thus, all the sets I0 , I1 , I2 , . . . are finite subsets of I.
Now, let n ∈ N be arbitrary. Let M := I0 ∪ I1 ∪ · · · ∪ In . Then, M is a union of n + 1
finite subsets of I (because all the sets I0 , I1 , I2 , . . . are finite subsets of I), and thus itself
is a finite subset of I. Moreover,
(since a = ∏ (1 + f i )). However, the finite set J is the union of the two disjoint sets M
i∈ M
and J \ M (since M ⊆ J). Hence, the product ∏ (1 + f i ) can be split as follows:
i∈ J
! !
∏ (1 + f i ) = ∏ (1 + f i ) ∏ (1 + f i ) =a ∏ (1 + f i ) .
i∈ J i∈ M i∈ J \ M i∈ J \ M
| {z }
=a
(by the definition of a)
Math 701 Spring 2021, version April 6, 2024 page 613
Forget that we fixed J. We thus have shown that every finite subset J of I satisfying
M ⊆ J ⊆ I satisfies
! !
[ xn ] ∏ (1 + f i ) = [ xn ] ∏ (1 + f i ) .
i∈ J i∈ M
In other words, the set M determines the x n -coefficient in the product of (1 + f i )i∈ I (ac-
cording to Definition 3.11.1 (b)). Hence, the x n -coefficient in the product of (1 + f i )i∈ I
is finitely determined (according to Definition 3.11.3 (b), since the set M is finite).
Forget that we fixed n. Hence, we have shown that for each n ∈ N, the x n -coefficient
in the product of (1 + f i )i∈ I is finitely determined. In other words, each coefficient in
this product is finitely determined. In other words, the family (1 + f i )i∈ I is multipliable
(by the definition of “multipliable”). This proves Theorem 3.11.10.
Detailed proof of Proposition 3.11.11. Let (ai )i∈ I ∈ K [[ x ]] I be a family of FPSs. Assume
that all but finitely many entries of this family (ai )i∈ I equal 1 (that is, all but finitely
many i ∈ I satisfy ai = 1). We must prove that this family is multipliable.
We have assumed that all but finitely many i ∈ I satisfy ai = 1. In other words, there
exists a finite subset M of I such that
all i ∈ I \ M satisfy ai = 1. (347)
Consider this M.
Let n ∈ N. Every finite subset J of I satisfying M ⊆ J ⊆ I satisfies
! !
[ xn ] ∏ ai = [ xn ] ∏ ai
i∈ J i∈ M
164 .In other words, the set M determines the x n -coefficient in the product of (ai )i∈ I
(by the definition of “determining the x n -coefficient in a product”). Hence, the x n -
164 Proof. Let J be a finite subset of I satisfying M ⊆ J ⊆ I. Then, each i ∈ J \ M satisfies
i∈ J \ M ⊆ I \ M and therefore
|{z}
⊆I
ai = 1 (348)
(by (347)). However, the set J is the union of the two disjoint sets M and J \ M (since M ⊆ J).
Hence, we can split the product ∏ ai as follows:
i∈ J
! !
∏ ai = ∏ ai ∏ ai = ∏ ai ∏ 1 = ∏ ai .
i∈ J i∈ M i∈ M i∈ M
i ∈ J \ M |{z} i∈ J \ M
=1
(by (348)) | {z }
=1
!
Therefore, [xn ] ∏ ai = [xn ] ∏ ai , qed.
i∈ J i∈ M
Math 701 Spring 2021, version April 6, 2024 page 614
coefficient in the product of (ai )i∈ I is finitely determined (by the definition of “finitely
determined”).
Forget that we fixed n. We thus have proved that for each n ∈ N, the x n -coefficient
in the product of (ai )i∈ I is finitely determined. In other words, each coefficient in the
product of (ai )i∈ I is finitely determined. In other words, the family (ai )i∈ I is multipli-
able (by the definition of “multipliable”). This proves Proposition 3.11.11.
Detailed proof of Lemma 3.11.15. Fix m ∈ {0, 1, . . . , n}. Recall that the family (ai )i∈ I is
multipliable. In other words, each coefficient in its product is finitely determined.
Hence, in particular, the x m -coefficient in the product of (ai )i∈ I is finitely determined.
In other words, there is a finite subset M of I that determines the x m -coefficient in the
product of (ai )i∈ I . Consider this subset M, and denote it by Mm . Thus, Mm is a finite
subset of I that determines the x m -coefficient in the product of (ai )i∈ I .
Forget that we fixed m. Thus, for each m ∈ {0, 1, . . . , n}, we have defined a finite
subset Mm of I. Let M be the union M0 ∪ M1 ∪ · · · ∪ Mn of these (altogether n + 1)
subsets. Thus, M is a union of finitely many finite subsets of I; hence, M itself is a
finite subset of I.
Now, let m ∈ {0, 1, . . . , n}. We shall show that M determines the x m -coefficient in
the product of (ai )i∈ I .
Indeed, let N be a finite subset of I satisfying M ⊆ N ⊆ I. We have Mm ⊆ M (since
M is the union M0 ∪ M1 ∪ · · · ∪ Mn , while Mm is one of the sets in this union). Hence,
Mm ⊆ M ⊆ N. Now, recall that the set Mm determines the x m -coefficient in the product
of (ai )i∈ I (by the definition of Mm ). In other words, every finite subset J of I satisfying
Mm ⊆ J ⊆ I satisfies ! !
[ xm ] ∏ ai = [ xm ] ∏ ai (349)
i∈ J i ∈ Mm
(by the definition of what it means to “determine the x m -coefficient in the product of
(ai )i∈ I ”). Applying this to J = N, we obtain
! !
[ xm ] ∏ ai = [ xm ] ∏ ai
i∈ N i ∈ Mm
!
[ xm ] ∏ ai =[ xm ] ∏ ai . In other words, M determines the x m -coefficient in the
i∈ J i∈ M
product of (ai )i∈ I (by the definition of what it means to “determine the x m -coefficient
in the product of (ai )i∈ I ”).
Forget that we fixed m. We thus have shown that M determines the x m -coefficient in
the product of (ai )i∈ I for each m ∈ {0, 1, . . . , n}. In other words, M determines the first
n + 1 coefficients in the product of (ai )i∈ I . In other words, M is an x n -approximator
for (ai )i∈ I (by the definition of an “x n -approximator”). Hence, there exists an x n -
approximator for (ai )i∈ I . This proves Lemma 3.11.15.
Detailed proof of Proposition 3.11.16. The set M is an x n -approximator for (ai )i∈ I . In
other words, M is a finite subset of I that determines the first n + 1 coefficients in
the product of (ai )i∈ I (by the definition of an “x n -approximator”).
(a) Let m ∈ {0, 1, . . . , n}. Recall that M determines the first n + 1 coefficients in the
product of (ai )i∈ I . Thus, in particular, M determines the x m -coefficient in the product
of (ai )i∈ I (since m ∈ {0, 1, . . . , n}). In other words, every finite subset J of I satisfying
M ⊆ J ⊆ I satisfies ! !
[ xm ] ∏ ai = [ xm ] ∏ ai (350)
i∈ J i∈ M
(by the definition of “determining the x m -coefficient in the product of (ai )i∈ I ”).
Forget that we fixed m. We thus have proved that every m ∈ {0, 1, . . . , n} and every
finite subset J of I satisfying M ⊆ J ⊆ I satisfy (350).
Now, let J be a finite
! subset of I satisfying M ⊆ J ⊆ I. Then, each m ∈ {0, 1, . . . , n}
xn
satisfies [ x m ] ∏ ai = [ x m ] ∏ ai (by (350)). In other words, we have ∏ ai ≡ ∏ ai
i∈ J i∈ M i∈ J i∈ M
(by Definition 3.10.1). This proves Proposition 3.11.16 (a).
(b) Assume that the family (ai )i∈ I is multipliable. Let m ∈ {0, 1, . . . , n}. Thus, M
determines the x m -coefficient in the product of (ai )i∈ I (since M determines the first
n + 1 coefficients in the product of (ai )i∈ I ).
The product ∏ ai is defined according to Definition 3.11.5 (b). Specifically, Definition
i∈ I
3.11.5 (b) (with n and M renamed as k and N) shows that the product ∏ ai is defined
i∈ I
to be the FPS whose x k -coefficient (for any k ∈ N) can be computed as follows: If
k ∈ N, and if N is a finite subset of I that determines the x k -coefficient in the product
of (ai )i∈ I , then ! !
h i h i
x k
∏ ai = x ∏ ai .
k
i∈ I i∈ N
Forget
that
we fixed m. We thus have proved that each m ∈ {0, 1, . . . , n} satisfies
x n
[ x m ] ∏ ai = [ x m ] ∏ ai . In other words, ∏ ai ≡ ∏ ai (by Definition 3.10.1). This
i∈ I i∈ M i∈ I i∈ M
proves Proposition 3.11.16 (b).
Detailed proof of Proposition 3.11.17. (a) Fix n ∈ N. We know that the family (ai )i∈ J is
multipliable. Hence, there exists an x n -approximator U for (ai )i∈ J (by Lemma 3.11.15,
applied to J instead of I). Consider this U.
We also know that the family (ai )i∈ I \ J is multipliable. Hence, there exists an x n -
approximator V for (ai )i∈ I \ J (by Lemma 3.11.15, applied to I \ J instead of I). Consider
this V.
We know that U is an x n -approximator for (ai )i∈ J . In other words, U is a finite
subset of J that determines the first n + 1 coefficients in the product of (ai )i∈ J (by the
definition of an x n -approximator). Hence, in particular, U is finite. Similarly, V is finite.
Moreover, U ⊆ J (since U is a subset of J); similarly, V ⊆ I \ J.
Let M = U ∪ V. This set M is finite (since U and V are finite). Moreover, using
U ⊆ J ⊆ I and V ⊆ I \ J ⊆ I, we obtain M = |{z} U ∪ |{z}
V ⊆ I ∪ I = I. Hence, M is a
⊆I ⊆I
finite subset of I. Note that the sets U and V are disjoint165 . Hence, the set M is the
union of its two disjoint subsets U and V (since M = U ∪ V).
Now, let N be a finite subset of I satisfying M ⊆ N ⊆ I. We shall show that
! !
[ xn ] ∏ ai = [ xn ] ∏ ai .
i∈ N i∈ M
Indeed, let m ∈ {0, 1, . . . , n}. The set N ∩ J is a finite subset of J (since N is a finite
subset of I) and satisfies U ⊆ N ∩ J (since U ⊆ U ∪ V = M ⊆ N and U ⊆ J). Now,
recall that U determines the first n + 1 coefficients in the product of (ai )i∈ J . Hence, U
determines the x m -coefficient in the product of (ai )i∈ J (since m ∈ {0, 1, . . . , n}). In other
words, every finite subset T of J satisfying U ⊆ T ⊆ J satisfies
! !
[ xm ] ∏ ai = [ xm ] ∏ ai
i∈T i ∈U
(by the definition of what it means to “determine the x m -coefficient in the product of
(ai )i∈ J ”). We can apply this to T = N ∩ J (since N ∩ J is a finite subset of J satisfying
U ⊆ N ∩ J ⊆ J), and thus obtain
! !
[ xm ] ∏ ai = [ xm ] ∏ ai . (351)
i∈ N ∩ J i ∈U
i ∈ N ∩( I \ J ) i ∈V
Forget that we fixed m. We thus have proved the equalities (351) and (352) for each
m ∈ {0, 1, . . . , n}. Hence, Lemma 3.3.22 (applied to a = ∏ ai and b = ∏ ai and
i∈ N ∩ J i ∈U
c = ∏ ai and d = ∏ ai ) yields that
i∈ N \ J i ∈V
! !! ! !!
[ xm ] ∏ ai ∏ ai = [ xm ] ∏ ai ∏ ai
i∈ N ∩ J i∈ N \ J i ∈U i ∈V
and
! !
since the set M is the union of its
∏ ai = ∏ ai ∏ ai two disjoint subsets U and V
,
i∈ M i ∈U i ∈V
Now, the definition of the infinite product ∏ ai (namely, Definition 3.11.5 (b)) yields
i∈ I
that ! !
n
[x ] ∏ ai = [x ] n
∏ ai (353)
i∈ I i∈ M
(since the family (ai )i∈ J is multipliable). Furthermore, the set V is an x n -approximator
for (ai )i∈ I \ J . Thus, Proposition 3.11.16 (b) (applied to I \ J and V instead of I and M)
yields
xn
∏ ai ≡ ∏ ai (355)
i∈ I \ J i ∈V
(since the family (ai )i∈ I \ J is multipliable). From (354) and (355), we obtain
! ! ! !
xn
∏ ai ∏ ai ≡ ∏ ai ∏ ai
i∈ J i∈ I \ J i ∈U i ∈V
this rewrites as ! !
xn
∏ ai ∏ ai ≡ ∏ ai .
i∈ J i∈ I \ J i∈ M
In other words,
! !! !
each m ∈ {0, 1, . . . , n} satisfies [ x m ] ∏ ai ∏ ai = [ xm ] ∏ ai
i∈ J i∈ I \ J i∈ M
Math 701 Spring 2021, version April 6, 2024 page 619
Now, forget that we fixed n. We thus have proved that each n ∈ N satisfies (356).
In other words, each coefficient of the FPS ∏ ai equals the corresponding coefficient of
! ! i∈ I
∏ ai ∏ ai . Hence, we have
i∈ J i∈ I \ J
! !
∏ ai = ∏ ai · ∏ ai .
i∈ I i∈ J i∈ I \ J
Detailed proof of Proposition 3.11.18. (a) Fix n ∈ N. We know that the family (ai )i∈ I is
multipliable. Hence, there exists an x n -approximator U for (ai )i∈ I (by Lemma 3.11.15).
Consider this U.
We also know that the family (bi )i∈ I is multipliable. Hence, there exists an x n -
approximator V for (bi )i∈ I (by Lemma 3.11.15, applied to bi instead of ai ). Consider
this V.
We know that U is an x n -approximator for (ai )i∈ I . In other words, U is a finite
subset of I that determines the first n + 1 coefficients in the product of (ai )i∈ I (by the
definition of an x n -approximator). Hence, in particular, U is finite. Similarly, V is finite.
Moreover, U ⊆ I (since U is a subset of I); similarly, V ⊆ I.
Let M = U ∪ V. This set M is finite (since U and V are finite). Moreover, using
U ⊆ I and V ⊆ I, we obtain M = |{z} U ∪ |{z}
V ⊆ I ∪ I = I. Hence, M is a finite subset
⊆I ⊆I
of I.
Now, let N be a finite subset of I satisfying M ⊆ N ⊆ I. We shall show that
! !
[ xn ] ∏ ( ai bi ) = [ xn ] ∏ ( ai bi ) .
i∈ N i∈ M
(by the definition of what it means to “determine the x m -coefficient in the product
of (ai )i∈ I ”). We can apply this to T = N (since N is a finite subset of I satisfying
U ⊆ N ⊆ I), and thus obtain
! !
[ xm ] ∏ ai = [ xm ] ∏ ai .
i∈ N i ∈U
Forget that we fixed m. We thus have proved the equalities (358) and (359) for each
m ∈ {0, 1, . . . , n}. Hence, Lemma 3.3.22 (applied to a = ∏ ai and b = ∏ ai and
i∈ N i∈ M
c = ∏ bi and d = ∏ bi ) yields that
i∈ N i∈ M
! !! ! !!
[ xm ] ∏ ai ∏ bi = [ xm ] ∏ ai ∏ bi
i∈ N i∈ N i∈ M i∈ M
In view of
! !
by the standard rules for finite
∏ ( ai bi ) = ∏ ai ∏ bi products, since N is a finite set
i∈ N i∈ N i∈ N
and
! !
by the standard rules for finite
∏ ( ai bi ) = ∏ ai ∏ bi
products, since M is a finite set
,
i∈ M i∈ M i∈ M
this rewrites as ! !
[x ] n
∏ ( ai bi ) = [x ]n
∏ ( ai bi ) .
i∈ N i∈ M
Math 701 Spring 2021, version April 6, 2024 page 621
xn
(since the family (ai )i∈ I is multipliable). Since the relation ≡ is symmetric, we thus
obtain
xn
∏ ai ≡ ∏ ai . (361)
i ∈U i∈ I
Moreover, M is a finite subset of I satisfying U ⊆ M (since M ⊇ U) and therefore
U ⊆ M ⊆ I; hence, Proposition 3.11.16 (a) (applied to U and M instead of M and J)
yields
xn xn
∏ ai ≡ ∏ ai ≡ ∏ ai (by (361)) .
i∈ M i ∈U i∈ I
Math 701 Spring 2021, version April 6, 2024 page 622
Hence,
xn
∏ ai ≡ ∏ ai (362)
i∈ M i∈ I
xn
(since the relation ≡ is transitive). The same argument (applied to (bi )i∈ I and V instead
of (ai )i∈ I and U) yields
xn
∏ bi ≡ ∏ bi . (363)
i∈ M i∈ I
From (362) and (363), we obtain
! ! ! !
xn
∏ ai ∏ bi ≡ ∏ ai ∏ bi
i∈ M i∈ M i∈ I i∈ I
this rewrites as ! !
xn
∏ ( ai bi ) ≡ ∏ ai ∏ bi .
i∈ M i∈ I i∈ I
In other words,
! ! !!
each m ∈ {0, 1, . . . , n} satisfies [ x m ] ∏ ( ai bi ) = [ xm ] ∏ ai ∏ bi
i∈ M i∈ I i∈ I
Now, forget that we fixed n. We thus have proved that each n ∈ N satisfies (364). In
other words, each coefficient of the FPS ∏ (ai bi ) equals the corresponding coefficient
i∈ I
of ∏ ai ∏ bi . Hence, we have
i∈ I i∈ I
! !
∏ ( ai bi ) = ∏ ai ∏ bi .
i∈ I i∈ I i∈ I
Lemma B.2.1. Let a, b, c, d ∈ K [[ x ]] be four FPSs such that c and d are invertible. Let
n ∈ N. Assume that
Then,
a b
[ xm ] = [ xm ] for each m ∈ {0, 1, . . . , n} .
c d
Detailed proof of Proposition 3.11.19. This can be proved similarly to Proposition 3.11.18,
with the obvious changes (replacing multiplication by division, and requiring each bi
to be invertible). Of course, instead of Lemma 3.3.22, we need to use Lemma B.2.1.
In order to eventually prove Proposition 3.11.21, we shall first prove a slightly stronger
auxiliary statement:
Lemma B.2.2. Let (ai )i∈ I ∈ K [[ x ]] I be a family of invertible FPSs. Let J be a subset
of I. Let n ∈ N. Let U be an x n -approximator for (ai )i∈ I . Then, U ∩ J is an x n -
approximator for (ai )i∈ J .
Proof of Lemma B.2.2. The set U is an x n -approximator for (ai )i∈ I . In other words, U is
a finite subset of I that determines the first n + 1 coefficients in the product of (ai )i∈ I
(by the definition of an x n -approximator). Hence, in particular, U is finite. Moreover,
U ⊆ I (since U is a subset of I).
Let M = U ∩ J. Thus, M = U ∩ J ⊆ U, so that the set M is finite (since U is finite).
Moreover, M is a subset of J (since M = U ∩ J ⊆ J).
Now, let N be a finite subset of J satisfying M ⊆ N ⊆ J. We shall show that
xn
∏ ai ≡ ∏ ai .
i∈ N i∈ M
Math 701 Spring 2021, version April 6, 2024 page 624
Indeed, the set N ∪ U is finite (since N and U are finite) and is a subset of I (since
|{z} ∪ |{z}
N U ⊆ I ∪ I = I); it also satisfies U ⊆ N ∪ U ⊆ I. Now, we recall that U is an
⊆ J⊆I ⊆I
x n -approximator
for (ai )i∈ I . Hence, Proposition 3.11.16 (a) (applied to U and N ∪ U
instead of M and J) yields
xn
∏ ai ≡ ∏ ai (365)
i ∈ N ∪U i ∈U
xn
(since Proposition 3.10.3 (a) yields that the relation ≡ is reflexive).
We have now proved (365) and (366). Hence, (101) (applied to a = ∏ ai and
i ∈ N ∪U
b = ∏ ai and c = ∏ ai−1 and d = ∏ ai−1 ) yields
i ∈U i ∈U \ J i ∈U \ J
! ! ! !
xn
∏ ai ∏ ai−1 ≡ ∏ ai ∏ ai−1 . (367)
i ∈ N ∪U i ∈U \ J i ∈U i ∈U \ J
On the other hand, the sets N and U \ J are disjoint166 , and their union is N ∪
(U \ J ) = N ∪ U 167 . Hence, the set N ∪ U is the union of its two disjoint subsets N
and U \ J. Thus, we can split the product ∏ ai as follows:
i ∈ N ∪U
! !
∏ ai = ∏ ai ∏ ai .
i ∈ N ∪U i∈ N i ∈U \ J
N∪ U
|{z} = N ∪ (U ∩ J ) ∪ (U \ J ) ⊆ ∪ M} ∪ (U \ J ) = N ∪ (U \ J ) .
|N {z
| {z }
=(U ∩ J )∪(U \ J ) =M =N
(since M ⊆ N)
Math 701 Spring 2021, version April 6, 2024 page 625
However, the set U is the union of its two disjoint subsets U ∩ J and U \ J. Thus, we
can split the product ∏ ai as follows:
i ∈U
! !
∏ ai = ∏ ai ∏ ai .
i ∈U i ∈U ∩ J i ∈U \ J
In view of (368) and (369), we can rewrite the relation (367) as follows:
xn
∏ ai ≡ ∏ ai .
i∈ N i∈ M
Forget that we fixed N. We have thus shown that every finite subset N of J satisfying
M ⊆ N ⊆ J satisfies (370) for each m ∈ {0, 1, . . . , n}.
Now, let m ∈ {0, 1, . . . , n}. Then, every finite subset N of J satisfying M ⊆ N ⊆ J
satisfies ! !
[ xm ] ∏ ai = [ xm ] ∏ ai
i∈ N i∈ M
(by (370)). In other words, the set M determines the x m -coefficient in the product of
(ai )i∈ J (by the definition of “determining the x m -coefficient in a product”).
Forget that we fixed m. We thus have shown that M determines the x m -coefficient in
the product of (ai )i∈ J for each m ∈ {0, 1, . . . , n}. In other words, M determines the first
n + 1 coefficients in the product of (ai )i∈ J . In other words, M is an x n -approximator
for (ai )i∈ J (by the definition of an “x n -approximator”, since M is a finite subset of J).
In other words, U ∩ J is an x n -approximator for (ai )i∈ J (since M = U ∩ J). This proves
Lemma B.2.2.
Detailed proof of Proposition 3.11.21. Let J be a subset of I. We shall show that the family
(ai )i∈ J is multipliable.
Fix n ∈ N. We know that the family (ai )i∈ I is multipliable. Hence, there exists an
x n -approximator U for (ai )i∈ I (by Lemma 3.11.15). Consider this U.
Set M = U ∩ J. Lemma B.2.2 yields that U ∩ J is an x n -approximator for (ai )i∈ J . In
other words, M is an x n -approximator for (ai )i∈ J (since M = U ∩ J). In other words,
M is a finite subset of J that determines the first n + 1 coefficients in the product of
(ai )i∈ J (by the definition of an x n -approximator). Thus, the set M determines the first
n + 1 coefficients in the product of (ai )i∈ J . Hence, in particular, this set M determines
the x n -coefficient in the product of (ai )i∈ J . Therefore, the x n -coefficient in the product
of (ai )i∈ J is finitely determined (by the definition of “finitely determined”, since M is
a finite subset of J).
Forget that we fixed n. We thus have proved that for each n ∈ N, the x n -coefficient
in the product of (ai )i∈ J is finitely determined. In other words, each coefficient in the
product of (ai )i∈ J is finitely determined. In other words, the family (ai )i∈ J is multipli-
able (by the definition of “multipliable”).
Forget that we fixed J. We thus have shown that the family (ai )i∈ J is multipliable
whenever J is a subset of I. In other words, any subfamily of (ai )i∈ I is multipliable.
This proves Proposition 3.11.21.
Our next goal is to prove Proposition 3.11.23. First, however, let us restate a piece of
Theorem 3.10.3 (f) in more convenient language:
Lemma B.2.3. Let n ∈ N. Let V be a finite set. Let (cw )w∈V ∈ K [[ x ]]V and (dw )w∈V ∈
K [[ x ]]V be two families of FPSs such that
xn
each w ∈ V satisfies cw ≡ dw .
Then, we have
xn
∏ cw ≡ ∏ dw .
w ∈V w ∈V
Math 701 Spring 2021, version April 6, 2024 page 627
Proof of Lemma B.2.3. This is just (105), with the letters S, s, as and bs renamed as V, w,
cw and dw .
Detailed proof of Proposition 3.11.23. We shall subdivide our proof into several claims:
∏ as = ∏ as = bw (373)
s∈ J s∈S;
f (s)=w
Math 701 Spring 2021, version April 6, 2024 page 628
(by (371)).
On the other hand, from J = {s ∈ S | f (s) = w}, we obtain
U ∩ J = U ∩ {s ∈ S | f (s) = w}
= {s ∈ U | f (s) = w} (since U ⊆ S) .
∏ as = ∏ as . (374)
s ∈U ∩ J s∈U;
f (s)=w
xn
Indeed, each w ∈ V satisfies w ∈ V ⊆ W and therefore bw ≡ ∏ as (by Claim 2).
s∈U;
f (s)=w
Hence, Lemma B.2.3 (applied to cw = bw and dw = ∏ as ) yields
s∈U;
f (s)=w
xn
∏ bw ≡ ∏ ∏ as . (375)
w ∈V w ∈V s∈U;
f (s)=w
xn
∏ bw ≡ ∏ as (377)
w ∈ f (U ) s ∈U
xn
Since the relation ≡ on K [[ x ]] is transitive (by Theorem 3.10.3 (a)), we thus obtain
xn
∏ bw ≡ ∏ bw .
w ∈ f (U ) w ∈V
In other words, the set f (U ) determines the x m -coefficient in the product of (bw )w∈W
(by the definition of “determining the x m -coefficient in a product”, since f (U ) is a
finite subset of W).
Forget that we fixed m. We thus have shown that the set f (U ) determines the x m -
coefficient in the product of (bw )w∈W for each m ∈ {0, 1, . . . , n}. In other words, the set
f (U ) determines the first n + 1 coefficients in the product of (bw )w∈W . In other words,
f (U ) is an x n -approximator for (bw )w∈W (by the definition of an “x n -approximator”,
since f (U ) is a finite subset of W). This proves Claim 3.]
[Proof of Claim 4: Claim 3 shows that f (U ) is an x n -approximator for (bw )w∈W . Thus,
the set f (U ) determines the first n + 1 coefficients in the product of (bw )w∈W (by the
definition of an “x n -approximator”). Hence, in particular, this set f (U ) determines
the x n -coefficient in the product of (bw )w∈W . Thus, the x n -coefficient in the product of
(bw )w∈W is finitely determined (by the definition of “finitely determined”, since f (U )
is a finite subset of W). This proves Claim 4.]
Now, forget that we fixed n. We thus have shown that the x n -coefficient in the
product of (bw )w∈W is finitely determined for each n ∈ N. In other words, each
coefficient in the product of (bw )w∈W is finitely determined. In other words, the family
(bw )w∈W is multipliable (by the definition
of“multipliable”). In view of (371), we can
xn
∏ as ≡ ∏ as . (379)
s∈S s ∈U
xn
However, the relation ≡ on K [[ x ]] is symmetric (by Theorem 3.10.3 (a)); thus, (379)
entails
xn
∏ as ≡ ∏ as . (380)
s ∈U s∈S
xn
∏ bi ≡ ∏ bi
i ∈W i ∈V
(since the family (bw )w∈W is multipliable). Renaming the index i as w on both sides of
this relation, we obtain
xn
∏ bw ≡ ∏ bw . (381)
w ∈W w ∈V
Math 701 Spring 2021, version April 6, 2024 page 631
(Indeed, this is precisely the equality (376) that was shown during the proof of Claim
3, and its proof applies here just as well.)
Now, (381) becomes
xn xn
∏ bw ≡ ∏ bw ≡ ∏ as (by (382))
w ∈W w ∈V s ∈U
xn
≡ ∏ as (by (380)) .
s∈S
xn
Since the relation ≡ on K [[ x ]] is transitive (by Theorem 3.10.3 (a)), we thus obtain
xn
∏ bw ≡ ∏ as .
w ∈W s∈S
In other words,
! !
each m ∈ {0, 1, . . . , n} satisfies [ x m ] ∏ bw = [ xm ] ∏ as
w ∈W s∈S
In other words, each coefficient of the FPS ∏ bw equals the corresponding coefficient
w ∈W
of ∏ as . Therefore,
s∈S
∏ bw = ∏ as .
w ∈W s∈S
∏ ∏ as = ∏ as .
w ∈W s∈S; s∈S
f (s)=w
Detailed proof of Proposition 3.11.24. Let f : I × J → I be the map that sends each pair
(i, j) to i. We first prove two easy claims:
Math 701 Spring 2021, version April 6, 2024 page 632
J → J′,
j 7→ (w, j) .
J ′ = {s ∈ I × J | f (s) = w}
= (i, j) ∈ I × J | f (i, j) =w
| {z }
=i
(by the definition of f )
In other words, the set J ′ consists of all pairs (w, j) with j ∈ J. Hence, there is a bijection
J → J′,
j 7→ (w, j) .
J → J′,
j 7→ (w, j) .
Hence, the family (as )s∈ J ′ is a reindexing of the family a(w,j) . Since the latter
j∈ J
family a(w,j) is multipliable, we thus conclude that the former family (as )s∈ J ′ is
j∈ J
also multipliable (since a reindexing of a multipliable family is still multipliable168 ). In
other words, the family (as )s∈ I × J; f (s)=w is multipliable (since the family (as )s∈ J ′ is the
family (as )s∈ I × J; f (s)=w ). This proves Claim 2.
in
particular,
it yields that the right hand side of (383) is well-defined – i.e., the family
∏ as is also multipliable.
s∈ I × J;
f (s)=w w∈ I
Now, fix w ∈ I. Let J ′ be the subset {s ∈ I × J | f (s) = w} of I × J. Thus, the family
(as )s∈ J ′ is the family (as )s∈ I × J; f (s)=w , and therefore is multipliable (by Claim 2). Hence,
the product ∏ as is well-defined. Furthermore, Claim 1 yields that there is a bijection
s∈ J ′
J → J′,
j 7→ (w, j) .
Thus, we can substitute (w, j) for s in the product ∏ as (because any bijection allows
s∈ J ′
us to substitute the index in a product169 ). We thus obtain
∏ as = ∏ a(w,j) (384)
s∈ J ′ j∈ J
(and, in particular,
the product on the right hand side of this equality is well-defined,
i.e., the family a(w,j) is multipliable). However, we can replace the product sign
j∈ J
“ ∏ ” by “ ∏ ” (since J ′ = {s ∈ I × J | f (s) = w}). Hence, we can rewrite (384) as
s∈ J ′ s∈ I × J;
f (s)=w
∏ as = ∏ a(w,j) . (385)
s∈ I × J; j∈ J
f (s)=w
∏ as = ∏ ∏ as = ∏ ∏ a(w,j) = ∏ ∏ a(i,j)
s∈ I × J w∈ I s∈ I × J; w∈ I j∈ J i∈ I j∈ J
f (s)=w
| {z }
= ∏ a(w,j)
j∈ J
(by (385))
(here, we have renamed the index w as i in the outer product). Renaming the index s
as (i, j) on the left hand side of this equality, we can rewrite it as
∏ a(i,j) = ∏ ∏ a(i,j) .
(i,j)∈ I × J i∈ I j∈ J
A similar argument (but using the map I × J → J, (i, j) 7→ j instead of our map
f : I × J → I, (i, j) 7→ i) shows that
∏ a(i,j) = ∏ ∏ a(i,j) .
(i,j)∈ I × J j∈ J i∈ I
(Tracing back our above argument, we see that all products appearing in this equality
are well-defined; indeed, their well-definedness has been shown the moment they first
appeared in our proof.) Proposition 3.11.24 is thus proved.
Detailed proof of Proposition 3.11.25. We have assumed that a(i,j) ∈ K [[ x ]] I × J
(i,j)∈ I × J
is a multipliable family of invertible FPSs. In other words, (as )s∈ I × J ∈ K [[ x ]] I × J is
a multipliable family of invertible FPSs (here, we have renamed the index (i, j) as s).
Hence, any subfamily of (as )s∈ I × J is multipliable (by Proposition 3.11.21, applied to
I × J and (as )s∈ I × J instead of I and (ai )i∈ I ).
Let f : I × J → I be the map that sends each pair (i, j) to i. Let us show three easy
claims:
J → J′,
j 7→ (w, j) .
Proof of Claim 1. This is proved in the exact same way as Claim 1 in our above proof of
Proposition 3.11.24.
Claim 2: For each i ∈ I, the family a(i,j) is multipliable.
j∈ J
J → J′,
j 7→ (i, j) .
Hence, the family (as )s∈ J ′ is a reindexing of the family . Since the former a(i,j)
j∈ J
family (as )s∈ J ′ is multipliable, we thus conclude that the latter family a(i,j) is also
j∈ J
multipliable (since a reindexing of a multipliable family is still multipliable170 ). This
proves Claim 2.
170 This is part of Proposition 3.11.22.
Math 701 Spring 2021, version April 6, 2024 page 635
Claim 3: For each j ∈ J, the family a(i,j) is multipliable.
i∈ I
(and, in particular, all the products appearing in this equality are well-defined). This
proves Proposition 3.11.25.
Detailed proof of Lemma 3.11.32. Theorem 3.11.10 (applied to I = J) shows that the fam-
ily (1 + f i )i∈ J is multipliable. In other words, each coefficient in the product of this
family (1 + f i )i∈ J is finitely determined. In other words, for each m ∈ N, the x m -
coefficient in the product of (1 + f i )i∈ J is finitely determined. In other words, for each
m ∈ N, there is a finite subset Mm of J that determines the x m -coefficient in the product
of (1 + f i )i∈ J . Consider this Mm .
Let M = M0 ∪ M1 ∪ · · · ∪ Mn . Then, M is a finite subset of J (since M0 , M1 , . . . , Mn
are finite subsets of J). Moreover, we claim that
! !
[ xm ] ∏ (1 + f i ) = [ xm ] ∏ (1 + f i ) (386)
i∈ J i∈ M
= [ xm ] a (by (388)) .
Detailed proof of Proposition 3.11.30. The following proof is an expanded version of the
argument given by Mindlack at https://math.stackexchange.com/a/4123658/ .
This will be a long grind; we thus break it up into several claims. First, however, let
us introduce a few notations:
• If J is a subset of I, then S JI shall denote the set of all families (si )i∈ I ∈ S I that
satisfy
(si = 0 for all i ∈ I \ J ) .
This set S JI is in a canonical bijection with S J , as elements of both sets consist of
“essentially the same data”. To wit, an element of S J is a family that only has i-th
entries for i ∈ J, whereas an element of S JI is a family that has i-th entries for all
i ∈ I, but subject to the requirement that the i-th entries for all i ∈ I \ J are 0 (so
Math 701 Spring 2021, version April 6, 2024 page 637
that only the i-th entries for i ∈ J carry any information). More rigorously: The
map
S JI → S J ,
( si )i ∈ I 7 → ( si )i ∈ J
is a bijection (since it merely shrinks the family by removing entries that are
required to be 0 anyway). We denote this bijection by reduce J .
I to be the set of all essentially finite families s
• We define Sfin ( i )i∈ I ∈ S I . It is easy
I I
to see that Sfin is the union of the sets S J over all finite subsets J of I.
[Proof of Claim 1: Let j ∈ I. Then, the pairs (i, k ) ∈ S with i = j are precisely the pairs
of the form ( j, k ) with k ∈ S j and k ̸= 0. In other words, the pairs (i, k ) ∈ S with i = j
are precisely the pairs of the form ( j, k ) with k ∈ S j \ {0}.
We assumed that the family ( pi,k )(i,k)∈S is summable. Hence, its subfamily ( pi,k )(i,k)∈S with i= j
is summable as well (since asubfamily of a summable family is always summable). In
other words, the family p j,k k∈S \{0} is summable (since this family is just a reindexing
j
of the family ( pi,k )(i,k)∈S with i= j (because the pairs (i, k ) ∈ S with i = j are precisely the
pairs of the form ( j, k ) with k ∈ S j \ {0})). Thus, the family p j,k k∈S is summable as
j
well (since the summability of a family does not change if we insert a single entry into
it171 ).
Forget that we fixed j. We thus have shown that the family p j,k k∈S is summable
j
for each j ∈ I. Renaming j as i in this statement, we obtain the following: The family
( pi,k )k∈Si is summable for each i ∈ I. This proves Claim 1.]
Claim 1 shows that the sum ∑ pi,k is well-defined for each i ∈ I. Moreover, for each
k ∈ Si
i ∈ I, we have
here, we have split off
∑ pi,k = pi,0 + ∑ pi,k the addend for k = 0
k ∈ Si k ∈S \{0} from the sum, since 0 ∈ Si
|{z}
i
=1
(by (126))
= 1+ ∑ pi,k . (389)
k ∈Si \{0}
171 Indeed, the summability of a family is an “all but finitely many k satisfy something” type of
statement. If we insert a single entry into the family, such a statement does not change its
validity.
Math 701 Spring 2021, version April 6, 2024 page 638
[Proof of Claim 2: The family ( pi,k )(i,k)∈S is summable (by assumption). We can split
its sum into subsums as follows:
∑ pi,k = ∑ ∑ pi,k .
(i,k)∈S i∈ I k ∈Si \{0}
| {z }
=∑ ∑
i∈ I k ∈Si \{0}
(since a pair (i,k ) belongs to S
if and only if it satisfies i ∈ I
and k∈Si \{0})
!
This shows that the family ∑ pi,k is summable. Hence, Theorem 3.11.10
k ∈Si \{0} i∈ I !
(applied to f i = ∑ pi,k ) yields that the family 1+ ∑ pi,k is multipliable.
k ∈Si \{0} k ∈Si \{0} i∈ I
!
In other words, the family ∑ pi,k is multipliable (since (389) shows that this
k ∈ Si
i∈ I !
family is precisely the family 1+ ∑ pi,k ). This proves Claim 2.]
k∈Si \{0} i∈ I
Claim 2 shows that the product ∏ ∑ pi,k is well-defined.
i∈ I k ∈ Si
The family ( pi,k )(i,k)∈S is summable (by assumption). In other words, for any m ∈ N,
all but finitely many (i, k ) ∈ S satisfy [ x m ] pi,k = 0. In other words, for any m ∈ N,
there exists a finite subset Tm of S such that
all (i, k ) ∈ S \ Tm satisfy [ x m ] pi,k = 0. (390)
Consider these finite subsets Tm .
For each n ∈ N, we let Tn′ be the subset T0 ∪ T1 ∪ · · · ∪ Tn of S. This subset Tn′ is finite
(since T0 , T1 , . . . , Tn are finite).
For each n ∈ N, we let In be the subset
i | (i, k ) ∈ Tn′
[Proof of Claim 3: Let m ∈ {0, 1, . . . , n}. Then, Tm ⊆ Tn′ (since Tn′ is defined as the
union T0 ∪ T1 ∪ · · · ∪ Tn , whereas Tm is one of the n + 1 sets appearing in this union).
Hence, Tn′ ⊇ Tm , so that S \ Tn′ ⊆ S \ Tm . Now, (i, k ) ∈ S \ Tn′ ⊆ S \ Tm . Therefore,
|{z}
⊇ Tm
(390) shows that [ x m ] pi,k = 0. This proves Claim 3.]
The following is easy to see:
Math 701 Spring 2021, version April 6, 2024 page 639
I . Thus, k
[Proof of Claim 4: Let (k i )i∈ I ∈ Sfin ( i )i∈ I is an essentially finite family in S I .
Now, all but finitely many i ∈ I satisfy k i = 0 (since (k i )i∈ I is essentially finite) and thus
pi,ki = pi,0 = 1 (by (126)). Hence, all but finitely many entries of the family pi,ki i∈ I
equal 1. Thus, this family is multipliable (by Proposition 3.11.11). This proves Claim
4.]
Claim 4 shows that the product ∏ pi,ki is well-defined whenever (k i )i∈ I ∈ Sfin I . Next,
i∈ I
we claim the following:
this j. Then, the product ∏ pi,ki is a multiple of p j,k j (since p j,k j is one of the factors of
i∈ I
this product). Moreover, we have [ x m ] p j,k j = 0 for each m ∈ {0, 1, . . . , n} (by Claim 3,
applied to (i, k ) = j, k j ). Hence, Lemma 3.3.21 (applied to u = p j,k j and v = ∏ pi,ki )
i∈ I
[Proof of Claim 7: Let n ∈ N. We shall show that all but finitely many families
I satisfy
(k i )i∈ I ∈ Sfin !
[ xn ] ∏ pi,k i
= 0.
i∈ I
Math 701 Spring 2021, version April 6, 2024 page 640
We are going to show that (k i )i∈ I satisfies the following two properties:
These two properties together will restrict the family (k i )i∈ I to finitely many possibili-
ties (since the sets In and Kn are finite).
/ S IIn , then we would have (k i )i∈ I ∈ Sfin
If we had (k i )i∈ I ∈ I \ S I (since k
In
I )
( i )i∈ I ∈ Sfin
and thus [ x n ] ∏ pi,ki = 0 (by Claim 6), which would contradict (391). Hence, we
i∈ I
cannot have (k i )i∈ I ∈ / S IIn . Thus, we have (k i )i∈ I ∈ S IIn . In other words, all i ∈ I \ In
satisfy k i = 0. This proves Property 1.
If there was some j ∈ In that satisfies k j ∈ / Kn ∪ {0}, then this k j would satisfy
′ 172 n
, and therefore we would have [ x ] ∏ pi,ki = 0 (by Claim 5, since
j, k j ∈ S \ Tn
i∈ I
j ∈ In ⊆ I); but this would contradict (391). Hence, there exists no j ∈ In that satisfies
kj ∈
/ Kn ∪ {0}. In other words, all i ∈ In satisfy k i ∈ Kn ∪ {0}. This proves Property 2.
Now, we have shown that our family (k i )i∈ I satisfies Property 1 and Property 2.
I
(k i )i∈ I . We thus have shown that any family (k i )i∈ I ∈ Sfin that
Forget that we fixed
satisfies [ x n ] ∏ pi,ki ̸= 0 must satisfy Property 1 and Property 2. In other words, any
i∈ I
I that satisfy Property 1
such family must belong to the set of all families (k i )i∈ I ∈ Sfin
and Property 2. However, the latter
set is finite173 . Hence, there are only finitely many
I that satisfy x n
family (k i )i∈ I ∈ Sfin [ ] ∏ pi,ki ̸= 0 (since we have shown that any such
i∈ I
I that satisfy Property
family must belong to the finite set of all families (k i )i∈ I ∈ Sfin
172 Proof. / Kn ∪ {0}. We must show that j, k j ∈ S \ Tn′ .
Let j ∈ In be such that k j ∈
Indeed, we have k j ∈ / Kn ∪ {0}; thus, k j ∈
/ Kn and k j ̸= 0. However, j ∈ In ⊆ I and k j ∈ S j
I ⊆ S I ) and therefore j, k
(since (k i )i∈ I ∈ Sfin j ∈ S (by the definition of S, since k j ̸= 0). If
′
we had j, k j ∈ Tn , then we would have k j ∈ Kn (by the definition of Kn ), which would
/ Tn′ . Combining this with j, k j ∈ S, we obtain
contradict k j ∈ / Kn . Thus, we have j, k j ∈
j, k j ∈ S \ Tn′ . This completes our proof.
173 Proof. We must show that Property 1 and Property 2 leave only finitely many options for
the family (k i )i∈ I . Indeed, Property 1 shows that all entries k i with i ∈ I \ In are uniquely
determined; meanwhile, Property 2 ensures that the remaining entries (of which there are
only finitely many, since the set In is finite) must belong to the finite set Kn ∪ {0} (this set is
finite, since Kn is finite). Therefore, a family (k i )i∈ I that satisfies Property 1 and Property 2
is uniquely determined by finitely many of its entries (namely, by its entries k i with i ∈ In ),
and there are finitely many choices for each of them (since they must belong to the finite set
Kn ∪ {0}). Hence, there are only finitely many such families (namely, at most |Kn ∪ {0}|| In |
many options). In other words, the set of all families (k i )i∈ I ∈ Sfin I that satisfy Property 1
I
1 and 2). In other words, all but finitely many families (k i )i∈ I ∈ Sfin satisfy
Property
[ x n ] ∏ pi,ki = 0.
i∈ I
Forget that we fixed n. We thus have shown that for each n ∈ N, all but finitely
I satisfy x n
many families (k i )i∈ I ∈ Sfin [ ] ∏ pi,ki = 0. In other words, the family
i∈ I
[ xn ] ∑ ∏ pi,k = [xn ]
i ∑ ∏ pi,k .
i
I
(k i )i∈ I ∈Sfin i∈ I ( k i ) i ∈ In ∈ S In i ∈ In
[Proof of Claim 8: The set In is finite (as we know). Thus, S IIn ⊆ Sfin
I 174 . Hence, the
I is the union of its two disjoint subsets S I and S I \ S I .
set Sfin In fin In
Furthermore, for each (k i )i∈ In ∈ S IIn , we have
ki = 0 for all i ∈ I \ In
174 Proof.Let (k i )i∈ I ∈ S IIn . Thus, (k i )i∈ I is a family in S I that satisfies k i = 0 for all i ∈ I \ In
(by the definition of S IIn ). However, In is finite. Thus, k i = 0 for all but finitely many i ∈ I
(since k i = 0 for all i ∈ I \ In ). In other words, the family (k i )i∈ I is essentially finite. In other
I (by the definition of S I ).
words, (k i )i∈ I ∈ Sfin fin
Forget that we fixed (k i )i∈ I . We thus have shown that (k i )i∈ I ∈ Sfin I for each ( k ) I
i i ∈ I ∈ S In .
In other words, S IIn ⊆ Sfin I .
Math 701 Spring 2021, version April 6, 2024 page 642
Now,
[ xn ] ∑ ∏ pi,k i
I
(k i )i∈ I ∈Sfin i∈ I
!
= ∑ n
[x ] ∏ pi,k i
I
(k i )i∈ I ∈Sfin i∈ I
! !
= ∑ n
[x ] ∏ pi,k i
+ ∑ [x ] n
∏ pi,k i
(k i )i∈ I ∈S IIn i∈ I I \S I
(k i )i∈ I ∈Sfin In
i∈ I
| {z } | {z }
= ∏ pi,ki =0
i ∈ In (by Claim 6)
(by (392))
!
I is
here, we have split the sum, since the set Sfin
the union of its two disjoint subsets S IIn and Sfin
I \ SI
In
!
= ∑ [ xn ] ∏ pi,k i
+ ∑ 0
(k i )i∈ I ∈S IIn i ∈ In I \S I
(k i )i∈ I ∈Sfin In
| {z }
=0
! !
= ∑ n
[x ] ∏ pi,k i
= ∑ n
[x ] ∏ pi,k i
(k i )i∈ I ∈S IIn i ∈ In (k i )i∈ In ∈S In i ∈ In
here, we have substituted (k i )i∈ In for (k i )i∈ In in the
sum, since the map S IIn → S In , (k i )i∈ I 7→ (k i )i∈ In
is a bijection (indeed, this map is the map
we have called reduce In )
= [ xn ] ∑ ∏ pi,k . i
(k i )i∈ In ∈S In i ∈ In
Now, !
m
[x ] ∑ pi,k = ∑ [ x m ] pi,k = ∑ 0 = 0.
k ∈Si \{0} k∈S \{0}
| {z } k∈S \{0}
i i
=0
(by (393))
[Proof of Claim 10: The set In is a subset of I. Hence, the set I is the union of its two
disjoint subsets In and I \ In . Thus, we can split the product ∏ ∑ pi,k as follows:175
i∈ I k ∈ Si
!
∏ ∑ ∏ ∑ pi,k
∏ ∑ pi,k
pi,k =
i∈ I k ∈ Si i ∈ In k ∈ Si i∈ I \ In k ∈ Si
| {z }
=1+ ∑ pi,k
k ∈Si \{0}
(by (389))
! !!
= ∏ ∑ pi,k ∏ 1+ ∑ pi,k . (394)
i ∈ In k ∈ Si i ∈ I \ In k∈Si \{0}
!
However, the family ∑ pi,k is summable (as we have seen in the proof
k ∈Si \{0} i∈ I !
of Claim 2). Hence, its subfamily ∑ pi,k is summable as well (since a
k ∈Si \{0} i ∈ I \ In
subfamily of a summable family is always!summable). Moreover, Claim 9 shows that
∏ ∑ pi,k = ∑ ∏ pi,k . i
i ∈ In k ∈ Si (k i )i∈ In ∈S In i ∈ In
[Proof of Claim 11: The set In is finite. For any i ∈ In , the family ( pi,k )k∈Si is summable
(by Claim 1). Hence, Proposition 3.11.31 (applied to N = In ) yields
∏ ∑ pi,k = ∑ ∏ pi,k i
= ∑ ∏ pi,k i
i ∈ In k ∈ Si ( k i ) i ∈ In ∈ ∏ S i i ∈ In (k i )i∈ In ∈S In i ∈ In
i ∈ In
= [ xn ] ∑ ∏ pi,k i
( k i ) i ∈ In ∈ S In i ∈ In
since Claim 11 yields ∏ ∑ pi,k = ∑ ∏ pi,k i
i ∈ In k ∈ Si (k i )i∈ In ∈S In i ∈ In
= [ xn ] ∑ ∏ pi,k i
(by Claim 8) .
I
(k i )i∈ I ∈Sfin i∈ I
That is, any coefficient of the FPS ∏ ∑ pi,k equals the corresponding coefficient of
i∈ I k ∈ Si
∑ ∏ pi,ki . Hence,
I
(k i )i∈ I ∈Sfin i∈ I
∏ ∑ pi,k = ∑ ∏ pi,k i
= ∑ ∏ pi,k i
i∈ I k ∈ Si I
(k i )i∈ I ∈Sfin i∈ I ( k i ) i ∈ I ∈ ∏ Si i∈ I
i∈ I
is essentially finite
Math 701 Spring 2021, version April 6, 2024 page 645
Our proof of Proposition 3.11.36 will use the finite analogue of Proposition 3.11.36,
which is easy:
Proof of Lemma B.2.4. This follows by a straightforward induction on | I |. (The base case
is the case when | I | = 0, and relies on the fact that 1 ◦ g = 1 for any g ∈ K [[ x ]]. The
induction step relies on Proposition 3.5.4 (b). The details are left to the reader, who
must have seen dozens of such proofs by now.)
xn xn
We also have g ≡ g (since the relation ≡ is an equivalence relation). Hence, Proposition
3.10.5 (applied to a = ∏ f i and b = ∏ f i and c = g and d = g) yields
i∈ J i∈ M
! !
xn
∏ fi ◦g≡ ∏ fi ◦ g.
i∈ J i∈ M
Math 701 Spring 2021, version April 6, 2024 page 646
In other words,
! !
each m ∈ {0, 1, . . . , n} satisfies [ x m ] ∏ ( f i ◦ g) = [ xm ] ∏ ( f i ◦ g)
i∈ J i∈ M
xn
(by the definition of the relation ≡). Applying this to m = n, we obtain
! !
[ xn ] ∏ ( f i ◦ g) = [ xn ] ∏ ( f i ◦ g) .
i∈ J i∈ M
Forget that we fixed J. We thus have shown that every finite subset J of I satisfying
M ⊆ J ⊆ I satisfies
! !
[ xn ] ∏ ( f i ◦ g) = [ xn ] ∏ ( f i ◦ g) .
i∈ J i∈ M
In other words, the set M determines the x n -coefficient in the product of ( f i ◦ g)i∈ I
(by the definition of what it means to “determine the x n -coefficient in the product of
( f i ◦ g)i∈ I ”). This proves Claim 1.]
Now, let n ∈ N. Lemma 3.11.15 (applied to ai = f i ) shows that there exists an x n -
approximator for ( f i )i∈ I . Consider this x n -approximator for ( f i )i∈ I , and denote it by
M. Thus, M is an x n -approximator for ( f i )i∈ I ; in other words, M is a finite subset of
I that determines the first n + 1 coefficients in the product of ( f i )i∈ I (by the definition
of “x n -approximator”). Claim 1 shows that the set M determines the x n -coefficient in
the product of ( f i ◦ g)i∈ I . Hence, there is a finite subset of I that determines the x n -
coefficient in the product of ( f i ◦ g)i∈ I (namely, M). In other words, the x n -coefficient
in the product of ( f i ◦ g)i∈ I is finitely determined.
Forget that we fixed n. We thus have shown that for each n ∈ N, the x n -coefficient
in the product of ( f i ◦ g)i∈ I is finitely determined. In other words, each coefficient in
the product of ( f i ◦ g)i∈ I is finitely determined. In other words, the family ( f i ◦ g)i∈ I is
multipliable (by the definition of “multipliable”).
It remains to prove that ∏ f i ◦ g = ∏ ( f i ◦ g).
i∈ I i∈ I
In order to do so, we again fix n ∈ N. Lemma 3.11.15 (applied to ai = f i ) shows that
there exists an x n -approximator for ( f i )i∈ I . Consider this x n -approximator for ( f i )i∈ I ,
and denote it by M. Thus, M is an x n -approximator for ( f i )i∈ I ; in other words, M is
a finite subset of I that determines the first n + 1 coefficients in the product of ( f i )i∈ I
(by the definition of “x n -approximator”). Moreover, Proposition 3.11.16 (b) (applied to
ai = f i ) yields
xn
∏ fi ≡ ∏ fi . (397)
i∈ I i∈ M
Math 701 Spring 2021, version April 6, 2024 page 647
xn xn
We also have g ≡ g (since the relation ≡ is an equivalence relation). Hence, Proposition
3.10.5 (applied to a = ∏ f i and b = ∏ f i and c = g and d = g) yields
i∈ I i∈ M
! !
xn
∏ fi ◦g≡ ∏ fi ◦ g. (398)
i∈ I i∈ M
In other words,
! ! !
each m ∈ {0, 1, . . . , n} satisfies [ x m ] ∏ fi ◦g = [ xm ] ∏ ( f i ◦ g)
i∈ I i∈ M
xn
(by the definition of the relation ≡). Applying this to m = n, we obtain
! ! !
[ xn ] ∏ fi ◦g = [ xn ] ∏ ( f i ◦ g) . (399)
i∈ I i∈ M
However, Claim 1 shows that the set M determines the x n -coefficient in the prod-
uct of ( f i ◦ g)i∈ I . Hence, the definition of the infinite product ∏ ( f i ◦ g) (specifically,
i∈ I
Definition 3.11.5 (b)) yields
! !
[x ] n
∏ ( f i ◦ g) = [x ] n
∏ ( f i ◦ g) .
i∈ I i∈ M
·········
··· of Rn,3 .
Formally speaking, this domino tiling is the set partition of Rn,3 consisting of the
following the dominos:
• the horizontal dominos {(2i − 1, 1) , (2i, 1)} for all i ∈ [n/2], which fill the
bottom row of Rn,3 , and which we call the basement dominos;
• the vertical domino {(1, 2) , (1, 3)} in the first column, which we call the left
wall;
• the vertical domino {(n, 2) , (n, 3)} in the last column, which we call the right
wall;
• the horizontal dominos {(2i, 2) , (2i + 1, 2)} for all i ∈ [n/2 − 1], which fill
the middle row of Rn,3 (except for the first and last columns), and which we
call the middle dominos;
• the horizontal dominos {(2i, 3) , (2i + 1, 3)} for all i ∈ [n/2 − 1], which fill
the top row of Rn,3 (except for the first and last columns), and which we call
the top dominos.
(b) For each even positive integer n, we let Bn be the domino tiling
···
·········
of Rn,3 .
This is the reflection of the tiling An across the horizontal axis of symmetry of Rn,3 .
(c) We let C denote the domino tiling
of R2,3 .
Math 701 Spring 2021, version April 6, 2024 page 649
Proposition B.3.2. The faultfree domino tilings of height-3 rectangles are precisely
the tilings
A2 , A4 , A6 , A8 , . . . , B2 , B4 , B6 , B8 , . . . , C
we have just defined. More concretely:
(a) The faultfree domino tilings of a height-3 rectangle that contain a vertical
domino in the top two squares of the first column are A2 , A4 , A6 , A8 , . . ..
(b) The faultfree domino tilings of a height-3 rectangle that contain a vertical
domino in the bottom two squares of the first column are B2 , B4 , B6 , B8 , . . ..
(c) The only faultfree domino tiling of a height-3 rectangle that contains no vertical
domino in the first column is C.
Proof of Proposition B.3.2 (sketched). It clearly suffices to prove parts (a), (b) and (c). We
begin with the easiest part, which is (c):
(c) Clearly, C is a faultfree domino tiling of R2,3 that contains no vertical domino in
the first column. It remains to show that C is the only such tiling.
Indeed, let T be any faultfree domino tiling of R2,3 that contains no vertical domino
in the first column. Thus, its first column must be filled with three horizontal dominos,
which all must protrude into the second column and thus cover that second column as
well. If T had any further column, then T would have a fault between its 2-nd and its
3-rd column, which is impossible for a faultfree tiling. Thus, T must consist only of the
three horizontal dominos already mentioned. In other words, T = C. This completes
the proof of Proposition B.3.2 (c).
(a) It is straightforward to check that the tilings A2 , A4 , A6 , A8 , . . . are faultfree (in-
deed, the basement dominos prevent faults between the (2i − 1)-st and (2i )-th columns,
whereas the top dominos prevent faults between the (2i )-th and (2i + 1)-st columns).
Thus, they are faultfree domino tilings of a height-3 rectangle that contain a vertical
domino in the top two squares of the first column. It remains to show that they are the
only such tilings.
Indeed, let T be any faultfree domino tiling of a height-3 rectangle that contains a
vertical domino in the top two squares of the first column. Let n be the width of this
rectangle (so that the rectangle is Rn,3 ). We shall show that n is even, and that T = An .
We know that T contains a vertical domino in the top two squares of the first column.
In other words, T contains the left wall (where we are using the terminology from
Definition B.3.1 (a)). The remaining square in the first column is the square (1, 1), and
it must thus be covered by the basement domino {(1, 1) , (2, 1)} (since no other domino
would fit). Hence, T contains the basement domino {(1, 1) , (2, 1)}.
We shall now prove the following claims:
Claim 1: For each positive integer i < n/2, the tiling T contains the base-
ment domino {(2i − 1, 1) , (2i, 1)}, the middle domino {(2i, 2) , (2i + 1, 2)}
and the top domino {(2i, 3) , (2i + 1, 3)}.
Base case: We must prove that Claim 1 holds for i = 1, provided that 1 < n/2. So
let us assume that 1 < n/2. Thus, 2 < n, so that n ≥ 3. We must prove that Claim 1
holds for i = 1; in other words, we must prove that T contains the basement domino
{(1, 1) , (2, 1)}, the middle domino {(2, 2) , (3, 2)} and the top domino {(2, 3) , (3, 3)}.
For the basement domino, we have already proved this. For the middle and the top
domino, we argue as follows: If T had a vertical domino in the 2-nd column, then this
domino would cover the top two squares of that column (since the bottom square is
already covered by the basement domino {(1, 1) , (2, 1)}), and thus T would have a
fault between the 2-nd and 3-rd columns (since the basement domino {(1, 1) , (2, 1)}
also ends at the 2-nd column), which would contradict the faultfreeness of T. Hence,
T has no vertical domino in the 2-nd column. Thus, the top two squares of the 2-
nd column of T must be covered by horizontal dominos. These horizontal dominos
must both protrude into the 3-rd column (since the corresponding squares in the 1-st
column are already covered by the left wall), and thus must be the middle domino
{(2, 2) , (3, 2)} and the top domino {(2, 3) , (3, 3)}. Hence, we have shown that T
contains the middle domino {(2, 2) , (3, 2)} and the top domino {(2, 3) , (3, 3)}. This
completes the base case.
Induction step: Let j be a positive integer such that j + 1 < n/2. Assume (as the induc-
tion hypothesis) that Claim 1 holds for i = j. In other words, T contains the basement
domino {(2j − 1, 1) , (2j, 1)}, the middle domino {(2j, 2) , (2j + 1, 2)} and the top
domino {(2j, 3) , (2j + 1, 3)}. We must now show that Claim 1 holds for i = j + 1 as
well, i.e., that T also contains the basement domino {(2j + 1, 1) , (2j + 2, 1)}, the mid-
dle domino {(2j + 2, 2) , (2j + 3, 2)} and the top domino {(2j + 2, 3) , (2j + 3, 3)}.
Indeed, we first recall that j + 1 < n/2, so that 2 ( j + 1) < n and therefore n >
2 ( j + 1) = 2j + 2, so that n ≥ 2j + 3. This shows that the rectangle Rn,3 has a (2j + 3)-
th column (along with all columns to its left).
Now, the square (2j + 1, 1) cannot be covered by a vertical domino in T, since this
vertical domino would collide with the middle domino {(2j, 2) , (2j + 1, 2)} (which
we already know to belong to T). Thus, this square must be covered by a horizon-
tal domino. This horizontal domino cannot protrude into the (2j)-th column (since it
would then collide with the basement domino {(2j − 1, 1) , (2j, 1)}, which we already
know to belong to T), and thus must be the basement domino {(2j + 1, 1) , (2j + 2, 1)}.
So we have shown that T contains the basement domino {(2j + 1, 1) , (2j + 2, 1)}. It
remains to show that T also contains the middle domino {(2j + 2, 2) , (2j + 3, 2)} and
the top domino {(2j + 2, 3) , (2j + 3, 3)}.
If T had a vertical domino in the (2j + 2)-nd column, then this domino would
cover the top two squares of that column (since the bottom square is already cov-
ered by the basement domino {(2j + 1, 1) , (2j + 2, 1)}), and thus T would have a
fault between the (2j + 2)-nd and (2j + 3)-rd columns (since the basement domino
{(2j + 1, 1) , (2j + 2, 1)} also ends at the (2j + 2)-nd column), which would contra-
dict the faultfreeness of T. Hence, T has no vertical domino in the (2j + 2)-nd column.
Thus, the top two squares of the (2j + 2)-nd column of T must be covered by horizon-
tal dominos. These horizontal dominos must both protrude into the (2j + 3)-rd col-
umn (since the corresponding squares in the (2j + 1)-st column are already covered by
the middle domino {(2j, 2) , (2j + 1, 2)} and the top domino {(2j, 3) , (2j + 1, 3)}),
and thus must be the middle domino {(2j + 2, 2) , (2j + 3, 2)} and the top domino
{(2j + 2, 3) , (2j + 3, 3)}. Hence, we have shown that T contains the middle domino
Math 701 Spring 2021, version April 6, 2024 page 651
{(2j + 2, 2) , (2j + 3, 2)} and the top domino {(2j + 2, 3) , (2j + 3, 3)}. This com-
pletes the induction step.
Thus, Claim 1 is proved by induction.
Proof of Claim 2. Assume the contrary. Thus, n is odd. But n > 1 (since R1,3 cannot be
tiled by dominos), so that n − 1 > 0. Hence, (n − 1) /2 is a positive integer (since n is
odd). Therefore, Claim 1 (applied to i = (n − 1) /2) shows that the tiling T contains the
basement domino {(n − 2, 1) , (n − 1, 1)}, the middle domino {(n − 1, 2) , (n, 2)}
and the top domino {(n − 1, 3) , (n, 3)}. But T must also include a domino that
contains the square (n, 1) (since (n, 1) ∈ Rn,3 ). This domino cannot be vertical (since
it would then collide with the basement domino {(n − 2, 1) , (n − 1, 1)}, which we
know to belong to T), and cannot be horizontal either (since it would then collide with
the middle domino {(n − 1, 2) , (n, 2)}). This is clearly absurd. This contradiction
shows that our assumption was wrong, so that Claim 2 is proved.
(Alternatively, Claim 2 can be obtained from a parity argument: Since T is a tiling of
Rn,3 by dominos, the total # of squares of Rn,3 must be even (since each domino covers
exactly 2 squares). But this total # is 3n. Thus, 3n must be even, so that n must be
even.)
Now, Claim 2 shows that n is even, so that n/2 is a positive integer. Furthermore,
we know that the tiling T contains the left wall, the first n − 1 basement dominos
(also by Claim 1). This leaves only the four squares (n − 1, 1), (n, 1), (n, 2) and (n, 3)
unaccounted for, but there is only one way to tile them: namely, with the last basement
domino {(n − 1, 1) , (n, 1)} and the right wall {(n, 2) , (n, 3)}. Thus, T must contain
all the basement dominos, all the middle dominos, all the top dominos and both walls
(left and right). Since these dominos cover all the squares of Rn,3 , this entails that T
consists precisely of these dominos. In other words, T = An . The proof of Proposition
B.3.2 (a) is thus complete.
(b) Proposition B.3.2 (b) is just Proposition B.3.2 (a) reflected across the horizontal
axis of symmetry of Rn,3 .
Detailed proof of Lemma 3.13.7. We have lim f i = f . In other words, the sequence ( f i )i∈N
i→∞
coefficientwise stabilizes to f . In other words, for each n ∈ N,
stabilizes to x k f (by (400), applied to k instead of n). In other words, there exists
Forget that we fixed k. Thus, for each k ∈ {0, 1, . . . , n}, we have defined an inte-
ger Nk ∈ N for which (401) holds. Altogether, we have thus defined n + 1 integers
N0 , N1 , . . . , Nn ∈ N.
Let us set P := max { N0 , N1 , . . . , Nn }. Then, of course, P ∈ N.
Now, let i ≥ P be an integer. Then, for each k ∈ {0, 1, . . . , n}, we have i ≥ P =
k k
max { N0 , N1 , . . . , Nn } ≥ Nk (since k ∈ {0, 1, . . . , n}) and therefore x f i = x f (by
(401)). In other words, each k ∈ {0, 1, . . . , n} satisfies x k f i = x k f . Renaming the
xn
In other words, f i ≡ f (by the definition of x n -equivalence176 ).
xn
Forget that we fixed i. We thus have shown that all integers i ≥ P satisfy f i ≡ f .
Hence, there exists some integer N ∈ N such that
xn
all integers i ≥ N satisfy f i ≡ f
Detailed proof of Proposition 3.13.8. Recall that lim f i = f . Hence, Lemma 3.13.7 shows
i→∞
that there exists some integer N ∈ N such that
xn
all integers i ≥ N satisfy f i ≡ f .
xn
Let us denote this N by K. Hence, all integers i ≥ K satisfy f i ≡ f .
Thus, we have found an integer K ∈ N such that
xn
all integers i ≥ K satisfy f i ≡ f . (402)
xn
all integers i ≥ L satisfy gi ≡ g. (403)
Consider these K and L. Let us furthermore set P := max {K, L}. Thus, P ∈ N.
We shall now show that each integer i ≥ P satisfies [ x n ] ( f i gi ) = [ x n ] ( f g).
xn
Indeed, let i ≥ P be an integer. Then, (402) yields f i ≡ f (since i ≥ P = max {K, L} ≥
xn
K), whereas (403) yields gi ≡ g (since i ≥ P = max {K, L} ≥ L). Hence, we obtain
xn
f i gi ≡ f g
[ x n ] ( f i gi ) = [ x n ] ( f g ) .
Forget that we fixed i. We thus have shown that all integers i ≥ P satisfy [ x n ] ( f i gi ) =
[ x n ] ( f g). Hence, there exists some N ∈ N such that
Proof of Claim 1. We have assumed that lim gi = g. In other words, the sequence
i→∞
( gi )i∈N coefficientwise stabilizes to g. In other words, for each n ∈ N, the sequence
([ x n ] gi )i∈N stabilizes to [ x n ] g (by the definition of “coefficientwise stabilizing”). Ap-
0
0
plying this to n = 0, we see that the sequence x gi i∈N stabilizes to x g. In other
words, there exists some N ∈ N such that
(by
0 the definition of “stabilizes”). Consider this N. Thus, all
integers i ≥ N satisfy
x gi = x0 g. Applying this to i = N, we obtain x0 g N = x0 g.
However, each FPS gi is invertible (by assumption). Hence, in particular, the FPS
g N is invertible.
0
0a = g N ), this entails that its constant
By Proposition 3.3.7 (applied to
0 g =
term
0 x g N is invertible in K. In other words, x g is invertible in K (since x N
x g).
But the FPS g is invertible if and only if its constant term x0 g is invertible in K
0 Proposition 3.3.7, applied to a = g). Hence, g is invertible (since its constant term
(by
x g is invertible in K). This proves Claim 1.
fi f
It remains to prove that lim = .
i → ∞ gi g
Recall that lim f i = f . Hence, Lemma 3.13.7 shows that there exists some integer
i→∞
N ∈ N such that
xn
all integers i ≥ N satisfy f i ≡ f .
xn
Let us denote this N by K. Hence, all integers i ≥ K satisfy f i ≡ f .
Thus, we have found an integer K ∈ N such that
xn
all integers i ≥ K satisfy f i ≡ f . (404)
Similarly, using lim gi = g, we can find an integer L ∈ N such that
i→∞
xn
all integers i ≥ L satisfy gi ≡ g. (405)
Consider these K and L. Let us furthermore set P := max {K, L}. Thus, P ∈ N.
fi f
We shall now show that each integer i ≥ P satisfies [ x n ] = [ x n ] .
gi g
xn
Indeed, let i ≥ P be an integer. Then, (404) yields f i ≡ f (since i ≥ P = max {K, L} ≥
xn
K), whereas (405) yields gi ≡ g (since i ≥ P = max {K, L} ≥ L). Since both FPSs gi and
g are invertible (by Claim 1), we thus obtain
f i xn f
≡
gi g
(by Theorem 3.10.3 (e), applied to a = f i , b = f , c = gi and d = g). In other words,
fi f
each m ∈ {0, 1, . . . , n} satisfies [ x m ] = [ xm ]
gi g
(by the definition of x n -equivalence). Applying this to m = n, we find
fi f
[ xn ] = [ xn ] .
gi g
fi
Forget that we fixed i. We thus have shown that all integers i ≥ P satisfy [ x n ] =
gi
f
[ x n ] . Hence, there exists some N ∈ N such that
g
fi f
all integers i ≥ N satisfy [ x n ] = [ xn ]
gi g
Math 701 Spring 2021, version April 6, 2024 page 655
Proof of Theorem 3.13.14. The family ( f n )n∈N is summable. In other words, the family
( f k )k∈N is summable (since this is the same family as ( f n )n∈N ). In other words, for
each n ∈ N,
all but finitely many k ∈ N satisfy [ x n ] f k = 0 (406)
(by the definition of “summable”).
Let us define
i
gi : = ∑ fk for each i ∈ N. (407)
k =0
Now, fix n ∈ N. Then, all but finitely many k ∈ N satisfy [ x n ] f k = 0 (by (406)). In
other words, there exists a finite subset J of N such that
Consider this subset J. The set J is a finite set of nonnegative integers, and thus has
an upper bound (since any finite set of nonnegative integers has an upper bound). In
other words, there exists some m ∈ N such that
Consider this m.
Let k ∈ N be such that k ≥ m + 1. If we had k ∈ J, then we would have k ≤ m (by
(410)), which would contradict k ≥ m + 1 > m. Thus, we cannot have k ∈ J. Hence,
k ∈ N \ J (since k ∈ N but not k ∈ J). Therefore, (409) yields [ x n ] f k = 0.
Forget that we fixed k. We thus have shown that
i
On the other hand, from gi = ∑ f k , we obtain
k =0
!
i i
n
[ x ] gi = [ x ] n
∑ fk = ∑ [ xn ] f k .
k =0 k =0
(namely, N = m). In other words, the sequence ([ x n ] gi )i∈N stabilizes to [ x n ] g (by the
definition of “stabilizes”).
Forget that we fixed n. We thus have shown that for each n ∈ N,
i
lim
i→∞
∑ fk = ∑ fk.
k =0 k ∈N
Proof of Theorem 3.13.15. The family ( f n )n∈N is multipliable. In other words, the family
( f k )k∈N is multipliable (since this is the same family as ( f n )n∈N ). In other words, each
coefficient in the product of this family is finitely determined (by the definition of
“multipliable”).
Let us define
i
gi : = ∏ fk for each i ∈ N. (413)
k =0
Now, fix n ∈ N. Then, the x n -coefficient in the product of the family ( f k )k∈N is
finitely determined (since each coefficient in the product of this family is finitely de-
termined). In other words, there is a finite subset M of N that determines the x n -
coefficient in the product of ( f k )k∈N (by the definition of “finitely determined”). Con-
sider this subset M.
The set M is a finite set of nonnegative integers, and thus has an upper bound (since
any finite set of nonnegative integers has an upper bound). In other words, there exists
some m ∈ N such that
all k ∈ M satisfy k ≤ m. (415)
Consider this m.
Now, let i be an integer be such that i ≥ m. Then, i ∈ N (since i ≥ m ≥ 0) and m ≤ i.
From (415), we see that all k ∈ M satisfy k ≤ m ≤ i and therefore k ∈ {0, 1, . . . , i }. In
other words, M ⊆ {0, 1, . . . , i }.
However, the set M determines the x n -coefficient in the product of ( f k )k∈N . In other
words, every finite subset J of N satisfying M ⊆ J ⊆ N satisfies
! !
[ xn ] ∏ fk = [ xn ] ∏ fk
k∈ J k∈ M
(by the definition of “determines the x n -coefficient in the product of ( f k )k∈N ”). Apply-
ing this to J = {0, 1, . . . , i }, we obtain
! !
[ xn ] ∏ fk = [ xn ] ∏ fk (416)
k ∈{0,1,...,i } k∈ M
i
In view of ∏ f k = ∏ f k = gi (by (413)) and ∏ f k = g (by (414)), we can
k ∈{0,1,...,i } k =0 k ∈N
rewrite this as
[ x n ] gi = [ x n ] g.
Forget that we fixed i. We thus have shown that
(namely, N = m). In other words, the sequence ([ x n ] gi )i∈N stabilizes to [ x n ] g (by the
definition of “stabilizes”).
Forget that we fixed n. We thus have shown that for each n ∈ N,
i
lim ∏ f k = ∏ fk.
i→∞
k =0 k ∈N
i
In other words, a = lim ∑ an x n . This proves Corollary 3.13.16.
i → ∞ n =0
i
We have assumed that the limit lim ∑ f n exists. Let us denote this limit by g. Thus,
i → ∞ n =0
i i
here, we have renamed the
g = lim ∑
i → ∞ n =0
f n = lim
i→∞
∑ fk
summation index n as k
k =0
| {z }
= gi
(by (417))
= lim gi . (418)
i→∞
In other words, the sequence ( gi )i∈N coefficientwise stabilizes to g. In other words, for
each n ∈ N,
the sequence ([ x n ] gi )i∈N stabilizes to [ x n ] g. (419)
Let n ∈ N. Then, the sequence ([ x n ] gi )i∈N stabilizes to [ x n ] g (by (419)). In other
words, there exists some N ∈ N such that
[ x n ] gi − 1 = [ x n ] gi . (421)
In other words, the family ( f n )n∈N is summable (by the definition of “summable”).
i i
Hence, Theorem 3.13.14 yields lim ∑ f n = ∑ f n . In other words, ∑ f n = lim ∑ f n .
i → ∞ n =0 n ∈N n ∈N i → ∞ n =0
This completes the proof of Theorem 3.13.17.
i
We have assumed that the limit lim ∏ f n exists. Let us denote this limit by g. Thus,
i → ∞ n =0
i i
here, we have renamed the
g = lim ∏ f n = ilim
i → ∞ n =0 →∞
∏ fk product index n as k
k =0
| {z }
= gi
(by (423))
= lim gi . (424)
i→∞
N xn
(since g N was defined to be ∏ f k ). Applying (425) to i = N, we obtain g N ≡ g (since
k =0
xn xn
N ≥ N). Therefore, g ≡ g N (since ≡ is an equivalence relation and thus symmetric).
Next, we shall show two claims:
xn
Claim 1: For each integer j > N, we have g ≡ g f j .
j −1
However, the definition of g j−1 yields g j−1 = ∏ f k . Furthermore, the definition of g j
k =0
yields
j −1
!
j
gj = ∏ fk = ∏ fk f j = g j −1 f j .
k =0 k =0
| {z }
= g j −1
xn xn xn
But recall that g j−1 ≡ g and f j ≡ f j . Hence, g j−1 f j ≡ g f j (by (101), applied to a = g j−1 ,
xn
b = g, c = f j and d = f j ). In view of g j = g j−1 f j , we can rewrite this as g j ≡ g f j .
xn xn xn xn xn
Now, recall that g ≡ g j . Thus, g ≡ g j ≡ g f j , so that g ≡ g f j (since ≡ is an equivalence
relation and thus transitive). This proves Claim 1.
xn
Claim 2: If U is any finite subset of N satisfying M ⊆ U, then g ≡ ∏ f k .
k ∈U
!
a = g, b = ∏ f k , c = f u and d = f u ). In view of ∏ f k = ∏ fk f u , we can
k ∈U \{u} k ∈U k ∈U \{u}
xn
rewrite this as g f u ≡ ∏ f k .
k ∈U
But u ∈ |{z}
U \ M
|{z} ⊆ N \ {0, 1, . . . , N } = { N + 1, N + 2, N + 3, . . .}, so that
⊆N ={0,1,...,N }
xn
u ≥ N + 1 > N. Hence, Claim 1 (applied to j = u) yields g ≡ g f u . Therefore,
xn xn xn xn
g ≡ g f u ≡ ∏ f k . Therefore, g ≡ ∏ f k (since ≡ is an equivalence relation and thus
k ∈U k ∈U
transitive).
Forget that we fixed U. We thus have shown that if U is any finite subset of N
xn
satisfying M ⊆ U and |U \ M| = s + 1, then g ≡ ∏ f k . In other words, Claim 2 holds
k ∈U
when |U \ M | = s + 1. This completes the induction step. Thus, Claim 2 is proved by
induction.
xn
Now, let J be a finite subset of N satisfying M ⊆ J ⊆ N. Then, g ≡ ∏ f k (by Claim
k∈ J
2, applied to U = J). In other words,
!
each m ∈ {0, 1, . . . , n} satisfies [ x m ] g = [ x m ] ∏ fk
k∈ J
The same argument can be applied to M instead of J (since M ⊆ M ⊆ N), and thus
yields !
[ xn ] g = [ xn ] ∏ fk .
k∈ M
Forget that we fixed J. We have now shown that every finite subset J of N satisfying
M ⊆ J ⊆ N satisfies ! !
[ xn ] ∏ fk = [ xn ] ∏ fk .
k∈ J k∈ M
In other words, the set M determines the x n -coefficient in the product of ( f k )k∈N (by
the definition of “determines the x n -coefficient in the product of ( f k )k∈N ”). Hence,
the x n -coefficient in the product of ( f k )k∈N is finitely determined (by the definition of
“finitely determined”, since M is a finite subset of N).
Math 701 Spring 2021, version April 6, 2024 page 663
Forget that we fixed n. We thus have shown that for each n ∈ N, the x n -coefficient
in the product of ( f k )k∈N is finitely determined. In other words, each coefficient in
the product of ( f k )k∈N is finitely determined. In other words, the family ( f k )k∈N is
multipliable (by the definition of “multipliable”). In other words, the family ( f n )n∈N is
multipliable (since this is the same family as ( f k )k∈N ).
i i
Hence, Theorem 3.13.15 yields lim ∏ f n = ∏ f n . In other words, ∏ f n = lim ∏ f n .
i → ∞ n =0 n ∈N n ∈N i → ∞ n =0
This completes the proof of Theorem 3.13.18.
Proof of Theorem 3.14.7 (sketched). First, we need to prove that the set K [ x ± ] is closed
under addition, under scaling, and under the multiplication introduced in Definition
3.14.6. The proof of this is analogous to Theorem 3.4.2 (with the obvious changes, such
n
as replacing the ∑ sums by ∑ sums).
i =0 i ∈Z
Next, we need to prove that the multiplication on K [ x ± ] is associative, commutative,
distributive and K-bilinear. This is analogous to the corresponding parts of Theorem
3.2.6. Likewise, we can show that the element (δi,0 )i∈Z (which is our analogue of the
FPS 1) is a neutral element for this multiplication. Hence, K [ x ± ] is a commutative
K-algebra with unity (δi,0 )i∈Z .
It remains to show that the element x is invertible in this K-algebra. But this is easy:
Set x := (δi,−1 )i∈Z , and show (by direct calculation) that xx = xx = 1. (For example,
let us prove that xx = 1. Indeed, from x = (δi,1 )i∈Z and x = (δi,−1 )i∈Z , we obtain
Thus, (cn )n∈Z = (δn,0 )n∈Z = 1, so that xx = (cn )n∈Z = 1, as we wanted to prove. The
proof of xx = 1 is analogous, or follows from xx = 1 by commutativity.)
Math 701 Spring 2021, version April 6, 2024 page 664
x · a = ( a n −1 ) n ∈Z and x −1 · a = ( a n +1 ) n ∈Z .
Proof of Lemma B.5.1 (sketched). From x = (δi,1 )i∈Z and a = ( an )n∈Z = ( ai )i∈Z , we ob-
tain
= a n −1 + ∑ 0an−i = an−1 .
i ∈Z;
i ̸ =1
| {z }
=0
Thus, (cn )n∈Z = ( an−1 )n∈Z , so that x · a = (cn )n∈Z = ( an−1 )n∈Z . Thus we have proved
x · a = ( a n −1 ) n ∈Z . (427)
It remains to prove that x −1 · a = ( an+1 )n∈Z . Here we use a trick: Let b = ( an+1 )n∈Z ;
this is again a Laurent polynomial in K [ x ± ]. Hence, (427) (applied to b and an+1
instead of a and an ) yields
x · b = a(n−1)+1 = ( a n ) n ∈Z
n ∈Z
Proof of Proposition B.5.2 (sketched). This is similar to Proposition 3.2.17, which we proved
by induction on k. Here, too, we can use induction, but (since k can be negative) we
have to use “two-sided induction”, which contains both an induction step from k to
k + 1 and an induction step from k to k − 1. (See [Grinbe15, §2.15] for a detailed expla-
nation of two-sided induction.)
Both induction steps rely on Lemma B.5.1. (Specifically, the step from k to k + 1 uses
the x · a = ( an−1 )n∈Z part of Lemma B.5.1, whereas the step from k to k − 1 uses the
x −1 · a = ( an+1 )n∈Z part.)
Now, Proposition 3.14.8 is easy:
Proof of Proposition 3.14.8 (sketched). Analogous to Corollary 3.2.18, but using Proposi-
tion B.5.2 instead of Proposition 3.2.17.
Next, let us prove Theorem 3.14.10:
Proof of Theorem 3.14.10 (sketched). This is analogous to the proof of Theorem 3.14.7.
The only (slightly) different piece is the proof that the set K (( x )) is closed under mul-
tiplication. So let us prove this:
Let ( an )n∈Z and (bn )n∈Z be two elements of K (( x )). We must prove that their prod-
uct ( an )n∈Z · (bn )n∈Z belongs to K (( x )) as well. This product is defined by
( a n ) n ∈Z · ( bn ) n ∈Z = ( c n ) n ∈Z , where cn = ∑ a i bn − i .
i ∈Z
Thus, we need to prove that (cn )n∈Z belongs to K (( x )). In other words, we must prove
that the sequence (c−1 , c−2 , c−3 , . . .) is essentially finite (by the definition of K (( x ))).
So let us prove this now. We know that the sequence ( a−1 , a−2 , a−3 , . . .) is essentially
finite (since ( an )n∈Z ∈ K (( x ))). Thus, there exists some negative integer p such that
all i ≤ p satisfy ai = 0. (428)
Similarly, there exists some negative integer q such that
all j ≤ q satisfy b j = 0 (429)
(since (bn )n∈Z ∈ K (( x ))). Consider these p and q. Now, set r := p + q. We shall show
that all negative integers n ≤ r satisfy cn = 0.
Indeed, let n ≤ r be any integer. Then, each integer i ≥ p satisfies |{z}n − |{z}i ≤
≤r = p + q ≥p
p + q − p = q and therefore
bn − i = 0 (430)
(by (429), applied to j = n − i). Now,
cn = ∑ a i bn − i = ∑ a i bn − i + ∑ ai bn−i = ∑ 0bn−i + ∑ ai 0 = 0.
i ∈Z i ∈Z; i ∈Z; i ∈Z; i ∈Z;
|{z} |{z}
i < p (by = 0
(428)) i≥ p =0 i< p i≥ p
(by (430)) | {z } | {z }
=0 =0
Forget that we fixed n. We thus have shown that all n ≤ r satisfy cn = 0. Hence, the
sequence (c−1 , c−2 , c−3 , . . .) is essentially finite. As explained, this completes our proof
of the claim that the set K (( x )) is closed under multiplication.
The rest of Theorem 3.14.10 is proved just like Theorem 3.14.7.
Math 701 Spring 2021, version April 6, 2024 page 666
Detailed proof of Lemma 6.1.4. The set X is finite (since it is a subset of the finite set A).
Thus, |X | = n for some n ∈ N. Consider this n.
Let [n] be the set {1, 2, . . . , n}. Then, |[n]| = n. Comparing this with |X | = n, we
obtain |X | = |[n]|. Hence, there exists a bijection α : X → [n]. Consider this α.
Now, define two subsets U and W of X by
U := { I ∈ X | α ( f ( I )) < α ( I )} ;
W := { I ∈ X | α ( f ( I )) > α ( I )} .
g : U → W,
I 7→ f ( I )
h : W → U,
I 7→ f ( I )
only changes are that all “<” signs have to be replaced by “>” signs and vice versa, and
that all “U ”s have to be replaced by “W ”s and vice versa.
Math 701 Spring 2021, version April 6, 2024 page 667
maps g and h. It is clear that g ◦ h = id 179 and h ◦ g = id 180 . Hence, the maps g
and h are mutually inverse, and thus are bijections.
We have
sign I + sign ( g ( I )) = 0 for all I ∈ U (431)
181 . Furthermore, we have
However, each I ∈ X satisfies exactly one of the three conditions “α ( f ( I )) < α ( I )”,
“α ( f ( I )) = α ( I )” and “α ( f ( I )) > α ( I )” (because α ( f ( I )) and α ( I ) are two integers).
Thus, ( g ◦ h) ( I ) = f h ( I ) = f ( f ( I )) = ( f ◦ f ) ( I ) = id ( I ) = I = id ( I ).
|{z} | {z }
= f (I) =id
Forget that we fixed I. We thus have shown that ( g ◦ h) ( I ) = id ( I ) for each I ∈ W . In
other words, g ◦ h = id.
180 The proof of this is analogous to the proof of g ◦ h = id we just gave.
181 Proof of (431): Let J ∈ U . Then, J ∈ U ⊆ X and g ( J ) = f ( J ) (by the definition of g).
Now, recall our assumption saying that sign ( f ( I )) = − sign I for all I ∈ X . Applying
this to I = J, we obtain sign ( f ( J )) = − sign J. In view of g ( J ) = f ( J ), this rewrites as
sign ( g ( J )) = − sign J. In other words, sign J + sign ( g ( J )) = 0.
Forget that we fixed J. We thus have shown that sign J + sign ( g ( J )) = 0 for all J ∈ U .
Renaming J as I in this statement, we obtain that sign I + sign ( g ( I )) = 0 for all I ∈ U . This
proves (431).
182 Proof of (432): Recall our assumption saying that
= ∑ sign I + ∑ 0+ ∑ sign I
I ∈X ; I ∈X ; I ∈X ;
α( f ( I ))<α( I ) α( f ( I ))=α( I ) α( f ( I ))>α( I )
| {z }
=0
= ∑ sign I + ∑ sign I
I ∈X ; I ∈X ;
α( f ( I ))<α( I ) α( f ( I ))>α( I )
| {z } | {z }
=∑ = ∑
I ∈U I ∈W
(since { I ∈X | α( f ( I ))<α( I )}=U ) (since { I ∈X | α( f ( I ))>α( I )}=W )
However, the set A is the union of its two disjoint subsets X and A \ X (since X ⊆ A).
Thus, we can split the sum ∑ sign I as follows:
I ∈A
Detailed proof of Lemma 6.1.3. The map f has no fixed points (by assumption). In other
words, there exist no I ∈ X satisfying f ( I ) = I. Hence, we have sign I = 0 for all I ∈ X
satisfying f ( I ) = I (because non-existing objects satisfy any possible claim; this is
known as being “vacuously true”). Thus, Lemma 6.1.4 yields ∑ sign I = ∑ sign I.
I ∈A I ∈A\X
This proves Lemma 6.1.3.
6 B′
r
5
4 B
p′
3 A′
p′
1 A P
p
0
0 1 2 3 4 5 6 7
Math 701 Spring 2021, version April 6, 2024 page 670
184 Importantly,this reflection preserves our digraph (in fact, it transforms north-steps into east-
steps and vice versa).
Math 701 Spring 2021, version April 6, 2024 page 671
= s k 1 · s k 2 · · · · · s k p −2 · s k p −1 · f = s k 1 · s k 2 · · · · · s k p −3 · s k p −2 · f
| {z } | {z }
=f =f
(by (264)) (by (264))
= ··· = f.
(To be fully rigorous, this is really an induction argument: We are showing (by induc-
tion on i) that sk1 sk2 · · · ski f = f for each i ∈ {0, 1, . . . , p}; the induction base is obvious
(since sk1 sk2 · · · sk0 = (empty product in S N ) = id), while the induction step relies on
(264). It is straightforward to fill in the details of this induction.)
Forget that we fixed σ. We thus have proved that σ · f = f for all σ ∈ S N . In other
words, the polynomial f is symmetric. This proves Lemma 7.1.17.
where the coefficients f b1 ,b2 ,...,bN belong to K. Thus, the coefficient of any monomial
x1b1 x2b2 · · · x bNN in f is f b1 ,b2 ,...,bN . In other words,
h i
x1b1 x2b2 · · · x bNN f = f b1 ,b2 ,...,bN (435)
is a bijection (indeed, this map simply permutes the entries of any given N-tuple using
the permutation σ; thus, it can be undone by permuting them using σ−1 ). Moreover,
the map σ itself is a bijection (since it is a permutation).
185 Alternatively, we can derive this from Corollary 5.3.22 (which is more well-known than The-
orem 5.3.17 (a)):
Corollary 5.3.22 (applied to n = N) shows that the symmetric group S N is generated
by the simples s1 , s2 , . . . , s N −1 . Hence, each element of S N is a (finite) product of simples
and their inverses. Since the inverses of the simples are simply these simples themselves
(because each i ∈ [ N − 1] satisfies si−1 = si ), we can simplify this statement as follows: Each
element of S N is a (finite) product of simples. Applying this to the element σ, we conclude
that σ is a (finite) product of simples.
Math 701 Spring 2021, version April 6, 2024 page 672
a a a
h i
x1σ(1) x2σ(2) · · · x Nσ( N ) f = f aσ(1) ,aσ(2) ,...,aσ( N ) = x1a1 x2a2 · · · x aNN (σ · f )
Detailed proof of Lemma 7.3.17. We shall first prove parts (a) and (c), and then quickly
derive the rest from them.
(a) The skew tableau T is semistandard. Hence, we have
for any (i, j) ∈ Y (λ/µ) satisfying (i, j + 1) ∈ Y (λ/µ). (Indeed, this is one of the
requirements placed on T in Definition 7.3.16.)
Now, let (i, j1 ) and (i, j2 ) be two elements of Y (λ/µ) satisfying j1 ≤ j2 . We must
prove that T (i, j1 ) ≤ T (i, j2 ).
Let k ∈ { j1 , j1 + 1, . . . , j2 − 1} be arbitrary. Then, j1 ≤ k ≤ j2 − 1. From k ≤ j2 − 1, we
obtain k + 1 ≤ j2 , so that k ≤ k + 1 ≤ j2 . Also, j1 ≤ k ≤ k + 1.
Thus, we have i ≤ i ≤ i and j1 ≤ k ≤ j2 . Hence, Lemma 7.3.14 (applied to ( a, b) =
(i, j1 ) and (e, f ) = (i, j2 ) and (c, d) = (i, k)) yields (i, k) ∈ Y (λ/µ). Therefore, the entry
T (i, k ) of T is well-defined.
Also, we have i ≤ i ≤ i and j1 ≤ k + 1 ≤ j2 . Hence, Lemma 7.3.14 (applied to
( a, b) = (i, j1 ) and (e, f ) = (i, j2 ) and (c, d) = (i, k + 1)) yields (i, k + 1) ∈ Y (λ/µ).
Therefore, the entry T (i, k + 1) of T is well-defined.
Now, (439) (applied to j = k) yields T (i, k ) ≤ T (i, k + 1) (since (i, k) ∈ Y (λ/µ) and
(i, k + 1) ∈ Y (λ/µ)).
Forget that we fixed k. We thus have shown that for each k ∈ { j1 , j1 + 1, . . . , j2 − 1},
the inequality T (i, k ) ≤ T (i, k + 1) holds (and both entries T (i, k ) and T (i, k + 1) are
well-defined). In other words, we have
for any (i, j) ∈ Y (λ/µ) satisfying (i + 1, j) ∈ Y (λ/µ). (Indeed, this is one of the
requirements placed on T in Definition 7.3.16.)
Now, let (i1 , j) and (i2 , j) be two elements of Y (λ/µ) satisfying i1 < i2 . We must
prove that T (i1 , j) < T (i2 , j).
Let k ∈ {i1 , i1 + 1, . . . , i2 − 1} be arbitrary. Then, i1 ≤ k ≤ i2 − 1. From k ≤ i2 − 1, we
obtain k + 1 ≤ i2 , so that k ≤ k + 1 ≤ i2 . Also, i1 ≤ k ≤ k + 1.
Thus, we have i1 ≤ k ≤ i2 and j ≤ j ≤ j. Hence, Lemma 7.3.14 (applied to ( a, b) =
(i1 , j) and (e, f ) = (i2 , j) and (c, d) = (k, j)) yields (k, j) ∈ Y (λ/µ). Therefore, the entry
T (k, j) of T is well-defined.
Also, we have i1 ≤ k + 1 ≤ i2 and j ≤ j ≤ j. Hence, Lemma 7.3.14 (applied to
( a, b) = (i1 , j) and (e, f ) = (i2 , j) and (c, d) = (k + 1, j)) yields (k + 1, j) ∈ Y (λ/µ).
Therefore, the entry T (k + 1, j) of T is well-defined.
Now, (440) (applied to i = k) yields T (k, j) < T (k + 1, j) (since (k, j) ∈ Y (λ/µ) and
(k + 1, j) ∈ Y (λ/µ)).
Forget that we fixed k. We thus have shown that for each k ∈ {i1 , i1 + 1, . . . , i2 − 1},
the inequality T (k, j) < T (k + 1, j) holds (and both entries T (k, j) and T (k + 1, j) are
well-defined). In other words, we have
T (i1 , j) < T (i1 + 1, j) < T (i1 + 2, j) < · · · < T (i2 − 1, j) < T (i2 , j) .
Hence, T (i1 , j) < T (i2 , j) (since i1 < i2 ). This proves Lemma 7.3.17 (c).
(b) Let (i1 , j) and (i2 , j) be two elements of Y (λ/µ) satisfying i1 ≤ i2 . We must prove
that T (i1 , j) ≤ T (i2 , j). If i1 < i2 , then this follows from Lemma 7.3.17 (c). Hence, for
the rest of this proof, we WLOG assume that we don’t have i1 < i2 . Thus, we have
i1 ≥ i2 . Combining this with i1 ≤ i2 , we obtain i1 = i2 . Thus, T (i1 , j) = T (i2 , j), so that
T (i1 , j) ≤ T (i2 , j). This proves Lemma 7.3.17 (b).
(d) Let (i1 , j1 ) and (i2 , j2 ) be two elements of Y (λ/µ) satisfying i1 ≤ i2 and j1 ≤
j2 . Then, i1 ≤ i2 ≤ i2 and j1 ≤ j1 ≤ j2 . Hence, Lemma 7.3.14 (applied to ( a, b) =
(i1 , j1 ) and (e, f ) = (i2 , j2 ) and (c, d) = (i2 , j1 )) yields (i2 , j1 ) ∈ Y (λ/µ). Therefore,
the entry T (i2 , j1 ) of T is well-defined. Hence, Lemma 7.3.17 (b) (applied to j = j1 )
yields T (i1 , j1 ) ≤ T (i2 , j1 ). Furthermore, Lemma 7.3.17 (a) (applied to i = i2 ) yields
T (i2 , j1 ) ≤ T (i2 , j2 ). Thus,
yields T (i1 , j1 ) < T (i2 , j1 ). Furthermore, Lemma 7.3.17 (a) (applied to i = i2 ) yields
T (i2 , j1 ) ≤ T (i2 , j2 ). Thus,
Detailed proof of Lemma 7.3.35. Let (i, j) ∈ Y (λ). Set p := T (i, j).
All entries of T are elements of [ N ] (by the definition of a tableau), and thus are pos-
itive integers. Thus, in particular, the i entries T (1, j) , T (2, j) , . . . , T (i, j) are positive
integers186 . Moreover, the tableau T is semistandard; thus, its entries increase strictly
down each column. Hence, in particular, we have
Thus, the i numbers T (1, j) , T (2, j) , . . . , T (i, j) are distinct. Moreover, all these num-
bers are positive integers (as we have seen above) and are ≤ p (since T (1, j) < T (2, j) <
· · · < T (i, j) = p); thus, they all belong to the set [ p]. This shows that there are i dis-
tinct numbers in the set [ p] (namely, the i numbers T (1, j) , T (2, j) , . . . , T (i, j)); in other
words, the set [ p] has at least i elements. In other words, we have |[ p]| ≥ i. Since
|[ p]| = p = T (i, j), this rewrites as T (i, j) ≥ i. This proves Lemma 7.3.35.
Detailed proof of Lemma 7.3.39. Write the N-tuple α ∈ N N as α = (α1 , α2 , . . . , α N ). Then,
Definition 7.3.2 (b) yields
α
aα = det xi j . (441)
1≤i ≤ N, 1≤ j≤ N
(a) Assume that the N-tuple α has two equal entries. In other words, the N-tuple
(α1 , α2 , . . . , α N ) has two equal entries (since α = (α1 , α2 , . . . , α N )). In other words, there
exist two elements u, v ∈ [ N ] such that u < v and αu = αv . Consider these u, v.
Now, from α αu = αv , we conclude that the u-th and the v-th columns α of the N × N-
j
matrix xi are equal. Hence, this N × N-matrix xi j has
1≤i ≤ N, 1≤ j≤ N 1≤i ≤ N, 1≤ j≤ N
two equal columns (since u < v).
However, if an N × N-matrix A has equal columns, then det A = 0 (by Theorem
two α
6.4.14 (c)187 ). Applying this to A = xi j , we obtain
1≤i ≤ N, 1≤ j≤ N
α
det xi j =0
1≤i ≤ N, 1≤ j≤ N
186 Here, we are tacitly using the fact that the boxes (1, j) , (2, j) , . . . , (i, j) all belong to Y (λ) (so
that the corresponding entries T (1, j) , T (2, j) , . . . , T (i, j) are well-defined). This fact can be
checked as follows: Let u ∈ [i ]. Thus, u ≤ i. Now, write λ in the form λ = (λ1 , λ2 , . . . , λ N ).
Thus, λ1 ≥ λ2 ≥ · · · ≥ λ N (since λ is an N-partition). Hence, λu ≥ λi (since u ≤ i), so
that λi ≤ λu and therefore [λi ] ⊆ [λu ]. However, (i, j) ∈ Y (λ). In other words, i ∈ [ N ] and
j ∈ [λi ] (by the definition of the Young diagram Y (λ)). Hence, u ≤ i ≤ N (since i ∈ [ N ]),
so that u ∈ [ N ], and furthermore j ∈ [λi ] ⊆ [λu ]. Now, from u ∈ [ N ] and j ∈ [λu ], we
obtain (u, j) ∈ Y (λ). Forget that we fixed u. We thus have shown that (u, j) ∈ Y (λ) for
each u ∈ [i ]. In other words, the boxes (1, j) , (2, j) , . . . , (i, j) all belong to Y (λ), qed.
187 more precisely: by the analogue of Theorem 6.4.12 (c) for columns instead of rows
Math 701 Spring 2021, version April 6, 2024 page 676
α
(since the N × N-matrix xi j has two equal columns). In view of (441),
1≤i ≤ N, 1≤ j≤ N
this rewrites as aα = 0. This proves Lemma 7.3.39 (a).
(b) Write the N-tuple β ∈ N N as β = ( β 1 , β 2 , . . . , β N ). Then, Definition 7.3.2 (b)
yields
βj
a β = det xi . (442)
1≤i ≤ N, 1≤ j≤ N
However, the N-tuple β is obtained from α by swapping two entries. In other words,
the N-tuple ( β 1 , β 2 , . . . , β N ) is obtained from (α1 , α2 , . . . , α N ) by swapping
two entries β
(since β = ( β 1 , β 2 , . . . , β N ) and α = (α1 , α2 , . . . , α N )). Thus, the matrix xi j
α 1≤i ≤ N, 1≤ j≤ N
j 188
is obtained from the matrix xi by swapping two columns .
1≤i ≤ N, 1≤ j≤ N
However, if we swap two columns of an N × N-matrix A, then det A gets multiplied
by −1 (by Theorem 6.4.14 (b)189 ). In other words, if A and B are two N × N-matrices
such that B is obtained
from
A by swapping two columns,
then det B = − det A.
α β
Applying this to A = xi j and B = xi j , we obtain
1≤i ≤ N, 1≤ j≤ N 1≤i ≤ N, 1≤ j≤ N
β α
det xi j = − det xi j
1≤i ≤ N, 1≤ j≤ N 1≤i ≤ N, 1≤ j≤ N
β α
(since the N × N-matrix xi j is obtained from the matrix xi j
1≤i ≤ N, 1≤ j≤ N 1≤i ≤ N, 1≤ j≤ N
by swapping two columns). In view of (441) and (442), this rewrites as a β = − aα . This
proves Lemma 7.3.39 (b).
Some details omitted from the proof of Lemma 7.3.34. In our above proof of Lemma 7.3.34,
we have omitted certain arguments – namely, the proofs of the equalities (304) and
(305) in the proof of Observation 2. Let us now show these proofs:190
(cont ( T ∗ ))k+1 = (# of (k + 1) ’s in T ∗ )
∗ ∗
= # of (k + 1) ’s in col< j ( T ) + # of (k + 1) ’s in col≥ j ( T )
| {z } | {z }
=col≥ j T
= β k (col< j T )
(by (294)) (by (295))
(cont T )k = (# of k’s in T )
= # of k’s in col< j T + # of k’s in col≥ j T
| {z }
=bk −νk
(by (298))
αk = (ν + cont T + ρ)k
= νk + (cont T )k + ρk
| {z } |{z}
=(# of k’s in col< j T )+bk −νk = N −k
(by the definition of ρ)
= νk + # of k’s in col< j T + bk − νk + N − k
= # of k’s in col< j T + bk + N − k.
(by (300)). Comparing these two equalities, we obtain (cont ( T ∗ ))i = (cont T )i .
However, γ = ν + cont ( T ∗ ) + ρ, so that
References
[17f-hw7s] Darij Grinberg, UMN Fall 2017 Math 4990 homework set #7 with solu-
tions, http://www.cip.ifi.lmu.de/~grinberg/t/17f/hw7os.pdf
[18f-hw2s] Darij Grinberg, UMN Fall 2018 Math 5705 homework set #2 with so-
lutions, http://www.cip.ifi.lmu.de/~grinberg/t/18f/hw2s.pdf
[18f-hw4s] Darij Grinberg, UMN Fall 2018 Math 5705 homework set #4 with so-
lutions, http://www.cip.ifi.lmu.de/~grinberg/t/18f/hw4s.pdf
Math 701 Spring 2021, version April 6, 2024 page 679
[Armstr19] Drew Armstrong, Abstract Algebra I (Fall 2018) and Abstract Algebra
II (Spring 2019) lecture notes, 2019.
https://www.math.miami.edu/~armstrong/561fa18.php
https://www.math.miami.edu/~armstrong/562sp19.php
[BenQui03] Arthur T. Benjamin and Jennifer J. Quinn, Proofs that Really Count:
The Art of Combinatorial Proof, The Mathematical Association of
America, 2003.
[BenQui04] Arthur T. Benjamin and Jennifer J. Quinn, Proofs that Really Count:
The Magic of Fibonacci Numbers and More, Mathematical Adventures
for Students and Amateurs, (David F. Hayes and Tatiana Shubin,
editors), Spectrum Series of MAA, pp. 83–98, 2004.
[Berndt17] Bruce C. Berndt, Spring 2017, MATH 595. Theory of Partitions, lec-
ture notes, 2017.
https://conf.math.illinois.edu/~berndt/math595-tp.html
[Bourba02] Nicolas Bourbaki, Lie Groups and Lie Algebras: Chapters 4–6,
Springer 2002.
Math 701 Spring 2021, version April 6, 2024 page 681
[Cohn04] Henry Cohn, Projective geometry over F1 and the Gaussian binomial
coefficients, American Mathematical Monthly 111 (2004), pp. 487–
495, arXiv:math/0407093v1.
[Comtet74] Louis Comtet, Advanced Combinatorics: The Art of Finite and Infinite
Expansions, D. Reidel Publishing Company, 1974.
[EdeStr04] Alan Edelman and Gilbert Strang, Pascal Matrices, American Math-
ematical Monthly, Vol. 111, No. 3 (March 2004), pp. 189–197.
[EGHetc11] Pavel Etingof, Oleg Golberg, Sebastian Hensel, Tiankai Liu, Alex
Schwendner, Dmitry Vaintrob, Elena Yudovina, Introduction to Rep-
resentation Theory, with historical interludes by Slava Gerovitch,
Student Mathematical Library 59, AMS 2011, updated version
2018.
[GesVie85] Ira Gessel, Gérard Viennot, Binomial Determinants, Paths, and Hook
Length Formulae, Advances in Mathematics 58 (1985), pp. 300-321.
[Grinbe09] Darij Grinberg, Solution to Problem 19.9 from “Problems from the
Book”.
http://www.cip.ifi.lmu.de/~grinberg/solutions.html
[Grinbe17] Darij Grinberg, Why the log and exp series are mutually inverse, 11
May 2018.
https://www.cip.ifi.lmu.de/~grinberg/t/17f/logexp.pdf
[Grinbe18] Darij Grinberg, The diamond lemma and its applications (talk), 20 May
2018.
https://www.cip.ifi.lmu.de/~grinberg/algebra/
diamond-talk.pdf
[Koch16] Dick Koch, The Pentagonal Number Theorem and All That, 26 August
2016.
https://darkwing.uoregon.edu/~koch/PentagonalNumbers.pdf
[Muir30] Thomas Muir, The theory of determinants in the historical order of de-
velopment, 5 volumes (1906–1930), later reprinted by Dover.
http://www-igm.univ-mlv.fr/~al/
[OlvSha18] Peter J. Olver, Chehrzad Shakiban, Applied Linear Algebra, 2nd edi-
tion, Springer 2018.
https://doi.org/10.1007/978-3-319-91041-3
See http://www.math.umn.edu/~olver/ala.html for errata.
[Robins05] Donald W. Robinson, The classical adjoint, Linear Algebra and its
Applications 411 (2005), pp. 254–276.
[Sam19] Steven V. Sam, Notes for Math 184: Combinatorics, 9 December 2019.
https://math.ucsd.edu/~ssam/old/19F-184/notes.pdf
Math 701 Spring 2021, version April 6, 2024 page 689
[Sam21] Steven V. Sam, Notes for Math 188: Algebraic Combinatorics, 17 May
2021.
https://math.ucsd.edu/~ssam/188/notes-188.pdf
[Spivey19] Michael Z. Spivey, The Art of Proving Binomial Identities, CRC Press
2019.
See https://mathcs.pugetsound.edu/~mspivey/Errata.html for
errata.
[Zeng93] Jiang Zeng, A bijective proof of Muir’s identity and the Cauchy-Binet
formula, Linear Algebra and its Applications 184, 15 April 1993,
pp. 79–82.