compsci-120-coursebook
compsci-120-coursebook
COMPSCI 120
Mathematics for Computer Science
Above: How to tile a 16 × 16 grid minus its top-right corner with shapes.
Contents
Numbers 6
1.1 Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Even and Odd Integers . . . . . . . . . . . . . . . . . . . . . 8
1.3 Divisibility and Primes . . . . . . . . . . . . . . . . . . . . . 11
1.4 Modular Arithmetic . . . . . . . . . . . . . . . . . . . . . . . 14
1.5 % and Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . 18
1.6 Other Number Systems . . . . . . . . . . . . . . . . . . . . . 22
1.7 Practice Problems . . . . . . . . . . . . . . . . . . . . . . . . 24
Graphs 76
5.1 Graph Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.2 Graphs: Useful Concepts . . . . . . . . . . . . . . . . . . . . 79
5.3 Walks and connectedness . . . . . . . . . . . . . . . . . . . . 81
5.4 Practice Problems . . . . . . . . . . . . . . . . . . . . . . . . 85
Trees 86
6.1 Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
6.2 Useful Results on Trees . . . . . . . . . . . . . . . . . . . . . 88
6.3 Rooted Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
6.4 Practice Problems . . . . . . . . . . . . . . . . . . . . . . . . 92
Proofs 94
7.1 Proofs: Motivation and Fundamentals . . . . . . . . . . . . 94
7.2 Direct Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
7.3 Proof by Cases . . . . . . . . . . . . . . . . . . . . . . . . . . 99
7.4 Proof by Contradiction . . . . . . . . . . . . . . . . . . . . . 101
7.5 Common Contradiction Mistakes: Not Understanding Nega-
tion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
7.6 Proof by Construction . . . . . . . . . . . . . . . . . . . . . . 106
7.7 Proof by Induction: First Examples . . . . . . . . . . . . . 109
7.8 Induction: Two Recurrence Relations . . . . . . . . . . . . 113
7.9 Induction, Graphs, and Trees . . . . . . . . . . . . . . . . . 114
7.10 Proof Methods: How to Choose . . . . . . . . . . . . . . . . 116
7.11 Practice Problems . . . . . . . . . . . . . . . . . . . . . . . . 120
1 1 1
7−2 = 2
= , a−b = .
7 49 ab
❼ Logarithms. You should know what log2 (n) and log10 (n) mean,
and be comfortable with calculations like the following:
log2 (32) = log2 (25 ) = 5, log10 (1) = 0, log2 (2n ) = n, 2log2 (a) = a.
16
3x − 4 = 12 ⇒ 3x = 16 ⇒ x= ,
3
x2 + 1 1
∣x∣ < 3,
20
>2 ⇒ < ⇒ x2 + 1 < 10 ⇒
x2 +1 20 2
x2 − 3x + 2 > 0 ⇒ (x − 2)(x − 1) > 0 ⇒ (x − 2) and (x − 1) are both < 0, or (x − 2) and (x − 1) are both > 0
⇒ (x < 2 and x < 1), or (x > 2 and x > 1)
⇒ (x < 1) or (x > 2).
Exercise 1.2. A mistake people often make when adding fractions is the
following:
1 1 ? 1
+ =
x y x+y
1 1 y x
Now, we know that this isn’t right, and that + is actually + =
x y xy xy
x+y
. Sometimes, however, even a formula that is wrong in general
xy
might be right in a specific situation! (Think of the old adage “a stopped
clock is right twice a day.”)
Does this ever happen with this mistake? In other words: are there any
1 1 1
values of x, y such that + = ? Or is this false for literally every
x y x+y
value of x and y?
1.1 Integers
We start our coursebook by studying the integers! In case you have
not seen the word “integer” before, we define it here:
In this course and in mathematics in
general, we will define lots of useful Definition 1.1. The integers are the collection of all whole numbers:
concepts. When we do so, we’ll label
these things by writing “Definition” that is, they consist of the whole positive numbers 1, 2, 3, 4, . . ., together
in bold, and then give you a carefully- with the whole negative numbers −1, −2, −3, −4, . . ., and the number 0.
written and precise definition of that We denote this set by writing the symbol Z.
word.
When studying for this class, it is a
good idea to make sure that you know The symbol Z comes from the German word “Zahl,” which means “num-
all of the definitions in this coursebook, ber,” in case you were curious.
as well as some examples and nonex-
amples for each definition where appro- Example 1.1. √ The numbers 1, −7, 10000, 78, 45, 0, −345678 are integers,
priate. but things like 2, −2.787878787, 81 and π are not.
In some form or another, integers have been used by humans for almost
as long as humans have existed theirselves. The Lebombo bone, one
of the oldest human artifacts, is a device on which people used tally
marks to count the number of days in the lunar cycle. The base-10
system and idea of numerals took a bit longer for us to discover, but we
can trace these concepts back to at least the Egyptians and Sumerians
around 4,000-3,000 BC. Finally, negative numbers and zero have been
with us for at least two millenia: historical documents tracking back
to at least 1000 BC describe how Chinese mathematicians used negative
?
missing; one solution is drawn in the margin. When you remove a square,
however, things get a bit more interesting! Try it for a while; no mat-
ter how you do this, you’ll always have at least one square left over
uncovered.
However, saying “I tried it a lot and it didn’t work” is not a very persua-
sive argument. Suppose that you had a boss that said that Chad from
Marketing said this was totally possible, and to have a working version
on their desk by 5pm or you’ll be fired. What do you do?
Well, maybe you start revising your resumé and looking for other work.
Before that, though, you may want to try to make a logical argument to
your boss that shows that it’s impossible to come up with such a tiling
— a set of reasons so airtight that no matter what they think or would
respond with, they’d have to agree that you’re right. In mathematics
and computer science, we call such logical arguments proofs!
In this class, we’re going to write these kinds of arguments for almost
all of the claims that we make. We’re going to leave the rules for what
constitutes a “proof” a little vague at first; in our first few chapters, we’re
going to just try to write a solid argument that tells us why something
is true, and anything that does that we’ll count as a proof. (Later in
this course, we’ll approach proofs a bit more formally: feel free to read
ahead if you can’t wait!)
For this problem, let’s prove that an 8 × 8 chessboard without a single
square cannot be covered with 2 × 1 dominoes as follows:
you try this for a while, you’ll find yourself quite stuck again. So: why
are we stuck? What is a compelling argument we could write that would
persuade someone that this is truly impossible, and not just that we’re
bad at tiling?
After some thought and clever observations, you might come up with
the following:
Proof.
i. Notice that each time we place a 2 × 1 domino on our grid, it’s not
just true that it covers two units of area: it also covers exactly one
unit of white area and one unit of black area! This is because our
chessboard has alternating white and black squares.
ii. Because dominoes cannot overlap, this means that if we have placed
k dominoes on our board, there are k white squares and k black
squares covered by our dominoes. In particular, this means that
we always have as much black area covered as white area!
iii. However, if we count the black and white area in our 8 × 8 board
with the top-left and bottom-right squares removed, we can see
that there are 32 black squares and 30 white squares. These num-
bers are different! Therefore, it is impossible to cover this board
with dominoes as well.
In our working above, the fact that we could not write 63 as an integer
multiple of 2 was a very useful observation! We can generalize this into
the concepts of even and odd integers:
To get some practice writing logical arguments, let’s look at a few prop-
erties of even and odd numbers that you already know:
Before getting into a “good” solution for this claim, let’s first study an
argument that doesn’t work.
“Bad” solution: Well, 1 + 1 = 2 is even, 3 + 7 = 10 is even, −13 + 5 = −8 is
even, and 1001 + 2077 = 3078 is even. Certainly seems to be true!
2
A defense of the “bad” solution: This might seem like a silly argument,
but suppose we’d listed a thousand examples, or a billion examples, or
set a computer program to work overnight and had it check all of the
pairs of numbers below 1012 . In many other fields of study, that would
be enough to “show” a claim is true! (Think about science labs: there,
we prove claims via experimentation, and any theory you could test a
billion times and get the same result would certainly seem very true!)
Why this argument is not acceptable in mathematics and computer sci-
ence: When we make a claim about “any” number, or say that something
is true for “all” values, we want to really mean it. If we have not liter-
ally shown that the claim holds for every possible case, we don’t believe
that this is sufficient!
This is not just because computer scientists are fussy. In the world of
numbers, there are tons of “eventual” counterexamples out there:
❼ Consider the following claim: “The sequence of numbers “12, 121,
1211, 12111, 121111, . . . ” are all not prime.” Skip ahead a bit to definition 1.4 if
you haven’t encountered prime num-
If you were to just go through and check things by hand, you’d bers before!
probably be persuaded by the first few entries: 12 = 3 ⋅ 4, 121 =
11 ⋅ 11, 1211 = 173 ⋅ 7, 12111 = 367 ⋅ 11, 12111 = 431 ⋅ 281, . . .
³¹¹ ¹ ¹ ¹ ¹ ¹ ·¹¹ ¹ ¹ ¹ ¹ ¹ µ
136 1′ s
However, when you get to 12 111 . . . 1, that one’s prime! This is well
beyond the range of any reasonable human’s ability to calculate
things, and yet something that could come up in the context of
computer programming and information security (where we make
heavy use of 500+ digit primes all the time.)
❼ Here’s a fun exercise one of the writers of this coursebook saw in
the wild on Facebook, shared by one of our troll-ier friends:
Exercise 1.4. (++) Can you find positive integer values for ,
and so that the equation + + = 4 holds?
+ + +
On its surface, this looks simple, right? Just solve it. Roughly 10106 years, or 3 ⋅ 10113 sec-
onds, or about 6 ⋅ 10130 floating-point
And yet: suppose you set a computer program to work at this operations on the world’s fastest super-
problem, by just bashing out all possible triples of numbers. To be computer as of early 2019.
fair, let’s give you access to the world’s fastest supercomputer. By
the end of a week, you wouldn’t have found an answer. By the end
of a year, you wouldn’t have found an answer! Indeed: by the time
we reached the heat death of the universe, you’d still have found
nothing. You’d be tempted to say that no answer exists, right?
And yet, an answer exists! The math required to find this goes way
beyond the scope of this course, but if you’re curious: the smallest
known answer is to do the following:
=154476802108746166441951315019919837485664325669565431700026634898253202035277999,
= 36875131794129999827197811565225474825492979968971970996283137471637224634055579,
and
= 4373612677928697257861252602371390152816537558161613618621437993378423467772036.
Proof. Take any two odd numbers. Let’s give them names, for ease of
reference: let’s call them M and N . By definition, because M and N
are odd, we can write M = 2k + 1 and N = 2l + 1, for two integers k, l.
Therefore, M + N = (2k + 1) + (2l + 1) = 2k + 2l + 2 = 2(k + l + 1). In
particular, this means that M + N is an even number, as we’ve written
it as a multiple of 2!
Proof. As before, let’s start by taking any two odd numbers. Let’s give
them names, and call them M and N respectively.
Also as before, let’s consult our definitions! By definition, because M, N
are both odd, we can again write M = 2k + 1 and N = 2l + 1, for two
integers k, l.
Therefore, M ⋅ N = (2k + 1) ⋅ (2l + 1). Expanding this product gives you
4kl + 2k + 2l + 1, which you can regroup as (4kl + 2k + 2l) + 1. Factoring
a 2 out of the left-hand part gives you 2(2kl + k + l) + 1.
Because k, l are integers, the expression 2kl + k + l inside the parentheses
is an integer as well, as any product or sum of integers is still an integer.
10
Claim 1.3. No integer is both even and odd at the same time.
Example 1.3.
❼ 4 divides 12; this is because we can multiply 4 by 3 to get 12.
❼ 72 can be divided by −6; this is because we can multiply −6 by −12
to get 72.
❼ 2 does not divide 15; this is because for any integer k, 2k is an
even number, and so is in particular never equal to an odd integer
like 15.
❼ n is a multiple of 1 for any integer n; this is because we can always
multiply 1 by n to get n.
❼ n is a factor of 0 for any integer n; this is because we can always
multiply n by 0 to get 0.
11
Example 1.4. Note that the phrases “divides” and “can be divided
by,” while quite similar-sounding in English, have almost the opposite
meaning in mathematics! For instance, 3 divides 9 and 9 can be
divided by 3 are the same claims.
Definition 1.4. A prime number is any positive integer with only two
distinct positive factors; namely, 1 and itself.
Example 1.5. The first few prime numbers are 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, . . .
Proof. This is because 1’s only factor is 1, and so 1 does not have two
distinct positive integer factors.
12
Theorem 1.1. Every positive integer can be factorized into a product To see a proof of Theorem 1.1, take
of prime numbers in exactly one way, up to the ordering of those prime Compsci 225!
factors.
You might object to Theorem 1.1 by
In other words, this theorem is saying the following: saying that 1 cannot be factored into a
product of prime numbers. If so, good
❼ Every positive integer can be factored into primes, as illustrated thinking! For now, regard 1 as a spe-
above. So, for example, we can take 60 and write it as 2 ⋅ 2 ⋅ 3 ⋅ 5. cial case.
Later on, though, we’ll consider the
❼ No number can be factored into primes in two different ways, up idea of an “empty product” when we
to the ordering. That is: while you could write 60 as 5 ⋅ 2 ⋅ 3 ⋅ 2 or get to the factorial and exponential
5 ⋅ 3 ⋅ 2 ⋅ 2, you’re never going to write a prime factorization of 60 functions. At that time, we’ll argue
that has a 7 as one of its prime factors, or doesn’t have a 5. that the “empty product” or “product
of no numbers” should be considered
Proving this theorem is a bit beyond our skill set at the moment. Instead, to be 1, because 1 is the multiplica-
let’s mention a second useful fact about primes: tive identity. If you believe this, then
1 can indeed be written as the product
Theorem 1.2. There are infinitely many primes. of prime numbers; it specifically can
be written as the product of no prime
numbers!
This proof is also a bit beyond us for now. However, if you skip ahead
to the proof by contradiction section of our notes / wait a few weeks,
you’ll see a proof of this in our course!
Prime numbers (as you’ll see in Compsci 110) are incredibly useful for
communicating securely. Using processes like the RSA cryptosystem,
one of the first public-key cryptosystems to be developed, you can use
prime numbers to communicate secretly over public channels.
Prime numbers are also quite baffling objects, in that despite having
studied them since at least 300 BC there are still so many things we do
not know about them! Here are a few particularly outstanding problems,
the solutions for which would earn you an instant Ph.D/professorship
basically anywhere you like in the world:
Exercise 1.5. (++)(Goldbach conjecture.) Show that every even integer
greater than 2 can be written as the sum of two prime numbers. For
example, we can write 8 = 3 + 5, 14 = 11 + 3, 24 = 17 + 7, 6 = 3 + 3, . . .
Exercise 1.6. (++)(Twin prime conjecture.) A pair of prime numbers
are called twin primes if one is exactly two larger than the other. For
example, (5, 7), (11, 13) and (41, 43) are twin primes. Show that there
are infinitely many twin primes.
In general, working with prime numbers can get tricky very fast; even
simple problems can get out of hand! With that said, there are some
claims that are approachable. We study two such statements here:
Claim 1.5. Let ab be a two-digit positive integer (where b is that num-
ber’s ones’ digit and a is its tens’ digit.) Show that the number abab is
not prime.
ab abab prime factorization of abab
Proof. As noted before, examples alone aren’t enough for a solution. 10 1010 2 ⋅ 5 ⋅ 101
However, if you’re stuck (as many of us would be on first seeing a problem 98 9898 2 ⋅ 72 ⋅ 101
like this,) they can be useful for helping us find a place to start! 21 2121 3 ⋅ 7 ⋅ 101
88 8888 23 ⋅ 11 ⋅ 101
So: let’s create a bunch of two-digit numbers and factor them (say, via 92 9292 22 ⋅ 23 ⋅ 101
WolframAlpha.) If our claim is wrong, then maybe we’ll stumble across 43 4343 43 ⋅ 101
a number whose only factors are 1 and itself. Conversely, if our claim is
right, maybe we’ll see a pattern we can generalize! We do this at right.
As we do this, a pattern quickly emerges: it looks like all of these numbers
are multiples of 101! This isn’t a solution yet: we just checked six
numbers out of quite a few possibilities. It does, however, tell us how to
write a proper argument here:
Notice that for any two-digit number ab, ab⋅101 = ab⋅100+ab = ab00+ab =
abab. Therefore, any number of the form abab is a multiple of 101 and
13
Proof. We use the “suppose we’re wrong” technique from Claim 1.3.
That is: take any number n with the “no factors in the set 2, 3, . . . k”
property listed above. There are two possibilities:
❼ n is prime. This is what we want: if this case holds, we’re done!
❼ n is not prime. We want to show that this case cannot hold; if we
can do this, then we are left with only the case above, and have
thus proven our claim!
To do this: note that if n is not prime, then, it must have more
than 2 factors. Let a be one of those positive factors that is not 1
or n; by the definition of factor, then, we can write n = ab for some
positive integer b, and thus have that b is also another factor of n.
We know that because a ≠ n that b ≠ 1. We also know that be-
cause both a, b are factors of n, that by our “suppose we’re wrong”
√ these
assumption that a, b ≠ 2, 3, 4, . . . k. Therefore, by combining
results, both a, b > k; that is, both a, b are greater than n.
√ √
But this means that ab > n ⋅ n = n. But this is impossible, as
ab = n.
Therefore the only possibility left is that n is prime, as desired.
14
Unlike the process above, you would not find the answer here by 12
11 1
adding 18 to 11 and getting “29 o’clock.” Instead, you’d find 11 + 10 2
18 = 29, and then subtract off 24 to get the right answer, 5 a.m. 9 3
Definition 1.7. Take any two integers a, n, where n > 0. We define the
number a % n, pronounced “a mod n”, by the following algorithm : An algorithm is a step-by-step pro-
cess for solving a problem or perform-
❼ If a ≥ n, repeatedly subtract n from a until a < n. The result of ing a calculation! We will study al-
this process is a % n. gorithms in more depth later in this
class; you’ll also see them everywhere
❼ If a < 0, repeatedly add n to a until a > 0. The result of this process in Compsci 110 / 130 / 220 / 225 / es-
is a % n. sentially, every paper you’ll see in your
Compsci degree!
❼ If neither of these cases apply, then by definition 0 ≤ a < n. In this
case, a % n is simply a (that is, we don’t have to do anything!) Note: if n < 0, we can use the same pro-
We call % the modulus operator. Most programming languages im- cess as above to calculate a % n. The
only change is that we replace n with
∣n∣ in our steps, to compensate for the
plement it as described here, though (as always) you should read the
documentation for details. fact that n is negative.
15
Claim 1.7. If a, n are any two integers with n > 0, the quantity a % n
exists and is between 0 and n − 1. That is: the algorithm given above
to calculate % will never “crash” nor “run forever,” and it will always
generate an output between 0 and n − 1.
16
17
– To set this up, take the alphabet {a, b, . . . z}, and label each
letter with the numbers {0, 1, . . . 25}, as shown at right. Also
a b c d e f g h i j k l m
↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓
choose a secret key k from the set of numbers {0, 1, . . . 25};
0 1 2 3 4 5 6 7 8 9 10 11 12
n o p q r s t u v w x y z
↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ for this problem, let’s pick our secret key to be k = 15.
13 14 15 16 17 18 19 20 21 22 23 24 25
– Now, take the message you want to send: for example, let’s
send the phrase chain fusion .
– Swap the letters for numbers, to get 2.7.0.8.13 5.20.18.8.14.13 .
– Take each number in your code and add our secret code to it:
this gives us 17.22.15.23.28 20.35.33.23.29.28 .
– Now replace each number in our code with its value modulo
26; this gives us 17.22.15.23.2 20.9.7.23.3.2 .
– Translate this back to letters, to get the encrypted message
rwpxc ujhxdc . To an outsider, this would look like gibber-
ish! In ancient Roman times, most people who would inter-
cept such a message would assume that it was written in a
foreign language and not be able to translate it.
– But, if you knew the secret key, you could just translate this
message back to numbers and subtract 15 from each letter
modulo 26. This would give you the original message, as
desired.
If you tried to use a computer to directly expand this problem, the size
29
of the resulting number (approximately 1010 ) would take up ≈ 1017
terabytes, i.e. several orders of magnitude in excess of the current total
storage capacity of humanity.
And yet, if you pop over to WolframAlpha and ask it this question, you’ll
get the following answer in about three seconds:
18
Example 1.9.
19
More English/mathematics word def- A quick corollary to Claim 1.9 is the following:
initions! A corollary is a claim that
follows quickly from the claim that just Corollary 1.1. If a % n = b % n, then for any positive integer k, we have
came before. That is, a corollary is a (ak ) % n = (bk ) % n.
“consequence” of the earlier claim, or
something that you get “for free” once
you’ve proven some earlier result. We leave proving this for Exercise 9; try proving this by using the mul-
tiplication property in Claim 1.9!
Instead of spending more time with this sort of abstract stuff, let’s do
something practical: let’s talk about how we could solve our Wolfra-
mAlpha problem! Specifically, let’s solve the following problem:
20
Proof. The clever thing here is not that we can calculate this (again,
if we just wanted an answer, we could plug this into WolframAlpha),
but rather that we can calculate this easily! That is: we can use mod-
ular arithmetic to find this number without needing a calculator, with
relatively little work overall.
To do so, just make the following observations:
❼ First, 213047 % 10 = 7, and in general any positive number is con-
gruent to its last digit modulo 10. This is by definition: if you
take any number a greater than 10, subtracting 10 from it does
not change its last digit! Therefore, the process we defined to cal-
culate a % 10 will never change the last digit of a, and thus its
output at the end is precisely a’s last digit.
❼ Therefore, we have that (213047129314 ) % 10 = (7129314 ) % 10, by
using our “exponentiation” result from earlier.
❼ Now, notice that (72 ) % 10 = 49 % 10 = 9, and thus that (74 ) % 10 =
(72 ⋅ 72 ) % 10 = (9 ⋅ 9) % 10 = 81 % 10 = 1.
It’s worth noting that this trick isn’t just useful for humans, but for
computers as well: you can use tricks like this to massively speed up
calculations in which you only care about the number’s last few digits,
and/or other pieces of partial information.
We can use this trick to basically calculate any number to any other
number’s last digit! For example, the task of finding the last digit of
1234567891011121314151617181912345678910111213141516171819
21
22
If x+y
xy
1
= x+y , then because x, y and x + y are all nonzero, we can multiply
both sides by the denominators (i.e. multiply both sides by xy(x + y).)
Doing so will get rid of our fractions, as
So, for example, the real number contain all of the rational numbers,
because you can do things like write
❼ 1
2
= 0.5,
❼ 1
3
= 0.333333 = 0.3 . . ., and
❼ 22
7
= 3.142857142857142857 . . . = 3.142857.
However, there are also real numbers that are not rationals: i.e. there
are quantities out there in the world that we cannot express as a ratio
of integers! We call such numbers irrational.
Observation 1.4. Notice that every real number, by definition, is either
rational or irrational.
It’s a bit beyond the skills we have at the moment, but numbers like
π = 3.1415926535897932384626433832795028841971693 . . .
and
√
e = 2.7182818284590452353602874713526624977572470 . . .
are both irrational numbers. Later on in this class, we’ll prove that 2
is also an irrational number (check out the proof by contradiction section
of this book if you cannot wait!)
With that said, though, it would be a bit of a shame to not show you at
least one irrational number. So, let’s close this chapter by doing this!
23
Proof. We prove this claim by using the “suppose we’re wrong” structure
we’ve successfully used in Claims 1.6 and 1.3. That is: what would
happen if log2 (3) was rational?
Well: we could write log2 (3) = xy , for two integers x, y. Note that because
log2 (3) is positive (it’s greater than log2 (2) = 1), we can have x, y > 0.
Exponentiating both sides of this equation by 2 gives us 2log2 (3) = 2x/y .
By definition, log2 (⋆) and 2⋆ cancel each other out, so this simplifies to
3 = 2x/y .
Now, raising both sides to the y-th power gives us 3y = (2x/y ) . By using
y
We close this chapter with some exercises for you to try out. Give these
an attempt, and post your answers / attempts on Piazza!
24
14. (+) Find the last digit of 987654321 without using a computer or
calculator.
15. (+) What is the longest English word that can be shift-ciphered
into another English word?
16. (++) Take an integer n. If it’s odd, replace it with 3n + 1; if it’s
even, replace it with n/2. Repeat this until the number is equal to
1. Will this process always eventually stop?
a c
17. Consider the following claim: “Let , be a pair of rational num-
b d
a c
bers. Suppose that ad > bc. Then we have > .”
b d
Is this claim true? If it is, explain why. If it is not, find a pair of
fractions that demonstrate that this claim is false.
18. If x is rational and y is irrational, must x + y be irrational? Either
explain why or find a counterexample (i.e. find a pair of numbers (from https://xkcd.com/710/)
x, y such that x is rational, y is irrational, and x + y is rational.)
19. If x and y are both irrational numbers, must x + y be irrational?
Either explain why or find a counterexample (i.e. find a pair of
numbers x, y such that x, y are irrational, but x + y is rational.)
20. (+) Show that if a, b, c are all odd integers, then there is no rational
number x such that the equation ax2 + bx + c = 0 holds.
21. (++) Is π + e irrational?
25
Exercise 2.1. You’re trying to break into a safe that has a PIN lock.
The safe has two buttons: 0 and 1. The PIN you’re trying to guess is a
three-digit sequence of binary numbers, and accepts the last three digits
you’ve typed in without needing you to hit enter: i.e. if you typed in
“00010,” the safe would open if the pin was either “000” or “001” or
“010”.
Sounds easy, right? There’s only eight possible PINs to check (two pos-
sibilities per digit, three digits in the PIN ⇒ 23 = 8 possible pins), so we
should be able to brute-force the lock by checking all possibilities.
However, the safe is wired to call the cops if more than ten buttons are
pressed and the correct PIN is not entered. As such, we can’t use our
brute-force approach: that could take 8 ⋅ 3 entries!
Is there an approach that is guaranteed to break us into the safe?
2.1 Strings
First, let’s define what an alphabet is:
26
0,1,2,3,4,5,6,7,8,9
If we’re working in binary, we use a much smaller alphabet:
0,1
If we’re working with DNA, we’d use
A,C,G,T
as our alphabet, where these represent the four nucleotide bases cytosine
[C], guanine [G], adenine [A] or thymine [T].
There are other alphabets that are too big to write down here: for
example, the set of all Unicode symbols, or the set of all emojis!
Given an alphabet, it’s often useful to be able to refer to the whole
thing with a symbol. We’ll do this by writing something like Σ =
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}. This notation, where we list our symbols be-
tween a pair of curly braces and separate them with commas, tells us
that Σ is an alphabet containing the ten symbols 0,1,. . . 9.
With the definition of an alphabet in hand, we can define strings:
Definition 2.2. Take any alphabet Σ. A string over the alphabet Σ is
any sequence of letters in an alphabet. Some people refer to strings as
“words:” if you see an author referring
Example 2.2. If we let Σ be the Roman alphabet described earlier, then to a collection of words over a given
“cat,” “mongoose,” and “ssssssssssss” are all strings over this alphabet. alphabet, this is just a synonym for
strings!
Note that these strings don’t have to correspond to any particular mean-
ing; they’re just sequences of symbols!
If we let Σ be the decimal alphabet, then “123,” “00012,” and “999”
are each possible strings over this alphabet. Again, these don’t always
have to correspond to numbers! In particular, notice that as strings
we think that “00012” and “12” are different things. Even though as
numbers they’re different, as strings they’re quite different: “00012”
has zeroes in it, while “12”does not. (That is, think about entering a
password on your phone. There, if someone has a password of “00012,”
entering “12” shouldn’t unlock your phone!)
We will sometimes not specify an alphabet, and instead just refer to
strings by listing their entries. If so, we assume that their alphabet is
the most reasonable one to work with that string in (usually either the
Roman alphabet, decimal, or binary.)
A particularly useful string to refer to is the empty string “”, i.e. the
string containing no symbols. We denote this string by writing λ.
27
In other words, two strings are equal if and only if they are literally
character-for-character identical! Note that two strings of different lengths
are always nonequal.
Example 2.4.
❼ The strings “00001” and “1” are different. Even though the un-
derlying numbers they represent are the same, these are different-
length strings.
❼ If we take the alphabet given by all characters on a keyboard, the
strings “12+23” and “10+25” are different. Even though these are
the same length and represent the same underlying integer, the
characters are different in some places: for instance, the second
character of the first string is 2, while the second character of the
second string is 0.
Example 2.5.
❼ Let s =“song” and t =“bird”. Then st is the string “songbird”.
❼ Let s =“12” and t =“0”. Then st is equal to “120”. Notice that
this is very different to what we would mean by writing st if we
thought of s, t as integers; there, st would denote 12 ⋅ 0 = 0!
In general, if you’re using string concatenation on strings of num-
bers, make sure to indicate this to your reader repeatedly through
your working so that they know what you’re doing. The use of
quotation marks can help keep things clear: that is, because we
wrote s =“12” and t =“0”, we’ve told you that we are thinking of
s, t as strings, and thus that concatenation is the appropriate way
to combine them.
❼ You can concatenate multiple strings at once: i.e. if s =“3”, t =“.”,
and u =“14159265...”, then stu is just “3.14159265...”
Notice that if s has length n and t has length m, then st has length
n + m.
Concatenation is used in tons of practical applications:
❼ Every bank account number is a concatenation of a bank code
(telling you what company you bank with,) an account number
(which tells the bank who owns this account,) and an account type
code (telling you what kind of account that number is attached to.)
❼ We saw that many ID numbers have “check digits” when we worked
with modular arithmetic! As such, your full ID number is usu-
ally created by concatenating your account number with the check
digit.
Related to the idea of concatenation, we have the concepts of prefixes,
substring and suffixes:
28
Example 2.6.
❼ If t = “snowball,” then “snow” is a prefix of t, “ball” is a suffix of
t, and “now” is a substring of t.
❼ If t = “112323411,” then “112” is a prefix of t, “323411” is a suffix
of t, and “2” is a substring of t.
Claim 2.1. The empty string λ is a prefix, suffix, and substring of every
string t.
29
2.2 Sets
A second useful object, that we will often study in relation to strings, is
the concept of a set:
Definition 2.7. A set A is just a collection of things. We call those
things the elements of A, and write x ∈ A to denote with symbols the
statement “x is an element of A.”
To describe a set, we just list its elements between a pair of curly braces:
for example, {1, 2, 3} would be how we would describe the set consisting
of the three numbers 1, 2 and 3.
Notice that sets can be finite (in the case of things like “the collection
of all English words”) or infinite (in the case of the set of all prime
numbers!)
To make our lives easier when working with sets, let’s make a few nota-
tional conventions about how we should treat them:
❼ When we’re describing a set, we don’t care about the order in
which we list our elements: i.e. {cat, tag, tact} and {tag, cat, tact}
are both the “same” to us! This is because we only care about what
30
things are contained within a set; the order is something that we’ll
wind up changing a lot depending on the context (i.e. sometimes
alphabetical, sometimes by length...) and isn’t itself something we
want to care about.
❼ Similarly, when we’re describing a set, we only want to list each
element once. This is because otherwise it would be quite irri-
tating to try to look things up in our set: imagine a dictionary
that just listed the word “mongoose” forty times in a row!
As such, if someone gives you a set in which an element is repeated
twice, we just remove duplicates: i.e. we say { cat, tag, tact, tact,
tact } and {cat, tag, tact} are the same, and would never write
the first thing if we couldn’t help it.
❼ In the case of {cat, tag, tact}, we were able to describe our set by
just listing its elements. This works for small cases, but becomes
quite unwieldly for larger sets: imagine having to write out all of
the words in French before discussing the French language!
To deal with this, we have an alternate way of writing sets: you
can describe them by giving a property. For instance, when
we say “the set of all words in Māori” above, we’re giving you a
property that a given string of letters may or may not satisfy (i.e.
“is it a word in Māori”), and then taking the set of all words that
satisfy that property.
While the sentences we used in our examples above do work as
definitions for sets, you can also use the following more “math-y”
construction: to describe the set of all strings s with property blah,
you can just write
For instance, the set of all odd-length binary strings could be de-
scribed as the following: We use the notation “∈” as shorthand
for the word “in.”
{s ∣ length(s) = 2k + 1 for some k ∈ Z, and s is a binary string.}
The “s” on the left tells you the variable name, the divider ∣ just
separates the variable from its property, and the text at the right
gives the required property.
You can also use the left-hand part to describe the structure of
your set’s elements: i.e. something like
gives you all binary strings that start with the prefix “001.”
One useful concept when working with sets is a notion of “size:” In maths, the word cardinality is
used to refer to the size of a set. If you
Definition 2.8. A set A has size n if it contains precisely n different take papers like Maths 190 or Compsci
elements. If A contains infinitely many different elements, we say that 225, you can learn to study the idea of
A has “infinite” size. We denote the size of A by writing ∣A∣.
“different sizes of infinity” by working
with cardinality! In particular, using
the idea of a bijection in those courses,
Example 2.8. you can show that the integers, ratio-
❼ The set {1, 2, 3, π, 7} has size 5. nals, and natural numbers somehow all
have the same “countable” size of in-
❼ The set of all binary strings of length 2, i.e. {00, 01, 10, 11}, has finity, while the real numbers somehow
size 4. have a larger and “uncountable” size of
infinity. . .
Another useful concept when working with sets is the idea of a “subset:”
Definition 2.9. Take two sets A, B We say that B is a subset of A,
and write B ⊆ A, if every object in B is also an object in A.
31
Example 2.9.
❼ Let A be the collection of all University of Auckland ID numbers,
and let B be the collection of all University of Auckland ID num-
bers corresponding to active Compsci 120 students. Then B is a
subset of A!
❼ Let A be the set of all binary strings of length 3, and let B be the
set of all binary strings with exactly two 1’s.
Then B is not a subset of A. This is because B contains things
like “11000”, which are not in A. Similarly, A is not a subset of
B, because A contains things like “000” that are not in B!
❼ Let A be the English language, and B be the collection of all
English words that rhyme with “avocado.” Then B is a subset of
A, as every word in B is by definition a word in A!
Example 2.10.
❼ Let A be the collection of all English words with even length and
B be the collection of all English words with odd length. In this
case, A ∪ B is the collection of all English words.
❼ Let A be the collection of all Compsci 120 students that turned in
assignment 1, and B be the collection of all Compsci 120 students
that attended tutorial 1. Then A∪B is the collection of all Compsci
120 students who either attended tutorial 1 or turned in assignment
1, or both.
In general, unions work like “or” operations: the union of a set
defined by property A with a set defined by property B is just the
collection of all elements that satisfy property A or B.
❼ Let A be the collection of the 1000 most common phrases used in
spam emails ( things like “You be a Winner!!!1!!”) and B be a col-
lection of dodgy email addresses (e.g. “bi11.gates@micr0soft.ie”).
Then, the union A ∪ B is a good start for a “block list,” i.e. some-
thing that an email filter can use to automatically trash certain
emails.
Example 2.11.
❼ Let A be the English language and B be the German language.
Then A∩B is the set of words that are both in English and German
at the same time: i.e. words like “alphabet,” “computer” and “tag”
would be in A∩B, as they are all both English and German words.
❼ Let A be the set of numbers that are multiples of 3, and B be the
set of numbers who are multiples of 2. Then A ∩ B is the set of
numbers that are multiples of both 2 and 3; i.e. it’s the set of all
numbers that are multiples of 6!
Like how union was an “or,” intersection works like an “and” oper-
ation: that is, the intersection of a set defined by property A with
a set defined by property B is just the collection of all elements
that satisfy property A and B.
32
If you go back to our remarks earlier, this should make sense. We said
that the only thing we cared about for a set was the elements it con-
tained; i.e. we didn’t care about the order, and we ignored repeats/etc.
Therefore, two sets should be the same if they contain the same elements!
A useful proof technique, that we’ll often use to show that two sets are
the same is the following. Take two sets A, B that you want to show are
equal. Suppose you showed that
1. every element in A is a element in B, and also
2. every element in B is a element in A.
Then, by the definition above, we would know that A and B are equal!
As such, we can use this two-part approach to prove that many pairs
of objects are equal. We study a few examples here, to get the hang of
this:
Claim 2.3. Let A, B be any two sets such that A ⊆ B. Then A ∪ B = B.
33
This is not the only way to prove that two sets are equal! As always,
simply expanding the definitions of both sets can often do the same trick:
These are the same statements; therefore we’ve shown that these sets
are equal!
34
35
We’re not going to practice counting things like this! Instead, consider
the following four exercises:
Exercise 3.1. How many strings of five letters are palindromes (i.e.
(0,0) (0,0) can be read the same way forwards and backwards?)
Exercise 3.2. A lattice path in the plane R2 is a path joining integer
points via steps of length 1 either upward or rightward. How many lattice
paths are there from (0, 0) to (3, 3)?
Exercise 3.3. How many seven-digit phone numbers exist in which the
digits are all nondecreasing?
Exercise 3.4. In how many ways can you roll a six-sided die three times
and get different values each time?
All of these are “counting” problems, in that they’re asking you to figure
out how many objects of a specific kind exist. However, because the sets
in question are trickily defined, these problems are much harder than
our “how many elements are in {3, 5, 7, Snape} question.
To approach them, we’ll need some new counting techniques! This is
the goal of this chapter: we’re going to study combinatorics, the art
of counting, and develop techniques for solving problems like the ones
above.
Let’s start off with a simple task:
Problem 3.1. Suppose that we have 3 different postcards and 2 friends.
In how many ways can we mail out all of our postcards to our friends
while we’re on vacation?
Answer. Let’s give our cards names A, B, C, and also give our friends
names X and Y , for easy reference.
In the setup above, a valid “way” to mail postcards to friends is some
way to assign each postcard to a friend (because we’re mailing out all
three of our postcards.) To do this, think of going through each card
A, B, C one-by-one and choosing a friend X, Y for each card.
By using brute-force, we can just enumerate all of the possibilities:
❼ X gets A, B, C, Y gets nothing.
❼ X gets A, B, Y gets C.
❼ X gets A, C, Y gets B.
36
❼ X gets B, C, Y gets A.
❼ X gets A, Y gets B, C.
❼ X gets B, Y gets A, C.
❼ X gets C, Y gets A, B.
❼ X gets nothing, Y gets A, B, C.
This works, and tells us that there are 8 different ways to do this.
However, if we had more cards this brute-force process seems like a bad
idea:
Problem 3.2. Suppose that we have 10 different postcards and 4 friends.
In how many ways can we mail out all of our postcards to our friends
while we’re on vacation?
When doing this process, we have 4 choices of friend for each card, and
any combination of those choices will give us a valid way to send out
cards. Therefore, we should have
4 ⋅ 4 ⋅ . . . ⋅ 4 = 410
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸′¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
10 4 s
37
Problem 3.4. Take a six-sided die, and roll it twice in a row. In how
many ways can you do this and not see the same value repeat?
1, 2, 3, 4, 5 or 6 1, 2, 3, 4, 5 or 6, ≠ roll 1
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
roll 1 roll 2
There are 6 possibilities for the first roll. Once this is done, there are 5
possibilities for the second roll, as we can have any value show up other
than the one that occurs in the first roll. Therefore, there are 6 ⋅ 5 = 30
possibilities in total by the multiplication principle.
It is worth noting that the multiplication principle does not apply to all
possible situations! Consider the following counting problem:
Problem 3.6. Suppose that you have four friends Aang, Korra, Zuko,
and Toph. In how many ways can you arrange them in a row, if Aang
and Zuko are currently fighting and can’t be placed next to each other?
Answer. If we just had the problem “In how many ways can we ar-
range Aang, Korra, Zuko, and Toph in a row,” our problem is pretty
straightforward! By using our multiple-choice framework from before,
we can see that we have 4 choices for the first friend in our order, 3 for
our second (as we can’t repeat a friend), 2 for our third (can’t choose
any of the two previously-placed friends), and 1 for our last (everyone
else is already in place!) So, in total, we have 4 ⋅ 3 ⋅ 2 ⋅ 1 = 24 ways to do
this.
However, in this process we’ve allowed all of the possible arrangements,
including ones like “A,Z,T,K” where Aang and Zuko are together. How
do we correct for this?
Well, let’s try to count all of the ways in which Aang and Zuko could
be placed together! There are two ways in which this can happen:
38
❼ Aang and Zuko are placed in the order “AZ.” To find all of the
arrangements where this could happen, think of all of the ways to
place three things T , K , AZ in order! There are 3 ⋅ 2 ⋅ 1 = 6
ways to do this, by the same reasoning as above.
❼ Aang and Zuko are placed in the order “ZA.” To find all of the
arrangements where this could happen, think of all of the ways to
place three things T , K , ZA in order! There are again 3⋅2⋅1 = 6
ways to do this.
Because the “AZ” and “ZA” cases are completely distinct, we add them
to get that there are 6 + 6 = 12 ways for Aang and Zuko to be placed
together in either order.
Finally, in any way of arranging our friends at all, we either have Aang
and Zuko together or not. Therefore, we add these situations together,
to get
all arrangements = arrangements with Aang and Zuko together + arrangements with Aang and Zuko separate
24 = 12 + arrangements with Aang and Zuko separate .
39
This probably feels somewhat abstract, so let’s illustrate this with a few
examples:
Problem 3.7. Take two fair 6-sided dice and roll them. What is the
probability that the sum of these dice is 7?
Answer. On one hand, we know that there are six possible ways for
our dice to sum to 7: the six pairs (1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)
each correspond for a way for to roll two dice and get a 7. (Notice that
we count (1, 6) and (6, 1) differently! This is because our two dice are
different, and so rolling a 1 on the first die and a a 6 on the second is a
different event than rolling a 6 followed by a 1.)
On the other hand, we know that there are 36 possible outcomes for a
given roll of our two dice; this is because there are 6 choices for what
the first die can be, 6 for the second, and we multiply these together.
6
Thus, the odds that we roll two dice and they sum to 7 is 36 = 61 .
Problem 3.8. You and three of your friends are graduating. At gradu-
ation, you each throw your hats in the air to celebrate! At the end, you
each pick up a hat from the ground. What are the odds that each of you
pick up your own hat? (Assume that you’re picking up hats at random,
i.e. you’re as likely to pick up your own hat as anyone else’s hat: also
assume that for some reason no-one else is graduating, and so there are
only four possible hats to pick up.)
40
and first characters at the end of the string: e.g. you can transform “rad”
to “radar,” “ten” to “tenet” and “eev” to “eevee.” This process is clearly
reversible: i.e. you can take any five-letter palindrome and cut off its
last two letters to get a 3-letter string.
Any such three-letter string is formed by making three choices in a row,
and we have 26 choices for each letter; this gives us 263 many such
strings, and thus 263 many five-letter palindromes.
This can get a bit more complex, however. Let’s try changing our post-
card problem a bit from before:
Problem 3.9. Suppose that we have k different kinds of postcards, n
friends, and that we want to mail these postcards to our friends. Last
time, however, it was possible that we just mailed all of our postcards
to the same friend. That’s a bit silly, so let’s add in a new restriction:
let’s never send any friend more than one postcard.
In how many ways can we mail out postcards now?
n ⋅ (n − 1) ⋅ (n − 2) ⋅ . . . ⋅ (n − (k − 1))
slot!
n! = 1 ⋅ 2 ⋅ 3 ⋅ 4 ⋅ . . . ⋅ n
With this notation in mind, we can simplify our answer to the postcard
problem considerably: notice that
n!
(n − k)!
= .
41
Problem 3.10. If you take a six-sided die and roll it three times, what
is the probability that you get different values each time?
Problem 3.11. Suppose that you have six lightbulbs: two identical red
bulbs, and one green / one blue / one white / one pink.
In how many ways can you screw these bulbs into a string of six lightbulb
sockets? (Assume that the order in which we string these bulbs matter:
i.e. we consider “RRGBW P ” and “P W BGRR” to be different.)
42
Problem 3.12. Suppose that we have just one friend that we want to
send postcards to. We still have n different kinds of postcards, but now
want to send that one friend k different postcards in a bundle (say as a
gift!) In how many ways can we pick out a set of k cards to send our
friend? In mathematics: whenever you have a
problem at hand, constantly look for
In this problem, we have n different kinds of postcards, and we want to modifications like these to make to the
problem! If you’re stuck, it can give
find out how many ways to send k different cards to a given friend. At you different avenues to approach or
first glance, you might think that this is the same as the answer to our think about the problem; conversely, if
second puzzle: i.e. we have k slots, and we clearly have n choices for the you think you understand the problem,
this can be a way to test and deepen
first slot, n − 1 choices for the second slot, and so on/so forth until we
have n − (k − 1) choices for our last slot.
that understanding.
n!
(n − k)!
This would certainly seem to indicate that there are many ways
to assign cards. However, our situation from before is not quite the same
as the one we have now! In particular: notice that the order in which
we pick our postcards to send to this one friend does not matter to our
friend, as they will receive them all at once anyways! Therefore, our
process above is over-counting the total number of ways to send out
postcards: it would think that sending card X and Y is a different action
to sending card Y and card X!
To fix this, we need to correct for our over-counting errors above. Notice
that for any given set of k distinct cards, there are k! different ways to
order that set: this is because in ordering a set of k things, you make
k choices for where to place the first element, k − 1 choices for where to
place the 2nd element, and so on / so forth until you have just 1 choice
for the k-th element.
Therefore, if we are looking at the collection of ordered length-k se-
quences of cards, each unordered sequence of k cards corresponds to
k! elements in this ordered sequence! That is, we have the following
equality:
Unordered ways to
n!
pick k cards =
from n choices k!(n − k)!
This concept — given a set of n things, in how many ways can we pick k
of them, if we don’t care about the order in which we pick those elements
— is an incredibly useful one, and as such leads itself to the following
definition:
Definition 3.2. (Unordered choice without repetition.) The bi-
nomial coefficient (nk) is the number of ways to choose k things from n
choices, if repeated choices are not allowed and the order of those choices
does not matter.
43
44
It first bears noting that this problem does not fall under the situations
of our earlier problems. In this problem, our choices are unordered:
i.e. we’re just picking out a bundle of cards to buy, and the order in
which they’re bought is irrelevant. Therefore, we cannot use the “or-
dered choice with repetitions” observation we made earlier, as this would
massively overcount things (i.e. we’d count orders of the same cards as
different if the cashier rang things up in a different order, which is silly.)
However, unlike our two “ordered/unordered choice without repeats”
situations, we can repeat choices! This means that this is not at all
like those situations: in particular, k can be larger than n and we will
still have lots of possibilities here, whereas in in the “without repeats”
situations this was always impossible. So we need a new method!
To develop this method, think of the n different kinds of postcards as n
“bins.” Here’s a visualization for when n = 5:
Picking out k cards to buy, then, can be thought of as pulling a few cards
from the first bin, a few from the second, and so on/so forth until we’ve
pulled out k cards in total. In other words, this is the same problem
as distributing k balls amongst n bins:
45
Now, forget the difference between objects and dividers! That is,
take the diagram above and suppose that you cannot tell the difference
between an object and a divider between our choices.
? ? ? ? ? ? ? ? ? ? ? ? ?
How can we return this back to a way to choose k things from n choices?
Well: take the set of k + (n − 1) objects, of which k used to be things
and n − 1 were dividers. Now choose n − 1 of them to be dividers! This
returns us back to a way to pick out k things from n choices.
? ? ? ? ? ? ? ? ? ? ? ? ?
To practice this, let’s answer our last two problems of this chapter:
Problem 3.15. Suppose that you have ten identical cookies, and want
to distribute them to four of our friends, so that each friend gets at least
one cookie. In how many ways can we do this?
Answer. First, if every friend gets at least one cookie, we can start by
just distributing one cookie to each friend! This leaves us with 6 cookies
left over to distribute further to our friends.
If we think of taking each cookie and “choosing” a friend to give it to, this
is unordered choice with repetition: the order doesn’t matter because the
cookies are identical, and repeats are allowed because we can give one
friend multiple cookies.
By our formula above, then, there are (6+4−1
4−1
) = (93) = 9⋅8⋅7
3⋅2
= 84 ways
for this to happen.
46
47
48
49
Exercise 4.1. Two processes that you can use to multiply two nonneg-
An example run of Algorithm 4.1 when ative integers a, b together are listed below:
a = 3, b = 12:
Algorithm 4.1.
step a b prod
1 3 12 0 1. Define a new number prod, and initialize it (i.e. set it equal) to 0.
2
3 2 12 2. If a = 0, stop, and return the number prod.
2
3. Otherwise, add b to prod, and subtract 1 from a. Then go to 2.
3 1 24
2
Algorithm 4.2. 1. Define a new number prod, and initialize it (i.e.
3 0 36
2 (halt!) set it equal) to 0.
2. If a = 0, stop, and return the number prod.
Similarly, an example run of Algorithm
4.2 when a = 3, b = 12: 3. Otherwise, if a is odd, subtract 1 from a and set prod = prod + b.
step a b prod 4. Divide a by 2, and multiply b by 2.
1 3 12 0
2
5. Go to step 2.
3 2 12
4 1 24 In general, which of these is the faster way to multiply two numbers?
2 Why?
3 0 36
4 0 48
2 (halt!)
4.1 Functions in General
In your high-school mathematics classes, you’ve likely seen functions de-
scribed as things like “f (x) = 2x +3” or “g(x) = max(x, y).” When we’re
writing code, however, we don’t do this! That is: in most programming
languages, you can’t just type in expressions like the ones before and
trust that the computer will understand what you mean. Instead, you’ll
often write something like the following:
return result ;
}
Notice how in this example we didn’t just get to define the rules for our
function: we also had to specify the kind of inputs the function should
expect, and also the type of outputs the function will generate! On one
hand, this creates more work for us at the start: we can’t just tell people
our rules, and we’ll often find ourselves having to go back and edit our
functions as we learn what sorts of inputs we actually want to generate.
On the other hand, though, this lets us describe a much broader class
of functions than what we could do before! Under our old high-school
50
51
While the objects above have had relatively “nice” rules, not all functions
can be described quite so cleanly! Consider the following example:
Note that the range is usually different to the codomain! In the examples
we studied earlier, we saw the following:
❼ f ∶ Z → Q defined by the rule f (n) = 2
1
does not output every
n +1
rational number! Amongst other values, it will never output any
1 1
number greater than 1 (as 2 ≤ 2 = 1 for every integer n.)
n +1 0 +1
As such, its codomain (Q) is not equal to its range.
❼ The function f ∶ A → B from Example 4.3, that takes in any
student at the University of Auckland and outputs their student
ID, does not output every integer: amongst other values, it will
never output a negative integer! As such, its codomain (B) is not
equal to its range.
52
Example 4.6. Consider the emoji function from Example 4.4. If we look
at the diagram we drew before, we can see that our function generates
three possible outputs: , and . Therefore, the collection of
these three emojis is our range!
Example 4.7. Let A be the collection of all pairs of words in the En-
glish language, and B be the two values {true, false}. Define the func-
tion f ∶ A → B by saying that f (w1 , w2 ) = true if the words w1 , w2
rhyme, and false otherwise. For example, f (cat, bat) = true, while f (cat,
cataclysm) = is false.
The range of this function is {true, false}, i.e. the same as its codomain!
It is possible for the range and codomain to agree. (If this happens, we
call such a function a surjective function. We’re not going to focus
on these functions here, but you’ll see more about them in courses like
Compsci 225 and Maths 120/130!)
Example 4.8. Let R denote the set of all real numbers (i.e. all numbers
regardless of whether they’re rational or irrational; alternately, anything
you can describe with a decimal expansion.) Define the function f ∶ R →
R by the rule f (x) = 2x . 3
This function has range equal to the set of all positive numbers! This 2
takes more math to see than we currently have: again, take things like
Maths 130 to see the “why” behind this. However, if you draw a graph 1
of 2x you’ll see that the outputs (i.e. y-values) range over all of the 0
-6 -5 -4 -3 -2 -1 0 1 2
To close this section, we give a useful bit of notation for talking about
functions defined in terms of other functions: function composition.
Fact 4.2. Given any two functions f ∶ B → C, g ∶ A → B, we can combine
these functions via function composition: that is, we can define the
function f ○ g ∶ A → C, defined by the rule f ○ g(x) = f (g(x)). We
pronounce the small open circle symbol ○ as “composed with.”
Example 4.9. ❼ If f, g ∶ R → R are defined by the rules f (x) = x + 1
and g(x) = x2 −1, then we would have g○f (x) = g(f (x)) = g(x+1) =
(x + 1)2 − 1 = x2 + 2x.
53
Handy!
Notice that in the definition above, we required that the domain of f was
the codomain of g. That is: if we wanted to study f ○ g(x) = f (g(x)),
we needed to ensure that every output of g is a valid input to f .
This makes sense! If you tried to compose functions whose domains and
codomains did not match up in this fashion, you’d get nonsense / crashes
when the inner function g returns an output at which the outer function
is undefined. For example:
Example 4.10. ❼ If f ∶ R ∖ {0} → R is defined by the rule f (x) = x1
and g ∶ R → R is defined by the rule g(x) = x2 − 1, then you might
think that f ○ g(x) = f (g(x)) = x21−1 .
However, this is not a function! When x = ±1, for example, we
have f (g(±1)) = (±1)12 −1 = 10 , which is undefined. This is why we
insist that the codomain of g is the domain of f ; we need all of g’s
outputs to be valid inputs to f .
❼ Let A be the set of all people in your tutorial room, B be the set
of all ID numbers of UoA students, and C be the set of all ID
numbers of Compsci 120 students. Then f ∶ C → R defined by
taking any Compsci 120 student’s ID and outputting their grade
on the mid-sem test is a function; as well, g ∶ A → B, defined by
mapping each person in your tutorial room to their ID number is
a function.
However, f ○ g, the function that tries to take each person in your
tutorial room and output their mid-sem test score, is undefined! In
particular, your tutor is someone in your tutorial room, who even
though they do have an ID number, will not have a score on the
mid-sem test. Another reason to insist that the codomain of g is
the domain of f !
4.2 Algorithms
In the previous section, we came up with a “general” concept for function
that we claimed would be better for our needs in computer science, as
it would let us think of things like
54
else
result = num2 ;
return result ;
}
55
Second, we can turn Claim 1.6 into an algorithm for how to tell if a
number is prime:
Algorithm 4.4. This is an algorithm that takes in a positive integer n,
and determines whether or not n is prime. It proceeds as follows:
❼ If n = 1, stop: n is not prime.
√
❼ Otherwise, if n > 1, find all of the numbers 2, 3, 4, . . . ⌊ n⌋. Take
each of these numbers, and test whether they divide n.
❼ If one of them does, then n is not prime!
❼ Otherwise, if none of them divide n, then by Claim 1.6, n is prime.
56
Back in our first chapter, to understand our two previous algorithms 4.3
and 4.4 we started by running these algorithms on a few example inputs!
In general, this is a good tactic to use when studying algorithms; actually
plugging in some concrete inputs can make othewise-obscure instructions
much simpler to understand.
To do so here, let’s run our algorithm on the list (1, 7, 1, 0), following
each step as written:
list step locmin valmin current k current lk list step locmin valmin current k current lk
(1,7,1,0) 1 (0, 7, 1, 1 ) 3(a) 3 1
(1,7,1,0) 2 1 1 (0, 7, 1, 1 ) 3 4 1
(1,7,1,0) 3 2 7
(0, 1, 7, 1 ) 4
(1,7,1,0) 3 3 1
(0,1, 7, 1 ) 5
(1,7,1,0) 3 4 0
(1,7,1,0) 3(a) 4 0 (0,1, 7, 1 ) 1
(0,7,1,1) 4 (0,1, 7, 1 ) 2 3 7
(0, 7, 1, 1 ) 5 (0,1, 7, 1 ) 3 4 1
(0, 7, 1, 1 ) 1 (0,1, 7, 1 ) 3(a) 4 1
(0, 7, 1, 1 ) 2 2 7 (0,1, 1, 7 ) 4
(0, 7, 1, 1 ) 3 3 1 (0,1,1, 7 ) 5
(0,1,1, 7 ) 1
Proof. In general, there are three things we need to check to show that
a given algorithm works:
❼ The algorithm doesn’t have any bugs: i.e. every step of the
process is defined, you don’t have any division by zero things or
undefined cases, or stuff like that.
This is true here! The only steps we perform in this algorithm
are comparisons and swaps, and the only case we encounter is
“valmin > lk is true” or “valmin > lk is false,” which clearly covers
all possible situations. As such, there are no undefined cases or
undefined operations.
❼ The algorithm doesn’t run forever: i.e. given a finite input,
the algorithm will eventually stop and not enter an infinite loop.
This is also true here! To see why, let’s track the number of com-
parisons and write operations performed by this process, given a
list of length n as input:
– Step 1: one comparison (we checked the size of the list.)
– Step 2: two write operations (we defined valmin , locmin .)
57
58
So, in total, what does this mean? Well: by the above, we know
that l1 ≤ l2 ≤ l3 ≤ . . . ≤ ln , as by definition each element is smaller
than all of the ones that come afterwards. In other words, this list
is sorted!
❼ [ ][ ] + [ ]
−0.15 0.28 x 0
0.26 0.24 y 0.44
❼ [ ][ ] + [ ].
0 0 x 0
0 0.16 y 0
This is the precise math-y way of de-
scribing the operations we’re doing to
these rectangles!
For example, T (S0 ), i.e. T applied to our fern spore shape, is the
following:
59
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
11 ⋅ 13 = 13 + 13 + 13 + 13 + 13 + 13 + 13 + 13 + 13 + 13 + 13 = 143.
11 times
Now, however, suppose that someone asked you to calculate 12⋅13. While
you could use the process above to find this product, you could also
shortcut this process by noting that
This sort of trick is essentially recursion! That is: if you want, you could
define the product n ⋅ k recursively for any nonnegative integer n by the
following two-step process:
❼ 0 ⋅ k = 0, for every k.
❼ For any n ≥ 1, n ⋅ k = k + (n − 1) ⋅ k.
The second bullet point is a recursive definition, because we defined
multiplication in terms of itself! In other words, when we say that 12⋅13 =
13 + 11 ⋅ 13, we’re really thinking of multiplication recursively: we’re not
trying to calculate the whole thing out all at once, but instead are trying
to just relate our problem to a previously-worked example.
Exponentiation does a similar thing! Because exponentiation is just
repeated multiplication, we know that
210 = 2 ⋅ 2 ⋅ 2 ⋅ 2 ⋅ 2 ⋅ 2 ⋅ 2 ⋅ 2 ⋅ 2 ⋅ 2 = 1024.
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
10 times
60
❼ a0 = 1, and
❼ For any n ≥ 1, an = a ⋅ an−1 . In classes like Compsci 220 / 320,
In other words, with this idea, we can just say that you’ll study the idea of efficiency in
depth, and come up with more sophis-
ticated ideas than this one! In most
211 = 2 ⋅ 210 = 2 ⋅ 1024 = 2048, practical real-life situations there are
better ways to implement multiplica-
instead of calculating the entire thing from scratch! tion and exponentiation than this re-
cursive idea; however, it can be useful
These ideas can be useful if you’re working with code where you’re calcu-
in some places, and the general prin-
lating a bunch of fairly similar products/exponents/etc. With recursion, ciple of “storing commonly-calculated
you can store some precalculated values and just do a few extra steps values and extrapolating other values
of working rather than doing the whole thing out by scratch each time. from those recursively” is one that does
come up in lots of places!
Efficiency!
1 2 3 4 5 6
Large 0 1 1 2 3 5
Small 1 0 1 1 2 3
Total 1 1 2 3 5 8
We call relations like the one above recurrence relations, and will
study them in greater depth when we get to induction as a proof method.
61
For now, though, notice that they are remarkably useful objects! By
These are particularly cool because generalizing the model above (i.e. subtracting an−3 to allow for old age
they actually accurately model preda- killing off amoebas, or having a −ca2n term to denote that as the popu-
tor/prey relations in real life! See Isle
Royale, amongst other examples.
lation grows too large, predation or starvation will cause the population
to die off, etc.) one can basically model an arbitrarily-complicated pop-
For more details on these sorts of ulation over the long term.
population-modelling processes, see
the logistic map, the Lotka-Volterra
equations, and most of the applied
mathematics major here at Auckland!
It bears noting that the amoeba recurrence relation is not the first recur-
rence relation you’ve seen! If you go back to Algorithm 4.5, we proved
that
n 1 2 3 4 5 6 7
InsertionSortSteps(n) 1 10 22 37 55 76 100
62
We don’t have the techniques to prove this just yet. If you would like
to see a proof, though, skip ahead to the induction section of our proofs
chapter! We’ll tackle this problem there (along with another recurrence
relation), in Section 7.8.
Here’s a sample run of this algorithm on the list (1, 7, 1, 0), where I’ve
used random.org to shuffle our list when needed by step 3:
63
list iteration count step sorted list iteration count step sorted list iteration count step sorted
(1,7,1,0) 1 1 no (7,0,1,1) 6 1 no (1,7,0,1) 11 1 no
(1,1,0,7) 2 (1,1,0,7) 2 (1,7,1,0) 2
(1,1,0,7) 2 1 no (1,1,0,7) 7 1 no (1,7,1,0) 12 1 no
(7,0,1,1) 2 (1,1,7,0) 2 (1,7,1,0) 2
(7,0,1,1) 3 1 no (1,1,7,0) 8 1 no (1,7,1,0) 13 1 no
(1,0,1,7) 2 (0,1,7,1) 2 (0,1,7,1) 2
(1,0,1,7) 4 1 no (0,1,7,1) 9 1 no (0,1,7,1) 14 1 no
(7,1,1,0) 2 (7,1,0,1) 2 (0,1,1,7) 2
(7,1,1,0) 5 1 no (7,1,0,1) 10 1 no (0,1,1,7) 15 1 yes!
(7,0,1,1) 2 (1,7,0,1) 2
This . . . is not great. If you used BogoSort to sort a deck of cards, your
process would look like the following:
❼ One-by-one, go through your deck of cards and see if they’re or-
dered.
❼ If during this process you spot any cards that are out of order,
throw the whole deck in the air, collect the cards together, and
start again.
By studying the running time of this algorithm, we can make “not great”
into something rigorous:
Claim 4.3. The worst-case running time for BogoSort to sort any list
containing more than one element is ∞.
In terms of running time, then, we’ve shown that BogoSort has a strictly
2
3n2 + 9n2 − 10
worse runtime than InsertionSort, as ∞ > . Success!
2
This sort of comparison was particularly easy to perform, as one of the
two things we were comparing was ∞. However, this comparison process
can get trickier if we examine more interesting algorithms. Let’s consider
a third sorting algorithm:
Algorithm 4.7. The following algorithm, MergeSort(L), takes in a list
L = (l1 , l2 , . . . ln ) of n numbers and orders it from least to greatest. It
does this by using the following algorithm:
1. If L contains at most one number, L is trivially sorted! In this
situation, stop.
2. Otherwise, L contains at least two numbers. In this case,
(a) Split L in half into two lists L1 , L2 .
(b) Apply MergeSort to each of L1 , L2 to sort them.
3. Now, we “merge” these two sorted lists:
(a) Create a big list with n entries in it, all of which are initially
blank.
(b) Compare the first element in L1 to the first element in L2 .
If L1 , L2 are both sorted, these first elements are the smallest
elements in L1 , L2 .
64
(c) Therefore, the smaller of those two first elements is the small-
est element in our entire list. Take it, remove it from the list
it came from, and put it in the first blank location in our big
list.
(d) Repeat (b)+(c) until our big list is full!
Here, we use the colored boxes to help us visualize the recursive appli-
cations of MergeSort to smaller and smaller lists.
As before, it is worth taking a moment to explain why this algorithm
works:
65
66
k 1 2 3 4 5 6 7
MergeSortSteps(2k ) 11 39 111 287 703 1663 3839
(Note that we’re using 2k as the length of our lists. This is because our
result only works for even numbers, so we want something that stays
even when we keep dividing it by 2.)
Spotting the pattern here is a pain without more advanced mathematics;
even the Online Encyclopedia of Integer Sequences doesn’t recognize it.
WolframAlpha, however, does!
Again, this is a claim whose proof will have to wait for Section 7.8. For
now, though, let’s take the following as given:
Claim 4.6. MergeSortSteps(2k ) = k ⋅ 2k+2 − 2k+1 − 1, for every natural
number k.
67
We can easily extend this to a list whose length is not a power of two
by just “rounding its length up to the nearest power of 2” by adding in
some blank cells: i.e. if we had a list of length 28, we’d add in 4 blank
cells to get a list of length 32.
If we do this, then Claim 4.6 becomes the following:
Observation 4.10. MergeSortSteps(n) ≤ k ⋅ 2k+2 − 2k+1 − 1, where 2k
is n rounded up to the nearest power of 2.
Recall that ⌈x⌉ is “x rounded up.”
To simplify this a bit to just write things in terms of n, notice that if 2k
is n rounded up to the nearest power of 2, then k = ⌈log2 (n)⌉. Plug this
into Observation 4.10, and you’ll get the following:
Two useful tricks we’re using in this MergeSortSteps(n) ≤ ⌈log2 (n)⌉ ⋅ 2⌈log2 (n)⌉+2 − 2⌈log2 (n)⌉+1 − 1
≤ (log2 (n) + 1)2(log2 (n)+1)+2 − 2(log2 (n)+1)+1 − 1
calculation:
❼ x ≤ ⌈x⌉. That is: rounding a number
= (log2 (n) + 1)2log2 (n) ⋅ 23 − 2log2 (n) ⋅ 22 − 1
up only makes it larger.
❼ ⌈x⌉ < x + 1. That is: rounding up
never increases a number by more = 8n log2 (n) + 8n − 4n − 1.
than 1.
= 8n log2 (n) + 4n − 1 .
3n2 + 9n − 10
8n log2 (n) + 4n − 1 vs.
2
n 1 2 3 4 5 6 7
3n2 + 9n − 10
1 10 22 37 55 76 100
2
8n log2 (n) + 8 log2 (n) − 2n − 1 3 23 49.0 79 111.9 147.1 184.2
It looks like the MergeSort algorithm needs more steps than InsertionSort
so far. However, this table only calculated the first few values of n; in
real life, however, we often find ourselves sorting huge lists! So: in the
long run, which should we prefer?
If you would like a more rigorous defi- To answer this, we need the idea of a limit:
nition here than “gets close,” look into
classes like Maths 130 and Maths 250! Definition 4.4. Given any function f that is defined on the natural
numbers N, we say that lim f (n) = L if “ as n goes to infinity, f (n)
gets closer and closer to L.” In particular, we say that lim f (n) = +∞
n→∞
68
1
Problem 4.1. What is lim ?
n→∞ 2 − n1
Answer. lim log2 (n) = ∞. This is because for any positive integer k,
n→∞
we can make log2 (n) ≥ k by setting n to be any value ≥ 2k .
In other words, as n grows, it eventually gets larger than 2k for any fixed
k, and thus log2 (n) itself grows without bound.
1
Problem 4.3. What is lim ?
n→∞ log2 ( n1 )
Answer. To find this limit, we simply break our function down into
smaller pieces:
❼ First, notice that as n takes on increasingly large positive values,
1
n
goes to 0 and stays positive.
❼ Therefore, log2 ( n1 ) goes to negative infinity, as log2 (tiny positive
numbers) yields increasingly huge negative numbers.
❼ Therefore, log 1( 1 ) goes to 0, as 1
huge negative numbers
yields tiny neg-
2 n
ative numbers.
1
In total, then, lim = 0.
n→∞ log ( 1 )
2 n
n2 + 3n − 4
Problem 4.4. What is lim ?
n→∞ n3 + 2
Answer. It is tempting to just “plug in infinity” into the fraction above,
and say that
∞2 +3∞−4 ∞
“because ∞3 +2
= ∞
= 1, our limit is 1”
However, you can’t do manipulations like this with infinity! For example,
because n1 = nn2 , we have
n 1
lim = lim = 0,
n→∞ n2 n→∞ n
even though the method above would say that
∞ ∞
“because ∞2
= ∞
= 1, our limit is 1.”
The issue here is that there are different growth rates at which various
expressions approach infinity: i.e. in our example above, n2 approaches
infinity considerably faster than n, and so the ratio nn2 approaches 0 even
though the numerator and denominator individually approach infinity.
Instead, if we ever have both the numerator and denominator approach-
ing +∞, we need to first simplify our fraction to proceed further! In
this problem, notice that if we divide both the numerator and the de-
nominator by n3 , the highest power present in either the numerator or
denominator, we get the following:
1 3 4
n2 + 3n − 4 1/n3 n
+ n2
− n3
⋅ = 2
.
n3 + 2 1/n3 1+ n3
69
With the idea of limits in mind, we can now talk about how to compare
functions:
Definition 4.5. Let f (n) and g(n) be two functions which depend on
n. We say that the function f (n) grows faster than the function g(n)
∣f (n)∣
n→∞ ∣g(n)∣
if lim = +∞.
Intuitively, this definition is saying that for huge values of n, the ratio of
f (n) to g(n) goes to infinity: that is, f (n) is eventually as many times
larger than g(n) as we could want.
We work an example of this idea here, to see how it works in practice:
Example 4.14. We claim that the function f (n) = n2 + 2n + 1 grows
faster than g(n) = 5n + 5. To see this, by our definition above, we want
n2 + 2n + 1
to look at the limit lim .
n→∞ 5n + 5
Notice that we can factor the numerator into (n + 1)2 . Plugging this
(n + 1)2
into our fraction leaves us with , which we can simplify (as in
5(n + 1)
n+1
Problem 4.4,) to .
5
n+1
As n goes to infinity, goes to infinity.
5
1 1
This grows without bound: as n goes to infinity, so does n + ! There-
5 5
fore we’ve shown that n2 + 2n + 1 grows faster than 5n + 5.
70
To do this, we could just plug in various values of n that are close to but
greater than zero, and see what happens!
x f(x)
1 0.33
10 0.63
100 2.70
1000 17.97
10000 136.03
If you plugged in n = 100, 1000, 10000,, you’d get ≈ −0.9, −0.34, −0.11.
This looks like it’s slowly increasing to 0, so you’d be tempted to
guess that
lim log2 (log2 (log2 (log2 (log2 (n)))) = 0.
n→∞
This often gives you a new function where you no longer have the top
and bottom going to zero, which is often much easier to work with.
71
x2 − x
Example 4.15. If we had the limit lim , we could factor an x
x→+∞ x
out of the top and bottom to get lim x − 1. This is +∞.
x→∞
x2 − x
For a second example, if we had the limit lim , we could factor
x→∞ x2
1 − x1
an x2 out of the top and bottom to get the simplified limit lim .
x→+∞ 1
The numerator goes to 1 and the denominator just is 1, so this is 1.
Within those groups, we sort these expressions by degrees and bases: i.e.
and
and so on / so forth.
Finally, any sum of expressions grows as fast as its largest expression:
i.e. a factorial plus a log plus a constant grows at a factorial rate, a n4
plus a log plus a square root grows as fast as n4 grows, etc.
As a brief justification for this, let’s look at a table:
72
To study one more example of this idea and finish our chapter, let’s use
this principle to answer our last exercise:
Answer to 4.1. In this problem, we had two multiplication algorithms
and wanted to determine which is best.
³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ·¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ
a times
73
For the purpose of this problem, we don’t have to think about why
this process works, though look at practice problem 6 if you’re curious!
Instead, if we simply take it on faith that this algorithm does work, it’s
easy to calculate how long it takes to complete: it will need as many
iterations as it takes to reduce a to 0 by these operations.
In the worst-case scenario for the number of iterations, we only ever
divide by 2; i.e. a is a power of 2. In this case, it takes k iterations if
a = 2k ; i.e. we need log2 (a) iterations, and thus need a constant times
log2 (a) many operations.
As we saw before, logarithms grow much slower than linear functions!
Therefore, Algorithm 4.2 is likely the better algorithm to work with.
√
2. (-) Let f (x) = 10x + 1, g(x) = x − 3, h(x) = log10 (x3 + 3) and j(x) =
3
x. Calculate the following compositions:
❼ f ○ h(x). ❼ g ○ f (x).
❼ h ○ j(x). ❼ f ○ g(x).
74
2n + 2n+1 sin(2n + n) + n
n→∞ n − log (n)
(a) lim (d) lim
n→∞ 2n − 2n−1 2
75
Graphs
2019 Chapter 5
Exercise 5.1. In the 1700’s, the city of Königsberg was divided by the
river Pregel into four parts: a northern region, a southern region, and
two islands. These regions were connected by seven bridges, drawn in
red in the map at right.
These bridges were particularly beautiful pieces of architecture, and so
when people would offer tours of the city they would try to take tourists
across each bridge to show it off. However, tour guides at the time
noticed that no matter how they could construct their walks around the
city, they’d always either miss a bridge or have to double a few bridges
up: they could never find a “perfect” tour that crossed each bridge exactly
once.
A map of Königsberg, c. 1730. Can you find such a tour? That is: can you come up with a walk through
Source: Wikipedia. the city that starts and ends at the same place, and walks over each bridge
exactly once?
Exercise 5.2. You’re a civil engineer! Suppose that you have a set of
buildings, and you want to hook up each of these buildings to the city’s
utilities, namely water, gas, and electricity.
However, your city is very earthquake-prone. As a safety precaution,
they’ve asked that no two utilities are allowed to vertically “cross over”
each other: that is, you can’t have water mains running beneath the gas
mains, or gas pipes running above the electricity wires. So, for example,
while the configuration drawn at left wouldn’t work for two buildings, the
right one would!
Can you connect three buildings to your utilities? Or is this impossible?
76
If Definition 5.1 above feels a bit too informal for you, here’s a more
precise way to describe a graph: For simplicty’s sake, we will use the
word “graph” to refer to a simple undi-
Definition 5.2. A simple undirected loopless graph G consists of rected loopless graph, and assume that
two things: a collection V of vertices, and another collection E of all graphs are simple undirected loop-
edges, which we think of as distinct unordered pairs of distinct elements less graphs unless we explicitly say oth-
erwise.
in V . We think of the vertices in a graph as the objects we’re study-
ing, and the edges in a graph as a way to describe how those objects are
connected.
To describe an edge connecting a pair of vertices a, b in our graph G, we
use our set language from earlier and write this as {a, b}. We say that
a and b are the endpoints of the edge {a, b} when this happens.
Example 5.2. The following describes a graph G:
❼ V = {a, b, c, d, e}
❼ E = {{a,b}, {b,c}, {c,d}, {d,e}, {e,a}}
a a
Given a graph G = (V, E), we can visualize G by drawing its vertices as e b d c
points on a piece of paper, and its edges as connections between those
points. d c b e
Several ways of drawing the graph G defined above are drawn at right.
Notice that there are many ways to draw the same graph! Also note that
edges don’t have to be straight lines, and that we allow them to cross. a b c d e
In general, it is often much easier to describe a graph by drawing it
rather than listing its vertices and edges one-by-one. Wherever we can,
we’ll describe graphs by drawing pictures; we encourage you to do the
same!
Here’s a few examples of graphs, described with this vertex+edge lan-
guage:
77
Example 5.3.
❼ We can represent a maze as a graph! Take the collection of “rooms”
in our maze, and think of them as our vertices. Now, connect any
two room-vertices with an edge if they are linked by a doorway.
With this done, finding a way out of our maze is the same as
finding a walk through our graph! This can often simplify things
for us, as the graph visualization lets us ignore some of the more
irrelevant information (i.e. right angles, walls without doors, etc)
in the maze.
❼ You and your three flatmates (Aang, Korra, and Zuko) have just
moved to a new house. You have a list of tasks to do over the
weekend: you all want to paint the walls, move in your furniture,
cook dinner, and do some gardening. Suppose that you have the
following skills:
– You really like to paint and garden.
– Aang is a great cook who enjoys gardening.
– Zuko is also a great cook who’s strong (furniture.)
– Korra is strong (furniture) and likes to paint.
You Paint You can visualize this with a graph! Make a vertex for each flat-
mate, and a vertex for each task. Then, connect flatmates to their
Aang Garden preferred tasks with edges. With this, a “good” way to divide
up tasks is to find a “matching:” that is, a way to pick out four
Zuko Furniture
edges in our graph, such that every person and every task is in
exactly one edge (so that the work is divided!) Such a matching is
Korra Cook
highlighted in the graph at right.
Just as there are many different ways to model the connections between a
set of objects, there are other notions of graphs beyond that of a simple
graph. Here are some such definitions:
Definition 5.3. A simple graph with loops is just like a simple graph
G, except we no longer require the pairs of elements in E to be distinct;
that is, we can have things like {v, v} ∈ E.
A multigraph is a simple graph, except we allow ourselves to repeat
A simple graph with loops A multigraph
on five vertices. edges in E multiple times; i.e. we could have three distinct edges e1 , e2 , e3 ∈
E with each equal to the same pair {x, y}.
on four vertices.
You can mix-and-match these definitions: you can have a directed graph
with loops, or a multigraph with loops but not directions, or pretty much
anything else you’d want!
Remark 5.1. In this course, we’ll assume that the word graph means
simple undirected graph without loops unless explicitly stated oth-
erwise.
78
To practice all of this graph theory language, let’s try solving one of our
exercises!
Answer to Exercise 5.2. You cannot solve the utility problem!
To see why, let’s use the language of graph theory. Make each of our
three buildings into vertices; as well, make each of the three utilities into
vertices. If we connect each building to each utility, notice that we’ve
created the graph K3,3 !
b
We claim that K3,3 cannot be drawn without any edges crossing, and a b c
thus that this puzzle cannot be solved. To see why this is true, let’s use d f
the “contradiction” technique we used above.
Think about what a solution would look like. In particular, notice that a c
K3,3 contains a “hexagon,” i.e. a C6 , as drawn at right. Therefore, in d e f
e
any drawing of K3,3 , we will have to draw this cycle C6 first.
Because this cycle is a closed loop, it separates space into an “inside” and
an “outside.” Therefore, if we are drawing K3,3 without edges crossing,
after drawing the C6 part, any remaining edge will have to either be
drawn entirely “inside” the C6 or “outside” the C6 . That is, we can’t
have an edge cross from inside to outside or vice-versa, because that b b
would involve us crossing over pre-existing edges. d f d f
Therefore, if we have a crossing-free drawing of K3,3 , after we draw the or
C6 part of this graph, when we go to draw the {d, c} edge we have two a c a c
options: either draw this edge entirely on the inside of our C6 , or entirely
on the outside. e e
If we draw this edge on the inside, then on the inside the vertices f and
a are separated by this edge; therefore, to draw the edge {a, f } we must
go around the outside. Similarly, if we draw this edge on the outside,
then a, f are separated from each other on any outside walk, and the
edge {a, f } must be drawn inside of the hexagon. b b
In either of these cases, notice that there is no walk that can be drawn d f d f
from b to e on either the inside or the outside! Therefore we cannot or
draw our last edge {b, e} without having a crossing, and thus have a a c a c
contradiction to our claim that such a crossing-free drawing of K3,3 was
possible. e e
79
Example 5.4. If we look at the flatmate graph from Example 5.3 above,
we can see that the vertices “You” and “Paint” are adjacent, but that
“You” and “Furniture” are not adjacent. Similarly, “Aang” and “Cook”
are adjacent, but “Aang” and “Zuko” are not adjacent.
Notice that adjacency is just a property of individual edges! Even though
we drew the Aang and Zuko vertices next to each other on our graph,
we did not connect them with an edge, so they are not connected. As
well, even though you can go from “You” to “Furniture” by using the
multiple-edge walk
80
Claim 5.1 says that this should be twice the number of edges; i.e. that
this graph should have 14 edges. This is true: count them by hand!
Claim 5.2. You cannot have a graph on seven vertices in which the
degree of every vertex is 3.
Proof. Before reading this proof, try to show this yourself: that is, take
pen and paper, and try to draw a graph on seven vertices where all of
the degrees are 3!
In doing so, we make two claims:
❼ You won’t succeed: you’ll always wind up with one vertex forced
to have degree 2 or 4 (or you’ll accidentally make something that’s
not a simple graph by drawing multiple edges / loops.)
❼ It won’t be obvious why this keeps failing: i.e. there will be a lot of
different things you could have tried, and writing down a “proof”
for why this is impossible may feel like it would involve a ton of
tedious casework.
However, if we use the degree-sum formula, we claim that this problem
is quite simple! We start by using a principle that’s served us well in
many prior problems: let’s suppose that our claim is somehow false, and
it is possible to have a graph G on seven vertices in which all degrees
are 3.
The degree-sum formula, when applied to G, tells us that the sum of
the degrees in G must be twice the number of edges. Because all of the
degrees in G are 3 and there are seven vertices, we know that this degree
sum is just 3 ⋅ 7 = 21.
Therefore, we must have that two times the number of edges in our graph
is 21. That is, we have 10.5 edges . . . which is impossible, because edges
come in whole-number quantities! (I.e. you either have an edge {a, b}
or you don’t: there is no “half” of an edge {a in a simple graph.)
Therefore, such a graph G cannot exist, as the number of edges required
for such a graph G is impossible to construct. In other words, no graphs
exist on seven vertices in which all degrees are 3!
81
v0 → v1 → v2 → . . . → vn−1 → vn
e f g
is a valid way to describe a walk.
Example 5.7. In the graph in the margins, the following sequences are
walks from e to g:
❼ e→a→f →b→e→d→f →b→g
❼ e→c→g
Notice that walks can repeat edges and vertices!
A useful property that graphs can have, related to this concept of walks,
a is being connected:
82
Tasman Tasman
3 4
Buller Buller Canterbury
7 5 2
2
West Coast Canterbury Mid-Canterbury
West Coast 3 1
8 4 South Canterbury
6 2
Mid-Canterbury
South Canterbury Southland North Otago
4
North Otago 3 4
Otago
Otago
Southland
In graphs like this, we’ll often solve problems like the following:
Example 5.10. Suppose that you’re a traveling salesman. In partic-
ular, you’re traveling the South Island, and trying to sell rugby tickets
for nine rugby teams there (one for each region in the map above.)
You want to start and finish in Mid-Canterbury, and visit each other
region exactly once to sell tickets in it. What circuit can you take through
these cities that minimizes your total travel time, while still visiting
each city exactly once?
Without knowing any mathematics, you’d probably guess that the short-
est route is to just go around the perimeter of the island. Intuitively,
at the least, this makes sense: avoiding the southern alps is probably a
good way to save time!
In real life, however, maps can get a lot messier than this. Consider a
map of all of the airports in the world, or even just in New Zealand (at
right.) If you were an Air New Zealand representative and wanted to
visit each airport, how would you do so in the shortest amount of time
and still return home to Auckland?
Publicly-available map sourced from
In general: suppose you have n cities C1 , . . . Cn that you need to visit for http://www.airlineroutemaps.com’s
work, and you’re trying to come up with an order to visit them in that’s Air New Zealand page.
the fastest. For each pair of cities {Ci , Cj }, assume that you know the
time it takes to travel from Ci to Cj . How can you find the cheapest way
to visit each city exactly once, so that you start and end at the same
place?
These sorts of tasks are known as traveling salesman problems, and
companies all over the world solve them daily to move pilots, cargo, and
people to where they need to be. Given that it’s a remarkably practical
problem, you’d think that we’d have a good solution to this problem by
now, right?
. . . not so much. Finding a “quick” solution (i.e. one with non-exponential
runtime) to the traveling salesman problem is an open problem in the-
oretical computer science; if you could do this, you would solve a problem
that’s stumped mathematicians for nearly a century, advance mathemat-
ics and computer science into a new golden age, and quite likely go down
in history as one of the greatest minds of the millenium . So, uh, extra-credit problem.
This is a fancy way of saying “this problem is really hard.” So: why
mention it here? Well: in computer science in general, and graph theory
in particular, we often find ourselves having to solve problems that don’t
have known good or efficient algorithms. Despite this, people expect
us to find answers anyways: so it’s useful to know how to find “good
enough” solutions in cases like this!
For the traveling salesman problem, one brute-force approach you could
use to find the answer could be coded like this:
83
Assume that c({x, y}) is infinite if the edge doesn’t exist (i.e. that
it would take “forever” to travel along a walk that is impossible to
travel along.)
3. Output the smallest number/walk you find.
Points in favor of this algorithm: it works! Also, it’s not too hard to
code (try it!)
Points against this algorithm: if you were trying to visit 25 cities in a
week, it would take the world’s fastest supercomputer over ten thousand
years to answer your problem. (If you were trying to visit 75 cities,
the heat death of the universe occurs before this algorithm is likely to
terminate.)
This is because the algorithm needs us to consider every possible order
of the n vertices in G to complete. There are (n − 1)! = (n − 1) ⋅ (n − 2) ⋅
To see why, think about how you’d (n − 3) ⋅ . . . ⋅ 3 ⋅ 2 ⋅ 1 many ways in which we can order our n cities, and
make an ordering of the cities. You’d the factorial function grows incredibly quickly, as we saw before!
start by choosing a city to travel to
from s: there are n − 1 choices here, Another approach (which, as authors who would like to book their travel
as we can possibly go anywhere other before the heat death of the universe, we are in favor of) is to use ran-
than s. From there, we have n − 2 domness to solve this problem! Consider the following algorithm:
choices for our second city, and then
n − 3 for our third city, and so on/so Algorithm 5.10. Init: Take our graph G, starting vertex s, and cost
forth!
function c just like before.
1. Start from s and randomly choose a city we haven’t visited, and
then go to that city.
2. Keep randomly picking new cities until we’ve ran out of new choices,
and then return to s.
3. Calculate the total cost of that path.
4. Run this process, say, ten thousand times (which, while large, will
be much smaller than n! for almost all values of n that you’ll run
into.)
5. Output the smallest number/path you find.
84
Pick any vertex x in our graph. Notice that each time x comes up in the
above circuit, it does so twice: if x = vi for some i, it shows up in both
{vi−1 , vi } and {vi , vi+1 }. You can think of this as saying that each time
our circuit “enters” a vertex along some edge, it must “leave” it along
another edge!
As a result, any vertex x shows up an even number of times in the circuit
we’ve came up with here. As well, we assumed that this circuit contains
every edge exactly once. Therefore, every vertex x shows up in an even
number of edges in our graph!
That is, deg(x) is even for every vertex x.
However, we know that our graph has odd-degree vertices. This is a
contradiction! Therefore, we have shown that it is impossible to solve
this puzzle.
85
Trees
2019 Chapter 6
6.1 Trees
In our last section, we saw that the language of graph theory could be
used to describe tons of real-life objects: the internet, transportation,
social networks, tasks, and many other things! Even your computer (a
particularly relevant thing to consider in a computer science class) can
be described as a graph:
❼ Vertices: all of the files and folders in your computer.
❼ Edges: Draw an edge from a file or folder to every object it con-
tains.
If you draw this out, you’d get something similar to the drawing below:
C:
This graph represents the file system for your computer, and is ex-
tremely useful for organizing files: imagine trying to find a document
if literally every file on your computer had to live on your desktop, for
instance!
This graph has a particularly useful structure: starting from C: , there’s
always exactly one way to get to any other file or folder if you don’t allow
backtracking. That is: there are no files you can’t get to by starting from
your root and working your way down, and also there are no files that
you can get to in multiple different ways! This is a very nice property
for a file system to have: you want to be able to navigate to every file
86
in some way, and it’s very nice to know that files in different places are
different (imagine deleting a file from your desktop and having all copies
of it disappear in other places!)
We call graphs with the structural property described above trees. Trees
come up all the time in real life:
❼ PDF documents (like the one you’re reading right now!) are tree-
based formats. Every PDF has a root note, followed by various
sections, each of which contains various subsections.
❼ In genealogy and genetics, people study family trees: i.e. take
your great-grandmother, all of her children, all of her children’s
children, and so on/so forth until you’re out of relatives. This is
a tree, as starting from your great-grandmother there should only
be one way to get to any relative.
❼ Given any game (e.g. chess, or tic-tac-toe, or Starcraft), you can
build a decision tree to model possible outcomes as the game
progresses. To do so, make a vertex for the starting state. Then
make a vertex for every possible move player 1 could make, and
connect the starting state to all of these. For each of those states,
make a vertex for every response player 2 could make, and connect
those states up as well; doing this for all possible moves generates
a decision tree, which you can use to win!
In short: they’re useful!
To define what a tree is, we first need to introduce a useful concept from
graph theory that we didn’t have time to discuss last chapter:
For example, the three graphs in the margins are not trees, as each of
them has a cycle graph of some length as a subgraph.
However, the three graphs below are all trees:
We call vertices of degree 1 in a tree the leaves of the tree. For example,
the leaves of the trees above are colored green.
To get a bit of practice with these ideas, let’s prove a straightforward
claim about trees:
87
Algorithm 6.11.
1. Choose any edge e = {x0 , y0 } in G.
2. Starting from i = 0, repeatedly do the following: if xi has degree
≥ 2, then pick a new edge {xi , xi+1 } leaving xi . Because T is a
tree, xi+1 is not equal to any of our previously-chosen vertices (if
it was, then we’d have created a cycle.) Stop when xi eventually
has degree 1.
3. Starting from i = 0, do the same thing for yi .
Notice that this process must eventually stop: on a tree with n vertices,
we can only put n vertices in our path because the “no cycle” property
stops us from repeating vertices. When it stops, the endpoints of the
path generated are both leaves because this is the only way we stop this
process. Therefore, this process eventually finds two leaves in any tree!
As a bit of extra practice, let’s try to use our tree language to sketch a
solution to our second exercise:
Answer to Exercise 6.2. It turns out that you can guard any n-sided
polygon (without any holes, and where all sides are straight) with at
most ⌊ n3 ⌋ cameras! To do so, use the following process:
❼ Take your n-sided polygon. By connecting opposite vertices, divide
it up into triangles.
❼ Turn this into a graph: think of each triangle as a vertex, and
connect two triangles with an edge when they share a side.
❼ This graph is a tree! (Why? Justify this to yourself.)
❼ Use this tree structure to do the following:
– Take any triangle. Color its 3 vertices red, blue, and green.
– Now, go to any triangle that shares a boundary with that
colored triangle. It will have 2 of its three vertices given
colors. Give its third vertex the color it’s currently missing.
– Repeat this process! It never runs into conflicts, because our
graph is a tree (and so we don’t have cycles.)
❼ Result of the above: every triangle has one red vertex, one blue
vertex, and one green vertex.
❼ Put a camera on the least-used color! This needs at most n/3
rounded down cameras, as we’re using the least popular of three
colors. It also guards everything, as a camera sees everything in
each triangle it’s in!
88
Proof of Theorem 6.2. Because we’re proving that these two statements
are equivalent, we need to show that if either of them is true, then the
other statement follows. That is, if you wanted to show that “attending
office hours” and “getting an A+ in Compsci 120” were equivalent things,
you wouldn’t be satisfied if I said “everyone who got an A+ in Compsci
120 attended office hours:” you’d also want to know whether “everyone
who attended office hours got an A+”!
As such, this proof needs to go in 2 steps:
1. First, we need to show that if T is a tree, then there’s a unique
path between any two vertices in T .
2. Then, we need to show that if there’s a unique path between any
two vertices in T , then T is a tree.
We do each of these one-by-one:
1. Because T is a tree, by definition we know that T is connected.
By the definition of connected, we know that for any two vertices
x, y there is at least one path that goes from x to y. To complete
our proof, then, we just need to show that there aren’t multiple
paths between any two vertices.
To see why two distinct paths is impossible, we proceed by contra-
diction: i.e. we suppose that we’re wrong, and that it is somehow
possible for us to have two different paths linking a pair of vertices.
Let’s give those vertices and paths names: that is, let’s assume
that there are vertices x, y linked by two different paths P1 = {v1 =
x, v2 }, {v2 , v3 }, . . . {vn−1 , y} and P2 = {w1 = x, w2 }, {w2 , w3 }, . . . {wm−1 , y}.
Because these two paths are different, there must be some value i
w2 wi wi+1 wl-1
such that vi ≠ wi . Let i be the smallest such value, so that these w1 w wi-1 wl
paths agree at vi = wi and diverge immediately afterwards. v2 3
v1 v3 vi-1 vk
vi v vk-1
These two paths must eventually meet back up, as they end at the i+1
89
This result should help us understand how our “trees don’t have cycle”
property connects to the “there’s exactly one path from C: to any other
file” property that made our filesystem example so useful!
The second of these results requires induction, and so we’ll delay its proof
until Section 7.9. It’s quite useful, though, and worth knowing even if
we can’t prove it yet! For example, it lets us solve one of our exercises:
Answer to Exercise 6.1. Let T be a tree on 2n vertices with the “dou-
bled degrees” property. Notice that if a graph T on 2n vertices has
the “doubled degree” property, then the sum of the degrees in T is
n2 + n
1 + 1 + . . . + n + n = 2(1 + 2 + . . . + n) = 2 = n2 + n.
2
As well, the degree-sum formula from our graph theory chapter tells us
that the sum of the degrees in T is twice the number of edges. Therefore,
we have that n2 + n = 2E.
Finally, because T is a tree, we know that it has one less edge than it
has vertices (i.e. it has 2n − 1 edges, because G is on 2n vertices.)
Combining this all together tells us that n2 + n = 2(2n − 1) = 4n − 2; i.e.
n2 − 3n + 2 = 0; i.e. (n − 1)(n − 2) = 0, i.e. n = 1 or n = 2. In other words,
if T is a tree with the doubled degrees property, then T is either a tree
on 2 or 4 vertices.
For n = 1, the “doubled degree” property would tell us that T should be
a two-vertex graph with two vertices of degree 1. There is exactly one
tree of this form, namely .
For n = 2, the “doubled degree” property would tell us that T should be
a four-vertex graph with two vertices of degree 1 and two of degree 2.
There is also exactly one tree of this form, namely .
90
A A A
B B B
B B B
A A A
Definition 6.4. We say that the children of a vertex v are all of the
neighbors of v at the level directly below v, and the parent of v is the
neighbor of v at the level directly above v. The height of a rooted tree is
the largest level index created when drawing the graph as above.
To finish out our chapter and practice working with this concept, let’s
study a quick result about binary trees:
Exercise 6.3. Suppose that T is a full binary (i.e. 2-ary) tree with 100
leaves. How many vertices does T have in total?
91
92
93
Proofs
2019 Chapter 7
Exercise 7.1. You’re about to leave on holiday, but you forgot to pack
socks! You’ve ran back to your room, but the light’s burnt out, so you
can’t see the colours of your socks.
You know that in your sock drawer that there are ten pairs of green socks,
ten pairs of black socks, and eleven pairs of blue socks (all mixed up.)
How many of your socks do you need to take before you can be sure
you’ve grabbed at least one matching pair?
Exercise 7.2. You’re a mad scientist! You’ve conducted an experiment
on yourself to get superpowers. It worked, but to keep the powers you
need to take two different tablets each day; if you forget one, or take
more than one of either type, you’ll, um, explode.
Unfortunately, they look completely identical, and you’ve just dropped
your last two days of supply (four tablets) on the floor.
What can you do?
94
Story 7.1. Janelle Shane is a researcher in optics who works with neural
networks and machine learning. Roughly speaking, the way that a neural
network works is the following:
❼ Take a bunch of examples of the thing you want the neural network
to recognize, as well as a bunch of nonexamples. “Show” is hard to define in words, but
there are some great YouTube videos:
❼ “Show” the neural network these examples and nonexamples. see SethBling’s Mar I/O videos for a
❼ The neural network will then come up with a set of rules that it fun and accessible introduction!
believes describes what it means for
People are often tempted to just use the results of a neural network
directly, without checking whether its discovered rules make sense. Doing
so, as Janelle notes, leads to some fascinatingly weird behaviour:
❼ “There was an algorithm that was supposed to sort a list of num-
bers. Instead, it learned to delete the list, so that it was no longer
technically unsorted.”
❼ “ In 1997, some programmers built algorithms that could play tic-
tac-toe remotely against each other on an infinitely large board.
One programmer, rather than designing their algorithms strategy,
let it evolve its own approach. Surprisingly, the algorithm suddenly
began winning all its games. It turned out that the algorithms
strategy was to place its move very, very far away, so that when its
opponents computer tried to simulate the new greatly-expanded
board, the huge gameboard would cause it to run out of memory
and crash, forfeiting the game.”
❼ “An algorithm that was supposed to figure out how to apply a
minimum force to a plane landing on an aircraft carrier. Instead,
it discovered that if it applied a *huge* force, it would overflow
the programs memory and would register instead as a very *small*
force. The pilot would die but, hey, perfect score.”
In short: just because something works for a bunch of examples doesn’t
mean it’s good!
Story 7.2. A somewhat darker story on the importance of being able
to read and understand proofs comes from the NSA, and something
called a Dual Elliptic Curve Deterministic Random Bit Generator. This
was an algorithm, designed by the NSA (a USA security agency,) that
they claimed was a cryptographically secure way to generate random
numbers. See the New York Times for an article
summarizing the scandal.
However, this algorithm was one that the NSA had built a “backdoor”
Also, check out Kleptography: Using
into. That is, they designed the algorithm around certain secret values so Cryptography Against Cryptography
that anyone with knowledge of those values (i.e. the NSA) could predict and Cryptanalysis of the Dual Ellip-
the randomly-generated numbers with a higher-than-normal degree of tic Curve Pseudorandom Generator ,
if you’d like to read through some re-
accuracy and thereby defeat cryptographic systems using this algorithm.
search papers describing the NSA’s al-
The NSA managed to get their algorithm used as a “standard” for over gorithm/its weaknesses.
seven years. However, many mathematicians and computer scientists
were suspicious of the NSA’s algorithm from the very start, in large
part because it was not something that was proven to work!
Their research led to the eventual revocation of the NSA’s algorithm as
a standard.
So: we have some motivation for why we would want to write clear,
logical arguments. The next question for us, then, is what counts as a
valid argument?
Every major field of study in academia, roughly speaking, has a way of
“showing” that something is true. In English, if you wanted to argue
that the whale in Melville’s Moby Dick was intrinsically tied up with
95
mortality, you would write an essay that quoted Melville’s story alongside
some of of his other writings and perhaps some contemporary literature,
and logically argue (using these quotations as “evidence”) that your
claim holds. Similarly, if you were a physicist and you wanted to show
that the speed of light is roughly 3.0⋅108 meters per second, you’d set up
a series of experiments, collect data, and see if it supports your claim.
In mathematics, a proof is an argument that mathematicians use to
show that something is true. However, the concepts of “argument” and
“truth” aren’t quite as precise as you might like; certainly, you’ve had
lots of “arguments” with siblings or classmates that haven’t proven some-
thing is true!
In mathematics, the same sort of thing happens: there are many ar-
guments that (to an outsider) look like a convincing reason for why
something is true, but fail to live up to the standards of a mathemati-
cian. In Chapter 1, we already studied a pair of “failed” proofs: namely,
our first attempts at proving Claim 1.1 and Exercise 1.1. We said that
these arguments failed because they did not work in general: that is,
they only considered a few cases, and did not consider all of the possible
ways to put dominoes on a chessboard, or to pick a pair of integers.
This, however, is not the only way in which a proof might fail us! Here’s
another dodgy proof:
√
x+y
Claim 7.1. Given any two nonnegative real numbers x, y, we have 2
≥
xy.
96
0 ≤ (x − y)2
⇒ 0 ≤ x2 − 2xy + y 2
⇒ 4xy ≤ x2 + 2xy + y 2
⇒ 4xy ≤ (x + y)2
(x + y)2
⇒ xy ≤ .
4
Because x and y are both nonnegative, we can take square roots of both
sides to get
√ ∣x + y∣
xy ≤ .
2
Again, because both x and y are nonnegative, we can also remove the
absolute-value signs on the sum x + y, which gives us
√ x+y
xy ≤ ,
2
which is what we wanted to prove.
Much better! This proof doesn’t have logical flaws, it’s easier to read,
and we’ve justified all of our steps so that even a skeptical reader would
believe us.
97
Proof. We start by “assuming” the part by the “if:” that is, we assume
that n is an odd integer. By definition, this means that we can write
n = 2k + 1 for some other integer k.
We seek to study n2 . By our observation above, this is just (2k + 1)2 =
4k 2 + 4k + 1 = 4(k 2 + k) + 1. This is a multiple of 4 plus 1, as claimed!
Therefore we have completed our proof.
Proof. We start this proof by thinking about all of the facts that we know
about graphs and degrees. There’s one result that should immediately
jump to mind, namely the degree-sum formula: for any graph G,
Let’s use this result! Specifically: in this problem, we’re studying vertices
with odd degree. How can we turn this result into something that talks
about odd-degree vertices? Well: from our work in our first chapter, we
know that every integer is either even or odd. If we apply this idea to
our degree-sum formula, we get the following:
98
On the right-hand side, notice that we have an even number (twice the
number of edges) minus a bunch of even numbers (the degrees of all
even-degree vertices in G); therefore, the right-hand-side is even!
As a result, the left-hand-side is also even. But this means that the sum
of all odd-degree vertices is an even number.
We know that summing an odd number of odd numbers is always odd,
and that summing an even number of odd numbers is always even. Be-
cause the left-hand side is even, we know we must be in the second case;
that is, that we have an even number of vertices of odd degree, as
claimed!
Theorem 7.2. For every natural number n, if n is a square number , An integer n is said to be a square
then n ≡/ 2 mod 3. number if we can write n = k2 for
some other integer k. For example,
0, 1, 4, 9, 16, 25 . . . are all square num-
Proof. As always, we start by expanding our definitions. If n is a square bers!
number, then by definition we know that n = k 2 for some integer k.
From here, we use the particularly clever trick that this section is devoted
to: we consider cases. That is: we want to look at what n is congruent
to modulo 3.
We don’t have any information about what n or k theirselves are modulo
3, so it would seem hard to introduce this information into our proof!
However, by the definition of the modulus operator % , we know that
every number is congruent to one of 0, 1 or 2 modulo 3. By definition,
then, this means that we most always be in one of the following three
cases:k ≡ 0 mod 3, k ≡ 1 mod 3 or k ≡ 2 mod 3.
In each of these cases, we can now expand our definitions and use our
knowledge of modular arithmetic to proceed further:
1. Assume that we’re in the k ≡ 0 mod 3 case. In this situation, we
have that k ≡ 3m for some m, which means that k 2 = 9m2 = 3(3m2 )
is also a multiple of 3. Thus, k 2 ≡ 0 mod 3.
2. Now, assume instead that we’re in the k ≡ 1 mod 3 case. In this
situation, we have that k ≡ 3m + 1 for some m, which means that
k 2 = 9m2 + 6m + 1 = 3(3m2 + 2m) + 1. Thus, k 2 ≡ 1 mod 3.
3. Finally, consider the last remaining case, where k ≡ 2 mod 3. In
this situation, we have that k ≡ 3m + 2 for some m, which means
that k 2 = 9m2 + 12m + 4 = 3(3m2 + 4m + 1) + 1. Thus, k 2 ≡ 1 mod 3.
In all three of these cases, we’ve seen that n = k 2 is not congruent to 2
modulo 3. These cases cover all of the possibilities! Therefore, we know
that n is simply never congruent to 2 modulo 3 in any situation, and
have therefore proven our claim.
The trick to the proof above was that we were able to introduce addi-
tional information about k (namely, its remainder on division by 3) by
simply considering all possible remainders as separate cases! This
99
Claim 7.5. For every two numbers x, y, we always have that max(x, y)+
min(x, y) = x + y.
In the next example, we return to the tricks we used to calculate the last
digit of a number in Claim 1.10:
Claim 7.6. For any integer k, we have that (k 4 ) % 10 is always either
Proof by cases is also usually a good 0, 1, 5, or 6.
idea if you see the modulus operator!
Proof. We saw before in our chapter on integers that if d0 is the last
digit of an integer n, then nm % 10 is equal to dm
0 % 10 for any positive
integer power m.
Therefore, in our claim, we don’t have to actually consider every possible
integer k; we can just consider the ten different possible last digits k could
have, and calculate the cubes of each of those! We do so here:
❼ 04 % 10 = 0 % 10 = 0. ❼ 34 % 10 = 81 % 10 = 1. ❼ 64 % 10 = (36)2 % 10 = 62 % 10 = 6.
❼ 14 % 10 = 1 % 10 = 1. ❼ 4
4 % 10 = 256 % 10 = 6. ❼ 74 % 10 = (49)2 % 10 = 92 % 10 = 6.
❼ 24 % 10 = 16 % 10 = 6. ❼ 4
5 % 10 = 625 % 10 = 5. ❼ 84 % 10 = (64)2 % 10 = 42 % 10 = 6.
❼ 94 % 10 = (81)2 % 10 = 12 % 10 = 1.
100
√
Proof. As always, let’s start by unpacking our definitions:
❼ 2 is the unique positive real number such that when we square
it, we get 2.
❼ A number x is rational if we can write x = m
n
, where m and n are
integers and n is nonzero.
With this done, our claim can be unpacked to the following:
√
“For a real number x, if x = 2, then there are no values of m, n ∈ Z
with n ≠ 0 such that x = m
n
.”
101
Proof. In the example we’re studying here, we want to show that it’s
impossible for ab to be irrational for every pair of irrational numbers
a, b. To do this via a proof by contradiction, we do the following: first,
assume that ab is irrational for every pair of irrational √
numbers a, b! If
we apply this knowledge to one of the few numbers ( 2) we know is
irrational, our assumption tells us that in specific
102
√ √2
2 is irrational.
√ √2 √ √2⋅√2 √ 2
√
2
( 2 ) = 2 = 2 = 2,
Proof. As we did with our argument that no number can be both even
and odd at the same time, let’s approach this with a bit of a thought
experiment: what would happen if there were not infinitely many prime
numbers?
Well: if this were to happen, then there would be some fixed number of
primes in existence. Let’s give that number a name, and say that there
were n primes in existence. Then, if we had a piece of paper with n
lines on it, we could in theory write down all of the prime numbers that
existed!
If we labeled those lines 1, 2, . . . n, we could then refer to those prime
numbers by their labels: that is, we could refer to our prime numbers
by calling them p1 , p2 , p3 , . . . pn . (Giving things names: a very useful
technique!)
In this world where we have all of these prime numbers, what can we
do with them? Well: as we saw before, a particularly useful property
about prime numbers is that they form the building blocks out of which
we can make all integers. Therefore, we’re motivated to take our primes
and stick them together, and see what happens!
After a lot of effort, you might eventually hit on the clever combination
of our prime numbers that Euclid discovered: think about what happens
if we multiply all of our prime numbers together, and then add 1 to that
entire sum. That is: look at the number
M = 1 + (p1 ⋅ p2 ⋅ p3 ⋅ . . . ⋅ pn )
103
On one hand: take any of the prime numbers on our list. To indicate
that we’re taking a general prime number from our list, let’s refer to
that prime number as pi , where i could be any index. Look at Mpi
. By
definition, this is equal to
1 ⎛ ³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ · ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ ⎞
all of the primes except for pi
+⎜ p1 ⋅ p 2 ⋅ . . . ⋅ p n ⎟.
pi ⎝ ⎠
Claim 7.10. For every pair of integers x, y such that x and y are both
odd, we have that x ⋅ y is also odd.
Here are a number of incorrect ways that people will try to negate this
claim:
1. “For every pair of integers x, y such that x, y are both even, we
have that x ⋅ y is also even.”
The first mistake made here is in the first two words, where we
wrote “for every!” That is: Claim 7.10 is a claim about all pairs
of integers. As such, if someone were to say that P was false,
they’d just have to have one counterexample to prove us wrong!
That is: if someone made a claim that every UoA student was
enrolled in Compsci 120, you wouldn’t prove them wrong by trying
to show that every UoA student is not enrolled in Compsci 120;
you’d just have to find at least one student not in Compsci 120.
This tells us the first part of how we should write the negation of
this claim: it should go “There is a pair of integers x, y. . . ”
104
“There is a pair of integers x, y such that both x, y are odd, and yet
x ⋅ y is even.”
105
Observation 7.16. The phrases “For every” and “There exist” get
switched around when writing a proof by negation. This is because we
disprove a claim about everything by finding a single counterexample,
and we prove that no example of a thing can exist by showing that
everything is not a counterexample!
Observation 7.17. The “universe” of a claim remains the same: i.e.
we don’t disprove a claim about all even numbers by studying odd num-
bers.
Observation 7.18. The opposite of an “if A then B” statement is “there
is a situation where A holds and B fails.” That is: if someone tells you
that when it rains outside the sidewalk gets wet, you just need to find a
situation where (1) it’s raining and (2) some bit of sidewalk is still dry
to disprove their claim!
To finish this section and put this to use, let’s prove Claim 7.12!
106
Proof. Behold!
In this proof, we don’t have much to really explain: the solution pre-
sented self-evidently has the desired property (just check every row and
column.) If it was unclear, though, we’d have to have some explanation
along with our answer!
We close by giving a pair of slightly trickier examples for how construc-
tion can work, by using processes and algorithms:
Definition 7.1. Given a graph G, a vertex coloring of G with k colors G H
is any way to assign each vertex of G one of k different colors, so that
no two adjacent vertices get the same color.
107
1. Take all currently uncolored vertices that are connected to any red
vertices by an edge, and color them blue.
2. Take all currently uncolored vertices that are connected to any blue
v
vertices by an edge, and color them red.
3. If there are any uncolored vertices left, go back to (i) and repeat.
108
❼ Label the (2n)2 vertices in the grid graph G2n,2n with coordinates
(i, j), where vertex (1, 1) is the vertex in the bottom-left-hand
corner and (2n, 2n) is the vertex in the upper-right-hand corner.
❼ Start at (1, 1),
❼ From this vertex, walk to the right until you’re at the bottom-right
corner (1, 2n).
❼ Go up one step to (2, 2n), and then walk back to the left until
you’re at (2, 2).
❼ Go up one step to (3, 2), then walk back to the right until you’re
at (3, 2n).
❼ Go up one step to (4, 2n), and then walk back to the left until
you’re at (4, 2).
❼ Go up one step to (5, 2), then walk back to the right until you’re
at (5, 2n).
❼ . . . Keep doing this! Eventually, you will find yourself at (2n, 2),
having walked on all of the vertices whose second coordinates are
not equal to 1, and not having visited any vertices whose second
coordinate is 1 other than (1,1). (This is where the “2n” part
comes in: because we go right on odd rows and left on even rows,
if our grid has even height then we’ll be going left on our top row
and thus wind up at (2n, 2) as claimed.)
❼ Walk from (2n, 2) to (2n, 1), and then go down to (1, 1).
By construction we have visited all vertices in our graph exactly once,
and thus created a Hamiltonian circuit, as desired.
109
110
The claim we proved above — one where we were some sense “growing”
or “extending” a result on small values of n to get to larger values of n
— is precisely the kind of question that induction is set up to solve! The
Fibonacci numbers, which we introduce in the next question, is another
object where this sort of “extension” approach is useful to consider.
111
To illustrate how it works, let’s use it to calculate the first few values of
the Fibonacci sequence! We know that f0 = 0, f1 = 1 by definition.
To find f2 , we can use the fact that for any n ≥ 2, fn = fn−2 + fn−1 to
calculate that.
f2 = f0 + f1 = 0 + 1 = 1.
112
3n2 + 9n − 10
InsertionSortSteps(n) = .
2
113
.
In other words, we’ve shown that our claim holds at k + 1, and have thus
proven our claim by induction!
3
We define this as follows: take any graph G and any edge e in G with two
5 4 5 4 5 distinct endpoints. We define Ge , the graph that this edge, as follows:
1 take G, delete e, and then combine e’s two endpoints together into a
6 2 6 2 2
single vertex, preserving all of the other edges that the graph has along
the way.
We draw examples of this process at right: here, we have started with
a graph on six vertices, and then contracted one by one the edges high-
lighted in red at each step.
Notice that contracting an edge decreases the number of vertices by 1
at each step, as it “squishes together” two adjacent vertices into one
vertex. It also decreases the number of edges by 1 at each step, as we
are contracting an edge to a point!
Finally, notice that contracting an edge preserves the property that our
graph is connected. To see why, take any walk
in our graph. Notice that if we contracted an edge {vi , vi+1 } in this walk,
this would collapse the vertices vi , vi+1 into some new vertex vi⊕i+1 and
114
preserve all of the edges other than {vi , vi+1 }. As a result, our walk
would just become
Notice that this result applies to simple graphs as well, as any simple
graph is certainly a multigraph!
We can also use induction to prove Theorem 6.2! We split this result
into two parts, as it’s a longish equivalence proof:
115
Paper is cheap, and it’s usually just a lot faster to try stuff and see which
things break than to predict ahead of time which method is “best.” Also,
most problems in maths can be solved by a number of different meth-
ods: there’s rarely a single “correct” approach to a problem! Instead,
many problems can be solved with many different techniques, and each
different proof can help illustrate a new way of thinking about the task
at hand.
With that said, though, there are clues or signs in a problem statement
that can indicate that certain techniques might be useful. There are no
hard-and-fast rules here, but the following observations often come in
handy:
❼ Are you proving a claim of the form “if (some claim A is true),
then (some other claim B is true)?” If so, a direct proof is maybe
a good idea! Write down what it would mean for A to be true, and
try to use that assumption to prove that B is also true.
❼ Are you dealing with modular arithmetic, even versus odd num-
bers, claims about “is a multiple of,” or absolute values? Cases
are often useful here. (More generally: if you have any problem
where the inputs or outputs can be split into cases, do so! Proofs
by cases often combine with other proof methods.)
❼ Are you being asked a claim of the form “Show that blah exists?”
Construction’s a good way to go here! (This is opposed to claims of
the form “Show that every x has property f oo,” which you usually
do not do by construction, as it’s hard to construct every x!)
❼ Are you proving a claim where it seems like your previous results
stick together to give you a later result? (Tiling problems that
involve a general integer n, anything defined recursively like the
Fibonacci sequence, processes that have recursion in them, . . . )
When you’re writing your proof, do you find yourself wanting to
116
Proof. Let’s think about which of our proof methods we want to try:
❼ Direct proof: we could try this. This would involve expanding out
what it means to be a multiple of 15, and trying to use logic/known
results to get to the conclusion.
❼ Cases: even though cases is often a good technique when working
with mods / multiple problems, this is likely not a great idea here.
This is because there isn’t really a clear set of cases you’d want to
divide n into: even versus odd doesn’t seem relevant, and consider-
ing all fifteen possible remainders of n % 15 seems painful enough
to not do unless absolutely necessary.
❼ Contradiction: could do, if we’re stuck!
❼ Construction: not relevant. We’re proving something for every
integer, not building examples for some values.
❼ Induction: This doesn’t obviously look like induction, in that it’s
not clear how you’d relate 16n − 1 to the “next” value 16n+1 − 1.
With some algebraic trickery, though, this is possible! Notice that
16(16n − 1) = 16n+1 − 16 = (16n+1 − 1) − 15, and thus that we’ve
related one step to the next (in a way that involves a 15, which n 16n − 1
seems promising.) So if you saw this trick, then this is promising! 0 1-1=0
1 16 − 1 = 15
❼ Disproof: If you were suspicious of this claim, you could start by 2 162 − 1 = 255 = 15 ⋅ 17
calculating a handful of values of 16n − 1, and see if any failed to 3 163 − 1 = 4095 = 15 ⋅ 273
be a multiple of 15. 4 164 − 1 = 65535 = 15 ⋅ 4369
No obvious counterexamples immediately showed up in our table
in the margins, so let’s not try to disprove this just yet.
So, amongst our proof methods, a direct proof and induction look
promising. Let’s try induction first!
Base case: we saw in our table in the margins that our case holds for
n = 0, 1, 2, 3 and 4. So we’ve established our claim for a number of base
cases.
117
Inductive step: For the inductive step, we assume that we’ve proven
our claim for n: i.e. that 16n − 1 is a multiple of 15. We seek to use this
claim to prove that our claim holds for the “next” value n + 1: i.e. that
16n+1 − 1 is also a multiple of 15.
This is not too hard to do! Notice that as we observed above,
This is not the only way you could prove this result! We could also use
a direct proof:
Proof. We want to show that 16n − 1 is always a multiple of 15, for any
positive integer n.
By definition, this holds true if and only if 16n ≡ 1 mod 15.
So: we know that 16 ≡ 1 mod 15, because 16 − 1 is itself a multiple of 15.
We also know from Claim ?? that for any positive integers a, b, c, n that
if a ≡ b mod c, then an ≡ bn mod c as well.
Combining these facts tells us that for any positive integer n, we have
16n ≡ 1n mod 15. Because 1n = 1 for all n, this gives us 16n ≡ 1 mod 15.
By definition, this means that for every positive integer n we’ve shown
that 16n − 1 is a multiple of 15, as desired!
118
Proof. While we could go through all of the proof methods again, we’ll
shortcut the process and explain why we know this is a constructive
proof: it’s asking us to show that there is a graph with some property!
This isn’t a “show all graphs have property f oo” problem or a “take any
graph G, show that it cannot be blah” task: this is just asking us to find
some single graph with a given property.
So, uh: behold the graph at right!
119
120
121
10. Consider the following two-player game: starting with the single
number 123, two players alternately subtract numbers from the set
{1, 2, 3} from this value. The player who first gets this sum to 0
wins.
If you want to win this game, should you go first or second? Prove
that your chosen player has a winning strategy. (Hint: try induc-
tion!)
11. Take an equilateral triangle with side length 2n . Divide it up into
side-length 1 equilateral triangles, and delete the top triangle. Call
this shape Tn :
122
Can you ever reach the following configuration? Prove your claim.
123