Discrete Maths Notes
Discrete Maths Notes
Discrete Maths Notes
Lecture Notes
Alexander Tiskin
University of Warwick
Autumn Term 2004/05
It is less suitable as general reference for the course material, but instead
concentrates on what is arguably its most important aspect: the concept of
a proof. It is very clearly written, and in many respects complements the
books on the courses main reading list.
Electronic resources
As the course progresses, the material will be available on the course website:
http://www.dcs.warwick.ac.uk/~tiskin/teach/dm1.html . The Rosen
book has a website of its own: http://www.mhhe.com/rosen .
A forum (discussion group) on Warwick Forums has been set up to exchange messages relevant to the course. In the past, it proved to be a useful tool for communication within the CS127 student population, and also
between students and tutors. The forum is available at http://forums.
warwick.ac.uk . The University IT Services should be able to help in case
of any problems with accessing this forum. As with all discussion groups,
its abuse will not be tolerated.
Assessment
One of the main challenges of the course is the lack of continuous coursework
assessment. This means that you have to work hard, without being forced
to. The course is assessed by a two-hour examination in week 1 of Summer
Term. Results of this and other exams will be announced at the end of the
academic year.
A new element of the course introduced last year is the class test, which
will be held in week 7 of Autumn Term. The test will consist of a one-hour
paper with 20 true or false questions, to be answered on specially prepared
sheets, which then will be scanned and marked automatically. The resulting
mark will not contribute to your official course assessment, and the class
test itself is not mandatory. However, it is strongly recommended to take
the test, in order to get feedback on your progress and to prepare yourself
for the Summer Term exam.
Mathematics studies concepts that are abstract, idealised images of the real
world. An example of such a concept is natural numbers:
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, . . .
We all learn it in early childhood yet nobody has ever seen three, as
opposed to three oranges or figure 3 in black ink in the top-right corner
of this page.
A philosopher would say here: well, our concept of three captures the
threeness of all three-element sets that we have seen before or may see in
future: three apples, three penguins, or two sheep with a sheepdog in the
field. Number 0 can be accommodated by this view as well: it represents
an empty set, a set that contains nothing.
While the philosophers answer makes a lot of sense, it is also true that
in mathematics, concepts depart from immediate reality, and start to live a
life of their own. Consider, for example, the notion of a set, that our friend
the philosopher has used to define natural numbers. We can have a set of
apples or penguins, so why not think about sets of numbers? Say, the set of
this weeks National Lottery winning numbers: {14, 20, 25, 32, 47, 49}. (note
the use of curly brackets to denote a set). We could then think of some
more interesting (in my opinion) examples, such as the set N of all natural
numbers:
N = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, . . . },
the set of all integers (natural numbers and their negatives):
Z = {. . . , 6, 5, 4, 3, 2, 1, 0, 1, 2, 3, 4, 5, 6, . . . },
or the set of all even integers:
{. . . , 10, 8, 6, 4, 2, 0, 2, 4, 6, 8, 10, . . . }.
For a mathematician, the last three sets are just as legitimate as a set
of three apples. However, there is a crucial difference: the new sets are
infinite. Infinite sets do not occur in reality, even the number of atoms in
the Universe is finite. Yet, we have just imagined a few infinite sets. Even
if we cannot write down the elements of these sets without resorting to the
. . . notation, we can capture these sets in our mind, and treat them as
we would treat any real-world set.
Of course, to make our theory of sets useful, we will have to answer some
important questions:
do infinite sets have a size? (yes they do, but of course these sizes
are beyond natural numbers);
Logic
2.1
We use all sorts of sentences in everyday speech. Our language has special
ways in which we can communicate information, ask a question, give a command, express our thoughts, feelings or emotions. In mathematics, however,
we restrict ourselves to only one type of sentences: statements, which must
be either true or false. Here are some examples of statements:
Five is less than ten.
Pigs can fly.
There is life on Mars.
Note that we know the last statement must be true of false, despite the fact
that we cannot decide between true and false from our present knowledge.
Here are some examples of sentences that are not statements:
Welcome to Tweedys farm!
Whats in the pies?
Its not as bad as it seems. . .
The last sentence will become a statement if we substitute the name of a
particular object for the pronoun it. Of course, we must also give a clear,
unambiguous definition of bad, seems, etc.
Thus, every statement has a value taken from the set B = {F, T }. The
two elements of this set are called Boolean values. There are special operations, called Boolean operators, that one can perform on Boolean values
(rather like addition and multiplication on natural numbers):
A
T
F
B
F
T
F
T
AB
F
F
F
T
B
F
T
F
T
AB
F
T
T
T
means T F
T F
means F
means
B
F
T
F
T
AB
T
F
F
T
10
2.2
Laws of logic
The truth tables completely define Boolean operators, so, in principle, the
truth value of any compound statement, however complicated, can be found
by a series of truth table lookups. In practice, we often want an easier
and more intuitive method of dealing with compound statements. One such
method consists in applying certain properties of Boolean operators, known
as the laws of logic. From the formal point of view, these laws do not add
anything new to the operator definitions: each of the laws follows directly
from the truth tables. However, the laws offer an alternative, complementary
approach to logic, and are widely applicable. Many of these laws are similar
to the properties of arithmetic operators + and .
In the following formulas, letters A, B, C stand for arbitrary statements.
The statements of the laws are always true, irrespective of the truth values
of A, B, C.
The first group of laws involve only one operator and at most two elementary statements each.
A A
A A A
A A A
idempotence of ,
A B B A
A B B A
commutativity of ,
associativity of
A (B C) (A B) (A C)
distributivity of over
(A B) C A (B C)
A (B C) (A B) (A C)
associativity of
distributivity of over
11
The following pair of laws, called De Morgans laws, describes the close
relationship between operators , .
(A B) A B
(A B) A B
A F A
identity laws
A F F
A T T
annihilation laws
A A F
A A T
excluded middle
A (A B) A A (A B)
absorption laws
(A B) (A B) (B A) (A B) (A B)
Again, both and are formally redundant, but, as we mentioned before,
very useful in practice.
All the above laws are in fact theorems, and proving them is a good
exercise in applying truth tables. Here is a table that proves one of De
Morgans laws, (A B) (A B):
A
T
T
F
F
B
T
F
T
F
AB
T
F
F
F
(A B)
F
T
T
T
?
A
F
F
T
T
B
F
T
F
T
(A B)
F
T
T
T
?
12
The columns for the two sides of the law (marked ?) are identical, hence
their truth values agree for any A, B.
We can use our laws of logic to prove new theorems. Here is an example.
Theorem 1 (Principle of proof by contradiction). For any statements
A, B, we have (A B) (B A)
Proof. We apply the law for , then the law of double negation, commutativity of , and finally the law for once again, this time in the opposite
direction.
(B A) (BA) (BA) (AB) (A B)
The above theorem gives us a useful generic proof method. When we are
given a statement A, and we are asked to prove a statement B, we may start
by assuming that B is false (i.e. B holds), and then show that a statement
contradicting A (i.e. A) follows from our assumption. The principle of proof
by contradiction tells us that in this case, B must be a logical consequence
of A.
2.3
Statements we have been making so far declared facts about specific objects:
Five is less than ten.
The pie is not as bad as it looks.
Often we need more that that: we want to declare a fact about a specific
set of objects. For example, we could say:
Some natural numbers are less than ten.
All pies are not as bad as they look.
In the first case, we could try to come up with a specific example that
proves is: say, five is less than ten. In the second case, we could restrict our
attention to a finite number of possible pies; let this set be {Chicken pie,
Mushroom pie, Cabbage pie}. Then the statement All pies are not as bad
as they look is a conjunction:
(Chicken pie is not as bad as it looks)
There are problems with both these approaches. In the first case, it was
easy to find a specific instance (five) that proved our statement; for other
13
14
as:
x N : x < 10
x S : P (x)
With predicates having more than one variable, we can write more complicated quantified statements:
x N : y N : x < y
y N : x N : x < y
Note that the meaning, and even the truth value of the above two statements is different: the first one is true (for every natural number, there is
a greater number), the second is false (there is a natural number greater
than all natural numbers). In general, the meaning of a quantified statement depends on the order of the quantifiers.
The meaning of a quantified statement does not change if we change the
quantifier variable consistently throughout the statement. For example, we
can write:
z N : z < 10
x : P (x) y : P (y)
15
These equivalences do not hold for an infinite S, since their right-hand sides
would not be well-defined. However, the following laws will hold for any
nonempty range, finite or infinite:
x : T T
x : F F
x : T T
x : F F
x : P (x) = x : P (x)
x : P (x) x : P (x)
On a finite range, these laws can be proved by the laws of Boolean logic,
using properties of conjunction for , and those of disjunction for . On an
infinite range, the new laws must be taken as axioms.
When several predicates are involved in a quantified statement, all the
usual laws of Boolean logic apply to these predicates. However, when we
introduce a quantifier, we must be careful not to capture inadvertently
any existing free variables, or any variables bound by other quantifiers. For
example, the statement (x : P (x)) (x : Q(x)) is, in general, not equivalent to x : (P (x) Q(x)). This is because in the former statement, P (x)
and Q(x) may be satisfied by different values of x, whereas in the latter
statement the value of x must be the same for both P and Q. We can
make this argument even more forceful by replacing the first statement by
its logical equivalent: (x : P (x)) (y : Q(y)). By a similar reasoning,
there is no equivalence between the statements (x : P (x))(x : Q(x)) and
x : (P (x)Q(x)), since the former is equivalent to (x : P (x))(y : Q(y)).
However, the following equivalences hold:
(x : P (x)) (x : Q(x)) x : (P (x) Q(x))
As before, they can be proved by laws of Boolean logic for a finite range,
but must be taken as axioms when the range is infinite.
In general, a quantifier x or x is safe to capture a predicate Q, as
long as Q does not contain x as a free variable (in other words, as long as
all occurrences of x in Q are bound by other quantifiers). Therefore, we
have the following laws, where Q is always assumed to be a predicate not
16
(x : P (x)) Q x : (P (x) Q)
(x : P (x)) Q x : (P (x) Q)
(x : P (x)) Q x : (P (x) Q)
(x : P (x)) Q x : (P (x) Q)
(x : P (x)) Q x : (P (x) Q)
Q (x : P (x)) x : (Q P (x))
Q (x : P (x)) x : (Q P (x))
(x : P (x)) Q x : (P (x) Q)
(x : P (x)) Q x : (P (x) Q)
Just like laws of Boolean logic, which are useful in simplifying statements
involving Boolean operators, the above laws, along with other laws introduced in this section, allow us to simplify statements involving quantifiers.
The ultimate purpose of all these laws, and of logic as a whole, is to allow
us to express and prove facts about objects and sets that we build across
all branches of mathematics. In the following sections of the course, we will
make extensive use of this sections language and ideas.
17
Sets
3.1
The notion of a set is central to mathematics. However, it was not until the
late 1800s and early 1900 that mathematicians began to study sets in their
own right. Sets and set elements are basic concepts, and, as such, are left
without a formal definition. Georg Cantor (18451918), one of the creators
of modern set theory, gave the following description:
By a set we shall understand any collection into a whole M of
definite, distinct objects of our intuition or of our thought. These
objects are called the elements of M .
The above is not a mathematical definition: it just describes our intuitive
idea of sets (collections) and their elements (objects). However, we can
formulate some characteristic properties that we associate with sets:
Any object can be an element of a set. For example, we can form the
following sets:
Planets = {Mercury, Venus, . . . , Pluto}
Neven = {0, 2, 4, 6, 8, 10, . . .}
NonpositiveNaturals = {0}
EmptySets = {}
18
Note that the set EmptySets is not empty: it contains an element, which
happens to be the set . Likewise, the set MorningStars is distinct from the
planet Venus, and the set NonpositiveNaturals is distinct from the number
zero.
The fact that x is an element of set S is written as x S. Thus,
Jupiter Planets, orange 6 Junk . A set A is called a subset of a set B
(A B), if all elements of A are also elements of B (but not necessarily the
other way round). For example, Neven is a subset of N (Neven N), since
every even natural number is a natural number. We can write the definition
formally as follows:
A B x : x A x B
By this definition, the empty set is a subset of any set (since the range
of the quantified statement in the definition is empty), and every set is a
subset of itself.
It is very important to distinguish between the signs (element inclusion) and (subset inclusion). Despite their superficial similarity, their
meaning is very different: the first indicates an individual member of a set,
the second an arbitrary subset of a set, including the two possible extremes: the empty set and the whole working set. Element inclusion is
a basic concept, and therefore has no formal definition; the definition of
subset inclusion in terms of element inclusion was given in the previous
paragraph.
Our intuitive idea of a set is an arbitrary collection of elements, where
the order and any repetitions of elements are ignored. Can we make this
idea formal by giving to the basic concept of a set the appropriate axioms?
The fact that order and repetitions do not matter is easy to express:
Axiom (The Law of Extensionality). If two sets contain the same elements, they are equal.
In other words, for any sets A, B, we have
(A B B A) A = B
In particular, any two sets without elements are equal, therefore there is
only one empty set .
When dealing with sets, we often need to select from a given set a subset
that satisfies a certain property. For example, we could start from the set
N, and select from it only those numbers that are even. In general, let S
be our working set; then we can express any property of its elements by a
predicate P (x), where x is a variable ranging over S. A set of all elements
x of S for which P (x) is T is denoted {x S | P (x)}. For example,
Neven = {x N | x is even}
19
The variable x in the above expression is a dummy: the set Neven will not
change if we replace all occurrences of x in its definition by y, or by any
other variable.
For any set S, we have
{x S | T } = S
{x S | F } =
{x Planets | x is a banana} =
Using the predicate notation, we can attempt to formalise completely
our intuitive notion of a set. We have described a set as an arbitrary
collection of elements that is, we can form a set of elements satisfying
any given predicate. We can now make it our second axiom.
Axiom (The Law of Abstraction). For any predicate P (x), there is a
set A = {x | P (x)}, such that an element x is in A if and only if P (x) is
true.
Our two axioms the law of extensionality and the law of abstraction
formalise our intuition about sets. We could try to base a whole theory
on these two axioms. Indeed, such attempts were made in the early stages
of set theory development. Unfortunately, it was soon realised that the
extensionality and abstraction laws, taken together, are inconsistent that
is, a theory based on these laws leads to contradictions. The simplest of
these contradiction is called Russells paradox, after the great logician and
philosopher Bertrand Russell (18721970).
Consider the following predicate: P (x) x 6 x (note that it involves
element inclusion, rather than subset inclusion). In words, we could say that
P (x) means x is not a member of itself. This would be definitely true if
x is not a set; it is also true for all sets we have seen so far, and for all sets
we can think of (except perhaps an imaginary set of all sets). We may or
may not believe that P (x) is true for all x: whether this is the case or not
is irrelevant, since both possibilities will lead to a contradiction. What is
relevant is that P (x) is a well-formed predicate (i.e. is true or false for any
given x). Therefore, by the law of abstraction, we can form the set B of all
objects x that satisfy the predicate P (x):
B = {x | P (x)} = {x | x 6 x}
20
In words, B is the set of all objects that are not their own members.
Now consider the following statement R: B B. It is a well-formed
statement, so it must be either true or false. Suppose statement R is true,
so B is a member of B, and, like all members of B, must not be a member
of itself. This makes the statement R false which is impossible, since
we assumed it was true. Now suppose statement R is false, so B is not
a member of B. By definition of set B, everything that is not a member
of itself must be a member of B, so B itself must be a member B. This
makes statement R true which is impossible, since we assumed it was
false! Thus, statement R cannot be either true or false, so there must be
something wrong in our reasoning. The only thing that can be wrong is
the law of abstraction that we used to form the set B.
There is an alternative, somewhat lighter form of Russells paradox.
Imagine a village that has a single (male) barber with the following code of
practice: the barber will shave every man in the village, but only if this man
does not shave himself. Must the barber shave himself? The question has no
answer, since both choices of the answer lead to a contradiction. Therefore,
the barbers rule is inconsistent.
Because of Russells paradox, the theory based on the laws of extensionality and abstraction is often called the nave set theory. It captures our
intuitive notion of a set but, being inconsistent, cannot serve as a formal
foundation of mathematics. A lot of time and effort have been spent in
order to provide a more sound axiomatic system for sets. Now, several such
systems exist; they are all significantly more complicated than the nave set
theory. We shall not go into their details in this course. For the rest of
the course, we will use implicitly the laws of extensionality and abstraction,
and in particular the convenient notation for set abstraction {x | P (x)}. On
the level of our course, no paradoxes similar to Russells will arise. Indeed,
unless mathematicians create them artificially, they seldom arise at all.
3.2
Operations on sets
We have already studied the concept of set abstraction, that allows us (ideally) to form a set {x | P (x)} from any predicate P (x). We will now use this
method to define operations that create new sets from existing ones. Despite
the problems with abstraction arising due to Russells paradox, these new
set operations will be completely non-controversial.
Let A, B be any sets. The intersection of A, B, denoted A B, is a set
that contains all elements which are members of both A and B:
A B = {x | (x A) (x B)}
The union of A, B, denoted A B, is a set that contains all elements which
are members of either A, or B (or both):
A B = {x | (x A) (x B)}
21
AA=A
idempotence of ,
AB =BA
AB =BA
commutativity of ,
Also,
(A B) C = A (B C)
associativity of
A (B C) = (A B) (A C)
distributivity of over
(A B) C = A (B C)
associativity of
A (B C) = (A B) (A C)
distributivity of over
A B = A B
A B = A B
A=A
identity laws
A=
AS =S
annihilation laws
A A =
A A = S
excluded middle
A (A B) = A = A (A B)
absorption laws
All the above laws are theorems, and are easy to prove by the laws of Boolean
logic. Here is an example:
22
Theorem 2 (De Morgans Law). For any universal set S, and for any
sets A, B S, we have A B = A B.
Proof. We apply the definition of complement, the Boolean De Morgans
law, the Boolean distributivity law, once again the definition of complement,
and finally the definition of set union:
A B = S \ (A B) =
{x | (x S) (x A B)} =
{x | (x S) ((x A) (x B))} =
{x | (x S) ((x A) (x B))} =
{x | (x S \ A) (x S \ B)} =
(x B)}
= A B
{x | (x A)
Let us compare once again the laws of Boolean logic with the laws of
sets. In logic, we have the set of Boolean values B = {F, T }, and Boolean
operators , , . In set theory, we have a fixed universal set S, and set
operations (complement), , . The laws obeyed by these two structures
(set B and the set of all subsets of S) are essentially the same. There are
many other similar structures in mathematics, with operations governed by
exactly the same laws. Such structures are called Boolean algebras.
The Boolean algebra formed by all subsets of a given set S is called the
powerset of S. Formally, the powerset of S is a set P(S) = {A | A S}.
In other words, a set is member of P(S), if and only if it is a subset of S:
A : A P(S) A S.
Let us consider some examples. The simplest case is S = . The empty
set contains exactly one subset: the empty set itself. Hence, the powerset
of is a singleton: P() = {}. Note: the powerset of the empty set is not
empty.
Now let S be a singleton, for example S = {Bunty}. Set S contains two
subsets: S itself, and the empty set. Hence, the powerset of S consists of
two elements: P(S) = {, {Bunty}}. In general, the powerset of any set S
contains, among other elements, the set S itself, and the empty set. For
example,
P({a, b, c}) = {, {a}, {b}, {c}, {a, b}, {a, c}, {b, c}, {a, b, c}}
When we form a subset of a given set S, we have two choices for each
element: either to include, or not to include this element in the subset.
Thus, for a finite set of n N elements, we make n independent choices,
leading to 2n different subsets. Therefore, the powerset of a finite set is
finite. Furthermore, the powerset of an n-element finite set consists of 2 n
elements. Note that this also holds for P(), since 20 = 1.
23
If set S is infinite, then its powerset P(S) must also be infinite. This is
because P(S) contains, among other elements, all singletons {a}, such that
a S. Since S is infinite, the number of such singletons is also infinite.
The last set operation that we consider in this section is based on the
idea of a sequence. Let x1 , x2 , . . . , xn be any objects (n N). A (finite)
sequence (x1 , x2 , . . . , xn ) is different from a set {x1 , x2 , . . . , xn } in that the
order and repetition of elements do matter in a sequence. For example,
the sequence JunkSeq 1 = (239, banana, ace of spades) is different from the
sequence JunkSeq 2 = (banana, 239, ace of spades, 239). Natural number n
is called the length of the sequence. For example, the length of JunkSeq 1 is
three, and the length of JunkSeq 2 is four. We will give a formal definition
of sequences further in the course.
A sequence of length two is called an ordered pair. Let A, B be any sets.
The Cartesian product of sets A, B, denote A B, is the set of all ordered
pairs (a, b), where a A, b B. In other words, A B = {(a, b) | (a
A) (b B)}.
The Cartesian product is named after the great philosopher and mathematician Rene Descartes (15961650). Descartes lived long before sets
emerged as a separate mathematical concept. However, Descartes was the
first to realise that in geometry, a point in the plane can be represented by
a pair of numbers, called coordinates. Therefore, the whole plane is represented by what we now call a Cartesian product of two lines.
Here are some examples of Cartesian products:
A = A = for any set A
{Bunty} {Fowler} = {(Bunty, Fowler)}
24
of m n elements. Note that this also holds for the products involving the
empty set: the Cartesian product of the empty set with any other set is
empty.
If one of the sets A, B is infinite, and the other is non-empty, then the
Cartesian product A B must be infinite. This is because if, say, A is
infinite, and b B, then we can form an infinite number of distinct pairs
(x, b), where x A. Each of such pairs belongs to A B.
In general A B 6= B A (the equality only holds when A = B, or
when one of A, B is empty). Hence, the Cartesian product operator is not
commutative. Furthermore, a nested pair ((a, b), c) is different from the
nested pair (a, (b, c)), hence (A B) C 6= A (B C), so the Cartesian
product operator is not associative. However, it still has some distributive
properties with respect to other set operations:
A (B C) = (A B) (A C)
distributivity of over
A (B C) = (A B) (A C)
distributivity of over
A (B \ C) = (A B) \ (A C)
distributivity of over \
(A B) C = (A C) (B C)
(A B) C = (A C) (B C)
(A \ B) C = (A C) \ (B C)
25
Relations
4.1
Introduction to relations
26
4.2
27
Equivalence relations
An equivalence relation is a relation that is reflexive, symmetric and transitive. Examples of equivalence relations are abundant in mathematics and
in everyday life. For example, consider the relation on the set of all people,
where person a is related to person b, if a and b are of the same age (in
whole number of years). It is easy to check that all necessary properties in
the definition of an equivalence relation are satisfied. A relation where a is
related to b if a and b were born on the same day (but possibly in different
years) is another equivalence relation. In geometry, we can define an equivalence relation on the set of all straight lines in the plane, where line a is
related to line b, if a and b are parallel (every line is considered to be parallel
to itself).
In arithmetic, given a fixed number n Z, we can define the relation
Rn : Z Z, where two numbers are related, if their difference is a multiple
of n: a n b n|(a b). The relation Rn is called congruence modulo
n. It is an equivalence relation for every natural n > 0.
Let A be any set, and R : A A an equivalence relation ( is a
general mathematical sign for equivalence). For any element a A, the
equivalence class of a, denoted [a] , is the set of all elements in A related
to a: [a] = {x A | x a}. Since R is reflexive, every element belongs
to its own equivalence class: for all a A, a [a] . Sometimes an element
a is called a representative of the equivalence class [a] .
For example, if a b means that a and b are two people of the same
age, then the equivalence classes are all possible ages, and every person
represents all people of his or her age. If a b means that persons a
and b share a birthday, then the equivalence classes are all 366 possible
birthdays, and every person represents all people with the same birthday.
If a b means that lines a and b are parallel, then these lines share the
same direction, and we can think of all possible directions as the equivalence
classes. For the congruence relation Rn , the equivalence class of any a Z
consists of all numbers that give the same remainder as a, when divided by
n. Thus, [2]5 = {. . . , 18, 13, 8, 3, 2, 7, 12, 17, . . . }.
The importance of equivalence classes is that in a set with an equivalence
relation, every element belongs to one, and only one, equivalence class. In
other words, we have the following theorem.
Theorem 3. Let R : A A be an equivalence relation. The equivalence
classes of R are pairwise disjoint. The union of all equivalence classes is
the whole set A.
Proof. To prove that the classes are pairwise disjoint, we need to show that
for all a, b A : ([a] = [b] ) ([a] [b] = ). Consider two cases:
Case a b. Consider any x [a] . By transitivity of R , we have:
x a, a b = x b = x [b]
28
By the law of excluded middle, one of the above two cases must be true,
hence a, b A : ([a] = [b] ) ([a] [b] = )
Finally, by reflexivity of R , we have a a, therefore a [a] , so every
element of A belongs to some equivalence class. On the other hand, every
equivalence class is a subset of A, therefore the union of all equivalence
classes is the whole set A.
Theorem 3 allows us to think of any equivalence relation as a partitioning
of the set into disjoint subsets. In many cases, such partitioning has a wellunderstood intuitive meaning:
The equivalence relation person a is of the same age as person b
(in whole number of years) has approximately 110120 equivalence
classes, corresponding to all possible ages. Note that these ages need
not be a contiguous set of natural numbers, if e.g. there is a person of
age 120, but no person of age 119.
The equivalence relation person a was born on the same day as person
b (possibly in different years) has exactly 366 equivalence classes,
corresponding to every date in a year. Note that the sizes of all classes
will be nearly equal, except the class corresponding to 29 February,
which will be approximately four times smaller than others.
The equivalence relation line a is parallel (or equal) to line b has
an infinite number of equivalence classes corresponding to all possible
directions of a line in the plane. In fact, we can define direction as
an equivalence class of this relation.
The congruence modulo n relation Rn has n equivalence classes,
represented by numbers 0, 1, . . . , n 1. For example, for n = 5, we
have:
[0]5 = {. . . , 10, 5, 0, 5, 10, . . . }
[1]5 = {. . . , 9, 4, 1, 6, 11, . . . }
[2]5 = {. . . , 8, 3, 2, 7, 12, . . . }
[3]5 = {. . . , 7, 2, 3, 8, 13, . . . }
[4]5 = {. . . , 6, 1, 4, 9, 14, . . . }
29
4.3
Partial orders
30
(a c) (b c) x A : (a x) (b x) (c x)
The least upper bound of a, b does not have to exist, even if elements a, b
have some common upper bounds.
All the above definitions can be easily restated for lower, rather than
upper bounds. Thus, d A is a lower bound of a, if d a. Every element
is a lower bound of itself. An element d A is a (common) lower bound of
31
(d a) (d b) x A : (x a) (x b) (x d)
Two elements may not have the greatest lower bound, even if they have
some common lower bounds. However, if two elements have the greatest
lower bound, then it is unique (why?). The same applies to the least upper
bound.
As an example, consider the partial order a is a descendant of b. For
any two people, their common upper bound is any common ancestor, if one
exists. Thus, if two persons are cousins, then either of the two common
grandparents is their common upper bound. Neither of these upper bounds
is the least, since the two grandparents are not ancestors of each other.
There are many other common upper bounds, provided by ancestors of
these grandparents, but none of these upper bounds is the least.
In the same partial order, the common lower bound of any two people
is their common descendant, if one exists. Thus, is two persons are inlaws, i.e. each of them is a parent of the other childs partner, then each
of their common grandchildren is their common lower bound. There may
be many other common lower bounds, provided by descendants of common
grandchildren. If the two in-laws have exactly one common grandchild,
he/she is their greatest lower bound, since all other common lower bounds
would be that grandchilds descendants.
In arithmetic, the greatest lower bound of two numbers a, b N with
respect to the divisibility relation R| is the two numbers greatest common
divisor: glb| (a, b) = gcd(a, b). (Sometimes the greatest common divisor is
called highest common factor.) In the same partial order, the least upper
bound of two numbers a, b N is their least common multiple: lub| (a, b) =
lcm(a, b). In contrast with the previous example, every two non-zero natural
numbers have the greatest common divisor and the least common multiple,
and therefore the greatest lower bound and the least upper bound in R| .
Another example of an arithmetic partial order with guaranteed greatest
lower and least upper bounds is the total order R : N N. Here, the
greatest lower bound of two numbers a, b is simply their minimum a u b
(a u b = a if a b, and a u b = b otherwise). The least upper bound of a, b
is their maximum a t b (a t b = b if a b, and a t b = a otherwise). In fact,
it is easy to see that greatest lower and least upper bounds are guaranteed
to exist in every totally ordered set.
Finally, consider the subset inclusion relation R : P(S) P(S) on
the subsets of any (note necessarily finite) set S. The greatest lower bound
32
33
only minimal) element is , and the greatest (and the only maximal) element
is S. If and S are excluded, and S is neither empty nor a singleton, then
there will be many minimal elements (all singletons {a}, where a S)
and many maximal elements (all complements of such singletons), but no
greatest or least element.
It is easy to prove that any greatest element is maximal, and that any
least element is minimal (try it!). As the above examples show, the converse
is not true: a maximal element need not be the greatest, and a minimal
element need not be the least. It is also easy to prove that if the greatest
(or the least) element exists, then it must be unique (try it!). However, if
a maximal or a minimal element is unique, it still does not have to be the
greatest or the least (why?).
The results of this section show us that the concept of a relation, and in
particular equivalence relations and partial orders, give us a useful general
tool, applicable in various branches of mathematics and computer science.
We will apply our knowledge of relations in the following sections.
34
5
5.1
35
Functions
Introduction to functions
B
Figure 1: A function
36
a(0) = a0
a : Nk A
a(1) = a1
...
a(k 1) = ak1
Similarly, an infinite sequence of elements of A is a function N A. Notation (a0 , a1 , a2 , a3 , . . .), where i N : ai A, is an alternative to
a(0) = a0
a:NA
a(1) = a1
a(2) = a2
a(3) = a3
...
37
f (A)
B
38
39
B
5.2
Set cardinality
Putting two sets in one-to-one correspondence is one of the most basic activities that can be performed on sets. Intuition tells us that it is possible
if and only if both sets have the same size. In fact, the idea of one-toone correspondence, or bijection, allows us to define precisely what size
means, even for infinite sets.
40
41
1 2 3 4 5 6 7 8
l l l l l
7
l
0 2 4 6 8 10 12 14
42
countable set, the quotient set (i.e. the set of all equivalence classes) is
either finite or countable. This can be shown by selecting an arbitrary
representative from every equivalence class. The function that maps every
equivalence class to its representative is a bijection (why?), therefore the
quotient set is equinumerous with a subset of the initial set. Since the
initial set is countable, its quotient set must be finite or countable.
It turns out that not only subsets, but also certain supersets of N may
be countable.
Theorem 9. Set Z is countable.
Proof. Consider function f : N Z, which counts negative integers by
even naturals, and positive integers by odd naturals:
(
(n + 1)/2 if n odd
n : f (n) =
n/2
if n even
Function f is bijective (proof left as an exercise).
l l l l l l
2 0 1 3 5 7
0
0
2
5
9
14
1
1
4
8
13
2
3
7
12
3
6
11
4
10
This method gives us a bijection between N and N2 ; with a little extra effort,
the formula for this bijection can be given explicitly (left as an advanced
exercise).
43
The above theorem implies that any finite Cartesian power of a countable
set is countable. For instance,
N3 = (N N) N
=NN
=N
In our quest for uncountable infinity, we may be tempted to extend the
set of natural numbers so that, roughly speaking, we would have an infinity
of numbers everywhere. More precisely, we may want to consider the set
Q of rational numbers, defined as fractions m/n, where m, n Z, n 6= 0.
Two fractions a/b and c/d are considered equal, i.e. representing the same
rational number, if a d = b c. Therefore, we have an equivalence relation
on the set of all integer pairs:
R : Z2 Z2
(a, b) (c, d) a d = b c
44
31 314 3141
,
,
,...
10 100 1000
32 315 3142
,
,
,...
10 100 1000
Thus, a real number splits the set of all rationals into two subsets, below
and above. More precisely, a real number is defined as a partitioning
Q = Q1 Q2 , such that for all x Q1 , y Q2 , we have x < y. These
partitionings are traditionally called Dedekind cuts of Q. Taken together,
all possible Dedekind cuts form the set of real numbers R.
Since the real numbers are nothing else than gaps between rationals,
one might expect that there cannot be more gaps than rationals themselves. Here the intuition fails us once again: unlike Q, the set R is uncountable. Consider the set of real numbers between 0 and 1. Every such number
can be represented in the decimal (or binary, or any other positional) system, which is just another form of approximation by rationals. For example,
the number 3 = 0.141592 corresponds to the following infinite sequence
of decimal digits:
(1, 4, 1, 5, 9, 2, . . . )
4,
The set of all such sequences includes as a subset the set of all sequences
composed of numbers 0 and 1. We already know that this set is equinumerous with the set of all functions N B, which is uncountable. Therefore
the whole set R is also uncountable.
45
46
47
Induction
48
49
50
This theorem justifies our choice of Boolean operators: using just three
of them, we can express every possible Boolean function of n variables.
(Exercise: what statement should be the base of induction in the omitted
proof?)
In the following chapter, we shall see more examples of induction.
7
7.1
51
Graphs
Motivating examples
52
F GC
W
WC
FG
FG
WC
F W GC
FWC
G
G
FWC
F W GC
W
F GC
F GW
C
H2
H3
"
$%
&'
()
W1
W2
W3
53
1
0
K(N5 )
7.2
Graphs as relations
From our first two motivating examples, it is clear that in our definition of a
graph, it should only matter which nodes are connected by edges; the layout
and shape of nodes and edges are irrelevant. Therefore, graphs for us are a
special type of relations.
Let Rp : A A. We say that relation Rp is
irreflexive, if no element is related to itself: a A : (apa)
symmetric, if every two elements are related in both possible orders,
as long as they are related at all: a, b A : apb bpa
Let V be any finite set. We call elements of V nodes. An irreflexive,
symmetric relation E = R* : V V is called a graph on V . The pairs of
nodes that are elements of relation E are called edges. Two nodes that are
connected by an edge are called adjacent. A graph with set of nodes V and
set of edges E is usually denoted G = (V, E).
A special case of a graph on the set of nodes V is the empty graph, which
has no edges: (V, ). The other extreme is the complete graph, which contains
all possible edges: K(V ) = (V, E), where E = {(u, v) V 2 | u 6= v}.
Figure 10 shows the complete graph K(N5 ).
When studying the structure of a graph, we usually want to identify
graphs which are the same up to a renaming of nodes. This informal idea
is captured by the following definition. Graphs G1 = (V1 , E1 ) and G2 =
(V2 , E2 ) are called isomorphic, if there is a bijective function f : V1
V2
which preserves the edges:
u, v V1 : (u, v) E1 (f (u), f (v)) E2
Bijective function f is called the isomorphism between G1 and G2 . Figure 11
shows three isomorphic graphs with different layouts.
The notion of isomorphism can be useful when the exact set of nodes
in the graph is irrelevant. For example, for every n N, there is, up to
isomorphism, just one complete graph on n nodes. We will denote this
graph by K(n). This can be read as any graph isomorphic to K(Nn ).
A graph G = (V, E) is called bipartite (or two-coloured ), if the set of
nodes can be partitioned into two disjoint subsets V = V1 V2 , such that
54
1
1
H2
H3
W1
W2
W3
Figure 12: The complete bipartite graph on two sets of three nodes
every edge in E connects two nodes from different subsets. The subsets V1 ,
V2 are called colour classes. From Figures 7, 8, 9 it is clear that the three
graphs introduced in the previous subsection are bipartite, with the colour
classes indicated by black and white colouring of the nodes. The complete
graph K(5) in Figure 10 is not bipartite.
The bipartite graph that contains all possible edges between its colour
classes is called a complete bipartite graph: K(V1 , V2 ) = (V1 V2 , (V1 V2 )
(V2 V1 )). Figure 12 shows a straightened picture of the houses and
wells graph, which is the complete bipartite graph K(H, S) on the sets of
houses H = {H1 , H2 , H3 } and wells W = {W1 , W2 , W3 }. When the exact
set of nodes is irrelevant, we will denote the complete bipartite graph by
K(m, n), where m, n N at are the sizes of the colour classes. This can be
read as any graph isomorphic to K(H, W ) with m houses and n wells.
The definition of bipartite graphs can be generalised for any fixed number
of colour classes. A graph with k colour classes is called k-partite. The
k-colourability problem consists in determining whether a given graph is kpartite, for a fixed value of k. The 2-colourability problem can be solved
efficiently; however, nobody knows an efficient algorithm for 3-colourability.
In fact, deciding if such an algorithm exists amounts to a solution of the
famous P versus NP problem. A correct solution can bring the author,
apart from worldwide fame, a $1 000 000 prize from Clay Mathematical
Institute. See www.claymath.org for details.
7.3
55
Graph connectivity
7
3
1
10
56
57
c
a
w * v is itself a path.
path u
w visits node v: u
v
w * v. We now have a path
u
v as an initial segment of u
w.
In both cases, the existence of a walk u # v implies the existence of a path
u
v.
By the above theorem, the notions of connectivity by walks and by paths
coincide: a graph is connected if and only if every two of its nodes are
connected by a path.
Let us now recall the Konigsberg Bridges problem. It consists in finding
a tour that visits every edge in a graph exactly once. Such a tour is called
an Euler tour of the graph.
The graph in Figure 14 has the following Euler tour:
a*b*c*f *e*d*c*e*b*f *a
It turns out that the Euler tour problem has a simple solution for any
graph G = (V, E). The solution is based on the following definition. For
any node v V , its degree is the number of nodes adjacent to it: deg(v) =
|{u V | v * u}|. For example, in Figure 14:
deg(a) = deg(d) = 2
deg(b) = deg(c) = deg(e) = deg(f ) = 4
We are now able to describe a simple test for existence of the Euler tour
in a graph.
Theorem 15. Consider a graph G = (V, E). Graph G has an Euler tour,
if and only if
G is connected;
every node in V has even degree.
58
Proof. If G has an Euler tour, then it is connected, since the Euler tour
contains a walk between any pair of nodes. Consider any node v V .
Suppose node v is visited k times by the Euler tour. On every visit, the
tour uses two edges: an incoming and an outgoing edge. Since every edge
adjacent to v is used exactly once, the total number of edges adjacent to v
must be twice the number of visits: deg(v) = 2k. Therefore, deg(v) is even.
The proof of the opposite implication is done in several steps. First, we
build a tour that visits every edge at most once, but may miss some of the
edges. We then show that such a tour can be extended to cover all the edges.
Let G = (V, E) be a connected graph, where each node has even degree.
Let us fix any starting node u V . Consider any walk u # v 6= u. The
final node of this walk v may have been visited by the walk several times;
on each such visit, the walk uses two edges adjacent to v. However, on the
final visit, only one incoming edge is used. Therefore, the number of visited
edges adjacent to v is odd. Since the total number of adjacent edges deg(v)
is even, there is at least one unvisited edge adjacent to v. Let us add this
edge to the walk: u # v * w. If w 6= u, we can repeat the previous step,
extending the walk u # w by more edges. Eventually, the walk will return
back to node u.
At this point, we have a tour u # u that visits every edge at most once,
but may not visit some of the edges at all. Suppose there are some unvisited
edges. We now recall that graph G is connected. Consider all nodes in our
tour u # u. If all these nodes had no adjacent unvisited edges, then there
would be no path connecting every one of them to an unvisited edge, and
hence the graph would not be connected. Therefore, some node s in the
tour u # s # u has an adjacent unvisited edge s * t.
Let us now make s the initial node of our tour: s # u # s. The tour
still visits every edge at most once. Let us extend the tour by visiting the
previously unvisited edge s * t: s # u # s * t. As before, the final
node t has an odd number of adjacent visited edges, but the total number of
adjacent edges deg(t) is even. Therefore, there is an unvisited edge adjacent
to t, so we can extend the walk by another edge. As before, we can repeat
this process until the walk returns back to node s. If there still are any
unvisited edges in the graph, we can repeat the whole process once again.
Eventually, the walk will return back to the starting node, having visited all
edges in the graph. We have constructed an Euler tour of the graph G.
Even though the above proof is longer that our previous proofs, it is less
formal: we use such phrases as repeat the whole process until eventually it yields an Euler tour. This proof can be completely formalised using
induction.
To illustrate the tour-building procedure outlined in the proof, consider
the graph in Figure 14. Let us take a as the starting node, and begin the
walk by moving along the edge a * b. Node b has now one adjacent visited
59
edge; since deg(b) is even, it is also guaranteed to have at least one adjacent
unvisited edge. In fact, it has three such edges; let us take the edge b * c.
We now have the walk a * b * c. Node c, in its turn, is guaranteed to have
at least one adjacent unvisited edge, so we can keep extending the walk.
Eventually we will return back to node a. Suppose at this point our walk is
the tour a * b * c * f * a.
Since the graph is connected, and not all edges have been visited, at
least one node in the current tour must have an adjacent unvisited edge.
For instance, let us take node b with the unvisited edge b * f . We can now
make b the starting node in our existing tour, and extend the tour by a new
edge, making it into a walk:
b*c*f *a*b*f
We can keep extending the walk by more edges, until eventually we return
to node b:
b*c*f *a*b*f *e*d*c*e*b
At this point, all edges have been visited, so our current tour is an Euler
tour.
The power of Theorem 15 is in replacing a complex global condition (existence of an Euler tour) by a much simpler global condition (connectivity),
plus a number of very simple local conditions (node degrees). By Theorem 15, the original Konigsberg graph in Figure 7 has no Euler tour, since
some nodes (in fact, all nodes representing islands) have odd degree.
Inspired by our success in finding an efficient test for the existence of an
Euler tour, we may want to formulate an analogous (and practically more
important) problem for cycles. It consists in finding a cycle that visits every
node (but not necessarily every edge) in a graph exactly once. Such a cycle
is called a Hamiltonian cycle of the graph.
The graph in Figure 14 has the following Hamiltonian cycle:
a*b*e*d*c*f *a
It turns out that the Hamiltonian cycle problem, despite its similarity
with the Euler tour problem, is hard. In fact, nobody has managed so far
to find an efficient test for existence of a Hamiltonian cycle, or to prove that
no such test exists. The status of this problem is very similar to that of
the colourability problem introduced in the previous subsection. And, like
colourability, the Hamiltonian cycle problem is also worth $1 000 000!
7.4
Trees
60
3
G0
G0
3
61
Note that a connected graph stays connected if we add some edges to it.
Therefore, a connected graph cannot have too few edges. Also note that
an acyclic graph stays acyclic if we remove some edges from it. Therefore,
an acyclic graph cannot have too many edges. A tree, being both connected and acyclic, must therefore have some middling number of edges.
It turns out that we can specify this number precisely: every tree with a
given number of nodes has the same number of edges.
Theorem 16. Let G = (V, E) be a tree. We have |V | = |E| + 1.
Proof. Induction.
Induction base. Consider graph G with one node and no edges. It is
connected and acyclic, and therefore a tree. We have |E| = 0, |V | = 1 =
|E| + 1.
Inductive step. Let G = (V, E) be any tree with at least one edge u * v.
Let G0 = (V, E \ {(u, v), (v, u)}) be the spanning subgraph of G obtained by
removing the edge u * v.
Consider the connectivity relation R in the graph G0 . A node w V
cannot be connected both to u and v in G0 , otherwise we would have a cycle
u
w
v * u in G. However, every node w must be connected either
to u or to v in G0 , otherwise graph G would not be connected. Therefore,
graph G0 has two connected components with node sets Vu = [u] (all nodes
connected to u) and Vv = [v] (all nodes connected to v). Let us denote
these components by Gu = (Vu , Eu ) and Gv = (Vv , Ev ).
Both Gu and Gv are connected and acyclic, therefore they are trees. By
the inductive hypothesis, we have
|Vu | = |Eu | + 1 |Vv | = |Ev | + 1
We also have |V | = |Vu | + |Vv | (all nodes in G are nodes in Gu plus nodes
in Gv ), and |E| = |Eu | + |Ev | + 1 (all edges in G are edges in Gu plus edges
in Gv plus the edge u * v). Therefore,
|V | = |Vu | + |Vv | = (|Eu | + 1) + (|Ev | + 1) =
The above theorem does not simply give us an edge count for trees; we
can draw from it some important conclusions on the structure of a tree. Let
us call a node of degree 1 a leaf.
Theorem 17. Every tree with at least one edge has a leaf.
Proof. Let G = (V, E) be a tree with at least one edge. The sum of all
node degrees in any graph is twice the number of edges, since every edge
62
contributes to the degree of both its ends. Suppose every node in G has
degree at least 2. Then 2 |E| 2 |V |, therefore |E| |V |. But since G
is a tree, by the previous theorem |E| = |V | 1. This is a contradiction, so
our assumption must be false, and G has some nodes of degree less than 2.
Since G is connected and has at least one edge, it cannot have any nodes of
degree 0. Therefore, there must be at least one node of degree 1.
To complete our study of trees, we characterise them in terms of the
spanning subgraph relation.
Recall that relation Rv is a partial order on the set G(V ) of all graphs
with node set V . This partial order has the least element (V, ) (the empty
graph), and the largest element K(V ) (the complete graph). Things become
more interesting if we restrict the relation Rv to the set of all connected, or
acyclic, graphs on V .
Theorem 18. Let V be any finite set. Consider the partial order Rv on
the set of all connected graphs on V . A graph G = (V, E) is minimal in
this partial order, if and only if it is a tree.
Proof. Since graph G is connected by the condition of the theorem, we need
to prove that G is minimal if and only if it is acyclic. Equivalently, we need
to prove that G has a cycle, if and only if it is not minimal in the partial
order.
Suppose that graph G has a cycle. Let u * v be any edge in the cycle.
Remove edge u * v from the graph. The graph stays connected, since every
walk that passed through the edge u * v can be redirected by the remaining
path u
v. Since the graph stays connected after removing an edge, it is
not minimal.
To prove the opposite implication, suppose that graph G is not minimal
connected. This means that for some u, v V , removing the edge u * v
does not disconnect the graph. It can only happen, if nodes u and v are
connected, apart from the edge u * v, by some path u
v. Therefore,
graph G has a cycle u * v
u.
In short, trees are minimal among connected graphs. This can be viewed
as an alternative definition of a tree, not using the word acyclic.
Theorem 19. Let V be any finite set. Consider the partial order Rv on
the set of all acyclic graphs on V . A graph G = (V, E) is maximal in this
partial order, if and only if it is a tree.
Proof. Since graph G is acyclic by the condition of the theorem, we need
to prove that G is maximal if and only if it is connected. Equivalently, we
need to prove that G is disconnected, if and only if it is not maximal in the
partial order.
Suppose that graph G is disconnected. Let u, v be any two unconnected
nodes. Add the edge u * v to the graph. The graph stays acyclic, since u
63
and v are not connected by any path apart from the new edge u * v. Since
the graph stays acyclic after adding an edge, it is not maximal.
To prove the opposite implication, suppose that graph G is not maximal
acyclic. This means that for some u, v V , adding the edge u * v does
not create a cycle. It can only happen, if nodes u and v are unconnected.
Therefore, graph G is disconnected.
In short, trees are maximal among acyclic graphs. This can be viewed
as an alternative definition of a tree, not using the word connected.