CMTM101 (Updated)
CMTM101 (Updated)
CMTM101 (Updated)
CUMT101 CALCULUS 1
Calculus of Single Variables
Author: Department:
D.F. Mamutse Mathematics
Calculus is one of the milestones of Western thought. Building on ideas of Archimedes, Fermat,
Newton, Leibniz, Cauchy, and many others, the calculus is arguably the cornerstone of modern
science. Any well-educated person should at least be acquainted with the ideas of calculus, and a
scientifically literate person must know calculus solidly. Calculus has two main aspects: differential
calculus and integral calculus. Differential calculus concerns itself with rates of change. Various
types of change, both mathematical and physical, are described by a mathematical quantity called
the derivative. Integral calculus is concerned with a generalized type of addition, or amalgamation,
of quantities. Many kinds of summation, both mathematical and physical, are described by a
mathematical quantity called the integral.
Calculus is one of the most important parts of mathematics. It is fundamental to all of modern
science. How could one part of mathematics be of such central importance? It is because calculus
gives us the tools to study rates of change and motion. All analytical subjects, from biology to
physics to chemistry to engineering to mathematics, involve studying quantities that are growing
or shrinking or moving, in other words, they are changing. Astronomers study the motions of the
planets, chemists study the interaction of substances, physicists study the interactions of physical
objects. All of these involve change and motion. 1 2
1
To Archimedes, Pierre de Fermat, Isaac Newton, and Gottfried Wilhelm von Leibniz, the fathers of calculus
2
The true sign of intelligence is not knowledge but imagination—— Albert Einstein
Chapter 1
The Basics
Mathematics has its own language with numbers as the alphabet. The language is given structure
with the aid of connective symbols, rules of operation, and a rigorous mode of thought (logic). The
number systems that we use in calculus are the natural numbers, the integers, the rational numbers,
and the real numbers. Let us describe each of these :
1. The natural numbers are the system of positive counting numbers 1, 2, 3 . . . . We denote the
set of all natural numbers by N.
N = {1, 2, 3, 4, 5, 6, 7, 8, . . . }.
2. The integers are the positive and negative whole numbers and zero, . . . , −3, −2, −1, 0, 1, 2, 3, . . . .
We denote the set of all integers by Z.
3. The rational numbers are quotients of integers or fractions, such as 32 , − 54 . Any number
p
of the form , with p, q ∈ Z and q 6= 0, is a rational number. We denote the set of all rational
q
numbers by Q.
p
Q= p, q ∈ Z, q 6= 0 .
q
4. The real numbers are the set of all decimals, both terminating and non-terminating. We
denote the set of all real numbers by R. A decimal number of the form x = 3.16792 is actually
a rational number, for it represents
316792
x = 3.16792 = .
100000
1
A decimal number of the form
m = 4.27519191919 . . . ,
with a group of digits that repeats itself interminably, is also a rational number. To see this,
notice that
100 · m = 427.519191919 . . .
and therefore we may subtract
100m = 427.519191919 . . .
m = 4.27519191919 . . .
(a) algebraic means formalisations of the rules of calculation (addition, subtraction, multi-
plication, division). Example : 2(3 + 5) = 2 · 3 + 2 · 5 = 6 + 10 = 16.
3 1
(b) order denote inequalities. Example : − < .
4 3
(c) completeness implies that there are “no gaps” on the real line.
Algebraic properties of the reals for addition (a, b, c ∈ R) are :
(A1) a + (b + c) = (a + b) + c. associativity
(A2) a + b = b + a. commutativity
(A3) There is a 0 such that a + 0 = a. identity
(A4) There is an x such that a + x = 0. inverse
Why these rules? They define an algebraic structure (commutative group). Now define anal-
ogous algebraic properties for multiplication :
2
(M1) a(bc) = (ab)c.
(M2) ab = ba.
(M3) There is a 1 such that a · 1 = a.
(M4) There is an x such that ax = 1 for a 6= 0.
Some useful rules for calculations with inequalities are : If a, b, c are real numbers, then :
⇒ N⊂Z⊂Q⊂R
In summary, the real numbers R are complete in the sense that they correspond to all points on
the real line, i.e., there are no “holes” or “gaps”, whereas the rationals have “holes” (namely
the irrationals).
You Try It : What type of real number is 3.41287548754875 . . . ? Can you express this
number in more compact form?
3
1.2 Intervals
Definition 1.2.1. A subset of the real line is called an interval if it contains at least two numbers
and all the real numbers between any of its elements.
Examples :
1. x > −2 defines an infinite interval. Geometrically, it corresponds to a ray on the real line.
Finite Intervals. Let a and b be two points such that a < b. By the open interval (a, b) we mean
the set of all points between a and b, that is, the set of all x such that a < x < b. By the closed
interval [a, b] we mean the set of all points between a and b or equal to a or b, that is, the set of all
x such that a ≤ x ≤ b. The points a and b are called the endpoints of the intervals (a, b) and [a, b].
By a half-open interval we mean an open interval (a, b) together with one of its endpoints. There
are two such intervals : [a, b) is the set of all x such that a ≤ x < b and (a, b] is the set of all x such
that a < x ≤ b.
Infinite Intervals. Let a be any number. The set of all points x such that a < x is denoted by
(a, ∞), the set of all points x such that a ≤ x is denoted by [a, ∞). Similarly, (−∞, b) denotes the
set of all points x such that x < b and (−∞, b] denotes the set of all x such that x ≤ b.
Solve inequalities to find intervals of x ∈ R. Set of all solutions is the solution set of the inequality.
Examples:
4
1.
2x − 1 < x + 3
2x < x + 4
x < 4.
x + 3(2 − x) ≥ 4 − x when
x + 6 − 3x ≥ 4−x
6 − 2x ≥ 4−x
2 ≥ x ⇒ x ≤ 2.
2 3
You Try It: Solve the inequality < .
x−1 2x + 1
It is a quantity that gives the magnitude or size of a real number. The absolute value or modulus
of a real number x, denoted by |x|, is given by
x, if x ≥ 0
|x| =
−x, if x < 0.
Geometrically, |x| is the distance between x and 0. For example, | − 6| = 6, |5| = 5, |0| = 0.
2. The absolute value of a real number x is zero if and only if x = 0, that is, |x| = 0 ⇐⇒ x = 0.
5
(a) −|x| ≤ x ≤ |x|.
(b) | − x| = |x| and |x − y| = |y − x|.
(c) |x| = |y| implies x = ±y.
x |x|
(d) |xy| = |x| · |y| and = if y 6= 0.
y |y|
(e) |x + y| ≤ |x| + |y|. (Triangle inequality)
4. If a is any positive number, then
(a) |x| = a if and only if x = ±a.
(b) |x| < a if and only if −a < x < a.
(c) |x| > a if and only if x > a or x < −a.
(d) |x| ≤ a if and only if −a ≤ x ≤ a.
(e) |x| ≥ a if and only if x ≥ a or x ≤ −a.
2x − 3 = 7 2x − 3 = −7
2x = 10 2x = −4
x = 5 x = −2
Solution: We have
5 − 2 2
< 1 ⇐⇒ −1 < 5 − < 1
x x
2
⇐⇒ −6 < − < −4
x
1
⇐⇒ 3 > > 2
x
1 1
⇐⇒ <x< .
3 2
6
Solve the inequalities and show the solution set on the real line. (a) |2x − 3| ≤ 1 (b) |2x − 3| ≥ 1.
Solution: (a)
|2x − 3| ≤ 1 ⇐⇒ −1 ≤ 2x − 3 ≤ 1
⇐⇒ 2 ≤ 2x ≤ 4
⇐⇒ 1 ≤ x ≤ 2.
(b)
|2x − 3| ≥ 1 ⇐⇒ 2x − 3 ≥ 1 or 2x − 3 ≤ −1
⇐⇒ x ≥ 2 or x ≤ 1.
It is an important property of the positive integers (natural numbers) and is used in proving state-
ments involving all positive integers when it is known for, for example, that the statements are valid
for n = 1, 2, 3, . . . but it is suspected or conjectured that they hold for all positive integers.
1.5.1 Steps
1. Prove the statement for n = 1 or some other positive integer. (Initial Step)
4. Since the statement is true for n = 1 (from 1) it must (from 3) be true for n = 1 + 1 = 2 and
from this for n = 2 + 1 = 3, and so on, so must be true for all positive integers. (Conclusion)
n(n + 1)
1 + 2 + ··· + n = .
2
Solution:
7
1(1 + 1) 2
1. Prove for n = 1, 1 = = = 1, which is clearly true.
2 2
2. Assume that the statement holds for n = k, that is,
k(k + 1)
1 + 2 + ··· + k = .
2
3. Prove for n = k + 1. So
k(k + 1)
1 + 2 + · · · + k + (k + 1) = + (k + 1) (by inductive hypothesis)
2
k(k + 1) + 2(k + 1)
=
2
2
k + 3k + 2
=
2
(k + 1)(k + 2)
=
2
so holds for n = k + 1.
n(n + 1)
4. Hence by induction, 1 + 2 + · · · + n = is true for any positive integer n.
2
1 + 3 + 5 + · · · + 2n − 1 = n2 .
Solution:
1 + 3 + 5 + · · · + 2k − 1 = k 2 .
So it is true for n = k + 1.
8
Example: Prove that 3n > 2n for all natural numbers n.
Solution:
3. Prove for n = k + 1.
3k+1 = 3k · 3
> 2k · 3 by inductive hypothesis
> 2k · 2 since 3 > 2
> 2k+1 ,
which is true.
2. Assume that the statement holds for n = k, that is, for k ≥ 1, 22k − 1 is divisible by 3, i.e.,
22k − 1 = 3l, for some l ∈ Z.
3. Prove for n = k + 1.
which is true.
1.6 Tutorial 1
1. Express the following recurring decimals in the form p/q where p and q are integers
(i) 2, 1737̇3̇ (ii) 0, 3̇2̇4̇.
9
2. Express 0, mnmnmnmn . . . = 0, ṁṅ, where m and n are distinct integers, in the form p/q
where p and q are integers.
3. State, giving a reason, whether each of the following numbers is rational or irrational.
(i) 0.20200200020 . . . (ii) 537.137137137 . . . .
5. Show that if 0 < a < b then a2 < b2 . If a2 < b2 , is it necessarily true that a < b? Give an
example to illustrate your answer.
1 √
6. If a ≥ 0 and b ≥ 0, prove that (a + b) ≥ ab.
2
7. Solve the following inequalities.
2x + 3
(i) x2 + x − 2 > 0 (ii) >3 (iii) 2|x| > 3x − 10 (iv) |x + 1| ≥ 3
x−5
8. Prove that |ab| = |a||b| for all a, b ∈ R.
11. If x and y are real numbers, prove that |x| − |y| ≤ |x − y|.
10
Chapter 2
Sequences
Each number in the sequence is called a term and un is called the nth term. The sequence
u1 , u2 , u3 , . . . is written briefly as {un }, e.g., {un } = 2n, where u1 = 2, u2 = 4, u3 = 6 and so
on. The sequence is called finite or infinite according as there are or are not a finite number of
terms.
Example:
Find the values of the first four terms of the sequence defined by
2
un+1 = , u0 = 1, n ∈ N.
un
Solution:
2 2
u1 = u0+1 = = =2
u0 1
2 2
u2 = u1+1 = = =1
u1 2
2 2
u3 = u2+1 = = = 2.
u2 1
11
You Try It: Define recursively
a0 = a1 = 1, and an = an−1 + 2an−2 , n ≥ 2.
Find a6 recursively.
1
Lets consider the sequence un = . The sequence has the terms 1, 21 , 13 , 14 , . . . . We see that the
n
terms of the sequence tend to or approach 0.
Definition 2.1.1. A number L is called the limit of an infinite sequence a1 , a2 , a3 , . . . or {an }, if
for any positive number ε, we can find a positive number N depending on ε such that |an − L| < ε
for all integers n > N . We write lim an = L.
n→∞
If {an } is a convergent sequence, it means that the terms an can be made arbitrarily close to L for
n sufficiently large.
1 3n + 1
Example: If un = 3 + = , the sequence is 4, 27 , 10
3
, . . . and we can show that
n n
lim un = 3.
n→∞
If the limit of a sequence exists, the sequence is called convergent, otherwise, it is called divergent.
12
1
Example: Prove that lim = 0.
n→∞ n
1 1 1 1 1
Proof: Let ε > 0, we can find N (ε) such that − 0 = = < ε. But n > . So N = .
n n n ε ε
1 1
Taking N to be the smallest integer greater than , we have, lim = 0.
ε n→∞ n
1
You Try It: Prove that lim = 0 if p ∈ N.
n→0 np
2n − 1 2
Example: Use the definition of a limit to prove that lim = .
n→∞ 3n + 2 3
7
< ε
3(3n + 2)
7 − 6ε
n > .
9ε
7 − 6ε 7 − 6ε
Take N = . So taking N to be the smallest integer greater than , we have
9ε 9ε
2n − 1 2 2n − 1 2
3n + 2 − 3 < ε , i.e., n→∞
lim = .
3n + 2 3
an lim an A
4. lim = n→∞ = if lim bn = B 6= 0.
n→∞ bn lim bn B n→∞
n→∞
13
Proof: We must show that if lim un = l1 and lim un = l2 , then l1 = l2 . By hypothesis, given any
n→∞ n→∞
ε ε
ε > 0, we can find N such that |un − l1 | < when n > N and |un − l2 | < when n > N . Then
2 2
ε ε
|l1 − l2 | = |l1 − un + un − l2 | ≤ |l1 − un | + |un − l2 | < + = ε,
2 2
i.e., |l1 −l2 | is less than any positive ε (however small) and so must be zero, i.e., l1 −l2 = 0 =⇒ l1 = l2 .
Proof: We must show that for any ε > 0, we can find N > 0, such that |(an + bn ) − (A + B)| < ε
for all n > N . We have
n tends to infinity, n → ∞ (n grows or increases beyond any limit ). Infinity is not a number and
the sequences that tend to infinity are not convergent.
We write lim an = ∞, if for each positive number M , we can find a positive number N (depending
n→∞
on M ) such that an > M for all n > N .
Similarly, we write lim an = −∞, if for each positive number M , we can find a positive number N
n→∞
such that an < −M for all n > N .
Example: Prove that (a) lim 32n−1 = ∞ (b) lim (1 − 2n) = −∞.
n→∞ n→∞
Proof: (a) If for each positive number M we can find a positive number
N such
that an > M for
1 ln M
all n > N , then 32n−1 > M when (2n − 1) ln 3 > ln M , i.e., n > + 1 . Taking N to be
2 ln 3
1 ln M
the smallest greater than + 1 , then lim 32n−1 = ∞.
2 ln 3 n→∞
14
(b) If for each positive number M , we can find a positive number N such that an < −M for all
n > N , i.e., 1 − 2n < −M when 2n − 1 > M or n > 12 (M + 1). Taking N to be the smallest integer
greater than 12 (M + 1), we have lim (1 − 2n) = −∞.
n→∞
A sequence that tends to a limit l is said to be convergent and the sequence converges to l. A
sequence may tend to +∞ or −∞, and is said to be divergent and it diverges to +∞ or −∞.
If un ≥ m, the sequence is bounded below and m is called a lower bound. The largest lower bound
is called the greatest lower bound (g.l.b).
If un+1 ≥ un , the sequence is called monotonic increasing and if un+1 > un it is called strictly
increasing. If un+1 ≤ un , the sequence is called monotonic decreasing, while if un+1 < un it is
strictly decreasing.
Examples: 1. The sequence 1, 1.1, 1.11, 1.111, . . . is bounded and monotonic increasing.
2. The sequence 1, −1, 1, −1, 1, . . . is bounded but not monotonic increasing or decreasing.
1
Definition 2.4.1. A null sequence is a sequence that converges to 0, e.g., un = , n ≥ 11.
n − 10
If {un } does not tend to a limit or +∞ or −∞, we say that {un } oscillates (or is an oscillating
sequence). It can oscillate finitely (bounded) or infinitely (unbounded).
5 − 2n2
1 3
We want to be able to evaluate limits, for example, of the form lim 2− + 2 or lim .
n→∞ n n n→∞ 4 + 3n + 2n2
15
1 3 1 1
Example: lim 2 − + 2 = lim 2 − lim + 3 lim 2 = 2 − 0 + 0 = 2.
n→∞ n n n→∞ n→∞ n n→∞ n
3n2 − 5n 3 − n5 3+0 3
Example: lim = lim = = .
n→∞ 5n2 + 2n − 6 n→∞ 5 + 2 − 6
5+0+0 5
n n2
√ √
√ √ √ √ n+1+ n 1
Example: lim ( n + 1 − n) = lim ( n + 1 − n) · √ √ = lim √ √ = 0.
n→∞ n→∞ n+1+ n n→∞ n+1+ n
If lim an = l = lim bn and there exists an N such that an ≤ cn ≤ bn , for all n > N , then
n→∞ n→∞
lim cn = l.
n→∞
cos n
Example: Find lim .
n→∞ n
16
Chapter 3
Infinite Series
is called an infinite series (or simply a series). The numbers a1 , a2 , a3 , . . . are called the terms of
the series. To find the sum of an infinite series, consider the following sequence of partial sums.
S1 = a1
S2 = a1 + a2
S3 = a1 + a2 + a3
.. . .. ..
. = .. . .
Sn = a1 + a2 + a3 + · · · + an .
If this sequence of partial sums converges, then the series is said to converge and has the sum
indicated in the following definition.
X
For the infinite series an , the nth partial sum is given by
Sn = a1 + a2 + a3 + · · · + an .
X
If the sequence of partial sums {Sn } converges to S, then the series an converges. The limit S
is called the sum of the series. If {Sn } diverges, then the series diverges.
17
∞
X 1 1 1 1 1
Example 3.1.1. The series n
= + + + + · · · has the following partial sums.
n=1
2 2 4 8 16
1
S1 =
2
1 1 3
S2 = + =
2 4 4
1 1 1 7
s3 = + + =
2 4 8 8
.. .. .. .. ..
. = . . . .
1 1 1 1 2n − 1
sn = + + + ··· + n = .
2 4 8 2 2n
2n − 1
Because lim = 1, it follows that the series converges and its sum is 1.
n→∞ 2n
Example 3.1.2. The nth partial sum of the series
∞
X 1 1 1 1 1 1 1
− = 1− + − + − + ···
n=1
n n+1 2 2 3 3 4
1
is given by Sn = 1 − . Because the limit of Sn is 1, the series converges and its sum is 1.
n+1
∞
X
Example 3.1.3. The series 1 = 1 + 1 + 1 + · · · diverges, because Sn = n and the sequence of
n=1
partial sums diverges.
The series in Example (3.1.2) is a telescoping series. That is, it is of the form
(b1 − b2 ) + (b2 − b3 ) + (b3 − b4 ) + (b4 − b5 ) + · · ·
note that b2 is canceled by the second term, b3 is canceled by the third term and so on. Because the
nth partial sum of the series is Sn = b1 − bn+1 , it follows that a telescoping series will only converge
if and only if bn approaches a finite number as n → ∞. Moreover, if the series converges, then its
sum is
S = b1 − lim bn+1 .
n→∞
∞
X 2
Example 3.1.4. Find the sum of the series .
n=1
4n2 −1
18
3.2 Geometric Series
Theorem 3.2.1. A geometric series with ratio r diverges if |r| ≥ 1. If 0 < |r| < 1, then the series
∞
X a
converges to the sum arn = , 0 < |r| < 1.
n=0
1 − r
has a ratio of r = 21 with a = 3. Because 0 < |r| < 1, the series converges and its sum is
a 3
S= = = 6.
1−r 1 − 12
X X
If an = A and bn = B and c is a real number, then the following series converge to the
X X X X
indicated sums. (i) can = cA (ii) (an ± bn ) = an ± bn = A ± B.
X
If the series an converges, then the sequence {an } converges to 0.
X
If the sequence {an } does not converge to 0, then the series an diverges.
In this and the following section, we will study several convergence tests that apply to series with
positive terms.
19
3.3.1 The Integral Test
∞
X Z ∞
If f is positive, continuous, and decreasing for x ≥ 1 and an = f (n), then an and f (x) dx
n=1 1
either both converge or both diverge.
∞
X n
Example 3.3.1. Apply the integral test to the series .
n=1
n2 +1
x
Because f (x) = satisfies the conditions for the integral test (check this), we can integrate to
x2 +1
obtain
Z ∞ Z ∞ Z b
x 1 2x 1 2x
dx = dx = lim dx
1 x2 + 1 2 1 x2 + 1 2 b→∞ 1 x2 + 1
b
1 2
= lim ln(x + 1)
2 b→∞ 1
1
= lim [ln(b2 + 1) − ln 2]
2 b→∞
= ∞.
1
Solution: Because f (x) = satisfies the conditions for the integral test, we can integrate to
x2 + 1
obtain
Z ∞ Z b b
dx dx −1
2
= lim = lim tan x
1 x +1 b→∞ 1 x2 + 1 b→∞
1
−1 −1
= lim (tan b − tan 1)
b→∞
π π π
= − = .
2 4 4
Thus, the series converges.
20
is a p−series, where p is a positive constant. For p = 1, the series
∞
X 1 1 1
= 1 + + + ···
n=1
n 2 3
Example 3.3.3. From the Theorem it follows that the harmonic series
∞
X 1 1 1
= 1 + + + ···
n=1
n 2 3
diverges.
This is a test for positive-term series. It allows you to compare a series having complicated terms
with a simpler series whose convergence or divergence is known.
∞
X ∞
X
1. If bn converges, then an converges.
n=1 n=1
∞
X ∞
X
2. If an diverges, then bn diverges.
n=1 n=1
∞
X 1
Example 3.4.1. Determine the convergence or divergence of .
n=1
2 + 3n
21
∞
X 1
Solution: This series resembles n
(Convergent geometric series). Term-by-term comparison
n=1
3
yields
1 1
an = n
< n = bn , n ≥ 1.
2+3 3
Thus, by the Direct Comparison Test, the series converges.
∞
X 1
Example 3.4.2. Determine the convergence or divergence of √ .
n=1
2+ n
∞
X 1
Solution: The series resembles 1 (Divergent p−series). Term-by-term comparison yields
n=1 n2
1 1
√ ≤√ , n≥1
2+ n n
which does not meet the requirements for divergence. Still expecting the series to diverge, we can
∞
X 1
compare the given series with (Divergent Harmonic series). In this case, term-by-term com-
n=1
n
parison yields
1 1
an = ≤ √ = bn , n ≥ 4
n 2+ n
and, by the Direct Comparison Test, the given series diverges.
Often a given series closely resembles a p−series or a geometric series, yet we cannot establish the
term-by-term comparison necessary to apply the Direct Comparison Test. We can apply a second
comparison test, called the Limit Comparison Test.
an
Suppose that an > 0 and bn > 0 and lim = L where L is finite and positive. Then the two
X X
n→∞ bn X
series an and bn , either both converge or both diverge. If L = 0 and bn converges, then
X X X
an converges. If L = ∞ and bn diverges, then an diverges.
22
∞ 1
X 1 n 1
Solution: By comparison with we have lim an+b
1 = lim = . Because this limit is
n=1
n n→∞
n
n→∞ an + b a
grater than 0, we can conclude from the Limit comparison Test that the given series diverges.
The limit Comparison Test works well for comparing a messy algebraic series with a p−series. In
choosing an appropriate p−series, we must choose one with an nth term of the same magnitude as
the nth term of the given series.
So far, most series we have dealt with have had positive terms. In this section, we will study series
that contain both positive and negative terms. The simplest such series is an alternating series,
whose terms alternate in sign. For example, the geometric series
∞ n X ∞
X 1 1 1 1 1 1
− = (−1)n n = 1 − + − + − ···
n=0
2 n=0
2 2 4 8 16
is an alternating geometric series with r = − 12 . Alternating series occur in two ways, either the odd
terms are negative or the even terms are negative.
∞
X ∞
X
n
Let an > 0. The alternating series (−1) an and (−1)n+1 an converge, if the following two
n=1 n=1
conditions are met.
2. lim an = 0.
n→∞
23
∞
X 1
Example 3.5.1. Determine the convergence or divergence of (−1)n+1 .
n=1
n
1 1 1
Solution: Because ≤ for all n and the limit (as n → ∞) of is 0, we can apply the
n+1 n n
Alternating Series Test to conclude that the series converges. (This series is called the alternating
harmonic series)
∞
X n
Example 3.5.2. Determine the convergence or divergence of .
n=1
(−2)n−1
passes the first condition in the alternating series test because an+1 ≤ an for all n. We cannot apply
the Alternating Series Test, because the series does not pass the second condition.
24
3.6 Absolute and Conditional Convergence
Occasionally, a series may have both positive and negative terms and not be an alternating series,
for example, the series
∞
X sin n sin 1 sin 2 sin 3
2
= + + + ···
n=1
n 1 4 9
has both positive and negative terms, yet it is not an alternating series. One way to obtain some
information about the convergence of this series is to investigate the convergence of the series
∞
X sin n
n2 . By direct comparison, we have | sin n| ≤ 1, for all n, so
n=1
sin n 1
n2 ≤ n2 , n ≥ 1.
∞
X sin n
Thus, by the Direct Comparison Test, the series converges. But the question still is “Does
n=1
n2
the original series converge?”
X X
Theorem 3.6.1 (Absolute Convergence). If the series |an | converges, then the series an
also converges.
The converse of the Theorem is not true. For example, the alternating harmonic series
∞
X (−1)n+1 1 1 1
=1− + − + ···
n=1
n 2 3 4
converges by the Alternating Series Test. Yet the harmonic series diverges. This type of convergence
is called conditional.
Example 3.6.1. Determine whether the following series are convergent or divergent. Classify any
convergent series as absolutely or conditionally convergent.
∞ n(n+1)
X (−1) 2 1 1 1 1
(a) n
=− − + + − ···.
n=1
3 3 9 27 81
25
Solution: This in not an alternating series. However, because
∞ n(n+1) ∞
X (−1) 2 X 1
=
3n n=1 3n
n=1
is a convergent geometric series, so the given series is absolutely convergent, hence convergent.
∞
X (−1)n 1 1 1 1
(b) =− + − + − ···.
n=1
ln(n + 1) ln 2 ln 3 ln 4 ln 5
Solution: In this case, the alternating series test indicates that the given series converges. However,
the series ∞
(−1)n
X 1 1 1
ln(n + 1) = ln 2 + ln 3 + ln 4 + · · ·
n=1
diverges by direct comparison with terms of the harmonic series. Therefore, the given series is
conditionally convergent.
∞
X (−1)n 1 1 1 1
(c) √ = −√ + √ − √ + √ − · · · .
n=1
n 1 2 3 4
Solution: The given series converges by the Alternating Series Test. Moreover, because the
p−series
∞
(−1)n
√ = √1 + √1 + √1 + √1 + · · ·
X
n 1 2 3 4
n=1
Ratio Test
X an+1
1. an converges absolutely if lim < 1.
n→∞ an
X an+1 an+1
2. an diverges if lim
> 1 or lim = ∞.
n→∞ an n→∞ an
26
an+1
3. The Ratio Test is inconclusive if lim
= 1.
n→∞ an
Although the Ratio Test is not a cure for all ills related to tests for convergence, it is particularly
useful for series that converge rapidly. Series involving factorials or exponentials are frequently of
this type.
∞
X 2n
Example 3.7.1. Determine the convergence or divergence of .
n=0
n!
2n
Solution: Because an = , we can write the following
n!
n+1
2n
an+1 2
lim
= lim ÷
n→∞ an n→∞ (n + 1)! n!
n+1
2 n!
= lim · n
n→∞ (n + 1)! 2
2
= lim
n→∞ n + 1
= 0.
Therefore, the series converges.
Example 3.7.2. Determine whether the following series converge or diverge.
∞ ∞
X n2 2n+1 X nn
(a) (b) .
n=0
3n n=1
n!
Solution:
an+1
(a) This series converges because the limit of is less than 1.
an
n+2 n
an+1 2 2 3
lim = lim (n + 1)
n→∞ an n→∞ 3n+1 n2 2n+1
2(n + 1)2
= lim
n→∞ 3n2
2
= < 1.
3
an+1
(b) This series diverges because the limit of is grater than 1.
an
n+1
an+1 (n + 1) n!
lim = lim
n→∞ an n→∞ (n + 1)! nn
(n + 1)n+1 1
= lim
n→∞ (n + 1) nn
n
(n + 1)n
1
= lim = lim 1 +
n→∞ nn n→∞ n
= e > 1.
27
3.7.2 The Root Test
This test of convergence or divergence of series works especially well for series involving nth powers.
Root Test
X
Let an be a series with non-zero terms.
X p
n
1. an converges absolutely if lim |an | < 1.
n→∞
X p p
n
2. an diverges if lim |an | > 1 or lim n |an | = ∞.
n→∞ n→∞
p
n
3. The Root Test is inconclusive if lim |an | = 1.
n→∞
∞
X e2n
Example 3.7.3. Determine the convergence or divergence of .
n=1
nn
Because this limit is less than 1, we can conclude that the series converges absolutely.
28
is called a power series. More generally, a series of the form
∞
X
an (x − c)n = a0 + a1 (x − c) + a2 (x − c)2 + · · · + an (x − c)n + · · ·
n=0
where the domain of f is the set of all x for which the power series converges.
2. There exists a real number R > 0 such that the series converges absolutely for |x − c| < R,
and diverges for |x − c| > R.
29
The number R is the radius of convergence of the power series. In the series converges only at
c, then the radius of convergence is R = 0, and if the series converges for all x, then the radius
of convergence is R = ∞. The set of all values of x for which the power series converges is the
interval of convergence of the power series.
∞
X
Example 3.8.2. Find the radius of convergence of n!xn .
n=0
For any fixed value of x such that |x| > 0, let un = n!xn . Then
(n + 1)!xn+1
un+1
lim = lim
n→∞ un n→∞ n!xn
= |x| lim (n + 1)
n→∞
= ∞.
Therefore, by the Ratio Test, the series diverges for |x| > 0, and converges only at its center, 0.
Hence, the radius of convergence is R = 0.
∞
X
Example 3.8.3. Find the radius of convergence of 3(x − 2)n .
n=0
= lim |x − 2|
n→∞
= |x − 2|.
By the Ratio Test, the series converges if |x − 2| < 1 and diverges if |x − 2| > 1. Therefore, the
radius of convergence of the series is R = 1.
∞
X xn
Example 3.8.4. Find the interval of convergence of .
n=1
n
30
xn
Solution: Letting un = produces
n
n+1
un+1 x
lim = lim n+1
x n
n→∞ un n→∞
n
nx
= lim
n→∞ n + 1
= |x|.
Therefore, by the Ratio Test, the radius of convergence is R = 1. Moreover, because the series is
centered at 0, it converges in the interval (−1, 1). This interval, however, is not necessarily the
interval of convergence. To determine this, we must test for convergence at each endpoint. When
x = 1, we obtain the divergent harmonic series
∞
X 1 1 1
= 1 + + + ···
n=1
n 2 3
(−1)n (x + 1)n
Solution: Letting un = produces
2n
(−1)n+1 (x+1)n+1
un+1 2n+1
lim = lim
n n
n→∞ un n→∞ (−1) (x+1)
2n
n
2 (x + 1)
= lim
n→∞ 2n+1
x + 1
= .
2
x + 1
By the Ratio test, the series converges if
< 1 or |x+1| < 2. Hence, the radius of convergence
2
is R = 2. Because the series is centered at x = −1, it will converge in the interval (−3, 1).
Furthermore, at the endpoints we have
∞ ∞ ∞
X (−1)n (−2)n X 2n X
= = 1 (Diverges when x=-3)
n=0
2n n=0
2n n=0
and ∞ ∞
X (−1)n (2)n X
= (−1)n (Diverges when x=1)
n=0
2n n=0
31
∞
X xn
Example 3.8.6. Find the interval of convergence of .
n=1
n2
xn
Solution: Letting un = produces
n2
xn+1
n2 x
un+1
(n+1)
2
lim = lim n = lim
= |x|.
n→∞ un n→∞ x 2 n→∞ (n + 1)2
n
Therefore, the interval of convergence for the given series is [−1, 1].
∞
X
Example 3.8.7. Find the interval of convergence of nxn .
n=1
Solution: The series is a power series with an = n and c = 0. Let un = nxn , so un+1 = (n+1)xn+1 .
Then
un+1 (n + 1)|x|n+1 n+1
= n
= |x| =⇒ |x| as n → ∞.
un n|x| n
∞
X
The limit is less than one whenever |x| < 1. The Ratio Test then shows un is convergent for
n=1
|x| < 1, and the series diverges for |x| > 1. This means the radius of convergence is R = 1. We
know that the series is convergent for −1 < x < 1. We need to check convergence at the endpoints
X∞
of this interval. When x = 1, we have n. This series does not approach zero as n → ∞, we
n=1
know this series must diverge. Similarly, the series is divergent when x = −1. The interval of
convergence is (−1, 1).
∞
X (−1)n (x − 2)n
Example 3.8.8. Find the interval of convergence of .
n=1
n4n
|x − 2|n |x − 2|n+1
Solution: Let un = , so u n+1 = . Then
n4n (n + 1)4n+1
32
|x − 2| |x − 2|
The Ratio Test gives convergence for < 1 and divergence for > 1. Solving the first
4 4
inequality, we have
|x − 2| < 4 =⇒ −4 < x − 2 < 4 =⇒ −2 < x < 6.
When x = −2, the series is
∞ ∞
X (−1)n (−2 − 2)n X 1
= ,
n=1
n4n n=1
n
which is a divergent p−series. When x = 6, we have
∞ ∞
X (−1)n (6 − 2)n X (−1)n
= .
n=1
n4n n=1
n
The Alternating Series Test shows that the series is convergent. The interval of convergence is
−2 < x ≤ 6.
∞
X xn
Example 3.8.9. Find the interval of convergence of the series .
n=1
n3n
xn
Solution: With un = , we find that
n3n
xn+1
= lim n|x| = |x| .
un+1 (n + 1)3n+1
lim =
n
n→∞ un
x n→∞ 3(n + 1)
3
n3 n
|x|
Now < 1 provided |x| < 3, so the Ratio Test implies that the given series converges absolutely if
3 X1
|x| < 3 and diverges if |x| > 3. When x = 3, we have the divergent harmonic series and when
n
X (−1)n
x = −3, we have the convergent alternating series . Thus the interval of convergence of
n
the given power series is [−3, 3).
∞
X 2n xn
Example 3.8.10. Find the interval of convergence of .
n=0
n!
2n xn
Solution: With un = , we find that
n!
n+1 n+1
2 x
= lim 2|x| = 0
(n + 1)!
lim n n
n→∞
2 x n→∞ n + 1
n!
for all x. Hence the Ratio Test implies that the power series converges for all x, and its interval of
convergence is (−∞, ∞).
33
3.9 Tutorial 2
1. Write
( the first
) five terms
( of each of) the following
( sequences. ) ( )
n
2n − 1 1 − (−1) (−1)n−1 (−1)n−1 x2n−1
(i) (ii) (iii) (iv)
3n + 2 n3 2 · 4 · 6 · · · 2n (2n − 1)!
4. Using the definition of a limit, show that each of the following sequences cannot have the limit
shown:
2n − 1 1 n+1 1 n
(i) un = , (ii) un = , (iii) un = 2 , 1.
3n + 4 2 7n − 4 6 n +1
5. Use the definition of a limit to verify each of the following limits.
2n − 1 2 4 − 2n 2 sin n
(i) lim = (ii) lim =− (iii) lim =0
n→∞ 3n + 2 3 n→∞ 3n + 2 3 n→∞ n
an − bn
6. Find lim , where a > 0 and b > 0 for the three cases:
n→∞ an + bn
(i) a > b (ii) a < b (iii) a = b.
(Series)
34
2. Determine whether or not the series converges and find its sum if it converges
∞ ∞ ∞
X 1 X 1 X 20
(i) (ii) (iii) .
r=1
(3r − 1)(3r + 2) r=1
(5r − 2)(5r + 3) r=1
(7r − 3)(7r + 4)
3. Use the integral test to determine the convergence or divergence of the series.
∞ ∞ ∞ ∞ ∞
X 1 X X 1 X 1 X 1
(i) (ii) ne−n (iii) (iv) 3
(v) 1 .
n=1
n + 1 n=1 n=1
4n + 1 n=1
n n=1 n 3
4. Use the Direct Comparison Test to determine the convergence or divergence of the series.
∞ ∞ ∞
X 1 X 1 X 1
(i) 2
(ii) (iii)
n=1
n +1 n=2
n−1 n=0
n!
5. Use the Limit Comparison to determine the convergence or divergence of the series.
∞ ∞ ∞ ∞ ∞
X n X 2n2 − 1 X 1 X n+3 X n
(i) 2
(ii) 5
(iii) √ (iv) (v) .
n=1
n +1 n=1
3n + 2n + 1 n=1
n n2 + 1 n=1
n(n + 2) n=1
(n + 1)2n−1
6. Use the Alternating Series Test to determine the convergence or divergence of the series.
∞ ∞ ∞
X (−1)n+1 X (−1)n+1 n X (−1)n
(i) (ii) (iii)
n=1
n n=1
2n − 1 n=2
ln n
9. Use the Root Test to test for convergence or divergence of the series.
∞ n ∞ ∞
X n X (−1)n X
(i) (ii) n
(iii) e−n
n=1
2n + 1 n=2
(ln n) n=0
35
Chapter 4
Functions
Definition 4.1.1. A function f from a set X to a set Y is a rule that assigns to each element x
in X a unique element y in Y .
The set X is called the domain of the function f and the range is the set of all elements of Y
assigned to an element of X. The element of Y assigned to an element x of X is called the image
of x under f and is denoted by f (x). We write f : X → Y for saying f is a function from X to Y .
In this course, both X and Y are sets of real numbers. Thus, the functions are called real functions.
We usually specify a function f by giving the expression for f (x). Below are a few examples of
functions:
Note that in the above examples, the letters f, g, h are used to denote functions whereas the let-
ters x, t, s are used to denote the variables. A variable is an arbitrary element of a set. In the
above examples, the letters x, t, s denote the independent variables and f (x), g(t), h(s) denote
the dependent variables since their values depend on the values of x, t, s respectively.
The domain of a function f is the largest set of real numbers for which the rule makes sense.
36
1 1
Example: Let f (x) = , we cannot compute f (0), since is not defined. Then the domain of
x 0
1
f (x) = is the set of all real numbers except 0.
x
It is useful to draw pictures which represent functions. These pictures, or graphs, are a device
for helping us to think about functions. We graph functions in the x − y plane. The elements of
the domain of the function are thought of as points of the x−axis. The values of a function are
measured on the y−axis. The graph of f associates to x a unique y value that the function f assigns
to x. The graph of a function f is the set of points {(x, y)|y = f (x) in the domain of f } in the
Cartesian plane.
As a consequence, a function is characterized geometrically by the fact that any vertical line inter-
secting its graph does so in exactly one point.
37
4.3 Monotone and Bounded Functions
A real function f is increasing (strictly increasing) on an interval I if for all points x1 and x2
in I with x1 < x2 , f (x1 ) ≤ f (x2 ) (f (x1 ) < f (x2 )).
A real function f is decreasing (strictly decreasing) on an interval I if for all points x1 and x2
in I with x1 < x2 , f (x1 ) ≥ f (x2 ) (f (x1 ) > f (x2 )).
Example: Consider the function f (x) = (2x − 1)(x + 5). We observe that f is increasing on
the interval (−9/4, ∞) and is decreasing on the interval (−∞, −9/4).
Bounded Functions
A function f is bounded above if there is a real number M such that f (x) ≤ M for all points x
in its domain. The number M is then called an upper bound of f .
A function f is bounded below if there is a real number m such that f (x) ≥ m for all points x
in its domain. The number m is then called a lower bound of f .
A function f is bounded if f is bounded above and below, that is, there exist real numbers
M and m such that m ≤ f (x) ≤ M for all points x in its domain.
38
4.4 Types of Functions
Polynomial Function
Have the form f (x) = a0 xn + a1 xn−1 + · · · + an−1 x + an where a0 , a1 , . . . , an are constants and n is
a positive integer called the degree of the polynomial provided a0 6= 0.
Rational Functions
P (x)
A function f (x) = where P (x) and Q(x) are polynomial functions.
Q(x)
x3 + x + 5
Example: f (x) = is a rational function. Since (x + 1)(x − 4) = 0 for x = −1 and
x2 − 3x − 4
x = 4, the domain of f is the set of all real numbers except −1 and 4.
Power Function
1 1 2
Examples: y = , y = x2 , y = x3 .
x
39
Piecewise Defined Functions
A function need not be defined by a single formula. A piecewise defined function is a function
described by using different formula on different parts of its domain.
−1, x<0 −x, x<0
Examples: (a) f (x) = 0, x=0 (b) f (x) = x2 , 0≤x≤1
x + 2, x>0 1, x > 1.
Transcendental Functions
3. Trigonometric functions (also called circular functions because of their geometric interpreta-
sin x
tion with respect to the unit circle), e.g., sin x, cos x, tan x = , csc x, cot x, sec x.
cos x
4. Inverse trigonometric functions, e.g., y = sin−1 x, y = cos−1 x.
40
Even and Odd Functions
Let f (x) be a real-valued function of a real variable. Then f is even if f (x) = f (−x). (Symmetric
with respect to the y−axis)
Let f (x) be a real-valued function of a real variable. Then f is odd if −f (x) = f (−x) or
f (x) + f (−x) = 0. (Symmetric with respect to the origin)
3x
Example: Determine whether the following function is odd or even f (x) = .
x2 + 1
Solution:
3(−x) 3x
f (−x) = 2
=− 2 = −f (x).
(−x) + 1 x +1
The function is odd.
A function f can be combined with another function g by means of arithmetic operations to form
f
other functions, the sum f + g, difference f − g, product f g and quotient are defined as :
g
Let f and g denote functions, then
f
Example: If f (x) = 2x2 − 5 and g(x) = 3x + 4. Find f + g, f − g, f g, .
g
41
Solution:
Let f and g denote functions. The composition of f and g, written f ◦ g is the function
(f ◦g)(x) = f (g(x)) and the composition of g and f , written g◦f , is the function (g◦f )(x) = g(f (x)).
Solution:
(f ◦ g)(x) = f (g(x)) = f (x2 + 1) = (x2 + 1)2 = x4 + 2x2 + 1.
and
(g ◦ f )(x) = g(f (x)) = g(x2 ) = (x2 )2 + 1 = x4 + 1.
In general, f ◦ g 6= g ◦ f .
Classes of functions may be distinguished by the manner in which arguments and images are related
or mapped to each other.
42
Example: Show that the functions f (x) = 2x + 3 and g(x) = x3 − 2 are injective.
A function f : X → y is called onto if for all y in Y there is an x in X such that f (x) = y. All
elements in Y are used. Such functions are referred to as surjective.
43
y+5
Solution: For onto f (x) = y, i.e., 3x − 5 = y. Solve for x, =⇒ x = .
3
y+5 y+5
So f =3 − 5 = y. Therefore f is onto.
3 3
Let X and Y be sets. A function f : X → Y that is one-to-one and onto is called a bijection
or bijective function from X to Y . If f is both one-to-one and onto, then we call f a 1 − 1
correspondence.
Inverse of a Function. Suppose f is a 1 − 1 function that has domain X and range Y . Since
every element y ∈ Y corresponds with precisely one element x of X, the function f must actually
determine a reverse function g whose domain is Y and range is X, where f and g must satisfy
f (x) = y and g(y) = x. The function g is given the formal name inverse of f and usually written
f −1 and read f inverse. Not all functions have inverses, those that do are called invertible functions.
y = (2x + 8)3
√3
y = 2x + 8
√3
y − 8 = 2x
√3 y − 8
x = .
2
√
3
x−8
Hence the inverse function f −1 −1
is given by f (x) = .
2
A 1−1 function f can have only one inverse, i.e., f −1 is unique. A function f : X → Y is invertible
if and only if f is one-to-one and maps X onto Y .
44
4.8 Operations on Functions
Equality of Functions
Equality of functions does not mean the same as equality of two numbers (numbers have a fixed
value but values of functions vary). Each function is a relationship between x and y, the two
relationships are the same if for every value of x we get the same value of y.
Identity Function
Generally, an identity function is one which does not change the domain values at all. Its the
function f (x) = x. Denoted by IX .
45
Chapter 5
The single most important idea in calculus is the idea of limit. More than 2000 years ago, the
ancient Greeks wrestled with the limit concept, and they did not succeed. It is only in the past 200
years that we have finally come up with a firm understanding of limits. The study of calculus went
through several periods of increased mathematical rigour beginning with the French mathematician
Augustin-Loius Cauchy (1789-1857) and later continued by the German mathematician, and former
high school teacher, Karl Wilhelm Weierstrass (1815-1897).
If f is a function, then we say lim f (x) = A, if the value of f (x) gets arbitrarily closer to A as x gets
x→a
closer and closer to a. For example, lim x2 = 9, since x2 gets arbitrarily close to 9 as x approaches
x→3
as close as one wishes to 3.
The definition can be stated more precisely as follows : lim f (x) = A if and only if, for any
x→a
chosen positive number ε, however small, there exists a positive number δ, such that, whenever
0 < |x − a| < δ, then |f (x) − A| < ε.
lim f (x) = A means that f (x) can be made as close as desired to A by making x close enough, but
x→a
not equal to a. How close is “close enough to a” depends on how close one wants to make f (x) to
A. It also of course depends on which function f is and on which number a is. The positive number
ε is how close one wants to make f (x) to A ; one wants the distance to be no more then ε. The
positive number δ is how close one will make x to a ; if the distance from x to a is less than δ (but
not zero), then the distance from f (x) to A will be less than ε. Thus δ depends on ε. The limit
statement means that no matter how small ε is made, δ can be made smaller enough. The letters
ε and δ can be understood as “error” and “distance”. In these terms the error (ε) can be made as
small as desired by reducing the distance (δ).
46
The ε − δ definition of lim f (x) = A
x→a
For any chosen positive number ε, however small, there exists a positive number δ, such that,
whenever 0 < |x − a| < δ, then |f (x) − A| < ε.
Solution: Need to find δ so that, for a given ε, |x2 + 1 − 2| < ε for |x − 1| < δ.
Now
x 2 + 1 − 2 = x2 − 1
= (x + 1)(x − 1).
Choose |x − 1| < 1 so that −1 < x − 1 < 1 ⇒ 0 < x < 2 ⇒ 1 < x + 1 < 3. You have |x2 + 1 − 2| < ε
ε
if 3|x − 1| < ε or |x − 1| < . You have now two conditions on x :
3
ε
|x − 1| < 1 and |x − 1| < .
3
Choose δ = min{1, 3ε }. For a given ε > 0, choose δ = min{1, 3ε }, then we have |x − 1| < δ, it would
be true that |x2 + 1 − 2| < ε.
Solution: Let ε > 0. We must produce a δ > 0 such that, whenever 0 < |x − 2| < δ then
|(x2 + 3x) − 10| < ε. First we note that
|(x2 + 3x) − 10| = |(x − 2)2 + 7(x − 2)| ≤ |x − 2|2 + 7|x − 2|.
ε
Also, if 0 < δ ≤ 1, then δ 2 ≤ δ. Hence, if we take δ to be the minimum of 1 and , then, whenever
8
0 < |x − 2| < δ,
|(x2 + 3x) − 10| < δ 2 + 7δ ≤ δ + 7δ = 8δ ≤ ε.
1
You Try It: Prove that lim x sin = 0.
x→0 x
Considering x and a as points on the real axis where a is fixed and x is moving, then x can approach
a from the right or from the left. We indicate these respective approaches by writing x → a+ and
x → a− .
47
If lim+ f (x) = A1 and lim− f (x) = A2 , we call A1 and A2 respectively the right and left hand limits
x→a x→a
of f (x) at a.
We have lim f (x) = A if and only if lim+ f (x) = lim− f (x) = A. The existence of the limit from the
x→a x→a x→a
left does not imply the existence of the limit from the right and conversely. When a function f is
defined on only one side of a point a, then lim f (x) is identical to the one-sided limit, if it exists. For
√ x→a √ √
example, if f (x) = x, then f is only defined to the right of zero. Hence, lim x = lim+ x = 0.
x→0
√ √ x→0
Of course, lim− x does not exist, since x is not defined when x < 0. On the other hand, consider
x→0 r
1
the function g(x) = , which is defined only for x > 0. In this case, lim+ g(x) does not exist and,
x x→0
therefore lim g(x) does not exist.
x→0
Solution: Must show that if lim f (x) = A1 and lim f (x) = A2 , then A1 = A2 .
x→a x→a
48
Example: Given lim f (x) = A and lim g(x) = B. Prove that
x→a x→a
Solution: We must show that for any ε > 0, we can find δ > 0 such that |(f (x)+g(x))−(A+B)| < ε
when 0 < |x − a| < δ.
By hypothesis, given ε > 0, we can find δ1 > 0 and δ2 > 0 such that
ε
|f (x) − A| < when 0 < |x − a| < δ1
2
ε
|g(x) − B| < when 0 < |x − a| < δ2 .
2
Then
ε ε
|(f (x) + g(x)) − (A + B)| ≤ |f (x) − A| + |g(x) − B| < + = ε,
2 2
when 0 < |x − a| < δ where δ is chosen as the smaller of δ1 and δ2 .
You Try It: Given lim f (x) = A and lim g(x) = B. Prove that
x→a x→a
ex − 1 x−1
3. lim = 1, lim = 1.
x→0 x x→1 ln x
If f (a) is defined
If x = a is in the domain of f (x) and a is not an endpoint of the domain, and f (x) is defined by a
single expression, then
lim f (x) = f (a).
x→a
49
Example: Find lim (x + 3).
x→1
Solution: lim (x + 3) = 1 + 3 = 4.
x→1
1
Example: Find lim .
x→1 x + 2
1 1 1
Solution: lim = = .
x→1 x + 2 1+2 3
x2 − 4
Example: Find lim .
x→2 x − 2
x2 − 4 (x + 2)(x − 2)
Solution: lim = lim = lim (x + 2) = 4.
x→2 x − 2 x→2 x−2 x→2
Suppose that f (x) is defined by one expression for x < a and by a different expression for x > a.
|x|
Example: Show that lim does not exist.
x→0 x
|x|
lim− = lim (−1) = −1.
x→0 x x→0−
|x|
lim+ = lim (1) = 1.
x→0 x x→0+
sin 3x
Example: Find lim .
x→0 x
50
sin 3x sin 3x
Solution: Since =3 . Then
x 3x
sin 3x sin 3x sin 3x
lim = lim 3 = 3 lim = 3(1) = 3.
x→0 x x→0 3x x→0 3x
1 − cos 2x
Example: Find lim .
x→0 sin 3x
1 − cos 2x 1 − cos 2x 3x 1 2 1 − cos 2x 3x
Solution: Since = 2x = . Then
sin 3x 2x sin 3x 3x 3 2x sin 3x
1 − cos 2x 2 1 − cos 2x 3x 2
lim = lim lim = (0)(1) = 0.
x→0 sin 3x 3 x→0 2x x→0 sin 3x 3
sin x
Example: Find limπ .
x→ 4 cos x
π
sin x sin 4
Solution: limπ = π
= 1.
x→ 4 cos x cos 4
1 − cos θ
You Try It: Show that lim = 0.
θ→0 θ
Limits at Infinity
Similarly, we say that lim f (x) = −∞, if for each positive number M we can find a positive number
x→a
δ (depending on M in general) such that f (x) < −M whenever 0 < |x − a| < δ.
51
1 1
Note that lim = 0 and lim = 0.
x→∞ x x→−∞ x
pm (x)
A rational function is a quotient of two polynomials, f (x) = , where m and n are the degrees
qn (x)
of the two polynomials.
Solution: The degree of the numerator is one, the degree of the denominator is two. Therefore
1
x+1 x
+ x12 0+0
lim 2 = 0, since lim 4 = = 0.
x→∞ x + 4 x→∞ 1 + 2 1+0
x
2. If m > n, then lim f (x) = ±∞ . (sign depends on the polynomials pm (x) and qn (x), if they
x→∞
are of the same sign as x gets larger, the quotient is positive, if they are of opposite signs, the
quotient is negative)
x3 − 2x2 + 3x + 4
Example: Find lim .
x→∞ 3x + 5
2
x3 − 2x2 + 3x + 4 1− x
+ x32 + x43 1
Solution: lim = lim 3 = = ∞.
x→∞ 3x + 5 x→∞
x2
+ x53 0
a
3. If m = n, then lim f (x) = , where a is the coefficient of xm in the numerator and b is the
x→∞ b
coefficient of xn in the denominator.
x3 − 4x + 1
Example: Find lim .
x→∞ 3x3 + 2x + 7
4 1
x3 − 4x + 1 1− x2
+ x3 1−0+0 1
Solution: lim 3
= lim 2 7 = = .
x→∞ 3x + 2x + 7 x→∞ 3 + + 3+0+0 3
x2 x3
a0 xm + a1 xm−1 + · · · + am
You Try It: What is lim , where a0 , b0 6= 0 and m and n are
x→∞ b0 xn + b1 xn−1 + · · · + bn
positive integers, when (a) m > n (b) m = n (c) m < n.
x−4
You Try It: Find lim √ .
x→4 x−2
5.5 Continuity
52
1. f (a) is defined.
Notice that, for f (x) to be continuous at x = a, all three conditions must be satisfied. If at least
one condition fails, f is said to have a discontinuity at x = a. For example, f (x) = x2 + 1 is
continuous at x = 2 since lim f (x) = 5 = f (2). The first condition above implies that a function
x→2 √
can be continuous only at points of its domain. Thus, f (x) = 4 − x2 is not continuous at x = 3
because f (3) is imaginary, i.e., not defined.
|x|
Example: Determine whether f (x) = is continuous at x = 0.
x
|x|
(
f (x) = , if x 6= 0
x
0, if x = 0,
is continuous at x = 0.
Sine the limits are not the same, lim f (x) does not exist and f (x) is not continuous at x = 0.
x→0
53
You Try It: Determine whether the function defined by
2
x, if x < 2
f (x) = 5, if x = 2
−x + 6, if x > 2,
A function f (x) is discontinuous at x = a if one or more of the conditions for continuity fails
there.
1
Example: (a) f (x) = is discontinuous at x = 2, because f (2) is not defined (has a zero
x−2
denominator) and because lim f (x) does not exist (equals ∞). The function is, however, continuous
x→2
everywhere except at x = 2, where it is said to have an infinite discontinuity.
x2 − 4
(b) f (x) = is discontinuous at x = 2 because f (2) is not defined (both numerator and
x−2
denominator are zero) and because lim f (x) = 4. The discontinuity here is called removable since
x→2
x2 − 4
it may be removed by redefining the function as f (x) = for x 6= 2 and f (2) = 4. (Note the
x−2
discontinuity in (a) cannot be removed because the limit also does not exist.)
f (x) is continuous at x = a, if for any ε > 0, we can find δ > 0, such that, |f (x) − f (a)| < ε
whenever 0 < |x − a| < δ.
Solution: Must show that, given any ε > 0, we can find δ > 0, such that |f (x)−f (2)| = |x2 −4| < ε
when |x − 2| < δ.
Choose δ ≤ 1, so that |x − 2| < 1 or 1 < x < 3 (x 6= 2). Then |x2 − 4| = |(x − 2)(x + 2)| =
|x − 2||x + 2| < δ|x + 2| < 5δ. Taking δ = min{1, 5ε } whichever is smaller, then we have |x2 − 4| < ε
whenever |x − 2| < δ.
You Try It: (a) Prove that f (x) = x is continuous at any point x = x0 .
(b) Prove that f (x) = 2x3 + x is continuous at any point x = x0 .
54
Theorems on Continuity
Theorem 1
f (x)
If f (x) and g(x) are continuous at x = a, so are the functions f (x) ± g(x), f (x)g(x) and if
g(x)
g(x) 6= 0.
Theorem 2
The following functions are continuous in every finite interval (a) all polynomials (b) sin x and
cos x (c) ax , a > 0.
Theorem 3
If y = f (x) is continuous at x = a and z = g(y) is continuous at y = b and if b = f (a), then the
function z = g[f (x)] called a function of a function or composite function is continuous at x = a.
Theorem 4
If f (x) is continuous in a closed interval, it is bounded in the interval.
5.7 Tutorial 3
1. Find the domain ofreach of the following functions.
1 x x3 − 8
(i) √ (ii) (iii) 2
1−x 2−x x −4
2. Determine whether or not each of the following correspondences is a function.
(i) x2 + y = 1 (ii) x2 y 2 = 5 (iii) x2 y = 4 (iv) {(1, 5), (2, 5), (5, 1)}.
3. For each of the following pairs of functions, calculate f ◦ g and g ◦ f .
2 2
(i) f (x) = ex and g(x) = e−x .
4. Let the function f : R → R be defined by f (x) = 2x − 1. Show that f is bijective and hence
find f −1 .
5. Show that if f : A → B is onto and g : B → C is onto, then the product function
(g ◦ f ) : A → C is onto.
√
6. Show that the function f (x) = ln(x+ x2 + 1) is an odd function and find the inverse function
of f (x).
7. Let
3x − 1, x<0
f (x) = 0, x=0
2x + 5, x > 0,
55
Evaluate (i) lim f (x) (ii) lim f (x) (iii) lim+ f (x) (iv) lim− f (x) (v) lim f (x).
x→2 x→−3 x→0 x→0 x→0
3x + |x|
8. If f (x) = . Evaluate (i) lim f (x) (ii) lim f (x).
7x − 5|x| x→∞ x→−∞
x2 − 4
2 1
11. Verify (i) lim =4 (ii) lim x cos = 0.
x→2 x − 2 x→0 x
12. Give the points of discontinuity of each of the following
functions.
x 1
(i) f (x) = (ii) f (x) = x2 sin , f (0) = 0
(x − 2)(x − 4) x
p 1
(iii) f (x) = (x − 3)(6 − x), 3 ≤ x ≤ 6 (iv) f (x) = .
1 + 2 sin x
13. Given that the function f : R → R be defined by
2
x − 4, x≥2
f (x) = 2ax + b, 0<x<2
x
e , x ≤ 0,
56
Chapter 6
Differentiation
Let f (x) be defined at any point x0 in (a, b). The derivative of f (x) at x = x0 is defined as
f (x0 + h) − f (x0 )
f 0 (x0 ) = lim
h→0 h
if this limit exists. A function is called differentiable at a point x = x0 , if it has a derivative at
that point, i.e., if f 0 (x0 ) exists. If we write x = x0 + h, then h = x − x0 and h approaches 0 if and
only if x approaches x0 . Therefore, an equivalent way of stating the definition of the derivative, is
f (x) − f (x0 )
f 0 (x0 ) = lim .
x→x0 x − x0
Solution:
f (x + h) − f (x) [(x + h)3 − (x + h)] − [x3 − x]
f 0 (x) = lim = lim
h→0 h h→0 h
3 2 2 3 3
x + 3x h + 3xh + h − x − h − x + x
= lim
h→0 h
3x2 h + 3xh2 + h3 − h
= lim
h→0 h
= lim (3x + 3xh + h2 − 1) = 3x2 − 1.
2
h→0
57
√
Example: If f (x) = x, find the derivative of f .
Solution:
f (x + h) − f (x)
f 0 (x) = lim
h→0
√ h
√
x+h− x
= lim
h→0
√ h
√ √ √
x+h− x x+h+ x
= lim ·√ √
h→0 h x+h− x
(x + h) − x
= lim √ √
h→0 h( x + h + x)
1
= lim √ √
h→0 x+h+ x
1 1
= √ √ = √ .
x+ x 2 x
dy d d
The derivative at x may be denoted by f 0 (x), y 0 , , (f (x)). The symbol is called differ-
dx dx dx
entiation operator because it indicates the operation of differentiation. The process of finding
derivatives of functions is called differentiation.
Solution: If x > 0, then |x| = x and we can choose h small enough that x + h > 0 and hence
|x + h| = x + h. Therefore, for x > 0,
|x + h| − |x|
f 0 (x) = lim
h→0 x
(x + h) − x h
= lim = lim = 1,
h→0 h h→0 h
and so f is differentiable for any x > 0.
Similarly, for x < 0, we have |x| = −x and h can be chosen small enough that x + h < 0 and so
|x + h| = −(x + h). Therefore, for x < 0,
|x + h| − |x|
f 0 (x) = lim
h→0 h
−(x + h) − (−x) −h
= lim = lim = −1,
h→0 h h→0 h
58
For x = 0 we have to investigate
f (0 + h) − f (0)
f 0 (0) = lim
h→0 h
|0 + h| − |0|
= lim (if it exists).
h→0 h
Let’s compute the left and right limits separately;
|0 + h| − |0| |h|
lim+ = lim+ = lim+ 1 = 1.
h→0 h h→0 h h→0
|0 + h| − |0| |h|
lim− = lim− = lim− (−1) = −1.
h→0 h h→0 h h→0
Since these limits are different f 0 (0) does not exist. Thus, f is differentiable at all x except 0.
For example,
(3x5 − 2x2 + 1)0 = (3x5 )0 − (2x2 )0 + 10 = 3(x5 )0 − 2(x2 )0 + 0 = 3(5x4 ) − 2(2x) = 15x4 − 4x.
6. Product Rule
(f (x)g(x))0 = f (x)g 0 (x) + g(x)f 0 (x).
59
7. Quotient Rule 0
g(x)f 0 (x) − f (x)g 0 (x)
f (x)
= .
g(x) (g(x))2
d2 y
The Second Derivative is given by
dx2
d2 y
d dy du
2
= .
dx du dx dx
dy d2 y
Example: Find and 2 given x = θ − sin θ and y = 1 − cos θ.
dx dx
dx dy
Solution: Note that = 1 − cos θ and = sin θ, so
dθ dθ
dy dy/dθ sin θ
= = .
dx dx/dθ 1 − cos θ
Also
d2 y
d sin θ dθ
2
=
dx dθ 1 − cos θ dx
cos θ − 1 1 1
= 2
· =− .
(1 − cos θ) 1 − cos θ (1 − cos θ)2
dy d2 y
Example: Find and 2 given x = et cos t and y = et sin t.
dx dx
dx dy
Solution: Note that = et (cos t − sin t) and = et (sin t + cos t), so
dt dt
dy dy/dt sin t + cos t
= = .
dx dx/dt cos t − sin t
Also
d2 y
d sin t + cos t dt
=
dx2 dt cos t − sin t dx
2 1 2
= 2
· t = t .
(cos t − sin t) e (cos t − sin t) e (cos t − sin t)3
Theorem 6.1.1. If f (x) = c is a constant function, then f 0 (x) = 0 for all real numbers x.
60
f (x + h) − f (x) c−c
Proof. Observe that f 0 (x) = lim = = 0.
h→0 h h
Theorem 6.1.2 (Product Rule). If f and g are both differentiable at x, then the product function
f g is also differentiable at x and (f g)0 (x) = f (x)g 0 (x) + g(x)f 0 (x).
Proof.
f (x + h)g(x + h) − f (x)g(x)
(f g)0 (x) = lim .
h→0 h
Trick of adding and subtracting f (x + h)g(x) to the numerator,
Proof.
1 1
g(x+h)
− g(x) g(x) − g(x + h)
lim = lim
h→0 h h→0 h(g(x))(g(x + h))
−(g(x + h) − g(x)) 1
= lim
h→0 h g(x)g(x + h)
−(g(x + h) − g(x)) 1
= lim lim
h→0 h h→0 g(x)g(x + h)
1
= −g 0 (x) .
(g(x))2
f
Theorem 6.1.4. If f and g are differentiable at x and g(x) 6= 0, then is differentiable at x and
0 g
0 0
f g(x)f (x) − f (x)g (x)
(x) = .
g (g(x))2
61
f 1
Proof. Since = f , we have
g g
0 0
f 1
(x) = f· (x)
g g
0
0 1 1
= f (x) + f (x) (x)
g(x) g
f 0 (x) g 0 (x)
= + f (x) −
g(x) (g(x))2
f 0 (x)g(x) − f (x)g 0 (x)
= .
(g(x))2
x 0 (x2 + 1) − x(2x) 1 − x2
Example: If f (x) = , then f (x) = = .
x2 + 1 (x2 + 1)2 (x2 + 1)2
1 1
Example: If f (x) = , then f 0 (x) = − 2 .
x x
dy
Example: If x = cos t and y = t sin t, find .
dx
Solution:
d
dy (t sin t) sin t + t cos t
= dt = .
dx d − sin t
(cos t)
dt
Recall :
sin h 1 − cos h
lim = 1 and lim = 0.
h→0 h h→0 h
62
sin(x + h) − sin x
(sin x)0 = lim
h→0 h
sin x cos h + cos x sin h − sin x
= lim
h→0 h
sin x(cos h − 1) + cos x sin h
= lim
h→0
h
(1 − cos h) sin h
= lim − sin x + cos x
h→0 h h
= − sin x(0) + cos x(1).
dy
Example: Find if y = x3 sin x.
dx
dy
Solution: = (x3 sin x)0 = x3 (sin x)0 + sin x(x3 )0 = x3 cos x + 3x2 sin x.
dx
is differentiable at x = 0.
d x d 1 d x
Derivatives of ln x and ex are e = ex and ln x = . Also (a ) = ax ln x, a > 0.
dx dx x dx
63
d
Example: Calculate the derivative [(sin x + x) · (x3 − ln x)].
dx
d d d 3 d 1
Solution: We know that sin x = cos x, x = 1, x = 3x2 and ln x = . Therefore, by
dx dx dx dx x
the addition rule,
d d d
(sin x + x) = sin x + x = cos x + 1
dx dx dx
and
d 3 d 3 d 1
(x − ln x) = x − ln x = 3x2 − .
dx dx dx x
Now we may conclude the calculation by applying the product rule;
d d d
[(sin x + x) · (x3 − ln x)] = (sin x + x) · (x3 − ln x) + (sin x + x) · (x3 − ln x)
dx dx dx
3 2 1
= (cos x + 1) · (x − ln x) + (sin x + x) · 3x −
x
1
= 4x3 − 1 + x3 cos x + 3x2 sin x − sin x − ln x cos x − ln x.
x
d
Example: Calculate the derivative (sin(x3 − x2 )).
dx
Solution: This is the composition of functions, so we must apply the Chain Rule. It is essential
to recognize what function will play the role of f and what function will play the role of g. Notice
64
that, if x is the variable, then x3 − x2 is applied first and sin applied next. So it must be that
d d
g(x) = x3 − x2 and f (s) = sin s. Notice that f (s) = cos s and g(x) = 3x2 − 2x. Then
ds dx
sin(x3 − x2 ) = f ◦ g(x)
and
d d
(sin(x3 − x2 )) = (f ◦ g(x))
dx dx
df d
= (g(x)) · g(x)
ds dx
= cos(g(x)) · (3x2 − 2x)
= [cos(x3 − x2 )] · (3x2 − 2x).
x2
d
Example: Calculate the derivative ln .
dx x−2
x2 x2
Solution: Let h(x) = ln . Then h = f ◦ g, where f (s) = ln s and g(x) = . So
x−2 x−2
d 1 d (x − 2) · 2x − x2 · 1 x2 − 4x
f (s) = and g(x) = = . As a result,
ds s dx (x − 2)2 (x − 2)2
d d
h(x) = (f ◦ g)
dx dx
df d
= (g(x)) · g(x)
ds dx
1 x2 − 4x
= ·
g(x) (x − 2)2
1 x2 − 4x
= ·
x2 (x − 2)2
x−2
x−4
= .
x(x − 2)
What is the relationship between continuity and differentiation? It appears that functions that
have derivatives must be continuous.
Theorem 6.4.1. If a function f is differentiable at a point x, then it is continuous at x.
65
Proof. We want to show that f is continuous at x, i.e., lim f (t) = f (x) or lim f (x + h) = f (x),
t→x h→0
where h = t − x. It will be sufficient to show that lim [f (x + h) − f (x)] = 0.
h→0
Now,
f (x + h) − f (x)
lim [f (x + h) − f (x)] = lim h
h→0 h→0 h
f (x + h) − f (x)
= lim lim h
h→0 h h→0
= f 0 (x) · 0
= 0,
Converse is false: For example, the function f (x) = |x| is continuous at x = 0, but it is not
differentiable there.
dy
If f (x) is differentiable in an interval, its derivative is given by f 0 (x), y 0 or where y = f (x).
dx
d2 y
0 00 d
00 dy
If f (x) is also differentiable in the interval, its derivative is denoted by f (x), y or = .
dx dx dx2
dn y
Similarly, the nth derivative of f (x), if it exists, is denoted by f (n) , y (n) or where n is called
dxn
the order of the derivative.
d 1 4
Solution: Derivative y 0 = f 0 (x) = ( x − 3x2 + 1) = 2x3 − 6x.
dx 2
d2 y d
Second derivative y 00 = f 00 (x) = = (2x3 − 6x) = 6x2 − 6.
dx2 dx
000 d3 y
000 d
Third derivative y = f (x) = 3 = (6x2 − 6) = 12x.
dx dx
d4 y d
Fourth derivative y (4) = f (4) (x) = 4
= (12x) = 12.
dx dx
66
6.6 Implicit Differentiation
Compare
√
1. x2 − y 3 = 3 ⇐⇒ y = 3
x2 − 3.
√
2. x2 + y 2 = 1 ⇐⇒ y = ± 1 − x2 .
3. x3 + y 2 = 3xy ⇐⇒????????.
Implicit Functions. A function in which the dependent variable is expressed solely in terms of
the independent variable x, namely y = f (x), is said to be an explicit function, for example,
y = 12 x3 − 1. An equation f (x, y) = 0, on perhaps certain restricted ranges of the variables, is said
to define y implicitly as a function of x.
1−x
Example: (a) The equation xy + x − 2y − 1 = 0, with x 6= 2, defines the function y = .
√ x−2
(b) The equation 4x2 + 9y 2 − 36 = 0 defines the function y = 23 9 − x2 when |x| ≤ 3 and y ≥ 0
√
and the function y = − 23 9 − x2 when |x| ≤ 3 and y ≤ 0.
2. Thinking of y as a function of x, differentiate both sides of the given equation with respect
to x and solve the resulting relation for y 0 . This differentiation process is known as implicit
differentiation.
dy
Example: Find if x2 + y 2 = 4.
dx
d2 y
Example: Find if x2 + y 2 = 4.
dx2
67
Solution: From the above example, we already know that the first derivative is
dy x
=− .
dx y
Hence by the Quotient Rule
d2 y
d x
2
= −
dx dx y
dy
y·1−x·
= − dx
y2
x
y−x −
y dy
= − 2
Substituting for
y dx
y + x2
2
= − .
y3
d2 y 4
= − .
dx2 y3
dy
Example: Find if sin y = y cos 2x.
dx
Solution:
d d
sin y = y cos 2x
dx dx
dy dy
cos y = y(− sin 2x · 2) + cos 2x
dx dx
dy
(cos y − cos 2x) = −2y sin 2x
dx
dy 2y sin 2x
= − .
dx cos y − cos 2x
Solution: We have
d d d d d d
x (y) + y (x) + (x) − 2 (y) − (1) = (0)
dx dx dx dx dx dx
1+y
or xy 0 + y + 1 − 2y 0 = 0, then y 0 = .
2−x
68
Solution:
d 2 d d 2 d 2
(x y) − (xy 2 ) + (x ) + (y ) = 0
dx dx dx dx
d d d d d 2 d 2
x2 (y) + y (x2 ) − x (y 2 ) − y 2 (x) + (x ) + (y ) = 0.
dx dx dx dx dx dx
y 2 − 2x − 2xy
Hence, x2 y 0 + 2xy − 2xyy 0 − y 2 + 2x + 2yy 0 = 0 and y 0 = .
x2 + 2y − 2xy
Solution:
d 2 d d 2 2x − y
(x ) − (xy) = (y ) = 2x − xy 0 − y + 2yy 0 = 0. So y 0 = .
dx dx dx x − 2y
Then
d d
(x − 2y) (2x − y) − (2x − y) (x − 2y) (x − 2y)(2 − y 0 ) − (2x − y)(1 − 2y 0 )
y 00 = dx dx =
(x − 2y)2 (x − 2y)2
2x − y
3x − 3y
3xy 0 − 3y x − 2y 6(x2 − xy + y 2 )
= = =
(x − 2y)2 (x − 2y)2 (x − 2y)2
18
= .
(x − 2y)2
Take natural logarithm (ln) both sides, differentiate implicitly and solve for y 0 .
√
2 3
x 7x − 14
Example: Compute y 0 if y = .
(1 + x2 )4
69
√ 2√
x2 3 7x − 14
x 3 7x − 14
Solution: y = ⇒ ln y = ln .
(1 + x2 )4 (1 + x2 )4
1
ln y = 2 ln x + ln(7x − 14) − 4 ln(1 + x2 )
3
1 (7x − 14)0 (1 + x2 )0
1 0 1
y = 2 + −4
y x 3 7x − 14 1 + x2
2 7 8x
= + −
x 3(7x − 14) 1 + x2
0 2 7 8x
y = y + −
x 3(7x − 14) 1 + x2
√
x2 3 7x − 14 2
7 8x
= + − .
(1 + x2 )4 x 3(7x − 14) 1 + x2
If x = sin y, the inverse function is written y = sin−1 x or y = arcsin x. The inverse trigonometric
functions are multivalued functions.
(sin y)0 = x0
cos yy 0 = 1
1
y0 =
cos y
1
y0 = p
1 − sin2 y
1
= √ .
1 − x2
d 1 d 1 d 1
Some Derivatives. (cos−1 x) = − √ , (cot−1 x) = − , (sec−1 x) = √ .
dx 1 − x2 dx 1 + x2 dx x x2 − 1
1
You Try It: Show that the derivative of tan−1 x = .
1 + x2
70
Applications of the Derivative
Suppose f (x) is continuous on [a, b] and differentiable on (a, b). Then, there exists a c in (a, b) at
which the tangent line is parallel to the secant line joining the points (a, f (a)) and (b, f (b)), i.e., at
f (b) − f (a)
which f 0 (c) = ,
b−a
OR
If f (x) is continuous in [a, b] and differentiable in (a, b), then there exists a point c in (a, b) such
that
f (b) − f (a)
f 0 (c) = , a < c < b.
b−a
The word mean in The Mean Value Theorem refers to the mean (or average) rate of change of f
in the interval [a, b].
If f (a) = f (b) = 0, then the theorem says that there exists a c in (a, b) at which f 0 (c) = 0. The
71
graphs suggest that there must be at least one point on the graph, that corresponds to a number c
in (a, b), at which the tangent is horizontal. This special case of the Mean Value Theorem is called
Rolle’s Theorem 1 .
√
Example: Consider f (x) = x − 1 on [2, 5], f (x) is continuous when x − 1 ≥ 0, i.e., x ≥ 1. In
1
particular, f (x) is continuous on [2, 5] and f 0 (x) = √ , so differentiable when x > 1. In
2 x−1
particular, f (x) is differentiable on (2, 5).
√ √
f (b) − f (a) f (5) − f (2) 5−1− 2−1 1
= = = .
b−a 5−2 3 3
1
The Mean Value Theorem asserts that, for some c in (2, 5), f 0 (c) = . Let us find it.
3
1
f 0 (x) =
3
1 1
√ =
2 x−1 3
√
2 x−1 = 3
4(x − 1) = 9
9
x−1 =
4
13
x = .
4
13 13
Notice that is in (2, 5), so we may take c = .
4 4
π
Example: Show that if f (x) = tan x on the interval 0 ≤ x ≤ k where k < , then tan k ≥ k.
2
1
Michel Rolle, a French mathematician (1652-1719)
72
Solution: By the Mean Value Theorem
tan k − tan 0
= sec2 c,
k−0
for some c ∈ (0, k). But sec2 c ≥ 1 and tan 0 = 0. So
tan k
≥ 1 =⇒ tan k ≥ k.
k
Example: Use The Mean Value Theorem to show that | cos a − cos b| ≤ |a − b|.
Solution: The function cos x is continuous and differentiable for all x. By the Mean Value Theorem
cos a − cos b
(cos x)0 =
a−b
0
cos a − cos b
|(cos x) | =
,
a−b
b−a b−a
Example: Prove that 2
< tan−1 b − tan−1 a < for a < b.
1+b 1 + a2
1 1
Solution: Let f (x) = tan−1 x. Since f 0 (x) = 2
, f 0 (c) = . By the Mean Value Theorem
1+x 1 + c2
f (b) − f (a) tan−1 b − tan−1 a 1
= = , a < c < b.
b−a b−a 1 + c2
Then, from a < c < b, we have
a2 < c2 < b2 =⇒ 1 + a2 < 1 + c2 < 1 + b2
1 1 1 1 1 1
2
> 2
> 2
=⇒ 2
< 2
<
1+a 1+c 1+b 1+b 1+c 1 + a2
1 tan−1 b − tan−1 a 1 b−a b−a
2
< < 2
=⇒ 2
< tan−1 b − tan−1 a < .
1+b b−a 1+a 1+b 1 + a2
a b b
Example: Use the Mean Value Theorem, to prove that if 0 < a < b, then 1 − < ln < − 1.
b a a
1 1
Hence show that < ln 1.2 < .
6 5
1
Solution: Let f (x) = ln x and f 0 (x) = . By the Mean Value Theorem, there exists c ∈ (a, b)
x
such that
1 ln b − ln a
f 0 (c) = = .
c b−a
73
Then, from a < c < b we have
1 1 1
a < c < b =⇒ < <
b c a
1 ln b − ln a 1 b−a b−a
< < =⇒ < ln b − ln a <
b b−a a b a
a b b
=⇒ 1 − < ln < − 1.
b a a
12 6
Now, ln(1.2) = ln = ln . Therefore a = 5 and b = 6. Substituting in
10 5
a b b
1 − < ln < − 1, we have
b a a
5 6 6 1 1
1 − < ln < − 1 =⇒ < ln 1.2 < .
6 5 5 6 5
Corollary 6.10.1. If f 0 (x) = 0 at all points of the interval (a, b), then f (x) must be a constant in
the interval.
Proof. Let x1 < x2 be any two different points in (a, b). By the Mean Value Theorem for
x1 < x < x 2 ,
f (x2 ) − f (x1 )
= f 0 (x) = 0.
x2 − x1
Thus f (x1 ) = f (x2 ). Since x1 and x2 are arbitrarily chosen, the function f (x) has the same value
at all points in the interval. Thus, f (x) is constant.
Corollary 6.10.2. If f 0 (x) > 0 at all points of the interval (a, b), then f (x) is strictly increasing.
Proof. Let x1 < x2 be any two different points in (a, b). By the Mean Value Theorem for
x1 < x < x 2 ,
f (x2 ) − f (x1 )
= f 0 (x) > 0.
x2 − x1
Thus f (x2 ) > f (x1 ) for x2 > x1 and so f (x) is strictly increasing.
74
6.11 Indeterminate Forms
f (x) 0 ∞ f (x) 0
Happens when lim tends to or as x → a. Think of the situation lim →
x→a g(x) 0 ∞ x→a g(x) 0
where f (x) and g(x) are differentiable (and therefore continuous so f (a) = lim f (x) = 0 and
x→a
g(a) = lim g(x) = 0.), then
x→a
f (x) − f (a)
= lim x−a (provided the denominator is not zero)
x→a g(x) − g(a)
x−a
f (x) − f (a)
lim
=
x→a x−a
g(x) − g(a)
lim
x→a x−a
0
f (a) lim f 0 (x)
x→a
= 0 = (provided f 0 (x) and g 0 (x) are also continuous.)
g (a) lim g 0 (x)
x→a
Example:
x2 − 4 (x2 − 4)0 2x
lim = lim 0
= lim = 2 · 2 = 4.
x→2 x − 2 x→2 (x − 2) x→2 1
0 ∞
The form ∞ − ∞. A given limit that is not immediately or can be converted to one of these
0 ∞
forms by combination of algebra and a little cleverness.
1 + 3x 1
Example: Evaluate lim − .
x→0 sin x x
1 + 3x 1
Solution: We note → ∞ and → ∞. However, after writing the difference as a single
sin x x
75
0
fraction, we recognize the form .
0
3x2 + x − sin x
1 + 3x 1
lim − = lim
x→0 sin x x x→0 x sin x
6x + 1 − cos x
= lim
x→0 x cos x + sin x
6 + sin x
= lim
x→0 −x sin x + 2 cos x
6+0
= = 3.
0+2
The form 0 · ∞. By suitable manipulation, L’Hôpital’s Rule can sometimes be applied to the limit
form 0 · ∞.
1
Example: Evaluate lim x sin .
x→∞ x
and we see
lim ln y = lim g(x) ln f (x)
x→a x→a
lim f (x)g(x) = eL .
x→a
1
Example: Evaluate lim+ x ln x .
x→0
76
1
Solution: The form is 00 . Now, if we set y = x ln x , then
1
ln y = ln x = 1.
ln x
Notice we do not need L’Hôpital’s Rule in this case since
lim ln y = 1.
x→0+
1
Hence, lim+ y = e1 or equivalently lim+ x ln x = e.
x→0 x→0
1
Example: Evaluate lim (1 + x) x .
x→0
1
Solution: The limit form is of the form 1∞ . If y = (1 + x) x , then
1
ln y = ln(1 + x).
x
ln(1 + x) 0
Now, lim has the form and so
x→0 x 0
1
ln(1 + x)
lim = lim 1+x
x→0 x x→0 1
1
= lim = 1.
x→0 1 + x
Thus,
1
lim (1 + x) x = e.
x→0
2x
3
Example: Evaluate lim 1 − .
x→∞ x
2x
∞ 3 3
Solution: The limit form is 1 . If y = 1 − then ln y = 2x ln 1 − . Observe that the
x x
2 ln(1 − x3 )
3 0
form lim 2x ln 1 − is ∞ · 0, whereas the form of lim 1 is . Therefore,
x→∞ x x→∞
x
0
3
x2
2 ln(1 − x3 ) (1 − x3 )
lim 1 = lim 2
x→∞
x
x→∞ − x12
−6
= lim = −6.
x→∞ (1 − 3 )
x
77
6.12 Tutorial 4
3+x
1. Let f (x) = , x 6= 3. Evaluate f 0 (2) from the definition.
3−x
d √ 1
2. Show from definition that ( x − 1) = √ .
dx 2 x−1
3. Determine whether each of the following functions is differentiable at(
x = 0.
3 1
x − |x|
x sin x , x 6= 0 , x 6= 0 .
(i) f (x) = (ii) f (x) = x|x| (iii) f (x) = x
0, x=0 0, x=0
4. Show that f (x) = cos x is differentiable at any x ∈ R and that f 0 (x) = − sin x.
dy
5. Find for each of the following functions
dx
(i) y = sin3 (6x) (ii) y = (ax + b)m (cx + d)n (iii) y = cos(x2 − 3x + 1)
dy
8. Find if xy 2 + 4y 3 + 3x = 0 at (1, −1).
dx
d2 y
9. Find .
dx2
(i) 4y 3 = 6x2 + 1 (ii) xy 4 = 5 (iii) x = 2 + t, y = 1 + t2
78
13. Evaluate each of the following limits using L’hopital’s rule.
sin x sin x ln x 6x2 − 5x + 7
(i) lim (ii) lim x −x
(iii) lim x
(iv) lim
x→0 x e −e
x→0 x→∞ ex x→∞ 4x2 − 2x
1 + 3x 1 3x 1
(ix) lim − (xiii) lim (xiv) lim+ xx (xv) lim x(e x − 1) (xvi)
x→0 sin x x x→∞ 3x + 1 x→0 x→∞
1 1
lim − .
x→0 x sin x
−1
e x2 , x 6= 0
14. Let f (x) = Prove that f 0 (0) = 0.
0, x=0
15. Explain why Rolle’s theorem is not applicable for the function f (x) = |x| on the interval
[−1, 1].
18. Find the value of c in Rolle’s Theorem when f (x) = (x − a)m (x − b)n where m and n are
positive integers.
19. If f 0 (x) ≤ 0 at all points of (a, b), prove that f (x) is monotonic decreasing in (a, b). Under
what conditions is f (x) strictly decreasing in (a, b)?
sin x
20. Use the mean value theorem to show that sin x ≤ x and tan x ≥ x. Hence show that is
x
strictly decreasing on (0, π2 ).
a b b
21. Use the mean value theorem, to prove that, if 0 < a < b, then 1 − < ln < −1 .
b a a
1 1
Hence show that < ln(1.2) < .
6 5
22. Use the Mean Value Theorem to prove the following inequalities
79
Chapter 7
Integration
7.1 Anti-derivatives
There is always more than one anti-derivative of a function. For instance, in the foregoing example,
F1 (x) = x2 −1 and F2 (x) = x2 +10 are also anti-derivatives of f (x) = 2x since F10 (x) = F20 (x) = f (x).
Indeed, if F is an anti-derivative of a function f , then so is G(x) = F (x) + C, for any constant C.
This is a consequence of the fact that
d
G0 (x) = (F (x) + C) = F 0 (x) + 0 = F 0 (x) = f (x).
dx
Thus, F (x) + C stands for a set of functions of which each member has a derivative equal to f (x).
Theorem 7.1.1. If G0 (x) = F 0 (x) for all x in some interval [a, b], then
G(x) = F (x) + C
for all x in the interval.
80
7.2 Indefinite Integral
For convenience let’s introduce a notation for an anti-derivative of a function. If F 0 (x) = f (x), we
shall represent the most general anti-derivative of f by
Z
f (x)dx = F (x) + C.
Z Z
The symbol is called an integral sign, and the notation f (x) is called the indefinite integral
of f (x) with respect to x. The function f (x) is called the integrand. The process of finding an
anti-derivative is called anti-differentiation or integration. The number C is called Z a constant
d
of integration. Just as () denotes differentiation with respect to x, the symbol ()dx denotes
dx
integration with respect to x.
When differentiating the power xn , we multiply by the exponent n and decrease the exponent by
1. To find an anti-derivative of xn , the reverse of the differentiation rule would be : Increase the
exponent by 1 and divide by the new exponent n + 1.
xn+1
Z
xn dx = + C.
n+1
Z Z
6 1
Example: Evaluate (a) x dx (b) dx.
x5
Solution:
x7
Z
(a) x6 dx = + C.
7
x−4
Z
1 1
(b) By writing 5 as x , we have x−5 dx =
−5
+ C = − 4 + C.
x −4 4x
√
Z
Example: Evaluate x dx.
81
√
Z Z
1
Solution: We first write x dx = x 2 dx and therefore
Z 3
1 x2 2 3
x dx =
2
3 + C = x 2 + C.
2
3
The following property of indefinite integrals is an immediate consequence of the fact that the
derivative of a sum is the sum of derivatives.
Theorem 7.3.1. If F 0 (x) = f (x) and G0 (x) = g(x), then
Z Z Z
[f (x) ± g(x)]dx = f (x)dx ± g(x)dx = F (x) ± G(x) + C.
Z
1
Example: Evaluate (x− 2 + x4 )dx.
The anti-derivative, or indefinite integral, of any finite sum can be obtained by integrating each
term.
Z
− 31 5
Example: Evaluate 4x − 2x + 2 dx.
x
82
Z Z Z
2. [f (x) ± g(x)] = f (x)dx ± g(x)dx.
Z
1
3. xn dx = xn+1 + C for n 6= −1.
n+1
Z
4. adx = ax + C.
Z
5. cos xdx = sin x + C.
Z
6. sin xdx = − cos x + C.
Z
7. sec2 xdx = tan x + C.
Z
8. ex dx = ex + C.
Z
1
9. dx = ln |x| + C.
x
Z
1
10. √ dx = sin−1 x + C.
1−x 2
Z
1
11. dx = tan−1 x + C.
1 + x2
Z
5 √
3
Example: Evaluate 2
− 2 x dx.
x
83
7.5 u−Substitution
Z
x
Example: Evaluate dx.
(4x2 + 3)6
−6
u
Z Z du
1 z 2 }| −6{ z }| {
(4x2 + 3)−6 xdx = (4x + 3) 8xdx
8
Z
1
= u−6 du
8
1 u−5
= · +C
8 −5
1
= − (4x2 + 3)−5 + C.
40
Z
Example: Evaluate x(x2 + 2)3 dx.
84
p
You Try It: Evaluate 3
(7 − 2x3 )4 x2 dx.
Z
Example: Evaluate sin 10x dx.
Z
You Try It: Evaluate sec2 (1 − 4x) dx.
As the derivative is motivated by the geometric problem of constructing a tangent to a curve, the
historical problem leading to the definition of a definite integral is the problem of finding area.
Specifically, we are interested in finding the area A of a region bounded between the x−axis, the
graph of a non-negative function y = f (x) defined on some interval [a, b] and
85
7.6.1 The Definite Integral
The geometric problems that motivated the development of the integral calculus (determination
of lengths, areas, and volumes) arose in the ancient civilizations of Northern Africa. Where so-
lutions were found, they related to concrete problems such as the measurement of a quantity of
grain. Greek philosophers took a more abstract approach. In fact, Eudoxus (around 400 B.C.)
and Archimedes (250 B.C.) formulated ideas of integration as we know it today. Integral calculus
developed independently, and without an obvious connection to differential calculus. The calculus
became a “whole” in the last part of the seventeenth century when Isaac Barrow, Isaac Newton,
and Gottfried Wilhelm Leibniz (with help from others) discovered that the integral of a function
could be found by asking what was differentiated to obtain that function.
Definition 7.6.1. Let f be a function defined on a closed interval [a, b]. Then the definite integral
Z b
of f from a to b, denoted by f (x) dx, is defined to be
a
Z b n
X
f (x) dx = lim f (xi )∆x.
a n→∞
i=1
The numbers a and b are called the lower and upper limits of integration, respectively. If the
limit exists, the function f is said to be integrable on the interval.
The following two definitions prove to be useful when working with definite integrals.
Theorem 7.6.1. If f (a) exists, then
Z a
f (x)dx = 0.
a
Example: By definition Z 1
(x3 + 3x)dx = 0.
1
Z b Z b
(i) kf (x) dx = k f (x) dx where k is any constant.
a a
86
Z b Z b Z b
(ii) [f (x) ± g(x)] dx = f (x) dx ± g(x) dx.
a a a
Z b Z c Z b
(iii) f (x) dx = f (x) dx + f (x) dx, where c is any number in [a, b].
a a c
The independent variable x in a definite integral is called a dummy variable of integration. The
value of the integral does not depend on the symbol used. In other words,
Z b Z b Z b Z b
f (x) dx = f (r) dr = f (s) ds = f (t) dt
a a a a
and so on.
Theorem 7.6.4. For any constant k,
Z b Z b
k dx = k dx = k(b − a).
a a
Theorem 7.6.5. Let f be integrable on [a, b] and f (x) ≥ 0 for all x in [a, b], then
Z b
f (x) dx ≥ 0.
a
In this theorem we shall see that the concept of an anti-derivative of a continuous function provides
the bridge between the differential calculus and the integral calculus.
Theorem 7.7.1. Let f be continuous on [a, b] and let F be any function for which F 0 (x) = f (x).
Then Z b
f (x) dx = F (b) − F (a).
a
Z 3
Example: Evaluate x dx.
1
87
x2
Solution: An anti-derivative of f (x) = x is F (x) = . Consequently,
2
3
3
x2
Z
9 1
x dx = = − = 4.
1 2 1 2 2
Useful
Z for integrands
Z involving products of algebraic and exponential or logarithmic functions, such
as x2 ex dx and x ln x dx. This is the inverse operation of differentiating a product. If u and v
are functions of x, then
d dv du
(uv) = u + v .
dx dx dx
du dv
Integrate both sides, if and are continuous, then
dx dx
Z Z
dv du
uv = u dx + v dx
dx dx
Z Z
dv du
u dx = uv − v dx
dx dx
Z Z
u dv = uv − v du.
Z
Example: Evaluate xex dx.
Z Z
Solution: Let u = x, du = dx and dv = e dx ⇒ v = x
dv = ex dx = ex . Then,
Z
xex dx = xex − ex + C.
Z
Example: Evaluate x2 ln x dx.
x3
Z Z
2 2
dv = x dx ⇒ v = dv = x dx = .
3
88
1
and u = ln x ⇒ du = dx. Therefore,
x
x3
Z Z 3
2 x 1
x ln x dx = ln x − dx
3 3 x
x3
Z
1
= ln x − x2 dx
3 3
x3 x3
= ln x − + C.
3 9
Z
Example: Evaluate ln x dx.
Z Z
1
Solution: Choose dv = dx ⇒ v = dv = dx = x and u = ln x ⇒ du = dx, therefore
x
Z Z
1
ln x dx = x ln x − x· dx
x
Z
= x ln x − dx
= x ln x − x + C.
Z
You Try It: Evaluate x tan−1 x dx.
Solution: Notice that the derivative of x2 becomes simpler, whereas the derivative of ex does not.
So you should let u = x2 and dv = ex dx. So
Z Z
dv = e dx ⇒ v = dv = ex dx = ex
x
u = x2 ⇒ du = 2x dx.
Integrating by parts one time we get,
Z Z
x e dx = x e − 2xex dx.
2 x 2 x
89
Then,
Z Z
2 x 2 x
x e dx = x e − 2xex dx
Z
2 x x x
= x e − 2xe − 2e dx
= x2 ex − 2xex + 2ex + C
= ex (x2 − 2x + 2) + C.
Z e
Example: Evaluate ln x dx.
1
Solution:
Z e
e
ln x dx = (x ln x − x)
1 1
= (e ln e − e) − (1 · ln 1 − 1)
= (e − e) − (0 − 1)
= 1.
These are formulas in which a given integral is expressed in terms of similar integrals of simpler
form.
Example: Let n be a positive integer. Use integration by parts to derive the reduction formula
Z Z
x e dx = x e − n xn−1 ex dx + C.
n x n x
Z Z
n x
Solution: Let u = x , dv = e dx. Then du = nx n−1
,v= dv = ex dx = ex . So
Z Z
n x n x
x e dx = x e − ex (nxn−1 ) dx
Z
= x e − n xn−1 ex dx + C.
n x
90
Z
To illustrate the use of the reduction formula we calculate xn ex dx for n = 1, 2.
Z Z
x x
n=1 : ex dx = xex − ex + C.
xe dx = xe −
Z Z
n=2 : x e dx = x e − 2 xex dx = x2 ex − 2xex + 2ex + C.
2 x 2 x
Z
Example: Evaluate sinn x dx.
Z Z
n
Solution: Rewrite as sin x dx = sinn−1 x sin x dx.
ThenZ u = sinZn−1 x ⇒ du = (n − 1) sinn−2 x cos x dx and dv = sin x ⇒
v = dv = sin x dx = − cos x. Then ,
Z Z
n n−1
sin x dx = − sin x cos x + (n − 1) sinn−2 x cos x cos x dx
Z
n−1
= − sin x cos x + (n − 1) sinn−2 x cos2 x dx
Z
n−1
= − sin x cos x + (n − 1) sinn−2 x(1 − sin2 x) dx
Z
n−1 n−2 n
= − sin x cos x + (n − 1) sin x dx − sin x dx
Z Z Z
n n n−1
sin x dx + (n − 1) sin x dx = − sin x cos x + (n − 1) sinn−2 x dx
Z Z
n n−1
n sin x dx = − sin x cos x + (n − 1) sinn−2 x dx
sinn−1 x cos x n − 1
Z Z
n
sin x dx = − + sinn−2 x dx.
n n
This technique involves the decomposition of a rational function into the sum of two or more
simpler rational functions. We will consider rational functions (quotients of polynomials) in which
the numerator has a lower degree than the denominator. If this condition is not meet, we first
carry out long division process, dividing the denominator into the numerator, until we reduce the
problem to an equivalent one involving a fraction in which the numerator has a lower degree than
the denominator.
91
x+7 2 1
Example: = − , then
x2 − x − 6 x−3 x+2
Z Z
x+7 2 1
dx = − dx
x2 − x − 6 x−3 x+2
Z Z
1 1
= 2 dx − dx
x−3 x+2
= 2 ln |x − 3| − ln |x + 2| + C.
Z
2x + 1
You Try It: Evaluate dx.
(x − 1)(x + 3)
5x2 + 20x + 6
Z
Example: Find dx.
x3 + 2x2 + x
Solution: Factorise the denominator as x(x + 1)2 . Then write the partial decomposition as
5x2 + 20x + 6 A B C
2
= + + .
x(x + 1) x x + 1 (x + 1)2
Substituting we get, A = 6, B = −1, C = 9, then
5x2 + 20x + 6
Z Z Z Z
6 1 9
dx = dx − dx + dx
x3 + 2x2 + x x x+1 (x + 1)2
9(x + 1)−1
= 6 ln |x| − ln |x + 1| + +C
6 −1
x
= ln − 9 + C.
x + 1 x + 1
6x − 1
Z
You Try It: Evaluate dx.
x3 (2x− 1)
x5 + x − 1
Z
Example: Find dx.
x4 − x3
Solution: This rational is improper, its numerator has a degree greater than that of its denomi-
nator. Carrying out long division, we have
x5 + x − 1 x3 + x − 1
= x + 1 + .
x4 − x3 x4 − x3
92
Now, applying partial fraction decomposition produces
x3 + x − 1 A B C D
3
= + 2+ 3+ .
x (x − 1) x x x x−1
We see that A = 0, B = 0, C = 1 and D = 1. So now we can integrate,
Z 5
x3 + x − 1
Z
x +x−1
dx = x+1+ dx
x4 − x3 x4 − x3
Z
1 1
= x+1+ 3 + dx
x x−1
x2 1
= + x − 2 + ln |x − 1| + C.
2 2x
x3 − 2x
Z
You Try It: Evaluate dx.
x2 + 3x + 2
Z
dx
Example: Find .
x2 − 4x + 5
Solution: Here we cannot find real factors, but we can complete the square,
x2 − 4x + 5 = (x − 2)2 + 1,
therefore Z Z
dx dx
2
= = tan−1 (x − 2) + C.
x − 4x + 5 (x − 2)2 + 1
Z
(x + 3)dx
Example: Find .
x2 − 4x + 5
x+3 x+3
Solution: Completing the square, = . Now since x + 3 = x − 2 + 5, we
x2 − 4x + 5 (x − 2)2 + 1
have
(x − 2 + 5) dx
Z Z
(x + 3)dx
=
x2 − 4x + 5 (x − 2)2 + 1
(x − 2) dx
Z Z
5 dx
= +
(x − 2)2 + 1 (x − 2)2 + 1
1
= ln((x − 2)2 + 1) + 5 tan−1 (x − 2) + C.
2
Z
(x + 1) dx
Example: Find .
x(x2 + 1)
93
Solution: Decompose into partial fractions,
x+1 A B + Cx
2
= + 2 .
x(x + 1) x x +1
Z
4x
You Try It: Evaluate dx.
(x2 + 1)(x2 + 2x + 3)
Integrals of rational expressions that involve sin x and cos x can be reduced to integrals of quotients
of polynomials by means of the substitution
x
t = tan .
2
1 − t2 2t dx 2
It then follows that cos x = , sin x = and = .
1 + t2 1 + t2 dt 1 + t2
Z
dx
Example: Evaluate .
2 + 2 sin x + cos x
Z
dx
Example: Evaluate .
5 + 3 cos x
94
Solution: The integral becomes
Z Z 2dt
dx 1+ t2
=
5 + 3 cos x 1 − t2
5+3
1 + t2
Z 2dt Z
1 + t2 2dt
= 2 2 =
5(1 + t ) + 3(1 − t ) 8 + 2t2
Z 1 + t2
dt 1 −1 t
= 2
= tan +C
t +4 2 2
1 −1 1 x
= tan tan + C.
2 2 2
Z
dx
You Try It: Evaluate .
5 + 4 cos x
With the aid of trigonometric identities, it is possible to evaluate integrals of the type
Z
sinm x cosn x dx.
Case 1: m or n is an odd positive integer. Let us first assume that m is an odd positive
integer. By writing
sinm x = sinm−1 x sin x,
where m − 1 is even, and using sin2 x = 1 − cos2 x, the integrand can be expressed as a sum of
powers of cos x times sin x.
Z
Example: Evaluate sin3 x dx.
95
Solution:
Z Z
3
sin x dx = sin2 x sin x dx
Z
= (1 − cos2 x) sin x dx
Z Z
= sin x dx + cos2 x(− sin x) dx
1
= − cos x + cos3 x + C.
3
Z
Example: Evaluate sin5 x cos2 x dx.
Solution:
Z Z
5 2
sin x cos x dx = cos2 x sin4 x sin x dx
Z
= cos2 x(sin2 x)2 sin x dx
Z
= cos2 x(1 − cos2 x)2 sin x dx
Z
= cos2 x(1 − 2 cos2 x + cos4 x) sin x dx
Z Z Z
= − cos x(− sin x) dx + 2 cos x(− sin x) dx − cos6 x(− sin x) dx
2 4
1 2 1
= − cos3 x + cos5 x − cos7 x + C.
3 5 7
If n is an odd positive integer, the procedure for evaluation is the same except that we seek an
integrand that is the sum of powers of sin x times cos x.
Z
Example: Evaluate sin4 x cos3 x dx.
Solution:
Z Z
4 3
sin x cos x dx = sin4 x cos2 x cos x dx
Z
= sin4 x(1 − sin2 x) cos x dx
Z
= sin4 x(cos x) dx − sin6 x(cos x) dx
1 5 1
= sin x − sin7 x + C.
5 7
Z
You Try It: Evaluate sin2 x cos3 x dx.
96
Case II: m and n are both even non-negative integers. When both m and n are even
non-negative integers, the evaluation of the integral relies heavily on the identities,
1 1 − cos 2x 1 + cos 2x
sin x cos x = sin 2x, sin2 x = , cos2 x = .
2 2 2
Z
Example: Evaluate cos4 x dx.
Solution:
Z Z
4
cos x dx = (cos2 x)2 dx
Z 2
1 + cos 2x
= dx
2
Z
1
= (1 + 2 cos 2x + cos2 2x) dx
4
Z
1 1 + cos 4x
= 1 + 2 cos 2x + dx
4 2
Z
1 3 1
= + 2 cos 2x + cos 4x dx
4 2 2
3 1 1
= x + sin 2x + sin 4x + C.
8 4 32
Z
You Try It: Evaluate sin2 x cos2 x dx.
Z 2π
More generally, find sin px cos qx dx. Use the identity
0
1 1
sin px cos qx = sin(p + q)x + sin(p − q)x.
2 2
1 1
Thus sin 8x cos 6x = sin 14x + sin 2x. Separated like this, sine are easy to integrate,
2 2
Z 2π 2π
1 cos 14x cos 2x
sin 8x cos 6x dx = − − = 0.
0 2 14 4 0
With two sines or two cosines, the addition formula, derive these formulas,
1 1
sin px cos qx = − cos(p + q)x + cos(p − q)x.
2 2
1 1
cos px cos qx = cos(p + q)x + cos(p − q)x.
2 2
97
Z
You Try It: Evaluate sin 2x sin 4x dx.
7.15 Tutorial 5
1. Evaluate the following.
tan−1 x
Z Z Z
2 3
(i) (x + 2) sin(x + 4x − 6)dx (ii) x2 esin x cos x3 dx (iii) dx
1 + x2
Z
dx
Z
2
√
(iv) (v) x 2x3 − 4dx
x(ln x)4
1 x+5 2x − 5
2. Integrate (i) (ii) (iii) .
(x2 + 1)(x + 1) (2x − 1)(x + 3) (x2 + 4)(x + 6)
3. Evaluate each of the following integrals by usingr
an appropriate substitution.
Z
x5 dx
Z √ Z
1+x
(i) √ (ii) x7 x4 + 1dx (iii) dx
x3 + 1 1−x
4. Evaluate
Z each of the following
Z integrals. Z Z
3
(i) sin xdx (ii) sin x cos xdx (iii) sin3 x cos3 xdx (iv) sin 6x cos 3xdx
secn−2 x tan x n − 2
Z
(e) Let In = secn xdx, n = 2, 3, . . . . Show that In = + In−2 .
n−1 n−1
Z
(f) Let In = (ln x)n dx, n = 1, 2, . . . . Show that In = x(ln x)n − nIn−1 .
Z
(h) Let In = (1 + ax2 )n dx. Show that (2n + 1)In = 2nIn−1 + x(1 + ax2 )n .
98
Chapter 8
So far we have dealt only with functions of single (independent) variables. Many familiar quantities,
however, are functions of two or more variables. For instance, the work done by a force (W = F D)
and the volume of a right circular cylinder (V = πr2 h) are both functions of two variables. The
volume of a rectangular solid (V = lwh) is a function of three variables. The notation for a function
of two or three variables is as follows
z = f (x, y) = x2 + xy
| {z }
2 variables
and
w = f (x, y, z) = x + 2y − 3z.
| {z }
3 variables
Let D be a set of ordered pairs of real numbers. If to each ordered pair (x, y) in D there corresponds
a unique real number f (x, y), then f is called a function of x and y. The set D is the domain
of f and the corresponding set of values for f (x, y) is the range of f . For the function given by
z = f (x, y), we call x and y the independent variables and z the dependent variable.
As with functions of one variable, the most common way to describe a function of several variables
is with an equation, and unless otherwise restricted, we can assume that the domain is the set of
all points for which the equation is defined. for example, the domain of the function given by
f (x, y) = x2 + y 2
is assumed to be the entire xy−plane.
Example 8.1.1. Find the domains of the following functions.
p
x2 + y 2 − 9 x
(i) f (x, y) = (ii) g(x, y, z) = p .
x 9 − x − y2 − z2
2
99
Solution: (i) The function f is defined for all points (x, y) such that x 6= 0 and x2 + y 2 ≥ 9. Thus,
the domain is the set of all points lying on or outside the circle x2 + y 2 = 9.
(ii) The function g is defined for all points (x, y, z) such that x2 + y 2 + z 2 < 9. Consequently, the
domain is the set of all points (x, y, z) lying inside a sphere of radius 3 that is centred at the origin.
Functions of several variables can be combined in the same ways as functions of single variables. For
instance, we can form the sum, difference, product and quotients of two functions of two variables
as follows
As with functions of single variables, we can learn a lot about the behaviour of a function of two
variables by sketching its graph. The graph of a function f of two variables is the set of all points
(x, y, z) for which z = f (x, y) and (x, y) is in the domain of f . The graph can be interpreted as a
surface in space.
A second way to visualize a function of two variables is to use a scalar field in which the scalar
z = f (x, y) is assigned to the point (x, y). A scalar field can be characterized by level curves (or
contour lines) along which the value of f (x, y) is constant. For example, the weather map shows
level curves of equal pressure called isobars. In weather maps for which the level curves represent
points of equal temperature, the level curves are called isotherms. Another common use of level
curves is in representing electrical potential fields, in this type of map, the level curves are called
equipotential lines.
The concept of a level curve can be extended by one dimension to define a level surface. If f is
a function of three variables and c is a constant, then the graph of the equation f (x, y, z) = c is a
level surface of the function f .
100
8.3 Limits and Continuity
In this section, we will study limits and continuity involving functions of two or three variables. We
begin our discussion of the limit of a function of two variables by defining a two-dimensional analog
to an interval on the real line. Using the formula for the distance between two points (x, y) and
(x0 , y0 ) in the plane, we can define the δ-neighborhood about (x0 , y0 ) to be the disc centered at
(x0 , y0 ) with radius δ > 0 p
{(x, y) : (x − x0 )2 + (y − y0 )2 < δ}.
closed. When this formula contains the less than inequality, <, the disc is called open, and when
sq
(x0 , y0 )
it contains the less than or equal to inequality, ≤, the disc is called closed.
101
8.3.2 Limit of a Function of Two Variables
Let f be a function of two variables defined on an open disc centered at (x0 , y0 ), except possibly at
(x0 , y0 ) and let L be a real number. Then
lim f (x, y) = L
(x,y)→(x0 ,y0 )
Definition
A function f (x, y) has a limit L as (x, y) approaches (x0 , y0 ) if given any > 0 there exists
δ > 0 (depending on and (x0 , y0 )) such that whenever (x − x0 )2 + (y − y0 )2 < δ 2 , then
|f (x, y) − L| < .
The definition of the limit of a function of two variables is similar to the definition of the limit
of a function of a single variable, yet there is a critical difference. For a function of two variables,
the statement (x, y) → (x0 , y0 ) means that the point (x, y) is allowed to approach (x0 , y0 ) from any
direction. If the value of
lim f (x, y)
(x,y)→(x0 ,y0 )
is not the same for all possible approaches or paths, to (x0 , y0 ), then the limit does not exist. We
usually choose convenient paths. Some of these are
Solution: Let f (x, y) = x and L = a. We need to show that for each ε > 0, there exists a
δ−neighborhood about (a, b) such that
|f (x, y) − L| = |x − a| < ε
whenever (x, y) 6= (a, b) lies in the neighborhood. We can observe that from
p
0 < (x − a)2 + (y − b)2 < δ
it follows that
p p
|f (x, y) − a| = |x − a| = (x − a)2 ≤ (x − a)2 + (y − b)2 < δ.
102
Example 8.3.2. Prove that lim x2 + 2y = 5.
(x,y)→(1,2)
Solution: Using the definition of limits, we must show that, given ε > 0, we can find a δ > 0
such that |x2 + 2y − 5| < ε when 0 < |x − 1| < δ and 0 < |y − 2| < δ. If 0 < |x − 1| < δ and
0 < |y − 2| < δ, then
1−δ <x<1+δ
and
2−δ <y <2+δ
excluding x = 1 and y = 2. Thus,
1 − 2δ + δ 2 < x2 < 1 + 2δ + δ 2
and
4 − 2δ < 2y < 4 + 2δ.
Adding
5 − 4δ + δ 2 < x2 + 2y < 5 + 4δ + δ 2
or
−4δ + δ 2 < x2 + 2y − 5 < 4δ + δ 2 .
If δ ≤ 1, it follows that
−5δ < x2 + 2y − 5 < 5δ
i.e., |x2 + 2y − 5| < 5δ whenever 0 < |x − 1| < δ and 0 < |y − 2| < δ. Choosing 5δ = ε i.e., δ = 5ε or
δ = 1 which ever is smaller, it follows that |x2 + 2y − 5| < ε when 0 < |x − 1| < δ and 0 < |y − 2| < δ
i.e., lim x2 + 2y = 5.
(x,y)→(1,2)
Example 8.3.3. Show that the following limit does not exist.
2 2
x − y2
lim .
(x,y)→(0,0) x2 + y 2
2
x2 − y 2
Solution: The domain of the function given by f (x, y) = consists of all points in the
x2 + y 2
xy-plane except for the point (0, 0). To show that the limit as (x, y) approaches (0, 0) does not exist,
consider approaching (0, 0) along two different paths. along the x-axis, every point is of the form
(x, 0) and the limit along this approach is
2
x2 − y 2
lim = lim (1)2 = 1.
(x,0)→(0,0) x2 + y 2 (x,0)→(0,0)
103
xy
Example 8.3.4. Show that lim does not exist.
(x,y)→(0,0) x2 + y2
Solution: The fact that the limit taken along the x and y−axis exists and equal zero may lead us
to suspect that the lim f (x, y) exists. We have not examined every path to (0, 0). We now try
(x,y)→(0,0)
any line through the origin given by y = mx,
mx2 m
lim f (x, y) = lim 2 2 2
= .
(x,y)→(0,0) (x,y)→(0,0) x + m x 1 + m2
2
This limit changes as the gradient m changes. For example (i) on y = 2x, lim f (x, y) = and
(x,y)→(0,0) 5
5
(ii) on y = 5x, lim f (x, y) = . There is no single number L that we can call the limit of f
(x,y)→(0,0) 26
as (x, y) → (0, 0). So the limit does not exist.
Solution:
lim (2x + 5xy − 3y 2 ) = lim 2x + lim 5xy + lim (−3y 2 )
(x,y)→(2,1) (x,y)→(2,1) (x,y)→(2,1) (x,y)→(2,1)
104
as (x, y) → (1, 2) can be evaluated by direct substitution. That is, the limit is f (1, 2) = 2. In such
cases the function f is said to be continuous at the point (1, 2).
1. f is defined at (x0 , y0 ),
2. lim f (x, y) exists, and
(x,y)→(x0 ,y0 )
If k is a real number and f and g are continuous at (x0 , y0 ), then the following functions are
continuous at (x0 , y0 ),
1. Scalar Multiple kf .
2. Sum and difference f ± g.
3. Product f g.
f
4. Quotient , if g(x0 , y0 ) 6= 0.
g
Polynomials and rational functions in two variables are continuous at any point at which they are
defined.
In the application of functions of several variables, the question often arises, “How will a function
be affected by a change in one of its independent variables?”. You can answer by considering the
105
independent variables one at a time. The process is called partial differentiation, and the result
is referred to as the partial derivative of f with respect to the chosen independent variable 1 .
If z = f (x, y), then the first partial derivatives of f with respect to x and y are the functions fx
and fy defined by
f (x + ∆x, y) − f (x, y)
fx (x, y) = lim
∆x→0 ∆x
f (x, y + ∆y) − f (x, y)
fy (x, y) = lim
∆y→0 ∆y
provided the limits exist. This definition indicates that if z = f (x, y), then to find fx we consider
y constant and differentiate with respect to x. Similarly, to find fy , we consider x constant and
differentiate with respect to y.
Example 8.5.1. Find fx and fy for f (x, y) = 3x − x2 y 2 + 2x3 y.
and
∂z
= fy (a, b).
∂y
(a,b)
1
The introduction of partial derivatives followed Newton’s and Leibniz’s work in calculus by several years. Between
1760, Leonhard Euler and Jean Le Rond d’Alembert (1717-1783) separately published several papers on dynamics,
in which they established much of the theory of partial derivatives
106
2
Example 8.5.2. For f (x, y) = xex y , find fx and fy and evaluate each at the point (1, ln 2).
Solution: Because
2 2y
fx (x, y) = xex y (2xy) + ex
the partial derivative of f with respect to x at (1, ln 2) is
Because
2 2y
fy (x, y) = xex y (x2 ) = x3 ex
the partial derivative of f with respect to y at (1, ln 2) is
fy (1, ln 2) = eln 2 = 2.
The concept of a partial derivative can be extended naturally to functions of three or more variables.
For instance, if w = f (x, y, z), then there are three partial derivatives, each of which is formed by
holding two of three variables constant.
∂w f (x + ∆x, y, z) − f (x, y, z)
= fx (x, y, z) = lim
∂x ∆x→0 ∆x
∂w f (x, y + ∆y, z) − f (x, y, z)
= fy (x, y, z) = lim
∂y ∆y→0 ∆y
∂w f (x, y, z + ∆z) − f (x, y, z)
= fz (x, y, z) = lim .
∂z ∆z→0 ∆z
Example 8.5.3. (i) To find the partial derivative of f (x, y, z) = xy + yz 2 + xz with respect to z,
consider x and y to be constant and obtain
∂
xy + yz 2 + xz = 2yz + x.
∂z
(ii) to find the partial derivative of f (x, y, z) = z sin(xy 2 + 2z) with respect to z, consider x and y
to be constant. Then, using the Product rule, we obtain
∂ ∂ ∂
z sin(xy 2 + 2z) = (z) [sin(xy 2 + 2z)] + sin(xy 2 + 2z) [z]
∂z ∂z ∂z
2 2
= (z)[cos(xy + 2z)](2) + sin(xy + 2z)
= 2z cos(xy 2 + 2z) + sin(xy 2 + 2z).
x+y+z
(iii) To find the partial derivative of f (x, y, z, w) = with respect to w, consider x, y and
w
z to be constant and obtain
∂ x+y+z x+y+z
=− .
∂w w w2
107
8.5.3 Higher-Order Partial Derivatives
It is possible to take second, third and higher partial derivatives of a function of several variables,
provided such derivatives exist. Higher-order derivatives are denoted by the order in which the
differentiation occurs. For instance, the function z = f (x, y) has the following second partial
derivatives.
∂ 2f
∂ ∂f
= = fxx .
∂x ∂x ∂x2
∂ 2f
∂ ∂f
= = fyy .
∂y ∂y ∂y 2
∂ 2f
∂ ∂f
= = fxy .
∂y ∂x ∂y∂x
∂ 2f
∂ ∂f
= = fyx .
∂x ∂y ∂x∂y
The third and fourth cases are called mixed partial derivatives.
Example 8.5.4. Find the second partial derivatives of f (x, y) = 3xy 2 − 2y + 5x2 y 2 and determine
the value of fxy (−1, 2).
Solution: Begin by finding the first partial derivatives with respect to x and y.
fxx (x, y) = 10y 2 , fyy (x, y) = 6x + 10x2 , fxy (x, y) = 6y + 20xy and fyx (x, y) = 6y + 20xy.
Notice that the two mixed partials are equal. Sufficient conditions for this occurrence are given in
the next theorem.
108
Theorem 8.5.1 (Equality of Mixed Partial Derivatives). If f is a function of x and y such that fx
and fy are continuous on an open disc R, then for every (x, y) in R,
Example 8.5.5. Show that fxz = fzx and fxzz = fzxz = fzzx for the function given by
f (x, y, z) = yex + x ln z.
If z = f (x, y) and ∆x and ∆y are increments of x and y, then the differentials of the independent
variables x and y are
dx = ∆x and dy = ∆y
and the total differential of the dependent variable z is
∂z ∂z
dz = dx + dy = fx (x, y)dx + fy (x, y)dy.
∂x ∂y
This definition can be extended to a function of three or more variables. for example,
if w = f (x, y, z, u), then dx = ∆x, dy = ∆y, dz = ∆z, du = ∆u, and the total differential of w is
∂w ∂w ∂w ∂w
dw = dx + dy + dz + du.
∂x ∂y ∂z ∂u
∂z ∂z
dz = dx + dy = (2 sin y − 6xy 2 )dx + (2x cos y − 6x2 y)dy.
∂x ∂y
109
(ii) The total differential dw for w = x2 + y 2 + z 2 is
∂w ∂w ∂w
dw = dx + dy + dz = 2xdx + 2ydy + 2zdz.
∂x ∂y ∂z
Theorem 8.6.1. If a function of x and y is differentiable at (x0 , y0 ), then it is continuous at
(x0 , y0 ).
Solution: You can show that f is not differentiable at (0, 0) by showing that it is not continuous at
this point. to see that f is not continuous at (0, 0), look at the values of f (x, y) along two different
approaches to f (x, y). Along the line y = x, the limit is
−3x2 3
lim f (x, y) = lim 2
=−
(x,x)→(0,0) (x,x)→(0,0) 2x 2
whereas along y = −x we have
3x2 3
lim f (x, y) = lim = .
(x,−x)→(0,0) (x,−x)→(0,0) 2x2 2
Thus, the limit of f (x, y) as (x, y) → (0, 0) does not exist, and we can conclude that f is not
continuous at (0, 0). Hence f is not differentiable at (0, 0). On the other hand, by the definition of
the partial derivatives fx and fy , we have
f (∆x, 0) − f (0, 0) 0−0
fx (0, 0) = lim = lim =0
∆x→0 ∆x ∆x→0 ∆x
and
f (0, ∆y) − f (0, 0) 0−0
fy (0, 0) = lim = lim = 0.
∆y→0 ∆y ∆y→0 ∆y
Theorem 8.7.1. Let w = f (x, y), where f is a differentiable function of x and y. If x = g(t) and
y = h(t), where g and h are differentiable functions of t, then w is a differentiable function of t,
and
dw ∂w dx ∂w dy
= + .
dt ∂x dt ∂y dt
110
∂w
∂w w ∂y
∂x
x y
dy
dx dt
dt
t t
dw
Example 8.7.1. Let w = x2 y − y 2 , where x = sin t and y = et . Find at t = 0.
dt
The Chain Rule can be extended to any number of variables. for example, if each xi is a differentiable
function of a single variable t, then for w = f (x1 , x2 , . . . , xn ), we have
111
s
Solution: Begin by substituting x = s2 + t2 and y = into the equation w = 2xy to obtain
t
s 3
2 2 s
w = 2xy = 2(s + t ) =2 + st .
t t
∂w
Then, to find , hold t constant and differentiate with respect to s.
∂s
2
6s2 + 2t2
∂w 3s
=2 +t = .
∂s t t
∂w
Similarly, to find , hold s constant and differentiate with respect to t to obtain
∂t
3 3
−s + st2 2st2 − 2s3
∂w s
=2 − 2 +s =2 = .
∂t t t2 t2
The following theorem gives an alternative method for finding the partial derivatives without ex-
plicitly writing w as a function of s and t.
Theorem 8.7.2 (Chain Rule: Two Independent Variables). Let w = f (x, y), where f is a differ-
∂x ∂x ∂y
entiable function of x and y. If x = g(s, t) and y = h(s, t) such that the first partials , ,
∂s ∂t ∂s
∂y ∂w ∂w
and all exist, then and exist and are given by
∂t ∂s ∂t
∂w ∂w ∂x ∂w ∂y ∂w ∂w ∂x ∂w ∂y
= + and = + .
∂s ∂x ∂s ∂y ∂s ∂t ∂x ∂t ∂y ∂t
∂w ∂w s
Example 8.7.3. Use the Chain rule to find and for w = 2xy where x = s2 + t2 and y = .
∂s ∂t t
112
w
∂w
∂w ∂y
∂x
#
y
x
"!
∂x ∂x ∂y ∂y
∂t ∂s ∂t ∂s
'$
t s t s
&%
113
When s = 1 and t = 2π, we have x = 1, y = 0 and z = 2π. Therefore
∂w
= 2π(1) + (1 + 2π)(0) + 0 = 2π.
∂s
Furthermore
∂w ∂w ∂x ∂w ∂y ∂w ∂z
= + +
∂t ∂x ∂t ∂y ∂t ∂z ∂t
= (y + z)(−s sin t) + (x + z)(s cos t) + (y + x)(1)
Solution: Clearly du = dx + dy and dv = ydx + xdy, and hence −ydx = xdy − dv and
xdx = xdu − xdy. Adding these two equations, yield (x − y)dx = xdu − dv. Also, xdy = dv − ydx
and −ydy = ydx − ydu, hence (x − y)dy = dv − ydu. Thus
x 1
dx = du − dv
x−y x−y
and
1 y
dy = dv − du.
x−y x−y
Consequently, we have
∂x x ∂x −1 ∂y −y ∂y 1
= , = = = .
∂u x−y ∂v x−y ∂u x−y ∂v x−y
Example 8.7.6. Parabolic co-ordinates (u, v) are defined implicitly in terms of the Cartesian co-
ordinates (x, y) by the pair of equations
u2 − v 2
x= , y = uv.
2
∂u ∂v ∂v
Obtain expressions for , and in terms of u and v and verify that
∂y ∂x ∂y
∂u ∂v ∂u ∂v
+ = 0.
∂x ∂x ∂y ∂y
∂f ∂f ∂φ ∂φ
and
Given that f (x, y) = φ(u, v), obtain expressions for in terms of , , u and v, and
∂x ∂y ∂u ∂v
deduce that 2 2 " 2 #
2
∂f ∂f 1 ∂φ ∂φ
+ = 2 + .
∂x ∂y u + v2 ∂u ∂v
114
u2 − v 2
Solution: Since x = , y = uv, then dx = udu − vdv and dy = vdu + udv. Multiplying the
2
first by u and the second by v, we have
∂f ∂φ ∂u ∂φ ∂v u ∂φ v ∂φ
= + = 2 2
− 2 2
∂x ∂u ∂x ∂v ∂x u + v ∂u u + v ∂v
and
∂f ∂φ ∂u ∂φ ∂v v ∂φ u ∂φ
= + = 2 + 2 .
∂y ∂u ∂y ∂v ∂y u + v ∂u u + v 2 ∂v
2
Now 2 2 2
u2 v2
∂f ∂φ 2uv ∂φ ∂φ ∂φ
= 2 2 2
− 2 2 2
+ 2 2 2
∂x (u + v ) ∂u (u + v ) ∂u ∂v (u + v ) ∂v
2 2 2
v2 u2
∂f ∂φ 2uv ∂φ ∂φ ∂φ
= 2 + 2 + .
∂y (u + v 2 )2 ∂u (u + v 2 )2 ∂u ∂v (u2 + v 2 )2 ∂v
Hence 2 2 " 2 2 #
∂f ∂f 1 ∂φ ∂φ
+ = 2 + .
∂x ∂y u + v2 ∂u ∂v
115
Example 8.7.7. Let w = f (x, y), where x and y are given in polar coordinates by the equations
∂w ∂w ∂ 2w
x = r cos θ and y = r sin θ. Calculate , and in terms of r and θ and the partial
∂r ∂θ ∂r2
derivatives of w with respect to x and y.
Solution: Here x and y are intermediate values, while the independent variables are r and θ. First
note that
∂x ∂y ∂x ∂y
= cos θ, = sin θ, = −r sin θ and = r cos θ.
∂r ∂r ∂θ ∂θ
Then
∂w ∂w ∂x ∂w ∂y ∂w ∂w
= + = cos θ + sin θ
∂r ∂x ∂r ∂y ∂r ∂x ∂y
and
∂w ∂w ∂x ∂w ∂y ∂w ∂w
= + = −r sin θ + r cos θ.
∂θ ∂x ∂θ ∂y ∂θ ∂x ∂y
Next,
∂ 2w
∂ ∂w ∂ ∂w ∂w ∂wx ∂wy
2
= = cos θ + sin θ = cos θ + sin θ,
∂r ∂r ∂r ∂r ∂x ∂y ∂r ∂r
∂w ∂w
where wx = and wy = . Therefore
∂x ∂y
∂ 2w
∂wx ∂x ∂wx ∂y ∂wy ∂x ∂wy ∂y
= + cos θ + + sin θ
∂r2 ∂x ∂r ∂y ∂r ∂x ∂r ∂y ∂r
2
∂ 2w
2
∂ 2w
∂ w ∂ w
= cos θ + sin θ cos θ + cos θ + sin θ sin θ.
∂x2 ∂y∂x ∂x∂y ∂y 2
Finally, because wyx = wxy , we get
∂ 2w ∂ 2w 2 ∂ 2w ∂ 2w 2
= cos θ + 2 cos θ sin θ + sin θ.
∂r2 ∂x2 ∂x∂y ∂y 2
Theorem 8.8.1 (Extreme Value Theorem). Let f be a continuous function of two variables x and
y defined on a closed bounded region R in the xy-plane.
A minimum value is also called an absolute minimum and a maximum is also called an absolute
maximum.
116
Definition of Relative Extrema
To locate relative extreme of f , we can investigate the points at which the gradient of f is 0. Such
points are called critical points of f .
Let f be defined on an open region R containing (x0 , y0 ). The point (x0 , y0 ) is a critical point of
f if one of the following is true.
Solution: Begin by finding the critical points of f . Because fx (x, y) = 4x + 8 and fy (x, y) = 2y − 6
are defined for all x and y, the only critical points are those for which both first partial derivatives
are 0. To locate these points, let fx (x, y) and fy (x, y) be 0, and solve the system of equations
4x + 8 = 0 and 2y − 6 = 0
to obtain the critical point (−2, 3). By completing the square, we can conclude that for all
(x, y) 6= (−2, 3),
f (x, y) = 2(x + 2)2 + (y − 3)2 + 3 > 3.
Therefore, a relative minimum of f occurs at (−2, 3). The value of the relative minimum is
f (−2, 3) = 3.
117
The above example shows a relative minimum occurring at one type of critical point, the type for
which both fx (x, y) and fy (x, y) are 0. The next example concerns a relative maximum that occurs
at the other type of critical point, the type for which either fx (x, y) or fy (x, y) is undefined.
1
Example 8.8.2. Determine the relative extrema of f (x, y) = 1 − (x2 + y 2 ) 3 .
Solution: Because
2x 2y
fx (x, y) = − 2 and fy (x, y) = − 2
3(x2 + y 2 ) 3 3(x2 + y 2 ) 3
it follows that both partial derivatives are defined for all points in the xy-plane except for (0, 0).
Moreover, because the partial derivatives cannot both be 0 unless both x and y are 0, we can conclude
that (0, 0) is the only critical point. Note that f (0, 0) = 1, for all other (x, y) it is clear that
1
f (x, y) = 1 − (x2 + y 2 ) 3 < 1.
Some critical points yield saddle points, which are neither relative maxima nor relative minima.
Theorem 8.8.3. Let f have continuous second partial derivatives on an open region containing a
point (a, b) for which
fx (a, b) = 0 and fy (a, b) = 0.
To test for relative extrema of f , we define the quantity
1. If d > 0 and fxx (a, b) > 0, then f has a relative minimum at (a, b).
2. If d > 0 and fxx (a, b) < 0, then f has a relative maximum at (a, b).
A convenient device for remembering the formula for d in the Second Partials Test is given by
the 2 × 2 determinant
fxx (a, b) fxy (a, b)
d =
fyx (a, b) fyy (a, b)
where fxy (a, b) = fyx (a, b).
118
Example 8.8.3. Find the relative extrema of f (x, y) = −x3 + 4xy − 2y 2 + 1.
8.9 Tutorial 6
3 2 1 2
1. If f (x, y) = x − 2xy + 3y , find (i) f (−2, 3) (ii) f , .
x y
2. Use the definition of a limit to show that:
(i) lim (3x − 2y) = 14 (ii) lim (xy − 3x + 4) = 0 (iii) lim (2x + 5xy − 3y 2 ) =
(x,y)→(4,−1) (x,y)→(2,1) (x,y)→(2,1)
11 .
3. Let
xy 2
, (x, y) 6= (0, 0)
f (x, y) = x2 + y 2
0, (x, y) = (0, 0),
119
5. Show that the following limits do not exist:
x2 − 3y 2 xy x2 y 2x − y 2
(i) lim 2 2
(ii) lim (iii) lim (iv) lim .
(x,y)→(0,0) x + 2y (x,y)→(0,0) x + y 2
2 (x,y)→(0,0) x4 + y 2 (x,y)→(0,0) 2x2 + y
∂z ∂z
6. Use the definition of partial derivative to find and for z = 3x2 − xy + 2y 2 + 3.
∂x ∂y
x−y ∂f ∂f
7. If f (x, y) = , find and from the definition.
x+y ∂x ∂y
2 +2y 2 ∂f ∂f ∂ 2f ∂ 2f
8. Let f (x, y) = xex , find , and verify that = .
∂x ∂y ∂x∂y ∂y∂x
y ∂ 2z
9. If z = x2 tan−1 , find at (1, 1).
x ∂x∂y
1
10. Let f (x, y, z) = p . Show that fxx + fyy + fzz = 0.
x2 + y 2 + z 2
dw y
11. Find given that (i) w = x2 + y 2 , x = et , y = e−t (ii) w = ln , x = cos t, y = sin t
dt p x
(iii), w = x2 + y 2 , x = sin t, y = et .
∂r ∂r ∂r
12. Find , and .
∂x ∂y ∂z
(i) r = eu+v+w and u = yz, v = xz, w = xy (ii) r = uvw − u2 − v 2 − w2 and u = y + z,
v = x + z, w = x + y.
13. Let f (x, y) be a function of x and y where x = eu−v cos(u + v), y = eu−v sin(u + v).
Show that
∂f ∂f ∂f ∂f
+ = 2x − 2y .
∂u ∂v ∂y ∂x
14. Assume that w = f (u, v) where u = x + y and v = x − y. Show that
2 2
∂w ∂w ∂w ∂w
= − .
∂x ∂y ∂u ∂v
15. Suppose that w = f (x, y) and that x = u + v and y = u − v. Show that
∂ 2w ∂ 2w 1 ∂ 2w ∂ 2w
− = + .
∂x2 ∂y 2 2 ∂u∂v ∂v∂u
16. Let V = f (x, y), where x and y are given in polar co-ordinates by the equations x = r cos θ
2 2 2 2
∂V ∂V ∂V 1 ∂V
and y = r sin θ. Show that + = + 2 .
∂x ∂y ∂r r ∂θ
17. Given V = f (x, y). Show that if x = r cos θ and y = r sin θ, then
∂ 2V ∂ 2V ∂ 2V 1 ∂V 1 ∂ 2V
+ = + + .
∂x2 ∂y 2 ∂r2 r ∂r r2 ∂θ2
18. Given that w = f (x, y), x = eu cos v and y = eu sin v. Show that
2 2 " 2 #
2
∂w ∂w ∂w ∂w
+ = e−2u + .
∂x ∂y ∂u ∂v
120
8.10 Tutorial 7
1. Find the total differential of the following functions.
x2
(i) z = 3x2 y 3 (ii) z = (iii) z = x cos y − y cos x (iv) w = ex cos y + z
y
1 2 2 2 2 x+y
(v) w = x2 yz 2 sin yz (vi) z = (ex +y − e−x −y ) (vii) z = ex sin y (viii) w = .
2 z − 2y
2. Find and classify the stationary points of
(b) Hence, find and classify all the critical points of the function f (x, y).
x2 − y 2
4. Suppose that w = f (u), where u = . Show that xwx + ywy = 0.
x2 + y 2
5. Suppose that w = f (u) + g(v), where u = x − at and v = x + at. Show that
∂ 2w 2
2∂ w
= a .
∂t2 ∂x2
121
Chapter 9
Multiple Integration
In the previous chapter, we saw that it is meaningful to differentiate functions of several variables
with respect to one variable while holding the other variable constant, we can integrate functions
of several variables by a similar procedure. For example, if we have the partial derivative
fx (x, y) = 2xy, then by considering y constant, we can integrate with respect to x to obtain
Z
f (x, y) = fx (x, y) dx Integrate with respect to x
Z
= 2xy dx y is held constant
Z
= y 2x dx Factor out constant y
Note that the constant of integration, C(y) is a function of y. In other words, by integrating with
respect to x, we are able to recover f (x, y) only partially. For example, by considering y constant,
we can apply the Fundamental Theorem of calculus to evaluate
Z 2y 2y
2xy dx = x y = (2y)2 y − (1)2 y = 4y 3 − y.
2
1
1
Note that the variable ofZ integration cannot appear in either limit of integration. For example, it
x
makes no sense to write y dx.
0
Z x
Example 9.1.1. Evaluate (2x2 y −2 + 2y) dy.
1
122
Solution: Considering x to be constant and integrating with respect to y produces
Z x x
−2x2
2 −2 2
(2x y + 2y) dy = +y
1 y
1
−2x2 −2x2
2
= +x − +1
x 1
= 3x2 − 2x − 1.
Notice that in the above example the integral defines a function of x and can itself be integrated.
Z 2 Z x
2 −2
Example 9.1.2. Evaluate (2x y + 2y) dy dx.
1 1
Solution:
y 1≤x≤2 y=x
1≤y≤x
6
1 2 - x
= 2 − (−1)
= 3.
123
The integral in the above example is an iterated integral. Iterated integrals are usually written
simply as
Z b Z g2 (x) Z d Z h2 (y)
f (x, y) dydx and f (x, y) dxdy.
a g1 (x) c h1 (y)
The inside limits of integration can be variable with respect to the outer variable of integration.
However, the outside limits of integration must be constant with respect to both variables of
integration. After performing the inside integration, we obtain a definite integral, and the second
integration produces a real number.
One order of integration will often produce a simpler integration problem than the other order. The
order of integration affects the ease of integration, but not the value of the integral.
Example 9.1.3. Sketch the region whose area is represented by the integral
Z 2Z 4
dxdy.
0 y2
Then find another iterated integral using the order dydx to represent the area, and show that both
integrals yield the same value.
y2
∆y
- x
4
124
which means that the region R is bounded on the left by the parabola x = y 2 and on the right by the
line x = 4. Furthermore, because
0≤y≤2 Outer limits of integration
we know that R bounded below by the x-axis. The value of this integral is
Z 2Z 4 Z 2 #4
dxdy = x dy
0 y2 0
y2
Z 2
= (4 − y 2 ) dy
0
2
y3
16
= 4y − = .
3 0 3
To change the order of integration to dydx, place a vertical rectangle in the region. From this we
can see that the constant bounds 0 ≤ x ≤ 4 serve as the outer limits of integration.
√ By solving for
2
y in the equation x = y , we can conclude that the inner bounds are 0 ≤ y ≤ x. Therefore, the
area of the region can be represented by
Z 4 Z √x
dydx.
0 0
By evaluating this integral, we can see that it has the same value as the original integral.
y
6 √
y= x
∆x
- x
√ # √x
Z 4 Z x Z 4
dydx = y dx
0 0 0
0
√
Z 4
= x dx
0
#4
2 3 16
= x2 = .
3 3
0
125
Z 4 Z 2
2
Example 9.1.4. Express ey dydx as an iterated integral with order of integration reversed
x
0 2
and evaluate.
Solution: From the given limits of integration we see that, for a fixed x, y varies from y = x2 to
y = 2 and x varies from x = 0 to x = 4. We can also describe the region as, for y fixed, x varies
from
Z Z x = 0 to x = 2y and y varies from y = 0 to y = 2. The corresponding iterated integral is
2 2y
2
ey dxdy. Solving, we have
0 0
Z 2 Z 2y Z 2 x=2y
y2 y2
e dxdy = xe dy
0 0 0 x=0
Z 2
2
= 2yey dy
0
2
2
= ey
0
4
= e − 1.
Z 2 Z 1
3
Example 9.1.5. Evaluate yex dxdy.
y
0 2
3
Solution: We cannot integrate first with respect to x, as indicated, because it happens that ex
has no elementary anti-derivative. So we try to evaluate the integral by first reversing the order of
integration.
Z 2Z 1 Z 1 Z 2x
x3 3
ye dxdy = yex dydx
y
0 2
0 0
Z 1 2
1 2 3
= y xex dx
0 2 0
Z 1
3
= 2x2 ex dx
0
1
2 x3
= e
3 0
2
= (e − 1).
3
126
9.2 Double Integrals and Volume
If f is defined on a closed, bounded region R in the xy-plane, then the double integral of f over
R is given by
ZZ Xn
f (x, y) dA = lim f (xi , yi )∆xi ∆yi
|∆|→0
R i=1
provided the limit exists. If the limit exists, then f is integrable over R.
A double integral can be used to find the volume of a solid region that lies between the xy-plane
and the surface given by z = f (x, y).
If f is integrable over a plane region R and f (x, y) ≥ 0 for all (x, y) in R, then the volume of the
solid region that lies above R and below the graph of f is defined as
ZZ
V = f (x, y) dA.
R
Example 9.2.1. Find the volume of the solid region R bounded by the surface
2
f (x, y) = e−x
Solution: The base of R in the xy-plane is bounded by the lines y = 0, x = 1 and y = x. The two
possible orders of integration are
Z 1Z x Z 1Z 1
−x2 2
e dydx and e−x dxdy.
0 0 0 y
By setting Z
up the corresponding iterated integrals, we can see that the order dxdy requires the anti-
2
derivative e−x dx, which is not an elementary function. On the other hand, the order dydx
127
produces the integral
#x
Z 0 Z x Z 1
−x2 −x2
e dydx = e y dx
1 0 0
0
Z 1
−x2
= xe dx
0
#1
1 −x2
= − e
2
0
1 1
= − −1
2 e
e−1
= = 0.316.
2e
If f is continuous over a bounded solid region Q, then the triple integral of f over Q is defined
as ZZZ n
X
f (x, y, z) dV = lim f (xi , yi , zi ) ∆Vi
|∆|→0
Q i=1
provided the limit exists. The volume of the solid region Q is given by
ZZZ
Volume of Q = dV.
Q
128
Solution: For the first integration, hold x and y constant and integrate with respect to z.
Z 2 Z x Z x+y Z 2Z x #x+y
ex (y + 2z) dzdydx = ex (yz + z 2 ) dydx
0 0 0 0 0
0
Z 2Z x
= ex (x2 + 3xy + 2y 2 ) dydx.
0 0
For the second integration, hold x constant and integrate with respect to y.
Z 2Z x Z 2 x
x 2 2 x 2 3xy 2 2y 3
e (x + 3xy + 2y ) dydx = e x y+ + dx
0 0 0 2 3 0
19 2 3 x
Z
= x e dx
6 0
" #2
19 x 3
= e (x − 3x2 + 6x − 6)
6
0
19 e2
= +1 .
6 3
Example 9.3.2. If f (x, y, z) = xy + yz and T consists of those points (x, y, z) in space satisfying
the inequalities −1 ≤ x ≤ 1, 2 ≤ y ≤ 3 and 0 ≤ z ≤ 1, then
ZZ Z 1 Z 3Z 1
f (x, y, z) dV = (xy + yz) dzdydx
−1 2 0
T
Z 1 Z 3 1
1 2
= xyz + yz dydx
−1 2 2 z=0
Z 1 Z 3
1
= xy + y dydx
−1 2 2
Z 1 3
1 2 1 2
= xy + y dx
−1 2 4 y=2
Z 1
5 5
= = x+ dx
−1 2 4
1
5 2 5 5
= x + x = .
4 4 −1 2
9.4.1 Jacobians
The Jacobian is named after the German mathematician Carl Gustav Jacobi (1804-1851). For the
single integral Z b
f (x) dx
a
129
you can change variables by letting x = g(u), so that dx = g 0 (u)du, and obtain
Z b Z d
f (x) dx = f (g(u))g 0 (u) du
a c
where a = g(c) and b = g(d). Note that the change of variable introduces an additional factor g 0 (u)
into the integrand. This also occurs in the case of double integrals.
ZZ ZZ
∂x ∂y ∂y ∂x
f (x, y) dA = f (g(u, v), h(u, v))
− dudv
∂u ∂v ∂u ∂v
R S | {z }
Jacobian
where the change of variables x = g(u, v) and y = h(u, v) introduces a factor called the Jacobian
of x and y with respect to u and v.
If x = g(u, v) and y = h(u, v), then the Jacobian of x and y with respect to u and v, denoted by
∂(x, y)
is
∂(u, v)
∂x ∂x
∂(x, y) ∂u ∂v ∂x ∂y ∂y ∂x
= = − .
∂(u, v) ∂y ∂y ∂u ∂v ∂u ∂v
∂u ∂v
∂(u, v)
In cases it is more convenient to express u and v in terms of x and y, we can first compute
∂(x, y)
∂(x, y)
explicitly and then find the needed Jacobian from the formula
∂(u, v)
∂(x, y) ∂(u, v)
· = 1.
∂(u, v) ∂(x, y)
Example 9.4.1. Find the Jacobian for the change of variables defined by
130
The above example points out that the change of variables from rectangular to polar coordinates
for a double integral can be written as
ZZ ZZ
f (x, y) dA = f (r cos θ, r sin θ) rdrdθ, r > 0
R S
ZZ
∂(x, y)
= f (r cos θ, r sin θ)
drdθ
∂(r, θ)
S
where S is the region in the rθ-plane that corresponds to the region R in the xy-plane. In general,
a change of variables is given by a one-to-one transformation T from a region S in the uv-plane
to a region R in the xy-plane, to be given by
where g and h have continuous first partial derivatives in the region S. Note that the point (u, v)
lies in S and the point (x, y) lies in R. In most cases, we are hunting for a transformation for which
the region S is simpler than the region R.
Theorem 9.4.1. Let R and S be regions in the xy- and uv-planes that are related by the equations
x = g(u, v) and y = h(u, v) such that each point in R is the image of a unique point in S. If f is
∂(x, y)
continuous on R, g and h have continuous partial derivatives on S, and is non-zero on S,
∂(u, v)
then ZZ ZZ
∂(x, y)
f (x, y) dA = f (g(u, v), h(u, v)) dudv.
∂(u, v)
R S
x − 2y = 0, x − 2y = −4, x + y = 4, and x + y = 1.
Solution: to begin, let u = x + y and v = x − 2y. Solving this system of equations for x and y
1 1
produces x = (2u + v) and y = (u − v). The partial derivatives of x and y are
3 3
∂x 2 ∂x 1 ∂y 1 ∂y 1
= , = , = and =−
∂u 3 ∂v 3 ∂u 3 ∂v 3
131
which implies that the Jacobian is
∂x ∂x
∂(x, y) ∂u ∂v
=
∂(u, v)
∂y ∂y
∂u ∂v
2 1
3 3
=
1 1
−
3 3
2 1 1
= − − =− .
9 9 3
Therefore, we obtain
ZZ ZZ
1 1 ∂(x, y)
3xy dA = 3 (2u + v) (u − v) dvdu
3 3 ∂(u, v)
R S
Z 4Z 0
1
= (2u2 − uv − v 2 ) dvdu
1 −4 9
Z 4 0
1 2 uv 2 v 3
= 2u v − − du
9 1 2 3 −4
1 4
Z
2 64
= 8u + 8u − du
9 1 3
4
1 8u3
2 64
= + 4u − u
9 3 3 1
164
= .
9
Example 9.4.3. Suppose R is the Z Z plane bounded by the hyperbolas xy = 1, xy = 3 and
x2 − y 2 = 1, x2 = y 2 = 4. Find (x2 + y 2 ) dxdy.
R
Hence we have
∂(x, y) 1 1
=− 2 2
=− √ .
∂(u, v) 2(x + y ) 2 4u2 + v 2
132
Therefore
ZZ
2 2
Z 4 Z 3 √ 1
Z 4 Z 3
1
(x + y ) dxdy = 4u2 + v2 √ dudv = dudv = 3.
1 1 2 4u2 + v 2 1 1 2
R
Example 9.4.4. Find the area of the region R bounded by the curves xy = 1, xy = 3 and
xy 1.4 = 1, xy 1.4 = 2.
∂(x, y) 1 2.5
So = = . Consequently,
∂(u, v) ∂(u, v) v
∂(x, y)
ZZ Z 2 Z 3
2.5
dxdy = dudv = 5 ln 2.
1 1 v
R
9.5 Tutorial 8
1. Evaluate the integral.
Z x Z x2 Z y Z x3 Z cos x
y y ln x y
−x
(i) (2x − y) dy (ii) dy (iii) dx (iv) ye dy (v) y dy.
0 x x ey x 0 0
4. Sketch the region R whose area is given by the iterated integral. Then switch the order of
integration and show that both orders yield the same area.
Z 1Z 2 Z 2Z 1 Z 4Z 2 Z 2 Z 4−y2
(i) dydx (ii) dydx (iii) √
dydx (iv) dxdy.
x
0 0 0 2
0 x −2 0
5. Evaluate
Z 2 Z 2thepiterated integral. (Note
Z 2 Z it2 is necessary to switch
Z 1 Z the order of integration)
1 Z 2Z 4
√
3 −y 2 2
(i) x 1 + y dydx (ii) e dydx (iii) sin x dxdy (iv) x sin x dxdy
0 x 0 x 0 y 0 y2
Z 1Z 1 Z πZ π Z 1Z 1
dxdy sin y 2
(v) 4
(vi) dydx (vii) e−x dxdy.
0 y 1+x 0 x y 0 y
133
6. Use polar√coordinates to evaluate each of the following integrals.
Z 1 Z 1−x2 Z 2 Z √4−x2 Z 1 Z √1−y2
dydx 3
(i) 2 − y2
(ii) (x2 +y 2 ) 2 dydx (iii) sin(x2 +y 2 ) dxdy.
0 0 4 − x 0 0 0 0
7. Evaluate
Z 3 Z 2the
Z 1triple integral. Z 1 Z 1 Z 1 Z 4 Z 1 Z x
2
(i) (x+y+z) dxdydz (ii) 2 2 2
x y z dxdydz (iii) 2ze−x dydxdz
Z √y2 −9x2
0 0 0 −1 −1 −1 1 0 0
1 π y 1 y
Z 4 Z e2 Z xz
Z
2
Z
2
Z
y
Z 9 Z
3
(iv) ln zdydzdx (v) sin y dzdxdy (vi) z dzdxdy.
1 1 0 0 0 0 0 0 0
∂(x, y)
8. Find the Jacobian for the indicated change of variables.
∂(u, v)
1 1
(i) x = − (u − v), y = (u + v) (ii) x = au + bv, y = cu + dv (iii) x = u − uv, y = uv
2 2
(iv) x = u cos θ − v sin θ, y = u sin θ + v cos θ (v) x = eu sin v, y = eu cos v.
ZZ p
9. Evaluate x2 + y 2 dxdy, where D is the region bounded by x2 + y 2 = 4 and x2 + y 2 = 9.
D
ZZ
10. Evaluate (x + y)2 dxdy where D is the parallelogram bounded by the lines
D
x + y = 0, x + y = 1, 2x − y = 0 and 2x − y = 3.
134
ZZ
19. Calculate (x + y)3 cos2 (x − y) dxdy, where R is the region bounded by the lines
R
x + y = π, x + y = 5π, x − y = π and x − y = −π.
21. Calculate the area of the parallelogram bounded by the lines x + y = 1, x + y = 2 and
2x − 3y = 2, 2x − 3y = 5.
ZZ
22. Evaluate (x2 + y 2 ) dxdy, where R is the region in the first quadrant bounded by the
R
hyperbolas x2 − y 2 = 6, x2 − y 2 = 1, 2xy = 4 and 2xy = 1. (Hint: Use u = x2 − y 2 , v = 2xy)
Z 1Z 1 Z √π Z √π
− x2
23. Evaluate (i) √
e y dydx (ii) sin(x2 ) dxdy.
0 x 0 y
135
Chapter 10
136
10.1 The Importance of Set Theory
One striking feature of humans is their inherent need-and ability-to group objects according to
specific criteria. Our prehistoric ancestors grouped tools based on their hunting needs. They even-
tually evolved into strict hierarchical societies where a person belonged to one class and not another.
Many of us today like to sort our clothes at house, or group the songs on our computer into playlists.
The idea of sorting out certain objects into similar groupings, or sets, is the most fundamental
concept in modern mathematics. The theory of sets has, in fact, been the unifying framework for
all mathematics since the German mathematician Georg Cantor formulated it in the 1870’s. No
field of mathematics could be described nowadays without referring to some kind of abstract set.
A geometer, for example, may study the set of parabolic curves in three dimensions or the set of
spheres in a variety of different spaces. An algebraist may work with a set of equations or a set
of matrices. A statistician typically works with large sets of raw data. And the list goes on. You
may have also read or heard that the most important unresolved problem in mathematics at the
moment deals with the set of prime numbers (this problem in number theory is known as Riemann’s
Hypothesis; the Clay Institute will award a million dollars to whoever solves it.) As it turns out,
even numbers are described by mathematicians in terms of sets!
More broadly, the concept of set membership, which lies at the heart of set theory, explains how
statements with nouns and predicates are formulated in our language – or any abstract language
like mathematics. Because of this, set theory is intimately connected to logic and serves as the
foundation for all of mathematics.
A set is a collection of objects called the elements or members of the set. These objects could be
anything conceivable, including numbers, letters, colors, even sets themselves! However, none of the
objects of the set can be the set itself. We discard this possibility to avoid running into Russell’s
Paradox, a famous problem in mathematical logic unearthed by the great British logician Bertrand
Russell in 1901.
Let’s look at different types of numbers that we can have in our sets.
1. Natural Numbers
The set of natural numbers is {1, 2, 3, 4, . . . } and is denoted by N.
137
2. Integers
The set of integers is {. . . , −4, −3, −2, −1, 0, 1, 2, 3, 4, . . . } and is denoted by Z. The Z symbol
comes from the German word, Zahlen, which means number. Define the non-negative
integers {0, 1, 2, 3, 4, . . . } often denoted by Z+ . All natural numbers are integers.
3. Rational Numbers
The set of rational numbers is denoted by Q and consists of all fractional numbers i.e., x ∈ Q
if x can be written in the form pq , where p, q ∈ Z with q 6= 0.
4. Real Numbers
The real numbers are denoted by R.
5. Complex Numbers
The complex numbers are denoted by C.
10.4 Notation
1. A, B, C, . . . for sets.
2. a, b, c, . . . or x, y, z, . . . for members.
3. b ∈ A, if b belongs to A.
4. c ∈
/ A, if c does not belong to A.
5. ∅ is used for the empty set. There is exactly one set, the empty set or null set, which has
no members at all.
6. A set with only one member is called a singleton or singleton set. for example, {x}.
A set is said to be well-defined if it is unambiguous which elements belong to the set. In other
words, if A is well-defined, then the question “Does x ∈ A?” can always be answered for any object
x.
For example, if we define C as the set of large numbers, then it is unclear which numbers should be
considered “large”. C is therefore not a well-defined set. Similarly, the set of all great Zimbabwean
footballers, or the set of all expensive restaurants in Harare, are also not well-defined.
138
10.6 Specification of Sets
3. By defining a set of rules which generates (defines) its members (Recursive Rules).
List Notation
This is suitable for finite sets. It lists names of elements of a set, separated by commas and enclose
them in braces. For example
Predicate Notation
Recursive Rules
139
10.7 The Empty set (Null Set)
We have that the fundamental property of a set is that we can assert of each object whether or not
it is a member of the set.
Consider a set constructed by asserting of each object that it is not a member of the set. This set
has no members and is therefore called the empty set.
Definition 10.7.1. The null or empty set is the set that does not contain any elements, denoted
∅ = {} = {x|x 6= x}.
Example 10.7.1. (i) {x ∈ R|x2 = −1} (ii) {x ∈ Z|x2 = 2}.
Theorem 10.7.1. There is exactly one empty set.
Two sets are identical if and only if (iff) they have the same elements or both are empty. So A = B
iff, for every x, x ∈ A ⇔ x ∈ B.
Example 10.8.1. {0, 2, 4} = {x|x is an even positive integer less than 5}.
The number of elements in a set A is called the cardinality of A, denoted by |A|. The cardinality
of a finite set is a natural number. Infinite sets also have cardinalities but they are not natural
numbers. The set A is said to be countable or enumerable if there is a way to list the elements
of A.
A paradox (antimony) is an apparently true statement that seems to lead to a logical self-
contradiction.
Its important to note that any given property, P (x) does not necessarily determine a set, i.e., we
cannot say that given any arbitrary property P , there corresponds a set whose elements satisfy the
property P .
Consider the following. There was once a barber man, wherever he lived, all of the men in this
town either shaved themselves or were shaved by the barber. And this barber man only shaved the
men who did not shave themselves. Did the barber shave himself?
Let’s say that he did shave himself. But from this he shaved only the men in town, who did not
shave themselves, therefore, he did not shave himself.
But we see that every men in town either shaved himself or was shaved by the barber. So he did
shave himself. We have a contradiction.
Russell observed that if S is a set, then either S ∈ S or S ∈ / S, since a given object is either a
140
member of a given set or is not a member of that set. Consider the set of all sets that are not
members of themselves, R = {x|x is a set and x ∈ / x}. R is an object, either R ∈ R or R ∈/ R.
In both cases we have inferred the paradox that R ∈ R iff R ∈ / R. In other words, the assumption
that R is a set has led to a contradiction and therefore there is no such thing, then, as the set of
all sets. To avoid unnecessary paradoxes, we assume the existence of the universal set, U. All this
leads to the following problems
After this paradox was described, set theory had to be reformulated axiomatically as axiomatic
set theory.
10.10 Inclusion
Definition 10.10.1. Having fixed our universal set, U, then for all x ∈ U. If A and B are sets
(with all members in U), we write A ⊆ B or B ⊇ A iff x ∈ A =⇒ x ∈ B. (⊆ , set inclusion symbol)
Proof. Let x ∈ A, then since A ⊆ B, we have x ∈ B and given that B ⊆ C, we conclude that
x ∈ C, thus A ⊆ C.
Example 10.10.1. (i) {a, b} ⊆ {d, a, b, e} (ii) {a, b} ⊆ {a, b} (iii) {a, b} ⊂ {d, a, b, e}
(iv) {a, b} 6⊂ {a, b}.
Note that the empty set is a subset of every set, ∅ ⊆ A, for every set A and that for any set A, we
have A ⊆ A.
141
10.11 Axiom of Extensionality
The set of all subsets of A is called the power set of A and is denoted by P(A) and |P(A)| = 2|A|
where |A| is finite.
Example 10.11.1. If A = {a, b}, then P(A) = {∅, {a}, {b}, {a, b}}.
Let A and B be arbitrary sets. The union of A and B, written A ∪ B, is the set whose elements
are just the elements of A or B or both.
A ∪ B := {x|x ∈ A or x ∈ B}.
Example 10.12.1. Let K = {a, b}, L = {c, d}, M = {b, d}, then K ∪ L = {a, b, c, d},
K ∪ M = {a, b, d}, L ∪ M = {b, c, d}, (K ∪ L) ∪ M = K ∪ (L ∪ M ) = {a, b, c, d}, K ∪ K = K,
K ∪ ∅ = ∅ ∪ K = K = {a, b}.
The intersection of A and B, written A ∩ B, is the set whose elements are just the elements of
both A and B.
A ∩ B := {x|x ∈ A and x ∈ B}.
Example 10.12.2. K ∩ L = ∅, K ∩ M = {b}, L ∩ M = {d}, (K ∩ L) ∩ M = K ∩ (L ∩ M ) = ∅,
K ∩ K = K, K ∩ ∅ = ∅ ∩ K = ∅.
142
2. An element x belongs to the union A ∪ B if x belongs to A or x belongs to B, hence every
element in A belongs to A ∪ B and every element in B belong to A ∪ B, i.e.,
A⊆A∪B and B ⊆ A ∪ B.
Definition 10.13.1. Two sets A and B are called disjoint sets if the intersection of A and B is
the null set i.e., A ∩ B = ∅.
Definition 10.14.1. A minus B written A\B or A−B, which subtracts from A all elements which
are in B (also called relative complement, or the complement of B relative to A) is defined as
A − B := {x|x ∈ A and x ∈
/ B}.
The complement of a set A, is the set of elements which do not belong to A, i.e., the difference of
the universal set U and A. Denote the complement of A by A0 or Ac .
A0 = {x|x ∈ U and x ∈
/ A} or A0 = U − A.
Example 10.14.2. Let E = {2, 4, 6, . . . }, the set of all even numbers. Then E c = {1, 3, 5, . . . },
the set of odd numbers.
143
10.15 Venn Diagrams
A simple and instructive way of illustrating the relationship between sets in the use of the so called
Venn-Euler diagrams or simply Venn diagrams.
4. Distributive Laws (i) X ∪(Y ∩Z) = (X ∪Y )∩(X ∪Z) (ii) X ∩(Y ∪Z) = (X ∩Y )∪(X ∩Z).
144
10.17 Counting Elements in Sets
We have considered the problem of showing that two sets are the same, however this technique
becomes tedious should the expressions involved be at all complicated. We shall develop an algebra
of sets, to assist us in simplifying a given expression. The following basic laws are easily established.
Law 1 : (Ac )c = A Law 2 : A ∪ B = B ∪ A Law 3 : A ∩ B = B ∩ A
Law 4 : A ∪ (B ∩ C) = (A ∪ B) ∪ C Law 5 : A ∩ (B ∩ C) = (A ∩ B) ∩ C
Law 6 : A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C) Law 7 : A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C)
Law 8 : (A ∪ B)c = Ac ∩ B c Law 9 : (A ∩ B)c = Ac ∪ B c Law 10 : U c = ∅
Law 11 : ∅c = U Law 12 : A ∪ ∅ = A Law 13 : A ∪ U = U Law 14 : A ∩ U = A
Law 15 : A ∩ ∅ = ∅ Law 16 : A ∪ Ac = U Law 17 : A ∩ Ac = ∅.
Example 10.18.1. By using the algebra of sets, show that A ∪ (B ∩ Ac ) = A ∪ B.
Proof.
A ∪ (B ∩ Ac ) = (A ∪ B) ∩ (A ∪ Ac ) by Law 6
= (A ∪ B) ∩ U by Law 16
= A ∪ B by Law 14.
Definition 10.19.1. Let n be any natural number and let a1 , a2 , . . . , an be any objects. Then
(a1 , a2 , . . . , an ) denotes the ordered n-tuple with first term a1 , second term a2 , . . . and nth term an .
145
Example 10.19.1. (5, 7) denotes the ordered pair whose first term is 5 and second term 7. Note
that (5, 7, 2) is called an ordered triple, (5, 7, 2, 4) is called an ordered 4-tuple.
The fundamental statement we can make about an ordered n-tuple is that a given object is the kth
term of an ordered n-tuple.
Definition 10.19.2. Let A and B be any non-empty sets, then
If A and B are both finite sets, then |A × B| = |A| · |B|. If A = B, we sometimes write A2 for
A × A.
Example 10.19.2. 1. If A = {1, 2} and B = {2, 3, 4}, then A×B = {(1, 2), (1, 3), (1, 4), (2, 2), (2, 3), (2, 4)}
and B × A = {(2, 1), (2, 2), (3, 1), (3, 2), (4, 1), (4, 2)}.
Notice that A × B 6= B × A, in general.
2. The Cartesian product R × R = R2 is the set of all ordered pairs of real numbers and this
represents the 2-dimensional Cartesian plane.
1. A × (B ∪ C) = (A × B) ∪ (A × C).
2. A × (B ∩ C) = (A × B) ∩ (A × C).
3. (A × B) ∩ (C × D) = (A ∩ C) × (B ∩ D).
4. (A × B) ∪ (C × D) ⊆ (A ∪ C) × (B ∪ D).
5. (A − B) × C = (A × C) − (B × C).
146
Now consider any element (u, v) ∈ (A × C) ∪ (B × C). This implies that (u, v) ∈ (A × C) or
(u, v) ∈ (B × C). In the first case u ∈ A and v ∈ C and in the second case u ∈ B and v ∈ C. Thus
u ∈ (A∪B) and v ∈ C which implies (u, v) ∈ (A∪B)×C. Therefore (A×C)∪(B×C) ⊆ (A∪B)×C.
Hence (A ∪ B) × C = (A × C) ∪ (B × C).
10.21 Tutorial 9
1. Let {2, 4, 6}, and {3, 4, 5, 6, 7, 8, 9, 10}, What is the cardinality of the sets A ∩ B and A ∪ B.
3. Consider the universal set U = {1, 2, 3, . . . , 9} and the sets: A = {1, 2, 3, 4, 5}, B = {4, 5, 6, 7},
C = {5, 6, 7, 8, 9}, D = {1, 3, 5, 7, 9}, E = {2, 4, 6, 8}, F = {1, 5, 9}. Find:
(b) A × (B ∩ C) = (A × B) ∩ (A × C),
(c) (A × B) ∪ (C × D) ⊆ (A ∪ C) × (B ∪ D).
147
Introduction to Probability Theory
148
10.22 Probability
• You need to decide whether a coin is loaded (i.e., whether it tends to favor one side over the
other when tossed). You toss the coin 6 times and in all cases you get “Tails”. Would you
say that the coin is loaded?
• You are trying to figure out whether newborn babies can distinguish green from red. To do
so you present two colored cards (one green, one red) to 6 newborn babies. You make sure
that the 2 cards have equal overall luminance so that they are indistinguishable if recorded
by a black and white camera. The 6 babies are randomly divided into two groups. The first
group gets the red card on the left visual field, and the second group on the right visual field.
You find that all 6 babies look longer to the red card than the green card. Would you say
that babies can distinguish red from green?
• A pregnancy test has a 99% validity (i.e., 99 of 100 pregnant women test positive) and 95%
specificity (i.e., 95 out of 100 non pregnant women test negative). A woman believes she has a
10% chance of being pregnant. She takes the test and tests positive. How should she combine
her prior beliefs with the results of the test?
• You need to design a system that detects a sinusoidal tone of 1000Hz in the presence of white
noise. How should you design the system to solve this task optimally?
• How should the photo receptors in the human retina be interconnected to maximize informa-
tion transmission to the brain?
While these tasks appear different from each other, they all share a common problem: The need to
combine different sources of uncertain information to make rational decisions. Probability theory
provides a very powerful mathematical framework to do so. We now go into the mathematical
aspects of probability theory.
149
10.23 Sample Spaces
A set S that consists of all possible outcomes of a random experiment is called a sample space,
and each outcome is called a sample point. Often there will be more than one sample space that
can describe outcomes of an experiment, but there is usually only one that will provide the most
information.
Example 10.23.1. If we toss a die, then one sample space is given by {1, 2, 3, 4, 5, 6} while another
is {even, odd}. It is clear, however, that the latter would not be adequate to determine, for example,
whether an outcome is divisible by 3.
The sample space is also called the outcome space, reference set, and universal set. It is
often useful to portray a sample space graphically. In such cases, it is desirable to use numbers in
place of letters whenever possible. If a sample space has a finite number of points, it is called a
finite sample space. If it has as many points as there are natural numbers 1, 2, 3, . . . , it is called a
countably infinite sample space. If it has as many points as there are in some interval on the x axis,
such as 0 ≤ x ≤ 1, it is called a noncountably infinite sample space. A sample space that is finite
or countably finite is often called a discrete sample space, while one that is noncountably infinite is
called a nondiscrete sample space.
Example 10.23.2. The sample space resulting from tossing a die yields a discrete sample space.
However, picking any number, not just integers, from 1 to 10, yields a non-discrete sample space.
10.24 Events
We have defined outcomes as the elements of a sample space S. In practice, we are interested in
assigning probability values not only to outcomes but also to sets of outcomes. For example, we
may want to know the probability of getting an even number when rolling a die. In other words,
we want the probability of the set {2, 4, 6}. An event is a subset A of the sample space S, i.e., it is
set of possible outcomes. If the outcome of an experiment is an element of A, we say that the event
A has occurred. An event consisting of a single point of S is called a simple or elementary event.
As particular events, we have S itself, which is the sure or certain event since an element of S must
occur, and the empty set ∅, which is called the impossible event because an element of ∅ cannot
occur.
By using set operations on events in S, we can obtain other events in S. For example, if A and B
are events, then
150
4. A − B = A ∩ B 0 is the event “A but not B.” In particular, A0 = S − A.
If the sets corresponding to events A and B are disjoint, i.e., A ∩ B = ∅, we often say that the
events are mutually exclusive. This means that they cannot both occur. We say that a collection
of events A1 , A2 , . . . , An is mutually exclusive if every pair in the collection is mutually exclusive.
Example 10.24.1. Consider an experiment of tossing a coin twice, let A be the event “at least one
head occurs” and B the event “the second toss results in a tail.” Find the events A ∪ B, A ∩ B, A0
and A − B.
In any random experiment there is always uncertainty as to whether a particular event will or will
not occur. As a measure of the chance, or probability, with which we can expect the event to occur,
it is convenient to assign a number between 0 and 1. If we are sure or certain that an event will
occur, we say that its probability is 100% or 1. If we are sure that the event will not occur, we
say that its probability is zero. If, for example, the probability is 1/4, we would say that there is
a 25% chance it will occur and a 75% chance that it will not occur. Equivalently, we can say that
the odds against occurrence are 75% to 25%, or 3 to 1.
There are two important procedures by means of which we can estimate the probability of an
event.
151
Both the classical and frequency approaches have serious drawbacks, the first because the words
“equally likely” are vague and the second because the “large number” involved is vague. Because
of these difficulties, mathematicians have been led to an axiomatic approach to probability.
Suppose we have a sample space S. If S is discrete, all subsets correspond to events and conversely;
if S is nondiscrete, only special subsets (called measurable) correspond to events. To each event A
in the class C of events, we associate a real number P (A). The P is called a probability function,
and P (A) the probability of the event, if the following axioms are satisfied.
From the above axioms we can now prove various theorems on probability that are important in
further work.
Theorem 10.27.1. If A1 ⊂ A2 , then P (A1 ) ≤ P (A2 ) and P (A2 − A1 ) = P (A2 ) − P (A1 ).
Theorem 10.27.2. For every event A, 0 ≤ P (A) ≤ 1, i.e., a probability between 0 and 1.
Theorem 10.27.3. For ∅, the empty set, P (∅) = 0, i.e., the impossible event has probability zero.
Theorem 10.27.4. If A0 is the complement of A, then P (A0 ) = 1 − P (A).
Theorem 10.27.5. If A = A1 ∪ A2 ∪ A3 ∪ . . . ∪ An , where A1 , A2 , . . . , An are mutually exclusive
events, then
P (A) = P (A1 ) + P (A2 ) + P (A3 ) + . . . + P (An ).
In particular, if A = S, the sample space, then
P (A1 ) + P (A2 ) + P (A3 ) + . . . + P (An ) = 1.
Theorem 10.27.6. If A and B are any two events, then P (A ∪ B) = P (A) + P (B) − P (A ∩ B).
More generally, if A1 , A2 , A3 are any three events, then
P (A1 ∪A2 ∪A3 ) = P (A1 )+P (A2 )+P (A3 )−P (A1 ∩A2 )−P (A2 ∩A3 )−P (A3 ∩A1 )+P (A1 ∩A2 ∩A3 ).
Generalizations to n events can also be made.
152
Theorem 10.27.7. For any events A and B, P (A) = P (A ∩ B) + P (A ∩ B 0 ).
Theorem 10.27.8. If an event A must result in the occurrence of one of the mutually exclusive
events A1 , A2 , . . . , An , then
P (A) = P (A ∩ A1 ) + P (A ∩ A2 ) + · · · + P (A ∩ An ).
It follows that we can arbitrarily choose any nonnegative numbers for the probabilities of these
simple events as long as the previous equation is satisfied. In particular, if we assume equal proba-
bilities for all simple events, then
1
P (Ak ) = , k = 1, 2, . . . , n
n
And if A is any event made up of h such simple events, we have
h
P (A) = .
n
This is equivalent to the classical approach to probability. We could of course use other procedures
for assigning probabilities, such as frequency approach.
Assigning probabilities provides a mathematical model, the success of which must be tested by
experiment in much the same manner that the theories in physics or others sciences must be tested
by experiment.
Example 10.28.1. A single die is tossed once. Find the probability of a 2 or 5 turning up.
Solution: The sample space is S = {1, 2, 3, 4, 5, 6}. If we assign equal probabilities to the sample
points, i.e., if we assume that the die is fair, then
1
P (1) = P (2) = · · · = P (6) = .
6
The event that either 2 or 5 turns up is indicated by 2 ∪ 5. Therefore,
1 1 1
P (2 ∪ 5) = P (2) + P (5) = + = .
6 6 3
153
10.29 Conditional Probability
Let A and B be two events such that P (A) > 0. Denote P (B|A) the probability of B given that A
has occurred. Since A is known to have occurred, it becomes the new sample space replacing the
original S. From this we are led to the definition
P (A ∩ B)
P (B|A) ≡ (10.1)
P (A)
or
In words, this is saying that the probability that both A and B occur is equal to the probability
that A occurs times the probability that B occurs given that A has occurred. We call P (B|A)
the conditional probability of B given A, i.e., the probability that B will occur given that A has
occurred. It is easy to show that conditional probability satisfies the axioms of probability previously
discussed.
Example 10.29.1. Find the probability that a single toss of a die will result in a number less than
4 if
Solution:
(a) Let B denote the event {less than 4}. Since B is the union of the events 1, 2, or 3 turning up,
we see by Theorem 10.27.5 that
1 1 1 1
P (B) = P (1) + P (2) + P (3) = + + =
6 6 6 2
assuming equal probabilities for the sample points.
3 1 2 1
(b) Letting A be the event {odd number}, we see that P (A) = = . Also, P (A ∩ B) = = .
6 2 6 3
Then
P (A ∩ B) 1/3 2
P (B|A) = = = .
P (A) 1/2 3
Hence, the added knowledge that the toss results in an odd number raises the probability from
1/2 to 2/3.
154
10.30 Theorems on Conditional Probability
In words, the probability that A1 and A2 and A3 all occur is equal to the probability that A1
occurs times the probability that A2 occurs given that A1 has occurred times the probability that
A3 occurs given that both A1 and A2 have occurred. The result is easily generalized to n events.
Theorem 10.30.2. If an event A must result in one of the mutually exclusive events A1 , A2 , . . . , An ,
then
If P (B|A) = P (B), i.e., the probability of B occurring is not affected by the occurrence or nonoc-
currence of A, then we say that A and B are independent events. This is equivalent to
Notice also that if this equation holds, then A and B are independent.
We say that three events A1 , A2 , A3 are independent if they are pairwise independent.
and
Both of these properties must hold in order for the events to be independent. Independence of more
than three events is easily defined.
Note: In order to use this multiplication rule, all of your events must be independent.
Suppose that A1 , A2 , . . . , An are mutually exclusive events whose union is the sample space S, i.e.,
one of the events must occur. Then if A is any event, we have the following important theorem:
155
Theorem 10.32.1. (Bayes’ Rule):
P (Ak )P (A|Ak )
P (Ak |A) = n . (10.8)
X
P (Aj )P (A|Aj )
j=1
This enables us to find the probabilities of the various events A1 , A2 , . . . , An that can occur. For
this reason Bayes’ theorem is often referred to as a theorem on the probability of causes.
10.33 Tutorial 10
1. There are 3 arrangements of the word DAD, namely DAD, ADD, and DDA. How many
arrangements are there of the word PROBABILITY?
2. A ball is drawn at random from a box containing 8 red balls, 17 white balls, and 9 blue balls.
Determine the probability that it is
(i) white, (ii) not blue, (iii) red or blue, (iv) neither white nor red.
3. A card is picked from a deck of 52 playing cards, without replacement, and then another one
is picked. What is the probability of picking (i) two red cards, (ii) one of each colour.
4. A die is loaded in such a way that each odd number is twice likely to occur as each even
number. Find P (G), where G is the event that a number greater than 3 occurs on a single
roll of the die.
7. From a batch of 100 items of which 20 are defective, exactly two items are chosen, one at a
time without replacement. Calculate the probabilities that:
156
8. The punctuality of buses has been investigated by considering a number of bus journey. In
the sample, 60% of buses had a destination of Masvingo, 20% Bulawayo and 20% Mutare.
The probabilities of a bus arriving late in Masvingo, Bulawayo or Mutare are 30%, 25% and
20% respectively. If a late bus is picked at random from the group under consideration, what
is the probability that it terminated in Masvingo.
9. Machines M and N produce 10% and 90% respectively of the production of a component
intended for the motor industry. From experience, it is known that the probability that
machine M produces a defective component is 0.01 while the probability that machine N
produces a defective component is 0.05. If a component is selected at random from a day‘s
production and is found to be defective, find the probability that it was made by
(a) machine M
(b) machine N.
157
Complex Numbers and Polynomials
There are no secrets about the world of nature. There are secrets about the thoughts and intentions of men.
—Robert Oppenheimer
10.34 Introduction
No one person invented complex numbers, but controversies surrounding the use of these numbers
existed in the sixteenth century. In their quest to solve polynomial equations by formulas involving
158
radicals, early dabblers in mathematics were forced to admit that there were other kinds of numbers
besides positivepintegers. Equations such as x2 + 2x + 2 = 0 and x3 = 6x + 4 that yielded solutions
√ √ p √
1 + −1 and 3 2 + −2 + 3 2 − −2 caused particular consternation within the community √ of
√ mathematical scholars because everyone knew that there are no numbers such as −1
fledgling
and −2, numbers whose square is negative. Such numbers exist only in one’s imagination, or
as one philosopher opined, “the imaginary, (the) bosom child of complex mysticism.” Over time
these imaginary numbers did not go away, mainly because mathematicians as a group are tenacious
and some are even practical. A famous mathematician held that even though they exist in our
imagination, nothing prevents us from employing them in calculations. Mathematicians also hate
to throw anything away. After all, a memory still lingered that negative numbers at first were
branded fictitious. The concept of number evolved over centuries; gradually the set of numbers grew
from just positive integers to include rational numbers, negative numbers, and irrational numbers.
But in the eighteenth century the number concept took a gigantic evolutionary step forward when
the German mathematician Carl Friedrich Gauss put the so-called imaginary numbers or complex
numbers, as they were now beginning to be called on a logical and consistent footing by treating
them as an extension of the real number system.
The set of all complex numbers is usually denoted by C. Since x2 ≥ 0 for every real number, x, the
equation
x2 + 1 = 0
has no real solutions.
Complex numbers are usually written in the form a + bi where a and b are real numbers or can be
regarded as the ordered pair (a, b).
159
Geometrically, a complex number can be viewed either as a point or vector in the xy−plane.
Let us denote
z = a + bi.
The real number a is called the real part of z and the real number b is called the imaginary part
of z.
When complex numbers are represented geometrically in the xy-coordinate system, the x-axis is
called the real axis, the y-axis, the imaginary axis, and the plane is called the complex plane.
Definition 10.35.1. Two complex numbers a + bi and c + di are defined to be equal, when
a + bi = c + di if a = c and b = d.
Numbers of the form where a = 0, then a + bi reduces to 0 + bi = bi, these complex numbers which
correspond to points on the imaginary axis, are called purely imaginary numbers. For example
z = 8i is a purly imaginary number.
10.35.1 Operations
Solution:
z1 + z2 = (4 − 5i) + (−1 + 6i) = (4 − 1) + (−5 + 6)i = 3 + i.
z1 − z2 = (4 − 5i) − (−1 + 6i) = (4 + 1) + (−5 − 6)i = 5 − 11i.
3z1 = 3(4 − 5i) = 12 − 15i.
−z2 = −1(z2 ) = (−1)(−1 + 6i) = 1 − 6i.
Multiplying two complex numbers as (a + bi)(c + di), treating i2 = −1, this yields
(a + bi)(c + di) = ac + bdi2 + adi + bci = (ac − bd) + (ad + bc)i.
Example 10.35.3. 1. (3 + 2i)(4 + 5i) = (3 · 4 − 2 · 5) + (3 · 5 + 2 · 4)i = 2 + 23i.
2. i2 = (0 + i)(0 + i) = (0 · 0 − 1 · 1) + (0 · 1 + 1 · 0)i = −1.
160
10.35.2 Rules of Complex Arithmetic
1. z1 + z2 = z2 + z1 .
2. z1 z2 = z2 z1 .
3. z1 + (z2 + z3 ) = (z1 + z2 ) + z3 .
5. z1 (z2 + z3 ) = z1 z2 + z1 z3 .
6. 0 + z = z.
7. z + (−z) = 0.
8. 1 · z = z
z = a − bi.
3. z = 4, then z = 4.
161
10.36.2 Modulus of a Complex Number
Definition 10.36.1. The modulus of a complex number z = a + bi, denoted |z|, is defined by
√
|z| = a2 + b2 .
z1 |z1 |
The modulus of a complex number z has the additional properties |z1 z2 | = |z1 ||z2 | and = .
z2 |z2 |
For division
z1 z1 z 2
= .
z2 |z2 |2
3 + 4i
Example 10.36.3. Express in the form a + bi.
1 − 2i
Solution:
3 + 4i (3 + 4i)(1 + 2i)
=
1 − 2i (1 − 2i)(1 + 2i)
3 + 6i + 4i + 8i2
=
1 + 2i − 2i − 4i2
−5 + 10i
=
5
= −1 + 2i.
162
10.36.4 Properties of the Conjugate
(a) z1 + z2 = z1 + z2 .
(b) z1 − z2 = z1 − z2 .
(c) z1 · z2 = z1 · z2 .
z1 z1
(d) = .
z2 z2
(e) z = z.
1 √ 1
Since |z| = (zz) 2 = a2 + b2 = ((Re(z))2 + (Im(z)2 )) 2 , then
p p
Re(z) ≤ |Re(z)| = (Re(z))2 ≤ (Re(z))2 + (Im(z))2 = |z|.
Similarly,
Im(z) ≤ |Im(z)| ≤ |z|.
For any two complex numbers, z1 and z2 , we have that
|z1 + z2 | ≤ |z1 | + |z2 |.
This is called the triangle inequality.
Proof.
|z1 + z2 |2 = (z1 + z2 )(z1 + z2 ) = (z1 + z2 )(z1 + z2 )
= z1 z1 + 2Re(z1 z2 ) + z2 z2 .
Using the fact that 2Re(z1 z2 ) ≤ 2|z1 z2 | = 2|z1 ||z2 |, we get
|z1 + z2 |2 ≤ |z1 |2 + 2|z1 ||z2 | + |z2 |2 = (|z1 | + |z2 |)2 .
Taking square roots the result follows, that is
|z1 + z2 | ≤ |z1 | + |z2 |.
163
10.37 Polar Representation of Complex Numbers
If z = x + iy is a non-zero complex number, r = |z| and θ measures the angle from the positive real
axis to the vector z,
P = (r, θ)
r = directed distance
θ = directed angle
then
x = r cos θ and y = r sin θ,
so that z = x + iy can be written as
This is called a polar form of z. The angle θ is called an argument of z and is denoted by
θ = arg z. The argument of z is not uniquely determined because we can add or subtract any
multiple of 2π from θ to produce another value of the argument.
One value of the argument in radians that satisfies −π < θ ≤ π is called the principal argument
of z and is denoted by θ = Arg z.
√
Example 10.37.1. Express z = 1 + 3i in polar form using the principal argument.
q √ √ √
Solution: The value of r is r = |z| = (1)2 + ( 3)2 = 4 = 2. Since x = 1 and y = 3, it
√ √
follows that 1 = 2 cos θ and 3 = 2 sin θ. So cos θ = 12 and sin θ = 23 . The only value of θ that
satisfies these relations and meets the requirement −π, θ ≤ π is θ = π3 . The polar form of z is
π π
z = 2 cos + i sin .
3 3
164
We now show how polar forms can be used to give geometric interpretations of multiplication and
division of complex numbers.
Recall:
We obtain
z1 z2 = r1 r2 [cos(θ1 + θ2 ) + i sin(θ1 + θ2 )]
which is a polar form of the complex number with modulus r1 r2 and argument θ1 + θ2 . Thus, we
have shown that
|z1 z2 | = |z1 ||z2 | and arg(z1 z2 ) = arg z1 + arg z2 .
Also
z1 r1
= [cos(θ1 − θ2 ) + i sin(θ1 − θ2 )] ,
z2 r2
from which, it follows that
z1 |z1 |
=
z2 |z2 | , if z2 6= 0
and
z1
arg = arg z1 − arg z2 .
z2
or
z n = rn (cos nθ + i sin nθ). (10.9)
In the special case, if r = 1, we have for z = (cosθ + i sin θ), so that (10.9) becomes
165
10.38.1 Application of De Moivre’s Formula
Recall from algebra that −2 and 2 are said to be square roots of the number 4 because (−2)2 = 4
and (2)2 = 4. In other words, the two square roots of 4 are distinct solutions of the equation w2 = 4.
If n is a positive integer and z is any complex number, then we define the nth root of z to be any
complex number that satisfies the equation
wn = z (10.10)
1
and denote the nth root of z by z n .
√
θ 2kπ θ 2kπ
n
w = r cos + + i sin + , k = 0, ±1, ±2, . . .
n n n n
Although there are infinitely many values of k, it can be shown that k = 0, 1, 2, . . . , n − 1 produces
distinct values of w satisfying (10.10), but all other choices of k yield duplicates of these.
Solution: Since −8 lies on the negative real axis, we can use π as an argument.
Here r = |z| = | − 8| = 8, so a polar form of −8 is
Here n = 3, hence
√
1 3 π 2kπ π 2kπ
(−8) =3 8 cos + + i sin + , k = 0, 1, 2.
3 3 3 3
166
Thus, the cube roots of −8 are
√ !
π π 1 3 √
k = 0, 2 cos + i sin = 2 + i = 1 + 3i.
3 3 2 2
k = 1, 2(cos π + i sin π) = 2(−1) = −2.
√ !
√
5π 5π 1 3
k = 2, 2 cos + i sin = 2 − i = 1 − 3i.
3 3 2 2
Example 10.39.2. x3 − 2x + 4.
167
A number (real or complex) a is said to be a root of the polynomial p(x) if p(a) = 0.
Example 10.39.3. x = 1 is a root of x2 − 2x + 1, since 12 − 2 + 1 = 0.
A number a (real or complex) is a root of the polynomial p(x) if and only if (x − a) is a factor of
p(x). It may be the case that you pull more than one factor (x − a) out of the polynomial. In such
cases a is said to be a multiple root of p(x).
Theorem 10.39.1 (The Fundamental Theorem of Algebra). Let p(x) be any polynomial of degree
n. Then p(x) can be factorized into a product of a constant and n factors of the form (x − a), where
a may be real or complex.
Suppose the complex number z is a root of the polynomial, then the complex conjugate z is also a
root.
Example 10.39.4. Let p(z) = z 4 − 4z 3 + 9z 2 − 16z + 20. Given that 2 + i is a root, express p(z)
as a product of real quadratic factors.
Solution: Given that 2 + i is a root, it follows that 2 − i must also be a root and so the quadratic
(z − (2 + i))(z − (2 − i)) = z 2 − 4z + 5
must be a factor. Dividing the given polynomial by this factor gives
p(z) = z 4 − 4z 3 + 9z 2 − 16z + 20 = (z 2 − 4z + 5)(z 2 + 4).
Example 10.39.5. Solve z 3 + 3z 2 + 2z − 6 = 0 and express the left hand side as a product of
irreducible factors.
Solution: Since the equation is a polynomial equation of odd degree there is at least one real
solution. To find that solution by trial and error the factors of the constant terms are substituted
into the polynomial. The factors of 6 are ±1, ±2, ±3, ±6.
Substituting z = 1 gives
1+3+2−6=0
z 3 + 3z 2 + 2z + 6
so z = 1 is a solution and (z − 1) is a factor. So = z 2 + 4z + 6 and the other
√ z − 1
solutions are z = −2 ± 2i and so
z 3 + 3z 2 + 2z − 6 = (z − 1)(z 2 + 4z + 6)
as a product of irreducible real factors.
Exercise 10.39.1. Express z 5 − 1 as a product of real linear and quadratic factors.
168
10.40 Tutorial 11
1. Given that z1 = 3 − 8i and z2 = −7 + i, find
(i) iz1 + 2z2 (ii) z1 + z2
2. Express each of the following complex numbers in polar form and represent each number on
an Argand diagram:
√ 2
(i) −1 − i (ii) 3 − 3 3i (iii) −2 − √ i
2
2+i 1
3. If z = , find the real and imaginary parts of z + .
1−i z
z−i
(a) Let z ∈ C, and let w = .
z+i
i. Evaluate w when z = 0, and when z = 1.
ii. Let z = β where β ∈ R. Show that for any such z the corresponding w
always has unit modulus.
(b) i. Express the complex number z = 24 + 7i in polar form.
1
ii. Find the four values of z 4 in exponential form, and plot them on an
Argand diagram.
169
Chapter 11
Theory of Matrices
Any man who can drive safely while kissing a pretty girl is simply not giving the kiss the attention it
deserves.
—Albert Einstein
170
11.1 Matrices
Definition 11.1.1. A matrix over a field K (elements of K are called numbers or scalars) is a
rectangular array of scalars presented in the following form
a11 a12 · · · a1n
a21 a22 · · · a2n
A = ..
.. ..
. . ··· .
am1 am2 · · · amn
The rows of such a matrix are the m horizontal list of scalars, that is
The element aij , called the ij-entry or ij-element appears in row i and column j. Denote a matrix
simply by A = [aij ].
A matrix with m rows and n columns is called an m by n matrix, written m × n. The pair of
numbers m and n are called the size of the matrix.
Two matrices are equal, written A = B, if they have the same size and if corresponding elements
are equal.
Example 11.1.1. Find x, y, z, t such that
x + y 2z + t 3 7
= .
x−y z−t 1 5
Solution: By definition of equality of matrices, the four corresponding entries must be equal. Thus
x + y = 3, 2z + t = 7, x − y = 1, z − t = 5.
x = 2, y = 1, z = 4, t = −1.
171
Example 11.1.2.
0 0
A= , P = 0 0 .
0 0
Matrices whose entries are all real numbers are called real matrices and are said to be matrices
over R. Matrices whose entries are all complex numbers are called complex matrices and are
said to be matrices over C.
Let A = [aij ] and B = [bij ] be two matrices with the same size, say m × n matrices. The sum of
A and B, written A + B, is the matrix obtained by adding corresponding elements from A and B,
that is
a11 + b11 a12 + b12 · · · a1n + b1n
a21 + b21 a22 + b22 · · · a2n + b2n
A+B = .
.. .. ..
. . ··· .
am1 + bm1 am2 + bm2 · · · amn + bmn
The product of a matrix A by a scalar k, written kA, is the matrix obtained by multiplying each
element of A by k, that is
ka11 ka12 · · · ka1n
ka21 ka22 · · · ka2n
kA = .. .. .
..
. . ··· .
kam1 kam2 · · · kamn
We also define
−A = (−1)A and A − B = A + (−B).
The matrix −A s called the negative of matrix A and the matrix A − B is called the difference
of matrix A and B.
172
11.2.1 Properties
Theorem 11.2.1. Consider any matrices A, B and C (with same size) and scalars k and l. Then
(i) (A + B) + C = A + (B + C).
(ii) A + 0 = 0 + A = A.
(iv) A + B = B + A.
(viii) 1 · A = A.
Need to show that corresponding ij-entries in each side of each matrix equation are equal.
The ij-entry of A + B is aij + bij , hence the ij-entry of (A + B) + C is (aij + bij ) + cij . On the other
hand, the ij-entry of B + C is bij + cij and hence the ij-entry of A + (B + C) is aij + (bij + cij ).
However for scalars in K,
(aij + bij ) + cij = aij + (bij + cij ).
Thus (A+B)+C and A+(B +C) have identical ij-entries. Therefore (A+B)+C = A+(B +C).
Proof. (v) The ij-entry of A + B is aij + bij , hence k(aij + bij ) is the ij-entry of k(A + B). On
the other hand, the ij-entry of kA and kB are kaij and kbij respectively. Thus, kaij + kbij is the
ij-entry of kA + kB. However, for scalars in K,
The product of matrices A and B, is written as AB. Consider the product AB, of a row matrix
A = [aij ] and a column matrix B = [bij ] with the same number of elements is defined to be the
173
scalar obtained by multiplying corresponding entries and adding, that is
b1
b2 Xn
AB = [a1 , a2 , · · · , an ] .. = a1 b1 + a2 b2 + · · · + an bn = ak b k .
. k=1
bn
The product AB is not defined when A and B have different number of elements.
Example 11.3.1.
3
[7, −4, 5] 2 = 7(3) + −4(2) + 5(−1) = 21 − 8 − 5 = 8.
−1
Definition 11.3.1. Suppose A = [aik ] and B = [bkj ] are matrices such that the number of columns
of A is equal to the number of rows of B, say, A is an m × p matrix and B is a p × n matrix. Then
the product AB is the m × n matrix whose ij-entry is obtained by multiplying the ith row of A by
the jth column of B, that is
b
a11 · · · · · · aip 11 · · · b1j · · · b1n c11 · · · · · · c1n
.. .. .. .. .. .. .. .. .. .. .. .. ..
. . . . . . . . . . . . .
.
. .
. .
. .
. .
.
.
= .. · · · c
ai1 · · · · · · aip
. . . . . . ij · · ·
.. .. .. .
.. . . . . . . . .
. . . . . . . . .
. . . . . . . . . . . .
am1 · · · · · · amp bp1 · · · bpj · · · bpn cm1 · · · · · · cmn
where p
X
cij = ai1 b1j + ai2 b2j + · · · + aip bpj = aik bkj .
k=1
174
1 2 5 6
Example 11.3.3. Suppose A = and . Then
3 4 0 −2
5+0 6−4 5 2
AB = =
15 + 0 18 − 8 15 10
and
5 + 18 10 + 24 23 34
BA = = .
0−6 0−8 −6 −8
The above example shows that matrix multiplication is not commutative, that is, the products AB
and BA of matrices need not be equal. Matrix multiplication satisfies the following properties
Theorem 11.3.1. Let A, B and C be matrices, then, whenever the products and sums are defined.
Let A = [aij ], B = [bjk ], C = [ckl ] and let AB = S = [sik ] and BC = T = [tjl ]. Then
m
X n
X
sik = aij bjk and tjl = bjk ckl .
j=1 k=1
The above sums are equal, that is, corresponding elements in (AB)C and A(BC) are equal. Thus
(AB)C = A(BC).
175
11.4 Transpose of a Matrix
Definition 11.4.1. The transpose of a matrix A, written At , the matrix obtained by writing the
columns of A, in order, as rows.
Example 11.4.1.
t 1 4 1
1 2 3
= 2 5 and [1 − 3 − 5]t = −3 .
4 5 6
3 6 −5
In other words, if A = [aij ] is an m × n matrix, then At = [bij ] is the n × m matrix, where bij = aji .
Observe that the transpose of a row vector is a column vector. Similarly, the transpose of a column
vector is a row vector. The basic properties of the transpose operation are
Theorem 11.4.1. Let A and B be matrices and let k be a scalar. Then, whenever the sum and
product are defined, we have
(i) (A + B)t = At + B t .
(ii) (At )t = A.
(iii) (kA)t = kAt .
(iv) (AB)t = B t At .
This is the ji-entry (reverse order) of (AB)t . Now column j of B becomes row j of B t and row i of
A becomes column i of At , Thus, the ij-entry of B t At is
[b1j , b2j , · · · , bmj ][ai1 ai2 aim ]t = b1j ai1 + b2j ai2 + · · · + bmj aim .
Definition 11.5.1. A square matrix is a matrix with the same number of rows as columns.
176
Example 11.5.1. The following are square matrices of order 3.
1 2 3 2 −5 1
A = −4 −4 −4 and B = 0 3 −2 .
5 6 7 1 2 −4
Definition 11.6.1. Let A = [aij ] be an n-square matrix. The diagonal or main diagonal of A
consists of the elements with the same subscripts, that is
a11 , a22 , . . . , ann .
Definition 11.6.2. The trace of A, written tr(A), is the sum of the diagonal elements. Namely
tr(A) = a11 + a22 + · · · + ann .
The n-square identity or unit matrix, denoted by I, is the n-square matrix with 1’s on the diagonal
and 0’s elsewhere.
177
and
3 7 −6 1 2
2 −11 38
A =A A= = .
−9 22 3 −4 57 −106
A square matrix D = [dij ] is diagonal if its non diagonal entries are all zero.
Example 11.8.1.
3 0 0
4 0
A = 0 −7 0 and B= .
0 −5
0 0 2
A square matrix A = [aij ] is upper triangular if all entries below the main diagonal are equal to
zero.
Example 11.8.2.
b11 b12 b13
a11 a12
A= and B = 0 b22 b23 .
0 a22
0 0 b33
A lower triangular matrix is a square matrix whose entries above the main diagonal are all zero.
At = A.
2 −3 5 2 −3 5
Example 11.8.3. Let A = −3 6 7 , then At = −3 6 7 . Hence At = A, thus A is
5 7 −8 5 7 −8
symmetric.
178
11.8.4 Skew-Symmetric Matrices
At = −A.
Example 11.8.4.
0 3 −4
B = −3 0 5 .
4 −5 0
At = A−1 ,
that is
AAt = At A = I.
Let A be a complex matrix. The conjugate of a complex matrix A, written A, is the matrix
obtained from A by taking the conjugate of each entry of A.
A∗ = (A)t = (At ).
If A is real then A∗ = At .
2 − 8i −6i
2 + 8i 5 − 3i 4 − 7i
Example 11.9.1. Let A = , then A∗ = 5 + 3i 1 + 4i.
6i 1 − 4i 3 + 2i
4 + 7i 3 − 2i
179
11.9.1 Hermitian Matrices
A∗ = A.
Skew-Hermitian Matrices
A∗ = −A.
AI = IA = A.
This raises the following question : Given an n×n matrix A, is it not possible to find another
n × n matrix B, such that AB = BA = I?
B = A−1 .
Proposition 11.10.2. Suppose that A is an invertible n × n matrix. Then its inverse A−1 is
unique.
180
Proof. Suppose that B satisfies the requirements for being the inverse of A. Then AB = I = BA.
It follows that
A−1 = A−1 I = A−1 (AB) = (A−1 A)B = IB = B.
Hence the inverse A−1 is unique.
Exercise 11.10.1. Suppose that A and B are invertible n × n matrices. Prove that
(AB)−1 = B −1 A−1 .
(A−1 )−1 = A.
11.11 Determinants
Each n-square matrix A = [aij ] is assigned a special scalar called the determinant of A, denoted
by det A or |A| or
a11
a12 ··· a1n
a21 a22 ··· a2n
.. .
.. ..
.
. ··· .
am1 am2 ··· amn
181
A procedure for evaluating the determinants of 3 × 3 is called Sarrus’ Rule.
+ + +
a11 a12 a13 a11 a12
2 1 1
det A = 0 5 −2 =
1 −3 4
+ + +
2 1 1 2 1
0 5 −2 0 5
1 −3 4 1 −3
− − −
3 2 1
det B = −4 5 −1 =
2 −3 4
+ + +
3 2 1 3 2
−4 5 −1 −4 5
2 −3 4 2 −3
− − −
182
= 3(5)(4) + 2(−1)(2) + 1(−4)(−3) − (2)(−4)(4) − (3)(−1)(−3) − (1)(5)(2) = 60 − 4 + 12 + 32 − 9 − 10
= 81.
Definition 11.13.1. If A = [aij ] is an n × n matrix, then the minor of the element aij denoted
by Mij and is defined ad the determinant of the (n − 1) × (n − 1) sub-matrix which is obtained by
deleting all the entries in the ith row and the jth column.
Definition 11.13.2. The co-factor of an element aij denoted by aij is defined as the product of
(−1)i+j and the minor of aij , that is
Co-factor of an element is merely the signed minor of the element. We emphasize Mij denotes a
matrix and Aij denotes a scalar.
a11 a12 a13
1+1 a22 a23
Example 11.13.2. If A = a21 a22 a23 , then the co-factor of a11 = A11 = (−1)
a31 a32 a33 a32 a33
a22 a23 1+2
a21 a23 a21 a23
= + a31 a33 = − a31 a33 .
, the co-factor of a12 = A12 = (−1)
a32 a33
183
11.14 Laplace Expansion of the Determinant
To compute the determinant of an n×n matrix we make use of the concept of co-factors and minors
to reduce the matrix to lower ones whose determinants we already know how to calculate.
The determinant of a square matrix A = [aij ] is equal to the sum of the products obtained by
multiplying the elements of any row (column) by their respective co-factors.
n
X
|A| = ai1 Ai1 + ai2 Ai2 + · · · + ain Ain = aij Aij .
j=1
This expansion can be carried out along any row of the matrix in question and the value of the
determinant is the same.
3 −1 5
Example 11.14.1. Given that A = 0 4 −3. Find |A|.
2 1 2
Note that expanding by a row or column that contains zeros significantly reduces the number
of cumbersome calculations that need to be done. It is sensible to evaluate the determinant by
co-factor expansion along a row or column with the greatest number of zeros.
0 0 0 1
3 5 0 −1
Example 11.14.2. Given that A = 0 3 −2 5 . Find det A.
1 0 0 2
184
Solution:
3 5 0
5 0
det A = − 0 3 −2 = −
1 0 0 3 −2
= −(10) − 0
= 10.
The determinant of the identity matrix is 1. The determinant of a diagonal matrix D of order
n × n is given by the product of the elements on its main diagonal. The determinant of a triangular
matrix of order n × n is given by the product of the elements on its main diagonal.
11.15 Properties
1. For general matrices, A and B
|AB| = |A||B|.
|AB| = |BA|.
det A = − det A.
5. If the elements of any rows (columns) of an n × n matrix A are multiplied by the same scalar
k, then the value of the determinant of the new matrix is k times the determinant of A.
6. If the elements of any row (column) of A are all zeros, then the determinant of A is zero.
8. If A is an n × n matrix, with any two of its rows (columns) equal, then the determinant of A
is zero.
185
11.16 Adjoint
Definition 11.16.1. Let A = [aij ] be an n × n matrix and let Aij denote the co-factors of aij . The
adjoint of A, denoted by adj A is the transpose of the matrix of co-factors of A, that is
adj A = [Aij ]t .
2 3 −4
Example 11.16.1. Let A = 0 −4 2 . The co-factors of the nine elements of A are as follows,
1 −1 5
−4 2 0 2 0 −4
A11 = + = −18, A12 = −
1 5 = 2, A13 = + 1 −1 = 4
−1 5
3 −4 2 −4 2 3
A21 = − = −11, A22 = +
1 5 = 14, A23 = − 1 −1 = 5
−1 5
3 −4 2 −4 2 3
= −10, A32 = −
A31 = + 0 2 = −4, A33 = + 0 −4 = −8.
−4 2
A11 A12 A13 −18 2 4
[Aij ] = A21 A22 A23 = −11 14 5 .
A31 A32 A33 −10 −4 −8
The transpose of the above matrix of co-factors yields the adjoint of A, that is
−18 −11 −10
adj A = 2 14 −4 .
4 5 −8
186
11.17 Properties of Inverses
1. If an n × n matrix A is invertible, then det A 6= 0.
Definition 11.17.1. A matrix which has an inverse is said to be invertible. A matrix whose
determinant is non-zero is said to be non-singular and if a matrix has determinant equal to
zero it is called a singular matrix.
(A−1 )−1 = A.
187
Chapter 12
Application of Matrices
Gravitation cannot be held responsible for people falling in love. How on earth can you explain in terms of
chemistry and physics so important a biological phenomenon as first love? Put your hand on a stove for a
minute and it seems like an hour. Sit with that special girl for an hour and it seems like a minute. That’s
relativity.
—Albert Einstein
Let ri denote row i of matrix A. There are 3 elementary row operations, namely
188
1 0 2
Example 12.1.1. Consider the matrix A = 4 1 3. Then
3 2 6
4 1 3
r1 ↔ r2 gives 1 0 2
3 2 6
1 0 2
r2 → 2r2 gives 8 2 6
3 2 6
5 2 10
r1 → 2r1 + r3 gives 4 1 3 .
3 2 6
We can use row operations to find the inverse of A by writing a matrix (A|In ), then use row
operations to get (In |A−1 ).
2 3
Example 12.2.1. Consider A = .
2 2
Solution: We write
2 3 1 0
.
2 2 0 1
Then performing row operations we have
2 3 1 0
r2 → r2 − r1
0 −1 −1 1
2 0 −2 3
r1 → r1 + 3r2
0 −1 −1 1
0 −1 23
1 1
r1 → r1
2 0 −1 −1 1
0 −1 32
1
r2 → −r2 .
0 1 1 −1
−1 23
−1
Therefore A = . Checking can be done by verifying that A−1 A = I.
1 −1
−1 23
2 3 1 0
= = I.
1 −1 2 2 0 1
189
12.3 Linear Equations
A finite set of linear equations in the variables x1 , x2 , · · · , xn is called a system of linear equations
or a linear system.
Every system of linear equations has either no solutions, exactly one solution or infinitely
many solutions.
190
Example 12.3.2. Find the solution set of
2x1 − x2 + x3 = 4
−3x1 + 2x2 − 4x3 = 1
x1 − 5x3 = 0
Corresponding system of linear equations which is derived from the augmented matrix is
x1 − 5x3 = 0
x2 − 11x3 = −4
3x3 = 9.
Now using the method of back substitution, we find the values of the unknown as follows
x3 = 3
x2 = −4 + 33 = 29
x1 = 0 + 15 = 15.
191