Math1059 - Calculus
Math1059 - Calculus
Math1059 - Calculus
1. Functions 3
2. Limits 9
3. Continuity 18
4. Differentiation 27
5. Curve Sketching 47
6. Integration 59
7. Methods of Integration 67
8. Indeterminate Forms and Improper Integrals 77
9. Differential Equations 83
10. Appendix: Mathematical Induction 97
2
3
1. Functions
Getting started.
Definition 1.1. A function is a map f : X −→ Y between sets X and Y with
the property that f sends an element of X to a unique element of Y . For
x ∈ X, write f (x) for the corresponding element in Y .
These examples show that it’s good to get a grip on the set from which a
function starts and the set where it sends elements to.
Definition 1.3. Let f : X −→ Y be a function. The set X is called the
domain of the function f . The range of f is the smallest subset R of Y
containing all the elements f (x). That is,
R = {f (x) | x ∈ X}.
More notation. For a < b in R, write [a, b] for the closed interval from a to b,
write (a, b) for the open interval from a to b, and write [a, b) and (a, b] for the
half-open intervals from a to b. That is:
• [a, b] = {x ∈ R | a ≤ x ≤ b};
• (a, b) = {x ∈ R | a < x < b};
• [a, b) = {x ∈ R | a ≤ x < b};
• (a, b] = {x ∈ R | a < x ≤ b}.
Example 1.4. Return to Example 1.2.
(i) The domain of f is R. The range is R.
(ii) The domain of g is R. The range is R+ .
4
q ≤ 1 for every
(iii) The domain of h is R. The range is (0, 1] because 0 < g(x)
1 1−y
x ∈ R, and any y ∈ [0, 1] can be written as y = 1+x2 for x = y .
surjective. It is not injective because, for example, h(1) = h(−1) = 1/2 but
1 6= −1. It is not surjective because we have already seen that the range of h
is (0, 1] and (0, 1] 6= R.
The absolute value function. Write |x| for the absolute value of a real
number x. That is,
x if x ≥ 0
|x| =
−x if x < 0.
5
The absolute value function is given by f (x) : R −→ R+ where f (x) = |x|. You
should check that the domain of f is R, the range is R+ , f is not injective,
and f is surjective.
sin x
(i) tan x = cos x . Observe that tan x is not defined when cos x = 0, that is,
when x = kπ + π/2 for some integer k. Let A = {kπ + π/2 | k ∈ Z}. Then
we can define a function f : R − A −→ R by f (x) = tan x. Observe that f is
a surjection but f (x + 2π) = f (x) so f is not injective.
The restricted function f 0 : (− π2 , π2 ) −→ R defined by f 0 (x) = tan x is a bijec-
tion.
(ii) sec x = cos1 x . As with tan x, sec x is not defined on A = {kπ+π/2 | k ∈ Z}.
So there is a function g : R−A −→ R defined by g(x) = sec x. Observe that g is
neither a surjection nor an injection. It is not a surjection since −1 ≤ cos x ≤ 1
implies that sec x ≥ 1 or sec x ≤ −1. In particular, the open interval (−1, 1)
is not in the range of sec x. It is not an injection since sec(x + 2π) = sec x.
(iii) csc x = sin1 x . Now sec x is not defined when sin x = 0, that is, when
x = kπ for some integer k. Let B = {kπ | k ∈ Z}. Then we can define
a function h : R − B −→ R by h(x) = csc x. Observe that h is neither a
surjection nor an injection reasons similar to those for sec x.
This definition does not explicitly say that f has to be a bijection, but that is
inherent from the compositions g ◦ f and f ◦ g being the identity maps on X
and Y respectively. (Exercise: check this!)
Example 1.14. Let f : R −→ R be defined by f (x) = 2x + 1. Then f is a
bijection. What is the inverse of f ?
Solution 1.15. To find f −1 , let y = f (x) and solve y = 2x + 1 for x. This
gives x = y−1
2 . Let g(x) =
x−1
2 . Let’s check that g is an inverse of f by
showing that g ◦ f and f ◦ g are the identity maps on R. We have
(2x + 1) − 1 2x
(g ◦ f )(x) = g(f (x)) = g(2x + 1) = = =x
2 2
x−1 x−1
(f ◦ g)(x) = f (g(x)) = f ( ) = 2( ) + 1 = (x − 1) + 1 = x.
2 2
Thus g is the inverse of f so we can take f −1 = g. That is, f −1 (x) = x−1
2 .
7
One useful observation is that the graph of f −1 is the mirror image of the
graph of f along the line y = x. That is, if y = f (x) then the point (x, y) is on
the graph of f . Now apply f −1 to y = f (x) to get f −1 (y) = f −1 (f (x)) = x.
This implies that the point (y, x) is on the graph of f −1 . You can get the
point (y, x) from the point (x, y) by reflecting along the line x = y - i.e. -
swapping coordinates. Doing this for every such point on the graph of f gives
the graph of f −1 .
Inverse Trig Functions Special cases of inverse functions occur for trig
functions, but caution needs to be exercised to get the domains and ranges
correct.
Let f : R −→ R be defined by f (x) = sin x. Then f is not a bijection so it has
no inverse. But we saw that f becomes a bijection if the domain and range
are appropriately restricted. Starting again, let f : [− π2 , π2 ] −→ [−1, 1] be the
function defined by f (x) = sin x. Then f is a bijection, so f −1 exists. Let
sin−1 x be the inverse function of sin x for this domain and range. So sin−1 x
is a function
π π
sin−1 x : [−1, 1] −→ [− , ]
2 2
with domain [−1, 1] and range [− π2 , π2 ]. The graph of sin−1 x is the mirror
image of that for sin x, with the mirror placed along the line y = x.
Similarly, the function g : [0, π] −→ [−1, 1] defined by g(x) = cos x is a bijec-
tion and so has an inverse
cos−1 x : [−1, 1] −→ [0, π].
Its graph is the mirror image of that for cos x, with the mirror placed along
the line y = x.
The five rules for logarithms are derived from the five rules for exponentials
by using inverse functions. Here is an example.
Example 1.19. Show that loga (xy) = loga (x) + loga (y).
Solution 1.20. Let c = loga (xy) and d = loga (x) + loga (y). We want to show
that c = d. The strategy is to first show that ac = ad and then show that
c = d.
Taking exponentials using the base a gives az = aloga (xy) = xy. On the other
hand, consider the string of equalities
ad = aloga (x)+loga (y) = aloga (x) aloga (y) = xy.
From left to right, the first equality holds by definition of d, the second equal-
ity holds from the property as+t = as at , and the third equality holds since
aloga (x) = x and aloga (y) = y. Thus we have ac = ad . Now apply the logarithm
function to get loga (ac ) = loga (ad ). But loga (ac ) = c and loga (ad ) = d, so
c = d as required.
9
2. Limits
3 3
1 1
1 1
g(x) h(x)
They are clearly different functions because they take different values at x = 1.
However, we’ll see that the concept of limits does not see this, while the
concept of continuity does.
The idea of a limit is as follows. Start with a function f : R −→ R. Fix
a point a in R. Suppose x is another point in R which can be moved, in
particular, we can let x get closer and closer to a. The lingo is to say that x
approaches a. As x approaches a, look at how f (x) changes.
Informally, we say that the function f approaches the limit L as x approaches a
if f (x) approaches L. If f is defined on either side of a, write
lim f (x) = L.
x→a
For example, consider the functions g and h above, and take a = 1. Let x
approach 1. This could be from the left with x < 1 and x getting bigger as it
nears 1, or from the right with x > 1 and x getting smaller as it nears 1. For
all x near 1 but not equal to 1, we have g(x) = 2x + 1 so g(x) approaches 3
as x approaches 1. Thus
lim g(x) = 3.
x→1
On the other hand, exactly the same thing is happening with h. For all x
near 1 but not equal to 1, we have h(x) = 2x + 1 so h(x) approaches 3 as x
approaches 1. Thus
lim h(x) = 3.
x→1
10
In the first case, it happens to be the case that g(1) = 3, which matches the
limit, but in the second case h(1) = 1, which does not match the limit. There
is nothing inconsistent here if we remember the key fact that limits are only
concerned with what happens near a but not at a.
3x−1
Example 2.1. Find limx→2 x−2 .
Solution 2.2. First, we double-check the problem makes sense. The function
3x−1
x−1 is not defined when x = 1. But as we are only concerned with what
happens near 2, this is not a problem as we can restrict the function to the
domain (1, ∞).
Now observe that the numerator approaches 5 as x approaches 2 and the
demonimator approaches 1 as x approaches 2. So the function as a whole
approaches 51 = 5. That is,
3x − 1
lim = 5.
x→2 x−1
x2 −1
Example 2.3. Find limx→1 x−1 .
The function is not defined at x = 0 but it is defined at any other point near 0,
so it makes sense to ask for the limit. Let x approach 0 from the right, that is,
take x ∈ (0, 1) and let x get smaller. Then x1 gets bigger. In fact, the closer x
gets to 0 the bigger x1 gets. So there is no number L such that f (x) = x1
approaches L as x approaches 0 from the right. Therefore x1 has no limit at
11
x = 0. Note also that the same argument works if we take x ∈ (−1, 0) and
let x approach 0. Then x1 is an increasingly large negative number.
The argument in the last paragraph is somewhat intuitive and logically loose,
while mathematics likes proof and logical rigour. So having a sense for what
limits are now, let’s consider the real, formal definition of a limit.
Definition 2.5. Suppose that f is a function defined on the set (c, a) ∪ (a, b)
where c < a and b > a. Then
lim f (x) = L
x→a
if for every > 0 there exists a δ > 0 such that |f (x) − L| < whenever
|x − a| < δ.
This is a mouthful. Bit by bit, here’s what it’s saying. First, the nota-
tion.
• The condition, “f is a function defined on the set (c, a) ∪ (a, b) where
c < a and b > a” means that f is defined near a but does not have to
be defined at a.
• and δ are real numbers which are usually thought of as very small,
but they don’t necessarily have to be.
• |f (x) − L| is the distance from f (x) to L and |x − a| is the distance
from x to a.
Now the content, thought of as a two-step procedure.
• Step 1: start with a given > 0. You have to find a δ > 0 so that
if the distance from x to a is less than δ – ie, |x − a| < δ – then the
distance from f (x) to L must be less than – ie, |f (x) − L| < .
• Step 2: Make sure you can do Step 1 for every > 0.
The two steps are consistent with the intuitive idea of a limit. They say that
whenever x approaches a (the distance between x and a is small) then f (x)
approaches L (the distance between f (x) and L is small). However, they are
phrased in a way that makes them apt for studying when limits exist, and
especially for when limits do not exist.
Let’s revisit two of the earlier examples.
Example 2.6. Using the formal definition of the limit, show that
lim 2x + 1 = 3.
x→1
Solution 2.7. Let > 0. By the definition of the limit, we need to show that
there is a δ > 0 such that whenever |x − a| < δ we obtain |f (x) − L| < .
In this case, f (x) = 2x + 1, a = 1 and L = 3, so we need to show there is a
δ > 0 such that whenever |x − 1| < δ we obtain |(2x + 1) − 3| < . Note that
|(2x + 1) − 3| = |2x − 2| = 2|x − 1|. So what we really need to show is that
12
there is a δ > 0 such that whenever |x − 1| < δ we obtain 2|x − 1| < . This
is now easy. Take δ = 2 . Then whenever |x − a| < δ we obtain
|(2x + 1) − 3| = 2|x − 1| < 2 · = ,
2
as required.
Challenge Problem 2.8. Using the formal definition of the limit, show that
limx→0 x1 does not exist.
This is not the formal argument, but let’s get a feeling for why limx→0 x1 does
not exist. Suppose the limit *did* exist and took the value L. Let > 0.
Then by the definition of the limit, we can find a δ > 0 such that whenever
|x − a| < δ we have |f (x) − L| < . As a = 0 and f (x) = x1 in this case, this
reads as whenever |x| < δ we have | x1 − L| < . But as |x| gets very small,
| x1 | gets very large, so no matter what value the fixed number L is, | x1 − L|
would also get very large. In particular, it would be larger than . This would
give a contradiction, meaning that the assumption that the limit did exist was
incorrect, and therefore the limit does not exist.
Proof. The strategy of the proof is to show that |L − M | can be made as small
as we want. This shows that L = M since if L 6= M then |L − M | = d for
some distance d > 0. But if we can made |L − M | as small as we want, we
could make it less than d, a contradiction.
Let > 0. We aim to show that |L − M | < . By the definition of a limit,
to say limx→a f (x) = L means that there is a δ1 > 0 such that whenever
|x − a| < δ1 we have |f (x) − L| < 2 . Similarly, to say limx→a f (x) = M means
that there is a δ2 > 0 such that whenever |x − a| < δ2 we have |f (x) − M | < 2 .
Let δ be the minimum of δ1 and δ2 . Then when |x − a| < δ we have both
|f (x) − L| < 2 and |f (x) − M | < 2 . By the triangle inequality, this implies
that
|L − M | ≤ |L − f (x)| + |f (x) − M | = |f (x) − L| + |f (x) − M | < + = .
2 2
13
Proof. We will only do part (a). The rest are left as exercises for you.
Let > 0. We wish to show that there is a δ > 0 such that
|(f (x) + g(x)) − (L + M )| <
whenever |x − a| < δ.
By the definition of the limit, to say that limx→a f (x) = L means that there
exists δ1 > 0 such that |f (x) − L| < 2 whenever |x − a| < δ1 . Similarly, to say
that limx→a g(x) = M means that there exists δ2 > 0 such that |g(x)−M | < 2
whenever |x − a| < δ2 . Let δ be the minimum of δ1 and δ2 . Then whenever
|x − a| < δ we have both |f (x) − L| < 2 and |g(x) − M | < 2 . Also, observe
that (f (x) + g(x)) − (L − M ) = (f (x) − L) + (g(x) − M ). Now, by Distance
Inequality (i),
|(f (x) + g(x)) − (L + M )| = |(f (x) − L) + (g(x) − M )|
≤ |f (x) − L| + |g(x) − M |
< + =
2 2
14
as required.
Theorem 2.12 greatly increases the number of limit problems we can tackle.
Starting with the very simple limit limx→a x = a, part (c) gives
lim xn = an
x→a
for any positive integer n, part (e) implies that
lim cxn = can
x→a
for a constant c, and parts (a) and (b) then imply that
lim p(x) = p(a)
x→
for any polynomial p(x). Further, part (d) then implies that
p(x) p(a)
lim =
x→ q(x) q(a)
when both p(x) and q(x) are polynomials and q(a) 6= 0. Let’s record this for
future reference.
Corollary 2.13. Let p(x) and q(x) be polynomials and suppose a is a real
number such that q(a) 6= 0. Then
p(x) p(a)
lim = .
x→a q(x) q(a)
x2 −3x+2
Example 2.14. Find limx→3 2x3 −x+1 .
Proof. Suppose that limx→a g(x) = M . Since f (x) ≤ g(x) for all x near a,
by Theorem 2.12 (f) we have L ≤ M . Since g(x) ≤ h(x) for all x near a, by
Theorem 2.12 (f) we have M ≤ L. Therefore M = L.
x
One-sided limits. The example of |x| in Problem 2.10 illustrates a good
point. The function was 1 for x to the right of 0 and −1 for x to the left of 0.
The limit didn’t exist because the two values didn’t match up. But there is
value to thinking of the function having limit 1 at 0 as x → 0 from the right
and having limit −1 at 0 as x → 0 from the left. This leads to one-sided
limits.
Definition 2.19. Suppose that f is a function defined on the set (a, b) where
b > a. Then
lim f (x) = L
x→a+
if for every > 0 there exists a δ > 0 such that |f (x) − L| < whenever
0 < x − a < δ. We say that L is the right hand limit of f (x).
This definition is exactly the same as that for a limit but it is only concerned
with those x near a which also satisfy x > a. That’s why the absolute value
disappeared in the |x − a| < δ part of the definition.
A similar definition works from the left, but now it concerns those x near a
which also satisfy x < a, so |x − a| = a − x.
Definition 2.20. Suppose that f is a function defined on the set (c, a) where
c < a. Then
lim− f (x) = L
x→a
if for every > 0 there exists a δ > 0 such that |f (x) − L| < whenever
0 < a − x < δ. We say that L is the left hand limit of f (x).
2x2 −x+1
Example 2.21. Find limx→∞ 3x2 +2x−1 .
Solution 2.22. Pick the highest power in either the numerator or the de-
n
moninator, say it’s xn , and multiply by xxn . In our case, the maximum power
is 2, so we get
1 1
2x2 − x + 1 x2 2x2 − x + 1 2− x + x2
2
= 2· 2 = 2 1 .
3x + 2x − 1 x 3x + 2x − 1 3+ x − x2
Now take limits. By the arithmetic of limits in Theorem 2.12, we can take
limits in the numerator and denominator separately, and in each we can take
limits term by term. Further, since limx→∞ x1 = 0 and x12 = x1 · x1 , Theo-
rem 2.12 (c) implies that limx→∞ x12 = 0. Therefore, in the numerator we
have
1 1 1 1
lim 2 − + 2 = lim 2 − lim + lim 2 = 2 − 0 + 0 = 2
x→∞ x x x→∞ x→∞ x x→∞ x
and in the denominator we have
2 1 1 1
lim 3 + − 2 = lim 3 + 2 lim − lim 2 = 3 + 0 − 0 = 3.
x→∞ x x x→∞ x→∞ x x→∞ x
Hence
2x2 − x + 1 2
lim = .
x→∞ 3x2 + 2x − 1 3
is the 0 in the denominator, which obscures what’s going on. Let’s instead
modify the trick to make sure the denominator has a limit which is a nonzero
constant.
n
Multiply by xxn where n is the highest power in the denominator (rather than
the highest in either the numerator or denominator). Doing so will give a limit
with a constant in the denominator. In our case, we get
4x3 − 3x + 1 x2 4x3 − 3x + 1 4x − x3 + x12
= · = .
2x2 + 8x − 2 x2 2x2 + 8x − 2 2 + x8 − x22
Now in the numerator we have
3 1 1 1
lim 4x − + 2 = lim 4x − 3 lim + lim 2
x→∞ x x x→∞ x→∞ x x→∞ x
= lim 4x − 0 + 0
x→∞
= lim 4x
x→∞
where the term limx→∞ 4x term has been deliberately left alone for the mo-
ment. In the denominator we have
8 2 1 1
lim 2 + − 2 = lim 2 + 8 lim − 2 lim 2 = 2 + 0 − 0 = 0.
x→∞ x x x→∞ x→∞ x x→∞ x
Thus
4x3 − 3x + 1 4x
lim = lim = lim 2x = ∞.
x→∞ 2x2 + 8x − 2 x→∞ 2 x→∞
18
3. Continuity
f (x)
g(x)
For example, if
x
|x| if x 6= 0
f (x) =
1 if x = 0
f (x)
Analogous definitions of continuity exist for half-open intervals (b, c] and [b, c),
as well as for extended intervals (b, ∞) or [b, ∞), and for unions of intervals
such as (−1, 1) ∪ [3, ∞).
20
For example, consider the function f (x) = x2 restricted to the closed interval
[0, 2]. Then f is continuous at a for every a ∈ (0, 2), it is continuous on the
right at 0, and it is continuous on the left at 2. Therefore f is continuous on
[0, 2].
The arithmetic of limits translates into arithmetic for continuous functions.
Theorem 3.6. If f and g are functions that are continuous at a then:
(a) f + g is continuous at a;
(b) f − g is continuous at a;
(c) f g is continuous at a;
(d) f /g is continuous at a provided that g(a) 6= 0;
(e) cf is continous at a, for any constant c.
Proof. We’ll prove part (a), the others being similar. Since f and g are
continuous at a, by the definition of continuity, limx→a f (x) = f (a) and
limx→a g(x) = g(a). By the arithmetic of limits,
lim (f (x) + g(x)) = lim f (x) + lim g(x) = f (a) + g(a).
x→a x→a x→
Many other functions are also continuous. Without proof, we’ll simply state
that sin x and cos x are continuous at all x ∈ R. Theorem 3.6 (d) then implies
sin x 1
that tan x = cos x is continuous whenever cos x 6= 0. Similarly, sec x = cos x ,
1 cos x
csc x = sin x and cot x = sin x are continuous whenever the denominator is
nonzero.
As well, the exponential function ax is continuous for all x ∈ R and loga x is
continuous for all x ∈ R+ .
The proof is a bit tricky, but the idea is simple. As x approaches a we have
g(x) approaching g(a). Let g(x) = y and g(a) = b. Then as y approaches b
we have f (y) approaching f (b). That is, as x approaches a we have f (g(x))
approaching f (g(a)). So f ◦ g is continuous.
Challenge Problem 3.10. Prove Theorem 3.9.
q
Example 3.11. Show that the function F (x) = 2x23x+1
−2x+1 is continuous for
all x ∈ [− 13 , 1) ∪ (1, ∞).
Solution 3.12. Observe that F (x) = (f ◦ g)(x) where g(x) = x23x+1 −2x+1 and
√
f (x) = x. First, x2 −2x+1 = (x−1)2 , which is 0 at x = 1, so g(x) = x23x+1 −2x+1
√
is continuous on R−{1}. Second, the domain of f (x) = x is x ≥ 0, so for the
composition F (x) = f (g(x)) to make sense we need g(x) ≥ 0. Observe that
the denominator of g(x) is the square (x − 1)2 which is always nonnegative.
Therefore g(x) ≥ 0 if and only if the numerator satisfies 3x + 1 ≥ 0. This is
true if and only if x ≥ − 31 . Putting it all together, if x ≥ − 13 and x 6= 1 then
F (x) = f (g(x)) makes sense and is continuous because it is a composition of
two continuous functions.
Maximum and minimum values refine the notions of upper and lower bounds
respectively, provided a maximum or minimum value exists. In the previous
example of f (x) = x1 on (0, 1] we have 1 has a minimum value since 1 is a
lower bound for f on (0, 1] and f (1) = 1. However, as f has no upper bound
it can have no maximum value either.
Example 3.21. Let f (x) = x2 on (0, 2]. Find the minimum and maximum
values of f , if they exist.
Solution 3.22. Observe that 4 is an upper bound since f (x) ≤ 4 for any
x ∈ (0, 2]. Also, 2 is a point in the interval and f (2) = 4, so 4 is the maximum
value of f on (0, 2]. On the other hand, 0 is a lower bound since f (x) ≥ 0
for all x ∈ (0, 2], but 0 is not a minumum value since there is no d ∈ (0, 2]
such that f (d) = 0. In fact, f has no minimum value on (0, 2]. To see why,
suppose that N is the minimum value of f on (0, 2]. We claim that it must
be the case that N ≤ 0. For if N > 0 then consider N2 , which is larger than 0
q
but smaller than N . We have N2 = f ( N2 ), so N cannot be a a lower bound
for f . But with N ≤ 0, there cannot be any d ∈ (0, 2] with f (d) = N since
f (d) > 0 for all such d while N ≤ 0. Hence N cannot be a minimum value
for f , a contradiction.
Notice that continuity has not yet entered the picture in terms of upper and
lower bounds and maximum and minimum values. It only does so now. The
following theorem says that a continuous function on a closed interval does
have both a maximum and a minimum value.
Theorem 3.23 (Extreme Value Theorem). Let f be a continuous function on
a closed interval [a, b]. Then f is bounded on [a, b] and f has both a maximum
value and a minimum value on [a, b].
For example, let S = (0, 1). Then 1 is an upper bound for S, as is 2 and 17.
We claim that 1 is the least upper bound for S. This means showing that
if K is any other upper bound for S then 1 ≤ K. But if K is another upper
bound, we must have K ≥ 1. Suppose K is an upper bound for S and K < 1.
We’ll aim for a contradiction. Observe that the distance between K and 1 is
1 − K. So consider L = K + 1−K 2 . Then K < L and L < 1. As L < 1 we
have L ∈ (0, 1) so K < L implies that K cannot be an upper bound for (0, 1),
contradicting our choice of K.
The notion of a greatest lower bound can also be defined, although we won’t
need it right now.
The following is an axiom of the real numbers. That is, it is a statement to
be accepted without proof.
Axiom (The Least Upper Bound Axiom). Every nonempty set of real num-
bers that has an upper bound has a least upper bound.
Lemma 3.27. Let f be continuous on [a, b]. If f (a) < 0 < f (b), or if
f (a) > 0 > f (b), then there is a number c between a and b such that f (c) = 0.
Proof. Suppose f (a) < 0 < f (b), the other case being proved similarly. Since f
is continuous, limx→a f (x) = f (a). That is, as x approaches a, f (x) ap-
proaches f (a). As f (a) < 0, this means that if x is close enough to a then
f (x) also has to be less than 0. Technically, there is some small number 0 > 0
such that f is negative on the interval [a, 0 ). This line of reasoning will be
used repeatedly in the argument that follows.
Let S be the set of all values for which f is negative on [a, ). That is,
S = { | f is negative on [a, )}.
Observe that S is nonempty since 0 ∈ S. The set S has an upper bound
since f (b) > 0 implies that f must be positive on some interval (t, b), implying
that f is not negative on the whole interval [a, b). So b is an upper bound on S.
Therefore, by the Least Upper Bound Axiom, S has a least upper bound c.
Note that c ≤ b.
We claim that f (c) = 0, which would prove the lemma. If f (c) > 0 then f is
positive on some interval of the form (t, c) so t is also an upper bound on S
and t < c. But this contradicts the fact that c is the least upper bound on S.
If f (c) < 0 then f is negative on some interval of the form (c, s), implying
25
that f is negative on the interval [a, s), meaning that the least upper bound
of S must be at least as large as s. But this contradicts the fact that c < s.
Thus the only option left is that f (c) = 0.
Proof. Saying that K is between f (a) and f (b) means that either f (a) < K <
f (b) or f (b) < K < f (a). Suppose that f (a) < K < f (b), the other case being
similar. Let g be the function defined by
g(x) = f (x) − K.
Observe that g is a difference of continuous functions on [a, b] so g is contin-
uous on [a, b]. Also, g(a) = f (a) − K < 0 and g(b) = f (b) − K > 0. So
by Lemma 3.27, there is a number c between a and b such that g(c) = 0.
Therefore, as f (x) = g(x) + K, we obtain f (c) = g(c) + K = K.
4. Differentiation
f (x)
How can we do this? First, remember we want the equation of a line, and a
line is determined by its slope and one point on the line. We have the one
point, it is (x, f (x)). What we need to find is the slope.
The idea is to find the slope through a series of approximations. Pick a point
x + h close to x. To simplify the argument, suppose h > 0. Look at the red
line connecting the points (x, f (x)) and (x + h, f (x + h)) on the graph of f ,
called the secant line.
f (x)
x x+h
(1)
This secant line approximates the tangent line. The advantage of the secant
line is that its slope is easy to calculate since we know two points on the line.
The slope of the secant line is
f (x + h) − f (x) f (x + h) − f (x)
= .
(x + h) − x h
Thinking in terms of the graph (1), if h is made smaller then we get a new
secant line which should be a better approximation to the tangent line.
28
As these approximations to the tangent line improve the smaller h gets, let’s
take the limit as h approaches 0 in order to get the tangent line on the nose.
Then the slope of the tangent line to the graph of f at x should be
f (x + h) − f (x)
lim .
h→0 h
WARNING!: The definition of the derivative is careful to make sure that the
limit actually exists. If the limit does not exist - and this can happen - then
the function is not differentiable at x.
The definition of the derivative works one point x at a time. It’s often easier
to talk about functions being defined on an interval, or some domain.
Definition 4.2. A function f is differentiable on a subset S of the real num-
bers if and only if it is differentiable at every point x in S.
We’ll soon be able to calculate derivatives with ease, but let’s stick to the
definition for the time being.
Example 4.3. Using the definition of the derivative, find the derivative of
f (x) = x2 .
f (x+h)−f (x)
Solution 4.4. Before taking any limits, just consider the fraction h .
Using f (x) = x2 and simplifying we get
f (x + h) − f (x) (x + h)2 − x2 (x2 + 2xh + h2 ) − x2 2xh + h2
= = = = 2x+h.
h h h h
With the arithmetic done, now take limits to get
f (x + h) − f (x)
lim = lim (2x + h) = 2x.
h→0 h h→0
2 0
Thus the derivative of f (x) = x is f (x) = 2x. Note that this works for all
x ∈ R, so f is differentiable on R.
√
Solution 4.6. First observe that the domain of f (x) = x is [0, ∞). The
derivative of f is defined in terms of a limit involving h approaching 0, where
we can equivalently think of this as x + h approaching x. In this case, if x = 0
then the function f is not defined for points less than 0, so 0 + h can approach
0 only from the right. Therefore the limit limh→0 f (0+h)−fh
(0)
does not exist.
Hence f is not differentiable at x = 0.
Now suppose x > 0. Then f is defined both to the left and right of x so
it makes sense to take a limit as x + h approaches x. Before taking limits,
consider the fraction f (x+h)−f
h
(x)
. We have
√ √ √ √ √ √
f (x + h) − f (x) x+h− x x+h− x x+h+ x
= = · √ √
h h h x+h+ x
(x + h) − x
= √ √
h · ( x + h + x)
h
= √ √
h · ( x + h − x)
1
=√ √ .
x+h+ x
Therefore
f (x + h) − f (x) 1 1 1
lim = lim √ √ =√ √ = √ .
h→0 h h→0 x+h+ x x+ x 2 x
√
In conclusion, f (x) = x is differentiable for x ∈ (0, ∞) and at any such x
the derivative is f 0 (x) = 2√
1
x
.
Remember that the derivative is designed to find the slope of a tangent line.
So we can now go back to work out actual tangent lines.
Example 4.7. Find the equation of the tangent line for the function f (x) =
x2 at x = 3.
Solution 4.8. To get the equation of a line we need a point on the line and
its slope. For the point on the line, at x = 3 we have f (3) = 32 = 9 so (3, 9)
will be a point on the tangent line. Its slope is given by the derivative of f
at x = 3. We already saw that f 0 (x) = 2x, so at x = 3 we have f 0 (3) = 6.
Therefore the equation of the tangent line is y − 9 = 6(x − 3), which simplifies
to y = 6x − 9.
f (x)
It’s not hard to see using the definition of continuity that f is continuous on R,
and this matches the geometric intuition of being able to draw the graph of f
without lifting your pencil off the page.
Now consider the differentiability of f . If x > 0 then f (x) = x. It’s easy
to check using the definition of the deriviative that f (x) is differentiable on
(0, ∞) and f 0 (x) = 1. Geometrically, this makes sense since the tangent line
to a line is the same line, and the slope of f (x) = x is 1. Similarly, if x < 0
then f (x) = −x and this is differentiable on (−∞, 0), although now the slope
of the tangent line is −1. However, at x = 0 something different happens. We
have just seen that
f (x + h) − f (x) f (x + h) − f (x)
lim =1 and lim = −1
h→0+ h h→0− h
for f (x) = |x|. Therefore limh→0 f (x+h)−f
h
(x)
does not exist. Hence f is *not*
differentiable at x = 0. Geometrically, the problem is the corner at x = 0 in the
graph of f , where the slope of the tangent line has to change abruptly.
What we conclude is that the function f (x) = |x| is continuous on R but not
differentiable on R. More precisely, f is continuous on R and differentiable
only on R − {0}.
On the other hand, a differentiable function is continuous.
Theorem 4.9. If f is differentiable at x then f is continuous at x.
= f 0 (x) · 0 = 0.
31
Thus
lim f (x + h) = f (x).
h→0
This doesn’t immediately look like it, but it is equivalent to saying that f
is continuous at x. Thinking through it, as x + h approaches x we have
f (x + h) approaching f (x). This is the same intuition used in saying that
limy→x f (y) = f (x).
Challenge Problem 4.10. Finish the proof of Theorem 4.9 by showing that
limh→0 f (x + h) = f (x) is the same as saying that the limit of f at x is f (x).
Proof. We only prove part (a) to give a taste of how things are done. The defi-
nition of the derivative of f + g at x considers the fraction (f +g)(x+h)−(f
h
+g)(x)
.
So first observe that
32
The proofs of Theorems 4.13 and 4.14 are trickier but not unreasonable. How-
ever, we will not dwell on them as the point of the formulas is to start calcu-
lating.
Now let’s build. Consider f (x) = x2 . Think of f (x) as x · x. Then by the
Product Rule and the fact that (x)0 = 1 we get
f 0 (x) = (x)0 · x + x · (x)0 = 1 · x + x · 1 = 2x.
Suppose that for n ≥ 1 we have proved that (xn−1 )0 = (n − 1)xn−1 . By the
Product Rule we obtain
(xn )0 = (x · xn−1 )0 = (x)0 · xn−1 + x · (xn−1 )0
(2) = 1 · xn−1 + x · (n − 1)xn−1
= nxn−1 .
Continuing, suppose that p(x) = an xn + an−1 xn−1 + · · · + a0 is a polynomial.
From Lemma 4.11, Theorem 4.12 and (2) we obtain the following.
Proposition 4.15. If p(x) = an xn + an−1 xn−1 + · · · + a0 is a polynomial then
p0 (x) = na xn−1 + (n − 1)a
n xn−2 + · · · + 2a x + a .
n−1 2 1
Proposition 4.15 can seem like a mouthful, but it’s just the notation that makes
it that way. In practise, differentiating polynomials is dead easy.
Example 4.16. Find the derivative of f (x) = 3x4 − 7x3 + 2x + 8.
Solution 4.17. We have f 0 (x) = 12x3 − 21x2 + 2.
Example 4.18. If g(x) = (x3 − 4x + 2)(x11 − x6 + 1), find g 0 (1).
33
Solution 4.19. You could multiply the two factors of g(x) together and then
differentiate, but it may be easier to simply use the product rule. We have
Using the Quotient Rule, rational functions are also easy to differentiate.
3x2 −2x+4
Example 4.20. Find the derivative of f (x) = x2 −1 .
A special case of Theorem 4.14 that sometimes gets singled out as its own rule
is when f (x) = 1.
So h0 (2) = − 11
20 20
2 = − 121 .
Solution 4.26. Use the Quotient Rule as usual, keeping track of constants.
We get
(ax2 + bx + c)0 (2x + 1) − (ax2 + bx + c)(2x + 1)0
f 0 (x) =
(2x + 1)2
(2ax + b)(2x + 1) − (ax2 + bx + c)(2)
=
(2x + 1)2
(4ax + 2bx + 2ax + b) − (2ax2 + 2bx + 2c)
2
=
(2x + 1)2
2
2ax + 2ax + (b − 2c)
= .
(2x + 1)2
2a−2a+(b−2c)
Therefore f 0 (−1) = (−1)2 = b − 2c.
Challenge Problem 4.27. Use the Reciprocal Rule to show that if n is a
negative integer then (xn )0 = nxn−1 .
f (x + h) − f (x)
.
(x + h) − x
This is exactly the slope of the secant line connecting the points (x, f (x)) and
(x + h, f (x + h)) on the graph of f , just as in diagram (1).
Solution 4.32. The velocity is p0 (t) = 3t2 − 2 and the acceleration is p00 (t) =
6t. At t = 2, the velocity is p0 (2) = 10 and the acceleration is p00 (2) = 12.
Solution 4.34. This is just finding the derivative using the Quotient Rule:
dy (2x − 1)0 (3x + 4) − (2x − 1)(3x + 4)0
=
dx (3x + 4)2
2(3x + 4) − (2x − 1)3
=
(3x + 4)2
11
= .
(3x + 4)2
dy 11
The rate of change of y at x = 0 is dx (0) = 16 .
d2 y d3 y
In terms of higher order derivatives, we write dx2 for y 00 (x), dx3 for y 000 (x),
and so on.
Example 4.35. An object’s motion in space over time is given by the function
2
y(t) = 2t3 − 3. Find its acceleration ddt2y .
dy
Solution 4.36. The velocity is the first derivative of y, which is dt = 6t2 .
2
d y
The acceleration is the second derivative of y, which is dt2 = 12t.
dy
The dx notation is very handy if there is more than one variable in play, which
is the content of the next section on the Chain Rule.
The Chain Rule. The Chain Rule is a formula for taking the derivative
of a composition of functions. Suppose that y = f ◦ g where f and g are
differentiable. What is the derivative of the composition y?
To break it down, we’ll use rates of change and the Liebniz notation. Let’s
slow the composition down by inserting a middle term. Think of y as
y = f (u).
That is, y is a function depending on a variable u. For u we want to take g(x),
so think of u as
u = g(x).
That is, u as a function depending on a variable x. Then we have
y = f (g(x)).
Now the rate at which y changes with respect to x should have something to
do with how y changes with respect to u and how u changes with respect to
dy dy
x. That is, dx should have something to do with du and du
dx . The Chain Rule
describes the relationship as a product:
dy dy du
(3) = · .
dx du dx
1 dy
Example 4.37. If y = 3u+1 and u = 2x2 , find dx .
37
dy 3 du
Solution 4.38. Observe that du = − (3u+1) 2 and dx = 4x. So by the Chain
Rule,
dy dy du 3
= =− · 4x.
dx du dx (3u + 1)2
The answer should be in terms of the variable x, so substitute in u = 2x2 to
get
dy 3 −12x
=− · 4x = .
dx (6x2 + 1)2 (6x2 + 1)2
Solution 4.40. You could multiply out (x2 −1)32 and then take the derivative
but this would take a while. Instead, think of y = (x2 − 1)32 as y = u32 where
u = x2 − 1. Then by the Chain Rule,
dy dy du
= = 32u31 · 2x = 64x(x2 − 1)31 .
dx du dx
Now let’s try to get a formula for the Chain Rule in terms of the functions f
and g. We have y = f ◦ g where we think of y = f (u) and u = g(x). The
Chain Rule says
dy dy du
= .
dx du dx
dy
Now du is f 0 (u) where we have taken the derivative of f with respect to u.
dy
Remembering that u = g(x), this means that du = f 0 (g(x)). On the other
du 0
hand, dx is g (x). Thus
dy
= f 0 (g(x))g 0 (x).
dx
dy
Finally, since y = f ◦ g, we have dx = (f ◦ g)0 . Hence we have:
Theorem 4.41 (The Chain Rule). Let f and g be functions such that g is
differentiable at x and f is differentiable at g(x). Then
Remark 4.42. This is not a proof of the Chain Rule since (3) was not
rigourously justified.
) . Find y 0 .
x+1 3
Example 4.43. Let y = ( x−1
38
x+1
Solution 4.44. Solution 1: Use the Liebniz technique. Let u = x−1 so that
y = u3 . Then
dy dy du 1(x − 1) − (x + 1)(1)
= = 3u2 ·
dx du dx (x − 1)2
2
x+1 −2
=3 ·
x−1 (x − 1)2
−6(x + 1)2
= .
(x − 1)4
Solution 4.46. Warning: the ideas used here are easy but the details are
technically harder.
dy dy du dv 1 −3v 2
= = √ · 3 · 2x.
dx du dv dx 2 u (v − 1)2
1 −3v 2 1 −3(x2 + 1) · 2x
√ · 3 2
· 2x = q · .
2 u (v − 1) 1
2 (x2 +1) 3 −1
[(x2 + 1)3 − 1]2
√
Solution 2: Use Theorem 4.41. We have y = f ◦ g ◦ h where f (x) = x,
2
g(x) = x31−1 and h(x) = x2 + 1. Note that f 0 (x) = 2√
1
x
, g 0 (x) = − (x33x
−1)2 and
39
The key to making the Chain Rule work for composites of multiple functions
is to start differentiating from the outermost function in the composite and
work inwards to the innermost function.
TIP: How do you get good at the Chain Rule? Practise. A lot. And when
you think you’re good at it, practise some more.
Solution 4.50. This time we’ll only use Theorem 4.41 and speed up a little
√ is a composite of three functions, y = f ◦ g ◦ h, where f (x) = tan(x),
bit. This
g(x) = x and h(x) = 2x2 + 3x − 1. So by Theorem 4.41,
y 0 (x) = f 0 (g(h(x)))g 0 (h(x))h0 (x)
1
= sec2 (g(h(x))) · p · (4x + 3)
2 h(x)
p 1
= sec2 ( 2x2 + 3x − 1) · √ · (4x + 3)
2 2x2 + 3x − 1
p 4x + 3
= sec2 ( 2x2 + 3x − 1) · √ .
2 2x2 + 3x − 1
Let’s return to (4) and see why these derivatives make sense. Two limits play
an important role:
sin x cos x − 1
lim =1 and lim = 0.
x→0x x→0 x
The first says that the tangent line to sin x as x approaches 0 matches the line
y = x, which is believable from the graph. The second says that the tangent
line to cos x as x approaches 0 matches the line y = 1, or equivalently, the
tangent line to cos x − 1 as x approaches 0 matches the line y = 0. Again,
thinking of the first interpretation, this is believable from the graph. Actually
working these out is not overly difficult but goes a bit beyond the scope of
what we want to do. Let’s take them as given and just use them.
d d
Theorem 4.51. We have dx (sin x) = cos x and dx (cos x) = − sin x.
Proof. Since sin x and cos x can’t be interpreted as sums, products, quotients,
or compositions of other functions, we have to go back to the definition of the
derivative. So
sin(x + h) − sin x cos(x + h) − cos x
(sin x)0 = lim and (cos x)0 = lim .
h→0 h h→0 h
We’ll need the trig identities
sin(x + h) = sin x cos h + sin h cos x
cos(x + h) = cos x cos h − sin x sin h.
Now using the arithmetic of limits we obtain
sin(x + h) − sin x sin x cos h + cos x sin h − sin x
(sin x)0 = lim = lim
h→0 h h→0 h
sin x(cos h − 1) cos x sin h
= lim + lim
h→0 h h→0 h
cos h − 1 sin h
= sin x · lim + cos x · lim
h→0 h h→0 h
= sin x · 0 + cos x · 1 = cos x.
The derivative of cos x is similar and left to you.
41
p
Proof. Let y = x q . Take q th -powers of both sides to get
y q = xp .
Now differentiate implicitly go get
qy q−1 y 0 = pxp−1 .
Solving for y 0 gives
pxp−1 p
y0 = = xp−1 y 1−q .
qy q−1 q
p
Since y = x q we have
p(1−q) p(1−q) (p−1)q+p(1−q) p−q p
xp−1 x q = xp−1+ q =x q =x q = x q −1 .
Therefore
p pq −1
y0 = (x ).
q
44
4
2 1
Example 4.62. Find the derivatives of y = x 3 − 2x 4 and z = ex 5 .
Solution 4.63. For the first one,
2 −1 1 3
y0 =(x 3 ) − x− 4 .
3 2
For the second, use the Chain Rule to get
4
4 1 4 1 4
z 0 = ex 5 · x− 5 = x− 5 ex 5 .
5 5
Differentiating Inverse Trig Functions. Recall that the six trig func-
tions have inverses, once restricted to an appropriate domain. What are their
derivatives?
Take for example sin−1 x, which has domain [−1, 1] and range [− π2 , π2 ]. Since
sin−1 x is the inverse function for sin x we have sin(sin−1 x) = x. So if we let
y = sin−1 x then taking the sine of both sides gives sin y = x. Now differentiate
implicitly to get
cos y · y 0 = 1.
Solving for y 0 gives
1
y0 = .
cos y
p
Here’s a trick. Since sin2 t + cos2 t = 1, we have cos t = 1 − sin2 t. In our
case this gives
1 1
y0 = p 2
=√
1 − sin y 1 − x2
where the last step occurred since y = sin−1 x so sin y = sin(sin−1 x) = x.
Working similarly for the other trig functions we end up with:
d 1
(sin−1 x) = √
dx 1 − x2
d −1
(cos−1 x) = √
dx 1 − x2
d 1
(tan−1 x) =
dx 1 + x2
d −1
(cot−1 x) =
dx 1 + x2
d 1
(sec−1 x) = √
dx |x| x2 − 1
d −1
(csc−1 x) = √
dx |x| x2 − 1
Note the odd form that the derivatives of sec−1 x and csc−1 x take, involving
absolute values. These are slightly unusual and won’t be dwelt on much in
what follows.
45
The usual rules, Chain Rule, Product Rule, Quotient Rule, apply as nor-
mal.
Example 4.64. Find the derivatives of y = tan−1 (2x2 ) and z = sin−1 (cos x).
Solution 4.65. Using the Chain Rule in the first case gives
1 4x
y0 = · (2x2 )0 = .
1 + (2x2 )2 1 + 4x4
The Chain Rule in the second case gives
1 1 − sin x − sin x
z0 = p · (cos x)0 = √ (− sin x) = √ 2 = .
1 − (cos x)2 2
1 − cos x sin x | sin x|
√
Note that in the denominator we can’t say sin2 x = sin x since sin x could
be negative while the square root always has to be positive. That’s why the
absolute value signs appeared. Note that if sin x is positive then z 0 = −1
whereas if sin x is negative then z 0 = 1. Moreover, z 0 is not defined when
sin x = 0.
d 1
Challenge Problem 4.66. Prove that dx (tan x) = 1+x2 .
Solution 4.68. First, we have to make sense of the problem. The volume
of the sphere is V = 43 πr3 where r is the radius. We’re looking for the rate
at which volume changes over time, that is, we’re looking for dV dt . We are
told that the radius changes with respect to time, ie. drdt = 2. Note that the
variable t does not explicitly appear in the formula for the volume. So when
we differentiate we need to do so implicitly. This gives
dV 4 dr dr
= π3r2 = 4πr2 .
dt 3 dt dt
We’re told that the radius is increasing at the rate of 2cm per minute, so
dr dV 2 dV
dt = 2. Therefore dt = 8πr . When r = 5 we get dt = 200π. Remembering
the units, dV 3
dt = 200π cm /min.
Example 4.69. Two ships, one heading east and the other heading west,
approach each other on parallel courses 8 miles apart. Given that each ship
is cruising at 20 miles per hour, at what rate is the distance between them
diminishing when they are 10 miles apart?
46
y
8 miles
ship −→
x
Here, the vertical line is the 8 mile difference between the parallel courses the
ships are moving in, y is the distance between the two ships, and x is the
“horizontal” distance between the ships.
Now let’s figure out what the problem is asking for and what information we
have. We want to find the rate at which the distance between the ships is
changing, that is, we want dy dt . We are told that the ships are approaching
one another so x is decreasing, meaning dx dt should be negative. Further, the
ships are each moving at 20 miles per hour, so the total speed at which they
approach one another is 40 miles per hour. Therefore, dx dt = −40. Finally, we
need to relate x and y. But the triangle does this: x2 + 64 = y 2 . Therefore,
differentiating implicitly gives
dx dy dy x dx
2x = 2y ⇒ = .
dt dt dt y dt
We want to know dy 2 2
dt when y = 10. But when y = 10, from x + 64 = y we
get x = 6. Therefore
dy 6
= · (−40) = −24.
dt 10
47
5. Curve Sketching
We’ll see that critical points have something to do with maximums and mini-
mums.
Definition 5.4. A function f has a local maximum at x0 if there is an inter-
val I with x0 ∈ I and f (x0 ) ≥ f (x) for all x in I.
A function f has a local minimum at x0 if there is an interval I with x0 ∈ I
and f (x0 ) ≤ f (x) for all x in I.
Proof. Suppose that f has a local maximum value at c for some c ∈ (a, b).
Then f (c) ≥ f (x) for any other x ∈ (a, b). In particular, if h > 0 and h is
sufficiently small so that c + h ∈ (a, b), then f (c + h) − f (c) ≤ 0. As h is
positive, this also implies that f (c+h)−f
h
(c)
≤ 0. Further, this is true for any h
near c, so
f (c + h) − f (c)
lim ≤ 0.
h→0+ h
0
This limit is very close to the definition of f (c), the only difference being
that it’s one-sided. But by hypothesis, f is differentiable at c so f 0 (c) ex-
ists, meaning the limit from the left and the limit from the right are equal.
48
Therefore,
f (c + h) − f (c)
f 0 (c) = lim ≤ 0.
h→0+ h
On the other hand, if h < 0 and h is sufficiently small so that c + h ∈ (a, b)
then the same argument gives
f (c + h) − f (c)
f 0 (c) = lim ≥ 0.
h→0− h
Thus f 0 (c) is both ≥ 0 and ≤ 0, meaning we must have f 0 (c) = 0.
Maximum and Minimum values. We’ve seen local maximums and mini-
mums. There are also absolute maximums and minimums.
Definition 5.6. A function f has an absolute maximum at x0 if f (x0 ) ≥ f (x)
for every x in the domain of f .
The function has an absolute minimum at x0 if f (x0 ) ≤ f (x) for every x in
the domain of f .
Note that absolute maximum and minimum values do not have to be unique.
In an extreme case, take f (x) = 1 for all x ∈ R. Then f has an absolute
maximum of 1 and an absolute minimum of 1 at every point x in R.
For example, consider the following graph of a function y = f (x):
a x1 x2 b
(5)
Observe that f has a local maximum at the endpoint a, a local minimum at
the point x1 , a local maximum at the point x2 . and a local minimum at the
endpoint b. Since the value of f at x2 is larger than at a, f (x2 ) is the absolute
maximum, and since the value of f at x1 is less than the value at b, f (x1 )
is the absolute minimum. The sharp corner at f (x2 ) has the distinct feature
of f not being differentiable at x2 . This leads to a definition.
Definition 5.7. Let f be a function and suppose there is a point c such that
f (c) is defined but f 0 (c) is not. Then c is a singular point.
Returning to (5), f had extreme values (local max’s and local min’s) at three
types of points:
• endpoints;
• singular points;
49
• critical points.
Example 5.8. Consider the function f (x) = x4 − x2 + 2 restricted to the
domain [−2, 2]. Find the absolute maximum and absolute minimum of f .
f 0 (x) =
Solution 5.9. The first step is to find the critical points. We have √
3 2
4x − 2x√= 2x(2x − 1) so the critical points are x = 0, x = 1/ 2 and
x = −1/ 2. Notice that f 0 is defined on all of (−3, 2) so f has no singular
points. Thus the extreme values of f occur at critical points and endpoints.
To find the absolute maximum and absolute minimum value √ of f simply eval-
uate √
f at the extreme points. We get f (−2) = 14, f (−1/ 2) = 7/8, f (0) = 2,
f (1/ 2) = 7/8 and f (2) = 14. Therefore f has an absolute maximum of 14
which it takes twice, at√x = ±2, and an absolute minimum of 7/8, which it
takes twice, at x = ±1/ 2.
The Mean Value Theorem. Having found local minimums and maximums,
the next step is to try to determine on what intervals a function is increasing
or decreasing. The key tool for doing this is The Mean Value Theorem.
Theorem 5.10 (The Mean Value Theorem). Let f be a continuous function
defined on the closed interval [a, b] which is differentiable on the open interval
(a, b). Then there is a point c in (a, b) such that
f (b) − f (a)
f 0 (c) = .
b−a
(b, f (b))
(a, f (a))
a c b
Example 5.11. Verify the Mean Value Theorem for the function f (x) = x2
on [1, 4].
Solution 5.12. The Mean Value Theorem says that there is a point c in (1, 4)
such that f 0 (c) = 2c equals f (b)−f
b−a
(a)
= 16−1
4−1 = 5, ie - 2c = 5. But just take
c = 5/2 and this works.
Intervals of increase and decrease. We now use the Mean Value Theorem
to determine when a function is increasing or decreasing on an interval. First,
let’s define terms.
Definition 5.13. Let f be a function defined on an interval I and suppose
x1 , x2 are two points in I.
• If f (x1 ) < f (x2 ) whenever x1 < x2 then f is increasing on I.
• If f (x1 ) > f (x2 ) whenever x1 < x2 then f is decreasing on I.
• If f (x1 ) ≤ f (x2 ) whenever x1 < x2 then f is non-increasing on I.
• If f (x1 ) ≥ f (x2 ) whenever x1 < x2 then f is non-decreasing on I.
51
Proof. Let x1 , x2 ∈ I and suppose that x1 < x2 . By the Mean Value Theo-
rem there is a point c ∈ (x1 , x2 ) such that f 0 (c) = f (xx22)−f
−x1
(x1 )
. Notice that
x2 − x1 > 0, so f 0 (c) and f (x2 ) − f (x1 ) have the same sign. If f 0 (x) > 0 for all
x ∈ I then f 0 (c) > 0, implying that f (x2 ) − f (x1 ) > 0. Thus f (x2 ) > f (x1 ).
As this is true for all x1 < x2 in I, the function f must be increasing on I.
The other cases are similar.
Example 5.15. Find the intervals of increase and decrease for the function
f (x) = x3 − 12x + 1.
Solution 5.16. The derivative of f is f 0 (x) = 3x2 − 12 = 3(x2 − 4) =
3(x + 2)(x − 2). We want to know when f 0 (x) > 0 and f 0 (x) < 0. It helps to
first know when f 0 (x) = 0. We have f 0 (x) = 3(x + 2)(x − 2) = 0 when x = ±2.
This means we need to consider the intervals (−∞, −2), (−2, 2) and (2, ∞).
In each interval, f 0 (x) will be either always positive or always negative. (For
if there is a sign change then at some point the graph of f 0 (x) must cross
the x-axis, that is, f 0 (x) must be zero, but we’ve already found all the points
where f 0 (x) = 0.) Therefore all we need to do is test f 0 (x) on one point in
each interval. Observe that f 0 (−3) = 15 > 0 so f is increasing on (−∞, −2);
f 0 (0) = −12 > 0 so f is decreasing on (−2, 2); and f 0 (3) = 15 > 0 so f is
increasing on (3, ∞).
Example 5.17. Show that for all x > 0 we have sin x < x.
Solution 5.18. Let f (x) = x − sin x. We want to show that f (x) > 0 for all
x > 0. Since −1 ≤ sin x ≤ 1 for any x, if x > 1 then certainly f (x) > 0. So it
remains to consider f on (0, 1). If we knew that f (0) ≥ 0 and f is increasing
on (0, 1) then we’d be done. We have f (0) = 0 and f 0 (x) = 1 − cos x. Now
cos x ≤ 1 for any x, and cos x = 1 only if x is a multiple of 2π. Thus cos x < 1
on (0, 1), implying that f 0 (x) > 0 on (0, 1), so f is increasing on (0, 1).
Testing critical points I. Given a function f we can find its critical points
by finding solutions to f 0 (x) = 0. We also know that critical points have some-
thing to do with local maximums and minimums, although the only definite
statement we have so far is logically going the other way: given a local max or
local min in an open interval on which f is differentiable then it is a critical
point. So when we find solutions to f 0 (x) = 0, we don’t know immediately
whether these critical points are local minimums, local maximums, or neither.
We need a way of testing this.
52
There are two ways to do this. The first is to look at the intervals of increase
and decrease and correctly interpret information. This is called The First
Derivative Test, and is best seen through an example.
Example 5.19. Find and classify the critical points of the function f (x) =
x4 − 2x2 − 2.
Solution 5.20. First find the critical points: f 0 (x) = 4x3 −4x = 4x(x2 −1) =
4x(x − 1)(x + 1), so the critical points are x = 0, x = 1 and x = −1. Consider
the intervals of increase and decrease:
(−∞, −1) (−1, 0) (0, 1) (1, ∞)
f0 − + − +
f & % & %
Here, in each open interval we have evaluated f 0 at one point in order to deter-
mine whether it is positive or negative. For example, taking −2 ∈ (−∞, −1)
gives f 0 (−2) = −24 and all we care about is the sign, which is negative. Know-
ing that f 0 (x) < 0 on (−∞, 3), we know that f is decreasing on this interval,
and so a downward pointing arrow is inserted in the table.
Now interpret what the intervals of increase and decrease say about local
maximums and minimums. As f is decreasing on (−∞, 1) and then increases
on (−1, 0), it must be the case that f has a local minimum at x = 1. Similarly,
as f is increasing on (−1, 0) and decreasing on (0, 1), it must be the case that f
has a local maximum at x = 0. The same reasoning shows that f has a local
minimum at x = 1.
Remark 5.21. Having seen how it works, note that the local maximums and
minimums can be simply read off the table.
It’s worth being slightly cautious here. It’s possible for a critical point to be
neither a maximum nor a minimum.
Example 5.22. Find and classify the critical points of f (x) = x3 .
Solution 5.23. This is a simple but illustrative example, because you know
the graph of y = x3 is always increasing, but it flattens out momentarily at
x = 0. Doing the algebra, f 0 (x) = 3x2 so f has a critical point at x = 0. The
intervals of increase and decrease are:
(−∞, 0) (0, ∞)
f0 + +
f % %
The table shows that while x = 0 is a critical point, the function is increasing
both to the left and right of it, so x = 0 is neither a local maximum nor a
local minimum.
Concavity and Inflection Points Now we see what extra information the
second derivative of f adds. Essentially, the second derivative will say some-
thing about whether the graph of f is cupped upwards or cupped down-
wards.
What does this mean in terms of the graph of f ? Remember that the derivative
f 0 (x) is the slope of the tangent line to the graph of f at x. If f 0 is increasing it
means that over time, travelling from left to right on the x-axis, the slopes of
the tangent lines to the graph of f are increasing. Here is an example:
x0
Note that on the left part of the graph the slope of the tangent line is negative,
but as we go from left to right it becomes less and less negative, until it hits 0
at x0 , and then it becomes increasingly positive. It’s worth repeating this:
saying f 0 is increasing does not mean the slope of the tangent line to the graph
of f is positive, it could be increasing from −2 to −1, for example.
Similarly, if f 0 is decreasing then the slopes of the tangent lines to the graph
of f are decreasing, as in:
x0
Here is a third graph which has one part of f being concave up and another
being concave down.
x0
Before explaining why this works, let’s read some fine print. First, the theorem
does not say that if f 00 (x0 ) = 0 then x0 is an inflection point. It may be the case
that some point x0 satisfying f 00 (x0 ) = 0 is not an inflection point. (Just as
having f 0 (c) = 0 leaves open the possibility that c is neither a local maximum
nor a local minimum.) Second, it could be the case that the concavity of f
changes at a sharp corner - but f is not differentiable at this corner so it does
not qualify as an inflection point.
Example 5.27. Find the intervals of concavity and the inflection points of
f (x) = x6 − 10x4 .
Solution 5.28. We have f 0 (x) = 6x5 − 40x3 so f 00 (x) = 30x4 − 120x2 =
30x2 (x2 − 4). Thus f 00 (x) = 0 when x = 0, x = −2 and x = 2, so the possible
inflection points of f are x = 0, x = −2 and x = 2. Now consider the intervals
of concavity:
(−∞, −2) (−2, 0) (0, 2) (2, ∞)
f 00 + − − +
f ^ _ _ ^
Here, like in the case of intervals of increase and decrease, we simply test f 00 at
one point in each interval and only record the sign. We find that f is concave
up on (−∞, −2), concave down on (−2, 2) and concave up on (2, ∞). The
change in concavity at x = −2 and x = 2 implies that both of these points are
inflection points. However, there is no change of concavity at x = 0 so this is
not an inflection point.
Testing Critical Points II. Information from the second derivative gives
another way of testing whether a critical point is a local maximum, a local
minimum or neither.
Theorem 5.29 (The Second Derivative Test). Suppose that f 0 (c) = 0 and
f 00 (c) exists. Then
• if f 00 (c) > 0 then f is a local minimum;
• if f 00 (c) < 0 then f is a local maximum;
• if f 00 (c) = 0 then more information is needed to classify the critical
point.
This looks counter-intuitive: a positive second derivative says the function has
a local minimum. But think about it. As f 0 (c) = 0, the tangent line to the
graph of f is horizontal. As f 00 (c) > 0 Theorem 5.26 says that f is concave
up around c. The only way to be concave up around c and have a horizontal
tangent line t c is for f to be a local minimum. The same argument works for
a local maximum.
Example 5.30. Return to f (x) = x4 − 2x2 − 2 in Example 5.19. Classify the
critical points of f using the Second Derivative Test.
Solution 5.31. We know what the answer should be, so we’re checking that
the Second Derivative Test gives the same answer. We have f 0 (x) = 4x3 − 4x
so f 00 (x) = 12x2 − 4. The critical points are x = 0, x = 1 and x = −1.
Evaluating, f 00 (−1) = 8 > 0 so x = −1 is a local minimum, f 00 (0) = −4 < 0
so x = 0 is a local maximum, and f 00 (1) = 8 > 0 so x = 1 is a local minimum.
Example 5.32. Return to f (x) = x3 in Example 5.22. Classify the critical
points of f using the Second Derivative Test.
56
Solution 5.33. We have f 0 (x) = 3x2 so f 00 (x) = 6x. There is one critical
point at x = 0, and f 00 (0) = 0, so the Second Derivative Test does not give
enough information to determine whether x = 0 is a local maximum, a local
minimum or neither.
This says that the graph of f flattens out to a horizontal line if you go
far enough along the x-axis. For example, the graph of f (x) = e−x sat-
isfies limx→∞ e−x = 0, so f has a horizontal asymptote y = 0 as x ap-
proaches ∞.
x
Example 5.36. Determine whether the function f (x) = x−2 has vertical or
horizontal asymptotes.
Solution 5.37. Vertical asymptotes tend to occur at points where the func-
tion is not defined. In this case, f is not defined at x = 2. Think of f as a
1
product, f (x) = x · x−2 . As x approaches 2 from the right, the factor x ap-
proaches 2, and x−2 becomes a very small positive number, implying that the
1 x
factor x−2 becomes a very large positive number. Thus limx→2+ x−2 = ∞.
Therefore f has a vertical asymptote at x = 2.
We don’t strictly need to check the left hand limit at this point, but let’s do
1
so to be complete. Again think of f as f (x) = x · x−2 . As x approaches 2 from
the left, the factor x approaches 2 and x − 2 becomes a very small negative
1
number, implying that the factor x−2 becomes a very large negative number.
x
Thus limx→2− x−2 = −∞.
For horizontal asymptotes, if x is very large then x and x − 2 have essentially
x
the same magnitude, so the limit of x−2 as x approaches ∞ should be 1.
That’s intuition, let’s now check it works for real. Do a trick by writing the
denominator x − 2 as x(1 − x2 ). This looks strange until you see it in action:
x x 1
lim = lim = lim =1
x→∞ x − 2 x→∞ x(1 − x2 ) x→∞ 1 − 2
x
57
• critical points;
• inflection points;
• intervals of concavity;
• asymptotes;
Solution 5.39. Let’s find critical ponts and inflection points first and then
put the intervals of increase, decrease and concavity on one table. We have
f 0 (x) = 4x3 − 12x2 = 4x2 (x − 3), so f has critical points at x = 0, x = 3.
Also, f 00 (x) = 12x2 − 24x = 12x(x − 2) so f has inflection points at x = 0
and x = 2. Use the critical and inflection points to divide the real line into
intervals and test f 0 and f 00 at one point in each. We get
To pin it down more, we need some values for f (x) at certain x. Ideally, we’d
find intercepts. However, finding solutions to f (x) = 0 is not terribly easy,
so let’s check the value of f at the critical and inflection points. We have
f (0) = 1, f (2) = −15 and f (3) = −26. Let’s also check for asymptotes. Since
f (x) is defined for all x there are no vertical asymptotes, and the limit of f
as x approaches ±∞ is ∞, so there are no horizontal asymptotes.
58
2 3
x2 −1
Example 5.40. Sketch the graph of the function f (x) = x2 −4 .
Solution 5.41. First observe that f is not defined at x = ±2. Note that
limx→2+ f (x) = ∞, limx→2− f (x) = −∞, limx→−2+ f (x) = ∞ and limx→−2− f (x) =
−∞. So f has vertical asymptotes at ±2. Speaking of asymptotes, note also
that as in Example 5.36 we have limx→∞ f (x) = 1 and limx→−∞ f (x) = −∞.
So f has horizontal asymptotes of 1 as x approaches ±∞.
While we’re on f , we can find intercepts this time. We have f (x) = 0 if and
only if x2 − 1 = 0, so f has x-intercepts at x = ±1.
Now take derivatives. Doing the quotient rule gives
−6x 6(3x2 + 4)
f 0 (x) = f 00 (x) = .
(x2
− 4) 2 (x2 − 4)3
So f has one critical point at x = 0, and no inflection points. Let’s put
together a table, where the real line is divided into intervals using the critical
point and the points where f has a vertical asymptote. We get
(−∞, −2) (−2, 0) (0, 2) (2, ∞)
f0 + + − −
f % % & &
f 00 + − − +
f ^ _ _ ^
For pinning points, there are the intercepts at x = ±1 and f (0) = 14 . So for
the graph of f we obtain something roughly like
−2 2
59
6. Integration
Integration is about finding the area under the graph of a function f . We’ll see
that, remarkably, integration is an “inverse process” to differentiation. Let’s
start with the inverse process.
What is this saying? It says that antiderivatives are *not* unique. Whatever
antiderivative you find you can always add a constant to get another anti-
derivative. For example, if f (x) = 3x2 then one antiderivative is g(x) = x3
and another is h(x) = x3 − 17.
At this point, the answer is that it has no meaning. It’s just a game we can
play with functions. We’ll soon see that antiderivatives give a way of finding
integrals.
“Under” is not a good term, really. For if the graph of f is above the x-axis we
want to count this as positive area, and if the graph of f is under the x-axis
we want to count this as negative area. So more accurately, we want to find
the area A between the curve and the x-axis that lies over [a, b]. How can we
do this?
Before aiming for an exact answer, let’s first try to approximate the area.
We do know how to calculate the area of a rectangle: it’s the base times the
height. So let’s fit together a sequence of rectangles that are below the curve
with total area L. Then the area A is ≥ L. Then let’s fit together a sequence
of rectangles that are above the curve, with total area U . Then the area A
is ≤ L. Here is a picture where the lower rectangles have their base on the
x-axis and their tops being the black horizontal lines below the curve, and the
upper rectangles have their base on the x-axis and their tops being the red
horizontal lines above the curve:
a x1 x2 x3 x4 b
Notice that we were organised about this. The number of lower rectangles
equals the number of upper rectangles, and they have the same bases. The
height of the lower rectangles is the least value of the curve running along that
section of the base. The height of the upper rectangles is the greatest value
of the curve running along that section of the base.
Step 1 : Divide [a, b] into n segments [a, x1 ], [x1 , x2 ], . . ., [xn−2 , xn−1 ], [xn−1 , b]
where a < x1 < x2 < · · · < xn−1 < b. To align the notation, let x0 = a and
xn = b. Let `i be the length of segment [xi−1 , xi ].
Step 3 : Now form rectangles over [xi−1 , xi ]. The lower rectangle has base `i
and height mi , so its area is `i mi . The upper rectangle has base `i and
height Mi , so its area is `i Mi .
61
Step 4 : Add up the areas of the lower and upper rectangles. The sum of the
areas of the lower rectangles is
L(f, P ) = `1 m1 + · · · + `n mn
where we remember in the notation L(f, P ) that this total lower area depends
on the function f how [a, b] was divided (“partitioned”) into n segments. The
sum of the areas of the upper rectangles is
U (f, P ) = `1 M1 + · · · + `n Mn .
Therefore, the area between the graph of f and the x-axis that lies over [a, b]
satisfies
L(f, P ) ≤ A ≤ U (f, P ).
You can imagine that if the intervals are made smaller then there’s less un-
counted area between A and the total area of the upper rectangles, and less
uncounted area between A and the total area of the lower rectangles. This
should continue to improve as the interval lengths get smaller. So in the limit
we should get an accurate approximation to A.
This limiting process is what is meant by the indefinite integral, which mea-
sures the area A between the curve of f and the x-axis that lies over the
interval [a, b]. Being cautious, it’s necessary to know that this limit exists,
and that the limit for L(f, P ) equals the limit for U (f, P ). We won’t prove
it but all this holds, provided f is continuous. The following definition then
makes sense.
Definition 6.7. Let f be a continuous function on the interval [a, b]. The
definite integral of f from a to b is the unique number I such that L(f, P ) ≤
I ≤ U (f, P ) for all partitions of I. We write this as
Z b
I= f (x) dx.
a
The definite integral has nice properties because of the arithmetic of limits and
the fact that within those limits we’re just adding up areas of rectangles.
Theorem 6.8. Let f and g be continuous functions on the interval [a, b].
Then:
Z b Z b Z b
(a) (cf (x) + dg(x)) dx = c f (x) dx + d g(x) dx;
a a a
Z a
(b) f (x) dx = 0;
a
Z t Z b Z b
(c) f (x) dx + f (x) dx = f (x) dx;
a t a
62
Z a Z b
(d) f (x) dx = − f (x) dx;
b a
Z b Z b
(e) if f (x) ≤ g(x) for all x ∈ [a, b] then f (x) dx ≤ g(x) dx.
a a
Proof. We’ll only show that F is differentiable and F 0 (x) = f (x). By the
definition of the derivative,
F (x + h) − F (x)
F 0 (x) = lim .
h→0h
Let’s look at the righthand limit first, so h > 0. By definition of F and
properties of the definite integral, we have
Z x+h Z x
F (x + h) − F (x) = f (t) dt − f (t) dt
a a
Z x Z x+h Z x
= f (t) dt + f (t) dt − f (t) dt
a x a
Z x+h
= f (t) dt.
x
63
R x+h
Now x f (t) dt is the area between the graph of f and the x-axis that lies
over [x, x+h]. We’re thinking of h as being small. Let’s approximate this area
by going back to rectangles. Let Mh be the maximum value of f on [x, x + h]
and let mh be the minimum value of f on [x, x + h]. Let L be the rectangle
with base [x, x + h] and height mh , so its area is hmh . Let U be the rectangle
with base [x, x + h] and height Mh , so its area is hMh . Therefore
Z x+h
hmh ≤ f (t) dt ≤ hMh .
x
Since h > 0, if we divide through by h we get
1 x+h
Z
mh ≤ f (t) dt ≤ Mh .
h x
That is,
F (x + h) − F (x)
(6) mh ≤ ≤ Mh .
h
As h approaches 0, the minimum value of f on [x, x + h] and the maximum
value of f on [x, x + h] both approach f (x). Therefore, taking limits in all
three terms in (6) we obtain
F (x + h) − F (x)
f (x) ≤ lim ≤ f (x).
x→0+ h
Hence
F (x + h) − F (x)
lim = f (x).
x→0+ h
0
Similarly for the lefthand limit. Therefore F (x) exists and it equals f (x).
Rx
Example 6.10. If F (x) = −1 sin t dt, find F 0 (π/2).
Solution 6.11. By the Fundamental Theorem of Calculus, F 0 (x) = sin x, so
F 0 (π/2) = 1.
Example 6.12. Suppose that
Z x
tf (t) dt = sin x.
0
Find f (π) and f 0 (π).
Rx
Solution 6.13. Let F (x) = 0 tf (t) dt − sin x, so F (x) = 0. Taking deriva-
tives,
F 0 (x) = xf (x) − cos x and F 0 (x) = 0.
Equating the two expressions for F 0 (x) gives
xf (x) = cos x.
Therefore, evaluating at π gives πf (π) = cos π = −1, so f (π) = − π1 . Further,
taking derivatives in xf (x) = cos x gives
f (x) + xf 0 (x) = − sin x
so evaluating at π gives f (π)+πf 0 (π) = − sin π = 0. Knowing that f (π) = − π1 ,
we get f 0 (π) = − π1 f (π) = π12 .
64
The Fundamental Theorem of Calculus Part I says that if you first integrate f
and then differentiate you get f back again. So the integral of f is an an-
tiderivative of f ! We know that antiderivatives are unique up to adding a
constant, so if we can find an antiderivative of f then we know what the in-
tegral is, up to a constant. The next theorem tells us how to deal with the
constant in a definite integral.
Theorem 6.14. Let f be a continuous function on the interval [a, b]. If G is
any antiderivative of f then
Z b
f (t) dt = G(b) − G(a).
a
Example 6.20. Find the area enclosed between the curves y = 4x and y = x2 .
Solution 6.21. The first thing to do is find where the curves intersect so
we have bounds for the integration. Intersection points lie on both curves, so
they are solutions to the equation 4x = x2 . That is, 0 = x2 − 4x = x(x − 4),
so the points of intersection are 0 and 4.
The next thing to do is find which curve is on top and which is on bottom.
Pick a point in (0, 4) and evaluate both functions at that point. For example,
pick x = 1. Then on the line y = 4x we get y(1) = 4 and on the parabola
y = x2 we get y(1) = 1. So y = 4x is on top and y = x2 is on bottom.
The area between the two curves is therefore
Z 4 4
2 2 1 3 64 32
4x − x dx = 2x − x = 32 − − (0 − 0) = .
0 3 0 3 3
Next, suppose that you want to know the total area enclosed between the
graph of f and the x-axis, but some of the area is above the x-axis and some
is below. The definite integral counts area below the x-axis as negative but as
we want a total enclosed area we want to count it as positive. So we’ll have
to manually insert a minus sign.
Example 6.22. Determine the area enclosed by the graph of y = x2 − 2x
and the x-axis, on the interval [−1, 3].
Solution 6.23. First, we need to know where the graph crosses the x-axis,
as these are the points where the graph of f will change from being above the
x-axis to below or vice-versa. That is, find solutions to y = x2 −2x = x(x−2).
These are x = 0 and x = 2.
Second, on the subintervals (−1, 0), (0, 2) and (2, 3), check where the graph
of y is above or below the x-axis. This is done by evaluating a point from the
interior of each subinterval and checking its sign. We have y(− 21 ) = 14 + 1 > 0
66
so y is above the x-axis on (−1, 0); y(1) = 1 − 2 < 0 so y is below the x-axis
on (0, 2); and y(5/2) = 25
4 − 5 > 0 so y is above the x-axis on (2, 3).
for the indefinite integral. Here, the bounds on the integration are left out as
we’re only looking for an antiderivative, not an area, and the “+c” is used to
indicate that the antiderivative can be changed up to any constant c.
1
Example 6.24. Find ex − cos x + x− 3 dx
R
We’ll often use indefinite integrals because it’s easier to tackle methods of
integration without having to worry about carrying along bounds all the time.
WARNING: An indefinite integral without the “+c” is incorrect and you will
lose marks for doing this on the exam!
67
7. Methods of Integration
Differentiation is algorithmic: once you know the derivatives of the basic func-
tions and the rules of differentiation (including the Chain Rule), you can
differentiate anything you like. Integration is not algorithmic: there are no
formulas that are 100% analogues to the Product Rule, the Quotient Rule
or the Chain Rule. So it is sometimes not straightforward to integrate, and
finding an integral can be something of an art form.
There are some helpful methods of integration. We’ll look at substitution,
integration by parts, and partial fractions. In what follows, we’re mainly
concerned with finding methods to allow for calculation rather than justifying
precisely why they work.
0
Writing this a bit differently, let u = g(x). Then du dx = g (x), or loosely,
0
du = g (x) dx. So
Z Z
f (g(x))g (x) du = f 0 (u) du = f (u) + c = f (g(x)) + c.
0 0
WARNING: The argument we just went through is not rigourous, that’s why
we said “loosely”. Remember du dx is *not* a fraction, it’s just notation for
the derivative. It happens to be particularly suggestive notation because the
intuition that comes in loosely treating like a fraction actually turns out to
be rigourously true, although this needs a proper proof, which we won’t give.
Note the process in the last example: use u for the substitution, integrate
to find a function in u, and substitute back to get an answer in the vari-
able x.
68
Solution 7.6. If u = cos x then du = − sin x dx. This leaves two more powers
of sine unaccounted for when we try to substitute in u. But we can use the
usual trick of writing sin2 x = 1 − cos2 x to convert the sines into cosines. This
gives
Z Z Z
3 5 2 5
sin cos dx = sin x(1 − cos x) cos x dx = − (1 − u2 )u5 du
Z
= u7 − u2 du
1 8 1 3
= u − u +c
8 3
1 1
= cos8 x − cos3 x + c.
8 3
For even powers of sine and cosine it’s handy to remember the half-angle
formulas
1 1
cos2 x = (1 + cos 2x) sin2 x = (1 − cos 2x).
2 2
Z
Example 7.7. Find sin4 x dx.
Solution 7.8. First, use the half-angle formula to convert some sines to
cosines:
2
4 1 1
sin x = (1 − cos 2x) = (1 − 2 cos 2x + cos2 2x).
2 4
Now convert the cos2 2x using the half-angle formula, to get cos2 2x = 21 (1 +
cos 4x). Therefore
Z Z
4 1 1
sin x dx = 1 − 2 cos 2x + (1 + cos 4x) dx
4 2
1 1 1
= (x − sin 2x + x + sin 4x) + c
4 2 8
3 1 1
= x − sin 2x + sin 4x + c.
4 4 32
69
Remark 7.9. It’s always easy to check whether you’re indefinite integral is
correct. Simply differentiate the answer to see if you get the original function
back again. If so, you’re answer is right. If not, you’re answer is wrong.
opposite
Here, from 2 sin θ = x we have sin θ = x2 , so knowing that sin θ = hypotenuse
we obtain that the opposite side to the angle θ is x and the hypotenuse is 2.
√ adjacent
Therefore the adjacent side has length 4 − x2 . As cos θ = hypotenuse we
√
2
obtain cos θ = 4−x2 . Thus
Z p
4 − x2 dx = 2θ + sin 2θ + c = 2θ + 2 sin θ cos θ + c
√
x 4 − x2
= 2 sin−1 ( ) + x + c.
2 2
The integral on the right side is easy to work out, so the end result is
Z
xex = xex − ex + c.
Remark 7.14. It may not always be easy to see what u and dv should be to
get the calculation going. Sometimes experimentation is needed.
Z
Example 7.15. Find x2 ln x dx.
Solution 7.20. This is much like the first problem. Let u = x2 and dv =
e−2x dx. Then du = 2x dx and v = − 21 e−2x . Interation by parts gives
Z Z Z
1 1 1
x2 e−2x dx = − x2 e−2x + 2xe−2x dx = − x2 e−2x + xe−2x dx.
2 2 2
The integral on the right side can be handled by integration by parts again.
Now let u = x and dv = e−2x dx. Then du = dx and v = − 21 e−2x . We
therefore get
Z Z
−2x 1 −2x 1 1 1
xe dx = − xe + e−2x dx = − xe−2x − e−2x + c.
2 2 2 4
Putting everything together gives
Z
1 1 1
x2 e−2x dx = − x2 e−2x − xe−2x − e−2x + c.
2 2 4
Z
Challenge Problem 7.21. Find and prove a formula for xn ex dx.
Integration by parts with sine and cosine sometimes involves doing parts twice
and then collecting like terms.
Z
Example 7.22. Find ex sin x dx.
For the integral on the right side, do integration by parts again with u = cos x
and dv = ex dx. Then du = − sin x dx and v = ex , so
Z Z
ex cos x dx = ex cos x + ex sin x dx.
Note that the integral on the left is the one we started with, so we seem to be
going in circles. But if we put the two equations together we get
Z Z Z
e sin x dx = e sin x−(e cos x+ e sin x dx) = e sin x−e cos x− ex sin x dx.
x x x x x x
Finally, let’s not forget that that integration is about finding area.
√ √
Example 7.24. Find the area between the curve y = xe x and the x-axis
on the interval [0, 4].
Solution 7.25. The area is given by the definite integral
Z 4
√ √x
xe dx.
0
Let’s forget about the bounds for the moment and work out the indefinite
integral first. We use √
a combination of integration methods. Start with a
1
substitution: let u = x so du = 2√ x
dx. Notice that the latter implies
√ √
that 2 xdu = dx, and substituting for the x again, this gives 2udu = dx.
Therefore
√ √x
Z Z Z
xe dx = ueu · 2u du = 2 u2 eu du.
Thus
√ √ √ √ √ √
Z
x x
xe dx = 2(xe − 2 xe x + e x ) + c.
Solution 7.27. Focus on the term inside the integral first (called the inte-
grand ). Notice that the denominator factors, x2 − 5x + 6 = (x − 3)(x − 2).
Let’s look for an A and B such that
x+4 A B
= + .
(x − 3)(x − 2) x−3 x−2
Cross-multiplying gives
x+4 A(x − 2) + B(x − 3) (A + B)x + (−2A − 3B)
= = .
(x − 3)(x − 2) (x − 3)(x − 2) (x − 3)(x − 2)
Equating coefficients of powers of x in the numerator gives
A+B =1 − 2A − 3B = 4.
Solving for A and B gives A = 7 and B = −6. Thus
x+4 7 6
= − .
(x − 3)(x − 2) x−3 x−2
Now integrate to get
Z Z
x+4 7 6
dx = − dx = 7 ln |x − 3| − 6 ln |x − 2| + c.
x2 − 5x + 6 x−3 x−2
How did we know looking for an A and a B like this would work? Essentially,
this is what led to an effective comparison of coefficients in the numerators
after cross-multiplying.
This leads to a series of steps for doing partial fractions.
P (x)
Step 1 : If necessary, rewrite the fraction Q(x) so that the power of the numer-
ator is strictly less than the denominator. That is, if the degree of P (x) is ≥
the degree of Q(x) then do a long division to write P (x) = A(x)Q(x) + R(x)
where A(x) is a polynomial and the degree of the remainder R(x) is strictly
P (x) R(x)
less than the degree of Q(x). Then Q(x) = A(x) + Q(x) . Integrating the
R(x)
polynomial A(x) is easy and we’re left with integrating Q(x) .
For example, if
P (x) P (x)
=
Q(x) (x − 3)(x + 1) (x + x + 1)(x2 − x + 2)3
2 2
write
P (x)
=
(x − 3)(x + 1)2 (x2 + x + 1)(x2 − x + 2)3
A1 A2 A3 B1 x + C1 B2 x + C2 B3 x + C3 B4 x + C4
+ + + + + + .
x − 3 x + 1 (x + 1)2 x2 + x + 1 x2 − x + 2 (x2 − x + 2)2 (x2 − x + 2)3
2
x2 + 2
Z
Example 7.28. Evaluate the definite integral dx.
1 4x5 + 4x3 + x
Solution 7.29. This will be a longer example as there are lots of bits and
pieces. First observe that
4x5 + 4x3 + x = x(4x4 + 4x2 + 1) = x(2x2 + 1)2 .
So we aim to solve
x2 + 2 A Bx + C Dx + E
= + 2 + .
x(2x2 + 1)2 x 2x + 1 (2x2 + 1)2
Cross-multiplying gives
x2 + 2 A(2x2 + 1)2 + (Bx + C)x(2x2 + 1) + (Dx + E)x
=
x(2x2 + 1)2 x(2x2 + 1)2
A(4x + 4x + 1) + B(2x4 + x2 ) + C(2x3 + x) + Dx2 + Ex
4 2
= .
x(2x2 + 1)2
Comparing coefficients gives equations
4A + 2B = 0
2C = 0
4A + B + D = 1
C +E =0
A=2
Solving gives A = 2, B = −4, C = 0, D = −3, E = 0. Therefore
x2 + 2 −4x −3x
Z Z
2
5 3
dx = + 2 + dx.
4x + 4x + x x 2x + 1 (2x2 + 1)2
75
On the right the substitution u = 2x2 + 1 gives du = 4dx and so (ignoring the
“+c” for now)
−4x
Z Z
1
2
dx = − du = − ln u
2x + 1 u
−3x
Z Z
3 1 31
dx = =− .
2x2 + 1)2 4 u2 4u
Therefore
x2 + 2 −4x −3x
Z Z
2
dx = + + dx
4x5 + 4x3 + x x 2x2 + 1 (2x2 + 1)2
31
= 2 ln x − ln u −
4u
2 3 1
= 2 ln x − ln(2x + 1) − +c
4 2x2 + 1
x2
3 1
= ln 2
− 2
+ c.
2x + 1 4 2x + 1
Finally, returning to the definite integral,
Z 2 2
x2 + 2 x2
3 1
5 3
dx = ln 2
− 2
1 4x + 4x + x 2x + 1 4 2x + 1 1
4 3 1 3
= ln − − ln −
9 36 3 12
4 1
= ln + .
3 6
In the linear case there is sometimes a faster way to solve the equations when
comparing coefficients. This is useful to remember when later we turn to
differential equations.
2x2 + 3
Z
Example 7.30. Find dx.
x(x − 1)2
Solution 7.31. Using partial fractions, we want
2x2 + 3 A B C
2
= + +
x(x − 1) x x − 1 (x − 1)2
A(x − 1)2 + Bx(x − 1) + Cx
= .
x(x − 1)2
Comparing numerators, we want
2x2 + 3 = A(x − 1)2 + Bx(x − 1) + Cx.
Substituting in x = 1, x = 0 produces zero terms that help quickly determine
A, B and C. Substituting in x = 1 gives 5 = C. Substituting in x = 0 gives
3 = A. For B, we can’t do the same trick, but now that we know A and C
76
we can pick any other value of x and see what happens. Take, for example,
x = 2 to get 11 = A + 2B + 2C = 3 + 2B + 10, so B = −1. Hence
2x2 + 3
Z Z
3 1 5
dx = − + dx
x(x − 1)2 x x − 1 (x − 1)2
5
= 3 ln x − ln |x − 1| − + c.
x−1
77
when both f (x) and g(x) approach 0, or when both f (x) and g(x) approach
±∞.
Theorem 8.1 (L’Hospital’s Rule Version I). Let f and g be differentiable
functions with g 0 (x) 6= 0. Suppose that
lim f (x) = 0 and lim g(x) = 0.
x→a x→a
If
f 0 (x)
lim =L
x→a g 0 (x)
then
f (x)
lim = L.
x→a g(x)
If
f 0 (x)
lim =L
x→a g 0 (x)
then
f (x)
lim = L.
x→a g(x)
Remark 8.8. As before, the case x → ∞ or x → −∞ is also allowed. Also,
0
if limx→a fg0 (x)
(x)
= ∞ or −∞ then limx→a fg(x)
(x)
= ∞ or −∞ respectively.
ln x
Example 8.9. Let r be any positive integer. Find lim .
x→∞ xr
Solution 8.10. Observe that limx→∞ ln x = ∞ and limx→∞ xr = ∞. Ob-
serve also that both ln x and xr are differentiable, and (xr )0 = rxr−1 is nonzero
since r > 0. Therefore, by L’Hospital’s Rule,
1
ln x x 1
lim = lim = lim = 0.
x→∞ xr x→∞ rxr−1 x→∞ rxr
xk
Example 8.11. Find lim .
x→∞ ex
One way to interpret the previous example is that the exponential function
grows faster than any power of x.
Sometimes an indeterminate form arises that appears differently, like 0 · ∞
or 00 , but a little craftiness allows for L’Hospital’s Rule to be used.
√
Example 8.13. Find lim x ln x.
x→0+
Solution 8.16. Now we have the indeterminate form 00 . To deal with limits
in the exponent, take logarithms. We have ln(xx ) = x ln x, so now using the
same idea as in the last example,
1
ln x x
lim x ln x = lim 1 = lim = lim −x = 0.
x→0+ x→0+ x→0+ − 12 x→0+
x x
This, remember, is the logarithm of limx→0+ xx ,
which is not what we want.
To get back to limx→0+ xx we need to exponentiate. Thus
lim xx = e0 = 1.
x→0+
Improper Integrals.
All the integrals we’ve considered so far have been of the form
Z b
f (x) dx
a
for some interval [a, b]. What about an integral of the form
Z ∞ Z b
f (x) dx or f (x) dx ?
a −∞
Such integrals are called improper.
We can make sense of such an improper integral by using a limit. If
Z b
lim f (x) dx = L
b→∞ a
for some finite number L then we write
Z ∞
f (x) dx = L
a
R∞
and say that a f (x) dx converges to L. If
Z b
lim f (x) dx does not exist
b→∞ a
R∞
then we say that a f (x) dx diverges.
Rb
We can similarly analyse −∞ f (x) dx by considering
Z b
lim f (x) dx.
a→−∞ a
Z ∞
1
Example 8.17. Show that the improper integral dx converges.
1 x2
Solution 8.18. Observe that
Z b b
1 1 1 1
2
dx = − = − − 1 =1− .
1 x x 1 b b
Therefore Z b
1 1
lim dx = lim 1 − = 1.
b→∞ 1 x2 b→∞ b
80
R∞ 1
As the limit exists, dx converges and we have
1 x2
Z ∞
1
dx = 1.
1 x2
Z ∞
1
Example 8.19. Show that the improper integral dx diverges.
1 x
Solution 8.20. Observe that
Z b b
1
dx = ln x = ln(b) − ln(1) = ln b.
1 x 1
Therefore
Z b
1
lim dx = lim ln b = ∞.
b→∞ 1 x b→∞
R∞ 1
Since the limit does not exist, 1 x dx diverges.
One way to interpret the last two examples is to say that the area under the
curve x12 on the interval [1, ∞) is finite, while the area under the curve x1 on
[1, ∞) is infinite.
Challenge Problem 8.21. Let p be a real number. Show that the area
under the curve f (x) = x1p on the interval [1, ∞) is finite if and only p > 1.
for some a and then regard each of the two summands as its own improper
integral.
Z ∞
1
Example 8.22. Find dx.
−∞ 1 + x2
Solution 8.23. First observe that, as an indefinite integral,
Z
1
dx = tan−1 x + c
1 + x2
since the derivative of tan−1 x is 1+x
1
2 . So we obtain
Z 0 Z 0 0
1 1 −1
2
dx = lim 2
dx = lim tan x
−∞ 1 + x a→−∞ a 1 + x a→−∞
a
and
Z ∞ Z b b
1 1 −1
dx = lim dx = lim tan x
0 1 + x2 b→∞ 0 1 + x2 b→∞
0
1
Remark 8.24. Notice that 1+x 2 takes the same value at x = c and x = −c
1
for any c ≥ 0. Therefore the graph of 1+x 2 is symmetric about the y-axis.
So the area under the graph on the interval (−∞, 0] equals the area under
the graph on the interval [0, ∞). Using this, the previous example could be
simplified by calculating
Z ∞ Z ∞
1 1
2
dx = 2 dx.
−∞ 1 + x 0 1 + x2
√ 1
Z 1 Z 1
√
1 1
√ dx = lim √ dx = lim 2 x = lim (2 − a) = 2.
0 x a→0+ a x a→0+
a
a→0+
Z ∞ Z ∞
(ii) if f (x) dx diverges then so does g(x) dx.
a a
Example 8.28. Determine whether or not the improper integral
Z ∞
1
√ dx
1 1 + x3
converges or diverges.
1
Solution 8.29. It is not easy to integrate √1+x 3
, but it is easy to integrate
1 1
√
x3
= 3 . Observe that for all x ∈ [1, ∞) we have 1 + x3 > x3 > 0, implying
√ x 2
√ 1
that 1 + x3 > x3 > 0, which in turn implies that 0 < √1+x 3
< √1x3 . Write
3
√1
x3
as x− 2 . Then
Z ∞ Z ∞
1 3
0≤ √ dx ≤ x− 2 dx.
1 1 + x3 1
Since
Z ∞ Z b b
− 23 − 32 − 12
2
x dx = lim x dx = lim −2x lim (− √ + 2) = 2
= b→∞
1 b→∞ 1 b→∞ 1 b
R∞ − 23
the integral 1 x dx converges and therefore by the Comparison Test so
R∞ 1
does 1 √1+x 3
dx.
83
9. Differential Equations
In the examples above, the first equation has order 2 since the highest deriv-
ative involved is the second; the second equation has order 1 since - despite
the square - the highest derivative involved is the first; and the third equation
has order 2.
There are usually many solutions to an ODE. For example, consider the
ODE
dy
= y.
dx
We have no tools for solving this yet, but it’s not hard to see that y = ex
dy
is a solution since dx = ex = y. Another solution is y = 21 ex , because
dy 1 x
dx = 2 e = y.
Definition 9.2. The general solution of an ODE is the most general function
y = f (x) that satisfies the ODE.
Sometimes you want to specify which of the solutions is the one you want.
This can be done with initial condition or boundary conditions.
dy
Example 9.3. Find the solution of the ODE dx = y satisfying y(0) = 1.
Solution 9.4. Again, we don’t have tools yet to properly analyse this, but
notice that for the two solutions we know, y = ex and y = 21 ex , the condition
y(0) = 1 holds only for y = ex . So that’s the solution we want.
Definition 9.5. An initial value problem (IVP) is an ODE together with an
initial condition. A boundary value problem (BVP) is an ODE together with
boundary conditions.
Warning: There is no single method for dealing with all ODE’s. There are
different methods to deal with different types of ODE’s.
dy
The Method for Solution: Given the separable ODE g(y) dx = h(x):
• Step 1: Formally rewrite the ODE as g(y) dy = h(x) dx.
Z Z
• Step 2: Integrate g(y) dy = h(x) dx + c.
dy
Example 9.7. Solve the ODE ey dx = x2 .
1 3
=⇒ gey = x + c.
3
Solving for y by taking logarithms gives
3
x
y = ln +c .
3
dy
Note that Step 1 in the Method makes no real sense, since dx is not a fraction.
However,
Z the
Z Method does work: to check just differentiate both sides of
g(y) dy = h(x) dx + c and see if we can recover the ODE. We have
85
Z Z
d d
g(y) dy = h(x) dx + c
dx dx
Z Z
d dy d
=⇒ g(y) dy = h(x) dx (Chain Rule)
dy dx dx
dx
=⇒ g(y) = h(x) (Fundamental Theorem of Calculus).
dy
First order linear ODE’s. A first order linear ODE is of the form
dy
+ p(x)y = q(x)
dx
for some continuous functions p(x) and q(x). Note:
dy
• if p(x) = 0 then we have to solve dx = q(x), that is, simply find an
antiderivative of q(x);
dy
• if q(x) = 0 then we have to solve dx = −p(x)y which is separable.
So if both p(x) 6= 0 and q(x) 6= 0 the equation is more subtle.
Definition 9.13. A first order linear ODE with q(x) = 0 is called homoge-
neous. A first order linear ODE with q(x) 6= 0 is called non-homogeneous.
87
Z
• Step 3: Integrate to get ye F (x)
= eF (x) q(x) dx.
Z
• Step 4: Solve for y to get y = e−F (x) eF (x) q(x) dx.
dy
Example 9.14. Find the solution of the ODE dx − 2y = x given the initial
condition y(0) = 1.
dy
Solution 9.15. First find the general solution. Observe that dx − 2y = x is a
first order linear non-homogeneous ODE. Here p(x) = −2 and q(x) = x. Let
Z Z
F (x) = p(x) dx = −2 dx = −2x.
88
For the right hand side, integrate by parts (you do the details) to get
Z
x 1
xe−2x dx = − e−2x − e−2x + c.
2 4
Therefore
x 1
ye−2x = − e−2x − e−2x + c
2 4
and solving for y gives
x 1
y = − − + ce2x .
2 4
This is the general solution of the ODE. For the initial condition y(0) = 1 we
get
0 1 1
1 = y(0) = − − + ce0 = − + c
2 4 4
implying that c = 45 . Hence the solution of the ODE with initial condition
y(0) = 1 is
x 1 5
y = − − + e−2x .
2 4 4
This is a first order linear ODE and can be solved in the usual way.
dy y
Example 9.16. Find the general solution to the ODE dx − x = xy 2 .
Solution 9.17. Taking p(x) = − x1 and q(x) = x the ODE is
dy
+ p(x)y = q(x)y 2
dx
which is a Bernoulli ODE with n = 2. Use the substitution z = y 1−n = y −1 .
Then we get the linear ODE
1 dz dz 1
+ p(x)z = q(x) =⇒ − − z = x.
1 − 2 dx dx x
Let’s rewrite this as
dz 1
+ z = −x.
dx x Z
1
Solve this using the integrating factor method. Now F (x) = dx = ln x so
x
eF (x) = eln x = x.
Multiply the ODE (in the z variable) by the integrating factor to get
dz 1 d
x + z = −x2 =⇒ zx = −x2 .
dx x dx
Integrate to get
x3
zx = − + c
3
and solve for z to get
x2 c
z=− + .
3 x
This solves the linear ODE for the variable z but we want to get back to the
variable y so reverse the substitution. As z = y −1 we obtain
x2 c −x3 + 3c x3 + C
y −1 = − + = =−
3 x 3x 3x
where C = −3c. (As c is an arbitrary constant, so is −3c, so we may as well
just say C is the arbitrary constant.) Inverting then gives
3x
y=− 3
x +C
which is the general solution to the Bernoulli ODE.
This is a peculiar ODE in that, in order to solve it, one does not integrate but
instead differentiates! Using the Product and Chain Rules, differentiating (8)
gives
d2 y
2
dy dy 0 dy d y
= +x 2 +f .
dx dx dx dx dx2
dy
Cancelling out the dx on each side gives
d2 y
2
0 dy d y
0=x 2 +f
dx dx dx2
2
dy d y
=⇒ 0 = x + f0 .
dx dx2
Thus there is a solution if either
d2 y
dy
x + f0 =0 or = 0.
dx dx2
2
d y dy
If dx 2 = 0 then dx = c for some constant c. Substituting this back into
Clairaut’s ODE (8) gives the solution
y = cx + f (c).
This is the general
solution. Note that y is simply a line. The other case is
dy
when x + f 0 dx = 0. This turns out to have a unique solution called the
singular solution, and it is the envelope of all the general solutions, in the
following sense
envelope
.
The singular solution can be difficult to work out and we won’t go into it.
We’ll simply restrict to the general solution.
Example 9.18. Find the general solution to the ODE
3
dy dy dy
y=x + −2 .
dx dx dx
Solution 9.19. Observe that this is a Clairaut ODE with
3
dy dy dy
f = −2 .
dx dx dx
91
That is, f (z) = z 3 + 2z. The general solution is of the form y = cx + f (c) for
some constant c, so in this case the general solution is
y = cx + c3 + 2c.
Second order linear ODE’s with constant coefficients. This is our first
type of second order ODE, involving the second derivative. A second order
linear ODE with constant coefficients is of the form
d2 y dy
a 2 +b + cy = f (x)
dx dx
where a, b and c are constants and f is some function depending only on x.
Method of Solution.
• Step 1: Find the general solution of the associated homogeneous ODE
d2 y dy
a +b + cy = 0.
dx2 dx
This is called the complementary function (CF).
• Step 2: Find any solution you can to the full nonhomogeneous ODE
d2 y dy
a +b + cy = f (x).
dx2 dx
This is called a particular integral (PI).
• Step 3: The general solution to the full nonhomogeneous ODE
d2 y dy
a +b + cy = 0
dx2 dx
is y = CF+PI.
Let’s check that this makes sense. We need to check: (i) that y = CF+PI
is in fact a solution of the ODE, and (ii) that all solutions are found this
way. Write g(x) for the complementary function, and p(x) for the particular
integral.
To check (i), substitute y = g(x) + p(x) into the ODE and check that both
sides are equal. We have
d2 d
a 2
(g(x) + p(x)) + b (g(x) + p(x)) + c(g(x) + p(x))
dx dx
2 2
d g(x) dg(x) d p(x) dp(x)
= a +b + cg(x) + a +b + cp(x)
dx2 dx dx2 dx
=0 + f (x)
=f (x)
where the second equality comes from g(x) being a solution of the associated
homogeneous equation and p(x) being a solution of the full nonhomogeneous
ODE. In particular, g(x) + p(x) is a solution of the ODE.
92
To check (ii), suppose that h(x) is any solution of the full nonhomogeneous
ODE. Consider h(x) − p(x). Substituting into the ODE we get
d2 d
a (h(x) − p(x)) + b (h(x) − p(x)) + c(h(x) − p(x))
dx2 dx
2 2
d h(x) dh(x) d p(x) dp(x)
= a + b + ch(x) − a + b + cp(x)
dx2 dx dx2 dx
=f (x) − f (x)
=0
where the second equality comes from both h(x) and p(x) being solutions of
the full nonhomogeneous ODE. This says that h(x) − p(x) is a solution of the
associated homogeneous ODE, so it is a special case of the complementary
function. That is, h(x) − p(x) is of the form CF, so h(x) is of the form
CF + p(x) = CF+PI. Hence y = CF+PI really is the most general solution to
the ODE.
The next question is: How do you find the complementary function and the
particular integral? First consider the complementary function.
A basic solution of the associated homogeneous ODE is y = esx for some
constant s. To check this is the case, substitute y = esx into the associated
homogeneous ODE and check that you get zero. We have
dy d2 y
= sesx and = s2 esx .
dx dx2
So we get
d2 y dy
a 2
+b + cy = 0
dx dx
=⇒ as2 esx + bsesx + cesx = 0
=⇒ (as2 + bs + c)esx = 0
=⇒ as2 + bs + c = 0
where the last equality holds since esx > 0 for all x. The equation as2 +bs+c =
0 is called the auxilliary equation and it has a solution when
√
−b ± b2 − 4ac
s= .
2a
This gives three cases:
• Case 1: the auxilliary equation has real roots α and β. Then the
complementary function is
y = Aeαx + Beβx
where A and B are arbitrary constants.
• Case 2: the auxilliary equation has a repeated root α. Then the
complementary function is
y = (A + Bx)eαx
93
d2 y dy
Example 9.24. Solve the boundary value problem dx2 − 4 dx + 4y = 0 with
y(0) = −1 and y(1) = 1.
Solution 9.25. This is a second order linear homogeneous ODE with constant
coefficients. The auxilliary equation is
s2 − 4s + 4 = 0.
This factors as (s − 2)2 = 0 so s = 2 is a repeated root. The complementary
function is therefore
y = (A + Bx)e2x .
As the ODE is homogeneous, the complementary function is also the general
solution. The boundary condition y(0) = −1 gives
−1 = y(0) = (A + 0)e0 = A
and the boundary condition y(1) = 1 gives
1 = y(1) = (A + B)e2 .
Substituting in A = −1 and solving for B gives
B = 1 + e−2 .
Therefore the solution of the boundary value problem is
y = (−1 + (1 + e−2 )x)e2x = (−1 + x + e−2 x)e2x .
We now know how to find the complementary function for a second order
linear constant coefficient ODE. What about the particular integral? This
can be tricky, but here is a method that usually works. Suppose the ODE
is
d2 y dy
a 2 +b + cy = f (x).
dx dx
Then as the particular integral try a function of the same form as f (x). For
example:
Sample f (x) Try as PI
constant y(x) = p
3x + 2 y(x) = px + q
2
x −2 y(x) = px2 + qx + r
5e3x y(x) = pe3x
2 sin 3x y(x) = p sin 3x + q cos 3x
−4 cos 2x y(x) = p sin 2x + q cos 2x
a sum of these a sum of these.
d2 y dy
Example 9.26. Solve the ODE dx2 + 5 dx + 4y = 4x2 − 2x + 3 where y(0) = 1
and y 0 (0) = 0.
95
For the particular integral, since f (x) is a quadratic polynomial, try y(x) =
dy d2 y
px2 + qx + r. Then dx = 2px + q and dx 2 = 2p, so substituting these into the
ODE gives
2p + 5(2px + q) + 4(px2 + qx + r) = x2 − 4x + 1.
Rearranging to collect like terms gives
4px2 + (10p + 4q)x + (2p + 5q + 4r) = 4x2 − 2x + 3.
Thus
4p = 4
10p + 4q = −2
2p + 5q + 4r = 3.
The first equation gives p = 1, substituting this into the second equation gives
q = −3, and substituting both values forp and q into the third equation gives
r = 4. Thus
PI = x2 − 3x + 3.
The general solution of the ODE is therefore
y(x) = CF+PI = Ae−x + Be−4x + x2 − 3x + 3.
Now consider the initial values. First differentiate the general solution to get
y 0 (x) = −Ae−x − 4Be−4x + 2x − 3.
The initial value y(0) = 1 gives
1 = y(0) = A + B + 3.
0
The initial value y (0) = 0 gives
0 = y 0 (0) = −A − 4B − 3.
Solving these equations gives A = −1/3 and B = −5/3. Hence the solution
to the initial value problem is
1 5
y(x) = − e−x − e−4x + x2 − 3x + 3.
3 3
d2 y dy
Example 9.28. Find the general solution of the ODE dx2 +3 dx +2y = x+e3x .
96
For the particular integral, since x + e3x is a sum of a the polynomial x and
dy
the exponential e−3x , try y(x) = px + q + re−3x . Then dx = p − 3re−3x and
d2 y
dx2 = 9re−3x . Substituting these into the ODE gives
9re−3x + 3(p − 3re−3x ) + 2(px + q + re−3x ) = x + e−3x .
Rearranging to collect like terms gives
(9r − 9r + 2r)e−3x + 2px + (3p + 2q) = x + e−3x .
Therefore
2r = 1
2p = 1
3p + 2q = 0
This gives r = 1/2, p = 1/2 and q = −3/4. Thus the particular integral is
1 3 1
PI = x − + e−3x .
2 4 2
Hence the general solution of the ODE is
1 3 1
y(x) = CF+PI = Ae−x + Be−2x + x − + e−3x .
2 4 2
Challenge Problem 9.30. Show that a first order linear ODE can also be
solved using the CF+PI method. That is, given the ODE
dy
(9) + p(x)y = q(x),
dx
dy
suppose that g(x) is the solution to the associated homogeneous ODE dx +
p(x)y = 0 and suppose that h(x) is any solution to the nonhomogenous ODE.
Show that:
• CF+PI - that is, g(x) + h(x) - really is a solution to (9);
• any solution to (9) is of the form CF+PI.
Further, if the function p(x) in (9) is a constant, say p(x) = a, show that the
complementary function is g(x) = Aeax for some constant A.
97
dn−1 n−1
Now assume that, by inductive hypothesis, dx n−1 x = (n − 1)!. Consider
the nth -derivative of xn . To make use of the inductive hypothesis, let’s think
of the nth -derivative as the (n − 1)st derivative of the first derivative. Then
we get
dn n dn−1 dn−1
d n n−1
x = x = nx .
dxn dxn−1 dx dxn−1
The derivative is linear so we can pull the constant n out, and use the inductive
hypothesis to get
dn−1 dn−1
n−1 n−1
nx = n · n−1 x = n · (n − 1)! = n!.
dxn−1 dx
dn n
Thus x = n!. Therefore, by induction, the asserted formula holds.
dxn
dn √
Challenge Problem 10.5. Guess a formula for 1 + x and prove it by
dxn
induction.
dn −x
Challenge Problem 10.6. Guess a formula for xe and prove it by
dxn
induction.