100% found this document useful (1 vote)

95 views

Calculus 2

Uploaded by

nic

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

95 views

Calculus 2

Uploaded by

nic

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 305

Math 138

Calculus II
for Honours Mathematics

Course Notes

Barbara A. Forrest and Brian E. Forrest

Version 1.5
c Barbara A. Forrest and Brian E. Forrest.
Copyright

All rights reserved.

September 1, 2020

All rights, including copyright and images in the content of these course notes, are
owned by the course authors Barbara Forrest and Brian Forrest. By accessing these
course notes, you agree that you may only use the content for your own personal,
non-commercial use. You are not permitted to copy, transmit, adapt, or change in
any way the content of these course notes for any other purpose whatsoever without
the prior written permission of the course authors.

Author Contact Information:

Barbara Forrest (baforres@uwaterloo.ca)
Brian Forrest (beforres@uwaterloo.ca)

i
QUICK REFERENCE PAGE 1

Right Angle Trigonometry

sin θ = opposite
hypotenuse
cos θ = ad jacent
hypotenuse
tan θ = opposite
ad jacent

csc θ = 1
sin θ
sec θ = 1
cos θ
cot θ = 1
tan θ

Radians
Definition of Sine and Cosine

The angle θ in For any θ, cos θ and sin θ are

radians equals the defined to be the x− and y−
length of the directed coordinates of the point P on the
arc BP, taken positive unit circle such that the radius
counter-clockwise and OP makes an angle of θ radians
negative clockwise. with the positive x− axis. Thus
Thus, π radians = 180◦ sin θ = AP, and cos θ = OA.
or 1 rad = 180
π
.

The Unit Circle

ii
QUICK REFERENCE PAGE 2

Trigonometric Identities

Pythagorean cos2 θ + sin2 θ = 1

Identity
Range −1 ≤ cos θ ≤ 1
−1 ≤ sin θ ≤ 1
Periodicity cos(θ ± 2π) = cos θ
sin(θ ± 2π) = sin θ
Symmetry cos(−θ) = cos θ
sin(−θ) = − sin θ

Sum and Difference Identities

cos(A + B) = cos A cos B − sin A sin B
cos(A − B) = cos A cos B + sin A sin B
sin(A + B) = sin A cos B + cos A sin B
sin(A − B) = sin A cos B − cos A sin B

Complementary Angle Identities

cos( π2 − A) = sin A
sin( π2 − A) = cos A

Double-Angle cos 2A = cos2 A − sin2 A

Identities sin 2A = 2 sin A cos A

Half-Angle cos2 θ = 1+cos 2θ

Identities sin2 θ = 1−cos 2θ

Other 1 + tan2 A = sec2 A

iii
QUICK REFERENCE PAGE 3
Table of Integrals
Differentiation Rules
xn+1
xn dx = +C
R
Function Derivative n+1
R 1
f (x) = cxa , a , 0, c ∈ R f 0 (x) = caxa−1 dx = ln(| x |) + C
R x
f (x) = sin(x) f 0 (x) = cos(x) e x dx = e x + C
f (x) = cos(x) f 0 (x) = − sin(x) sin(x) dx = − cos(x) + C
R

f (x) = tan(x) f 0 (x) = sec2 (x) cos(x) dx = sin(x) + C

f (x) = sec(x) f 0 (x) = sec(x) tan(x) sec2 (x) dx = tan(x) + C

1 1
f (x) = arcsin(x) f 0 (x) = √ dx = arctan(x) + C
R
1 − x2 1 + x2
1
f (x) = arccos(x) f 0 (x) = − √ R 1
dx = arcsin(x) + C
1 − x2 √
1 − x2
1
f (x) = arctan(x) f 0 (x) = −1
1 + x2 dx = arccos(x) + C
R
√
f (x) = e x f 0 (x) = e x 1 − x2
sec(x) tan(x) dx = sec(x) + C
R
f (x) = a x with a > 0 f 0 (x) = a x ln(a)
1 ax
a dx = +C
R x
f (x) = ln(x) for x > 0 f 0 (x) =
x ln(a)
Inverse Trigonometric Substitutions
Integral Trig Substitution Trig Identity
R √
a2 − b2 x2 dx bx = a sin(u) sin2 (x) + cos2 (x) = 1
R √
a2 + b2 x2 dx bx = a tan(u) sec2 (x) − 1 = tan2 (x)
R √
b2 x2 − a2 dx bx = a sec(u) sec2 (x) − 1 = tan2 (x)

Additional Formulas
f (x)g0 (x) dx = f (x)g(x) −
R R
Integration by Parts f 0 (x)g(x) dx
Rb
Areas Between Curves A = a |g(t) − f (t)| dt
Rb
Volumes of Revolutions: Disk I V = a π f (x)2 dx
Rb
Volumes of Revolutions: Disk II V = a π(g(x)2 − f (x)2 ) dx
Rb
Volumes of Revolutions: Shell V = a 2πx(g(x) − f (x)) dx
Rb p
Arc Length S = a 1 + ( f 0 (x))2 dx

Taylor Series (Maclaurin Series)

∞ Differential Equations
1
= xn = 1 + x + x2 + x3 + · · · R=1
P
1−x
n=0 Separable y0 = f (x)g(y)
∞
xn x2 x3
ex = n! = 1+ x
+ + + ··· R=∞
P
?
g(y) = 0, g(y) dy = f (x)dx
R 1 R
n=0
1! 2! 3! Solve
∞
x2n x2 x4 x6
cos(x) = (−1)n (2n)! = 1 − + + ··· R=∞ y0 = f (x)y + g(x)
P
2! 4! − 6! FOLDE
n=0 R
g(x)I(x)dx
R
∞
sin(x) = x2n+1
(−1)n (2n+1)!
P
= x− x3
+ x5
− x7
+ ··· R=∞ Solve y= I(x) , I(x) = e− f (x) dx
3! 5! 7!
n=0

iv
QUICK REFERENCE PAGE 4

LIST of THEOREMS: Chapter 5: Numerical Series

Geometric Series Test
Chapter 1: Integration Divergence Test
Integrability Theorem Arithmetic for Series Theorems
for Continuous Functions The Monotone Convergence Theorem
Properties of Integrals Theorem for Sequences
Integrals over Subintervals Theorem Comparison Test for Series
Average Value Theorem Limit Comparison Test
(Mean Value Theorem for Integrals) Integral Test for Convergence
Fundamental Theorem of Calculus (Part 1) p-Series Test
Extended Version of the Alternating Series Test (AST) and
Fundamental Theorem of Calculus the Error in the AST
Power Rule for Antiderivatives Absolute Convergence Theorem
Fundamental Theorem of Calculus (Part 2) Rearrangement Theorem
Change of Variables Theorem Ratio Test
Polynomial versus Factorial Growth Theorem
Chapter 2: Techniques of Integration Root Test
Integration by Parts Theorem
Integration of Partial Fractions Chapter 6: Power Series
p-Test for Type I Improper Integrals Fundamental Convergence Theorem
Properties of Type I Improper Integrals for Power Series
The Monotone Convergence Theorem Test for the Radius of Convergence
for Functions Equivalence of Radius of Convergence
Comparison Test Abel’s Theorem: Continuity of Power Series
for Type I Improper Integrals Addition of Power Series
Absolute Convergence Theorem Multiplication of Power Series by (x − a)m
for Improper Integrals Power Series of Composite Functions
p-Test for Type II Improper Integrals Term-by-Term Differentiation of Power Series
Uniqueness of Power Series Representations
Chapter 3: Applications of Integation Term-by-Term Integration of Power Series
Area Between Curves Taylor’s Theorem
Volumes of Revolution: Disk Methods Taylor’s Approximation Theorem I
Volumes of Revolution: Shell Method Convergence Theorem for Tayor Series
Arc Length Binomial Theorem
Generalized Binomial Theorem
Chapter 4: Differential Equations
Theorem for Solving
First-order Linear Differential Equations
Existence and Uniqueness Theorem
for FOLDE

v
Table of Contents

Page

1 Integration 1
1.1 Areas Under Curves . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Estimating Areas . . . . . . . . . . . . . . . . . . . . . . 1
1.1.2 Approximating Areas Under Curves . . . . . . . . . . . . 2
1.1.3 The Relationship Between Displacement and Velocity . . 8
1.2 Riemann Sums and the Definite Integral . . . . . . . . . . . . . . 13
1.3 Properties of the Definite Integral . . . . . . . . . . . . . . . . . . 18
1.3.1 Additional Properties of the Integral . . . . . . . . . . . . 19
1.3.2 Geometric Interpretation of the Integral . . . . . . . . . . 22
1.4 The Average Value of a Function . . . . . . . . . . . . . . . . . . 27
1.4.1 An Alternate Approach to the Average Value of a Function 28
1.5 The Fundamental Theorem of Calculus (Part 1) . . . . . . . . . . 30
1.6 The Fundamental Theorem of Calculus (Part 2) . . . . . . . . . . 40
1.6.1 Antiderivatives . . . . . . . . . . . . . . . . . . . . . . . . 41
1.6.2 Evaluating Definite Integrals . . . . . . . . . . . . . . . . 43
1.7 Change of Variables . . . . . . . . . . . . . . . . . . . . . . . . . 47
1.7.1 Change of Variables for the Indefinite Integral . . . . . . . 48
1.7.2 Change of Variables for the Definite Integral . . . . . . . . 52

2 Techniques of Integration 56
2.1 Inverse Trigonometric Substitutions . . . . . . . . . . . . . . . . . 56
2.2 Integration by Parts . . . . . . . . . . . . . . . . . . . . . . . . . 62
2.3 Partial Fractions . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
2.4 Introduction to Improper Integrals . . . . . . . . . . . . . . . . . . 80
2.4.1 Properties of Type I Improper Integrals . . . . . . . . . . 86
2.4.2 Comparison Test for Type I Improper Integrals . . . . . . 88
2.4.3 The Gamma Function . . . . . . . . . . . . . . . . . . . . 94
2.4.4 Type II Improper Integrals . . . . . . . . . . . . . . . . . . 96

3 Applications of Integration 100

3.1 Areas Between Curves . . . . . . . . . . . . . . . . . . . . . . . 100
3.2 Volumes of Revolution: Disk Method . . . . . . . . . . . . . . . . 108
3.3 Volumes of Revolution: Shell Method . . . . . . . . . . . . . . . . 114
3.4 Arc Length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

4 Differential Equations 122

4.1 Introduction to Differential Equations . . . . . . . . . . . . . . . . 122
4.2 Separable Differential Equations . . . . . . . . . . . . . . . . . . 124
4.3 First-Order Linear Differential Equations . . . . . . . . . . . . . . 131

vi
4.4 Initial Value Problems . . . . . . . . . . . . . . . . . . . . . . . . 136
4.5 Graphical and Numerical Solutions to Differential Equations . . . 140
4.5.1 Direction Fields . . . . . . . . . . . . . . . . . . . . . . . 140
4.5.2 Euler’s Method . . . . . . . . . . . . . . . . . . . . . . . . 142
4.6 Exponential Growth and Decay . . . . . . . . . . . . . . . . . . . 145
4.7 Newton’s Law of Cooling . . . . . . . . . . . . . . . . . . . . . . 149
4.8 Logistic Growth . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

5 Numerical Series 162

5.1 Introduction to Series . . . . . . . . . . . . . . . . . . . . . . . . 162
5.2 Geometric Series . . . . . . . . . . . . . . . . . . . . . . . . . . 167
5.3 Divergence Test . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
5.4 Arithmetic of Series . . . . . . . . . . . . . . . . . . . . . . . . . 173
5.5 Positive Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
5.5.1 Comparison Test . . . . . . . . . . . . . . . . . . . . . . . 180
5.5.2 Limit Comparison Test . . . . . . . . . . . . . . . . . . . 185
5.6 Integral Test for Convergence of Series . . . . . . . . . . . . . . 189
5.6.1 Integral Test and Estimation of Sums and Errors . . . . . 199
5.7 Alternating Series . . . . . . . . . . . . . . . . . . . . . . . . . . 202
5.8 Absolute versus Conditional Convergence . . . . . . . . . . . . . 213
5.9 Ratio Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
5.10 Root Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227

6 Power Series 229

6.1 Introduction to Power Series . . . . . . . . . . . . . . . . . . . . 229
6.1.1 Finding the Radius of Convergence . . . . . . . . . . . . 236
6.2 Functions Represented by Power Series . . . . . . . . . . . . . . 240
6.2.1 Building Power Series Representations . . . . . . . . . . 241
6.3 Differentiation of Power Series . . . . . . . . . . . . . . . . . . . 244
6.4 Integration of Power Series . . . . . . . . . . . . . . . . . . . . . 253
6.5 Review of Taylor Polynomials . . . . . . . . . . . . . . . . . . . . 257
6.6 Taylor’s Theorem and Errors in Approximations . . . . . . . . . . 269
6.7 Introduction to Taylor Series . . . . . . . . . . . . . . . . . . . . . 277
6.8 Convergence of Taylor Series . . . . . . . . . . . . . . . . . . . . 281
6.9 Binomial Series . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
6.10 Additional Examples and Applications of Taylor Series . . . . . . 290

vii
Chapter 1

Integration

Many operations in mathematics have an inverse operation: addition and subtraction;

multiplication and division; raising a number to the nth power and finding its nth
root; taking a derivative and finding its antiderivative. In each case, one operation
“undoes” the other. In this chapter, we begin the study of the integral and integration.
Soon you will understand that integration is the inverse operation of differentiation.

1.1 Areas Under Curves

The two most important ideas in calculus - differentiation and integration - are both
motivated from geometry. The problem of finding the tangent line led to the definition
of the derivative. The problem of finding area will lead us to the definition of the
definite integral.

1.1.1 Estimating Areas

Our objective is to find the area under the curve of some function.
What do we mean by the area under a curve?
The question about how to calculate areas is actually thousands of years old and it is
one with a very rich history. To motivate this topic, let’s first consider what we know
about finding the area of some familiar shapes. We can easily determine the area of
a rectangle or a right-angled triangle, but how could we explain to someone why the
area of a circle with radius r is πr2 ?
The problem of calculating the area of a circle was studied by the ancient Greeks. In
particular, both Archimedes and Eudoxus of Cnidus used the Method of Exhaustion
to calculate areas. This method used various regular inscribed polygons of known
area to approximate the area of an enclosed region.
Chapter 1: Integration 2

In the case of a circle, as the number of sides of the inscribed polygon increased, the
error in using the area of the polygon to approximate the area of the circle decreased.
As a result, the Greeks had effectively used the concept of a limit as a key technique
in their calculation of the area.

1.1.2 Approximating Areas Under Curves

Let’s use the ideas from the Method of Exhaustion and try to find the area underneath
a parabola by using rectangles as a basis for the approximation.
y
f (x) = x2
2

Suppose that we have the 1.5

function f (x) = x2 . Consider the
region R bounded by the graph of 1 (1, 1)
f , by the x-axis, and by the lines
0.5
x = 0 and x = 1.
R
−1 −0.5 0 0.5 1 x

How could we determine the area of this irregular region?

y
f (x) = x2
For our first estimate, we can 2
approximate the area of R by
1.5
constructing a rectangle R1 of
length 1 (from x = 0 to x = 1) R1
1 (1, 1)
and height 1 (y = f (1) = 12 = 1).
This rectangle (in this case a 0.5 height
square) has area R
length × height = 1 × 1 = 1. x
−1 −0.5 0 0.5 1
length
y
f (x) = x2
2

1.5
The diagram shows that the area
of rectangle R1 is larger than the R1
1 (1, 1)
area of region R. Moreover, the
error is actually quite large. 0.5 error
R
−1 −0.5 0 0.5 1 x

We can find a better estimate if we split the interval [0, 1] into 2 equal subintervals,
[0, 12 ] and [ 21 , 1].

Calculus 2 (B. Forrest)2

Section 1.1: Areas Under Curves 3

y
Using these intervals, two f (x) = x2
2
rectangles are constructed. The
first rectangle R1 has its length 1.5
from x = 0 to x = 21 with height
R2
equal to f ( 12 ) = 212 = 41 . 1 (1, 1)

The second rectangle R2 has its 0.5

R1
length from x = 21 to x = 1 with 0.25 ( 12 , 14 )
height f (1) = 12 = 1. −1 −0.5 0 0.5 1 x

The area of rectangle R1 is equal to

1 1 1
R1 = length × height = × 2 =
2 2 8
while the area of rectangle R2 is equal to
1 1
R2 = length × height = ×1=
2 2

Our second estimate for the area of the original region R is obtained by adding the
areas of these two rectangles to get
1 1 5
R1 + R2 = + = = 0.625
8 2 8
Observe from the diagram that y
our new estimate using two f (x) = x2
2
rectangles for the area under
f (x) = x2 on the interval [0, 1] is 1.5
much better than our first
estimate since the error is R2 (1, 1)
1
smaller. The region containing error
the dashed lines indicates the 0.5
improvement in our estimate (this R1
0.25 error ( 12 , 41 )
is the amount by which we have
reduced the error from our first −1 −0.5 0 0.5 1 x
estimate).
To improve our estimate even further, divide the interval [0, 1] into five equal
subintervals of the form
i−1 i
[ , ]
5 5
where i ranges from 1 to 5.
This produces the subintervals
1 1 2 2 3 3 4 4 5
[0, ], [ , ], [ , ], [ , ], [ , ]
5 5 5 5 5 5 5 5 5
each having equal lengths of 15 .

Calculus 2 (B. Forrest)2

Chapter 1: Integration 4

Next we construct five new rectangles where the ith rectangle forms its length from
i−1
5
to 5i and has height equal to the value of the function at the right-hand endpoint
of the interval. That is, the height of a rectangle is f (x) = x2 where x = 5i or
i i
f ( ) = ( )2
5 5
i2
= 2
5

The area of the ith rectangle is given by

1 i2 i2
length × height = × 2 = 3
5 5 5

y
f (x) = x2
2 error
area under curve

1.5

R5
1 (1,1)
R4
0.5 R3
R2
R1
0 1
5
2
5
3
5
4
5
5
5
=1 x
1
5

Our new estimate is the sum of the areas of these rectangles which is
1 1 2 1 3 1 4 1 5 1
R1 + R2 + R3 + R4 + R5 = [( )2 ( )] + [( )2 ( )] + [( )2 ( )] + [( )2 ( )] + [( )2 ( )]
5 5 5 5 5 5 5 5 5 5
12 22 32 42 52
= 3+ 3+ 3+ 3+ 3
5 5 5 5 5
1 2
= (1 + 22 + 32 + 42 + 52 )
53
5
1 X 2
= i
53 i=1

Note: It can be shown that for any n

n
X (n)(n + 1)(2n + 1)
i2 =
i=1
6

Calculus 2 (B. Forrest)2

Section 1.1: Areas Under Curves 5

This means that the sum of the areas of the rectangles is

5
1 X 2 1 (5)(5 + 1)(2(5) + 1)
3
i = 3×
5 i=1 5 6
1 (5)(6)(11)
= ×
53 6
11
=
25

= 0.44

So far the estimates for the area under the curve of f (x) = x2 on the interval [0, 1]
are:

Number of Subintervals Length of Subinterval Estimate for Area

(Rectangles) (Width of Rectangle) under Curve
1 1 1
1
2 2
0.625
1
5 5
0.44

Observe from the diagram that the estimate for the area is getting better while the
error in the estimate is getting smaller.

y
f (x) = x2
2

1.5

Let’s repeat this process again by 1 (1,1)

using 10 equal subintervals.

Ri
0.5

0 i−1 i x
10 10

Calculus 2 (B. Forrest)2

Chapter 1: Integration 6

In this case, the sum of the areas of the 10 rectangles will be

10
X
Ri = R1 + R2 + R3 + R4 + R5 + R6 + R7 + R8 + R9 + R10
i=1
10
1 X 2
= i
103 i=1
1 (10)(10 + 1)(2(10) + 1)
=
103 6
77
=
200

= 0.385

If we were to use 1000 subintervals, the estimate for the area would be

1000
X
Ri = R1 + R2 + R3 + . . . + R1000
i=1
1000
1 X 2
= i
10003 i=1
1 (1000)(1000 + 1)(2(1000) + 1)
=
10003 6

= 0.3338335

You should begin to notice that as we increase the number of rectangles (number
of subintervals), the total area of these rectangles seems to be getting closer and
closer to the actual area of the original region R. In particular, if we were to produce
an accurate diagram that represents 1000 rectangles, we would see no noticeable
difference between the estimated area and the true area. For this reason we would
expect that our latest estimate of 0.3338335 is actually very close to the true value of
the area of region R.
We could continue to divide the interval [0, 1] into even more subintervals. In fact,
we can repeat this process with n subintervals for any n ∈ N. In this generic case, the
estimated area Rn would be
n
1 X 2
Rn = 3 i
n i=1
1 (n)(n + 1)(2(n) + 1)
=
n3 6
1
n3
(2n3 + 3n2 + n)
=
6
2 + n3 + 1
n2
=
6

Calculus 2 (B. Forrest)2

Section 1.1: Areas Under Curves 7

Note that if we let the number of subintervals n approach ∞, then

2 + n3 + 1
n2
lim Rn = lim
n→∞ n→∞ 6
2
=
6
1
=
3

By calculating the area under the graph of f (x) = x2 using an increasing number of
rectangles, we have constructed a sequence of estimates where each estimate is larger
than the actual area. Though it appears that the limiting value 31 is a plausible guess
for the actual value of the area, at this point the best that we can say is that the area
should be less than or equal to 13 .

Number of Subintervals Length of Subinterval Estimate for Area

(Rectangles) (Width of Rectangle) under Curve
1 1 1
1
2 2
0.625
1
5 5
0.44
1
10 10
0.385
1
1000 1000
0.3338335
1
approaches ∞ approaches 0 approaches 3

y
f (x) = x2
Alternately, we can use a 2
similar process that would
produce an estimate for the
area that will be less than the 1.5
actual value. To do so we
again divide the interval
[0, 1] into n subintervals of
length n1 with the i-th interval 1 (1,1)
[ i−1
n n
, i ]. This interval again
forms the length of a
rectangle Li , but this time we 0.5 Li
will use the left-hand First
rectangle has
endpoint of the interval so height 0
that the value f ( i−1n
) is the
height of the rectangle. 0 i−1 i x
n n

Calculus 2 (B. Forrest)2

Chapter 1: Integration 8

In this case, notice that since f (0) = 0 the first rectangle is really just a horizontal
line with area 0. Then the estimated area Ln for this generic case would be
n
1 X
Ln = (i − 1)2
n3 i=1
1 (n − 1)(n + 1 − 1)(2(n − 1) + 1)
=
n3 6
1 (n − 1)(n)(2n − 1)
=
n3 6
1 2n3 − 3n2 + n
=
n3 6
2 − n3 + 1
n2
=
6
Finally, observe that
2 − 3n + 1
n2 1
lim Ln = lim = .
n→∞ n→∞ 6 3
In summary, we have now shown that if R is the area of the region under the graph
of f (x) = x2 , above the x-axis, and between the lines x = 0 and x = 1, then for each
n ∈ N,
Ln ≤ R ≤ Rn .
It would be reasonable to y
conclude that the area under the f (x) = x2
2
graph of f (x) = x2 bounded by
the x-axis and the lines x = 0 and 1.5
x = 1 is precisely the limit 13 .
That is, the process of using more 1 (1, 1)
and more rectangles to estimate
0.5
the area under the curve gives us
a sequence of values that R = 13
converge to the actual area under −1 −0.5 0 0.5 1 x
the curve.

1.1.3 The Relationship Between Displacement and Velocity

In the previous section we looked at a method to determine the area under a curve.
In this section we will look at a different problem–finding a geometric relationship
between displacement and velocity. Perhaps surprisingly, the problem of finding the
area under a curve and the geometric relationship between displacement and velocity
are related to one another.
From the study of differentiation, we know that if s(t) represents the displacement
(or position) of an object at time t and v(t) represents its velocity, then
ds
= s 0 (t) = v(t)
dt

Calculus 2 (B. Forrest)2

Section 1.1: Areas Under Curves 9

In other words, the derivative of the displacement (position) function is the velocity
function. By implementing the method we used to calculate area in the last section,
we will now see that another relationship exists between displacement and velocity.
Suppose that we are going to take a trip in a car along a highway. Our task is to
determine how far we have travelled after two hours. Unfortunately, the odometer in
the car is broken. However, the speedometer is in working condition.
With proper planning, the data from the speedometer can be used to help estimate
how far we travelled. To see that this is plausible, suppose that we always travel
forward on the highway at a constant velocity, say 90 km/hr. We know from basic
physics that provided our velocity is constant, if s = displacement, v = velocity, and
∆t = time elapsed, then
s = v∆t

In our case, the velocity is v = 90 km/hr and the elapsed time is ∆t = 2 hrs. Hence,
the displacement (or distance travelled since we are always moving forward), is

s = v∆t = 90 km/hr × 2 hrs = 180 km

v
v(t) = 90 km/hr
Notice that the velocity function
v(t) = 90 is a constant function
90 s = v∆t and so its graph is a horizontal
line. The area below this constant
function is just a rectangle with
length from t = 0 to t = 2.
0 2 t

∆t
Hence, the area below the curve v(t) = 90 on the interval [0, 2] is

v × ∆t = 90 km/hr × 2 hrs = 180 km = s

Notice that the

area below the constant velocity function is equal to the displacement

during time ∆t!

Unfortunately, we will drive through some small towns with heavy traffic so traveling
at a constant velocity over the entire trip is impossible. How can we calculate the
distance travelled if our velocity varies over the 2 hour duration of the trip?
Consider that it is still reasonable to assume that over any very short interval of time
our velocity will be relatively constant.
Let us proceed in the following familiar manner. First we will assume our velocity
is always positive and so in this case our displacement will be equal to the distance
travelled.

Calculus 2 (B. Forrest)2

Chapter 1: Integration 10

Next let’s separate the 2 hour duration of the trip into 120 one minute intervals

0 = t0 < t1 < t2 < t3 < · · · < ti−1 < ti < · · · < t120 = 2 hours

so that ti = i minutes = 60i hours. Let si be the distance travelled during time ti−1 until
ti . In other words, each si is the distance travelled in the i th minute of our trip. Then
if s is the total displacement (distance travelled), we have

s = (distance travelled in 1 st minute) + (distance travelled in 2nd minute) + · · ·

. . . + (distance travelled in i th minute) + · · · + (distance travelled in 120th minute)
= s1 + s2 + s3 + · · · + si + · · · + s120

120
X
= si
i=1

Let v(t) be the function that represents the velocity at time t along the trip, again
assuming that v(t) > 0. At the end of each minute ti , the velocity on the speedometer
is recorded. That is, v(ti ) is determined.

v si v(ti )∆ti
(ti , v(ti ))
v = v(t)

0 2 t
ti−1 ti

∆ti

Next let ∆ti denote the elapsed time between ti−1 and ti so that ∆ti = ti −ti−1 . However,
each interval has the same elapsed time, namely 601 of an hour (or 1 minute). Since
it makes sense to assume that the velocity does not vary much over any one minute
period, we can assume that the velocity during the interval [ti−1 , ti ] was the same as it
was at ti . From this assumption, the previous formula (s = v∆t) is used to estimate si
so that
1
si v(ti )∆ti = v(ti )
60
Finally, we have the estimate for s:
120 120 120
X X X 1
s= si v(ti )∆ti = v(ti )
i=1 i=1 i=1
60

Calculus 2 (B. Forrest)2

Section 1.1: Areas Under Curves 11

To find an even better estimate, we could measure the velocity every second. This
means we would divide the two hour period into equal subintervals [ti−1 , ti ] each of
length 36001
hours (i.e., 1 hr × 60 min/hr × 60 sec/min = 3600 sec/hr and 2 hrs ×
3600 seconds/hr = 7200 seconds). We again let si denote the distance travelled over
the i th interval. This time we have
1
si v(ti )∆ti = v(ti )
3600
and
7200 7200 7200
X X X 1
s= si v(ti )∆ti = v(ti )
i=1 i=1 i=1
3600

In fact, for any Natural number n > 0, we can divide the interval [0, 2] into n equal
parts of length 2n by choosing

0 = t0 < t1 < t2 < t3 < · · · < ti−1 < ti < · · · < tn = 2

where ti = 2i
n
for each i = 1, 2, 3, . . . , n. If we let
n n
X X 2
Sn = v(ti )∆ti = v(ti )
i=1 i=1
n

then it can be shown that the sequence {S n } converges and that

lim {S n } = s
n→∞

Let’s consider what this last statement means geometrically. The diagram shows the
graph of velocity as a function of time over the interval [0, 2] partitioned into n equal
subintervals.

v = v(t)

t
t0 = 0 tn = 2
ti−1 ti

We have that
si v(ti )∆ti
but
v(ti )∆ti
is just the area of the shaded rectangle with height v(ti ) and length ∆ti .

Calculus 2 (B. Forrest)2

Chapter 1: Integration 12

v si = v(ti )∆ti
= area of rectangle
v = v(t)

v(ti )

0 2 t
ti−1 ti

∆ti

Moreover, if
n
X
Sn = v(ti )∆ti ,
i=1

then S n is the sum of the areas of all of the rectangles in the diagram.

v = v(t)

0 2 t
S n = sum of areas of all rectangles

Notice that S n closely approximates the area bounded by the graph of v = v(t), the
t-axis, the line t = 0 and the line t = 2. If n approaches ∞, we are once again led to
conclude that the

displacement (distance travelled) equals the area under the graph of the velocity function.

v
Note: This is the same process we used to
find the area under the graph of f (x) = x2 . v = v(t)
This example shows geometrically that
the displacement (distance travelled) from Distance travelled (S)
ti−1 through ti is equal to the area under = Area under curve
the graph of the velocity function v = v(t)
bounded by the t-axis, t = ti−1 and t = ti . 0 2 t

Calculus 2 (B. Forrest)2

Section 1.2: Riemann Sums and the Definite Integral 13

1.2 Riemann Sums and the Definite Integral

In this section, the notion of a Riemann sum is introduced and it is used to define the
definite integral.1
Suppose that we have a function f that is bounded on a closed interval [a, b]. We
begin the construction of a Riemann sum by first choosing a partition P for the
interval [a, b]. By a partition we mean a finite increasing sequence of numbers of the
form
a = t0 < t1 < t2 < · · · < ti−1 < ti < · · · < tn−1 < tn = b.

This partition subdivides the interval [a, b] into n subintervals

[t0 , t1 ], [t1 , t2 ], · · · , [ti−1 , ti ], · · · , [tn−2 , tn−1 ], [tn−1 , tn ].

Note that these subintervals need not be equal in length.

For each i = 1, 2, . . . , n, the
length of the ith subinterval ∆ti f
[ti−1 , ti ] is denoted by ∆ti . In other f (ci )
words, ∆ti = ti − ti−1 .
The norm of the partition P is the
length of the widest subinterval
which we denote by
t
0 a b
kPk = max{∆t1 , ∆t2 , . . . , ∆tn }. ti−1 ti
ci
Now for each i = 1, 2, . . . , n, a
point ci ∈ [ti−1 , ti ] is chosen.2
Given these conditions, we can now define a Riemann sum for the partition P.

DEFINITION Riemann Sum

Given a bounded function f on [a, b], a partition P

a = t0 < t1 < t2 < · · · < ti−1 < ti < · · · < tn−1 < tn = b

of [a, b], and a set {c1 , c2 , . . . , cn } where ci ∈ [ti−1 , ti ], then a Riemann sum for f with
respect to P is a sum of the form
n
X
S = f (ci )∆ti .
i=1

1
Riemann sums are named after the German mathematician Georg Friedrich Bernhard Riemann
(1826-1866) who worked on the theory of integration among many other accomplishments in analysis,
number theory and geometry.
2
For a generic Riemann sum, the widths of the subintervals do not have to be equal to one another
and each ci need not be the midpoint of each subinterval.

Calculus 2 (B. Forrest)2

Chapter 1: Integration 14

The next diagram represents a Riemann sum for a function f defined on the interval
[1, 4].
∆ti f
Since the function is positive on f (ci )
this interval, the terms f (ci )∆ti
represent the area of the rectangle
with length equal to the
subinterval [ti−1 , ti ] and height
given by f (ci ). In the diagram, t
0 1
the dashed lines represent the ti−1 ti 4
location of the points ci . ci

Notice the similarity between these sums and the sums we used in the previous sec-
tion to determine the area under the graph of f (x) = x2 . This similarity occurs
because the latter sums were actually special types of Riemann sums.

DEFINITION Regular n-Partition

Given an interval [a, b] and an n ∈ N, the regular n−partition of [a, b] is the partition
P(n) with
a = t0 < t1 < t2 < · · · < ti−1 < ti < · · · < tn−1 < tn = b

of [a, b] where each subinterval has the same length ∆ti = b−a
n
.

In this case,
b−a
a = t0 = a + 0 · ( ),
n
b−a
t1 = a+1·( ),
n
b−a
t2 = a+2·( ),
n
..
.
b−a
ti = a+i·( ),
n
..
.
b−a
tn = a+n·( ) = b.
n

Calculus 2 (B. Forrest)2

Section 1.2: Riemann Sums and the Definite Integral 15

DEFINITION Right-hand Riemann Sum

The right-hand Riemann sum for f with respect to the partition P is the Riemann sum
R obtained from P by choosing ci to be ti , the right-hand endpoint of [ti−1 , ti ]. That is
n
X
R= f (ti )∆ti .
i=1

If P(n) is the regular n-partition, we denote the right-hand Riemann sum by

n n
X X b−a
Rn = f (ti )∆ti = f (ti )
i=1 i=1
n
n !! !
X b−a b−a
= f a+i
i=1
n n

DEFINITION Left-hand Riemann Sum

The left-hand Riemann sum for f with respect to the partition P is the Riemann sum
L obtained from P by choosing ci to be ti−1 , the left-hand endpoint of [ti−1 , ti ]. That is
n
X
L= f (ti−1 )∆ti .
i=1

If P(n) is the regular n-partition, we denote the left-hand Riemann sum by

n n
X X b−a
Ln = f (ti−1 )∆ti = f (ti−1 )
i=1 i=1
n
n !! !
X b−a b−a
= f a + (i − 1)
i=1
n n

EXAMPLE 1
(a) Right-hand Riemann sum (b) Left-hand Riemann sum
using right endpoints using left endpoints
overestimate underestimate
(1, 1) (1, 1)
f (x) = x2 f (x) = x2

0 x 0 x
xi−1 xi 1 xi−1 xi 1

Calculus 2 (B. Forrest)2

Chapter 1: Integration 16

A closer look at the examples in the previous section reveals that the sums used
to estimate the area under the graph of f (x) = x2 were right-hand Riemann sums.
Similarly, the sums used to find the distance travelled were right-hand Riemann sums
of the velocity function v(t). Moreover, we saw that if we let n approach ∞, then
these sequences of Riemann sums converged to the area under the graph of f (x) = x2
and the total distance travelled, respectively. These examples motivate the following
definition:

DEFINITION Definite Integral

We say that a bounded function f is integrable on [a, b] if there exists a unique
number I ∈ R such that if whenever {Pn } is a sequence of partitions with lim kPn k = 0
n→∞
and {S n } is any sequence of Riemann sums associated with the Pn ’s, we have

lim S n = I.
n→∞

In this case, we call I the integral of f over [a, b] and denote it by3
Z b
f (t) dt
a

The points a and b are called the limits of integration and the function f (t) is called
the integrand. The variable t is called the variable of integration.

NOTE
The variable of integration is sometimes called a dummy variable in the sense that
if we were to replace t’s by x’s everywhere, we would not change the value of the
integral.

It might seem difficult to find such a number I or even to know if it exists. The next
result tells us that if f is continuous on [a, b], then it is integrable. It also shows that
the integral can be obtained as a limit of Riemann sums associated with the regular
n-partitions.
R
3
In the 17th century, Gottfried Wilhelm Leibniz introduced the notation for the integral sign
which represents an elongated S from the Latin word summa.

Calculus 2 (B. Forrest)2

Section 1.2: Riemann Sums and the Definite Integral 17

THEOREM 1 Integrability Theorem for Continuous Functions

Let f be continuous on [a, b]. Then f is integrable on [a, b]. Moreover,
Z b
f (t) dt = lim S n
a n→∞

where n
X
Sn = f (ci )∆ti
i=1

is any Riemann sum associated with the regular n-partitions. In particular,

Z b n
X b−a
f (t) dt = lim Rn = lim f (ti )
a n→∞ n→∞
i=1
n

and n
Z b X b−a
f (t) dt = lim Ln = lim f (ti−1 )
a n→∞ n→∞
i=1
n

REMARK
This theorem also holds if f is bounded and has finitely many discontinuities on
[a, b]. The proof of this theorem is beyond the scope of this course.

EXAMPLE 2 We have already seen that if f (x) = x2 on [0, 1], then

n
1 X 2 1 f (x) = x2 (1, 1)
Rn = i
n3 i=1
1 (n)(n + 1)(2(n) + 1) Ri
=
n3 6
2 + 3n + 1
n2
=
6 1
It follows that

1 f (x) = x2 (1, 1)
1 2 + 3n + 1
Z
n2
x dx = lim
2
0 n→∞ 6
2
=
6 R1
1 x2 dx
= 0
3
1

Calculus 2 (B. Forrest)2

Chapter 1: Integration 18

Soon we will see how to calculate integrals by means other than using limits of
Riemann sums. However, before ending this section, consider the following
important example.

EXAMPLE 3 Let f (t) = α for each t ∈ [a, b].

Then
n
X
Rn = f (ti )∆ti
i=1
n
b−a f (t) = α
α
X
= α( )
i=1
n
Rb
n α dt = α(b − a) α
a
X b−a
= α
i=1
n
a b
b−a
= α(b − a) n

n intervals
Since Rn = α(b − a) for each n, it
Z b
follows that α dt = α(b − a).
a

In other words, if f is any constant function (for example, α), then the integral of
f over the limits of integration from a to b is just α times the length of the interval
[a, b] or α(b − a).

1.3 Properties of the Definite Integral

Since the integral is a limit of a sequence, we would expect many of the limit laws to
hold. The next theorem shows that this is indeed the case.

THEOREM 2 Properties of Integrals

Assume that f and g are integrable on the interval [a, b]. Then:
Rb Rb
i) For any c ∈ R, a c f (t) dt = c a f (t) dt.
Rb Rb Rb
ii) a ( f + g)(t) dt = a f (t) dt + a g(t) dt.
Rb
iii) If m ≤ f (t) ≤ M for all t ∈ [a, b], then m(b − a) ≤ a f (t) dt ≤ M(b − a).
Rb
iv) If 0 ≤ f (t) for all t ∈ [a, b], then 0 ≤ a f (t) dt.
Rb Rb
v) If g(t) ≤ f (t) for all t ∈ [a, b], then a g(t) dt ≤ a f (t) dt.
Rb Rb
vi) The function | f | is integrable on [a, b] and | a f (t) dt | ≤ a | f (t) | dt.

Calculus 2 (B. Forrest)2

Section 1.3: Properties of the Definite Integral 19

Properties (i) and (ii) in the previous theorem follow immediately from the rules of
arithmetic for convergent sequences. Property (iv) can be deduced from Property (iii)
and Property (v) can be obtained from Properties (i), (ii) and (iv).
Let’s consider why Property (iii) is true.

Assume that M
f
m ≤ f (t) ≤ M
m
for all t ∈ [a, b].
t
Let
a = t0 < t1 < t2 < · · · < ti−1 < ti < · · · < tn−1 < tn = b
n
be any partition of [a, b]. We first observe that ∆ti = b − a. Then since
P
i=1
m ≤ f (ti ) ≤ M,
n
X n
X n
X
m(b − a) = m∆ti ≤ f (ti )∆ti ≤ M∆ti = M(b − a).
i=1 i=1 i=1

M
f
m
m(b − a)
a b t
It then follows
M
Z b f
m(b − a) ≤ f (t) dt ≤ M(b − a) m
Rb
f (t) dt
a a

as expected. a b t
M
f
M(b − a)
m
a b t
Property (vi) canR be derived by applying the triangle inequality to the Riemann sums
b
associated with a f (t) dt.

1.3.1 Additional Properties of the Integral

Up until now, in defining the definite integral we have always considered integrals of
the form Z b
f (t) dt
a
where a < b. However, it is necessary to give meaning to
Z a
f (t) dt
a

Calculus 2 (B. Forrest)2

Chapter 1: Integration 20

and to Z a
f (t) dt.
b
Ra
How do we define a
f (t) dt?

If we assume that f (a) > 0, we y = f (t)

can again view this integral as the 2
“area” of the region below the
graph of y = f (t), but this time 1 (a, f (a))
the interval is “from x = a to
(a, 0)
x = a.” This is a degenerate
−1 −0.5 0 a 0.5 1
rectangle that is just the line t
segment joining (a, 0) and −1

(a, f (a)).
We can see that the line segment has height f (a) but length 0. As such it makes sense
to define its “area” to be 0. In keeping with our theme that the integral of a positive
function represents area, we are led to the following definition.

Ra
DEFINITION a
f (t) dt [Identical Limits of Integration]
Let f (t) be defined at t = a. Then we define
Z a
f (t) dt = 0.
a

Recall the convention that moving to the right represents a positive amount and mov-
ing to the left represents a negative amount. In the definition of
Z b
f (t) dt
a

where a < b, we began at the left-hand endpoint a of an interval [a, b] and moved to
the right towards b. In the case of the integral
Z a
f (t) dt
b

where a < b, we are suggesting that using the interval [a, b] we move from b to the
left towards a. This is the opposite or negative of the original orientation. For this
reason, we define:

Ra
DEFINITION b
f (t) dt [Switching the Limits of Integration]
Let f be integrable on the interval [a, b] where a < b. Then we define
Z a Z b
f (t) dt = − f (t) dt.
b a

Calculus 2 (B. Forrest)2

Section 1.3: Properties of the Definite Integral 21

EXAMPLE 4 Recall that we have already seen

Z 1
1
x2 dx = .
0 3
It follows that
Z 0
1
x2 dx = − .
1 3
The next property shows us how to separate an integral over one region into the
sum of integrals over two or more regions. To motivate this rule, assume that f is
continuous and positive on [a, b] with a < c < b.
We know that the integral 3
y = f (t)
Z b
2
f (t) dt
a
1 Rb
represents the area of the region R= a
f (t)dt
R bounded by the graph of
a 0 b t
y = f (t), the t-axis, and the lines
t = a and t = b. −1

However, the line t = c separates the region R into two subregions, which we denote
by R1 and R2 .
We also have that
Z c
3
y = f (t)
f (t) dt 2
a

and R1 c R2 =
Z b
R1 = f (t)dt b f (t)dt
R
f (t) dt a c
c 0
a c b t
represent the areas of regions R1
−1
and R2 , respectively.
The diagrams show that the area of R is the sum of the areas of R1 and R2 . In other
words, R = R1 + R2 . But this suggests that
Z b Z c Z b
f (t) dt = f (t) dt + f (t) dt.
a a c

The definition of the integral can be used to prove this fact.

THEOREM 3 Integrals over Subintervals

Assume that f is integrable on an interval I containing a, b and c. Then
Z b Z c Z b
f (t) dt = f (t) dt + f (t) dt.
a a c

Calculus 2 (B. Forrest)2

Chapter 1: Integration 22

Note: The proof of this theorem is not part of this course.

This theorem also holds when c lies outside the interval [a, b]. Consider the following
example.

EXAMPLE 5 Assume that f is integrable on the interval [a, c] where a < b < c. Then we have that

Rc Rb
Since b
f (x) dx = − c
f (x) dx, then the previous theorem holds. That is,

Z b Z c Z c
f (x) dx = f (x) dx − f (x) dx
a a b
Z c Z b
= f (x) dx + f (x) dx.
a c

1.3.2 Geometric Interpretation of the Integral

f (x) = x2
We have already seen from our
study of Riemann sums that the
area of the region R bounded by
the graph of f (x) = x2 , by the 1 (1, 1)
Z by the lines x = 0 and
x-axis, and
1
x = 1 is x2 dx.
0 R

1 x 0
In fact, whenever f(x) ≥ 0 on all of [a, b], theR area under f (x) and above the x-axis
b
bounded by the lines x = a and x = b will be a f (x) dx. However, what happens if
f (x) ≤ 0
on some part of [a, b]?

Calculus 2 (B. Forrest)2

Section 1.3: Properties of the Definite Integral 23

y = f (x)
For example, assume
Z that f is as shown in
4
the diagram. Then f (x) dx is simply R
1
the area of the region R bounded by the
graph of f , the x-axis, and the lines x = 1
0 1 4 x
and x = 4.

Z 0
Suppose instead that we wanted to calculate f (x) dx.
−2
Notice that the function f is
negative on the interval [−2, 0].
Consider a term in a generic
Riemann sum from the regular
y = f (x)
n−partition:
2
f (ci )
n
Then n2 is the length of the
rectangle (i.e., the length of the ci
interval [−2, 0] is 2 and we divide −2 x
0 1 4
2 into n intervals, so the length of
each subinterval is 2n ). However, f (ci )
in this case f (ci ) < 0 (negative) 2
and it is the negative of the height n
of the rectangle since the graph
of f lies below the x-axis in this
interval.
It follows that the Riemann sum

n
X 2
Sn = f (ci )
i=1
n

approximates the negative of the area bounded by the graph of f , the x-axis, and the
lines x = −2 and x = 0.
y = f (x)

If we let n → ∞, then
Z 0 n
X 2
f (x) dx = lim S n = lim f (ci )
−2 n→∞ n→∞
i=1
n

is the negative of the area of the −2 x

0 1 4
region R1 . R1

Calculus 2 (B. Forrest)2

Chapter 1: Integration 24

y = f (x)

Suppose now that we wanted to

evaluate
Z 4
f (x) dx.
−2
−2 0 4 x

Then, by the properties of the integral, we can write

Z 4 Z 0 Z 4
f (x) dx = f (x)dx + f (x)dx.
−2 −2 0

But we have just seen that

Z 0 y = f (x)
f (x) dx
−2

represents the negative of the

area of region R1 and we also R2
know that
Z 4
−2 0 4 x
f (x) dx R1
0

represents the area of the region

R2 .
Z 4
It follows that f (x) dx represents the area of region R2 minus the area of region
−2
R1 . Z 4
f (x) dx = R2 − R1
−2

In general, if f is a continuous function on the interval [a, b], then

Z b
f (x) dx
a

represents the area of the region under the graph of f that lies above the
x-axis between x = a and x = b minus the area of the region above the graph
of f that lies below the x-axis between x = a and x = b.

Calculus 2 (B. Forrest)2

Section 1.3: Properties of the Definite Integral 25

Z b
If you are not yet convinced that for f (x) ≤ 0 on [a, b], f (x) dx is simply the
a
negative of the area of the region above the graph of f , below the x-axis, and between
x = a and x = b, then the following example may convince you.

EXAMPLE 6 Consider the function f shown in the diagram and let g(x) = − f (x). (In other words,
g is a reflection of f in the x-axis.)

y
f

R2
x
−1 R1 1

Note that since area is preserved by reflection, the area of R1 and the area of R2
are equal positive Zvalues since area is always a positive number. Suppose that we
1
want to calculate f (x) dx. We note that f (x) ≤ 0 on [−1, 1]. This means that
−1 Z 1
g(x) = − f (x) ≥ 0 on [−1, 1]. We also have that g(x) dx is equal to the area of
−1
region R2 . But
Z 1 Z 1
f (x) dx = (−g(x)) dx
−1 −1
Z 1
= − g(x) dx
−1
= −(area of region R2 )
= −(area of region R1 )

since the area of R1 and Z the area of R2 are equal. As such, the example shows that
b
for f (x) ≤ 0 on [a, b], f (x) dx is the negative of the area of the region above the
a
graph of f , below the x-axis, and between x = a and x = b.

Z 3
EXAMPLE 7 Find (2x − 1) dx.
−2

When calculating definite integrals, it is always advisable to look at the graph of the
integrand, in this case f (x) = 2x − 1, if possible.

Calculus 2 (B. Forrest)2

Chapter 1: Integration 26

y
f (x) = 2x − 1
Since 2x − 1 = 0 when x = the 1
, 5
2
graph of f sits below the x-axis
between x = −2 and x = 12 and R2
above the x-axis between x = 12 0 x
1
and x = 3. This gives us the −2 R1 2 3
limits of integration for regions
R1 and R2 . −5

Then
1
Z 3 Z 2
Z 3
(2x − 1) dx = (2x − 1) dx + (2x − 1) dx
1
−2 −2 2

= −area region R1 + area region R2

That is, this is the area of the region R2 minus the area of the region R1 . The region
R1 is a right triangle with base extending from x = −2 to x = 12 . This means that its
base is 2.5 = 52 . Since f (−2) = −5, the diagram shows that the height of the triangle
is 5. Since the area of a triangle is 21 (base) × (height), it follows that the area of
region R1 is 21 ( 52 )5 = 25
4
. It is again the case that the base of the triangle R2 is 2.5 and
its height is f (3) = 2(3) − 1 = 5. It follows that the area of R2 is also 254 and so
Z 3
−25 25
(2x − 1) dx = + = 0.
−2 4 4

Notice that in this example we avoided using Riemann sums by interpreting the
integral geometrically (in this case, the area of two triangles).

Z 1 √
EXAMPLE 8 Find 1 − x2 dx. y
−1
1

This is the area under

√ the graph
of the function y = 1 − x2 .

−1 0 1 x
The shape of this region is that of a semi-circle with radius 1. To see that this is the
case, we note that y2 = 1 − x2 so x2 + y2 = 1. The latter equation is the√ equation of
the circle centered at the origin with radius 1. (Since by assumption 1 − x2 is the
positive square root, we are only interested in the top half of the circle.) A circle of

Calculus 2 (B. Forrest)2

Section 1.4: The Average Value of a Function 27

radius 1 has area π, so this half circle has area π2 . It follows that
1 √ π
Z
1 − x2 dx = .
−1 2

Problem!
Unfortunately, this method of evaluating integrals by identifying an easily
R 1 √calculated
area has severe limitations. For example, we would not be able to find − 1 1 − x2 dx
2
with what we know at present. Instead, there exists a powerful tool that can be used to
find the integral of general functions. This tool is called The Fundamental Theorem of
Calculus. Along with this theorem, you will be required to learn various techniques
in order to integrate a variety of functions. The remainder of this chapter will focus
on this task.
However, before we can state and prove the Fundamental Theorem of Calculus, we
must first investigate what is meant by the average value of a function over an interval
[a, b].

1.4 The Average Value of a Function

Question: What is meant by “the average value of a continuous function over an

interval [a, b]?”
We know that the average of n real numbers α1 , α2 , . . . , αn is
α1 + α2 + . . . + αn
.
n
But how do you add all of the values of f on [a, b]? Is this possible?
One approach would be to take sample
values of f and calculate the average of
these samples as an estimate of the
average value. However, we need a
satisfactory method for obtaining such f
samples. One method to ensure that the
choice of sample points is as
representative as possible is to use the
regular n-partition
f (ti )
a = t0 < t1 < · · · < tn−1 < tn = b
i(b − a) a ti b
where ti = a + and consider
n
n
P
f (ti )
i=1
.
n

Calculus 2 (B. Forrest)2

Chapter 1: Integration 28

To acquire an even better sample, more and more points need to be considered.
Therefore, it might make sense to define the average of f on [a, b] to be
n
P
f (ti )
i=1
lim
n→∞ n
if this limit exists.
However, for continuous functions the limit always exists. In fact,
n
P
f (ti ) n
i=1 1 X (b − a)
lim = lim f (ti )
n→∞ n n→∞ b − a n
i=1
n
1 X (b − a)
= lim f (ti )
b − a n→∞ i=1 n
1
= lim Rn (where Rn is the right-hand Riemann sum)
b − a n→∞
Z b
1
= f (t) dt
b−a a

We are led to the following definition:

DEFINITION Average Value of f

If f is continuous on [a, b], the average value of f on [a, b] is defined as
Z b
1
f (t) dt
b−a a

1.4.1 An Alternate Approach to the Average Value of a Function

Recall that the Extreme Value Theorem implies that there exists m, M such that

m ≤ f (x) ≤ M

for all x ∈ [a, b]. Moreover, there exists c1 , c2 ∈ [a, b] such that f (c1 ) = m, f (c2 ) = M.
It make sense that the average of f on [a, b] should occur between m and M. Now
Z b Z b Z b
m dx ≤ f (x) dx ≤ M dx
a a a

Therefore Z b
m(b − a) ≤ f (x) dx ≤ M(b − a)
a

Calculus 2 (B. Forrest)2

Section 1.4: The Average Value of a Function 29

Equivalently, Z b
1
m≤ f (x) dx ≤ M
b−a a
Z b
1
Let α = f (x) dx. Then
b−a a

f (c1 ) ≤ α ≤ f (c2 ).
By the Intermediate Value Theorem, there exists c between c1 and c2 such that
Z b
1
f (c) = α = f (x) dx
b−a a

Geometrically, it follows that

Area R1 + Area R3 = Area R2

y = f (x)

R1 R3
α = f (c)
c
R2
a b

In other words, the area above α = f (c) but below y = f (x) equals the area below
α = f (c) but above y = f (x).
Once again it makes sense to say that
Z b
1
f (c) = f (x) dx
b−a a

represents the average value of the function f on [a, b].

The following theorem is established.

THEOREM 4 Average Value Theorem (Mean Value Theorem for Integrals)

Assume that f is continuous on [a, b].
Then there exists a ≤ c ≤ b such that
Z b
1
f (c) = f (t) dt
b−a a

Calculus 2 (B. Forrest)2

Chapter 1: Integration 30

Important Note:
If b < a and if f is continuous on [b, a], then there exists b < c < a with
Z a
1
f (c) = f (t)dt
a−b b
Z b !
1
= − f (t)dt
a−b a
Z b
1
= f (t)dt
b−a a

so the Average Value Theorem holds even if b < a.

You have now been presented with all of the background information required to
understand the The Fundamental Theorem of Calculus. It is so named because it is
one of the most important results in mathematics!

1.5 The Fundamental Theorem of Calculus (Part 1)

The goal in this section is to introduce the Fundamental Theorem of Calculus which
is attributed independently to Sir Issac Newton and to Gottfried Leibniz. As the name
suggests, this is perhaps the most important theorem in Calculus and many would ar-
gue, one of the most important discoveries in the history of mathematics. Despite this
lofty claim, the Fundamental Theorem is at its heart a simple rule of differentiation.
However, from this simple rule, we can derive a method that will allow us to evaluate
many types of integrals without having to appeal to the complicated process involv-
ing Riemann sums. Consequently, the Fundamental Theorem of Calculus enables us
to link together differential calculus and integral calculus in a very profound way.
Let’s begin by assuming that the function f is continuous on an interval [a, b].
Let’s also define the integral function
Z x
G(x) = f (t) dt.
a

What does this integral function do? If f ≥ 0, then G(x) is the function that
calculates the area under the graph of y = f (t) as x varies over an interval [a, b]
starting from a.

Calculus 2 (B. Forrest)2

Section 1.5: The Fundamental Theorem of Calculus (Part 1) 31

y = f (t)

x
G(x) = Area =
R
a
f (t) dt
t
a 0 x

constant varies

The objective is to determine the rate of change in the area G(x) as x changes. In
other words, to find the derivative of the integral function G(x).
Before we consider the general case, let’s look at a simple example.

EXAMPLE 9 Let f (t) = 2t on the interval [0, 3]. Find a formula for G(x).
Recall that the integral function is defined by
Z x
G(x) = f (t) dt.
a

2t
)=
f (t

Then the integral function for

f (t) = 2t starting at a = 0 is
Z x
G(x) = 2t dt.
0

x
G(x) =
R
0
2t dt
t
0 x 3

constant varies
In other words, G(x) is the integral function that calculates the area under the curve

Calculus 2 (B. Forrest)2

Chapter 1: Integration 32

of f (t) = 2t as x varies over the interval [0, 3] starting at t = 0.

Let’s try to calculate the area under f (t) as t varies from x = 0, x = 1, x = 2 and
x = 3. We will use these values to see if we can determine G(x).
Case x = 0:
R0
If x = 0, we have that G(0) = 0 2t dt = 0 since the limits of integration are
identical. (There is no area to calculate.) Thus we have the area under f (t) on the
interval [0, 0] is 0 and G(0) = 0 .
Case x = 1:
R1
If x = 1, we have that G(1) = 0 2t dt and G(1) is the area under the graph of
f (t) = 2t on the interval [0, 1]. We can calculate this using geometry since the area
is a triangle.
Z 1
Area = G(1) = 2t dt
0
1
=

2t
× base × height
2

)=
f (t
1
= (1)(2(1))
2
2 (1, 2)
= 1 R1
G(1) = 0
2t dt = 1
Thus we have the area under f (t)
on the interval [0, 1] is 1 and t
G(1) = 1 . 0 x=1

Case x = 2:
R2
If x = 2, we have that G(2) = 0 2t dt and G(2) is the area under the graph of
f (t) = 2t on the interval [0, 2]. We can again calculate this using geometry since the
area is a triangle.

Z 2
Area = G(2) = 2t dt
0

1
2t

= × base × height 4 (2, 4)

2
f (t

1
= (2)(2(2))
2 R2
G(2) = 2t dt = 4
= 4 0

Thus we have the area under f (t)

t
on the interval [0, 2] is 4 and
0 x=2
G(2) = 4 .

Calculus 2 (B. Forrest)2

Section 1.5: The Fundamental Theorem of Calculus (Part 1) 33

Case x = 3:
R3
If x = 3, we have that G(3) = 0 2t dt and G(3) is the area under the graph of
f (t) = 2t on the interval [0, 3]. Once more we can calculate this using geometry
since the area is a triangle.

Z 3 6 (3, 6)
Area = G(3) = 2t dt
0

2t
= × base × height

)=
2

f (t
1
= (3)(2(3))
2 R3
G(3) = 0
2t dt = 9
= 9

Thus we have the area under f (t)

t
on the interval [0, 3] is 9 and
0 x=3
G(3) = 9 .
Though we could continue to allow x to vary and x G(x)
perform this calculation, let’s consider the results
for G(x). They are summarized in the following 1 1
table. 2 4
3 9
Notice that a pattern is forming. It appears that as .. ..
“x” varies, G(x) takes on the value of “x2 .” In fact, . .
this is indeed the case. x x2
Case: Generic x:
Rx
If x > 0, we have that G(x) = 0 2t dt and G(x) is the area under the graph of
f (t) = 2t on the interval [0, x]. We are still able to calculate this area using geometry
since the region is a triangle.

Z x
Area = G(x) = 2t dt
0 (x, 2x)
1
2t

= × base × height
)=

2
f (t

1
= (x)(2(x))
2 x
G(x) = 2t dt = x2
R
0
= x 2

Thus we have the area under f (t)

t
on the interval [0, x] is x2 and
0 x
G(x) = x2 .
Important Observation: Notice that G(x) = x2 and the derivative of G is

Calculus 2 (B. Forrest)2

Chapter 1: Integration 34

G0 (x) = 2x. In other words, we have just seen that

G0 (x) = f (x).

This means that Z x

d
G (x) =
0
f (t) dt = f (x).
dx 0

In the previous example, we were able to calculate the area geometrically because f
was a linear function and the region under the graph of f was always triangular.
Normally we will not have an integrand that has its area calculated so easily. We
will now discuss the case where f is a generic function.

y = f (t)
Again we begin by assuming that f (t) ≥ 0
is continuous on the interval [a, b] and let
the integral function be defined by
Z x
G(x) = f (t) dt.
a
x
G(x) = Area =
R
In this case, G(x) represents the area a
f (t) dt
bounded by the graph of f (t), the t-axis, t
a 0 x
and the lines t = a and t = x.
constant varies
The objective is to determine the rate of change in the area G(x) as x changes. In
other words, to find the derivative G0 (x) of the integral function G(x).

y = f (t)

First we increment x by adding a

small amount denoted by h. Then
G(x + h) is the area obtained by
adding the first region G(x) with
the shaded area between the lines
t = x and t = x + h. G(x + h)
G(x)
t
a 0 x x+h
h

Calculus 2 (B. Forrest)2

Section 1.5: The Fundamental Theorem of Calculus (Part 1) 35

y = f (t)
Next, consider G(x + h) − G(x).
This difference is exactly the
shaded area.
G(x + h) − G(x)
It is important to remember that
the area of this region can also be
expressed as an integral, namely
Z x+h
f (t) dt.
x t
a 0 x x+h
h

Moreover, the Average Value y = f (t)

Theorem tells us that there is a c
with x < c < x + h such that
G(x + h) − G(x)
Z x+h
f (t)dt = f (c)((x + h) − x)
x
f (c)
= f (c)h

So the area of this rectangle is

t
f (c)h. a 0 xcx+h
h
As the diagram suggests, when h is small this means that there exists a c with
x < c < x + h such that
G(x + h) − G(x) = f (c)h
and hence
G(x + h) − G(x)
= f (c).
h
However, if h is very small, then c must also be very close to x. Since f is
G(x + h) − G(x)
continuous, this means that f (c) = must be very close to f (x). In
h
fact, if we let h approach 0 from the right, we get that c must also approach x from
the right. All of this together with the assumption that f is continuous gives us
G(x + h) − G(x)
lim+ = lim+ f (c) = f (x).
h→0 h c→x

This is one side of the limit that defines G 0 (x). A similar argument shows that
G(x + h) − G(x)
lim− = f (x)
h→0 h

Calculus 2 (B. Forrest)2

Chapter 1: Integration 36

and hence that

G(x + h) − G(x)
lim = f (x).
h→0 h
That is,
G 0 (x) = f (x).

Note: When we calculated the one-sided limit we made a number of assumptions.

First, we assumed that f was continuous. This was used in two places. First, so that
we could apply the Average Value Theorem to get that
G(x + h) − G(x)
= f (c)
h
and second, to conclude that
lim f (c) = f (x).
c→x+

The other assumptions were that f (t) ≥ 0 and that the increment h was positive.
The assumption that f be continuous is essential, but the other two assumptions
were only for our convenience and they can actually be omitted. This gives us a very
simple rule of differentiation for integral functions, though a rule with a profound
impact.

THEOREM 5 Fundamental Theorem of Calculus (Part 1) [FTC1]

Assume that f is continuous on an open interval I containing a point a. Let
Z x
G(x) = f (t) dt.
a

Then G(x) is differentiable at each x ∈ I and

G 0 (x) = f (x).

Equivalently, Z x
d
G (x) =
0
f (t) dt = f (x).
dx a

PROOF
Rx
Assume that G(x) = a f (t)dt and that f is continuous at x0 ∈ I. Let > 0. Then
there exists a δ > 0 so that if 0 < |c − x0 | < δ, then

| f (c) − f (x0 )| < .

Calculus 2 (B. Forrest)2

Section 1.5: The Fundamental Theorem of Calculus (Part 1) 37

Now let 0 < |x − x0 | < δ. Then

R x R x0
G(x) − G(x0 ) f (t)dt − f (t)dt
= a a
x − x0 x − x0
Z x
1
= f (t)dt
x − x0 x0

But then there exists a c between x and x0 with

Z x
1
f (c) = f (t)dt
x − x0 x0

by the Average Value Theorem.

= f (x0 ).

NOTE
If we use Leibniz notation for derivatives, the Fundamental Theorem of
Calculus (Part 1) can be written as
Z x
d
f (t) dt = f (x)
dx a
This equation roughly states that if you first integrate f and then differentiate the
result, you will return back to the original function f .

In the following example, a physical interpretation of the Fundamental Theorem is

presented.

EXAMPLE 10 Assume that a vehicle travels forward along a straight road with a velocity at time t
given by the function v(t). If we fix a starting point at t = 0, then we saw from the
section about Riemann sums that the displacement s(x) up to time t = x is the area
Z x
under the velocity graph. That is, s(x) = v(t) dt.
0

Calculus 2 (B. Forrest)2

Chapter 1: Integration 38

v(t)

x
s(x) =
R
0
v(t) dt

t
0 x

We can assume that velocity is a continuous function of time. Consequently, the

Fundamental Theorem of Calculus applies to the function s(x). Moreover, the
theorem tells us that s(x) is differentiable and that the derivative of displacement is
velocity
s 0 (x) = v(x)
exactly as we would expect!

Z x
2
EXAMPLE 11 (a) Find F (x) if F(x) =
0
et dt.
3

Since f (t) = e is a continuous function, the Fundamental Theorem of Calculus

applies. Therefore to find F 0 (x) we can simply replace t by x in f (t) to get

2
F 0 (x) = e x .

Z x2
2
(b) Let’s modify the previous question. Let G(x) = et dt. Find G 0 (x).
3

This is not quite the same as the previous example. In fact, in order to find G 0 (x) we
note that
G(x) = F(x2 )
Z x
2
where F(x) = et dt. But this means we can use the Chain Rule to get that
3

d 2
G 0 (x) = F 0 (x2 ) (x ).
dx
2
But to find F 0 (x2 ) we replace t by x2 in et . That is
2 2 4
F 0 (x2 ) = e(x ) = e x
Z x2
2
and d
dx
(x2 ) = 2x. It follows that if G(x) = et dt, then
3

4
G 0 (x) = 2xe x .

Calculus 2 (B. Forrest)2

Section 1.5: The Fundamental Theorem of Calculus (Part 1) 39

R xsolve this problem. First let u = x since the goal is

2
There is an alternate method to
to get an integral in the form a . Then
Z x2
d 2
G (x) =
0
et dt
dx 3
Z u
d 2
= et dt (substituting u = x2 )
dx 3
Z u !
d t2 du
= e dt (by the Chain Rule)
du 3 dx
2 du
= eu (by FTCI)
dx
4
= e x · 2x

which is the same answer we calculated by the previous method.

In the statement of the Fundamental Theorem of Calculus, the lower limit of the
integral was always fixed. That is, it did not vary with x. We can now make our
example even more complicated by letting the lower limit of the integral vary as a
function of x. Let Z 2 x
2
H(x) = et dt.
cos(x)
0
How would we find H (x)?
We can cleverly use the properties of the integral. In fact, we can write
Z x2 Z 3 Z x2
t2 t2 2
H(x) = e dt = e dt + et dt.
cos(x) cos(x) 3

Furthermore, we know that

Z 3 Z cos(x)
t2 2
e dt = − et dt
cos(x) 3

and this integral is in the form where we can use the Fundamental Theorem.
Therefore, we have that
Z x2 Z x2 Z cos(x)
t2 t2 2
H(x) = e dt = e dt − et dt.
cos(x) 3 3

If we now let Z cos(x)

2
H1 (x) = et dt
3
then
H(x) = G(x) − H1 (x)
where G(x) is defined as before. But then

H 0 (x) = G 0 (x) − H1 0 (x)

Calculus 2 (B. Forrest)2

Chapter 1: Integration 40

and we already know that

4
G 0 (x) = 2xe x .
This means that we only need to find H1 0 (x). To accomplish this we do exactly what
we did to find G0 (x). We note that

H1 (x) = F(cos(x))

so
d
H1 0 (x) = F 0 (cos(x)) (cos(x))
dx
2
= − sin(x)e(cos(x))

Combining all of this together gives us that

H 0 (x) = G 0 (x) − H1 0 (x)

4 2
= 2xe x − (− sin(x)e(cos(x)) )
4 2
= 2xe x + sin(x)e(cos(x))

The previous example leads us to an extended version of the Fundamental Theorem

of Calculus.

THEOREM 6 Extended Version of the Fundamental Theorem of Calculus

Assume that f is continuous and that g and h are differentiable. Let
Z h(x)
H(x) = f (t) dt.
g(x)

Then H(x) is differentiable and

H 0 (x) = f (h(x))h 0 (x) − f (g(x))g 0 (x).

1.6 The Fundamental Theorem of Calculus (Part 2)

We have seen that the Fundamental Theorem of Calculus provides us with a simple
rule for differentiating integral functions and so it provides the key link between
differential and integral calculus. However, we will soon see it also provides us with
a powerful tool for evaluating integrals. First we must briefly review the topic of
antiderivatives from your study of differential calculus.

Calculus 2 (B. Forrest)2

Section 1.6: The Fundamental Theorem of Calculus (Part 2) 41

1.6.1 Antiderivatives

We know a number of techniques for calculating derivatives. In this section, we will

review how we can sometimes “undo” differentiation. That is, given a function f ,
we will look for a new function F with the property that F 0 (x) = f (x).

DEFINITION Antiderivative
Given a function f , an antiderivative is a function F such that

F 0 (x) = f (x).

If F 0 (x) = f (x) for all x in an interval I, we say that F is an antiderivative for f on I.

x4
EXAMPLE 12 Let f (x) = x3 . Let F(x) = 4
. Then

4x4−1
F 0 (x) = = x3 = f (x),
4
x4
so F(x) = 4
is an antiderivative of f (x) = x3 .

While the derivative of a function is always unique, this is not true of antiderivatives.
4
In the previous example, if we let G(x) = x4 + 2, then G 0 (x) = x3 . Therefore, both
4 4
F(x) = x4 and G(x) = x4 + 2 are antiderivatives of the same function f (x) = x3 .
This holds in greater generality: if F is an antiderivative of a given function f , then
so is G(x) = F(x) + C for every C ∈ R. A question naturally arises–are these all of
the antiderivatives of f ?
To answer this question, we appeal to the Mean Value Theorem. Assume that F and
G are both antiderivatives of a given function f . Let

H(x) = G(x) − F(x).

Then

H 0 (x) = G 0 (x) − F 0 (x)

= f (x) − f (x)
= 0

for every x.
The Mean Value Theorem showed that there exists a constant C such that

H(x) = G(x) − F(x) = C.

But this means that

G(x) = F(x) + C.

Calculus 2 (B. Forrest)2

Chapter 1: Integration 42

It follows that once we have one antiderivative F of a function f , we can find all of
the antiderivatives by considering all functions of the form

G(x) = F(x) + C.

EXAMPLE 13 Let f (x) = x3 . Find all of the antiderivatives of f .

4
We have already seen that F(x) = x4 is an antiderivative of x3 . It follows that the
family of all antiderivatives consists of functions of the form

x4
G(x) = +C
4
for C ∈ R.

Notation: We will denote the family of antiderivatives of a function f by

Z
f (x) dx.

For example,
x4
Z
x3 dx = + C.
4

The symbol Z
f (x) dx

is called the indefinite integral of f . The function f is called the integrand.

We will be content to find the antiderivatives of many of the basic functions that we
will use in this course. The next theorem tells us how to find the antiderivatives of
one of the most important classes of functions, the powers of x.

THEOREM 7 Power Rule for Antiderivatives

If α , −1, then

xα+1
Z
xα dx = + C.
α+1

To see that this theorem is correct we need only differentiate. Since

d xα+1
( + C) = xα ,
dx α + 1
we have found all of the antiderivatives.
The following table lists the antiderivatives of several basic functions. You can use
differentiation to verify each antiderivative.

Calculus 2 (B. Forrest)2

Section 1.6: The Fundamental Theorem of Calculus (Part 2) 43

Integrand Antiderivative

xn+1
f (x) = xn xn dx = +C
R
where n , −1
n+1
1 R 1
f (x) = dx = ln(| x |) + C
x x
f (x) = e x e x dx = e x + C
R

f (x) = sin(x) sin(x) dx = − cos(x) + C

f (x) = cos(x) cos(x) dx = sin(x) + C

f (x) = sec2 (x) sec2 (x) dx = tan(x) + C

1 1
f (x) = dx = arctan(x) + C
R
1 + x2 1 + x2
1 1
f (x) = √ dx = arcsin(x) + C
R
√
1 − x2 1 − x2
−1 −1
f (x) = √ dx = arccos(x) + C
R
√
1 − x2 1 − x2
f (x) = sec(x) tan(x) sec(x) tan(x) dx = sec(x) + C
R

ax
f (x) = a x where a > 0 and a , 1 a x dx = +C
R
ln(a)

1.6.2 Evaluating Definite Integrals

Z 2
Suppose that we want to evaluate t3 dt. At this point we would have to resort to
0
using Riemann sums, a process that we have seen is very tedious and is best avoided
if possible. Instead let’s define
Z x
G(x) = t3 dt
0

and note that Z 2

G(2) = t3 dt.
0

Why does this help us? The Fundamental Theorem of Calculus shows that

G 0 (x) = x3 .

That is, G(x) is an antiderivative of x3 . However, we know from the Mean Value
Theorem and by the power rule for antiderivatives that if F is any antiderivative of
x3 then
x4
F(x) = +C
4
Calculus 2 (B. Forrest)2
Chapter 1: Integration 44

where C is some unknown constant. This means that

Z x
x4
G(x) = t3 dt = + C1
0 4
for some constant C1 . If we knew C1 we would be done.
To determine C1 we know that
x
x4
Z
G(x) = t3 dt = + C1
0 4
and
0
04
Z
0= t3 dt = G(0) = + C1 = C1
0 4
so x
x4
Z
G(x) = t3 dt = .
0 4
Finally,
2
24
Z
t3 dt = G(2) = = 4.
0 4
Question: Did we really need to find C1 ?
To answer this question we will make the following very important observation.
Key Observation: Let F and G be any two antiderivatives of the same function f .
Then
G(x) = F(x) + C

Let a, b ∈ R. Then
G(b) − G(a) = (F(b) + C) − (F(a) + C)
= F(b) − F(a)

Assume that f is continuous. We want to calculate

Z b
f (t) dt.
a

Let Z x
G(x) = f (t) dt.
a

Then the Fundamental Theorem of Calculus (Part1) shows that G is an

antiderivative of f . Moreover, if F is any other antiderivative of f , then
Z b
f (t) dt = G(b)
a
Z a
= G(b) − G(a) (since G(a) = f (t) dt = 0)
a
= F(b) − F(a)

Calculus 2 (B. Forrest)2

Section 1.6: The Fundamental Theorem of Calculus (Part 2) 45

For example, to evaluate Z 2

t3 dt,
0
we know that
x4
F(x) =
4
is an antiderivative for f (x) = x . It follows from the previous observation that
3

Z 2
t3 dt = F(2) − F(0)
0

24 04
= −
4 4

= 4

This example shows us how we can now use antiderivatives to help us evaluate an
integral, which is further evidence that the two branches of calculus–differentiation
and integration–are intimately linked. Moreover, the observation we have just made
establishes a procedure for evaluating definite integrals that works in general
because of the Fundamental Theorem of Calculus (Part 1). This is summarized in
the following theorem. Because this procedure is essentially a consequence of the
Fundamental Theorem of Calculus (Part 1), this result is called the Fundamental
Theorem of Calculus (Part 2).

THEOREM 8 Fundamental Theorem of Calculus (Part 2) [FTC2]

Assume that f is continuous and that F is any antiderivative of f .
Then Z b
f (t)dt = F(b) − F(a).
a

Going forward, it is no longer necessary to use Riemann sums to calculate integrals;

we can use the Fundamental Theorem of Calculus (Part 2) instead.
We will now introduce the following notation to use in evaluating integrals. We
write b
F(x) = F(b) − F(a)
a
to indicate that the value of the antiderivative F evaluated at b minus the value of the
antiderivative F evaluated at a.

Calculus 2 (B. Forrest)2

Chapter 1: Integration 46

Z π
EXAMPLE 14 Evaluate sin(t) dt.
0

This is the area of the region R1 under the graph of sin(t) between t = 0 and t = π.
The value for the area is not a number that we can guess since the region is not a
familiar shape.

1 f (t) = sin(t)

R1
t
0 π

−1

However, f (t) = sin(t) is continuous and F(t) = − cos(t) is an antiderivative of f .

The Fundamental Theorem of Calculus II tells us that
Z π
sin(t) dt = F(π) − F(0)
0
= (− cos(π)) − (− cos(0))
= −(−1) − (−1)
= 1+1
= 2

Z π
Next let’s evaluate sin(t)dt.
−π

1 f (t) = sin(t)

R1
t
−π 0 π
R2
−1

Using a geometric argument, the value of this integral should be the area of region
R1 minus the area of region R2 . But since sin(x) is an odd function, the symmetry of
the graph shows that R1 and R2 should have the same area. This means the integral
should be 0. To confirm this result we can again use the Fundamental Theorem of
Calculus to get

Calculus 2 (B. Forrest)2

Section 1.7: Change of Variables 47

Z π
sin(t) dt = (− cos(t)) |π−π
−π
= (− cos(π)) − (− cos(−π))
= (−(−1)) − (−(−1))
= 1−1
= 0

as expected.

Before we end this section, it is important that we emphasize the difference between
the meaning of Z b Z
f (t) dt and f (t)dt
a
The first expression, Z b
f (t) dt
a
is called a definite integral. It represents a number that is defined as a limit of
Riemann sums.
The second expression, Z
f (t)dt

is called an indefinite integral. It represents the family of all functions that are
antiderivatives of the given function f .
The use of similar notation for these very distinct objects is a direct consequence of
the Fundamental Theorem of Calculus.

1.7 Change of Variables

While the Fundamental Theorem of Calculus is a very powerful tool for evaluating
definite integrals, the ability to use this tool is limited by our ability to identify
antiderivatives.
Finding antiderivatives is essentially “undoing differentiation.” While we have
antiderivative rules for polynomials and for some of the trigonometric and
exponential functions, unfortunately it is generally much more difficult to find
antiderivatives than it is to differentiate. For example, it is actually possible to prove
(using sophisticated algebra that is well beyond this course) that the function
2
f (x) = e x

does not have an antiderivative that we can state in terms of any functions with
which we are familiar. This is a serious flaw in our process since, for example,

Calculus 2 (B. Forrest)2

Chapter 1: Integration 48

integrals involving such functions are required for statistical analysis. However, in
the next section a method is presented that can undo the most complex rule of
differentiation—the Chain Rule. By using this technique, you will be able to
evaluate many more types of integrals.

1.7.1 Change of Variables for the Indefinite Integral

Assume that we have two functions f (u) and h(u) with

h 0 (u) = f (u).

Then using the notation of antiderivatives we have

Z
f (u) du = h(u) + C.

Now let u = g(x) be a function of x. The Chain Rule says that if H(x) = h(g(x)) then

H 0 (x) = h 0 (g(x))g 0 (x)

= f (g(x))g 0 (x) (since h 0 (u) = f (u))

Integrating both sides we get

Z
f (g(x))g 0 (x) dx = H(x) + C
= h(g(x)) + C (since H(x) = h(g(x)))

However, this shows that

Z
f (g(x))g 0 (x) dx = h(g(x)) + C

= h(u) +C
u=g(x)
Z
= f (u) du
u=g(x)

where the symbol h(u) means replace u by g(x) in the formula for h(u) and the
u=g(x)
R
symbol f (u) du means replace u by g(x) once the antiderivative has been
u=g(x)
found.
Let’s see how this works in practice.

Z
2
EXAMPLE 15 Evaluate 2xe x dx.

In this case, note that if we let u = g(x) = x2 , then g 0 (x) = 2x. If we also let
f (u) = eu , then

Calculus 2 (B. Forrest)2

Section 1.7: Change of Variables 49

x 2
2x e dx = f(g(x)) g (x) dx

We get that
Z Z
x2
2xe dx = f (g(x))g 0 (x) dx
Z
= f (u) du
u=g(x)
Z
= eu du 2
u=x

= eu 2 + C
u=x

2
= ex + C

To verify that Z
2 2
2xe x dx = e x + C

we can check the answer by differentiating. Using the Chain Rule, we see that
d x2 2
(e + C) = 2xe x
dx
which is the integrand in the original question, exactly as we expected.

The method just outlined is called Change of Variables. It is often also called
Substitution, since we “substitute g(x) for u.”
There is a notational trick that can help you to remember the process. Start with
Z
f (g(x))g 0 (x) dx.

We want to make the substitution

u = g(x).
Differentiating both sides gives us
du
= g 0 (x).
dx
If we treat du and dx as if they were “numbers”, then
du = g 0 (x) dx.
We can now substitute u for g(x) and du for g 0 (x)dx to get

Calculus 2 (B. Forrest)2

Chapter 1: Integration 50

f(g(x)) g (x) dx = f(u) du

u = g(x)

which is the Change of Variables formula.

It is important to note that the expression

du = g 0 (x) dx

does not really have any mathematical meaning. None the less, this trick works and
it is how integrals are actually computed in practice.

Z
2x
EXAMPLE 16 Evaluate dx by making the substitution u = 1 + x2 .
1+x 2

If
u = 1 + x2 ,
then
du = 2x dx.
Substituting u = 1 + x2 and du = 2x dx into the original integral gives us
Z Z
2x 1
dx = du
1 + x2 u u=1+x2

but Z
1
du = ln(| u |) + C.
u
Hence
Z Z
2x 1
dx = du
1 + x2 u u=1+x2

= ln(| u |) +C
u=1+x2

= ln(| 1 + x2 |) + C

= ln(1 + x2 ) + C

where the last equality holds since 1 + x2 > 0.

Until you become comfortable with this technique it is always a good idea to check
your answer by differentiating. (In this case, the answer is f (x) = ln(1 + x2 ) + C.
Differentiating we get f 0 (x) = (1+x1
2 ) dx (1 + x ) = (1+x2 ) which is the original
d 2 2x

integrand for the question . . . so the answer is correct!)

Calculus 2 (B. Forrest)2

Section 1.7: Change of Variables 51

Z
EXAMPLE 17 Evaluate x cos(x2 ) dx.

In this example, using the substitution u = x2 causes us to be out by a constant

factor. However, this will not be a problem by proceeding as follows.
The method of substitution suggests letting u = x2 and f (u) = cos(u). In this case
we have
du = 2x dx (∗)
However, it appears that we really wanted

du = x dx.

We can rearrange equation (∗) to get that

1
du = x dx.
2
Then substituting 12 du for x dx and u for x2 we get
Z Z
1
x cos(x ) dx =
2
cos(u) du 2
2 u=x
Z
1
= cos(u) du 2
2 u=x

1
= sin(u) 2 + C
2 u=x

1
= sin(x2 ) + C
2

You should check that this answer is correct by differentiating 21 sin(x2 ) + C to

ensure it equals the integrand.

The next unusual substitution allows us to evaluate an important trigonometric

integral.
Z
EXAMPLE 18 Evaluate sec(θ) dθ.

At first glance this does not appear to be an integral that can be evaluated by using
substitution. However, we can make the following clever observation:

sec(θ) + tan(θ)
!
sec(θ) = sec(θ)
sec(θ) + tan(θ)
sec2 (θ) + sec(θ) tan(θ)
=
sec(θ) + tan(θ)

Calculus 2 (B. Forrest)2

Chapter 1: Integration 52

Let u = sec(θ) + tan(θ). Then

du = sec(θ) tan(θ) + sec2 (θ) dθ.

It follows that
sec2 (θ) + sec(θ) tan(θ)
Z Z
sec(θ) dθ = dθ
sec(θ) + tan(θ)
Z
du
=
u

The antiderivative rules give us that

Z
du
= ln(|u|) + C.
u

Letting u = sec(θ) + tan(θ), we get

Z
sec(θ) dθ = ln(| sec(θ) + tan(θ)|) + C.

1.7.2 Change of Variables for the Definite Integral

We can also use the Change of Variables technique for definite integrals. However,
for the case of definite integrals, we must be careful with the limits of integration.
Suppose that we want to evaluate
Z b
f (g(x))g 0 (x) dx
a

where f and g 0 are continuous functions. We have just seen that if h(u) is an
antiderivative of f (u), then
H(x) = h(g(x))
is an antiderivative of
f (g(x))g 0 (x).
This means that we can apply the Fundamental Theorem of Calculus to get

Z b
f (g(x))g 0 (x) dx = H(b) − H(a)
a
= h(g(b)) − h(g(a))

Calculus 2 (B. Forrest)2

Section 1.7: Change of Variables 53

However, since h 0 (u) = f (u), the Fundamental Theorem of Calculus also shows us
that Z g(b)
f (u) du = h(g(b)) − h(g(a)).
g(a)

Combining these two results shows that

Z b Z g(b)
f (g(x))g (x) dx =
0
f (u) du.
a g(a)

THEOREM 9 Change of Variables

Assume that g0 (x) is continuous on [a, b] and f (u) is continuous on g([a, b]),
then

Z x=b Z u=g(b)
f (g(x))g (x) dx =
0
f (u) du.
x=a u=g(a)

Let’s see what this theorem implies by looking at some examples. Notice that you
must give special attention to the limits of integration.

Z 4
EXAMPLE 19 Evaluate (5x − 6)3 dx.
2

Let u = g(x) = 5x − 6. Then f (u) = u3 . Since

du = g 0 (x) dx = 5 dx,

we have
1
du = dx.
5
The Change of Variables Theorem shows us that

Z 4 Z u=g(b)
(5x − 6) dx =3
f (u) du
2 u=g(a)
Z u=g(4)
1
= u3 du
u=g(2) 5

Now since g(a) = g(2) = 5(2) − 6 = 4 and g(b) = g(4) = 5(4) − 6 = 14 we have

Calculus 2 (B. Forrest)2

Chapter 1: Integration 54

Z 14
1
= u3 du
5 4
!14
1 1 4
= u
5 4 4
1
= (144 − 44 )
20

= 1908

Z 1
x dx
EXAMPLE 20 Evaluate √ .
0 x2 + 1
Let u = g(x) = x2 + 1. Then

du = g 0 (x) dx = 2x dx,

so
1
du = x dx.
2
We also have,
1 1
f (u) = √ = u− 2 .
u
The Change of Variables Theorem shows us that
Z 1 Z u=g(b)
x dx
√ = f (u) du
0 x2 + 1 u=g(a)
Z u=g(1)
1 1
= (u− 2 ) ( du)
u=g(0) 2

Now since g(a) = g(0) = 02 + 1 = 1 and g(b) = g(1) = 12 + 1 = 2 we have

1 2 − 12
Z
= u du
2 1
1 12 2
= 2u
2 1

1 1
= 22 − 12
√
= 2−1
0.41

Calculus 2 (B. Forrest)2

Section 1.7: Change of Variables 55

Z π
4
EXAMPLE 21 Evaluate −2 cos2 (2x) sin(2x) dx.
0

Let u = g(x) = cos(2x) and f (u) = u2 . Then since

du = g 0 (x) dx = −2 sin(2x) dx

the Change of Variables Theorem shows us that

π
Z 4
Z u=g(b)
−2 cos (2x) sin(2x) dx =
2
f (u) du
0 u=g(a)
Z cos(2( π4 ))
= u2 du
cos(2(0))
Z cos( π2 )
= u2 du
cos(0)
Z 0
= u2 du
1
0
u3
=

3

1

03 13
= −
3 3
−1
=
3

Calculus 2 (B. Forrest)2

Chapter 2

Techniques of Integration

The Change of Variables Theorem gave us a method for evaluating integrals when
the antiderivative of the integrand was not obvious. The underlying method involved
“substitution” of one variable for another. In this chapter, we continue using the
Change of Variables Theorem to calculate integrals. However, the substitutions will
involve trigonometric functions.

2.1 Inverse Trigonometric Substitutions

From the geometric interpretation

of the integral we have seen that

π
Z 1√
1 − x2 dx = . 1 √
−1 2 f (x) = 1 − x2

However, we did not explicitly

calculate this integral because we
did not know many integration
techniques. Instead, we√found the
area by deducing that 1 − x2 0 x
was the top half of a circle with −1 1
radius 1 and using the high
school formula for the area of a
circle, area = πr2 .
We will now explicitly calculate this integral using a trigonometric substitution.

EXAMPLE 1 Evaluate
1 √ π
Z
1 − x2 dx = .
−1 2

We will require the use of the Pythagorean Identity for trigonometric functions to
evaluate this integral:
sin2 (x) + cos2 (x) = 1.
Section 2.1: Inverse Trigonometric Substitutions 57

Let
x = sin(u)
−π π
for ≤u≤ .
2 2
This might seem like an unusual suggestion since the substitution rule usually asks
us to make a substitution of the form

u = g(x).

However, we have actually done this! To see why this is the case, recall that on the
interval [ −π , π ], the function x = sin(u) has a unique inverse given by u = arcsin(x).
2 2
In fact, we are really making the substitution u = arcsin(x) and this method is called
inverse trigonometric substitution.
Using the substitution x = sin(u), the integrand
√
1 − x2

becomes
q p
1 − sin2 (u) = cos2 (u)
= | cos(u) |
= cos(u)

where the last equality holds since cos(u) ≥ 0 on the interval [ −π , π ].

2 2

We can differentiate
x = sin(u)
to get
dx = cos(u) du.

This means that

Z 1 √ Z arcsin(1)
1− x2 dx = cos(u) cos(u) du
−1 arcsin(−1)
Z π
2
= cos2 (u) du
− π2

Note: Since u = g(x) = arcsin(x), we have that the new limits of integration are
u = g(−1) = arcsin(−1) = − π2 and u = g(1) = arcsin(1) = π2 .
To finish the integral calculation we need to use the following trigonometric identity:

1 + cos(2u)
cos2 (u) = .
2

Calculus 2 (B. Forrest)2

Chapter 2: Techniques of Integration 58

This means that

π
Z 1 √ Z 2
1− x2 dx = cos2 (u) du
−1 − π2
π
1 + cos(2u)
Z 2
= du
− π2 2
Z π2 Z π2
1 cos(2u)
= du + du
− π2 2 −π
2
2
Z π2 Z π
1 1 2
= 1 du + cos(2u) du
2 − π2 2 − π2

The first integral is

Z π π
1 2 1 2
1 du = u
2 − π2 2 − π2
1 π π
= − −
2 2 2
π
=
2

To evaluate the second integral

Z π
1 2
cos(2u) du
2 − π2

use the substitution

v = 2u
with
dv = 2 du.
If u = π2 , then v = π and if u = − π2 , then v = −π. We can apply the Change of
Variable Formula to get that

π
Z Z π
1 2 1 dv
cos(2u) du = cos(v)
2 − π2 2 −π 2
Z π
1
= cos(v) dv
4 −π
π
1
= sin(v)
4 −π

1
= (sin(π) − sin(−π))
4

Calculus 2 (B. Forrest)2

Section 2.1: Inverse Trigonometric Substitutions 59

1
= (0 − 0)
4

= 0

We have then shown that

Z 1√ Z π
2
1 − x dx =
2 cos2 (u) du
−1 − π2
Z π Z π
1 2 1 2
= 1 du + cos(2u) du
2 − π2 2 − π2
π
= +0
2
π
=
2
which agrees with our previous argument.

REMARK
In general, there are three main classes of inverse trigonometric substitutions. The
first class are integrals with integrands of the form
√
a2 − b2 x2 .
The substitution, based on the pythagorean identity sin2 (x) + cos2 (x) = 1, is
bx = a sin(u).
The previous example demonstrated this class.
The second class of trigonometric substitution covers integrands of the form
√
a2 + b2 x2 .
The substitution is based on the identity sec2 (x) − 1 = tan2 (x) and is given by
bx = a tan(u).

The third type of substitution covers integrands of the form

√
b2 x2 − a2 .
The substitution is again based on the identity sec2 (x) − 1 = tan2 (x) and is given by
bx = a sec(u).

Calculus 2 (B. Forrest)2

Chapter 2: Techniques of Integration 60

The next example illustrates the third class of trigonometric substitution.

Z 3
√
4x2 − 9
EXAMPLE 2 Evaluate √ dx.
3 x
√
Consider the expression 4x2 − 9. We could try to substitute u = 4x2 − 9 giving us
du = 8x dx or dx = du
8x
. This would have worked if the integral had been
Z 3 √
2
√ x 4x − 9 dx
3

because then the x’s would cancel. (You should verify this statement.) However, the
substitution u = 4x2 − 9 does not help in this example.
√
Since the numerator of the integrand takes the form b2 x2 − a2 where b = 2 and
a = 3, the correct substitution is
2x = 3 sec(u)
or !
2x
u = arcsec
3
where 0 ≤ u < π2 .
Since 4x2 = (2x)2 , we have (3 sec(u))2 = 9 sec2 (u). Therefore,
√ p
4x2 − 9 = 9 sec2 (u) − 9
p
= 9(sec2 (u) − 1)
p
= 3 sec2 (u) − 1
p
= 3 tan2 (u)
= 3 | tan(u) |
= 3 tan(u)

with the last equality holding since if 0 ≤ u < π2 , then tan(u) ≥ 0.

Since 2x = 3 sec(u) we also have that
2dx = 3 sec(u) tan(u) du
or
3
dx = sec(u) tan(u) du.
2
√
Next, we must find the new limits of integration for this integral. When x = 3, we
have
√
2 3
sec(u) =
3
2
= √
3

Calculus 2 (B. Forrest)2

Section 2.1: Inverse Trigonometric Substitutions 61

This means that

1
cos(u) =
sec(u)
√
3
=
2
√
π
The only angle u with 0 ≤ u < 2
and cos(u) = 2
3
, is u = π6 . (Verify this fact by
considering the Unit Circle.)
Similarly, if x = 3, then sec(u) = 2 so cos(u) = 21 . This means that u = π3 .
Finally, we must also substitute for the x in the denominator of the integrand

√
4x2 − 9
x

since it has not cancelled in the substitution. However, we know that x = 23 sec(u).
Combining everything we know gives us

Z 3
√ Z π3 !
4x2 − 9 3 tan(u) 3
√ dx = 3
sec(u) tan(u) du
3 x π
6 2
sec(u) 2
Z π3
= 3 tan2 (u) du
π
6
Z π
3
= 3 (sec2 (u) − 1) du
π
6
π
= 3(tan(u) − u) π
3

6
π π π π
= 3 tan − − 3 tan −
3 3 6 6
√ π
!
3
= 3 3−π − √ −
3 2
√ π
= 2 3−
2

In general, when presented with an integral that you are unsure of how to solve, be
aware of the following classes of trigonometric substitutions to see if they can help
you evaluate the integral.

Calculus 2 (B. Forrest)2

Chapter 2: Techniques of Integration 62

Summary of Inverse Trigonometric Substitutions

Class of Integrand Integral Trig Substitution Trig Identity

√ R √
a2 − b2 x2 a2 − b2 x2 dx bx = a sin(u) sin2 (x) + cos2 (x) = 1

√ R √
a2 + b2 x2 a2 + b2 x2 dx bx = a tan(u) sec2 (x) − 1 = tan2 (x)

√ R √
b2 x2 − a2 b2 x2 − a2 dx bx = a sec(u) sec2 (x) − 1 = tan2 (x)

2.2 Integration by Parts

Suppose that we want to calculate

Z
x sin(x) dx.

There is no obvious substitution that will help. Fortunately, there is another method
that will work for this integral called Integration by Parts.
While the method of integration by substitution was based on trying to undo the
Chain Rule, Integration by Parts is derived from the Product Rule. If f and g are
differentiable, then the Product Rule states that
d
( f (x)g(x)) = f 0 (x)g(x) + f (x)g 0 (x).
dx
Since the antiderivative of a derivative is just the original function up to a constant,
we have
Z
d
f (x)g(x) = ( f (x)g(x)) dx
dx
Z
= ( f 0 (x)g(x) + f (x)g 0 (x)) dx
Z Z
= f (x)g(x) dx +
0
f (x)g 0 (x) dx

Rearranging this equation leads to the following formula:

DEFINITION The Integration by Parts Formula

Z Z
f (x)g (x) dx = f (x)g(x) −
0
f 0 (x)g(x) dx.

Calculus 2 (B. Forrest)2

Section 2.2: Integration by Parts 63

R
EXAMPLE 3 Use integration by parts to evaluate x sin(x) dx.
The
R task is to choose the functions
R f and g 0 in such a way that the integral
x sin(x) dx has the form f (x)g 0 (x) dx and the expression
Z
f (x)g(x) − f 0 (x)g(x) dx

can be easily evaluated. The key is to view x sin(x) as a product of the functions x
and sin(x) and to note that differentiating x produces the constant 1. This will leave
us with only a simple trigonometric function to integrate. Therefore, we let f (x) = x
and let g 0 (x) = sin(x).
The next step is to determine f 0 and g.
Since f (x) = x, we have that f 0 (x) = 1. We can choose any antiderivative of sin(x)
to play the role of g(x) so we choose g(x) = − cos(x). Substituting into the
Integration by Parts Formula
Z Z
f (x)g (x) dx = f (x)g(x) −
0
f 0 (x)g(x) dx

gives Z Z
x sin(x) dx = x(− cos(x)) − (1)(− cos(x)) dx
or Z Z
x sin(x) dx = −x cos(x) + cos(x) dx.

Since Z
cos(x) dx = sin(x) + C

we get Z
x sin(x) dx = −x cos(x) + sin(x) + C.

We can verify this solution by differentiation:

d
(−x cos(x) + sin(x) + C) = − cos(x) + x sin(x) + cos(x) + 0
dx
= x sin(x)

which is the original integrand as we expected.

The next example shows that we might have to combine Integration by Parts with
substitution to calculate an integral.

Calculus 2 (B. Forrest)2

Chapter 2: Techniques of Integration 64

R
EXAMPLE 4 Evaluate x cos(2x) dx.

The strategy for this integral is again to use Integration by Parts to eliminate the x
from the integrand so that we are left with a simple trigonometric function to
integrate. Therefore, we let

f (x) = x and g 0 (x) = cos(2x).

We must now find f 0 (x) and g(x). Since f (x) = x, then f 0 (x) = 1. To find g(x), we
evaluate Z
cos(2x) dx.

Let u = 2x, then du = 2 dx so that dx = du

2
. Substitute these into the integral to get
Z Z
du
cos(2x) dx = cos(u)
2
Z
1
= cos(u) du
2
sin(u)
= +C
2
sin(2x)
= +C
2

We can choose any antiderivative of cos(2x) for g(x), so let C = 0 to get

f (x) = x g 0 (x) = cos(2x)

f 0 (x) = 1 g(x) = sin(2x)

Applying the Integration by Parts formula gives

Z Z
x sin(2x) sin(2x)
x cos(2x) dx = − (1) dx
2 2
Z
x sin(2x) 1
= − sin(2x) dx
2 2

sin(2x) dx, we can again use the substitution u = 2x so that du = 2dx

R
To evaluate

Calculus 2 (B. Forrest)2

Section 2.2: Integration by Parts 65

and dx = du
2
. This shows that
Z Z
du
sin(2x) dx = sin(u)
2
Z
1
= sin(u) du
2
− cos(u)
= +C
2
− cos(2x)
= +C
2

Therefore,
Z Z
x sin(2x) 1
x cos(2x) dx = − sin(2x) dx
2 2
!
x sin(2x) 1 − cos(2x)
= − +C
2 2 2
x sin(2x) cos(2x)
= + +C
2 4

NOTE
1
Since C is an arbitray
R constant we did not need to multiply it by 2
when we
substituted for sin(2x) dx in this calculation.

R
EXAMPLE 5 Evaluate x2 e x dx.
Once again there is no obvious substitution so we will try Integration by Parts. We
will use differentiation to eliminate the polynomial x2 so that only a simple
exponential function is left to integrate. However, this time we will need to apply
the process twice.
We begin with

f (x) = x2 g 0 (x) = e x
f 0 (x) = 2x g(x) = e x

Calculus 2 (B. Forrest)2

Chapter 2: Techniques of Integration 66

Applying the Integration by Parts formula gives

Z Z
x e dx = x e − (2x)e x dx
2 x 2 x

Z
= x e − 2 xe x dx
2 x

R
We are left to evaluate xe x dx. This integral is again an ideal candidate for
Integration by Parts. Let

f (x) = x g 0 (x) = e x
f 0 (x) = 1 g(x) = e x

to get
Z Z
xe dx = xe −
x x
(1)e x dx
Z
= xe − x
e x dx
= xe x − e x + C

R
We can now substitute for xe x dx to get,
Z Z
x e dx = x e − 2
2 x 2 x
xe x dx

= x2 e x − 2(xe x − e x ) + C
= x2 e x − 2xe x + 2e x + C

REMARK
You might guess from the previous example that the integral
Z
x3 e x dx

could be evaluated with three applications of Integration by Parts since we would

have to differentiate three times to eliminate the polynomial x3 .

The next example illustrates another class of functions that are ideally suited to
Intergation by Parts.

Calculus 2 (B. Forrest)2

Section 2.2: Integration by Parts 67

R
EXAMPLE 6 Evaluate e x sin(x) dx.
This example presents a different type of problem than any of the previous
examples. It is not clear which function should be f and which should be g 0 since
no amount of differentiation will eliminate either e x or sin(x). In this case, we will
simply choose g 0 to be the easiest function to integrate. For this example, this
means that g 0 (x) = e x . Therefore, we have

f (x) = sin(x) g 0 (x) = e x

f 0 (x) = cos(x) g(x) = e x

The Integration by Parts Formula gives

Z Z
e sin(x) dx = e sin(x) − e x cos(x) dx.
x x

This result may appear somewhat discouraging because there is no reason to believe
that the integral Z
e x cos(x) dx
R
is any easier to evaluate than the original integral e x sin(x) dx.
R
The key is to apply the formula again to the integral e x cos(x) dx with

f (x) = cos(x) g 0 (x) = e x

f 0 (x) = − sin(x) g(x) = e x

to get
Z Z
e cos(x) dx = e cos(x) −
x x
e x (− sin(x)) dx
Z
= e cos(x) +
x
e x (sin(x)) dx

R
We seem to be left with having to evaluate e x (sin(x)) dx which is exactly where
we started! However, if we substitute e cos(x) + e (sin(x)) dx for e cos(x) dx
R R
x x x

in the original equation something interesting happens. We get

Z Z
e sin(x) dx = e sin(x) − e x cos(x) dx
x x

Z
= e sin(x) − (e cos(x) + e x (sin(x)) dx)
x x

Z
= e sin(x) − e cos(x) − e x (sin(x)) dx
x x

Calculus 2 (B. Forrest)2

Chapter 2: Techniques of Integration 68

R
You will notice that e x (sin(x)) dx appears on both sides of our equation but with
opposite signs. We can treat this expression as some unknown variable
R and then
gather like terms as we would in basic algebra. This means adding e x (sin(x)) dx to
both sides of the expression to get

Z !
2 e (sin(x)) dx = e x sin(x) − e x cos(x).
x

Divide by 2 so that
e x sin(x) − e x cos(x)
Z
e x (sin(x)) dx = .
2

At this point, you might notice that the constant of integration is missing in this
expression yet all general antiderivatives must include a constant. In fact, this is due
to the way that the Integration by Parts formula handles these constants (the
constants are always there implicitly even if they are not explicitly written). We
have identified just one possible antiderivative for the function e x sin(x). To state all
of the antiderivatives we know that we simply add an arbitrary constant so that
e x sin(x) − e x cos(x)
Z
e x (sin(x)) dx = + C.
2

Observation: The method we outlined in the previous example works because of

the cyclic nature of the derivatives for sin(x) and e x . This statement is also true of
both
R cos(x) and e x . Therefore, it should not be surprising to discover that
e x cos(x) dx can be evaluated in the same manner.
Important Note: In summary, the Integration by Parts formula is ideally suited to
evaluating integrals of the following types:
R
• xn cos(x) dx
R
• xn sin(x) dx
R
• xn e x dx
R
• e x cos(x) dx
R
• e x sin(x) dx

However, there are other more unusual examples of integrals that are also suitable
for Integration by Parts.

Calculus 2 (B. Forrest)2

Section 2.2: Integration by Parts 69

EXAMPLE 7 Evaluate Z
arctan(x) dx.

At first glance this integral does not seem to be of the form

Z
f (x)g 0 (x) dx

since there is no product in the integrand. However, the key is to rewrite the
integrand as
arctan(x) = (1) arctan(x).
Since arctan(x) is easy to differentiate and 1 is easily integrated, we can now try
Integration by Parts with f (x) = arctan(x) and g0 (x) = 1. This leads to

f (x) = arctan(x) g 0 (x) = 1

f 0 (x) = 1+x
1
2 g(x) = x

Applying the Integration by Parts formula gives

Z Z !
1
arctan(x) dx = x arctan(x) − x dx
1 + x2
Z
x
= x arctan(x) − dx
1 + x2

2 dx can be handled by substitution. Let u = 1 + x so du = 2xdx

R x
2
The integral 1+x
and dx = du
2x
. Substitution gives
Z Z
x x du
dx =
1 + x2 u 2x
Z
1 1
= du
2 u
ln(| u |)
= +C
2
ln(| 1 + x2 |)
= +C
2
ln(1 + x2 )
= +C
2

since 1 + x2 > 0.

Calculus 2 (B. Forrest)2

Chapter 2: Techniques of Integration 70

Returning to the original equation, we have

Z Z
x
arctan(x) dx = x arctan(x) − dx
1 + x2
ln(1 + x2 )
!
= x arctan(x) − +C
2

We can check this answer by differentiating:

ln(1 + x2 )
! ! !
d 1 1 1
x arctan(x) − + C = arctan(x) + x − (2x)
dx 2 1 + x2 2 1 + x2
x x
= arctan(x) + −
1+x 2 1 + x2
= arctan(x)

exactly as expected.

R
EXAMPLE 8 Evaluate ln(x) dx.
Notice

ln(x) = 1 · ln(x)
↑ ↑
g 0 (x) f (x)

This gives

f (x) = ln(x) g 0 (x) = 1

f 0 (x) = 1
x
g(x) = x

Applying the Integration by Parts Formula

Z Z !
1
ln(x) dx = x ln(x) − x dx
x
Z
= x ln(x) − 1 dx
= x ln(x) − x + C

It is important to be cautious when trying to evaluate an integral. For example, the

Calculus 2 (B. Forrest)2

Section 2.3: Partial Fractions 71

integral Z
2
xe x dx
appears to be a candidate for Integration by Parts. However, this is not the case.
Instead, it can be evaluated by using the substitution u = x2 .
The Integration by Parts Formula can also be applied to definite integrals. The
following theorem is a direct consequence of combining the Integration by Parts
formula with the Fundamental Theorem of Calculus.

THEOREM 1 Integration by Parts

Assume that f and g are such that both f 0 and g 0 are continuous on an interval
containing a and b. Then
Z b Z b
f (x)g (x) dx = f (x)g(x)|a −
0 b
f 0 (x)g(x) dx.
a a

R1
EXAMPLE 9 Evaluate 0
xe x dx.
Let
f (x) = x g 0 (x) = e x

f 0 (x) = 1 g(x) = e x

Integration by Parts shows that

Z 1 1 Z 1
xe dx = xe −
x x
e x dx
0 0 0
1
= [e − 0] − e x
0

= e − [e − 1]
= 1

2.3 Partial Fractions

Recall that a rational function is a function of the form

p(x)
f (x) =
q(x)
where p and q are polynomials. In this section, we will discuss a method for
integrating rational functions called Partial Fractions. We will illustrate this method
with an example.

Calculus 2 (B. Forrest)2

Chapter 2: Techniques of Integration 72

R 1
EXAMPLE 10 Evaluate x2 −1
dx.
Step 1:
First factor the denominator to get that
1 1
= .
x2 − 1 (x − 1)(x + 1)

Step 2:
Find constants A and B so that
1 A B
= + (*)
(x − 1)(x + 1) x − 1 x + 1

To find A and B, we multiply both sides of the identity (∗) by (x − 1)(x + 1) to get
1 = A(x + 1) + B(x − 1) (**)

The two roots of the denominator were x = 1 and x = −1. If we substitute x = 1 into
equation (**), we have
1 = A(1 + 1) + B(1 − 1)
or
1 = 2A.
Therefore,
1
A= .
2
If we then substitute x = −1 into equation (**), we get
1 = A(−1 + 1) + B(−1 − 1)
or
1 = −2B
and
1
B=− .
2
Using these values of A and B we have
1
1 1 − 12
= = 2
+ .
x2 − 1 (x − 1)(x + 1) x − 1 x + 1

Therefore Z Z Z
1 1 1 1 1
dx = dx − dx.
2
x −1 2 x−1 2 x+1
Recall that Z
1
dx = ln(| x − a |) + C.
x−a

Calculus 2 (B. Forrest)2

Section 2.3: Partial Fractions 73

Hence
Z Z Z
1 1 1 1 1
dx = dx − dx
2
x −1 2 x−1 2 (x + 1)
1 1
= ln(| x − 1 |) − ln(| x + 1 |) + C
2 ! 2
1 | x−1|
= ln +C
2 | x+1|

since ln(b) − ln(a) = ln b
a
.

The method we have just outlined required us to separate the function f (x) = 1
x2 −1
into rational functions with first degree denominators. This is called a partial
fraction decomposition of f .

DEFINITION Type I Partial Fraction Decomposition

Assume that
p(x)
f (x) =
q(x)
where p and q are polynomials such that

1. degree(p(x)) < degree(q(x)) = k,

2. q(x) can be factored into the product of linear terms each with distinct roots.
That is
q(x) = a(x − a1 )(x − a2 )(x − a3 ) · · · (x − ak )
where the ai ’s are unique and none of the ai ’s are roots of p(x).

Then there exists constants A1 , A2 , A3 , · · · , Ak such that

" #
1 A1 A2 A3 Ak
f (x) = + + + ··· +
a x − a1 x − a2 x − a3 x − ak

we say that f admits a Type I Partial Fraction Decomposition.

Key Observation: The existence of the constants A1 , A2 , · · · , Ak follows from

some basic algebra. However, what is important to us is that if the rational function
f (x) = q(x)
p(x)
has a Type I Decomposition, then it is easy to find its integral.

Calculus 2 (B. Forrest)2

Chapter 2: Techniques of Integration 74

THEOREM 2 Integration of Type I Partial Fractions

Assume that f (x) = p(x)
q(x)
admits a Type I Partial Fraction Decomposition of the form
" #
1 A1 A2 Ak
f (x) = + + ··· + .
a x − a1 x − a2 x − ak
Then
Z "Z Z Z #
1 A1 A2 Ak
f (x) dx = dx + dx + · · · + dx
a x − a1 x − a2 x − ak
1
= [A1 ln(| x − a1 |) + A2 ln(| x − a2 |)) + · · · + Ak ln(| x − ak |)] + C
a

x+2
Z
EXAMPLE 11 Evaluate dx.
x(x − 1)(x − 3)
In this case, p(x) = x + 2 and q(x) = x(x − 1)(x − 3). Notice that
1 = degree(p(x)) < degree(q(x)) = 3. Since q(x) has the distinct roots 0, 1 and 3, we
have a Type I Decomposition. Therefore, there are constants A, B, and C such that
x+2 A B C
= + + .
x(x − 1)(x − 3) x x−1 x−3

Step 1: Cross-multiply to get

x + 2 = A(x − 1)(x − 3) + B(x)(x − 3) + C(x)(x − 1) (*)

Step 2: To find the constants we will substitute each of the roots into the identity
(*). If x = 0, then

0 + 2 = A(0 − 1)(0 − 3) + B(0)(0 − 3) + C(0)(0 − 1)

or
2 = 3A.

Hence
2
A= .
3
Let x = 1 to get

1 + 2 = A(1 − 1)(1 − 3) + B(1)(1 − 3) + C(1)(1 − 1).

Therefore,
3 = −2B

Calculus 2 (B. Forrest)2

Section 2.3: Partial Fractions 75

so that
3
B=− .
2
Finally, with x = 3

3 + 2 = A(3 − 1)(3 − 3) + B(3)(3 − 3) + C(3)(3 − 1)

so
5 = 6C
and
5
C= .
6
Notice that when we substitute the root a j into identity (*), we get back the
coefficient A j corresponding to this root.
This means that
2 5
x+2 − 32
= +
3
+ 6 .
x(x − 1)(x − 3) x x − 1 x − 3
Therefore
x+2
Z Z Z Z
2 1 3 1 5 1
dx = dx − dx + dx
x(x − 1)(x − 3) 3 x 2 x−1 6 x−3
2 3 5
= ln(| x |) − ln(| x − 1 |) + ln(| x − 3 |) + c
3 2 6

DEFINITION Type II Partial Fraction Decomposition

Assume that
p(x)
f (x) =
q(x)
where p(x) and q(x) are polynomials such that

1. degree(p(x)) < degree(q(x)) = k,

2. q(x) can be factored into the product of linear terms with non-distinct roots.
That is
q(x) = a(x − a1 )m1 (x − a2 )m2 (x − a3 )m3 · · · (x − al )ml
where at least one of the m j ’s is greater than 1.

We say that f admits a Type II Partial Fraction Decomposition.

In this case, the partial fraction decomposition can be built as follows.

Calculus 2 (B. Forrest)2

Chapter 2: Techniques of Integration 76

Each expression (x − a j )m j in the factorization of q(x) will contribute m j terms to the

decomposition, one for each power of x − a j from 1 to m j , which when combined
will be of the form
l
p(x) X A j,1 A j,2 A j,3 A j,m j
= + + + ··· + .
q(x) j=1
x − a j (x − a j )2 (x − a j )3 (x − a j )m j

The number m j is called the multiplicity of the root a j .

Z
1
EXAMPLE 12 Evaluate dx.
x2 (x− 1)
In this case, p(x) = 1 and q(x) = x2 (x − 1). The roots of q(x) are 0 and 1 and since
the root 0 has multiplicity 2, this is a Type II Partial Fraction. Therefore, we can find
constants A, B and C such that
1 A B C
= + 2+ .
x2 (x − 1) x x x−1

To find the constants A, B and C, we follow a similar procedure.

Step 1: Cross-multiply to get

1 = Ax(x − 1) + B(x − 1) + Cx2 (*)

Step 2: Substitute the roots x = 0 and x = 1 into identity (*).

If x = 0, then
1 = A(0)(0 − 1) + B(0 − 1) + C(02 )
or
1 = −B
so
B = −1.

Notice that substituting x = 0 only gave us the coefficient of the term with the
highest power of x in the decomposition.
Next, let x = 1. Then

1 = A(1)(1 − 1) + B(1 − 1) + C(12 )

and hence
1 = C.

Step 3: We have not yet found the coefficient A. There are a number of methods we
could use to find A. We could, for example, substitute into the identity (*) any value

Calculus 2 (B. Forrest)2

Section 2.3: Partial Fractions 77

other than 0 and 1 and use the fact that we already know B and C to solve for A. For
example, if we let x = 2, we have

1 = A(2)(2 − 1) + B(2 − 1) + C(22 )

or
1 = 2A + B + 4C.
Substituting B = −1 and C = 1 gives

1 = 2A − 1 + 4

or
−2 = 2A.
Hence
A = −1.

Perhaps an easier method is to compare coefficients. If we expand the expressions

on the right-hand side of identity (*) we get

1 = A(x2 − x) + B(x − 1) + C(x2 ) = (A + C)x2 + (B − A)x − B.

We can rewrite this as

0x2 + 0x + 1 = (A + C)x2 + (B − A)x − B.

Since the two sides must agree for all x, they must both be the same polynomial.
This means that the coefficients must be equal. In particular, the coefficients of x2
must agree so that
0 = A+C
or
A = −C.
Since C = 1, this tells us that
A = −1.

Therefore,
Z Z Z Z
1 A B C
dx = dx + dx + dx
x2 (x − 1) x x2 x−1
Z Z Z
−1 −1 1
= dx + 2
dx + dx
x x x−1
Z Z Z
1 1 1
= − dx − 2
dx + dx
x x x−1
1
= − ln(| x |) + + ln(| x − 1 |) + c
x

Calculus 2 (B. Forrest)2

Chapter 2: Techniques of Integration 78

since
Z Z
1
dx = x−2 dx
x2
x−1
= +c
(−1)
1
= − +c
x

Unfortunately, not all polynomials factor over the real numbers into products of
linear terms. For example, the polynomial x2 + 1 cannot be factored any further.
This is an example of an irreducible quadratic. In fact, a quadratic ax2 + bx + c is
irreducible if its discriminant b2 − 4ac < 0.
However, the Fundamental Theorem of Algebra shows that every polynomial q(x)
factors in the form

q(x) = a(p1 (x))m1 (p2 (x))m2 (p3 (x))m3 · · · (pk (x))mk

where each pi (x) is either of the form (x − a) or it is an irreducible quadratic of the

form x2 + bx + c.

DEFINITION Type III Partial Fraction Decomposition

Let f (x) = q(x)
p(x)
be a rational function with degree(p(x)) < degree(q(x)), but q(x)
does not factor into linear terms. We say that f admits a Type III Partial Fraction
Decomposition.
In this case, the partial fraction decomposition can be built as follows:
Suppose that q(x) has an irreducible factor x2 + bx + c with multiplicity m. Then this
factor will contribute terms of the form
B1 x + C1 B2 x + C2 Bm x + Cm
+ 2 + ··· + 2
x + bx + c (x + bx + c)
2 2 (x + bx + c)m
to the decomposition.
The linear terms are handled exactly as they were in the previous cases.

Note: We will not consider the case where m > 1 for some irreducible quadratic in
evaluating integrals of rational functions with Type III Partial Fraction
Decompositions.
The method for finding the constants in a Type III Partical Fraction Decomposition
is very similar to that of the first two types. We illustrate this with an example.

Calculus 2 (B. Forrest)2

Section 2.3: Partial Fractions 79

Z
1
EXAMPLE 13 Evaluate dx.
x3 + x
First observe that
1 1
f (x) = =
+ x x(x + 1)
x3
2

so there will be constants A, B and C such that

1 1 A Bx + C
f (x) = = = + 2 .
x3 + x x(x + 1)
2 x x +1

Step 1: To find the constants, we begin by cross-multiplying to obtain the identity

1 = A(x2 + 1) + (Bx + C)x (*)

Step 2: Substitute x = 0, the only Real root, to find the coefficient A. This gives
1 = A(02 + 1) + (B(0) + C)(0)
or
A = 1.

Step 3: The remaining constants are found by comparing coefficients. Expanding

the identity (*) gives
1 = (A + B)x2 + Cx + A.

Comparing the coefficients of x2 gives

0= A+B
or
B = −A
Since A = 1, we get
B = −1.

Comparing the coefficients of x gives

C = 0.

Therefore,
1 1 x
= − 2
x(x2+ 1) x x + 1
and
Z Z
1 1
dx = dx
x +x
3 + 1)
x(x2
Z Z
1 x
= dx − dx
x x +1
2

Z
x
= ln(| x |) − dx
x +1
2

Calculus 2 (B. Forrest)2

Chapter 2: Techniques of Integration 80

To finish the calculation, we use the substitution u = x2 + 1, so du = 2x dx and

dx = du
2x
to get
Z Z
x x du
dx =
x +1
2 u 2x
Z
1 1
= du
2 u
1
= ln(| u |) + c
2
1
= ln(x2 + 1) + c
2

Putting this all together we get that

Z
1 1
dx = ln(| x |) − ln(x2 + 1) + c.
x +x
3 2

NOTE

1. Since C = 0 in the previous example, the integral was straightforward to

evaluate. Unfortunately, this is not the case for most Type III Partial Fractions.
2. All of the Partial Fraction Decompositions required the rational function to
satisfy degree(p(x)) < degree(q(x)). If this is not the case, then we need to
use division of polynomials to find new polynomials r(x) and p1 (x) such that
p(x) p1 (x)
= r(x) +
q(x) q(x)
with degree(p1 (x)) < degree(q(x)).

2.4 Introduction to Improper Integrals

The Riemann Integral was defined for certain bounded functions on closed intervals
[a, b]. In many applications, most notably for statistical and data analysis, we want
to be able to integrate functions over intervals of infinite length. However, at this
point Z ∞
f (x) dx
a

Calculus 2 (B. Forrest)2

Section 2.4: Introduction to Improper Integrals 81

has no meaning.
If f (x) ≥ 0 and a < b, then Z b
f (x) dx
a
can be interpreted geometrically as the area bounded by the graph of y = f (x), the
x-axis and the vertical lines x = a and x = b. We could use this to guide us in
defining the integral over an infinite interval.

For example, for the function

f (x) = x12 , we might want
Z ∞
1
dx 1
1 x2 f (x) =
x2
to represent the area bounded
by the graph of y = f (x), the
x-axis and the vertical line
x = 1. 1

Since this region is unbounded it might seem likely that this area should be infinite.

However, if this was true

then it should be the case that
by choosing b large enough
we should be able to make
1
the area bounded by the f (x) =
graph of y = f (x), the x-axis x2
and the vertical lines x = 1
and x = b at least as large as Area > 2?
2. 1 b

Rb 1
However, this area is given by 1 x2
dx and
Z b Z b
1
dx = x−2 dx
1 x2 1

1 b

= −
x1
1
= − +1
b
1
= 1−
b
< 2

Calculus 2 (B. Forrest)2

Chapter 2: Techniques of Integration 82

This shows that no matter how large b is the area bounded by the graph of y = f (x),
the x-axis and the vertical lines x = 1 and x = b will always be less than 2. In fact, it
is always less than 1. This suggests that the original region should have finite area
despite the fact that it is unbounded.
We are now in a position similar to when we first defined the integral. While we
have an intuitive idea of what area means, we do not have a formal definition of area
that applies to such unbounded regions. One method to avoid this problem would be
to define the area of the unbounded region as the limit of the bounded areas as b
goes to ∞. That is Z b
1
Area = lim dx.
b→∞ 1 x2

In particular, we would have

Z b
1
Area = lim dx
b→∞ 1 x2
1
= lim 1 −
b→∞ b

= 1

In this case, we would like to have

Z ∞
1
dx = 1.
1 x2
This motivates the following definition:

DEFINITION Type I Improper Integral

1) Let f be integrable on [a, b] for each a ≤ b. We say that the Type I Improper
Integral Z ∞
f (x) dx
a
converges if Z b
lim f (x) dx
b→∞ a
exists. In this case, we write
Z ∞ Z b
f (x) dx = lim f (x) dx.
a b→∞ a

R∞
Otherwise, we say that a
f (x) dx diverges.

Calculus 2 (B. Forrest)2

Section 2.4: Introduction to Improper Integrals 83

2) Let f be integrable on [b, a] for each b ≤ a. We say that the Type I Improper
Integral Z a
f (x) dx
−∞
converges if Z a
lim f (x) dx
b→−∞ b
exists. In this case, we write
Z a Z a
f (x) dx = lim f (x) dx.
−∞ b→−∞ b

Ra
Otherwise, we say that −∞
f (x) dx diverges.

3) Assume that f is integrable on [a, b] for each a, b ∈ R with a < b . We say that
the Type I Improper Integral
Z ∞
f (x) dx
−∞
Rc R∞
converges if both −∞
f (x) dx and c
f (x) dx converge for some c ∈ R.
In this case, we write
Z ∞ Z c Z ∞
f (x) dx = f (x) dx + f (x) dx
−∞ −∞ c

R∞
Otherwise, we say that −∞
f (x) dx diverges.

Note: In general, we will focus our attention on Type I improper integrals of the
form Z ∞
f (x) dx
a

R∞ 1
EXAMPLE 14 Show that 1 x
dx diverges.

R∞ 1
1
dx represents the area
x
of the region bounded by the 1
f (x) =
graph of f (x) = 1x , the x-axis x
and the vertical line x = 1.

Calculus 2 (B. Forrest)2

Chapter 2: Techniques of Integration 84

This unbounded region looks very similar to the region discussed previously.
However, this time
Z ∞ Z b
1 1
dx = lim dx
1 x b→∞ 1 x
b
= lim ln(x)
b→∞ 1

= lim (ln(b) − ln(1))

b→∞

= lim ln(b)
b→∞

= ∞

R∞
This shows that 1 1x dx diverges to ∞ and hence that the area of the region bounded
by the graph of f (x) = 1x , the x-axis and the vertical line x = 1 is also infinite.

R∞ R∞
We have seen that 1 x12 dx converges while 1 1x dx diverges. More generally, we
have the following natural question.
Question: For which p does Z ∞
1
dx
1 xp
converge? In fact, the answer to this question will be crucial to our study of series.
Since we already know what happens if p = 1, we can assume that p , 1.
To answer this question we require the following facts. If α > 0, then
lim bα = ∞
b→∞

and if α < 0, then

lim bα = 0.
b→∞

For any b > 1,

Z b Z b
1
dx = x−p dx
1 xp 1

x−p+1 b
=
−p + 1 1

b1−p 1
= −
1 − p −p + 1
b1−p 1
= +
1− p p−1

Calculus 2 (B. Forrest)2

Section 2.4: Introduction to Improper Integrals 85

If p < 1, then 1 − p > 0. Therefore,

b1−p
lim =∞
b→∞ 1 − p

since the exponent is positive. This means that

b1−p 1
lim + =∞
b→∞ 1 − p p−1
R∞ 1
and hence that 1 xp
dx diverges.
However, if p > 1, then 1 − p < 0. This time since the exponent is negative,

b1−p
lim =0
b→∞ 1 − p

and hence

b1−p 1 1
lim + = .
b→∞ 1 − p p−1 p−1
R∞
Therefore, if p > 1, 1
1
xp
dx converges and
Z ∞
1 1
p
dx = .
1 x p−1

This is summarized in the following important theorem.

THEOREM 3 p-Test for Type I Improper Integrals

The improper integral Z ∞
1
dx
1 xp
converges if and only if p > 1.
If p > 1, then Z ∞
1 1
p
dx = .
1 x p−1

We end this section with one more important example of a convergent improper
integral.

Calculus 2 (B. Forrest)2

Chapter 2: Techniques of Integration 86

R∞
EXAMPLE 15 Evaluate 0
e−x dx.
We have
Z ∞ Z b
e dx = lim
−x
e−x dx
0 b→∞ 0
b
= lim −e −x
b→∞ 0

= lim (−e−b + e0 )
b→∞

= lim (−e−b + 1)
b→∞

= 1

since lim −e−b = 0.

b→∞

2.4.1 Properties of Type I Improper Integrals

Since the evaluation of an improper integral results from taking limits at ∞, it makes
sense that improper integrals should inherit many of the properties of these types of
limits.

THEOREM 4 Properties of Type I Improper Integrals

R∞ R∞
Assume that a
f (x) dx and a
g(x) dx both converge.
R∞
1. a
c f (x) dx converges for each c ∈ R and
Z ∞ Z ∞
c f (x) dx = c f (x) dx.
a a
R∞
2. a
( f (x) + g(x)) dx converges and
Z ∞ Z ∞ Z ∞
( f (x) + g(x)) dx = f (x) dx + g(x) dx.
a a a

3. If f (x) ≤ g(x) for all a ≤ x, then

Z ∞ Z ∞
f (x) dx ≤ g(x) dx.
a a
R∞
4. If a < c < ∞, then c
f (x) dx converges and
Z ∞ Z c Z ∞
f (x) dx = f (x) dx + f (x) dx.
a a c

Calculus 2 (B. Forrest)2

Section 2.4: Introduction to Improper Integrals 87

Note: Unfortunately, it is usually not possible to explicitly evaluate an integral

Rb
a
f (x) dx for every b > 1 . Therefore, it may be difficult to apply the definition to
determine if an improper integral converges. For example, it is not obvious how to
evaluate the integral Z b
1
dx.
1 e + x
x 2

However, we do know that

Z b Z b
1 1
dx ≤ dx 1
1 e + x
x 2
1 x
2 f (x) =
x2
for each b ≥ 1 since e x > 0
for each x ≥ 1.
1 1 b
g(x) =
ex + x2

From this we can

immediately conclude that
the area under the graph of
1
g(x) = x between 1
e + x2 f (x) =
x = 1 and x = b should be x2
less than the area under the
1
graph of f (x) = 2 between
x
x = 1 and x = b. 1 1 b
g(x) =
e x + x2
Z ∞
1
We also know that dx is finite and
1 x2
Z b Z ∞
1 1
2
dx < dx
1 x 1 x2
for each b ≥ 1. It follows that for every b ≥ 1
Z b Z ∞
1 1
dx < dx.
1 e + x
x 2 x2
1

1
This suggests that the area of the region under the graph of g(x) = x which
e + x2
lies above the x-axis and to the right of the line x = 1 should be finite.

R ∞ functions g with g(x) > 0 for all x ≥ a, we want to interpret the integral
For
a
g(x) dx as the area of the region bounded by the graph of y = g(x), the x-axis and
R∞
the vertical line x = a. To say that the a g(x) dx converges should be equivalent to

Calculus 2 (B. Forrest)2

Chapter 2: Techniques of Integration 88

Z ∞
1
the area being finite. As such we should be able to conclude that dx
1 ex + x2
converges.
Unfortunately, we have not explicitly shown that
Z b
1
lim dx
b→∞ 1 e + x2
x

Z b
1
actually exists because we cannot evaluate dx with any of the techniques
1 e + x
x 2
we have developed.
Z b
1
We can however make the following observation: If we let G(b) = dx,
1 e + x
x 2
then when viewed as a function on the interval [1, ∞), G is increasing and
Z ∞
1
G(b) ≤ dx
1 x2
for all b ∈ [1, ∞). In the next section, we will see that this is enough to show that
Z b
1
lim G(b) = lim dx
b→∞ b→∞ 1 e + x2
x

Z ∞
1
exists and hence that the improper integral dx does in fact converge.
1 e + x
x 2

2.4.2 Comparison Test for Type I Improper Integrals

Suppose that
0 ≤ g(x) ≤ f (x)
R∞
on [a, ∞). Assume also that a f (x) dx converges. Then the area under the graph of
f from x = a to ∞ is finite. But 0 ≤ g(x) ≤ f (x) so the area under the graph of g
from x = a to ∞ should be less than the area Runder the graph of f , and hence it
∞
should also be finite. This should imply that a g(x) dx converges.
R∞
On the other hand, if a g(x) dx diverges, then the area under the graph of g from
x = a to ∞ is infinite. Since f (x) is larger than g(x),R it should be true that area under
∞
the graph of f from x = a to ∞ is infinite. That is, a f (x) dx diverges.
In summary, if
0 ≤ g(x) ≤ f (x)
on [a, ∞) and the integral of the larger function converges, so does the integral of the
smaller function. However, if the smaller function has an integral that diverges to
infinity, so should the larger function. To see why this is the case, we can use an
analogue of the Monotone Convergence Theorem for functions.
Recall that the Monotone Convergence Theorem tells us that a non-decreasing
sequence {an } converges if and only if it is bounded above and that if it does
converge, then lim an = lub({an }). We can now prove a similar result for functions.
n→∞

Calculus 2 (B. Forrest)2

Section 2.4: Introduction to Improper Integrals 89

THEOREM 5 The Monotone Convergence Theorem for Functions

Assume that f is non-decreasing on [a, ∞).

1. If { f (x) | x ∈ [a, ∞)} is bounded above, then lim f (x) exists and
x→∞

lim f (x) = L = lub({ f (x) | x ∈ [a, ∞)}).

x→∞

2. If { f (x) | x ∈ [a, ∞)} is not bounded above, then lim f (x) = ∞.

x→∞

PROOF
The proof of this theorem is very similar to that of the Monotone Convergence
Theorem for sequences.

1. Assume that { f (x) | x ∈ [a, ∞)} is bounded above, and let

L = lub({ f (x) | x ∈ [a, ∞)}).
Let > 0. Then L − is not an upper bound for { f (x) | x ∈ [a, ∞)}. Therefore,
there exists an N ∈ [a, ∞)} so that
L − < f (N) ≤ L.
But if x ≥ N we would have that
L − < f (N) ≤ f (x) ≤ L.
This means that if x ≥ N, then | f (x) − L| < so that lim f (x) = L as claimed.
x→∞

L−
f

N x

2. Assume { f (x) | x ∈ [a, ∞)} is not bounded above. Let M > 0. Since M is not
an upper bound for { f (x) | x ∈ [a, ∞)}, there exists an N ∈ [a, ∞)} so that
M < f (N). But if x ≥ N, we have
M < f (N) ≤ f (x)
which shows that lim f (x) = ∞.
x→∞

Calculus 2 (B. Forrest)2

Chapter 2: Techniques of Integration 90

N x

We can now establish one of the most important tools for determining the
convergence or divergence of improper integrals.

THEOREM 6 Comparison Test for Type I Improper Integrals

Assume that 0 ≤ g(x) ≤ f (x) for all x ≥ a and that both f and g are continuous on
[a, ∞).
R∞ R∞
1. If a
f (x) dx converges, then so does a
g(x) dx.
R∞ R∞
2. If a g(x) dx diverges, then so does a f (x) dx.

PROOF
The two statements are logically equivalent (why?). This means that we only have
to prove
R ∞ the first statement and then the second statement follows. As such, assume
that a f (x) dx converges. Next let
Z t
F(t) = f (x) dx.
a

Since f (x) ≥ 0, we have that F is non-decreasing on [a, ∞) and that

Z t Z ∞
lim F(t) = lim f (x) dx = f (x) dx < ∞.
t→∞ t→∞ a a

Let Z t
G(t) = g(x) dx.
a

Calculus 2 (B. Forrest)2

Section 2.4: Introduction to Improper Integrals 91

This time since 0 ≤ g(x) ≤ f (x) we have that G is non-decreasing on [a, ∞) and that
G(t) ≤ F(t) for any t ∈ [a, ∞). But then
Z ∞
G(t) ≤ f (x) dx < ∞
a

for all t ∈ [a, ∞). This shows that {G(t) | t ∈ [a, ∞)} is bounded above. Finally, the
Monotone Convergence Theorem for Functions tells us that
Z t
lim G(t) = lim G(x) dx
t→∞ t→∞ a

exists. This proves statement 1.

R∞ 1
EXAMPLE 16 Show that 1 e x +x2
dx converges.
We have already seen that
1 1
0< <
e x + x2 x2
R∞
for all x ≥ 1 and that 1 x12 dx converges. It follows immediately from the
R∞ 1
Comparison Theorem that 1 ex +x 2 dx also converges.

R∞ 1
EXAMPLE 17 Does √ dx converge or diverge?
1
x+ x
We know that
1 1
0< √ <
x+ x x
R∞
for all x ≥ 1. However, 1 1x dx diverges so the Comparison Test does not apply
since we cannot say anything about the smaller integral if the larger one diverges.
√
The key observation is that x + x ≤ x + x = 2x for x ≥ 1. Therefore,
1 1
0< ≤ √
2x x + x
R∞ 1
for x ≥ 1. Moreover, since 1 x
dx diverges, so does
Z ∞
1
dx.
1 2x
R∞ 1√
This time we can use the Comparison Test to conclude that 1 x+ x
dx diverges.

So far we have dealt almost exclusively with improper integrals involving positive
functions. In particular, the Comparison Test applies to positive functions.

Calculus 2 (B. Forrest)2

Chapter 2: Techniques of Integration 92

Moreover the Monotone Convergence Theorem for Functions allows us to establish

the following important fact.

Fact

R ∞f is integrable on [a, b) for every b ≥ a and if f (x) ≥ 0 on [a, ∞), then

If
a
f (x) dx converges if and only if there exists an M such that
Z b
f (x) dx ≤ M
a

for all b > a.

Note: The previous statement is not true without the assumption that f (x) ≥ 0 as
the following example illustrates.

R∞
EXAMPLE 18 Show that 0
cos(x) dx diverges.
By definition,
Z ∞ Z b
cos(x) dx = lim cos(x) dx
0 b→∞ 0
b
= lim sin(x)
b→∞ 0
= lim (sin(b) − sin(0))
b→∞
= lim sin(b)
b→∞

However, sin(b) osculates between −1 and 1 as b → ∞ so that lim sin(b) does not
R∞ b→∞
exist. Therefore, 0 cos(x) dx diverges despite the fact that
Z b
−1 ≤ cos(x) dx = sin(b) ≤ 1
0

for all b > 0.

However, we can still often use our tools for improper integrals of positive functions
to determine convergence of some improper integrals of more general functions. To
do so we introduce the notion of absolute convergence which is an analog of
absolute convergence for sequences.

Calculus 2 (B. Forrest)2

Section 2.4: Introduction to Improper Integrals 93

DEFINITION Absolute Convergence for Type I Improper Integrals

RLet f be integrable on [a, b) for all b ≥ a. We say that the improper integral
∞
a
f (x) dx converges absolutely if
Z ∞
| f (x) | dx
a

converges.

Similar to the case for sequences we will now see that absolute convergence implies
convergence for improper integrals.

THEOREM 7 Absolute Convergence Theorem for Improper Integrals

Let f be integrable on [a, b] for all b > a. Then | f | is also integrable on [a, b] for all
b > a. Moreover, if we assume that
Z ∞
| f (x) | dx
a

converges, then so does Z ∞

f (x) dx.
a

In particular, if 0 ≤ R| f (x)| ≤ g(x) for all x ≥ a, both f and g are integrable on [a, b]
∞
for all b ≥ a, and if a g(x) dx converges, then so does
Z ∞
f (x) dx.
a

PROOF
R∞
Assume that a
| f (x) | dx converges. Then so does
Z ∞
2 | f (x) | dx.
a

We also have that

0 ≤ f (x) + | f (x)| ≤ 2| f (x)|
so by the Comparison Theorem
Z ∞
f (x)+ | f (x) | dx
a

converges. Finally, since

f (x) = [ f (x) + | f (x)|] − | f (x)|

Calculus 2 (B. Forrest)2

Chapter 2: Techniques of Integration 94

R∞
we get that a
f (x) dx converges with
Z ∞ Z ∞ Z ∞
f (x) dx = f (x)+ | f (x) | dx − | f (x) | dx.
a a a
R∞
To prove the second statement, assume that a g(x) dx converges. The Comparison
Test shows that Z ∞
| f (x) | dx
a
also converges. We can now apply the first statement to conclude that
Z ∞
f (x) dx
a

converges.

R∞ cos(x)
EXAMPLE 19 Show that 3 x2 +2x+1
dx converges.
We know

cos(x) 1
≤ 2
x + 2x + 1
2 x + 2x + 1
1
≤
x2

for all x ≥ 3.
R∞ 1
The p-Test shows that 3 x2
dx converges.
Therefore, by the Comparison Test,
Z ∞
cos(x)
dx

3 x2 + 2x + 1
converges.
R∞ cos(x)
The Absolute Convergence Theorem shows that 3 x2 +2x+1
dx converges.

2.4.3 The Gamma Function

An important class of examples of improper integrals arise as values of a function

called the Gamma function.

Calculus 2 (B. Forrest)2

Section 2.4: Introduction to Improper Integrals 95

DEFINITION The Gamma Function

For each x ∈ R, define Z ∞
Γ(x) = t x−1 e−t dt.
0
The function Γ is called the Gamma function.

Observation: In order to properly define the Γ function we should really show that
the improper integral that arises from each choice of x is actually convergent. We
have actually already seen this to be true for x = 1 as the example below reminds us.
Later we will provide strong evidence of why this is so in general, but the
verification of convergence is left as an exercise.

EXAMPLE 20 Calculate Z ∞ Z ∞
Γ(1) = t e dt =
0 −t
e−t dt.
0 0

By definition
Z ∞
Γ(1) = e−t dt
0
Z b
= lim e−t dt
b→∞ 0
b
= lim −e−t
b→∞ 0

= lim (−e−b ) − (−e−0 )

b→∞

= 1

Note: To see why the integrals involved in the definition of the Gamma function
always converge we first note that by modifying the previous example we can show
that for any M > 0, the improper integral
Z ∞
t
e− 2 dt
M

also converges. Next we observe that the Fundamental Log Limit shows that for any
x ∈ R we have that t
lim t x−1 e− 2 = 0.
t→∞

Combining these two observations

R∞ and using the Comparison Test for Improper
x−1 −t
Integrals we can show that 0 t e dt is always convergent.

Calculus 2 (B. Forrest)2

Chapter 2: Techniques of Integration 96

Observation: If we apply the Integration by Parts formula we get the following

interesting result:
Z Z
t e dt = −t e + x ·
x −t x −t
t x−1 e−t dt.

It follows that
Z b
Γ(x + 1) = lim t x e−t dt
b→∞ 0
b Z b
= lim −t e + x ·
x −t
t x−1 e−t dt
b→∞ 0 0
Z b
= lim −b e + x · lim
x −b
t x−1 e−t dt
b→∞ b→∞ 0
Z ∞
= 0+x· t x−1 e−t dt
0
= x · Γ(x)

As an immediate application of this observation we can show that for any n ∈ N, we

have that
Γ(n) = (n − 1)!
In fact, we know that Γ(1) = 1 = (1 − 0)!. Now suppose that Γ(k) = (k − 1)!. Then

Γ(k + 1) = k · Γ(k)
= k · (k − 1)!
= k!

so we can deduce that Γ(n) = (n − 1)! by using Mathematical Induction. For this
reason the Γ function is viewed as a means of generating factorial values for
non-natural numbers.
Note: The Γ function also has important applications in statistics and probability
theory.

2.4.4 Type II Improper Integrals

So far in considering improper integrals we have only considered the case where the
interval over which we are integrating is unbounded. There is a second type of
improper integral which we call a Type II Improper Integral. In the case of a Type II
Improper Integral the assumption will be that the integrand f has a vertical
asymptote at some point a ∈ R. To illustrate what we mean by a Type II Improper
Integral we consider the function f (x) = 11 on the interval (0, 1]. We might ask:
x2

Calculus 2 (B. Forrest)2

Section 2.4: Introduction to Improper Integrals 97

Problem: What is the area under the

graph of f above the x-axis and between
the lines x = 0 and x = 1? f (x) =
1
1
We note that f has a vertical asymptote at x2
x = 0 and as a consequence this region
(which we denote by R) is also
unbounded. R

0 1
However, if this was the case, then for any M > 0 we should be able to find a
b ∈ (0.1] so that the area under the graph of f above the x-axis and between the lines
x = b and x = 1 should be at least M. We know that this latter area is
Z 1
1 1 1

1
dx = 2x 2
b
b x2
√
= 2(1 − b)
< 2

Just as we deduced that the unbounded area under the graph of f (x) = x12 on the
interval [1, ∞) could be finite, we again see that the area of this unbounded region R
could be finite. In fact, similar to Improper Integrals of Type I, we could define the
area A of region R to be
Z 1
1 1 1
√
A = lim+ 1
dx = lim 2x 2 = lim 2(1 − b) = 2.
b→0+ b→0+
b→0
b
b x2

DEFINITION Type II Improper Integral

1) Let f be integrable on [t, b] for every t ∈ (a, b] with either lim+ f (x) = ∞ or
x→a
lim+ f (x) = −∞ . We say that the Type II Improper Integral
x→a
Z b
f (x) dx
a

converges if Z b
lim+ f (x) dx
t→a t
exists. In this case, we write
Z b Z b
f (x) dx = lim+ f (x) dx.
a t→a t
Rb
Otherwise, we say that a
f (x) dx diverges.

Calculus 2 (B. Forrest)2

Chapter 2: Techniques of Integration 98

2) Let f be integrable on [a, t] for every t ∈ [a, b) with either lim− f (x) = ∞ or
x→b
lim− f (x) = −∞ . We say that the Type II Improper Integral
x→b

Z b
f (x) dx
a

converges if Z t
lim− f (x) dx
t→b a
exists. In this case, we write
Z b Z t
f (x) dx = lim− f (x) dx.
a t→b a

Rb
Otherwise, we say that a
f (x) dx diverges.

3) If f has an infinite discontinuity at x = c where a < c < b, then we say that the
Type II Improper Integral Z b
f (x) dx
a
Z c Z b
converges if both f (x) dx and f (x) dx converge. In this case, we write
a c
Z b Z c Z b
f (x) dx = f (x) dx + f (x) dx.
a a c
Z b
If one or both of these integrals diverge, then we say that f (x) dx diverges.
a

Remark: In determining the convergence or divergence of a Type I Improper

Integral the p-test was an important tool. There is a natural analog of the p-test for
Type II Improper Integrals.

THEOREM 8 p-Test for Type II Improper Integrals

The improper integral Z 1
1
dx
0 xp
converges if and only if p < 1.
If p < 1, then Z 1
1 1
p
dx = .
0 x 1− p

Calculus 2 (B. Forrest)2

Section 2.4: Introduction to Improper Integrals 99

PROOF
First assume that p , 1. By definition
Z 1 Z 1
1 1
p
dx = lim+ p
dx
0 x t→0 t x

1 1−p 1

= lim+ x
t→0 1 − p t

1 1 1−p
= lim+ − t
t→0 1− p 1− p

Now if p < 1, then 1 − p > 0 so

1 1 1−p 1
lim+ − t = .
t→0 1− p 1− p 1− p
Now if p > 1, then 1 − p < 0 so
1 1 1−p
lim+ − t = ∞.
t→0 1− p 1− p

Finally, if p = 1, then
Z 1 Z 1
1 1
dx = lim+ dx
0 x t→0 t x
1
= lim+ ln(x)
t→0 t

= lim+ ln(1) − ln(t)

t→0

= ∞

R1
It follows that 1
0 xp
dx converges precisely when p < 1 and that in this case
Z 1
1 1
p
dx =
0 x 1− p
as claimed.

Calculus 2 (B. Forrest)2

Chapter 3

Applications of Integration

In this chapter, we will consider four types of calculations that use integration: areas
between curves, volumes using the disk method, volumes using the shell method,
and arc length.

3.1 Areas Between Curves

We have already seen that there is a strong relationship between integration and
area. In particular, if the continuous function f is positive on [a, b], then we
Z b
interpreted f (t) dt to be the area under the graph of f that is above the t-axis and
a
bounded by the lines t = a and t = b.
In this section, integration is used to answer a more general problem—that of
calculating areas between curves (rather than between the curve and the x-axis).

Problem f
Let f and g be continuous on an
interval [a, b]. Find the area of
the region bounded by the graphs g
of the two functions, f and g, and
the lines t = a and t = b. t
a 0 b

Let’s begin with a simple case. f

Assume that f (t) ≤ g(t) for all g
t ∈ [a, b]. The task is to find the A
area A of the region in the
diagram.
t
a 0 b
We will use a method similar to our previous area calculation under a curve with
Riemann sums. That is, we begin by constructing a regular n-partition
a = t0 < t1 < t2 < · · · < ti−1 < ti < · · · < tn−1 < tn = b
Section 3.1: Areas Between Curves 101

with ∆ti = b−a

n
and ti = a + i(b−a)
n
.

This partition divides A f

into n subregions which
we label as g
A1 , A2 , · · · , An where Ai A1 A2 Ai An
is the region bounded by
the graphs f and g, and
the lines t = ti−1 and
t = ti .
0
a = t0 t1 t2 ti−1ti tn = b
A rectangle Ri can now be used to estimate the area Ai as follows:

f
g(ti )

g
Ri
g(ti ) − f (ti )

f (ti )
0 ti
a = t0 t1 t2 ti−1 tn = b
∆ti

The height of the rectangle Ri is h = g(ti ) − f (ti ) and its width is ∆ti = b−a
n
, so the
area Ai is estimated by
Ai (g(ti ) − f (ti )) ∆ti .

Thus
n
X
A = Ai
i=1
n
X
(g(ti ) − f (ti ))∆ti
i=1
n
X b−a
(g(ti ) − f (ti ))
i=1
n

with the latter sum equal to a right-hand Riemann sum for the function g − f on
[a, b].

Calculus 2 (B. Forrest)2

Chapter 3: Applications of Integration 102

0 t
a = t0 t1 t2 ti−1 i tn = b

Let n → ∞. Then

n
X
A = lim (g(ti ) − f (ti ))∆ti
n→∞
i=1
n
X b−a
= lim (g(ti ) − f (ti ))
n→∞
i=1
n
Z b
= (g(t) − f (t)) dt
a

The general case where f and g may cross at one or more locations on the interval
[a, b] is similar. We again construct a regular n-partition

a = t0 < t1 < t2 < · · · < ti−1 < ti < · · · < tn−1 < tn = b

with ∆ti = b−a

n
and ti = a + i(b−a)
n
. This divides A into n subregions A1 , A2 , · · · , An
where Ai is the region bounded by the graphs f and g, and the lines t = ti−1 and t = ti .

f
Moreover, we can again
estimate the area Ai by
constructing rectangle Ri .
However, this time we must Ai
be concerned with whether g
f (ti ) ≤ g(ti ) or whether
g(ti ) ≤ f (ti ).
0
a = t0 t1 t2 ti−1ti tn = b
If f (ti ) ≤ g(ti ), then the height of the rectangle Ri is hi = g(ti ) − f (ti ) and its width is
∆ti = b−a n
. That is
Ai (g(ti ) − f (ti )) ∆ti

Calculus 2 (B. Forrest)2

Section 3.1: Areas Between Curves 103

g(ti )
g(ti ) − f (ti ) f
f (ti )
∆ti

0 t
a = t0 t1 t2 ti−1 i tn = b

However, if g(ti ) ≤ f (ti ), then the height of the rectangle is now hi = f (ti ) − g(ti ).
The width remains as ∆ti = b−a n
, so

Ai ( f (ti ) − g(ti )) ∆ti

f (ti )
f (ti ) − g(ti ) f
g(ti )
∆ti

0 t
a = t0 t1 t2 ti−1 i tn = b

To summarize, we have that

Ai hi ∆ti
where (
g(ti ) − f (ti ) if g(ti ) − f (ti ) ≥ 0
hi =
f (ti ) − g(ti ) if g(ti ) − f (ti ) < 0

Calculus 2 (B. Forrest)2

Chapter 3: Applications of Integration 104

0 t
a = t0 t1 t2 ti−1 i tn = b

f (ti ) > g(ti ) g(ti ) > f (ti )

Once more, the estimate for the area between f and g on [a, b] is
n
X
A = Ai
i=1
Xn
hi ∆ti
i=1

However, since
(
g(ti ) − f (ti ) if g(ti ) − f (ti ) ≥ 0
hi =
f (ti ) − g(ti ) if g(ti ) − f (ti ) < 0

then hi is equivalent to
hi =| g(ti ) − f (ti ) |
so
n
X
A = Ai
i=1
Xn
| g(ti ) − f (ti ) | ∆ti
i=1

Calculus 2 (B. Forrest)2

Section 3.1: Areas Between Curves 105

Area Between Curves

Let f and g be continuous on [a, b]. Let A be the region bounded by the
graphs of f and g, the line t = a and the line t = b. Then the area of region A
is given by Z b
A= | g(t) − f (t) | dt.
a

EXAMPLE 1 Find the area A of the closed g(x) = x2

region bounded by the graphs of A
the functions g(x) = x2 and f (x) = x3
f (x) = x3 . This area is the shaded
region in the diagram. −1 0 1

−1
The graphs cross when x3 = x2 or equivalently when

0 = x3 − x2
⇒ 0 = x2 (x − 1)

This occurs when x = 0 and x = 1. It follows that we are looking for the area
bounded by the functions g(x) = x2 and f (x) = x3 between the lines x = 0 and x = 1.
Moreover, on the interval [0, 1] notice that x2 ≥ x3 . This means that the area is

Z 1
A = x2 − x3 dx
0
! 1
x3 x4
= −
3 4 0

!
1 1
= − − (0 − 0)
3 4
1
=
12

Calculus 2 (B. Forrest)2

Chapter 3: Applications of Integration 106

1
f (x) = x
EXAMPLE 2 Find the total area A of the closed
regions bounded by the graphs of g(x) = x3
the functions f (x) = x and
g(x) = x3 . The shaded regions in −1 0 1
the diagram represent A. g(x) = x3
f (x) = x
−1

First we must locate the points where the graphs intersect. That is, where x3 = x or
equivalently where

0 = x3 − x
⇒ 0 = x(x2 − 1)
⇒ 0 = x(x + 1)(x − 1)

The solutions are x = −1, x = 0, and x = 1. This means that the left-hand bound is
x = −1 and the right-hand bound is x = 1. Using this information, we know that the
area is given by Z 1
| x3 − x | dx.
−1
However, we cannot apply the Fundamental Theorem of Calculus directly to
| x3 − x | to finish the calculation since f and g intersect on the interval [−1, 1].
Instead, we must consider the area in two parts, A1 and A2.

Area of A1
On the interval [−1, 0] we have 1
f (x) = x
x3 ≥ x
g(x) = x3
It follows that
Z 0 Z 0 −1 0 1
A1
| x − x | dx =
3
(x3 − x) dx g(x) = x3
−1 −1
f (x) = x
This integral represents A1, the −1
shaded area in the diagram.

Calculus 2 (B. Forrest)2

Section 3.1: Areas Between Curves 107

Area of A2
On the interval [0, 1] we have 1
f (x) = x
x ≥ x3 .
A2 g(x) = x3
It follows that
Z 1 Z 1 −1 0 1
| x − x | dx =
3
(x − x3 ) dx g(x) = x3
0 0
f (x) = x
This integral represents A2, the −1
shaded area in the diagram.

Total Area Between the Curves

The total area A between the curves f (x) = x and g(x) = x3 on the interval [−1, 1] is
A = A1 + A2. Thus the total area is:

Z 1
A = | x3 − x | dx
−1

= A1 + A2
Z 0 Z 1
1
f (x) = x = | x − x | dx +
3
| x3 − x | dx
−1 0
Z 0 Z 1
A2 g(x) = x3 = (x3 − x) dx + (x − x3 ) dx
−1 0
−1 0 1 ! 0 ! 1
A1 x4 x2 x2 x4
g(x) = x3 = − + −
4 2 −1 2 4 0

f (x) = x !! ! !
−1 1 1 1 1
= (0 − 0) − − + − − (0 − 0)
4 2 2 4
1 1
= +
4 4
1
=
2

Calculus 2 (B. Forrest)2

Chapter 3: Applications of Integration 108

3.2 Volumes of Revolution: Disk Method

In this section we will use integration to calculate the volume of various types of
solids obtained by rotating a region in the plane around a fixed line.

y
Problem 1:
Assume that f is continuous on
[a, b] and that f (x) ≥ 0 on [a, b]. y = f (x)
Let W be the region bounded by
the graph of f , the lines x = a W
and x = b and the line y = 0. a b x

If region W is revolved around

y = f (x)
the x-axis an object called a solid
of revolution is generated with
the property that each vertical W
cross section of the solid is a a b x
circle with radius equal to the
value of the function at the
location of the slice.

Our goal is to determine the volume V of this solid.

We will use integration to solve this problem and begin with a regular n-partition

a = t0 < t1 < t2 < · · · < ti−1 < ti < · · · < tn−1 < tn = b

of [a, b] with ∆ti = b−a

n
and ti = a + i(b−a)
n
. This partition subdivides the region W into
n subregions. Let Wi denote the subregion of W in the interval [xi−1 , xi ].

y = f (x)

a b x

Calculus 2 (B. Forrest)2

Section 3.2: Volumes of Revolution: Disk Method 109

If we let Vi be the volume obtained by rotating Wi around the axis, then

n
X
V= Vi .
i=1

y = f (x)
We will use the same idea that
was used to calculate areas to
estimate the volume Vi . In
Ri f (xi )
particular, replace Wi by the
rectangle Ri with height f (xi ) and
base on the interval [xi−1 , xi ].
xi−1 xi x

∆xi

If 4xi is small, then Vi is

approximately equal to the Di
volume obtained by rotating Ri
around the x-axis. Rotating each
Ri generates a thin cylindrical
disk Di . Therefore, the solid is
approximated by a series of thin
disks.

For this reason, this method to find the volume of revolution is often called the
Disk Method.
The next step is to determine the volume Vi∗ of the disk Di .
However, a close look at this disk shows that it has radius equal to the value of the
function at xi and its thickness is 4xi . Therefore, since the volume of a cylindrical
disk is
π × (radius)2 × (thickness)
we get that
Vi∗ = π f (xi )2 4xi .

Calculus 2 (B. Forrest)2

Chapter 3: Applications of Integration 110

Radius = f (xi )

x
∆xi

Volume = π f (xi )2 ∆xi

Then the approximation for the total volume of the solid of revolution is:
n
X
V = Vi
i=1
n
X
Vi∗
i=1
n
X
= π f (xi )2 4xi
i=1

It follows that n
X
V π f (xi )2 4xi
i=1

and this is a Riemann sum for the function π f (x)2 over the interval [a, b].
Therefore, letting n → ∞, we achieve the formula for the volume of revolution.

Volumes of Revolution: The Disk Method I

Let f be continuous on [a, b] with f (x) ≥ 0 for all x ∈ [a, b]. Let W be the
region bounded by the graphs of f , the x-axis and the lines x = a and x = b.
Then the volume V of the solid of revolution obtained by rotating the region
W around the x-axis is given by
Z b
V= π f (x)2 dx.
a

EXAMPLE 3 Find the volume of the solid of revolution obtained by rotating the region bounded
by the graph of the function f (x) = x2 , the x-axis, and the lines x = 0 and x = 1,
around the x-axis.

Calculus 2 (B. Forrest)2

Section 3.2: Volumes of Revolution: Disk Method 111

y
Using the formula, the volume is
f (x) = x 2
Z 1
V = π f (x)2 dx
0
(1, 1)
Z 1
= π(x2 )2 dx
0
1
x2
Z
= π x4 dx
0 1 x 0
1
x5
= π
5 0
π
=
5

In the next example, we will use what we have learned about volumes of revolution
to derive the formula for the volume of a sphere.

EXAMPLE 4 Find the volume of the sphere of radius √

r obtained by rotating the semi-circular
region bounded by the graph of f (x) = r2 − x2 , the lines x = −r, x = r and y = 0
around the x-axis.
Applying the disk method we get
Z r
V = π f (x)2 dx
√ Z−rr √
y f (x) = r2 − x2 2
= π r2 − x2 dx
−r

√ Z r
r 2 − x2 = π(r2 − x2 ) dx
−r

3 r
!
−r 0 r x x
= π r2 x −

3 −r

r3 (−r)3
! !!
= π r −
3 3
− −r −
3 3
4 3
= πr
3
which is the general formula for the
volume of a sphere.

Calculus 2 (B. Forrest)2

Chapter 3: Applications of Integration 112

Until now we have looked at volume problems that involved a region that was
bounded by a function f and the x-axis. Next we will look at a more general
problem where the region that is revolved is bounded by two functions.

Problem 2:
Suppose that 0 ≤ f (x) ≤ g(x). We want to W
find the volume V of the solid formed by
revolving the region W bounded by the
graphs of f and g and the lines x = a and
x = b around the x-axis. f

a b

Observe that if we let W1 denote the and we let W2 denote region bounded by
region bounded by the graph of g, the the graph f , the x-axis, and the lines x = a
x-axis, and the lines x = a and x = b and x = b,

g g

f
W2
a b a b

then W is the region that remains when we remove W2 from W1 . It follows that the
solid generated by revolving W around the x-axis is the same as the solid we would
get by revolving W1 around the x-axis and then removing the portion that would
correspond to the solid obtained by revolving W2 around the x-axis.
If we let V1 be the volume of the solid obtained by rotating W1 and V2 be the volume
of the solid obtained by rotating W2 , then we have

V = V1 − V2 .

However, the formula for volumes of revolution tells us that

Z b
V1 = πg(x)2 dx
a

Calculus 2 (B. Forrest)2

Section 3.2: Volumes of Revolution: Disk Method 113

and Z b
V2 = π f (x)2 dx.
a

Therefore,
V = V1 − V2
Z b Z b
= πg(x) dx − 2
π f (x)2 dx
a a
Z b
= π(g(x)2 − f (x)2 ) dx
a

Volumes of Revolution: The Disk Method II

Let f and g be continuous on [a, b] with 0 ≤ f (x) ≤ g(x) for all x ∈ [a, b].
Let W be the region bounded by the graphs of f and g, and the lines x = a
and x = b. Then the volume V of the solid of revolution obtained by rotating
the region W around the x-axis is given by
Z b
V= π(g(x)2 − f (x)2 ) dx.
a

EXAMPLE 5 Find the volume V of the solid obtained by revolving the closed region bounded by
the graphs of g(x) = x and f (x) = x2 around the x-axis.
Since we have not been given the interval over which we will integrate, we must
find the x-coordinates of the points where the graphs intersect. But this means that
we must solve x = x2 so x2 − x = x(x − 1) = 0. Hence, the points of intersection are
located at x = 0 or x = 1. Moreover, on [0, 1], we have 0 ≤ x2 ≤ x (i.e., the graph of
x2 lies below the graph of x on this interval). Therefore, the region appears as
follows:
y y
f (x) = x2 f (x) = x2
g(x) = x g(x) = x
(1, 1) (1, 1)

0 1 x 0 1 x

Calculus 2 (B. Forrest)2

Chapter 3: Applications of Integration 114

Then the volume is

Z 1
V = π g(x)2 − f (x)2 dx
0
Z 1
= π (x)2 − (x2 )2 dx
0
Z 1
= π x2 − x4 dx
0
! 1
x3 x5
= π

−
3 5

0
!
1 1
= π −
3 5
2π
=
15

Exercise:
Suppose that f and g are continuous on y g
[a, b] with c ≤ f (x) ≤ g(x) for all
x ∈ [a, b]. Let W be the region bounded
by the graphs of f and g, and the lines W
x = a and x = b. What is the volume V of
the solid of revolution obtained by
revolving the region W around the line f
y = c? The previous analysis still applies. y=c
Therefore, as an exercise, verify that the
volume in this case is
a b x
Z b
V= π((g(x) − c)2 − ( f (x) − c)2 ) dx.
a

3.3 Volumes of Revolution: Shell Method

Sometimes using the Disk Method to find the volume of a solid of revolution can be
onerous due to the algebra involved. Additionally, the Disk Method is sometimes
difficult to use if the region is revolved around the y-axis instead of the x-axis. There
is an alternate method called the Shell Method that may be easier to implement in
such cases.

Calculus 2 (B. Forrest)2

Section 3.3: Volumes of Revolution: Shell Method 115

Problem: Assume that f and g are continuous on [a, b], with a ≥ 0 and f (x) ≤ g(x)
on [a, b]. Let W be the region bounded by the graphs of f and g and the lines x = a
and x = b. Find the volume V of the solid obtained by rotating the region W around
the y-axis.

y
g

a b x

We can proceed in a manner similar to the development of the Disk Method by

constructing a regular n-partition of [a, b] with

a = x0 < x1 < · · · < xi−1 < xi < · · · < xn = b.

y
g

Once again, this partition subdivides the Wi

region W into n subregions. Let Wi denote
the subregion of W on the interval [xi−1 , xi ]
f

a xi b x
xi−1

Let Vi be the volume obtained by rotating Wi around the y-axis so that

n
X
V= Vi .
i=1

Calculus 2 (B. Forrest)2

Chapter 3: Applications of Integration 116

y
g

Next, approximate Wi by the rectangle Ri Ri

with height g(xi ) − f (xi ), and base on the
line y = f (xi ) and top on the line y = g(xi )
in the interval [xi−1 , xi ].
f

a xi b x
xi−1
∆xi

It follows that if 4xi is small, then Vi is

approximately equal to the volume Ri
obtained by rotating Ri around the y-axis. shell S i
This time rotating Ri generates a thin
cylindrical shell S i . ∆xi

For this reason, this method for finding volumes is called the Shell Method (or
Cylindrical Shell Method).
The volume Vi∗ of the shell generated by Ri is

(circumference) × (height) × (thickness)

which is the same as

2π × (radius) × (height) × (thickness).

The height of the shell is g(xi ) − f (xi ), its thickness is 4xi , and the radius of
revolution is xi (the distance from the y-axis). Therefore, the volume Vi∗ of S i is

2πxi (g(xi ) − f (xi ))4xi .

Calculus 2 (B. Forrest)2

Section 3.3: Volumes of Revolution: Shell Method 117

y
thickness = ∆xi
radius = xi

circumference = 2π xi

height = g(xi ) − f (xi )

xi x

Volume = 2π xi (g(xi ) − f (xi )) ∆xi

It follows that
n
X
V = Vi
i=1
n
X
Vi∗
i=1
n
X
= 2πxi (g(xi ) − f (xi ))4xi
i=1

Letting n → ∞, we get
Z b
V= 2πx(g(x) − f (x)) dx.
a

Volumes of Revolution: The Shell Method

Let a ≥ 0. Let f and g be continuous on [a, b] with f (x) ≤ g(x) for all
x ∈ [a, b]. Let W be the region bounded by the graphs of f and g, and the
lines x = a and x = b. Then the volume V of the solid of revolution obtained
by rotating the region W around the y-axis is given by
Z b
V= 2πx(g(x) − f (x)) dx.
a

Calculus 2 (B. Forrest)2

Chapter 3: Applications of Integration 118

EXAMPLE 6 Find the volume of the solid obtained by revolving the closed region in the first
quadrant bounded by the graphs of g(x) = x and f (x) = x2 around the y-axis.

y y
f (x) = x2
g(x) = x f (x) = x2
g(x) = x
(1, 1) (1, 1)

0 1 x 0 1 x

As we have seen from a previous example, the graphs intersect in the first quadrant
when x = 0 and x = 1 on the interval [0, 1] with f (x) ≤ g(x). Thus
Z 1
V = 2πx(g(x) − f (x))dx
0
Z 1
= 2πx(x − x2 )dx
0
1
π
Z
= 2π(x2 − x3 )dx =
0 6

Observe that previously we calculated the volume obtained by rotating this same
region around the x-axis to equal 2π
15
(which is less than π6 ). This should not be
surprising since the region is closer to the x-axis than to the y-axis and the further
away from the axis of revolution, the larger the volume.

3.4 Arc Length

The next application of integration that we will develop is a method for finding the
length of the graph of a function over an interval [a, b]. The calculation of arc length
has many important applications though most are beyond the scope of this course.
Problem: Let f be continuously differentiable on [a, b]. What is the arc length S of
the graph of f on the interval [a, b]?

Calculus 2 (B. Forrest)2

Section 3.4: Arc Length 119

S = arc length y = f (x)

a b x

Let
a = x0 < x1 < · · · < xi−1 < xi < · · · < xn = b
be a regular n-partition of [a, b].
Let S i denote the length of the arc joining (xi−1 , f (xi−1 )) and (xi , f (xi )).

y = f (x)
Si

a xi b x
xi−1
∆xi

Then the length of the graph of f on the interval [a, b] is

n
X
S = S i.
i=1

Observe that if 4xi is small, then S i is approximately equal to the length of the
secant line joining (xi−1 , f (xi−1 )) and (xi , f (xi )).

y
(xi−1 , f (xi−1 )) secant

(xi , f (xi ))
portion of Si
arc on y = f (x)

xi−1 xi x
∆xi

Calculus 2 (B. Forrest)2

Chapter 3: Applications of Integration 120

It follows that
p
Si (4xi )2 + (4yi )2
p
= (4xi )2 + ( f (xi ) − f (xi−1 ))2

y
(xi−1 , f (xi−1 )) length =
p
(∆xi )2 + ( f (xi−1 ) − f (xi ))2
f (xi−1 ) − f (xi )
(xi , f (xi ))
∆xi

xi−1 xi x

Next, applying the Mean Value Theorem guarantees a ci ∈ (xi−1 , xi ) such that

f (xi ) − f (xi−1 ) = f 0 (ci )4xi .

Therefore,
p
Si (4xi )2 + ( f (xi ) − f (xi−1 ))2
p
= (4xi )2 + ( f 0 (ci )4xi )2
p
= (4xi )2 + ( f 0 (ci ))2 (4xi )2
p
= (4xi )2 (1 + ( f 0 (ci ))2 )
p
= 1 + ( f 0 (ci ))2 4xi .

Hence
n
X
S = Si
i=1
n p
X
1 + ( f 0 (ci ))2 4xi
i=1

1 + ( f 0 (x))2 over the interval [a, b].

p
This is a Riemann sum for the function
Therefore, letting n → ∞, we get

Calculus 2 (B. Forrest)2

Section 3.4: Arc Length 121

Z b p
S = 1 + ( f 0 (x))2 dx.
a

Arc Length
Let f be continuously differentiable on [a, b]. Then the arc length S of the
graph of f over the interval [a, b] is given by
Z bp
S = 1 + ( f 0 (x))2 dx
a

REMARK
The derivation of the arc length formula has many important applications that are
beyond the scope of this course. Unfortunately, due to the square root in the
integrand of the formula, there are very few functions for which we can calculate the
arc length explicitly. Even calculating the arc length of the graph of f (x) = x3 over
the interval [0, 1] is beyond our current ability. However, there are a few examples
that we can evaluate explicitly.

3
2x 2
EXAMPLE 7 Find the length S of the portion of the graph of the function f (x) = between
3
x = 1 and x = 2.
1
In this case, f 0 (x) = x 2 . Hence
Z 2 p
S = 1 + ( f 0 (x))2 dx
1
Z 2 q
= 1 + (x 2 )2 dx
1

1
Z 2 √
= 1 + x dx
1

3 2
2(1 + x) 2
=

3

1
3 3
2(3) 2 2(2) 2
= −
3 3
2 32 3
= (3 − 2 2 )
3

1.578

Calculus 2 (B. Forrest)2

Chapter 4

Differential Equations

In this chapter we introduce and study differential equations. Differential equations

are equations involving functions and their derivatives. These equations are used to
model problems in the physical, biological, and social sciences.

4.1 Introduction to Differential Equations

Differential equations (DEs) often arise from studying real world problems. For
example, if we let
P(t)
denote the population of a colony of bacteria at time t, then empirical evidence
suggests that in an environment with unlimited resources the population will grow at
a rate that is proportional to its size. This makes sense since the more bacteria that
are present, the more “offspring” they will produce. Mathematically, this gives rise
to the differential equation
P 0 (t) = kP(t)
where k is the constant of proportionality. If a function satisfying this equation can
be determined, it would be helpful in predicting how the population will evolve.
The goal of this section is to introduce differential equations and to see how to find
solutions for some basic examples.

DEFINITION Differential Equation

A differential equation is an equation involving an independent variable such as x, a
function y = y(x) and various derivatives of y. In general, we will write
F(x, y, y 0 , y 00 , · · · , y(n) ) = 0.

A solution to the differential equation is a function ϕ such that

F(x, ϕ(x), ϕ 0 (x), · · · , ϕ(n) (x)) = 0.

The highest order of a derivative appearing in the equation is called the order of the
differential equation.
Section 4.2: Introduction to Differential Equations 123

EXAMPLE 1 Consider the equation

F(x, y, y 00 ) = (cos(x))y + y 00 = 0.

This is an example of a differential equation of order 2. The constant function

ϕ(x) = 0 is a solution to this equation since

cos(x)ϕ + ϕ 00 = cos(x) · 0 + 0 = 0.

However, at this point we have no tools to find any other solutions should they exist.

NOTE

1) In this course, we will typically consider only first-order differential

equations. Such DEs can be written in the form

y 0 = f (x, y).

A solution for a first-order differentiable equation is a function ϕ for which

ϕ 0 (x) = f (x, ϕ(x)).

2) The simplest first-order DE is the equation

y 0 = f (x).

Hence y = y(x) is a solution if and only if y is an antiderivative of f .

Therefore, the solutions to this equation are given by
Z
f (x) dx = F(x) + C

where F is any antiderivative of f and C ∈ R is an arbitrary constant. This

shows that differential equations do not need to have unique solutions. In
particular, each different choice of C results in a new solution. The constant C
is called a parameter and the collection of solutions {F(x) + C | C ∈ R} is
called a one parameter family.

EXAMPLE 2 Solve the differential equation y 0 = cos(x).

We have that Z
y= cos(x) dx = sin(x) + C

where C is an arbitrary constant.

Calculus 2 (B. Forrest)2

Chapter 4: Differential Equations 124

4.2 Separable Differential Equations

In this section we consider an important class of first-order differential equations

and outline a technique for finding their solutions.

DEFINITION Separable Differential Equation

A first-order differentiable equation is separable if there exists functions f = f (x)
and g = g(y) such that the differentiable equation can be written in the form

y 0 = f (x)g(y).

EXAMPLE 3 Consider the following differentiable equations:

i) y 0 = xy2 is separable. In this case, f (x) = x and g(y) = y2 .

ii) y 0 = y is separable. In this case, f (x) = 1 and g(y) = y.

iii) y 0 = cos(xy) is not separable since it can not be written in the

form y 0 = f (x)g(y).

Solving Separable Differential Equations

There is a simple process to follow to find the solutions to a separable differential

equation y 0 = f (x)g(y). The steps are:

Step 1: Identify f (x) and g(y)

Step 2: Find all constant (equilibrium) solutions

Step 3: Find the implicit solution

Step 4: Find the explicit solutions

We will illustrate this process with a series of examples.

Step 1: Identify f (x) and g(y)
Often when you are presented with a differential equation, it will not be obvious
whether the DE is separable. You may have to factor the differential equation in
order to identify f (x) and g(y).

Calculus 2 (B. Forrest)2

Section 4.2: Separable Differential Equations 125

EXAMPLE 4 Let y0 = xy2 + x. Determine if this DE is separable.

In this case, the DE must be factored first in order to identify f (x) and g(y). Note
that

y0 = xy2 + x
= x(y2 + 1)

so f (x) = x and g(y) = y2 + 1. Since this differential equation can be rewritten in the
form y 0 = f (x)g(y), it is a separable DE.

Step 2: Find all constant (equilibrium) solutions

Let
y0 = f (x)g(y)
be a separable DE.
Suppose that g(y0 ) = 0 for some y0 . Then the constant function

y = ϕ(x) = y0

is a solution to the separable differential equation since

ϕ 0 (x) = 0 = f (x)g(y0 ) = f (x)g(ϕ(x))

for every x.

DEFINITION Constant (Equilibrium) Solution to a Separable Differential Equation

If
y 0 = f (x)g(y)
is a separable differential equation and if y0 ∈ R is such that g(y0 ) = 0, then

φ(x) = y0

is called a constant or equilibrium solution to the differential equation.

EXAMPLE 5 Consider the separable differential equation

y 0 = y(1 − y).

Then f (x) = 1 and g(y) = y(1 − y). Moreover, g(y) = 0 if y = 0 or y = 1.

Let ϕ(x) = 0 for each x. Then, since ϕ is constant ϕ 0 (x) = 0,

0 = ϕ 0 (x) = ϕ(x)(1 − ϕ(x))

Calculus 2 (B. Forrest)2

Chapter 4: Differential Equations 126

and hence ϕ(x) = 0 is an constant (equilibrium) solution.

If ψ(x) = 1 for all x, then ψ 0 (x) = 0 and (1 − ψ(x)) = 0 for each x. Therefore,

0 = ψ 0 (x) = ψ(x)(1 − ψ(x))

for each x. Hence ψ(x) = 1 is also a constant solution of the differential equation.
These are the only two constant solutions to this separable differential equation.

Step 3: Find the implicit solution

If y0 = f (x)g(y) is a separable differential equation, when g(y) , 0 we can divide by
g(y) to get
y0
= f (x).
g(y)

Integrating both sides with respect to x gives

y0
Z Z
dx = f (x) dx.
g(y)

However, if we note that y = y(x), we can apply the Change of Variables theorem to
the left-hand integral to get
y0
Z Z
1
dx = y 0 (x) dx
g(y) g(y(x))
Z
1
= dy
g(y)

This gives us the formula

Z Z
1
dy = f (x) dx
g(y)

Evaluating these integrals gives us an implicit solution to the differential equation of

the form
G(y) = F(x) + C
where C is an arbitrary real constant.
R 1 R
This step will be successful only if we are able to evaluate dy and f (x) dx.
g(y)
Step 4: Find the explicit solutions

Calculus 2 (B. Forrest)2

Section 4.2: Separable Differential Equations 127

Try to solve the implicit equation

G(y) = F(x) + C

for y in terms of x. This will be the explicit solution to the differential equation.
Unfortunately, it is not always easy to solve this equation for y in terms of x.
The next example illustrates all four of the steps required to solve a separable
differential equation.

EXAMPLE 6 Solve the separable differential equation

y 0 = x(y2 + 1).

Step 1: Identify f (x) and g(y)

Since this separable differential equation is already in factored form, we can see that
f (x) = x and g(y) = y2 + 1.
Step 2: Find all constant (equilibrium) solutions
We must find all values y0 such that g(y0 ) = (y20 + 1) = 0. Since y2 + 1 > 0 for all y,
there are no such y0 ’s. Thus we conclude that there are no constant solutions.
Step 3: Find the implicit solution
1
dy = f (x) dx, we get
R R
Using the formula
g(y)
Z Z
1
dy = x dx.
y2 + 1

Since Z
1
dy = arctan(y) + C1
y2 + 1
and
x2
Z
x dx = + C2
2
evaluating the integrals gives the implicit solution

x2
arctan(y) = +C (∗)
2

Note: We only need to include the constant once since it is arbitrary.

Step 4: Find the explicit solutions
To find the explicit solutions, we must try to solve equation (∗) for y. Observe that

tan (arctan(y)) = y

Calculus 2 (B. Forrest)2

Chapter 4: Differential Equations 128

so that we can apply the tangent function to both sides of the implicit solution (∗) to
get
x2
!
y = tan (arctan(y)) = tan +C .
2

Therefore, the explicit solutions to the separable differential equation

y 0 = x(y2 + 1)

are all functions of the form

x2
!
y = tan +C
2
where C is an arbitrary constant. There were no constant solutions.
Check your work: You can verify that these are the correct solutions by
differentiating.
2
If y = tan x2 + C , then

x2
! !
2x
y 0
= sec 2
+C
2 2
2
!
x
= x sec2 +C
2

The trigonometric identity sec2 (θ) = 1 + tan2 (θ) give us that

2
!
2 x
y = x sec
0
+C
2
2
!!
2 x
= x 1 + tan +C
2
= x(1 + y2 )

which was the original separable differential equation, exactly as expected.

EXAMPLE 7 Solve the differentiable equation

y 0 = xy + y.

Step 1: Identify f (x) and g(y) (if possible) to determine if the DE is separable
This DE factors as
y 0 = (x + 1)y
so f (x) = x + 1 and g(y) = y, so this DE is separable.

Calculus 2 (B. Forrest)2

Section 4.2: Separable Differential Equations 129

Step 2: Find all the constant (equilibrium) solutions

Since g(y) = y, the only y0 such that g(y0 ) = 0 is y0 = 0. Therefore,

y=0

is the only constant solution.

Step 3: Find the implicit solution
1
dy = f (x) dx, we get
R R
If y , 0, using the formula
g(y)
Z Z
1
dy = (x + 1) dx
y
so
x2
ln(| y |) = + x+C (∗)
2
where C is an arbitrary constant.
Step 4: Find the explicit solutions
To solve the implicit equation (∗) for y, we first exponentiate both sides to get
x2
eln(|y|) = e 2 +x+C

so
x2
| y | = eC e 2 +x
x2
= C1 e 2 +x

where C1 = eC > 0.
However,
x2
| y |= C1 e 2 +x
means that
x2
y = ±C1 e 2 +x
x2
= C2 e 2 +x

where C2 = ±C1 , 0.
Therefore, the solutions are

y=0 (equilibrium solution)

Calculus 2 (B. Forrest)2

Chapter 4: Differential Equations 130

or
x2
y = C2 e 2 +x (explicit solutions)
where C2 , 0.
Finally, since
x2
y = 0 = 0e 2 +x
we actually have that all of the solutions are of the form
x2
y = C3 e 2 +x

where C3 is an arbitrary constant.

Strategy [Solving Separable Differential Equations]

Solving the separable differential equation

y 0 = f (x)g(y)

consists of 4 steps.

Step 1: Determine whether the DE is separable. You may have to

factor the DE to identify f (x) and g(y).
Step 2: Determine the constant solution(s) by finding all the values y0
such that g(y0 ) = 0. For each such y0 , the constant function

y = y(x) = y0

is a solution.
Step 3: If g(y) , 0, integrate both sides of the following equation
Z Z
1
dy = f (x) dx
g(y)
to solve the differential equation implicitly.
Step 4: Solve the implicit equation from Step 3 explicitly for y in terms
of x.
Step 5: [Optional] Check your solution by differentiating y to
deterimine if this derivative is equal to the original DE y0 .

Calculus 2 (B. Forrest)2

Section 4.3: First-Order Linear Differential Equations 131

NOTE
Each of these steps could be difficult or even impossible to complete! For this
reason, it is often necessary to find qualitative solutions or numerical solutions for
the differential equation. We will discuss qualitative solutions later in the chapter.

4.3 First-Order Linear Differential Equations

Linear differential equations form one of the most important classes of differential
equations. There is a very well developed theory for dealing with these equations
both algebraically and numerically. Furthermore, a common strategy for handling
many differential equations that arise in real world problems is to use approximation
techniques to replace the given equation by a linear one. In this section, we will
develop an algorithm for solving first-order linear differential equations that
provides a rather simple formula for determining all solutions to this class of
equations.

DEFINITION First-Order Linear Differentiable Equations [FOLDE]

A first-order differential equation is said to be linear if it can be written in the form

y 0 = f (x)y + g(x).

EXAMPLE 8 Consider the following differential equations:

i) The separable differential equation

y 0 = 3x(y − 1)

may be rewritten as
y 0 = 3xy − 3x
so it is also linear.

ii) The differentiable equation

y 0 = x2 y3
is not linear since the term y3 is of third degree.

The next example will introduce a method for solving first-order linear differential
equations.

Calculus 2 (B. Forrest)2

Chapter 4: Differential Equations 132

EXAMPLE 9 Solve the first-order linear differential equation

y 0 = 3xy − 3x.

The first step is to rewrite the differential equation so that “g(x)” is alone on the
right-hand side of the equation,

y 0 − 3xy = −3x.

The next step is to multiply both sides of the equation by a nonzero function I = I(x)
to get
Iy 0 − 3xIy = −3xI (1)

The goal is to find the nonzero function I = I(x) such that if we differentiate I(x)y(x)
we will get the left-hand side of equation (1). That is,
d
(I(x)y(x)) = Iy 0 − 3xIy
dx

Using the Product Rule we see that

d
(I(x)y(x)) = Iy 0 + I 0 y
dx
so we require that
Iy 0 + I 0 y = Iy 0 − 3xIy (2)

A close look at equation (2) shows us that we need

I 0 = −3xI (3)

Equation (3) is a separable differential equation which we know how to solve.

Since the only constant solution is I = I(x) = 0 and we require a nonzero function,
we proceed to Step 2 of the algorithm for solving separable equations.
Write Z Z
1
dI = (−3x) dx.
I
This gives
3
ln(| I |) = − x2 + C.
2
Exponentiating shows that
3 2
| I |= C1 e− 2 x
where C1 = eC > 0 and hence that
3 2
I = C2 e− 2 x

with C2 , 0.

Calculus 2 (B. Forrest)2

Section 4.3: First-Order Linear Differential Equations 133

We only require one such function, so choose C2 = 1. Then

3 2
I = I(x) = e− 2 x .

With this choice of I we now have an equation of the form

d
(I(x)y(x)) = −3xI
dx
−3 2
where I = I(x) = e 2 x .
Integrating both sides of this equation gives us
Z !
d
I(x)y = (I(x)y(x)) dx
dx
Z
= −3xI(x) dx
Z
3 2
= −3xe− 2 x dx

Let u = −3 2
2
x to get that du = −3xdx so dx = −3x
du
which gives
Z Z
−3 2
(−3xI(x)) dx = −3xe 2 x dx
Z
= eu du
= eu + C

−3 2
= e 2 x +C

This means
−3 2
I(x)y = e 2 x + C.

Solving for y gives us

−3 2
e 2 x +C
y =
I(x)
−3 2
e 2 x +C
= −3 2
e2x
3 2
= 1 + Ce 2 x

where C is an arbitrary constant.

Calculus 2 (B. Forrest)2

Chapter 4: Differential Equations 134

Finally, we can verify this answer by differentiating y to get

3 2
y 0 = Ce 2 x (3x)
= (3x)(y − 1)
= 3xy − 3x

which, as we expected, is the original DE that we were trying to solve.

The function I = I(x) in the previous example is called the integrating factor. The
reasons for introducing such a function may look a little mysterious. However, in
general, the use of I(x) works for solving all first-order linear differential equations.

Strategy [Solving First-Order Linear Differential Equations]

Solving the first-order linear differential equation

y 0 = f (x)y + g(x)

consists of 3 steps.

Step 1: Determine whether the DE is linear. Write the equation in

the form
y 0 − f (x)y = g(x)
and identify f (x) and g(x).
Step 2: Calculate the integrating factor I(x) with I(x) , 0. Solve for I
by using R
I = e− f (x) dx

Step 3: Since I(x) , 0, the solution is

R
g(x)I(x) dx
y=
I(x)

Step 4: [Optional] Check your solution by differentiating y.

We can now state the following theorem which summarizes what we have learned
about solving first-order linear differential equations.

Calculus 2 (B. Forrest)2

Section 4.3: First-Order Linear Differential Equations 135

THEOREM 1 Solving First-Order Linear Differential Equations

Let f and g be continuous and let

y 0 = f (x)y + g(x)

be a first-order linear differential equation. Then the solutions to this equation are of
the form R
g(x)I(x) dx
y=
I(x)
R
where I(x) = e− f (x) dx
.

Note: In theory, the method we have just outlined provides us with a means of
solving all first-order linear differential equations. However, in practice this only
works provided that we can perform the required integrations.

EXAMPLE 10 Solve the first-order linear differential equation

y 0 = x − y.

First we rewrite the DE in the form y 0 − f (x)y = g(x) to get

y 0 − (−1y) = x

so this DE is linear and we have f (x) = −1 and g(x) = x.

The integrating factor is R
I(x) = e− (−1) dx
= ex .

The general solution can be found by using

R
g(x)I(x) dx
y =
I(x)
R
xe x dx
=
ex

Integration by parts can be used to show

Z
xe x dx = xe x − e x + C.

Calculus 2 (B. Forrest)2

Chapter 4: Differential Equations 136

It then follows that

R
xe x dx
y =
ex
xe x − e x + C
=
ex

= x − 1 + Ce−x

The general solution is y = x − 1 + Ce−x .

Check your work. You can verify that this is the correct solution by differentiating
y. Since
y = x − 1 + Ce−x
then
y0 = 1 − Ce−x .
We get

y0 = 1 − Ce−x
= x − (x − 1 + Ce−x )
= x−y

which is the original FOLDE. This verifies that y = x − 1 + Ce−x is the correct
solution.

4.4 Initial Value Problems

We have seen that a first-order differential equation

y 0 = f (x, y)

generally produces infinitely many solutions.

However, there are times when we are looking for a particular solution. This
happens, for example, when a real world problem dictates that the solution must
take particular values at a set of predetermined points. That is, we must satisfy
constraints such as

y(x0 ) = y0
y(x1 ) = y1
y(x2 ) = y2
y(x3 ) = y3
..
.

Calculus 2 (B. Forrest)2

Section 4.4: Initial Value Problems 137

These constraints are called initial values or initial conditions and a differential
equation specified with initial values is called an initial value problem.
The use of initial values are often important in real world problems. For example,
we have seen that with unlimited resources, we can expect the population P(t) of a
bacteria colony to satisfy the differential equation
P 0 = kP
for some k. It is then easy to verify that the general solution to this equation is
P(t) = Cekt
where C is arbitrary and k is potentially unknown.
In this form, it is impossible to use the solutions to derive information about the
population at a specific time. However, suppose we knew that the initial population
at time t = 0 was P0 . This specifies the initial condition
P(0) = P0 .

Substituting t = 0 into the general solution gives us

P0 = P(0) = Cek(0) = Ce0 = C.
Therefore, we now know that
P(t) = P0 ekt .
If we also knew that the population at t = 1 was P1 we would have
P1 = P0 ek(1) = P0 ek .
This gives us that
P1
ek =
P0
and hence that !
P1
k = ln .
P0
The population function has now been completely determined. (In fact, knowing the
population at any two distinct times completely determines the population function).
We have just seen how initial values can help us to determine which of the
potentially infinitely many solutions to a differential equation arising from a real
world problem gives the desired solution. For linear equations, this fact is illustrated
by the next theorem.

THEOREM 2 Existence and Uniqueness Theorem

for First-Order Linear Differential Equations
Assume that f and g are continuous functions on an interval I. Then for each x0 ∈ I
and for all y0 ∈ R, the initial value problem

y 0 = f (x)y + g(x)
y(x0 ) = y0

has exactly one solution y = ϕ(x) on the interval I.

Calculus 2 (B. Forrest)2

Chapter 4: Differential Equations 138

EXAMPLE 11 Solve the initial value problem

y 0 = xy
with y(0) = 1.
Observe that this differential equation is linear since it takes the form
y0 = f (x)y + g(x) where f (x) = x and g(x) = 0, so the previous theorem tells us that
there will be a unique solution. However, this differential equation is also separable
since it can be written in the form y0 = f (x)g(y) with f (x) = x and g(y) = y, so we
can use the method developed for separable equations to find the solution.
The only constant solution is y = y(x) = 0 which does not satisfy the initial
conditions. Hence we have Z Z
1
dy = x dx.
y
This shows that
x2
ln(| y |) = +C
2
so
x2
y = C1 e 2 .
We also have that
02
1 = y(0) = C1 e 2 = C1 e0 = C1 .
x2
Therefore y = e 2 is the unique solution to this initial value problem.

EXAMPLE 12 A Mixing Problem

Assume that a brine containing 30g of salt per litre of water is pumped into a 1000L
tank at a rate of 1 litre per second. The tank initially contains 1000L of fresh water.
It also contains a device that thoroughly mixes its contents. The resulting solution is
simultaneously drained from the tank at a rate of 1 litre per second.
Problem: How much salt will be in the tank at any given time?
Let s(t) denote the amount of salt in the tank at time t. Then s 0 (t) is the difference
between the rate at which salt is entering the tank (in the brine) and the rate at which
salt is leaving the tank (in the discharge). Label these rin (t) and rout (t), respectively.
That is,
s 0 (t) = rin (t) − rout (t).

s(t)

rin (t) rout (t)

Salt in Salt out

Calculus 2 (B. Forrest)2

Section 4.4: Initial Value Problems 139

To find rin (t) we note that the concentration of salt in the brine entering the tank is
constant at 30g per litre. The flow rate is 1L per second and the rate at which the salt
is entering the tank is the product of the concentration and the flow rate. Hence

g L g
rin (t) = 30 × 1 = 30
L s s
and so the rate at which salt is entering the tank is 30 grams per second.
Calculating rout (t) is similar. It is the concentration of the discharge times the rate of
flow. The rate of flow is again 1L per second but this time the concentration is not
constant. In fact the concentration of the discharge is the same as that of the tank.
s(t)
Since the concentration of salt in the tank is 1000 , we get

s(t) s(t)
rout (t) = ×1=
1000 1000
grams per second. It follows that

s(t)
s 0 (t) = 30 − .
1000

This is a first-order linear differential equation with f (t) = − 1000

1
and g(t) = 30.
(Note: It is also a separable DE). To solve the equation as a FOLDE, the integrating
factor I(t) is R −1 t
I(t) = e− 1000 dt = e 1000 .

Using the FOLDE formula R

g(t)I(t)dt
y=
I(t)
gives us
t
R
30e 1000 dt
s(t) = t
e 1000
t
30000e 1000 + C
= t
e 1000
t
= 30000 + Ce− 1000

Since s(0) = 0, we get

0 = 30000 + Ce0 = 30000 + C

and hence
C = −30000.

Calculus 2 (B. Forrest)2

Chapter 4: Differential Equations 140

Therefore, at any given time

t
s(t) = 30000 − 30000e− 1000

grams.
Finally, since lim e−x → 0, observe that
x→∞

t
lim s(t) = lim 30000 − 30000e− 1000 = 30000
t→∞ t→∞

grams. This means that if the system was allowed to continue indefinitely, the
amount of salt in the tank would approach 30000 grams. At that level, the
concentration in the 1000L tank would be 30 grams per litre, which would be the
same as the inflow rate. Therefore, the system is moving towards a stable
equilibrium.

Note: In general, initial value problems need not have any solutions or may not
have unique solutions. For example, to see that the solutions need not be unique
consider the initial value problem
1
y 0 = y3

with y(1) = 0. Then the constant solution

y = y(x) = 0

satisfies the initial condition. However so do both

2 3
y = ±[ (x − 1)] 2 .
3

4.5 Graphical and Numerical Solutions to Differential Equations

It is often is the case that an explicit formula for the solution to a differential
equation cannot be determined. When this occurs, we can still learn about the
possible solutions to a differential equation through a graphical analysis (direction
fields) or a numerical analysis (Euler’s Method).

4.5.1 Direction Fields

Most differential equations cannot be solved by obtaining an explicit formula for the
solution. However, we can construct local approximations to solutions by looking at
short segments of their tangent lines at a number of points (x, y), with the slope of

Calculus 2 (B. Forrest)2

Section 4.5: Graphical and Numerical Solutions to Differential Equations 141

these tangent lines determined by the differential equation. This set of tangent line
segments form a direction field and the direction field helps to visualize the solution
curve that passes through any point that sits on a solution to the differential equation.
For example, consider the differential equation

y 0 = x + y.

This differential equation tells us that the derivative y 0 (or slope of the tangent line)
of any solution whose graph contains the point (x, y) is the sum of the components
of the points, x + y.
Let’s consider a set of points (x, y) chosen at random. For each pair (x, y) the
corresponding value of y 0 is calculated. This information is listed in the following
table. The tangent line segments through (x, y) with the given slopes y 0 are then
plotted.
x y tangent line slope from DE
y0 = x + y
-2 0 y 0 = −2 + 0 = −2
-1 0 y 0 = −1 + 0 = −1
0 0 y0 = 0 + 0 = 0
1 0 y0 = 1 + 0 = 1
2 0 y0 = 2 + 0 = 2
0 1 y0 = 0 + 1 = 1
1 1 y0 = 1 + 1 = 2
2 1 y0 = 2 + 1 = 3
-1 3 y 0 = −1 + 3 = 2
-1 -1 y 0 = −1 + −1 = −2

For example, at the origin (0, 0), we have y 0 = 0 + 0 = 0, so the tangent line has
slope 0 at the origin which gives us a horizontal line segment there. Similarly, at
(1, 1), the tangent line has slope 2 so a line segment rising to the right is drawn. At
(−1, −1), the tangent line has slope −2 so a line segment falling to the right is drawn.
The more tangent line segments that are drawn in the direction field, the easier it is
to visualize the solution curves to this differential equation. However, this exercise
can become tedious if done by hand. Instead, a mathematical software program is
normally used to render the direction field.

The completed direction field

for
y0 = x + y
is shown.

Calculus 2 (B. Forrest)2

Chapter 4: Differential Equations 142

Once we can view the direction field, specific solutions can be sketched by drawing
along the tangent line segments. For example, the following diagrams show the
hand sketch of the solution to y 0 = x + y for y(0) = 1 and for y(−1) = 0 suggested by
the direction field. Notice that the solution curve for y(−1) = 0 is linear! By
studying the shape of these solution curve sketches, we can better understand the
nature of the solution set of the differential equation even though we may not know
the explicit solutions.

Note: Since this linear differential equation y 0 = x + y can be solved explicitly with
the initial values y(0) = 1 or y(−1) = 0, it is a worthwhile exercise to compare the
explicit solutions with the solution curves we obtained from the direction field.

4.5.2 Euler’s Method

In the previous section, we used linear approximation to construct direction fields to

graphically approximate solutions to differential equations. We will explore this
idea further and describe an algorithm known as Euler’s Method for numerically
building an approximate solution to a differential equation y 0 = f (x, y) on a closed
interval [a, b].
Suppose the function y = y(x) is known to be a solution of a differential equation
y 0 = f (x, y)
and that y(x0 ) = y0 . This means that the point (x0 , y0 ) is located on the graph of
y = y(x). Moreover,
y 0 (x0 ) = f (x0 , y0 )
is the slope of the tangent line to the graph of y = y(x) through the point (x0 , y0 ). We
know near x0 that y(x) can be approximated by its linear approximation:
L x0 (x) = y(x0 ) + y 0 (x0 )(x − x0 )
= y0 + f (x0 , y0 )(x − x0 )

In the following algorithm, the key idea will be the fact that if x is close to x0 , then
y(x) L x0 (x) = y(x0 ) + y 0 (x0 )(x − x0 ).

Calculus 2 (B. Forrest)2

Section 4.5: Graphical and Numerical Solutions to Differential Equations 143

Euler’s Method

The first step in this algorithm is to choose a partition

P = {a = x0 < x1 < x2 < · · · < xn = b}
of the closed interval [a, b].
Let
ϕ(x0 ) = ϕ(a) = y0 .
On the interval [x0 , x1 ], we define ϕ(x) to be the function
L x0 (x) = y0 + f (x0 , y0 )(x − x0 ).

slope = f (x0 , y0 )
Since this is the tangent
line approximation to y0 at
the point (x0 , y0 ), the graph
of L x0 (x) is a line through the
point (x0 , y0 ) with slope
equal to f (x0 , y0 ).
a = x0 x1

The next step is to calculate the value of this linear approximation at x1 to determine
the y-coordinate of the right-hand endpoint of the tangent line approximation. This
is given by
y1 = L x0 (x1 ) = y0 + f (x0 , y0 )(x1 − x0 ).

We can now find the linear approximation to the solution function through the new
point (x1 , y1 ). The approximate solution will then be defined as
L x1 (x) = y1 + f (x1 , y1 )(x − x1 )
on the interval [x1 , x2 ].

slope = f (x1 , y1 )
The graph is again a line
through the point (x1 , y1 )
with slope equal to
f (x1 , y1 ).

a = x0 x1 x2

Calculus 2 (B. Forrest)2

Chapter 4: Differential Equations 144

We then find y2 , the y-coordinate of the right-hand endpoint of the new tangent line
approximation. We have

y2 = L x1 (x2 ) = y1 + f (x1 , y1 )(x2 − x1 ).

In a similar manner we find the linear approximation to the solution function

through the new point (x2 , y2 ). This new approximation is

L x2 (x) = y2 + f (x2 , y2 )(x − x2 )

and its graph is the line through the point (x2 , y2 ) with slope equal to f (x2 , y2 ).

Once again we let the slope = f (x2 , y2 )

approximate solution agree
with this linear
approximation on the
interval [x2 , x3 ].
a = x0 x1 x2 x3

ϕ(x)
We proceed in this manner
moving left to right until we
have defined ϕ(x) on the
entire interval [a, b].

a = x0 x1 x2 x3 x4 x5 = b

In summary, Euler’s method begins at the left-hand endpoint x0 = a of an interval

and a short tangent line is created in the direction indicated by the direction field.
After proceeding a short distance along this tangent line, stop at the next location in
the partition x = x1 , adjust the slope of the tangent line, and proceed in this new
direction. Repeat this process according to the direction field to find an approximate
solution ϕ(x). It is important to emphasize that Euler’s method will not produce an
exact solution to an initial value problem. However, the larger the number of terms
in the partition and the closer together we choose successive points in the partition,
the closer ϕ(x) will be to a true solution to the differential equation.

Calculus 2 (B. Forrest)2

Section 4.6: Exponential Growth and Decay 145

4.6 Exponential Growth and Decay

It is known that a population of bacteria in an environment with unlimited resources

grows at a rate that is proportional to the size of the population. Therefore, if P(t)
represents the size of the population at time t, there is a constant k such that

P 0 = kP.

The general solution to this differential equation is given by

P(t) = Cekt

where C = P(0) represents the initial population.

P(t) = P(0)ekt

C = P(0)

Exponential Growth

From the shape of the graph, it makes sense when we say that the bacteria
population exhibits exponential growth.
Physical considerations generally limit the possible solutions to the equation. In the
case of the bacteria population we will see that if we know the initial population as
well as the size of the population at a one other fixed time, then the exact population
function can be determined.

EXAMPLE 13 At time t = 0, a bacteria colony’s population is estimated to be 7.5 × 105 . One hour
later, at t = 1, the population has doubled to 1.5 × 106 . How long will it take until
the population reaches 107 ?
Let P(t) represent the size of the population at time t. We know that there is a
constant k such that
P 0 = kP
so
P(t) = Cekt
and C = P(0) = 7.5 × 105 .
We also know that
1.5 × 106 = P(1) = 7.5 × 105 ek(1) .

Calculus 2 (B. Forrest)2

Chapter 4: Differential Equations 146

Therefore
1.5 × 106
ek = = 2.
7.5 × 105
To find k, take the natural logarithm of both sides of the equation to get

k = ln(2).

This tells us that the population function is

P(t) = 7.5 × 105 e(ln(2))t .

Now that we know the general formula for P(t), to answer the original question we
need to find t0 such that

P(t0 ) = 7.5 × 105 e(ln(2))t0 = 107 .

Therefore,
107
e(ln(2))t0 =
7.5 × 105
so
107
!
(ln(2))t0 = ln
7.5 × 105
and
107
ln 7.5×105
t0 = 3.74 hours.
ln(2)

There are many other real world phenomena that behave in a manner similar to the
growth of a bacteria population. In other cases, rather than exponential growth, we
have exponential decay. For example, the rate at which radioactive material breaks
down is proportional to the mass of material present.
Let m(t) denote the mass of a certain radioactive material at time t. Then there is a
constant k such that
dm
= m 0 = km.
dt
We have
m(t) = Cekt
where C = m(0) = M0 is the initial mass of the material. Therefore,

m(t) = M0 ekt .

Since the amount of material is decreasing, m 0 (t) < 0. But

m 0 (t) = km(t)

Calculus 2 (B. Forrest)2

Section 4.6: Exponential Growth and Decay 147

and m(t) > 0 so it follows that k < 0. Therefore, the graph of m(t) appears as
follows:

m(t) = M0 ekt

C = m(0) = M0

Exponential Decay

In particular, notice that

lim m(t) = lim M0 ekt = 0
t→∞ t→∞

since k < 0.
We call such a process exponential decay.
All radioactive materials have associated with them a quantity th known as the
half-life of the material. This is the amount of time it would take for one-half of the
material to decay. The half-life is a fundamental characteristic of the material.
Mathematically, if
m(t) = M0 ekt
then th is the time at which
M0
m(th ) = M0 ekth = .
2

Dividing by M0 shows that

1
ekth =
2
and hence that !
1
kth = ln = − ln(2).
2

Therefore, the half-life is given by the formula

− ln(2)
th = .
k

In particular, this shows that the half-life of a material is independent of the original
mass.

Calculus 2 (B. Forrest)2

Chapter 4: Differential Equations 148

m(t) = M0 ekt

C = m(0) = M0

M0
2

th = − ln(2)
k

Half-life

EXAMPLE 14 Carbon Dating

All living organisms contain a small amount of radioactive carbon-14. Moreover,
each type of organism has a particular equilibrium ratio of carbon-14 compared to
the stable isotope carbon-12.
When an organism dies the equilibrium is no longer maintained since the
radioactive carbon-14 slowly breaks down into carbon-12. It is also known that
carbon-14 breaks down at a rate of 1 part in 8000 per year. This means that after 1
year an initial quantity of 8000 particles will be reduced to 7999. Hence

7999 = m(1) = 8000ek(1)

so that !
7999
k = ln .
8000

Problem 1: Find the half-life of carbon-14.

From the previous discussion, we know that
− ln(2)
th =
k
− ln(2)
=
ln 7999
8000

5544.83 years

Problem 2: After a fossil was found research showed that the amount of carbon-14
was 23% of the amount that would have been present at the time of death. How old
was the fossil?

Calculus 2 (B. Forrest)2

Section 4.7: Newton’s Law of Cooling 149

Let M0 be the expected amount of carbon-14 in the fossil and let to be the age of the
fossil. Then the research shows that

(0.23)M0 = m(t0 ) = M0 ekt0 .

We must solve this equation for t0 . The first step is to recognize that
(0.23)M0
ekt0 = = 0.23
M0
This shows that we did not need to find the quantity M0 explicitly to solve this
question.
Taking the natural logarithm of both sides of the equation gives

kt0 = ln(0.23)

and hence that

ln(0.23)
t0 =
k
ln(0.23)
=
ln 7999
8000

= 11 756 years

4.7 Newton’s Law of Cooling

Newton’s law of cooling states that an object will cool (or warm) at a rate that is
proportional to the difference between the temperature of the object and the ambient
temperature T a of its surroundings. Therefore, if T (t) denotes the temperature of an
object at time t, then there is a constant k such that

T 0 = k(T − T a ).

If D = D(t) = T (t) − T a , then

D 0 = T 0 = kD
so D satisfies the equation for exponential growth (or decay). We know

D = Cekt .

It follows that
T (t) = Cekt + T a

Calculus 2 (B. Forrest)2

Chapter 4: Differential Equations 150

where C = D(0) = T 0 − T a and T 0 = T (0).

Therefore,
T (t) = (T 0 − T a )ekt + T a .

There are three possible cases.

1. T 0 > T a .
Physically, this means that the object is originally at a temperature that is
greater than the ambient temperature. This means that the object will be
cooling.
Since T (t) is decreasing
T 0 = k(T − T a ) < 0.
However, T > T a , so that k < 0.
2. T 0 < T a .
In this case, the object is originally at a temperature that is lower than the
ambient temperature. Therefore, the object will be warming.
This time T (t) is increasing so
T 0 = k(T − T a ) > 0.
Since T < T a , it follows again that k < 0.
3. T 0 = T a .
Then
T 0 = k(T − T a ) = 0
so the temperature remains constant. We call this the equilibrium state.

The diagram summarizes the possible graphs of the temperature function.

k<0

T0 > Ta
Ta = T0

T0 < Ta

k<0
Newton’s Law of Cooling
T (t) = (T 0 − T a )ekt + T a

Calculus 2 (B. Forrest)2

Section 4.7: Newton’s Law of Cooling 151

Notice that in all three cases,

lim T (t) = T a .
t→∞

Regardless of the initial starting point, if a process always moves towards a

particular equilibrium value, we call this value a stable equilibrium.

EXAMPLE 15 A cup of boiling water at 100◦C is allowed to cool in a room where the ambient
temperature is 20◦C. If after 10 minutes the water has cooled to 70◦C, what will be
the temperature after the water has cooled for 25 minutes?
Let T (t) denote the temperature of the water at time t minutes after cooling
commences. The initial temperature is T 0 = 100◦C and the ambient temperature is
T a = 20◦C. Newton’s Law of Cooling shows that there is a constant k < 0 such that

T (t) = (T 0 − T a )ekt + T a
= (100 − 20)ekt + 20
= 80ekt + 20

The next step is to determine k. Note that

70 = T (10) = 80ek(10) + 20

so
50 = 80e10k .
Hence, !
50
10k = ln
80
and

50
ln 80
k =
10

= −0.047

We can now evaluate T (25) to get that the temperature after 25 minutes is

T (25) = 80e−0.047(25) + 20
= 44.71

degrees Celsius.

Calculus 2 (B. Forrest)2

Chapter 4: Differential Equations 152

4.8 Logistic Growth

We have seen that a population with unlimited resources grows at a rate that is
proportional to its size. This leads to the differential equation

P 0 = kP.

However, the assumption that resources will be unlimited is usually unrealistic.

More likely, there is a maximum population M that the surrounding environment
can sustain. This means that as the population P(t) approaches M, resources will
become more scarce and the growth rate of the population will slow. On the other
hand, when the population is small in comparison to the maximum population
possible, the growth rate will be similar to that of the unrestricted case since there
will be little resource pressure. It is known that such a population satisfies a
differential equation of the form

P 0 = kP(M − P).

This equation means that the rate of growth is proportional to the product of the
current population and the difference from the maximum sustainable population.
Populations of this type are said to satisfy logistic growth and the differential
equation
y 0 = ky(M − y)
is called the logistic equation.
The logistic equation need not only model a population. However, in the special
case where we are trying to describe the behavior of a population, we have the
additional constraint that P(t) > 0.
Let P0 = P(0) be the initial population at the beginning of a study.
Observe that if the initial population is smaller than M, then the population will be
growing. This means that we would have

0 < P 0 = kP(M − P)

since both P and M − P are positive. As such, we would expect that k > 0.
However, if the initial population exceeds the maximum sustainable population,
then the population would decrease so

0 > P 0 = kP(M − P)

and again we would have k > 0 since P > 0 and M − P < 0.

A third possible case occurs when the initial population is already at the maximum.
In this case,
P 0 = kP(M − P) = 0
so the population would remain constant. This shows that P(t) = M is an
equilibrium solution.

Calculus 2 (B. Forrest)2

Section 4.8: Logistic Growth 153

The last case we will consider occurs when P0 = 0. In this case, we have that

P 0 = kP(M − P) = 0

which makes sense since there are no parents to produce offspring. Therefore,
P(t) = 0 is also an equilibrium, but its nature is quite different than that of the
equilibrium at P(t) = M.
It follows that in all cases, we may assume that P(t) > 0 for all t so that the possible
solutions look as follows:

P0 > M
M

0 < P0 < M

P0 = 0
t

Logistic Growth

You will notice that as long as P0 , 0 we have

lim P(t) = M.
t→∞

This means that P(t) = M is a stable equilibrium. However, since we will never
move towards an equilibrium of P(t) = 0 once there is a nonzero population,
P(t) = 0 is called an unstable equilibrium.
So far, we have presented a qualitative solution to the logistic growth problem.
However, since the equation is separable, we can try to solve it algebraically. We
have already observed that P(x) = 0 and P(x) = M are the constant solutions. We
can then try to solve
Z Z
1
dP = k dt = kt + C1
P(M − P)
R 1
To evaluate P(M−P)
dP we use partial fractions.
The constants A and B are such that
1 A B
= +
P(M − P) P M − P

Calculus 2 (B. Forrest)2

Chapter 4: Differential Equations 154

or
1 = A(M − P) + B(P).

Letting P = 0 gives
1 = A(M)
so
1
A= .
M
Letting P = M, we get
1 = B(M)
and again
1
B= .
M
Therefore " #
1 1 1 1
= + .
P(M − P) M P M − P

It follows that
Z "Z Z #
1 1 1 1
dP = dP + dP
P(M − P) M P M−P
1
= [ln(| P |) − ln(| M − P |)] + C2
M !
1 |P|
= ln + C2
M | M−P|

We now have that !

1 | P(t) |
ln + C2 = kt + C1 .
M | M − P(t) |
Therefore, !
| P(t) |
ln = Mkt + C3
| M − P(t) |
where C3 is arbitrary.
This shows that
| P(t) |
= Ce Mkt
| M − P(t) |
where C = eC3 > 0.
There are two cases to consider.
Case 1: Assume that 0 < P(t) < M. Then

| P(t) | P(t)
= = Ce Mkt .
| M − P(t) | M − P(t)

Calculus 2 (B. Forrest)2

Section 4.8: Logistic Growth 155

Solving for P(t) would give

P(t) = (M − P(t))Ce Mkt

= MCe Mkt − P(t)Ce Mkt

so that
P(t) + P(t)Ce Mkt = MCe Mkt .
We then have
P(t)(1 + Ce Mkt ) = MCe Mkt
and finally that

MCe Mkt
P(t) =
1 + Ce Mkt
Ce Mkt
= M
1 + Ce Mkt

There are two important observations we can make about this solution.

(a) Since C > 0, the denominator is never 0 so the function P(t) is continuous and

Ce Mkt
0< <1
1 + Ce Mkt
so that
0 < P(t) < M
which agrees with our assumption.
(b) Since k > 0, we have that

Ce Mkt
lim P(t) = lim M
t→∞ t→∞ 1 + Ce Mkt
Ce Mkt
= M lim
t→∞ 1 + Ce Mkt

= M

and
Ce Mkt
lim P(t) = lim M = 0.
t→−∞ t→−∞ 1 + Ce Mkt
This shows that the population would eventually approach the maximum
population M but if you went back in time far enough, the population would
be near 0. Both of these limits are consistent with our expectations.

Calculus 2 (B. Forrest)2

Chapter 4: Differential Equations 156

If t = 0, then
Ce0 C
P0 = P(0) = M =M .
1 + Ce0 1+C
Solving for C yields
P0 (1 + C) = MC
P0 + P0C = MC
P0 = (M − P0 )C
and finally that
P0
C= .
M − P0

Ce Mkt
The graph of the function P(t) = M looks as follows:
1 + Ce Mkt

0 < P0 < M

P0 = 0
t

Logistic Growth

Case 2: If P(0) > M, then

| P(t) | P(t) P
=− = = Ce Mkt .
| M − P(t) | M − P(t) P − M

Proceeding in a manner similar to the previous case, we get that there exists a
positive constant C such that

Ce Mkt
P(t) = M .
Ce Mkt − 1

Notice that this function has a vertical asymptote when the denominator

Ce Mkt − 1 = 0.

Calculus 2 (B. Forrest)2

Section 4.8: Logistic Growth 157

Moreover, the function is only positive if

Ce Mkt > 1
or equivalently if
1
e Mkt > .
C

1
ln C
The use of some algebra shows that this happens if and only if t > = t0 .
Mk
If we ignore the fact that population must be positive, the graph of the solution
Ce Mkt
function P(t) = M Mkt appears as follows:
Ce − 1

Since we are looking for a population function and so we require P(t) ≥ 0, we will
only consider values of t which exceed t0 . Therefore, the graph of the population
function is:

P0 > M

t0 t

Calculus 2 (B. Forrest)2

Chapter 4: Differential Equations 158

It is still true that

Ce Mkt
lim P(t) = lim M = M.
t→∞ t→∞ Ce Mkt − 1
and
lim P(t) = ∞.
t→t0+

EXAMPLE 16 A game reserve can support at most 800 elephants. An initial population of 50
elephants is introduced in the park. After 5 years the population has grown to 120
elephants. Assuming that the population satisfies a logistic growth model, how large
will the population be 25 years after this introduction?
Let P(t) denote the elephant population t years after they are introduced to the park.
We know that there are positive constants C and k such that the population of
elephants is given by
Ce800kt
P(t) = 800 .
1 + Ce800kt
Recall that if P0 = P(0), then
P0
C= .
M − P0
We are given that P0 = P(0) = 50 and M = 800. Then
50 50 1
C= = = .
800 − 50 750 15
Therefore,
1 800kt
e
P(t) = 800 15
.
1 + 151 e800kt

To find k, we use the fact that

1 800k(5)
e
120 = P(5) = 800 15
.
1 + 151 e800k(5)

Hence
1
120 3 e800k(5)
= = 15
800 20 1 + 151 e800k(5)
and thus
9 e4000k
= .
4 1 + 15
1 4000k
e

Cross-multiplying gives
9 1
(1 + e4000k ) = e4000k
4 15
and
9 3
+ e4000k = e4000k
4 20

Calculus 2 (B. Forrest)2

Section 4.8: Logistic Growth 159

so
9 17 4000k
= e .
4 20
This means
45
= e4000k
17
and finally that
45
ln 17
k= .
4000
Substituting k back into the population model and evaluating at t = 25 we get

1 800
( ) (25)
ln 45
17
e 4000
P(25) = 800 15
( ) (25)
ln 45
17
1+ 1 800
15
e 4000

1 5 ln( 45
e 17 )
= 800 15
1 5 ln( 45
1 + 15 e 17 )

= 717 elephants

It follows that after 25 years the population has very nearly reached its maximum
(800 elephants).

Logistic growth also applies to many other situations.

EXAMPLE 17 A rumor is circulating around a university campus. A survey revealed that at one
point only 5% of the students in the school were aware of the rumor. However, since
news on campus spreads quickly, after 10 hours the rumor is known by 10% of the
student body. How long will it take until 30% of the students are aware of the
rumor?
Let r(t) be the fraction of the student body at time t that have heard this rumor. Then
0 ≤ r(t) ≤ 1.
Experiments have shown that the rate at which a rumor spreads through a population
is proportional to the product of the fraction of the population that have heard the
rumor and the fraction that have not. Therefore, there is a constant k such that

r 0 = kr(1 − r)

and so this is a logistic growth model with M = 1. It follows that there is a positive
constant C such that
Cekt
r(t) = .
1 + Cekt

Calculus 2 (B. Forrest)2

Chapter 4: Differential Equations 160

We know that at r(0) = 0.05 so

r(0) 0.05
C= = = 0.0526315
1 − r(0) 0.95
and hence that

0.1 = r(10)

0.05 10k
e
= 0.95
1 + 0.95
0.05 10k
e

Therefore
0.005 10k 0.05 10k
0.1 + e = e
0.95 0.95
and
0.045 10k
0.1 = e .
0.95
This gives
0.095
e10k =
0.045
and
0.095
ln 0.045
k= = 0.07472
10
Finally, we want to find t0 such that

Cekt0
0.3 = .
1 + Cekt
Therefore,
0.3(1 + Cekt ) = Cekt0
so
0.3 = 0.7Cekt0
and
0.3
ekt0 = .
0.7C
This shows that
0.3 0.3
ln 0.7C
ln 0.7(0.0526315)
t0 = = = 28.07
k 0.07472
hours.
After 28.07 hours, 30% of the student population had heard the rumor.

Calculus 2 (B. Forrest)2

Section 4.8: Logistic Growth 161

There are many other important examples of logistic models that are similar to the
previous example. For example, the spread of disease through a population also
behaves like the spread of a rumor and as such can be studied with a logistic growth
model.

Calculus 2 (B. Forrest)2

Chapter 5

Numerical Series

The main topic in this chapter is infinite series. You will learn that a series is just a
sum of infinitely many terms. One of the main problems that you will encounter is
to try to determine what it means to add infinitely many terms . We will accomplish
this task by defining the sequence of partial sums and then studying the convergence
of the series.

5.1 Introduction to Series

The Greek philosopher Zeno, who lived from 490-425 BC, proposed many
paradoxes. The most famous of these is the Paradox of Achilles and the Tortoise. In
this paradox, Achilles is supposed to race a tortoise. To make the race fair, Achilles
(A) gives the tortoise (T) a substantial head start.

A T
P0 P1

Zeno would argue that before Achilles could catch the tortoise, he must first go from
his starting point at P0 to that of the tortoise at P1 . However, by this time the tortoise
has moved forward to P2 .

A T
P0 P1 P2

This time, before Achilles could catch the tortoise, he must first go from P1 to where
the tortoise was at P2 . However, by the time Achilles completes this task, the
tortoise has moved forward to P3 .

A T
P0 P1 P2 P3
Section 5.1: Introduction to Series 163

Each time Achilles reaches the position that the tortoise had been, the tortoise has
moved further ahead.

AT
P0 P1 P2 P3 Pn-1Pn

This process of Achilles trying to reach where the tortoise was ad infinitum led
Zeno to suggest that Achilles could never catch the tortoise.
Zeno’s argument seems to be supported by the following observation:
Let t1 denote the time it would take for Achilles to get from his starting point P0 to
P1 . Let t2 denote the time it would take for Achilles to get from P1 to P2 , and let t3
denote the time it would take for Achilles to get from P2 to P3 . More generally, let
tn denote the time it would take for Achilles to get from Pn−1 to Pn . Then the time it
would take to catch the tortoise would be at least as large as the sum

t1 + t2 + t3 + t4 + · · · + tn + · · ·

of all of these infinitely many time periods.

Since each tn > 0, Achilles is being asked to complete infinitely many tasks (each of
which takes a positive amount of time) in a finite amount of time. It may seem that
this is impossible. However, this is certainly a paradox because we know from our
own experience that someone as swift as Achilles will eventually catch and even
pass the tortoise. Hence, the sum

t1 + t2 + t3 + t4 + · · · + tn + · · ·

must be finite.
This statement brings into question the following very fundamental problem.

Problem: Given an infinite sequence {an } of real numbers, what do we mean by the
sum

a1 + a2 + a3 + a4 + · · · + an + · · · ?

To see why this is an issue, consider the following example.

EXAMPLE 1 Let an = (−1)n−1 . Consider

1 + (−1) + 1 + (−1) + 1 + (−1) + 1 + (−1) + · · · .

If we want to find this sum, we could try to use the associative property of finite
sums and group the terms as follows:

[1 + (−1)] + [1 + (−1)] + [1 + (−1)] + [1 + (−1)] + · · · .

Calculus 2 (B. Forrest)2

Chapter 5: Numerical Series 164

This would give

0 + 0 + 0 + 0 + ···

which must be 0. Therefore, we might expect that

1 + (−1) + 1 + (−1) + 1 + (−1) + 1 + (−1) + · · · = 0.

This makes sense since there appears to be the same number of 1’s and −1’s, so
cancellation should make the sum 0.
However, if we choose to group the terms the differently,

1 + [(−1) + 1] + [(−1) + 1] + [(−1) + 1] + [(−1) + 1] + · · ·

then we get

1 + 0 + 0 + 0 + 0 + · · · = 1.

Both methods seem to be equally valid so we cannot be sure of the real sum. It
seems that the usual rules of arithmetic do not hold for infinite sums. We must look
for an alternate approach.

Since finite sums behave very well, we might try adding all of the terms up to a
certain cut-off k and then see if a pattern develops as k gets very large. This is in fact
how we will proceed.

DEFINITION Series
Given a sequence {an }, the formal sum

a1 + a2 + a3 + a4 + · · · + an + · · ·

is called a series. The series is called formal because we have not yet given it a
meaning numerically.
The an ’s are called the terms of the series. For each term an , the index of the term is
n.
We will denote the series by
∞
X
an .
n=1

Calculus 2 (B. Forrest)2

Section 5.1: Introduction to Series 165

Note that all of the series we have listed so far have started with the first term
indexed by 1. This is not necessary. In fact, it is quite common for a series to begin
with the initial index being 0. In fact, the series can start at any initial point.
final index
∞
an = a j + a j+1 + a j+2 + a j+3 + . . .
P
n= j

initial index

DEFINITION Convergence of a Series

Given a series ∞
X
an
n=1

for each k ∈ N, we define the k-th partial sum S k by

k
X
Sk = an .
n=1

∞
P
We say that the series an converges if the sequence {S k } of partial sums
n=1
converges. In this case, if L = lim S k , then we write
k→∞

∞
X
an = L
n=1

∞
P
and assign the sum this value. Otherwise, we say that the series an diverges.
n=1

We can apply these definitions to the series that we considered earlier in this section.

EXAMPLE 2 Let an = (−1)n−1 . Consider the sequence of partial sums:

S1 = a1 =1
S2 = a1 + a2 = S 1 + a2 = S1 − 1 =0
S3 = a1 + a2 + a3 = S 2 + a3 = S2 + 1 =1
S4 = a1 + a2 + a3 + a4 = S 3 + a4 = S3 − 1 =0
S5 = a1 + a2 + a3 + a4 + a5 = S 4 + a5 = S4 + 1 =1
..
.
Therefore, (
0 if k is even
Sk =
1 if k is odd

Calculus 2 (B. Forrest)2

Chapter 5: Numerical Series 166

This shows that the sequence of partial sums {S k } diverges, and hence so does
∞
(−1)n−1 .
P
n=1

EXAMPLE 3 Determine if the series ∞

X 1
n=1
n2 +n
converges or diverges.
Observe that
1 1
an = = .
n2 + n n(n + 1)
Moreover, we can write
1 1 1
an = = − .
n(n + 1) n n + 1
Therefore the series becomes
∞ !
X 1 1
− .
n=1
n n + 1

To calculate S k note that

k !
X 1 1
Sk = −
n=1
n n+1
1 1 1 1 1 1 1 1 1
= (1 − ) + ( − ) + ( − ) + ( − ) + · · · + ( − )
2 2 3 3 4 4 5 k k+1

If we regroup these terms, we get

1 1 1 1 1 1 1 1 1
S k = (1 − ) + ( − ) + ( − ) + ( − ) + · · · + ( − )
2 2 3 3 4 4 5 k k+1
1 1 1 1 1 1 1 1 1 1 1
= 1 − ( − ) − ( − ) − ( − ) − ( − ) − ··· − ( − ) −
2 2 3 3 4 4 5 5 k k k+1
1
= 1 − 0 − 0 − 0 − 0 − ··· − 0 −
k+1
1
= 1−
k+1

Then !
1
lim S k = lim 1 − = 1.
k→∞ k→∞ k+1

Calculus 2 (B. Forrest)2

Section 5.2: Geometric Series 167

∞
P 1
Since the sequence of partial sums {S k } converges to 1, the series n2 +n
converges
n=1
and ∞
X 1
= 1.
n=1
n2 + n

What is remarkable about the previous series is not that we were able to show that it
converges, but rather that we could find its sum so easily. Generally, this will not be
the case. In fact, even if we know a series converges, it may be very difficult or even
impossible to determine the exact value of its sum. In most cases, we will have to be
content with either showing that a series converges or that it diverges and, in the
case of a convergent series, estimating its sum.
The next section deals with an important class of series known as geometric series.
Not only can we determine if such a series converges, but we can easily find the
sum.

5.2 Geometric Series

Perhaps the most important type of series are the geometric series.

DEFINITION Geometric Series

A geometric series is a series of the form
∞
X
rn = 1 + r + r2 + r3 + r4 + · · ·
n=0

The number r is called the ratio of the series.

If r = (−1), the series is

∞
X
(−1)n = 1 + (−1) + 1 + (−1) + 1 + (−1) + 1 + (−1) + · · ·
n=0

which we have already seen diverges.

If r = 1, the series is
∞
X
1n = 1 + 1 + 1 + 1 + 1 + · · ·
n=0
k
which again diverges since S k = 1n = k + 1 diverges to ∞.
P
n=0

Question: Which if any of the geometric series converge?

Calculus 2 (B. Forrest)2

Chapter 5: Numerical Series 168

Assume that r , 1. Let

S k = 1 + r + r2 + r3 + r4 + · · · + rk .

Then

rS k = r(1 + r + r2 + r3 + r4 + · · · + rk )
= r + r2 + r3 + r4 + · · · + rk+1

Therefore

S k − rS k = (1 + r + r2 + r3 + r4 + · · · + rk ) − (r + r2 + r3 + r4 + · · · + rk + rk+1 )
= 1 − rk+1

Hence
(1 − r)S k = S k − rS k = 1 − rk+1
and since r , 1,
1 − rk+1
Sk = .
1−r

The only term in this expression that depends on k is rk+1 , so lim S k exists if and
k→∞
only if lim rk+1 exists. However, if | r |< 1, then rk+1 becomes very small for large k.
k→∞
That is lim rk+1 = 0.
k→∞

If | r |> 1, then | rk+1 | becomes very large as k grows. That is, lim | rk+1 |= ∞.
k→∞
Hence, lim rk+1 does not exist.
k→∞

Finally, if r = −1, then rk+1 alternates between 1 and −1, so lim rk+1 again diverges.
k→∞
∞
This shows that rk+1 , and hence the series rn , converges if and only if | r |< 1.
P
n=0
Moreover, in this case,

1 − rk+1
lim S k = lim
k→∞ k→∞ 1 − r

1 − lim rk+1
k→∞
=
1−r
1
=
1−r

Calculus 2 (B. Forrest)2

Section 5.3: Divergence Test 169

THEOREM 1 Geometric Series Test

∞
rn converges if | r |< 1 and diverges otherwise.
P
The geometric series
n=0

If | r |< 1, then
∞
X 1
rn = .
n=0
1−r

∞ n
P 1
EXAMPLE 4 Evaluate 2
.
n=0

SOLUTION This is a geometric series with ratio r = 12 . Since 0 < 1

2
< 1, the
∞ n
P 1
Geometric Series Test shows that 2
converges. Moreover,
n=0

∞ !n
X 1 1
= 1
n=0
2 1− 2
= 2

5.3 Divergence Test

It makes sense that if we are to add together infinitely many positive numbers and
get something finite, then the terms must eventually be small. We will now see that
this statement holds for any convergent series.

THEOREM 2 Divergence Test

∞
P
Assume that an converges. Then
n=1

lim an = 0.
n→∞

∞
P
Equivalently, if lim an , 0 or if lim an does not exist, then an diverges.
n→∞ n→∞ n=1

The Divergence Test gets its name because it can identify certain series as being
divergent, but it cannot show that a series converges.

∞
rn with | r |≥ 1. Then lim rn = 1 if r = 1 and it
P
EXAMPLE 5 Consider the geometric series
n=0 n→∞
does not exist for all other r with | r |≥ 1 (in other words, if r = −1 or if |r| > 1). The
∞
rn diverges.
P
Divergence Test shows that if | r |≥ 1, then
n=0

Calculus 2 (B. Forrest)2

Chapter 5: Numerical Series 170

∞
P
The Divergence Test works for the following reason. Assume that an converges
n=1
to L. This is equivalent to saying that
lim S k = L.
k→∞

By the basic properties of convergent sequences, we get that

lim S k−1 = L
k→∞

as well.
However, for k ≥ 2,
k
X k−1
X
S k − S k−1 = an − an
n=1 n=1
= (a1 + a2 + a3 + a4 + · · · + ak−1 + ak ) − (a1 + a2 + a3 + a4 + · · · + ak−1 )
= ak

Therefore,
lim ak = lim (S k − S k−1 )
k→∞ k→∞
= lim S k − lim S k−1
k→∞ k→∞
= L−L
= 0.

EXAMPLES
n
1. Consider the sequence { n+1 }. Then
n
lim = 1.
n→∞ n+1
Therefore, the Divergence Test shows that
∞
X n
n=1
n+1
diverges.
2. While it is difficult to do so, it is possible to show that
lim sin(n)
n→∞

does not exist. Therefore, the Divergence Test shows that the series
∞
X
sin(n)
n=1

diverges.

Calculus 2 (B. Forrest)2

Section 5.3: Divergence Test 171

3. The Divergence Test shows that if either lim an , 0 or if lim an does not
n→∞ n→∞
∞
P
exist, then an diverges. It would seem natural to ask if the converse
n=1
statement holds. That is:

∞
Question: If lim an = 0, does this mean that
P
an converges?
n→∞ n=1
We will see that the answer to the question above is: No, the fact that
∞
lim an = 0, does not mean that
P
an converges.
n→∞ n=1

Let an = 1
n
and
k
X 1 1 1 1 1 1
Sk = =1+ + + + + ··· + .
n=1
n 2 3 4 5 k
Then
S1 = 1
1
S2 = 1 +
2
1 1 1
S4 = 1 + + +
2 3 4
1 1 1
= 1+ +( + )
2 3 4
1 1 1
> 1+ +( + )
2 4 4
1 1
= 1+ +
2 2
2
= 1+
2
1 1 1 1 1 1 1
S8 = 1 + + + + + + +
2 3 4 5 6 7 8
1 1 1 1 1 1 1
= 1+ +( + )+( + + + )
2 3 4 5 6 7 8
1 1 1 1 1 1 1
> 1+ +( + )+( + + + )
2 4 4 8 8 8 8
1 1 1
= 1+ + +
2 2 2
3
= 1+
2
..
.

We have seen that

S 1 = S 20 = 1+ 0
2
S 2 = S 21 = 1+ 1
2
S 4 = S 22 > 1+ 2
2
S 8 = S 23 > 1+ 3
2
..
.

Calculus 2 (B. Forrest)2

Chapter 5: Numerical Series 172

A pattern has emerged. In general, we can show that for any m

m
S 2m ≥ 1 + .
2
However, the sequence 1 + m2 grows without bounds. It follows that the partial
sums of the form S 2m also grow without bound. This shows that the series
∞
P 1
n
diverges to ∞ as well.
n=1

∞
This example shows that even if lim an = 0, it is still possible for
P
an to diverge!!!
n→∞ n=1

Note:

1. The sequence { n1 } was first studied in detail by Pythagoras who felt that these
ratios represented musical harmony. For this reason the sequence { n1 } is called
∞
P 1
the Harmonic Progression and the series n
is called the Harmonic Series.
n=1
∞
P 1
We have just shown that the Harmonic Series n
diverges to ∞. However,
n=1
the argument to do this was quite clever. Instead, we might ask if we could
use a computer to add up the first k terms for some large k and show that the
sums are getting large? In this regard, we may want to know how many terms
it would take so that
1 1 1 1 1
S k = 1 + + + + + · · · + > 100 ?
2 3 4 5 k
The answer to this question is very surprising. It can be shown that k must be
at least 1030 , which is an enormous number. No modern computer could ever
perform this many additions!!!
2. Recall that in Zeno’s paradox, Achilles had to travel infinitely many distances
in a finite amount of time to catch the tortoise. If Dn represents the distance
between points Pn−1 (where Achilles is after n − 1 steps) and Pn (where the
tortoise is currently located), then the Dn ’s are becoming progressively
smaller.

If tn is the time it takes Achilles to cover the distance Dn , then the tn ’s are also
becoming progressively smaller. In fact, they are so small that lim tn = 0 and
n→∞
indeed it is reasonable to assume that
X∞
tn
n=1

converges! This is how we can resolve Zeno’s paradox.

Calculus 2 (B. Forrest)2

Section 5.4: Arithmetic of Series 173

5.4 Arithmetic of Series

Since convergent series can be viewed as the limit of their sequences of partial
sums, the arithmetic rules for sequences can be applied whenever they are
appropriate. With this in mind, we get:

THEOREM 3 Arithmetic for Series I

∞
P ∞
P
Assume that an and bn both converge.
n=1 n=1

∞
P
1. The series can converges for every c ∈ R and
n=1

∞
X ∞
X
can = c an .
n=1 n=1

∞
(an + bn ) converges and
P
2. The series
n=1

∞
X ∞
X ∞
X
(an + bn ) = an + bn .
n=1 n=1 n=1

These rules should not be surprising. They follow immediately from the
corresponding rules for sequences.
There is one other rule that we will need that does not have an analog for sequences.
∞
P
Given a series an , let j ∈ N. Let
n=1

∞
X
an = a j + a j+1 + a j+2 + a j+3 + · · · .
n= j

∞
P
We say that an converges if
n= j
lim T k
k→∞
exists, where
j+k−1
X
Tk = an = a j + a j+1 + a j+2 + a j+3 + · · · + a j+k−1 .
n= j

∞
P
The following theorem relates the convergence of the series an with that of the
n= j
∞
P
original series an .
n=1

Calculus 2 (B. Forrest)2

Chapter 5: Numerical Series 174

THEOREM 4 Arithmetic for Series II

∞
P ∞
P
1. If an converges, then an also converges for each j.
n=1 n= j

∞
P ∞
P
2. If an converges for some j, then an converges.
n= j n=1

In either of these two cases,

∞
X ∞
X
an = a1 + a2 + · · · + a j−1 + an .
n=1 n= j

Observation: As a consequence of the previous theorem, we make the following

very important observation. Given a sequence
a1 , a2 , a3 , · · · , an , · · ·
and the first j − 1 terms are changed to create a new sequence
b1 , b2 , b3 , · · · , bn , · · ·
∞ ∞
where bn = an and if n ≥ j, then the series
P P
an and bn are identical. Hence, they
n= j n= j
either both converge or both diverge. The previous theorem can now be used to
∞
P ∞
P
show that either both an and bn converge or they both diverge!!! Therefore,
n=1 n=1
convergence of a series depends completely on the tail of the sequence of terms.
Changing finitely many terms will not affect convergence, though in the case of a
convergent sequence it will almost always affect the final sum.

EXAMPLE 6 A ball is launched straight up from the ground to a height of 30m. When the ball
returns to the ground it will bounce to a height that is exactly 13 of its previous
height. Assuming that the ball continues to bounce each time it returns to the
ground, how far does the ball travel before coming to rest?
Prior to returning to the ground for the first time, the ball travels 30m on its way up
and then 30m down for a total of 2(30) = 60m.
30
On the first bounce, the ball will travel upwards 3
m and down again the same
distance for a total of 2 30
3
m.
On the second bounce, the ball will travel upwards one third the distance of the first
bounce or !
1 30 30
= 2 m.
3 3 3

It will also travel down the same distance for a total of 2 30
32
m.

Calculus 2 (B. Forrest)2

Section 5.4: Arithmetic of Series 175

On the third bounce, the ball will again travel upwards one third the distance it
traveled on the second bounce or
! !
1 30 30
= 3 m.
3 32 3

With the downward trip, the third bounce covers a distance of 2 30 33
m.
Note the pattern that has formed. On the n-th bounce, the ball will travel a distance
of 2 30
3n
m. The total distance D the ball travels will be the sum of each of these
distances.

Therefore, using our rules of arithmetic and what we know about geometric series,
we get
! ! ! !
30 30 30 30
D = 2(30) + 2 + 2 2 + 2 3 + ··· + 2 n + ···
3 3 3 3
1 1 1 1
= 2(30)[1 + + 2 + 3 + ··· + n + ···]
3 3 3 3
∞ !n
X 1
= 60
n=0
3
60
=
1 − 31

= 90 meters.

Calculus 2 (B. Forrest)2

Chapter 5: Numerical Series 176

Since we are assuming that the ball will bounce infinitely often, we might expect
that this process would continue forever. However, this is not the case. In fact, the
reasoning is very similar to that of the resolution of Zeno’s paradox since the
amount of time it takes for the ball to complete each bounce decreases very rapidly.
Indeed, by using some basic physics, we can actually calculate the total time it
would take for the ball to complete its travels.

It is known from physics that if a ball is dropped, it will fall a distance

1
S = gt2
2

meters in t seconds, where g = 9.81 m/(sec)2 is the acceleration due to gravity.

Therefore, if a ball is dropped from a height h we can determine how long it will
take to reach the ground. We have

1
h = gt2
2

so that
2h
t2 =
g
or
s
2h
t= .
g

In our case, the ball will take the same amount of time to make the upward trip as
the downward trip. Therefore, the total time it will take to complete the n-th bounce
(the 0-th bounce is the original trip) will be
s
2hn
tn = 2
g

where hn is the height of the n-th bounce. But

30
hn =
3n
so
s  s  !n
2(30)  60  1
tn = 2 = 2  √
g(3n )  g 3

It follows that the total time it will take for the ball to complete all of the bounces is

Calculus 2 (B. Forrest)2

Section 5.5: Positive Series 177

∞ ∞ 
 s  !n
X X 60  1
=

tn 2  √
n=0 n=0
g 3
s
∞ !n
60 X 1
= 2 √
g n=0 3
s  
60  1 
= 2  
g  1 − √1 
3

= 11.708 seconds.

5.5 Positive Series

We will soon see that the Monotone Convergence Theorem can help us determine
the convergence of series, particularly those series with positive terms. Recall the
following definition:

DEFINITION Monotonic Sequences

Given a sequence {an }, we say that the sequence is

i) non-decreasing if an+1 ≥ an for every n ∈ N.

ii) increasing if an+1 > an for every n ∈ N.
iii) non-increasing if an+1 ≤ an for every n ∈ N.

iv) decreasing if an+1 < an for every n ∈ N.

We say that {an } is monotonic if it satisfies one of these four conditions.

The Monotone Convergence Theorem gives us a simple criterion for determining

the convergence or divergence of a monotonic sequence.

THEOREM 5 Monotone Convergence Theorem (MCT)

Let {an } be a non-decreasing sequence.

1. If {an } is bounded above, then {an } converges to L = lub({an }).

2. If {an } is not bounded above, then {an } diverges to ∞.

In particular, {an } converges if and only if it is bounded above.

Calculus 2 (B. Forrest)2

Chapter 5: Numerical Series 178

Note: A similar statement can be made about non-increasing sequences by

replacing the least upper bound with the greatest lower bound and ∞ by −∞.

DEFINITION Positive Series

∞
P
We call a series an positive if the terms an ≥ 0 for all n ∈ N.
n=1

Assume that an ≥ 0 for all n. Let

k
X
Sk = an = a1 + a2 + · · · + ak
n=1

be the k-th partial sum of the series with terms {an }. Then
k+1
X k
X
S k+1 − S k = an − an
n=1 n=1
= (a1 + a2 + · · · + ak + ak+1 ) − (a1 + a2 + · · · + ak )
= ak+1
≥ 0

This shows that

S k+1 ≥ S k
so {S k } is a non-decreasing sequence. The Monotone Convergence Theorem tells us
that there are two possibilities for the sequence {S k }:

1. {S k } is bounded and therefore, by the MCT it converges.

2. {S k } diverges to ∞.

In terms of the series this is equivalent to

∞
P
1. an converges.
n=1

∞
P
2. an diverges to ∞.
n=1

Key Observation: Therefore, for positive series, the convergence of the series
essentially depends only on how large are the terms an . Generally speaking, the
larger the an ’s, the more likely it is that a series will diverge to ∞ and the smaller the
an ’s, the more likely it is that a series will converge.

Calculus 2 (B. Forrest)2

Section 5.5: Positive Series 179

To make this statement more precise, assume that we have two series,
∞
X ∞
X
an and bn
n=1 n=1
∞
P
with 0 ≤ an ≤ bn for all n ∈ N. Assume also that the series bn with the larger
n=1
terms converges to some number L. Since an ≤ bn for all n ∈ N, we would not
∞
P
expect an to diverge to ∞. In fact, if we let
n=1

k
X
Sk = an
n=1

and
k
X
Tk = bn ,
n=1
then
Sk = a1 + a2 + · · · + ak
≤ b1 + b2 + · · · + bk
= Tk
≤ L

since L = lim T k .
k→∞

We have shown that for each k,

∞
X
Sk ≤ L = bn .
n=1

However, the sequence {S k } is increasing and we have just shown that it is bounded
above by L. The Monotone Convergence Theorem shows that {S k } converges to
some M with M ≤ L. In other words,
∞
X
an = M.
n=1

∞
P
On the other hand, if the series an with the smaller terms diverges to infinity, then
n=1
we can make the partial sum S k as large as we like. But, S k ≤ T k , so that we can
make the T k ’s as large as we like. This shows that
lim T k = ∞
k→∞
∞
P
so that bn diverges to ∞.
n=1

This leads us to one of the most important tools we will have for determining the
convergence or divergence of positive series.

Calculus 2 (B. Forrest)2

Chapter 5: Numerical Series 180

5.5.1 Comparison Test

THEOREM 6 Comparison Test for Series

Assume that 0 ≤ an ≤ bn for each n ∈ N.

∞
P ∞
P
1. If bn converges, then an converges.
n=1 n=1

∞
P ∞
P
2. If an diverges, then bn diverges.
n=1 n=1

Remark: We must make three important observations concerning the Comparison

Test.
∞
P ∞
P
1. If an converges, then we cannot say anything about bn .
n=1 n=1

∞
P ∞
P
2. If bn diverges, then we cannot say anything about an .
n=1 n=1

3. Since the first few terms do not affect whether or not a series diverges, for the
Comparison Test to hold, we really only need that
0 ≤ an ≤ bn
for each n ≥ K, where K ∈ N. That is, the conditions of the theorem need only
be satisfied by the elements of the tails of the two sequences.

∞
P 1
EXAMPLE 7 We have seen that the Harmonic Series n
diverges. We also know that
n=1

1 1
0< ≤ √
n n

for all n ∈ N. Let an = 1n and bn = √1 .

n
Then the Comparison Test shows that the
∞
√1 also diverges.
P
series n
n=1

EXAMPLE 8 Consider the series ∞ ∞

X 1 X 1
= .
n=2
n2 − n n=2 n(n − 1)
If we use a method similar to that of the Partial Fraction decomposition, we get that
1 1 1
= − .
n(n − 1) n − 1 n

Calculus 2 (B. Forrest)2

Section 5.5: Positive Series 181

Let
k
X 1
Sk =
n=2
n2 −n
k
X 1 1
= −
n=2
n−1 n
! ! ! ! !
1 1 1 1 1 1 1 1 1
= 1− + − + − + ··· + − + −
2 2 3 3 4 k−2 k−1 k−1 k
! ! ! ! !
1 1 1 1 1 1 1 1 1 1 1
= 1− − − − − − − ··· − − − − −
2 2 3 3 4 4 k−2 k−2 k−1 k−1 k
1
= 1−
k

∞
Since lim S k = lim 1 − 1
= 1, the series 1
P
k n2 −n
converges with
k→∞ k→∞ n=2

∞
X 1
= 1.
n=2
n2 −n

Next, observe that for n ≥ 2, we have n2 > n2 − n and hence that

1 1
0< 2
< 2 .
n n −n
∞
P 1
However, we have just shown that n2 −n
converges to 1. We can use the
n=2
∞
P 1
Comparison Test to conclude that n2
converges and that
n=2

∞
X 1
≤ 1.
n=2
n2

∞
P 1
We can now immediately conclude that the series n2
also converges and
n=1

∞ ∞
X 1 1 X 1
2
= 2
+ 2
≤ 1 + 1 = 2.
n=1
n 1 n=2
n

In fact, using techniques that are beyond the scope of this course, it can be shown
∞
= π6 1.64493.
P 1 2
that n2
n=1

Calculus 2 (B. Forrest)2

Chapter 5: Numerical Series 182

EXAMPLE 9 Let p ∈ R with p ≥ 2. Then for each n ∈ N, n2 ≤ n p . Hence

1 1
0≤ p
≤ 2.
n n
∞ ∞
P 1 P 1
Since n2 converges, the Comparison Test shows that if p ≥ 2, np
also
n=1 n=1
converges.
If p ≤ 1, then n p ≤ n for each n ∈ N. It follows that
1 1
0≤ ≤ p.
n n
∞
P 1
This time, we know that the series n
diverges. The Comparison Test shows that
n=1
∞
P 1
np
also diverges.
n=1
∞
P 1
Therefore, np
converges if p ≥ 2 and diverges if p ≤ 1. It would be natural to ask
n=1
∞
about what happens when 1 < p < 2. For example, does 1
P
3 converge?
n=1 n 2

Unfortunately, we have
1 1 1
2
≤ 3 ≤ .
n n2 n
∞ ∞
P 1 1 1 P 1
Since n2
converges and n2
≤ 3 , this tells us nothing about 3 . Similarly,
n=1 n2 n=1 n 2
∞ ∞
1 1
≤ 1n , this tells us nothing about 1
P P
since n
diverges and 3 3 .
n=1 n2 n=1 n 2

By making use of what we know about Improper Integrals, we will see later that
∞
P 1
3 actually converges.
n=1 n 2

EXAMPLE 10 Recall that

n! = 1 · 2 · 3 · · · n
for n ≥ 1 and that
0! = 1.
∞
P 1
Problem: Show that n!
converges and that
n=0

∞
X 1
2< < 3.
n=0
n!

∞
1
= e.
P
Note: In fact n!
n=0

Calculus 2 (B. Forrest)2

Section 5.5: Positive Series 183

First consider that

∞
X 1 1 1 1
= + + + ···
n=0
n! 0! 1! 2!
∞
X 1
= 1+
n=1
n!

∞
P 1
We will show that n!
converges and that
n=1

∞
X 1
1< < 2.
n=1
n!

Let an = 1
n!
and let bn = 1
2n−1
.
Then
1 1 1
a1 = = 0 = 1−1 = b1
1 2 2
1 1
a2 = = = b2
1·2 2
1 1 1
a3 = < = 2 = b3
1·2·3 1·2·2 2
1 1 1
a4 = < = 3 = b4
1·2·3·4 1·2·2·2 2
..
.
1 1 1
an = < = n−1 = bn
1 · 2 · 3 · 4···n 1 · 2 · 2 · 2···2 2

This shows that 0 < an ≤ bn for each n ∈ N.

Now
∞ ∞
X X 1
bn = n−1
n=1 n=1
2
1 1 1
= 0
+ 1 + 2 + ···
2 2 2
∞
X 1
=
j=0
2j
1
= 1
1− 2
= 2

Calculus 2 (B. Forrest)2

Chapter 5: Numerical Series 184

∞
Since 0 < an ≤ bn for each n ∈ N and
P
bn converges, the Comparison Test shows
n=1
∞ ∞
an = 1
P P
that n!
also converges and that
n=1 n=1

∞ ∞
X 1 X
0< ≤ bn = 2.
n=1
n! n=1

However, since an < bn for each n ≥ 3, we have that

∞ ∞
X 1 X
0< < bn = 2.
n=1
n! n=1

∞
1
= 1
+ 1
+ 1
+ · · · > 1. Therefore,
P
But n! 1! 2! 3!
n=1

∞
X 1
1< < 2.
n=1
n!

Finally, since
∞ ∞
X 1 X 1
=1+ ,
n=0
n! n=1
n!
we get that
∞
X 1
2=1+1< < 1 + 2 = 3.
n=0
n!

We have seen that the Comparison Test can help determine whether certain series
converge. We will now present a variation of the Comparison Test that will work for
a significant collection of series, including all of those series where the terms are
ratios of polynomials in n. We begin with such an example.

EXAMPLE 11 Let an = n3 −n+1

2n
. Notice that if n is very large, then (−n + 1) is negligible in
comparison to n3 . Therefore, we could say that for large n

n3 − n + 1 n3

and so
2n 2n 2
an = 3 = 2.
n3 −n+1 n n

In other words, for very large n the terms an = 2n

n3 −n+1
are roughly comparable in size
to n22 .

Calculus 2 (B. Forrest)2

Section 5.5: Positive Series 185

∞ ∞
P 1 P 2
We know that n2
converges and hence so does n2
. Since
n=1 n=1

2n 2
an = 2
n3 −n+1 n
∞
P 2n
we might guess that n3 −n+1
also converges. Unfortunately,
n=1

2 2n
0< ≤ 3
n2 n −n+1
for all n. This means that we cannot immediately apply the Comparison Test to
establish the convergence. However, we will be able to show that the next theorem
will work in this case. It is essentially an upgraded version of the Comparison Test.

5.5.2 Limit Comparison Test

THEOREM 7 Limit Comparison Test (LCT)

Assume that an > 0 and bn > 0 for each n ∈ N. Assume also that
an
lim =L
n→∞ bn
where either L ∈ R or L = ∞.
∞ ∞
1. If 0 < L < ∞, then
P P
an converges if and only if bn converges.
n=1 n=1

∞ ∞ ∞
2. If L = 0 and
P P P
bn converges, then an converges. Equivalently, if an
n=1 n=1 n=1
∞
P
diverges, then so does bn .
n=1

∞ ∞ ∞
3. If L = ∞ and
P P P
an converges, then bn converges. Equivalently, if bn
n=1 n=1 n=1
∞
P
diverges, then so does an .
n=1

PROOF
Assume that an
lim = L.
n→∞ bn

Calculus 2 (B. Forrest)2

Chapter 5: Numerical Series 186

1) If 0 < L < ∞, the interval ( L2 , 2L) is an open interval containing L. It follows

that we can find a cutoff N ∈ N so that if n ≥ N, then
L an
< < 2L
2 bn
or equivalently that
L
· bn < an < 2Lbn
2
∞
P
Now if an converges, then the Comparison Test shows that
n=1
∞
X L
· bn
n=1
2
converges and hence so does
∞
X
bn .
n=1
∞
P
If bn converges, then so does
n=1
∞
X
2L · bn .
n=1
But again we can use the Comparison Test to show that
∞
X
an
n=1
converges.
2) If L = 0, then we can find a cut off N ∈ N so that if n ≥ N, then
an
0< <1
bn
or equivalently that
0 < an < bn .
∞
P P∞
In this case, if bn converges, then an converges as well by the
n=1 n=1
∞
P ∞
P
Comparison Test. Equivalently, if an diverges, then so does bn .
n=1 n=1

3) If L = ∞, then we can find a cut off N ∈ N so that if n ≥ N, then

an
>1
bn
or equivalently that
bn < an .
∞
P ∞
P
If an converges, then bn converges by the Comparison Test.
n=1 n=1
∞
P ∞
P
Equivalently, if bn diverges, then so does an .
n=1 n=1

Calculus 2 (B. Forrest)2

Section 5.5: Positive Series 187

Remarks:
We can informally summarize why the Limit Comparison Test works.
If lim an
= L where 0 < L < ∞, then for large n we have
n→∞ bn

an
L
bn
or
an Lbn .
∞
P ∞
P
This suggests that an converges if and only if Lbn converges. However, the
n=1 n=1
∞
properties of convergent series show that if 0 < L < ∞, then
P
Lbn converges if and
n=1
∞
P ∞
P
only if bn converges. Combining these statements gives us that an converges if
n=1 n=1
∞
P
and only if bn converges.
n=1

When lim bann = L where 0 < L < ∞, we say that an and bn have the same order of
n→∞
magnitude. We write
an ≈ bn .

The Limit Comparison Test says that two positive series with terms of the same
order of magnitude will have the same convergence properties.
If lim abnn = 0, then bn must eventually be much larger than an . In this case, we write
n→∞
an bn and we say that the order of magnitude of an is smaller than the order of
magnitude of bn .
∞
P ∞
P
In this case, if the smaller series an diverges to ∞, it would make sense that bn
n=1 n=1
also diverges to ∞.
Finally, if lim = ∞, then an must eventually be much larger than bn . That is
an
n→∞ bn
∞
P
bn an . This time, if the larger series an converges, it would make sense that
n=1
∞
P
bn would converge as well.
n=1

Calculus 2 (B. Forrest)2

Chapter 5: Numerical Series 188

EXAMPLE 12 Let an = 2n
n3 −n+1
and bn = 1
n2
. Then
2n
an n3 −n+1
= 1
bn n2
2n3
=
n3 −
3 
 n+1 
n  2
= 3 

n 1 − n2 +
1 1

n3
2
=
1 − n2 + n13
1

Therefore,
an 2 2
lim = lim = = 2.
n→∞ bn n→∞ 1 − 1 + 1 1
n2 n3

This confirms for large n that an = 2n

n3 −n+1
2bn = 2
n2
.
∞
P 1
Since n2
converges, the first statement in the Limit Comparison Test shows that
n=1
∞
P 2n
n3 −n+1
converges as we expected.
n=1

∞
P 1
EXAMPLE 13 Show that sin n
diverges.
n=1

It can be shown that for any 0 < x ≤ 1 that 0 < sin(x) < x and hence that
!
1 1
0 < sin <
n n
∞
P 1
for each n ∈ N. However, since n
diverges, we cannot use the Comparison Test
n=1
∞
sin n1 diverges. (Why?)
P
directly to show that
n=1

Recall that the Fundamental Trigonometric Limit states that as x → 0, sin(x)

x
→ 1. As
1
n → ∞, we have n → 0. Therefore, by the Sequential Characterization of Limits

sin 1n
lim 1 = 1.
n→∞
n

If we let an = sin 1
n
and bn = 1n , then we have just shown that
an
lim = 1.
n→∞ bn
∞ ∞
P 1 P 1
Since n
diverges, the Limit Comparison Test shows that sin n
diverges.
n=1 n=1

Calculus 2 (B. Forrest)2

Section 5.6: Integral Test for Convergence of Series 189

5.6 Integral Test for Convergence of Series

∞ ∞
P 1 P 1
We have seen that the series n
diverges while the series n2
converges. The
n=1 n=1
∞
P 1
Comparison Test can then be used to show that if p ≥ 2 the series np
converges,
n=1
∞
P 1
while if p ≤ 1 the series np
diverges. We are not yet able to determine what
n=1
happens when 1 < p < 2 since the Comparison Test fails in this case since
1
n2
< n1p < 1n for n > 1, and the convergence of a series with smaller terms or the
divergence of a series with larger terms does not help us determine whether a
particular series converges or diverges.
It turns out that we can use improper integrals to establish the convergence or
divergence of this remaining case. To see how we do this, we will consider the
∞
P 1
series 3 .
n=1 n 2

∞
P 1
EXAMPLE 14 Show that 3 converges.
n=1 n 2
∞ ∞
P 1 P 1
We begin by noting that the series 3 converges if and only if the series 3
n=1 n 2 n=2 n 2
converges. Next we will consider the function f (x) = 3 . This function is
1
x2
continuous on [1, ∞) and is decreasing on this interval. You can verify the last
−5
statement by noting that the derivative is f 0 (x) = − 23 x 2 , which is negative if x > 0.

f (x) = 1
3
x2
The function appears as follows:

1
Observe that on the interval [1, 2], f has a minimum value of 3 at the
22
right-endpoint x = 2. Therefore,
1
f (x) ≥ 3
22
for all x ∈ [1, 2].

From this fact and the following diagram,

we see that the area of the rectangle with f (x) = 1
3
x2
height 13 and width 1 is less than the
2 2R
2 1
integral 1 13 dx. 3
22
1
3
x2 22
1 2

Calculus 2 (B. Forrest)2

Chapter 5: Numerical Series 190

Similarly, we see that f (x) = 1

3
x2
Z 3
1 1 1
3
+ 3
≤ 3
dx. 1
2 2 3 2 1 x2 3 1
22 3
32
1 2 3
Continuing on we get that for any k ∈ N, k ≥ 2,

k
X 1
Sk = 3
n=2 n2 f (x) = 1
3
1 1 1 x2
= 3
+ 3 + ··· + 3
22 32 k2
Z k
1
≤ 3
dx
1 x2 1 1 1 1 ... ... ... 1 ... ... ... ... 1
3 3 3 3 3 3
22 32 42 52 n2 k2

1 2 3 4 5 n−1 n k−1 k

We know that
Z ∞ Z b
1 1
3
dx = lim dx 3
1 x 2 b→∞ 1 x2
1 b

= lim −2x− 2
b→∞ 1
" #
−2 2
= lim √ + √
b→∞ b 1
= 2

This shows that for every k,

k Z k
X 1 1
Sk = 3
≤ 3
dx < 2.
n=2 n 2 1 x2
Therefore, {S k } is an increasing sequence that we have just seen is bounded. The
∞
P 1
Monotone Convergence Theorem tells us that it must converge. Therefore, 3
n=2 n 2
∞
P 1
converges. Finally, this shows that 3 also converges as we expected.
n=1 n 2

Calculus 2 (B. Forrest)2

Section 5.6: Integral Test for Convergence of Series 191

∞
P 1
EXAMPLE 15 We can use a similar argument to provide us another way to show that n
diverges.
n=1

Let f (x) = 1x . Then we know that

Z ∞ Z b
1 1
dx = lim dx
1 x b→∞ 1 x
b
= lim ln(x)
b→∞ 1
= lim [ln(b) − ln(1)]
b→∞
= ∞

R∞ 1
so that 1 x
dx diverges.
Since f (x) = 1x is decreasing, the maximum value for the function on an interval of
the form [n, n + 1] occurs at the left-hand endpoint n with f (n) = n1 . It follows that
R n+1 1
n x
dx is smaller than the area of the rectangle with height f (n) = n1 and base of
width 1 between n and n + 1.

1
f (x) =
x

1 1 1 1 1 1
... ... ...
1 2 3 4 n k
1 2 3 4 5 n n+1 k k+1

Moreover, as the diagram illustrates, for each k ∈ N

Z k+1
1 1 1 1 1
+ + + ··· + > dx.
1 2 3 k 1 x
R∞ 1
R k+1 1
However, because 1 x
dx diverges, 1 x
dx → ∞ as k → ∞. This means that
k
X 1 1 1 1 1
Sk = = + + + ··· +
n=1
n 1 2 3 k
∞
P 1
must also grow toward ∞ as k gets large. Therefore, n
diverges.
n=1

Calculus 2 (B. Forrest)2

Chapter 5: Numerical Series 192

Remark: It turns out that the process we have used in the last two examples
provides us with powerful tools for studying the convergence and divergence of
many important series. In general, we will assume that

1. f is continuous on [1, ∞).

2. f (x) > 0 on [1, ∞).
3. f is decreasing on [1, ∞).

Let an = f (n). Then just as was the case for f (x) = 13 , for any n ∈ N, n ≥ 2, the
x2
minimum value for the function f (x) on the interval [n − 1, n] is at the right-hand
endpoint and as such is f (n) = an . Again, as in the case of f (x) = 13 , for any k ∈ N,
x2
k ≥ 2, we have
Xk Z k
an = a2 + a3 + · · · + ak ≤ f (x) dx.
n=2 1

f (2)

f (3)
y = f (x)
f (4)
f (5)

f (n)
f (k)
a2 a3 a4 a5 ......... an ............ ak

1 2 3 4 5 n−1 n k−1 k

R∞
Therefore, if 1
f (x) dx converges, then
k
X Z ∞
an ≤ f (x) dx < ∞
n=2 1

for each k. Using the Monotone Convergence Theorem, this shows that if
R∞ ∞
P ∞
P
1
f (x) dx converges, then so does an . Finally, we get that an also converges.
n=2 n=1
k Rk
an = a2 + a3 + · · · + ak ≤
P
As we have just seen 1
f (x) dx. From this observation
n=2
we get that
k
X k
X
an = a1 + an
n=1 n=2
Z k
≤ a1 + f (x) dx.
1

Calculus 2 (B. Forrest)2

Section 5.6: Integral Test for Convergence of Series 193

R∞
Allowing k to approach ∞, we see that if 1
f (x) dx < ∞ converges, then
∞
X Z ∞
an ≤ a1 + f (x) dx.
n=1 1

Again, just as was the case for f (x) = 1x , for any n ∈ N, the maximum value for the
function f (x) on the interval [n, n + 1] is at the left-hand endpoint and as such is
f (n) = an . It follows that for any k ∈ N, we have
Z k+1 k
X
f (x) dx ≤ a1 + a2 + a3 + · · · + ak = an .
1 n=1

f (1)

f (2)
y = f (x)
f (3)
f (4)

f (n)
f (k)
a1 a2 a3 a4 ... ... an ... ak

1 2 3 4 5 n n+1 k k+1

R∞ ∞
P
This means that if 1
f (x) dx diverges to ∞, then an must also diverge or
n=1
P∞ R∞
equivalently that if an converges, then so does 1
f (x) dx.
n=1

P∞
Combining what we have done so far we get that an converges if and only if
R∞ n=1

1
f (x) dx converges!
Note: We have just seen how improper integrals can help us analyze the growth
rates of the partial sums of a series. In fact, so long as the series arises from a
function f with the stated properties we have that for each k ∈ N that
Z k+1 k
X Z k
f (x) dx ≤ an ≤ a1 + f (x) dx.
1 n=1 1

∞
P
Assume now that an converges to S . Then by allowing k to approach ∞ in the
n=1
previous inequality we get
Z ∞ Z ∞
f (x) dx ≤ S ≤ f (x) dx + a1 .
1 1

Calculus 2 (B. Forrest)2

Chapter 5: Numerical Series 194

Unfortunately, if a1 is large, this estimate for S is rather crude. The good news is
that we can do better!
Observe that since the terms in the series are positive, we have

0 ≤ S − Sk
X∞ k
X
= an − an
n=1 n=1
X∞
= an
n=k+1

This means that estimating how close the partial sum is to the final limit is
equivalent to estimating how large is the sum of the tail of the series.
However, as the following diagram shows
∞
X Z ∞
0 ≤ S − Sk = an ≤ f (x) dx.
n=k+1 k

y = f (x)

............. ak+1 ak+2 ak+3 ak+4 ... ... an+1 ... .............
k k+1 k+2 k+3 k+4 n n+1

The previous discussion leads us to the following important test for convergence.

THEOREM 8 Integral Test for Convergence

Assume that

1. f is continuous on [1, ∞),

2. f (x) > 0 on [1, ∞),

3. f is decreasing on [1, ∞), and
4. ak = f (k).

Calculus 2 (B. Forrest)2

Section 5.6: Integral Test for Convergence of Series 195

n
For each n ∈ N, let S n =
P
ak . Then
k=1

i) For all n ∈ N, Z n+1 Z n

f (x) dx ≤ S n ≤ a1 + f (x) dx.
1 1

∞
P R∞
ii) ak converges if and only if 1
f (x) dx converges.
k=1
∞
P
iii) In the case that ak converges, then
k=1

Z ∞ ∞
X Z ∞
f (x) dx ≤ ak ≤ a1 + f (x) dx
1 k=1 1

and Z ∞ Z ∞
f (x) dx ≤ S − S n ≤ f (x) dx,
n+1 n

∞ R∞
where S =
P
ak . (Note that by (ii), n
f (x) dx exists.)
k=1

Note: In the case where the series and the improper integral converge, they do not
have to converge to the same value. Furthermore, all of the conditions are important,
particularly that f is eventually decreasing (or at least non-increasing). Otherwise,
the series and the improper integral could converge or diverge independent of one
another.
The conditions of the Integral Test do not have to hold on all of [1, ∞) for this
analysis to be useful. What is really important is that they hold from some point
onward. In fact, if these three conditions hold on the interval [m, ∞) for some
positive integer m, then we can conclude that
∞
X Z ∞
an converges if and only if f (x) dx converges.
n=m m

∞
P 1
We have essentially used the Integral Test to show that 3 converges and that
n=1 n 2
∞ ∞ ∞
P 1 P 1 P 1
n
diverges. Since 3 converges, the Comparison Test shows that np
n=1 n=1 n 2 n=1
∞
3 P 1
converges for any p ≥ 2
and we know from before that np
diverges if p ≤ 1. We
n=1
still don’t know what happens if 1 < p < 23 . The Integral Test can help us fill in this
gap.

Calculus 2 (B. Forrest)2

Chapter 5: Numerical Series 196

THEOREM 9 p-Series Test

∞
1
converges if and only if p > 1.
P
The series np
n=1

PROOF
∞
P 1
We already know that np
converges if p ≥ 2 and it diverges for p ≤ 1. We can use
n=1
the Integral Test to address the missing interval.
Consider, the function f (x) = x1p . It is easy to see that if p > 0, then f (x) satisfies the
three hypotheses of the Integral Test. If we let an = f (n) = n1p , then for any p > 0,
∞ R∞ 1
P 1
the series n p will converge if and only if the improper integral 1 xp
dx
n=1
converges. However, we know that
Z ∞
1
dx
1 xp
∞
converges if and only if p > 1. Since we know that 1
P
np
diverges if p ≤ 0, this tells
n=1
∞
1
will converge if and only if p > 1.
P
us that np
n=1

EXAMPLE 16 Show that ∞

X 1
n −n+1
3
2
n=1
converges.
We will use the Limit Comparison Test and the p-Series Test.
A close look at the terms of the series shows that for large n,
1 1
.
n −n+1
3 3
2 n2

In fact
1 3
3
n2 − n + 1
lim n2
1
= lim 3
n→∞ n→∞ n2
3
n 2 −n+1
1 1
= lim (1 − 1
+ 3
)
n→∞ n 2 n2
= 1

Calculus 2 (B. Forrest)2

Section 5.6: Integral Test for Convergence of Series 197

∞
P 1
Therefore, the Limit Comparison Test shows that 3 converges if and only if
n=1 n 2 −n+1
∞ ∞
P 1 P 1
3 converges. However, the p-Series test shows that 3 converges. We can
n=1 n 2 n=1 n 2
∞
P 1
conclude that 3 converges as well.
n=1 n 2 −n+1

∞
P 1
EXAMPLE 17 Determine whether the series n ln(n)
converges.
n=2

Consider the function f (x) = 1

x ln(x)
. Then

1. f (x) is continuous on [2, ∞).

2. f (x) > 0 on [2, ∞).
3. f (x) is decreasing on [2, ∞).

While all of these claims are easy to verify, we will explicitly show that condition 3
holds. To do this, note that from the quotient rule it follows that
(x ln(x))(0) − (ln(x) + 1)(1)
f 0 (x) =
x2 (ln(x))2
−(ln(x) + 1)
=
x2 (ln(x))2
< 0

if x ≥ 2.
This shows that f 0 (x) < 0 for every x ∈ [2, ∞). Hence, f (x) is decreasing on [2, ∞).
Alternatively, we could have observed that the function x ln(x) is increasing on
1
[2, ∞) and as such its reciprocal x ln(x) must be decreasing.
∞
P 1
We can apply the Integral Test to see that n ln(n)
converges if and only if
R∞ 1 n=2

2 x ln(x)
dx converges.
Now Z ∞ Z b
1 1
dx = lim dx.
2 x ln(x) b→∞ 2 x ln(x)
Rb
To evaluate 1
2 x ln(x)
dx use the substitution u = ln(x), du = dx
x
to get
Z b Z ln(b)
1 1
dx = du
2 x ln(x) ln(2) u
ln(b)
= ln(u)
ln(2)
= ln(ln(b)) − ln(ln(2))

Calculus 2 (B. Forrest)2

Chapter 5: Numerical Series 198

Therefore,
Z ∞ Z b
1 1
dx = lim dx
2 x ln(x) b→∞ 2 x ln(x)

= lim ln(ln(b)) − ln(ln(2))

b→∞

= ∞
R∞ ∞
1 P 1
Since 2 x ln(x)
dx diverges, the Integral Test shows that n ln(n)
also diverges.
n=2

∞
P 1
EXAMPLE 18 Show that n(ln(n))2
converges.
n=2

Let f (x) = 1
x(ln(x))2
. It is easy to verify that

1. f is continuous on [2, ∞).

2. f (x) > 0 on [2, ∞).

Moreover, since x(ln(x))2 is increasing for x ≥ 2, it follows that f (x) = x(ln(x))

1
2 is

decreasing. Alternatively, f (x) = − x2 (ln(x))3 which is negative for x ≥ 2. Therefore,

0 2+ln(x)

∞
P 1
the Integral Test can be used to conclude that n(ln(n))2
converges if and only if
R∞ 1 n=2

2 x(ln(x))2
dx converges.
Rb 1
2 dx, use the substitution u = ln(x), du = x to get
dx
To evaluate 2 x(ln(x))
Z b Z ln(b)
1 1
dx = du
2 x(ln(x))2 ln(2) u2
1 ln(b)

= −
u ln(2)
1 1
= −
ln(2) ln(b)

Therefore,
Z ∞ Z b
1 1
dx = lim dx
2 x(ln(x))2 b→∞ 2 x(ln(x))2
1 1
= lim −
b→∞ ln(2) ln(b)
1
=
ln(2)

Calculus 2 (B. Forrest)2

Section 5.6: Integral Test for Convergence of Series 199

R∞ ∞
1 P 1
Since 2 x(ln(x))2
dx converges, so does n(ln(n))2
.
n=2

5.6.1 Integral Test and Estimation of Sums and Errors

The Integral Test is a powerful tool for determining the convergence or divergence
of many important series. However, much more can be said. In fact, we have seen
that if

1. f is continuous on [1, ∞),

2. f (x) > 0 on [1, ∞),
3. f (x) is decreasing on [1, ∞),
4. an = f (n), and
k
5. S k =
P
an ,
n=1

then Z k+1 Z k
f (x) dx ≤ S k ≤ f (x) dx + a1 .
1 1
Therefore, we can use integration to estimate the value of the partial sum S k of the
∞
P
series an .
n=1

∞
P 1
EXAMPLE 19 How large is the k-th partial sum S k of the harmonic series n
?
n=1

In this case, f (x) = 1

x
and a1 = 1. But we know that
Z k+1 Z k
1 1 1 1 1
dx ≤ S k = + + · · · + ≤ dx + a1
1 x 1 2 k 1 x
Since Z c
1 c
dx = ln(x) = ln(c) − ln(1) = ln(c)
1 x 1

for any value of c > 0, this inequality becomes

1 1 1
ln(k + 1) ≤ + + · · · + ≤ ln(k) + 1
1 2 k

If we let k = 1000, we see that ln(1001) = 6.908754779 and

ln(1000) = 6.907755279. This tells us that
1 1 1
6.908754779 ≤ + + ··· + ≤ 7.907755279
1 2 1000

Calculus 2 (B. Forrest)2

Chapter 5: Numerical Series 200

If k = 109 , then the inequality would show us that

1 1 1
20.72326584 ≤ + + · · · + 9 ≤ 21.72326584
1 2 10

What is remarkable about these estimates is not their accuracy in estimating a

particular partial sum, but that while we know the harmonic series will diverge to ∞,
after 1000 terms the sum has not yet exceeded 8, and after 1 billion terms, the sum is
still less than 22! This shows us that we would not have been able to guess that this
series diverged by testing some partial sums on a computer.
To emphasize home this last comment consider how many terms one would have to
add together so that
1 1 1
S k = + + · · · + > 1000.
1 2 k
We know that
S k ln(k).
As such to have S k > 1000 we would need enough terms so that

1000 < ln(k).

This means that we should have

k > e1000 10434 .

This is an enormous number.

∞
P 1
EXAMPLE 20 The p-Series Test shows that the series n4
converges. Let
n=1

k ∞
X 1 X 1
Sk = and S = .
n=1
n4 n=1
n4

Estimate the error in using the first 100 terms in the series to approximate S . That is,
estimate |S − S 100 |.
The first observation we make is that since all the terms are positive we have that
S − S 100 > 0 and hence that

|S − S 100 | = S − S 100 .

But we also know from the Integral Test that

Z ∞ Z ∞
1 1
4
dx ≤ S − S 100 ≤ 4
dx.
101 x 100 x

Calculus 2 (B. Forrest)2

Section 5.6: Integral Test for Convergence of Series 201

But for any m ∈ N, we have that

Z ∞ Z b
1 1
4
dx = lim dx
m x b→∞ m x4
1 b
= lim − 3
b→∞ 3x m
1 1
= lim 3
− 3
b→∞ 3m 3b
1
=
3m3

Substituting m = 101 and m = 100 respectively in 1

3m3
we get that

1 1
3
≤ S − S 100 ≤
3(101) 3(100)3
or
3.2353 × 10−7 ≤ S − S 100 ≤ 3.3333 × 10−7 .

Now if we calculate S 100 we get S 100 = 1.082322905 (up to 9 decimal places) and
hence our prediction would be that

1.082323229 ≤ S ≤ 1.082323238

In fact, it is actually known that

∞
X 1 π4
S = = = 1.082323234
n=1
n4 90

which does indeed lie within our range.

∞
P 1
EXAMPLE 21 The Integral Test tells us that the series n(ln(n))2
converges. But it can also show us
n=2
∞
that the series converges very slowly. For example, suppose that S = 1
P
n(ln(n))2
and
n=2
k
Sk = 1
P
n(ln(n))2
. Then we know that
n=2
Z ∞
1
S − Sk dx.
k x(ln(x))2
But
Z b Z ln(b)
1 1
dx = du
k x(ln(x))2 ln(k) u2
ln(b)
1
= −
u ln(k)

Calculus 2 (B. Forrest)2

Chapter 5: Numerical Series 202

1 1
= −
ln(k) ln(b)

Therefore,
Z ∞ Z b
1 1
dx = lim dx
k x(ln(x))2 b→∞ k x(ln(x))
2

1 1
= lim −
b→∞ ln(k) ln(b)
1
=
ln(k)

This means if we want

1
S − Sk <
100
we would choose k so that
1 1
<
ln(k) 100
or equivalently that
100 < ln(k) ⇐⇒ k > e100 .
1
Therefore, to have the partial sum S k approximate the final sum to within only 100
we would need roughly e100 1043 terms.

5.7 Alternating Series

∞
P
We have seen that for a series an with positive terms (in other words, an ≥ 0 for
n=1
∞
P
all n) that an will either converge if the terms are small enough or it will diverge
n=1
to ∞.
Without the assumption that an ≥ 0 for all n, the situation can become much more
complicated. In this section, we will look at one more class of series whose
behavior is particularly nice.

DEFINITION Alternating Series

A series of the form
∞
X
(−1)n−1 an = a1 − a2 + a3 − a4 + · · ·
n=1

or of the form ∞
X
(−1)n an = −a1 + a2 − a3 + a4 − · · ·
n=1
is said to be alternating provided that an > 0 for all n.

Calculus 2 (B. Forrest)2

Section 5.7: Alternating Series 203

EXAMPLE 22 The most important example of an alternating series is

∞
X 1 1 1 1
(−1)n−1 = 1 − + − + ··· .
n=1
n 2 3 4

∞
(−1)n−1 1n converge?
P
Problem: Does the series
n=1

For positive series, we saw that two series with terms of the same order of magnitude
would either both converge or both diverge. If an = (−1)n−1 1n and bn = 1n , then

1
| an |= =| bn |
n
so the terms an and bn are of the same order of magnitude. Our rule of thumb would
∞ ∞
1
diverges, we might expect that (−1)n−1 1n would also
P P
suggest that since n
n=1 n=1
∞
(−1)n−1 n1
P
diverge. However, is not a positive series.
n=1

Pj ∞
Let S j = (−1)n−1 n1 be the j-th partial sum of the series (−1)n−1 n1 . Then
P
n=1 n=1

S 1 = 1.

We can represent this graphically by beginning at 0 and then moving 1 unit to the
right to reach S 1 = 1.

Next we have that

1 1 1
S2 = 1 − = S1 − = .
2 2 2

Again, this can be represented graphically. This time we begin at S 1 = 1 and then
move 12 units to the left to reach S 2 = 12 .

Calculus 2 (B. Forrest)2

Chapter 5: Numerical Series 204

Notice that since we have moved to the left, we have S 2 < S 1 , but since 1
2
was
smaller than 1, we did not get back to 0. This means that

0 < S 2 < S 1.

In the third step, we get

1 1 1 5
S3 = 1 − + = S2 + = .
2 3 3 6

To reach S 3 we move to the right a total of 13 units. It is also very important to note
that because 13 < 12 , we do not get all the way back to S 1 . That is,

0 < S 2 < S 3 < S 1.

The fourth step will take us to the left a total of 14 units. Since our previous move to
the right was 13 units and clearly 14 < 13 , we now have

0 < S 2 < S 4 < S 3 < S 1.

After two more steps, we see a clear pattern emerging with

0 < S 2 < S 4 < S 6 < S 5 < S 3 < S 1.

Calculus 2 (B. Forrest)2

Section 5.7: Alternating Series 205

The terms with even indices are getting larger, while the terms with odd indices are
decreasing. In fact, if we continue on we will see a picture that looks as follows:

If we denote the odd indexed terms by S 2k−1 for k = 1, 2, 3, · · · and the even indexed
terms by S 2k for k = 1, 2, 3, · · · , then after 2k steps we have
0 < S 2 < S 4 < S 6 < · · · < S 2k < S 2k−1 < · · · < S 5 < S 3 < S 1 .

Eventually, we will have two sequences consisting of the odd partial sums {S 2k−1 }
with
S 1 > S 3 > S 5 > · · · > S 2k−1 > S 2k+1 > · · · > 0

Calculus 2 (B. Forrest)2

Chapter 5: Numerical Series 206

and the even partial sums {S 2k } with

S 2 < S 4 < S 6 < · · · < S 2k < S 2k+2 < · · · < 1.

Both of the sequences are monotonic and bounded. The Monotone Convergence
Theorem shows that they both converge. Let

lim S 2k−1 = M
k→∞

and
lim S 2k = L.
k→∞

Since the elements of {S 2k−1 } decrease to M and the elements of {S 2k } increase to L,

we can show that
L≤M
.

Moreover, since the odd terms and the even terms combine to give the entire
sequence {S j } of partial sums, to show that {S j } converges we need only show that
M = L. The key observation is that

S 2k < L ≤ M < S 2k−1

for every k. But to get to S 2k from S 2k−1 , we subtract 2k1 . This is equivalent to stating
that
1
S 2k−1 − S 2k = .
2k
Moreover, the distance between M and L is less than the distance from S k−1 to S 2k .
Putting this all together gives us
1
0≤ M−L≤
2k
for every k ∈ N. Since 2k1 can be made as small as we would like, the last statement
can only be true if L = M.

Calculus 2 (B. Forrest)2

Section 5.7: Alternating Series 207

We have just succeeded in showing that the sequence {S j } of partial sums of the
∞ ∞
series (−1)n−1 n1 converges. This means that (−1)n−1 n1 also converges.
P P
n=1 n=1

There is one more observation that we can make. The process above shows that any
two consecutive partial sums S m and S m+1 will always sit on opposite sides of the
∞
final sum. If we denote (−1)n−1 n1 by L, then this means that the distance from S m
P
n=1
to L is less than the distance from S m to S m+1 .
1
However, to get to S m+1 from S m , we either add or subtract m+1 units depending on
whether m is odd or even. Either way this tells us that the distance from S m to S m+1
1
is exactly m+1 . Therefore, we get that for any m,

1
| S m − L |≤| S m − S m+1 |= .
m+1

∞
(−1)n−1 n1 with an
P
That is, the partial sum S m approximates the sum of the series
n=1
1
error of less than m+1
.

Important Observation: A careful observation of the analysis of the series

∞
(−1)n−1 n1 will show that we used the following properties of the sequence an = 1
P
n
n=1
to show that the series converges:

1. an > 0 for all n.

2. an+1 ≤ an for all n.

3. lim an = 0.
n→∞

In fact our analysis is valid for any alternating series with these properties. We can
summarize this in the following theorem:

Calculus 2 (B. Forrest)2

Chapter 5: Numerical Series 208

THEOREM 10 Alternating Series Test (AST)

Assume that

1. an > 0 for all n.

2. an+1 ≤ an for all n.

3. lim an = 0.
n→∞

Then the alternating series

∞
X
(−1)n−1 an
n=1
converges.
k ∞
If S k = (−1)n−1 an , then S k approximates the sum S = (−1)n−1 an with an error
P P
n=1 n=1
that is at most ak+1 . That is
| S k − S |≤ ak+1 .

PROOF
We will show that the two subsequences of partial sums {S 2k−1 } and {S 2k } converge
to the same limit L.
We first prove that both subsequences are monotonic. We have

S 2(k+1)−1 − S 2k−1 = S 2k+1 − S 2k−1

2k+1
X 2k−1
X
= n−1
(−1) an − (−1)n−1 an
n=1 n=1
= (−1) a2k + (−1)
2k−1 (2k+1)−1
a2k+1
= −a2k + a2k+1
≤ 0

This shows that {S 2k−1 } is decreasing. Similarly,

S 2(k+1) − S 2k = S 2k+2 − S 2k
2k+2
X 2k
X
= n−1
(−1) an − (−1)n−1 an
n=1 n=1
= (−1) a2k+1 + (−1)(2k+2)−1 a2k+2
(2k+1)−1

= a2k+1 − a2k+2
≥ 0

This shows that {S 2k } is increasing.

Calculus 2 (B. Forrest)2

Section 5.7: Alternating Series 209

Now we will show that both subsequences are bounded. We have

S 2k−1 = (a1 − a2 ) + (a3 − a4 ) + · · · + (a2k−3 − a2k−2 ) + a2k−1

≥ 0 + 0 + · · · + 0 + a2k−1
≥ 0

and

S 2k = a1 − (a2 − a3 ) − (a4 − a5 ) − · · · − (a2k−2 − a2k−1 ) − a2k

≤ a1 − 0 − 0 − · · · − 0 − a2k
≤ a1

Hence {S 2k−1 } is bounded below by 0 and {S 2k } is bounded above by a1 . By the

Monotone Convergence Theorem, lim S 2k−1 = L ∈ R and lim S 2k = M ∈ R.
k→∞ k→∞

Next we show that L = M. To see why this is the case we note that

X 2k 2k−1
X
|S 2k − S 2k−1 | = (−1)n−1 an − (−1)n−1 an = a2k .
n=1 n=1

Then

|M − L| = lim |S 2k − S 2k−1 |
k→∞
= lim a2k
k→∞
= 0

so L = M and the series converges. Next we let

∞
X
S = (−1)n−1 an .
n=1

Finally, since {S 2k } increases to S and {S 2k−1 } decreases to S , we have

S 2k ≤ S ≤ S 2k−1

for all k ∈ N. This shows that S sits between S k and S k+1 for each k ∈ N.
Thus we get that

|S k − S | ≤ |S k − S k+1 | = |(−1)k ak+1 | = ak+1

for all k ∈ N. This proves the last part of the theorem.

Calculus 2 (B. Forrest)2

Chapter 5: Numerical Series 210

Remark: There are a few important observations we should make concerning this
theorem.
∞
(−1)n−1 n1 is the most important of the alternating
P
1. Historically, the series
n=1
series. For this reason it is usually called The Alternating Series.
2. All three of the conditions in the statement of the theorem are important for
the theorem to be valid. However, it is actually sufficient for the first two to
hold for all n ≥ M where M is some fixed integer. In this case, the error
estimate will only be valid when k ≥ M.
∞
(−1)n−1 an , we have
P
3. For the series
n=1
∞
X
0 ≤ S 2 ≤ S 4 ≤ S 6 ≤ · · · ≤ S 2k ≤ · · · (−1)n−1 an
n=1
· · · ≤ S 2k−1 ≤ · · · ≤ S 5 ≤ S 3 ≤ S 1 = a1 .
Therefore, if j is even, S j under estimates the sum and if j is odd, S j over
estimates the sum.
4. With the obvious changes, the theorem remains valid for series of the form
X∞
(−1)n an
n=1

provided that the three assumptions on the sequence {an } hold.

∞
(−1)n−1 n13 converges and determine how large k must be so
P
EXAMPLE 23 Show that the series
n=1
that ∞
X 1
| Sk − (−1)n−1 |< 10−6 .
n=1
n3

Let an = 1
n3
. Then
1 1
<
0<
(n + 1)3 n3
∞
1
= 0. Therefore, the series (−1)n−1 n13 converges by the Alternating
P
and lim 3
n→∞ n n=1
Series Test.
The Alternating Series Test tells us that
∞
X 1 1
| Sk − (−1)n−1 |≤ a k+1 = .
n=1
n3 (k + 1)3

We must choose k large enough so that

1
< 10−6 .
(k + 1)3

Calculus 2 (B. Forrest)2

Section 5.7: Alternating Series 211

After cross-multiplying this inequality is equivalent to

106 < (k + 1)3 .

Taking cube roots of both sides gives

102 < k + 1

and hence that

99 < k.
Therefore, if k ≥ 100, then
∞
X 1
| Sk − (−1)n−1 3
|< 10−6 .
n=1
n

It is not surprising that this series converges since the terms become quite small very
∞
P 1
quickly. In fact, the series n3
also converges.
n=1

We can use the Integral Test to see how many terms are needed so that
∞
X 1
| Tk − 3
|< 10−6
n=1
n

k
where T k = 1
P
n3
. The Integral Test tells us that
n=1

∞ Z ∞
X
n−1 1 1
| Tk − (−1) |< dx.
n=1
n3 k x3
Now
Z ∞ Z b
1 1
dx = lim dx
k x3 b→∞ k x
3

−1 b

= lim 2
b→∞ 2x k
!
−1 1
= lim 2
+ 2
b→∞ 2b 2k
1
=
2k2

We would like to choose k large enough so that

1
< 10−6
2k2

Calculus 2 (B. Forrest)2

Chapter 5: Numerical Series 212

or equivalently so that
106
< k2 .
2
Taking square roots of both sides of this inequality shows us that we require
1000
√ <k
2

and since √= 707.1068, we get that k must be at least 708 to ensure that the error
1000
2
∞
in approximating (−1)n−1 n13 by T k is no more than 10−6 .
P
n=1

EXAMPLE 24 The previous example illustrates the fact that alternating series converge much more
quickly than positive series with terms of equal magnitude. An even more extreme
example of this phenomenon can be seen by comparing the number of terms it
would take for the partial sums of the alternating series
∞
X 1
(−1)n−1
n=2
n(ln(n))2

versus the partial sums of the positive series

∞
X 1
n=2
n(ln(n))2

to be within 10−2 of the corresponding sum.

In the case of the series ∞
X 1
(−1)n−1
n=2
n(ln(n))2
we can use the Alternating Series Test to show not only that the series converges,
but also that

k ∞
X 1 X 1 1
(−1)n−1 − (−1)n−1 ≤ .
n=2 n(ln(n)) 2
n=2
n(ln(n))2
(k + 1)(ln(k + 1))2

If we sum up to k = 14, then since

1 1 1
= = 0.009091 <
(k + 1)(ln(k + 1))2 (15)(ln(15))2 100
we will be within 10−2 of the final sum.
We can show that the series ∞
X 1
n=2
n(ln(n))2

Calculus 2 (B. Forrest)2

Section 5.8: Absolute versus Conditional Convergence 213

converges using the Integral Test. Moreover, we have already seen that the Integral
Test shows us that to approximate the final sum within a tolerance of 10−2 we would
need to use approximately e100 terms. This number is larger than 1043 which as we
mentioned before is unimaginably big! In particular, this example shows us that we
could not find a reasonable approximation to this latter sum by simply asking a
computer to add the terms one by one.

5.8 Absolute versus Conditional Convergence

∞
P 1
Recall that the Harmonic Series n
diverges, while the Alternating Series
n=1
∞
(−1)n−1 1n converges even though the terms have the same order of magnitude. In
P
n=1
fact,
1 = 1 = (−1)n−1 1
n n n
and the second series converges because of the cancellation that occurs as the terms
of the series alternate in sign.
∞ ∞
1
(−1)n−1 n12 converge because 1
P P
On the other hand, both n2
and n2
is small enough!
n=1 n=1

We will see that there are differences between series that converge because the
magnitude of the terms is small and those that rely on cancellation. We begin with
the following definition:

DEFINITION Absolute vs Conditional Convergence

∞
P
A series an is said to converge absolutely if
n=1

∞
X
| an |
n=1

converges.
∞
P
A series an is said to converge conditionally if
n=1

∞
X
| an |
n=1

diverges while
∞
X
an
n=1
converges.

Calculus 2 (B. Forrest)2

Chapter 5: Numerical Series 214

∞
(−1)n−1 1n converges by the Alternating Series Test. However,
P
EXAMPLE 25 The series
n=1
∞ ∞ ∞
(−1)n−1 1n 1
(−1)n−1 n1 is conditionally convergent.
P P P
| |= n
diverges so that
n=1 n=1 n=1

∞
P
EXAMPLE 26 If an ≥ 0 for each n, then | an |= an , so the series an either converges absolutely or
n=1
it diverges.

∞ ∞ ∞
( −1 )n converges absolutely since | ( −1 )n |= ( 21 )n converges by the
P P P
EXAMPLE 27 The series 2 2
n=0 n=0 n=0
Geometric Series Test.

The terminology for absolute versus conditional convergence seems to suggest that
if a series converges absolutely it should also converge without the absolute values.
∞
P ∞
P
Question: Is it possible that | an | converges while an does not?
n=1 n=1

It turns out that such a scenario is not possible as the following theorem illustrates.

THEOREM 11 Absolute Convergence Theorem

∞
P ∞
P
If | an | converges, then so does an .
n=1 n=1
∞
P ∞
P
Note: The sums | an | and an will converge to different values unless an ≥ 0
n=1 n=1
for all n.

PROOF
The proof is an application of the Comparison Test.
∞
P
Assume that | an | converges. Then so does
n=1

∞
X
2 | an | .
n=1

Next observe that

0 ≤ an + | an |≤ 2 | an | .
(This is true because if an ≥ 0, then an =| an |, so an + | an |= 2 | an | and if an < 0,
then an = − | an | so an + | an |= 0.)

Calculus 2 (B. Forrest)2

Section 5.8: Absolute versus Conditional Convergence 215

Since 0 ≤ an + | an |≤ 2 | an |, we can apply the Comparison Test to show that

∞
X
(an + | an |)
n=1

converges. Finally, we have that

an = (an + | an |)− | an |

and hence ∞ ∞ ∞
X X X
an = (an + | an |) − | an | .
n=1 n=1 n=1
P∞
The two series on the right-hand side both converge. Therefore, an will also
n=1
converge.

Remark: This theorem is very useful because there are many more tests for
determining the convergence of positive series than there are for general series. In
fact, the only test we have so far that will determine if a non-positive series
converges is the Alternating Series Test. However, the conditions under which the
Alternating Series Test applies are rather restrictive.

∞
P cos(n)
EXAMPLE 28 Show that the series n2
converges.
n=1

It can be shown that as n goes from 1 to ∞ the values of cos(n) will be positive
infinitely often and negative infinitely often. Moreover, since

cos(1) = -0.540
cos(2) = -0.416
cos(3) = -0.9899
cos(4) = -0.6536
cos(5) = -0.2836

∞
P cos(n)
n2
is not an alternating series. Strictly speaking, none of the tests we have
n=1
discussed up to this point apply to this series. However, if we can show that
∞ ∞
P cos(n) converges, then P cos(n)
also converges.
n2 n2
n=1 n=1

We know that | cos(x) |≤ 1 for any value of x. Therefore, for every n

cos(n) 1
0 ≤ 2 ≤ 2 .

n n

Calculus 2 (B. Forrest)2

Chapter 5: Numerical Series 216

∞
P 1
The p-Series Test shows that n2
converges. Therefore, by the Comparison Test
n=1

∞
cos(n)
X
n2
n=1

∞
P cos(n)
converges. Hence, by the Absolute Convergence Theorem, the series n2
n=1
converges.

xn
∞
P
EXAMPLE 29 Show that the series converges absolutely if | x |< 2, converges
n=0 2 (n + 1)
n
conditionally at x = −2 and diverges if x = 2.
x0n
Choose x0 with | x0 |< 2. Let an = . Then
2n (n + 1)

x0n
| an | = n

2 (n + 1)

1 x0n
=
n + 1 2n

1 x0 n

=
n+12

n
x0
≤
2

x0
If | x0 |< 2, then | 2
|< 1 and so by the Geometric Series Test,
∞ n
x0
X
2
n=0

converges. The Comparison Test shows that

∞
X x0n
2 (n + 1)
n
n=0

converges.
If x = 2, the series becomes
∞ ∞
X 2n X 1
=
n=0
2n (n + 1) n=0
n+1
1 1 1
= + + + ···
1 2 3

Calculus 2 (B. Forrest)2

Section 5.8: Absolute versus Conditional Convergence 217

which is the Harmonic Series and as such diverges.

If x = −2, the series is
∞ ∞
X (−2)n X (−1)n
=
n=0
2n (n + 1) n=0
n+1
1 1 1
= − + − ···
1 2 3

This is the Alternating Series and as such it converges conditionally.

Remark: We have seen that absolutely convergent series also converge and that
testing for absolute convergence allows us to use most of the tools we have
developed. There is one more important reason why we would want to know if a
series converges absolutely.
If we have a finite sum
a1 + a2 + a3
we can add the terms in any order and we will get the same sum. For example,

a1 + a2 + a3 = a3 + a2 + a1

and
a1 + a2 + a3 = a2 + a3 + a1 .

We would hope that this would also be true for infinite series. Unfortunately, this is
not the case. However, if the series converges absolutely then it is true. That is, no
matter how we rearrange the terms, the result will be a new series that has the
same sum as the original. Hence,

a17 + a357 + a45 + a10437 + · · · = a1 + a2 + a3 + a4 + · · · .

∞
an converges conditionally, then given any α ∈ R or
P
Alternately, if a series
n=1
∞
α = ±∞, there is a new series
P
bn consisting of exactly the same terms as our
n=1
original series except in a different order but with
∞
X
bn = α.
n=1

This means that absolutely convergent series are very stable, whereas conditionally
convergent series are not.
We can make this remark more formal.

Calculus 2 (B. Forrest)2

Chapter 5: Numerical Series 218

DEFINITION Rearrangement of a Series

∞
an and a 1-1 and onto function φ : N → N, if we let
P
Given a series
n=1

bn = aφ(n) ,

then the series ∞

X
bn
n=1
∞
P
is called a rearrangement of an .
n=1

This definition leads to the following theorem.

THEOREM 12 Rearrangement Theorem

∞
P ∞
P
1) Let an be an absolutely convergent series. If bn is any rearrangement of
n=1 n=1
∞
P ∞
P
an , then bn also converges and
n=1 n=1

∞
X ∞
X
bn = an .
n=1 n=1

∞
an be a conditionally convergent series. Let α ∈ R or α = ±∞. Then
P
2) Let
n=1
∞
P ∞
P
there exists a rearrangement bn of an such that
n=1 n=1

∞
X
bn = α.
n=1

Remark: In summary, whenever you must test a series with terms of mixed signs
for convergence it is always a good idea to first check if the series converges
absolutely.

5.9 Ratio Test

Recall that the Geometric Series Test states that a geometric series
∞
X
rn
n=0

Calculus 2 (B. Forrest)2

Section 5.9: Ratio Test 219

will converge if and only if | r |< 1.

Suppose that we had a series where

an+1 1
lim = .
n→∞ an 2
Then, for a large number N, we would have
1
| aN+1 | | aN |
2
!2
1 1
| aN+2 | | aN+1 | | aN |
2 2
!3
1 1
| aN+3 | | aN+2 | | aN |
2 2
!4
1 1
| aN+4 | | aN+3 | | aN |
2 2
..
.
!k
1 1
| aN+k | | aN+k−1 | | aN |
2 2

0
1
Since | aN |= 1· | aN |= 2
| aN | this would suggest that
∞ ∞ !k
X X 1
| aN+k | | aN |
k=0 k=0
2

The series
∞ !k ∞ !k
X 1 X 1
| aN | =| aN |
k=0
2 k=0
2
converges by the Geometric Series Test. Therefore, we might expect that
∞
X
| aN+k |
k=0

also converges. Since

∞
X ∞
X
| aN+k | = | an |,
k=0 n=N
∞
P ∞
P
if | aN+k | converges, so does | an |. Finally, from the properties for series, we
k=0 n=N
∞
P
could conclude that | an | would converge.
n=1

This argument seems plausible, but can we make the argument above more
rigorous? In fact we can.

Calculus 2 (B. Forrest)2

Chapter 5: Numerical Series 220

If we assume that
an+1 1
lim = ,
n→∞ an 2
then we can find an N large enough so that if n ≥ N then | aan+1 n
| approximates 21 with
an error of less than 14 . This means that for n ≥ N, | aan+1
n
| must be in the interval

, .
1 3
4 4

an+1 3
In particular, if n ≥ N we get that | an
|< 4
or equivalently, that

3
| an+1 |< | an |
4
for every n ≥ N. This shows that

3
| aN+1 | < | aN |
4
!2
3 3
| aN+2 | < | aN+1 | < | aN |
4 4
!3
3 3
| aN+3 | < | aN+2 | < | aN |
4 4
!4
3 3
| aN+4 | < | aN+3 | < | aN |
4 4
..
.
!k
3 3
| aN+k | < | aN+k−1 | < | aN |
4 4

The Geometric Series Test shows that

∞ !k ∞ !k
X 3 X 3
| aN | = | aN |
k=0
4 k=0
4

converges. Since
3
| aN+k |< ( )k | aN |
4
the Comparison Test tells us that
∞
X
| aN+k |
k=0

Calculus 2 (B. Forrest)2

Section 5.9: Ratio Test 221

also converges. From this we can conclude that

∞
X
| an |
n=0

converges.
In fact, if lim | aan+1
n
| = L and 0 ≤ L < 1, then a similar method would show that
n→∞
P∞
| an | converges.
n=0

On the other hand, if lim | an+1

an
| = 2, then for large n we would have that
n→∞

| an+1 | 2 | an | .

This means that rather than going to 0, the terms in the tail are getting larger. Since
P∞
we would then have that lim an , 0, the Divergence Test would tell us that an
n→∞ n=0
diverges.
A similar statement would hold whenever lim | an+1
an
| = L and L > 1.
n→∞

This is summarized in the next theorem which gives us one of the most important
tests for convergence of series.

THEOREM 13 Ratio Test

∞
P
Given a series an , assume that
n=0

an+1
lim =L
n→∞ an

where L ∈ R or L = ∞.
∞
1. If 0 ≤ L < 1, then
P
an converges absolutely.
n=0

∞
2. If L > 1, then
P
an diverges.
n=0

3. If L = 1, then no conclusion is possible.

Remarks:

1) If 0 ≤ L < 1, the Ratio Test shows that the given series converges absolutely
and hence that the original series also converges.
∞
an+1
| = L exists with L , 1, then the series
P
2) If lim | an
an behaves like the
n→∞ n=0
∞
Ln as far as convergence is concerned.
P
geometric series
n=0

Calculus 2 (B. Forrest)2

Chapter 5: Numerical Series 222

3) While the Ratio Test is one of the most important tests for convergence, we
will see that it cannot detect convergence or divergence for many of the series
we have seen so far. In fact, it can only detect convergence if the terms an
approach 0 very rapidly, and it can only detect divergence if lim |an | = ∞.
n→∞
This means that the Ratio Test is appropriate for a very special class of series.

P∞ 1
EXAMPLE 30 Show that converges.
n=0 n!

We have already seen how this could be done using the Comparison Test. However,
the Ratio Test is perfectly suited to series involving factorials.
With an = 1
n!
, we see that
1
an+1 (n+1)!
= 1
an n!
n!
=
(n + 1)!
1
=
n+1

Therefore
an+1 1
lim = lim = 0.
n→∞ an n→∞ n + 1

P∞ 1
The Ratio Test shows that converges.
n=0 n!

EXAMPLE 31 In the previous example, we saw how that Ratio Test could be used to show that the
P∞ 1
series converges. This series actually converges very rapidly since n! grows
n=0 n!
very quickly. However, if we let
1000000n
an =
n!
the situation is quite different. For example, a10 > 1050 .
Still,
1000000n+1
an+1 (n+1)!
= 1000000n
an n!

1000000n+1 n!
=
1000000n (n + 1)!
1000000
=
n+1

Calculus 2 (B. Forrest)2

Section 5.9: Ratio Test 223

and so we again have that

an+1 1000000
lim = lim = 0.
n→∞ an n→∞ n + 1

This means that despite the enormous size of 1000000n in the numerator, n!
eventually dominates. Consequently, we can use the Ratio Test to show that
∞
P 1000000n
n!
converges.
n=0

EXAMPLE 32 Based on the two previous examples, for which values of x would the series
∞
X xn
n=0
n!

converge?
To answer this question, we first note that if x = 0
∞
X 0n
= 1 + 0 + 0 + 0 + ··· = 1
n=0
n!

so the series converges when x = 0. Next fix a value for x , 0.

xn
If an = , then
n!
n+1
x
an+1 (n+1)!
= xn
an
n!

| x |n+1 n!
=
| x |n (n + 1)!
|x|
=
n+1

Then
an+1 |x|
lim = lim = 0.
n→∞ an n→∞ n + 1

The Ratio Test shows that ∞ n

x
X
n!
n=0

converges and hence that

∞
X xn
n=0
n!
also converges.

Calculus 2 (B. Forrest)2

Chapter 5: Numerical Series 224

Remark: One final observation can be made from the previous example. Since
P∞ xn
converges for any x and since the Divergence Test tells us that the terms of a
n=0 n!
convergent series must approach 0 we have the following theorem.

THEOREM 14 Polynomial vs Factorial Growth

For any x ∈ R
xn
lim = 0.
n→∞ n!

Remark: This important limit tells us that exponentials are of a lower order of
magnitude compared to factorials. That is, for any fixed x0 ∈ R, | x0 |n n!
∞
rn will diverge if | r |= 1. Therefore, since the
P
Note: We know that the series
n=0
conclusions of the Ratio Test are based on the Geometric Series Test, it might be
surprising that if L = 1, the Ratio Test would not show that the series diverges.
However, it is important to recognize that lim | aan+1
n
|= 1 does not actually mean that
n→∞
| aN+k |=| aN | for large N as would be the case if the ratio was exactly 1.
The next two examples show that when L = 1 we could have either convergence or
divergence.

∞
P 1
EXAMPLE 33 Apply the Ratio Test to the divergent series n
.
n=1

In this example, an = 1
n
so that
1
an+1 = n+1
an 1
n
n
=
n+1
1
=
1 + n1

Therefore, liman+1
= lim 1
1 = 1 and the Ratio Test fails.
n→∞ an n→∞ 1+ n

∞
P 1
EXAMPLE 34 Apply the Ratio Test to the convergent series n2
.
n=1

In this example, an = 1
n2
so that

Calculus 2 (B. Forrest)2

Section 5.9: Ratio Test 225

1
an+1 = (n+1)2
an 1
n2

n2
= 2
n + 2n + 1
1
=
1+ + 2
n
1
n2

Therefore, lim an+1

= lim 1
= 1 and the Ratio Test fails again.
n→∞ 1+ n + n2
2 1
n→∞ an

We see that the Ratio Test cannot detect convergence or divergence of many other
series similar to the previous two examples.
Fact: If p(x) = a0 + a1 x + · · · + ak xk and q(x) = b0 + b1 x + · · · + bm xm are two
polynomials, then the Ratio Test will always fail for the series
∞
X p(n)
n=1
q(n)

EXAMPLE 35 Consider the series ∞

X nn
.
n=1
n!

It is even more difficult to predict how the growth of nn compares with that of n!
than it was for xn versus n!. Since the base is increasing as well as the exponent, nn
will get very large, very quickly. However, this is also true of n!. For example, there
6
(106 )10
is no easy way to determine the value of 106 !
. Instead, observe that

nn n · n · n · · · n n n n n
= = · · · · · · · ≥ 1.
n! 1 · 2 · 3 · · · n 1 2 3 n

Since the terms of the series are always larger than 1, they cannot converge to 0 and
hence the series diverges.

Calculus 2 (B. Forrest)2

Chapter 5: Numerical Series 226

The Ratio Test would also show this result because

(n+1)n+1
an+1 = (n+1)!
an nn
n!

(n + 1)n+1 n!
=
nn (n + 1)!
(n + 1)n+1
=
nn (n + 1)
(n + 1)n
=
nn
!n
n+1
=
n
!n
1
= 1+
n

Recall that
!n
1
lim 1 + = e.
n→∞ n

Therefore, lim an+1

= e > 1. The Ratio Test shows that the series diverges.
n→∞ an

This also shows that

nn
lim =∞
n→∞ n!

since for large N we will have that

aN+k aN ek → ∞.

Hence
n! nn .

Alternately, if we consider the series

∞
X n!
n=1
nn

Calculus 2 (B. Forrest)2

Section 5.10: Root Test 227

then
nn
an+1 = n!
an (n+1)n+1
(n+1)!

nn (n + 1)!
=
(n + 1)n+1 n!
nn (n + 1)
=
(n + 1)n+1
(n)n
=
(n + 1)n
n n
=
n+1
1
=
n+1 n

n

1
= n
1 + 1n

This time, we get

an+1 1 1
lim = lim = <1
n→∞ an n→∞ (1 + )n
1 e
n
so the series ∞
X n!
n=1
nn
converges.

The following is a summary of what we have learned about the order of magnitude
of various functions:

ln(n) n p xn n! nn
for | x |> 1.
Therefore,
1 1 1 1 1
n
n p .
n n! x n ln(n)

5.10 Root Test

The next test is related to the Ratio Test. In fact, it can be derived in a manner similar
to the Ratio Test by comparing the series with a suitably chosen geometric series.

Calculus 2 (B. Forrest)2

Chapter 5: Numerical Series 228

THEOREM 15 Root Test

∞
P
Given a series an , assume that
n=1
pn
lim |an | = L
n→∞

where L ∈ R or L = ∞.
∞
1. If 0 ≤ L < 1, then
P
an converges absolutely.
n=1

∞
2. If L > 1, then
P
an diverges.
n=1

3. If L = 1, then no conclusion is possible.

∞
3n2 +1 n
P
EXAMPLE 36 Does the series 4n2 +n−1
converge or diverge?
n=1

3n2 +1 n
√n 2
+1
Let an = 4n +n−1
2 . Then an = 4n3n2 +n−1 .
We know that

3n2 + 1
!
√ 3
lim n an = lim = < 1.
n→∞ n→∞ 4n2 + n − 1 4
∞
3n2 +1 n
P
It follows from the Root Test that 4n +n−1
2 converges.
n=1

∞ n2
1 + 1n converge or diverge?
P
EXAMPLE 37 Does the series
n=1
r
n2 √ n
n2 n
Let an = 1 + 1n . Then n an = 1 + 1n = 1 + n1 .

We know that
!n
1
lim 1 + = e > 1.
n→∞ n
∞ n2
1 + 1n diverges.
P
Hence, by the Root Test the series
n=1

Calculus 2 (B. Forrest)2

Chapter 6

Power Series

So far all of the series we have considered have been numerical. That is, they have
consisted of an infinite sum of real numbers which either converged or diverged. In
this chapter, we will introduce a type of series called a power series which
resembles a polynomial of infinite degree.

6.1 Introduction to Power Series

In this section, we will introduce an important class of series called power series.

DEFINITION Power Series

A power series centered at x = a is a series of the form
∞
X
an (x − a)n
n=0

where x is considered a variable and the value an is called the coefficient of the term
(x − a)n .

Once we assign a value to the variable x, the series becomes a numerical series. In
particular, the following is the fundamental problem that we must answer.
∞
an (x − a)n converge?
P
Problem: For which values of x does the power series
n=0

Note that if x = a, the series becomes

∞
X
an (0)n .
n=0

Since our convention is that 00 = 1, we have

Chapter 6: Power Series 230

∞
X
an (0)n = a0 (1) + a1 (0) + a2 (0) + · · ·
n=0
= a0 + 0 + 0 + 0 + · · ·
= a0

This shows that every power series centered at x = a will converge at x = a to the
value a0 . The problem now is to determine what happens for other values of x.
As you might expect, the answer depends on the coefficients an .
Note: Before we proceed to study the convergence properties for power series we
note that if we let u = x − a, then
∞
X ∞
X
an (x − a) = n
an un .
n=0 n=0

This allows us to transform a power series centered at x = a into a power series

∞
centered at u = 0. Moreover, the power series an (x − a)n converges at a point x0 if
P
n=0
∞
P n
and only if the power series an u converges at x0 − a. This means that if we know
n=0
∞
an un converges, then we know all the
P
all of the values of u at which the series
n=0
∞
an (x − a)n converges. For this reason, in studying
P
values of x for which the series
n=0
the basic properties of convergence of power series we can focus almost exclusively
on the case where a = 0. Therefore, unless otherwise specified, we will be dealing
with power series of the form
X∞
an x n .
n=0

EXAMPLE 1 If we choose an = 1 for each n, we get the power series

∞
X
xn .
n=0

We know from the Geometric Series Test that this series will converge if and only if
| x |< 1.

EXAMPLE 2 We saw as an application of the Ratio Test that the series

∞
X xn
n=0
n!

Calculus 2 (B. Forrest)2

Section 6.1: Introduction to Power Series 231

will converge no matter what value we assign to x. Moreover, it converges

absolutely for each x.

EXAMPLE 3 For which values of x does the series

∞
X xn
n=0
n+1

converge?
We can use the Ratio Test to answer this question.
Fix a value for x and let
xn
bn = .
n+1
Then
(x)n+1
b n+1

lim = lim n+2 n

n→∞ bn n→∞ (x)
n+1

(n + 1) | x |n+1
= lim
n→∞ (n + 2) | x |n

(n + 1)
= lim |x|
n→∞ (n + 2)
(n + 1)
= | x | lim
n→∞ (n + 2)

= | x | ·1

= |x|

∞ ∞
xn
=
P P
The Ratio Test shows that the series n+1
bn converges (absolutely) if
n=0 n=0

bn+1
lim | | = | x |< 1
n→∞ bn
and it diverges if
bn+1
lim | | = | x |> 1.
n→∞ bn

The Ratio Test fails to tell us what happens if | x |= 1, so we must consider this case
separately.
When x = 1, the series is
∞
X 1
n=0
n+1

Calculus 2 (B. Forrest)2

Chapter 6: Power Series 232

which is just the Harmonic Series written in a different form. Therefore, when
x = 1, the series diverges.
When x = −1 the series becomes
∞
X (−1)n
.
n=0
n+1

This is just the Alternating Series and as such when x = −1, the series converges.
∞
P xn
In summary, the power series n+1
converges absolutely if | x |< 1, diverges if
n=0
| x |> 1 and if x = 1, and converges conditionally at x = −1. That is, the series
converges if
x ∈ [−1, 1)

If we take a closer look at these three examples we will see that they have certain
∞
xn converges if x ∈ (−1, 1). In fact
P
common features. First observe that the series
n=0
∞
P xn
it converges absolutely on this interval. The series n!
converges absolutely on the
n=0
∞
P xn
interval (−∞, ∞). The series n+1
converges if x ∈ [−1, 1) and it converges
n=0
absolutely if x ∈ (−1, 1).
These three power series have the following two properties:
Property 1: The set of points on which the power series converge is an interval
centered around x = 0.
Property 2: There exists an R with either R ∈ [0, ∞) or R = ∞ such that the power
series converges absolutely if | x |< R and it diverges if | x |> R when R ∈ [0, ∞),
and the series converges absolutely at each x ∈ R when R = ∞. In the first and third
examples R = 1, while in the second example R = ∞.
We will now show that two properties are shared by all power series centered at
x = 0. Moreover, if a power series is centered at x = a, then Property 1 will hold
with an interval centered around x = a and Property 2 will hold if we replace | x | by
| x − a |.
Key Observation:
∞
an xn converges at x = x0 . If 0 ≤| x1 |<| x0 |, then we
P
Assume that the power series
n=0
claim that the series ∞
X
| an x1n |
n=0

Calculus 2 (B. Forrest)2

Section 6.1: Introduction to Power Series 233

also converges. To see why this is the case, we first note that since
∞
X
an x0n
n=0

converges, the Divergence Test shows that

lim |an x0n | = 0.

n→∞

It follows that there exists an N0 such that

|an x0n | < 1

for all n ≥ N0 .
Next observe that n n
x1 x1
|an x1n | = |an x0n | · ≤
x0 x0

for all n ≥ N0 . But xx10 < 1 so the series

∞ n
x1
X
x0
n=N0
n
converges by the Geometric Series Test. Since we know that |an x1n | ≤ xx01 the
Comparison Test shows that
X∞
|an x1n |
n=N0

also converges. Finally we get that

∞
X
|an x1n |
n=0

converges as claimed.
∞
an xn then the set
P
Summary: We have actually shown that if we are given a series
n=0

∞
X
I = {x0 | | an x0n converges}
n=0

is an interval centered at x = 0. Moreover, a similar result is true for a series of the

∞
an (x − a)n .
P
form
n=0

This leads us to the following definition:

Calculus 2 (B. Forrest)2

Chapter 6: Power Series 234

DEFINITION Interval and Radius of Convergence

∞
an (x − a)n , the set
P
Given a power series of the form
n=0

∞
X
I = {x0 | | an (x0 − a)n converges}
n=0

is an interval centered at x = a which we call the interval of convergence for the

power series.
Let 
lub({|x0 − a| | x0 ∈ I}) if I is bounded,


R := 
∞
 if I is not bounded.

Then R is called the radius of convergence of the power series.

The following theorem summarizes what we now know about convergence of a

power series.

THEOREM 1 Fundamental Convergence Theorem for Power Series

∞
an (x − a)n centered at x = a, let R be the radius of
P
Given a power series
n=0
convergence.
∞
1. If R = 0, then an (x − a)n converges for x = a but it diverges for all other
P
n=0
values of x.
∞
2. If 0 < R < ∞, then the series an (x − a)n converges absolutely for every
P
n=0
x ∈ (a − R, a + R) and diverges if | x − a |> R.
∞
3. If R = ∞, then the series an (x − a)n converges absolutely for every x ∈ R.
P
n=0

∞
an (x − a)n converges on an interval that is centered at x = a which
P
In particular,
n=0
may or may not include one or both of the endpoints.

The following are some important observations concerning the interval of

convergence.

∞
1. As the previous theorem states, if 0 < R < ∞, the power series an x n
P
n=0
converges absolutely on the interval (−R, R). It may or may not converge at
x = R or at x = −R. These points must be tested separately. As we will see
later, the interval of convergence could be (−R, R), [−R, R), (−R, R] or [−R, R].
The next few examples show that all four cases are possible.

Calculus 2 (B. Forrest)2

Section 6.1: Introduction to Power Series 235

2. If R = ∞, then the power series has interval of convergence (−∞, ∞) and it

∞ n
P x
converges absolutely at each point. For example, the series n!
has (−∞, ∞)
n=0
as its interval of convergence.
3. If R = 0, then the power series only converges at x = 0. That is, “the interval
of convergence” is just the single point {0}. An example of such a power
∞
n!xn .
P
series is
n=0

EXAMPLE 4 The following power series all have radius of convergence R = 1. The interval of
convergence is specified.

∞
xn has interval of convergence (−1, 1) since when x = 1, the series
P
1.
n=0

∞
X
1n
n=0

diverges and when x = −1, the series

∞
X
(−1)n
n=0

also diverges.
∞
xn
has interval of convergence [−1, 1) since when x = 1, the series
P
2. n
n=1

∞
X 1
n=1
n

diverges but when x = −1, the series

∞
X (−1)n
n=1
n
converges.
∞
(−1)n xn
has interval of convergence (−1, 1] since when x = 1, the series
P
3. n
n=1

∞
X (−1)n
n=1
n

converges but when x = −1, the series

∞
X 1
n=1
n

diverges.

Calculus 2 (B. Forrest)2

Chapter 6: Power Series 236

∞
xn
has interval of convergence [−1, 1] since when x = 1, the series
P
4. n2
n=1

∞
X 1
n=1
n2

converges and when x = −1, the series

∞
X (−1)n
n=1
n2

also converges.

6.1.1 Finding the Radius of Convergence

∞
an xn , we have seen that we can often use the Ratio Test to
P
Given a power series
n=0
find the radius of convergence. To see how this works in general, assume that

an+1
lim =L
n→∞ an

where 0 ≤ L < ∞ For x , 0, let

bn = an x n .
Then

an+1 xn+1

bn+1
lim = lim
n→∞ bn n→∞ an xn

an+1
= lim | x |
n→∞ an

an+1
= | x | lim
n→∞ an

= L|x|

∞ ∞
bn = an xn converges absolutely if
P P
The Ratio Test shows that the series
n=0 n=0
L | x |< 1 and diverges if L | x |> 1.
Assume that 0 < L < ∞. Then L | x |< 1 if and only if | x |< L1 . Therefore, the radius
of convergence is R = L1 .

Calculus 2 (B. Forrest)2

Section 6.1: Introduction to Power Series 237

If L = 0, then no matter what x is, we have L | x |= 0 < 1. Therefore, R = ∞. At this

point we might be tempted to cheat with our notation and write ∞ 10 , since this is
consistent with the first case. However, the symbol
1
0
actually has no numerical meaning.
If
an+1
lim = L = ∞,
n→∞ an

then the same calculation would show

bn+1
lim | |= ∞
n→∞ bn
and so the series diverges for all non-zero x. However, we know that the series must
converge at x = 0 so the radius of convergence is 0. Again, we might be tempted
write 0 = ∞1 but once more
1
∞
is not actually defined. Instead we can simply make use of the following theorem
which summarizes what we have just discussed.

THEOREM 2 Test for the Radius of Convergence

∞
an (x − a)n be a power series for which
P
Let
n=0

an+1
lim =L
n→∞ an

where 0 ≤ L < ∞ or L = ∞. Let R be the radius of convergence of the power series.

1. If 0 < L < ∞, then R = L1 .

2. If L = 0, then R = ∞.
3. If L = ∞, then R = 0.

EXAMPLE 5 Find the radius and interval of convergence for the power series
∞
X xn
.
n=0
3n (n2 + 1)

We begin by trying to calculate lim | an+1

an
|. In this case, an = 1
3n (n2 +1)
.
n→∞

Calculus 2 (B. Forrest)2

Chapter 6: Power Series 238

Therefore
1

a n+1
3 ((n+1) +1)
n+1 2
lim = lim 1

n→∞ an n→∞
3 (n +1)
n 2
3n (n2 + 1)

= lim n+1

n→∞ 3 ((n + 1) + 1)
2

n2 + 1
!
1
= lim
n→∞ 3 n2 + 2n + 2

1 1 + n12
= lim
3 n→∞ 1 + n2 + n22
1
= · (1)
3
1
=
3

It follows from the Test for Radius of Convergence that R = 1

1 = 3.
3

We know that the series converges absolutely on (−3, 3). We must check for
convergence at x = 3 and x = −3.
For x = 3, the series becomes
∞ ∞
X 3n X 1
= .
n=0
3n (n2 + 1) n=0 n2 + 1

Since
1 1
< 2
n2 +1 n
∞ ∞
P 1 P 1
and since n2
converges, the Comparison Test shows that n2 +1
converges and
n=1 n=1
hence that ∞
X 1
n=0
n2 +1
converges.
Similarly, if x = −3, the series becomes
∞ ∞
X (−3)n X (−1)n
= .
n=0
3n (n2 + 1) n=0 n2 + 1

Notice that the Alternating Series Test applies and so this series also converges.
Alternatively, we have n
(−1) = 1 .
n2 + 1 n2 + 1

Calculus 2 (B. Forrest)2

Section 6.1: Introduction to Power Series 239

∞ ∞
P 1 P (−1)n
We have just shown that the series n2 +1
converges, which shows that n2 +1
n=0 n=0
converges absolutely.
We have just shown that the interval of convergence includes both endpoints,
therefore the interval of convergence is [−3, 3].
∞
P xn
If the previous calculation is repeated, it would show that the power series 3n (n2 +1)
n=0
∞
P xn
and the power series 3n
have the same radius of convergence, namely 3, though
n=0
they have a different interval of convergence. (A close look at the second series
shows that it really looks like a geometric series with r = 3x so it must converge
when | 3x |< 1 and diverge when | 3x |> 1.)

The following theorem will be very useful when we consider differentiation and
integration for functions obtained from power series. It may also help to find the
radius of convergence of many series more quickly. It essentially says that
multiplying or dividing the terms of a power series by a fixed polynomial in n will
not change the radius of convergence, though it is important to remember that it may
change the interval of convergence.

THEOREM 3 Equivalence of Radius of Convergence

Let p and q be non-zero polynomials where q(n) , 0 for n ≥ k. Then the following
series have the same radius of convergence:

∞
an (x − a)n
P
1.
n=k

∞
P an p(n)(x−a)n
2. q(n)
n=k

However, they may have different intervals of convergence.

Note: The limit lim aan+1
n
may not always exist. In this case, it might require some
n→∞
clever thought to find the radius of convergence.

EXAMPLE 6 Consider the power series

∞
X
an xn = 1 + 2x + x2 + 2x3 + x4 + 2x5 + · · ·
n=0

with an = 1 if n is even and an = 2 if n is odd. When x = 1 the series is

1 + 2 + 1 + 2 + 1 + 2 + ···

Calculus 2 (B. Forrest)2

Chapter 6: Power Series 240

which diverges. In fact the series also diverges for x = −1. This shows that R ≤ 1.
Next pick x0 ∈ (−1, 1). Then |x0 | < 1 then

|an x0n | ≤ 2|x0n |

and since the series ∞

X
2 x0n
n=0

converges, the Comparison Test shows that the original series converges absolutely
at x = x0 . This shows that the interval of convergence is (−1, 1) and that R = 1.

However, the sequence an osculates between 2 and 2 so lim aan+1
an+1 1
n
does not exist.
n→∞

6.2 Functions Represented by Power Series

In this section, we consider power series centered at x = a with radius of

convergence R > 0. We will see how these series define functions with particularly
nice properties.

EXAMPLE 7 Recall that the geometric series

∞
X
xn
n=0

converges for each x with | x |< 1. Moreover, if | x |< 1, then we also know that
∞
1 X
= xn .
1 − x n=0

This means that the series provides us with a means to represent the function
f (x) = 1−x
1
on the interval (−1, 1).

DEFINITION Functions Represented by a Power Series

∞
an (x − a)n be a power series with radius of convergence R > 0. Let I be the
P
Let
n=0
∞
an (x − a)n . Let f be the function defined on the
P
interval of convergence for
n=0
interval I by the formula
∞
X
f (x) = an (x − a)n
n=0

for each x ∈ I.

Calculus 2 (B. Forrest)2

Section 6.2: Functions Represented by Power Series 241

∞
an (x − a)n on I.
P
We say that the function f (x) is represented by the power series
n=0

The next theorem tells us that a function represented by a power series must be
continuous on its interval of convergence.

THEOREM 4 Abel’s Theorem: Continuity of Power Series

∞
an (x − a)n has interval of convergence I. Let
P
Assume that the power series
n=0

∞
X
f (x) = an (x − a)n
n=0

for each x ∈ I. Then f (x) is continuous on I.

The proof of this theorem is beyond the scope of this course.

6.2.1 Building Power Series Representations

Suppose that f and g are two functions represented by power series centered at x=a
with

∞
X
f (x) = an (x − a)n
n=0
and ∞
X
g(x) = bn (x − a)n ,
n=0

and that the two series have intervals of convergence I f and Ig , respectively.
Question: Can this information be used to build a power series for the sum f + g?
To see why this is possible, we first start by noting that if both series converge at a
point x0 ∈ I f ∩ Ig , then

( f + g)(x0 ) = f (x0 ) + g(x0 )

X k k
X
= lim an (x0 − a) + lim
n
bn (x0 − a)n
k→∞ k→∞
n=0 n=0
Xk
= lim (an + bn )(x0 − a)n
k→∞
n=0
∞
X
= (an + bn )(x0 − a)n
n=0

Calculus 2 (B. Forrest)2

Chapter 6: Power Series 242

This tells us that the function f + g can be represented by the power series
∞
X
(an + bn )(x − a)n
n=0

on I f ∩ Ig .

THEOREM 5 Addition of Power Series

Assume that f and g are represented by power series centered at x = a with

∞
X
f (x) = an (x − a)n
n=0

and ∞
X
g(x) = bn (x − a)n ,
n=0

respectively.
Assume also that the radii of convergence of these series are R f and Rg with
intervals of convergence I f and Ig . Then

∞
X
( f + g)(x) = (an + bn )(x − a)n .
n=0

Moreover, if R f , Rg , then the radius of convergence of the power series

representing f + g is R = min{R f , Rg } and the interval of convergence is I = I f ∩ Ig .
If R f = Rg , then R ≥ R f .

Next assume that h(x) = (x − a)m f (x) where m ∈ N. We might guess that h(x) would
be represented by the following power series centered at x = a:
∞
X ∞
X
h(x) = (x − a) m
an (x − a) =n
an (x − a)n+m .
n=0 n=0

To see why this is the case, we note that if x0 ∈ I f , then

h(x0 ) = (x0 − a)m f (x0 )

X k
= (x0 − a) lim
m
an (x0 − a)n
k→∞
n=0
k
X
= lim an (x0 − a)n+m
k→∞
n=0
∞
X
= an (x0 − a)n+m
n=0

Calculus 2 (B. Forrest)2

Section 6.2: Functions Represented by Power Series 243

We can summarize this in the following Theorem.

THEOREM 6 Multiplication of a Power Series by (x − a)m

Assume that f is represented by a power series centered at x = a as

∞
X
f (x) = an (x − a)n
n=0

with radius of convergence R f and interval of convergence I f .

Assume that h(x) = (x − a)m f (x) where m ∈ N. Then h(x) can also be represented by
a power series centered at x = a with
∞
X
h(x) = an (x − a)n+m
n=0

Moreover, the series that represents h has the same radius of convergence and the
same interval of convergence as the series that represents f .

Finally, assume that f has a power series representation

∞
X
f (u) = an un
n=0

centered at u = 0 with interval of convergence I f .

Question: Can we find a power series representation for h(x) = f (c · xm ) centered
at x = 0 where c is some non-zero constant?
In fact, if we choose x0 so that c · x0m ∈ I f , then substituting c · x0m for u gives us

∞
X ∞
X
h(x) = f (c · x0m ) = an (c · x0m )n = (an · cn )x0mn .
n=0 n=0

This leads to the following Theorem.

THEOREM 7 Power Series of Composite Functions

Assume that f has a power series representation

∞
X
f (u) = an un
n=0

Calculus 2 (B. Forrest)2

Chapter 6: Power Series 244

centered at u = 0 with radius of convergence R f and interval of convergence I f . Let

h(x) = f (c · xm ) where c is a non-zero constant. Then h has a power series
representation centered at x = 0 of the form
∞
X
h(x) = f (c · xm ) = (an · cn )xmn
n=0

The interval of convergence is

Ih = {x ∈ R | c · xm ∈ I f }
q
R
and the radius of convergence is Rh = m |c|f if R f < ∞ and Rh = ∞ otherwise.

x
EXAMPLE 8 Find a power series representation for f (x) = centered at x = 0.
1 − 2x2
We know that ∞
1 X
= un
1 − u n=0
for u ∈ (−1, 1). Then
∞ ∞
1 X X
= (2x 2 n
) = 2n x2n
1 − 2x2 n=0 n=0

provided that 2x2 ∈ (−1, 1). However, 2x2 ∈ (−1, 1) if and only if x2 ∈ − 12 , 12 .
Therefore,
∞ ∞
x X X
2
= x· 2 x =
n 2n
2n x2n+1
1 − 2x n=0 n=0

if and only if x ∈ − √12 , √12 .

6.3 Differentiation of Power Series

While the Continuity of Power Series Theorem has important theoretical

applications, we will be more interested to see if we can differentiate or integrate
functions that can be represented by power series.
∞
Let f (x) = an (x − a)n be a function that is represented by a power series. We
P
n=0
could naively try to differentiate f by differentiating the series one term at a time.
We would hope that we can do this because we know that for the sum of two
functions we have
d
(g(x) + h(x)) = g 0 (x) + f 0 (x).
dx

Calculus 2 (B. Forrest)2

Section 6.3: Differentiation of Power Series 245

Since d
(a (x
dx n
− a)n ) = nan (x − a)n−1 , we would also hope that f 0 (x) would exist and
that ∞ ∞
X X
f 0 (x) = nan (x − a)n−1 = nan (x − a)n−1
n=0 n=1

where the last equality holds since multiplying by 0 gives 0.

Unfortunately, since we are dealing with infinite sums rather than finite sums, we do
not even know if the above series will converge. Even if it does, it is not at all
obvious that it will sum to f 0 (x). At the very least, the process of term-by-term
differentiation does produce another power series centered at x = a.

DEFINITION The Formal Derivative of a Power Series

∞
an (x − a)n , the formal derivative is the series
P
Given a power series
n=0

∞
X ∞
X
nan (x − a)n−1 = nan (x − a)n−1 .
n=0 n=1

We are left to consider two fundamental problems.

Problem 1: For which values of x does the formal power series

∞
X
nan (x − a)n−1
n=1

converge? In particular, does this series converge for the same values as does the
∞
an (x − a)n ?
P
original series
n=0

∞ ∞
an (x − a)n and nan (x − a)n−1 converge at the
P P
Problem 2: If both the series
n=0 n=1
same at x, must it be the case that

∞
X
f (x) =
0
nan (x − a)n−1 ?
n=1

In other words, why must the statement that the derivative of a sum is the sum of the
derivatives carry over from finite sums to infinite sums?
Fortunately, Problem 1 is not too difficult.
In the previous section, we saw that multiplying the terms of a power series by a
polynomial in n does not change the radius of convergence R. Therefore, the series
∞ ∞
an xn and the series nan xn have the same radius of convergence. A minor
P P
n=0 n=0

Calculus 2 (B. Forrest)2

Chapter 6: Power Series 246

∞
an (x − a)n and its formal
P
modification of this result shows that the series
n=0
∞
P n−1
derivative nan (x − a) also have the same radius of convergence, though the
n=1
interval of convergence may be different. Therefore, we have that
∞
X
g(x) = nan (x − a)n−1
n=1

is defined for all x ∈ (a − R, a + R). The question that remains is:

Does g(x) = f 0 (x)?

A rigorous answer to this question is very difficult. It involves a number of very

sophisticated results including the Fundamental Theorem of Calculus. However, the
following theorem tells us that functions represented by power series do indeed have
the remarkable property that we require.

THEOREM 8 Term-by-Term Differentiation of Power Series

∞
an (x − a)n has radius of convergence R > 0. Let
P
Assume that the power series
n=0

∞
X
f (x) = an (x − a)n
n=0

for all x ∈ (a − R, a + R). Then f is differentiable on (a − R, a + R) and for each

x ∈ (a − R, a + R),
X∞
f (x) =
0
nan (x − a)n−1 .
n=1

Remarks:

1) It may be tempting to dismiss the previous theorem as obvious since we

would expect differentiation to be preserved by sums. However, this theorem
illustrates a very special property of power series that is not shared by other
types of series of functions. For example, it can be shown as an application of
the Comparison Test and the Geometric Series Test that for each x ∈ R, the
series ∞ !n
X 3
sin(9n x)
n=0
4
converges. This allows us to define a function
∞ !n
X 3
f (x) = sin(9n x)
n=0
4

Calculus 2 (B. Forrest)2

Section 6.3: Differentiation of Power Series 247

that is defined for all x ∈ R.

Notice that f is the sum of infinitely many differentiable functions, namely
!n
3
fn (x) = sin(9n x)
4
for each n. Based on our experience with power series, we might be tempted
to think that f (x) should be differentiable and that its derivative would be the
sum of all of the derivatives of the fn ’s. Unfortunately, this is false. In fact, it
turns out that the function f (x) is a continuous function that is not
differentiable at any point on the real line. This is about as strange as a
continuous function can be!
This example, which is one of a class of everywhere continuous but nowhere
differentiable functions discovered by the 19th century mathematician Karl
Weierstrass. This is a historically important example because up until
Weierstrass’ discovery it was thought that a continuous function had to have
some points of differentiability. The graph of such a function has ”fractal-like”
properties. This means that no matter how close we zoom in to any given
point, the graph of the function does not start to resemble a straight line.
2) While it may not be immediately obvious, the Differentiation Theorem for
Power Series actually says much more about a function f that can be
represented by a power series than just simply that it is differentiable. To see
∞
why, observe that if f (x) = an (x − a)n is represented by a power series on
P
n=0
∞
the interval (a − R, a + R), then f 0 (x) = nan (x − a)n−1 is also represented by
P
n=1
a power series on (a − R, a + R). Applying the previous theorem again shows
that f 0 (x) is also differentiable on (a − R, a + R). Moreover, we can calculate
d
dx
( f 0 (x)) = f 00 (x) by differentiating the series representing f 0 (x)
term-by-term. Therefore
X∞ X∞
f 00 (x) = n(n − 1)an (x − a)n−2 = n(n − 1)an (x − a)n−2 .
n=1 n=2

(Note: We begin the series at n = 2 since for n = 1, n(n − 1) = 0.)

Furthermore, we need not stop here. We can term-by-term differentiate again
to show that ∞
X
f 000 (x) = n(n − 1)(n − 2)an (x − a)n−3
n=3
and that ∞
X
f (4) (x) = n(n − 1)(n − 2)(n − 3)an (x − a)n−4 .
n=4

In fact, we can show that f (x) has derivatives of all orders on (a − R, a + R)

with
X∞
f (x) =
(k)
n(n − 1)(n − 2)(n − 3) · · · (n − k + 1)an (x − a)n−k
n=k

for each k.

Calculus 2 (B. Forrest)2

Chapter 6: Power Series 248

EXAMPLE 9 We know that if | x |< 1, then

∞
1 X
= xn .
1 − x n=0

Let f (x) = 1
1−x
= (1 − x)−1 . Differentiating, we get

1
f 0 (x) = .
(1 − x)2

We also get a power series representation for f 0 (x) by term-by-term differentiation

∞
xn . It follows that
P
of the series
n=0

∞
1 X
f (x) =
0
2
= nxn−1 .
(1 − x) n=1

We can use this to evaluate the sum of the series

∞
X n
n−1
.
n=1
2

∞
nxn−1 by letting x = 12 . Therefore,
P
Observe that this series is obtained from
n=1

∞ !
X n 1
n−1
= f 0

n=1
2 2
1
= 2
1 − 21

= 4

EXAMPLE 10 We have seen that if a function f satisfies the differential equation y 0 = y, then there
∞ n
is a constant C such that f (x) = Ce x . We also know that the series x
P
n!
converges
n=0
for every value of x. Let
∞
X xn x0 x1 x2 x3
g(x) = = + + + + ··· .
n=0
n! 0! 1! 2! 3!

We can find g 0 (x) by term-by-term differentiation. Therefore

∞
X nxn−1
g (x) =
0
.
n=1
n!

Calculus 2 (B. Forrest)2

Section 6.3: Differentiation of Power Series 249

If we take a close look at this series we see immediately that it becomes

∞
X xn−1
.
n=1
(n − 1)!

Writing out the terms, we get

x0 x1 x2 x3
+ + + + ···
0! 1! 2! 3!
which is identical to the series that gave us g(x). This shows that

g 0 (x) = g(x)

for every x. Therefore, there is a C such that g(x) = Ce x . Moreover, g(0) = Ce0 = C.
But
∞
X 0n
g(0) =
n=0
n!

00 01 02 03
= + + + + ···
0! 1! 2! 3!

= 1 + 0 + 0 + 0 + ···

= 1

Hence C = 1 and ∞
X xn
g(x) = = ex
n=0
n!
for every x ∈ R.
This is an extremely important example that we will come back to again and again.
one immediate application is that
∞ ∞
X 1n X 1
e= = .
n=0
n! n=0 n!

This series converges quite rapidly so we can get a very accurate approximation for
e by summing a relatively small number of terms.

Find a power series representation for the function f (x) = e−x .

2
EXAMPLE 11
We have just seen for any u ∈ R that
∞
X un
e =
u
(∗)
n=0
n!

Calculus 2 (B. Forrest)2

Chapter 6: Power Series 250

Given x ∈ R, let u = −x2 and substitute for u in the expression (∗). This gives us
∞ ∞
−x2
X (−x2 )n X
nx
2n
e = = (−1) .
n=0
n! n=0
n!

At first glance

2 x2 x4 x6 x2n
e−x = 1 − + − + · · · + (−1)n + ···
1! 2! 3! n!
may not look like a power series since there are no terms involving xn when n is
odd. But in fact, it is a power series where the coefficients are of the form a2k−1 = 0
and a2k = (−1)k k!1 for each k = 0, 1, 2, 3, 4, . . ..
Moreover, since the original series converges for all u ∈ R, the power series for
f (x) = e−x will also converge for all x ∈ R. That is, its radius of convergence is
2

R = ∞.

Key Observation: We have seen that if

∞
X
f (x) = an x n
n=0

for all x ∈ (−R, R), then f (x) is infinitely differentiable on (−R, R). We can now use
this to show that once a function f (x) has a power series representation at x = 0, the
coefficients are uniquely determined by the various values of the derivatives of f (x).
In particular, once we fix the center x = 0, the function f can only be represented by
one such power series at x = 0 (though there may well be other representations of f
with different centers). To see why this is true, recall that for any function
∞
X
g(x) = bn xn
n=0

represented by a power series, if we substitute in 0 for x, we get that

g(0) = b0 .

That is, g(0) is simply the coefficient of the term x0 in the power series
representation of g(x). Therefore, if
∞
X
f (x) = an x n
n=0

is any power series representation for the function f (x), then a0 = f (0). This shows
that a0 is in fact, uniquely determined by the value of f (x) at x = 0.
We can show that something similar occurs for all of the other coefficients. For
example, we note that since
∞
X
f (x) =
0
nan xn−1
n=1

Calculus 2 (B. Forrest)2

Section 6.3: Differentiation of Power Series 251

then f 0 (0) is the coefficient of x0 in this series new representation. But we get x0
when n = 1 and when n = 1, the coefficient is (1)(a1 ). It follows that

f 0 (0) = (1)a1 = a1 .

Alternatively, we could get this directly by substituting x = 0 in the expression for

f 0 (x) to get

f 0 (0) = (1)a1 (0)0 + (2)a2 (0)1 + (3)a3 (0)2 + (4)a4 (0)4 + · · ·

= a1 + 0 + 0 + 0 + 0 + · · ·
= a1

By either method, we see that f 0 (0) = a1 , so the coefficient a1 is also uniquely

determined by f (x).
A similar calculation shows that f 00 (0) is the coefficient of x0 in
∞
X
f (x) =
00
n(n − 1)an xn−2 .
n=2

This time to get xn−2 = x0 , we let n = 2. From this we see that the coefficient of x0 in
the previous expression is given by 2(2 − 1)a2 or 2 · 1a2 = 2!a2 . As a result we have

f 00 (0) = 2!a2

or that
f 00 (0)
a2 = .
2!
Once again, a2 is uniquely determined.
To find a3 , start with
∞
X
f 000
(x) = n(n − 1)(n − 2)an xn−3 .
n=3

000
Then f (0) is the coefficient of x0 in this series and therefore

f 000
(0) = (3)(3 − 1)(3 − 2)a3 = 3 · 2 · 1a3 = 3!a3 .

Solving for a3 shows that

000
f (0)
a3 = .
3!
In fact, for any k ≥ 2, we have
∞
X
f (x) =
(k)
n(n − 1)(n − 2)(n − 3) · · · (n − k + 1)an xn−k
n=k

Calculus 2 (B. Forrest)2

Chapter 6: Power Series 252

so that the coefficient of x0 is obtained when n = k and hence

f (k) (0) = (k)(k − 1)(k − 2) · · · (k − (k − 2))(k − (k − 1))ak

= k · (k − 1) · (k − 2) · · · 2 · 1ak
= k!ak

Solving this expression for ak shows that for k ≥ 2,

f (k) (0)
ak = .
k!
If we note that
f (0) (0)
a0 = f (0) =
0!
and
f (1) (0)
a1 = f 0 (0) =
1!
then we see that for every k,
f (k) (0)
ak = .
k!
∞
an xn centered at
P
This tells us if a function can be represented by a power series
n=0
x = 0, then the function f (x) and its various derivatives uniquely determine the
coefficients.
A similar argument shows that the previous observation holds for a power series
centered at x = a. This situation motivates the next theorem.

THEOREM 9 Uniqueness of Power Series Representations

Suppose that
∞
X
f (x) = an (x − a)n
n=0

for all x ∈ (a − R, a + R) where R > 0. Then

f (n) (a)
an = .
n!

In particular, if
∞
X
f (x) = bn (x − a)n ,
n=0

then
bn = an
for each n = 0, 1, 2, 3, · · · .

Calculus 2 (B. Forrest)2

Section 6.4: Integration of Power Series 253

6.4 Integration of Power Series

Up until now we have seen that a function which is represented by a power series
has the remarkable property that it is infinitely differentiable and that its derivatives
can be obtained by repeated term-by-term differentiation of the power series. It
would make sense to determine if a similar statement could be made with respect to
anti-differentiation and integration.
We begin with the following definition:

DEFINITION Formal Antiderivative of a Power Series

∞
an (x − a)n , we define the formal antiderivative to be the
P
Given a power series
n=0
power series
∞ Z ∞
X X an
an (x − a)n dx = C + (x − a)n+1 .
n=0 n=0
n + 1
where C is an arbitrary constant.

In the definition, we called the series a formal antiderivative because at this point we
do not know if it is an actual antiderivative of f . In fact we do not even know if the
formal antiderivative converges at any point other than the center.
∞
an (x − a)n has radius of convergence
P
Problem: Suppose that the power series
n=0
R > 0. Let ∞
X
f (x) = an (x − a)n
n=0

be the function that is represented by this power series on the interval (a − R, a + R).
Are the formal anitderivatives
∞
X an
C+ (x − a)n+1
n=0
n+1

true antiderivatives of the function f ?

To see why this is in fact the case we first note that the series
∞
X an
(x − a)n
n=0
n + 1
∞
an (x − a)n by dividing each coefficient an by q(n) = n + 1. We
P
is obtained from
n=0
∞
an
− a)n will
P
know that this will not change the radius of convergence. Hence, n+1
(x
n=0
also converge on (a − R, a + R). Then the formal antiderivative
∞
X an
C+ (x − a)n+1
n=0
n+1

Calculus 2 (B. Forrest)2

Chapter 6: Power Series 254

will have radius of convergence R. This means that we can define a function
∞
X an
F(x) = C + (x − a)n+1
n=0
n + 1

on (a − R, a + R).
Next we show that F is an antiderivative of f .
Since F is represented by a power series, it is differentiable. Moreover, its derivative
can be obtained by term-by-term differentiation. Therefore
∞
d X d an
F (x) =
0
(C) + (x − a)n+1
dx n=0
dx n + 1
X∞ a
n
= 0+ (n + 1) (x − a)n
n=0
n+1
X∞
= an (x − a)n
n=0

= f (x)

so F is an antiderivative of f . Moreover, since the constant C is arbitrary, all

antiderivatives have this form.
The next theorem summarizes what we have just discussed. It also tells us that we
can evaluate definite integrals using term-by-term methods. (The proof of this last
statement is beyond the scope of the course.)

THEOREM 10 Term-by-Term Integration of Power Series

∞
an (x − a)n has radius of convergence R > 0. Let
P
Assume that the power series
n=0

∞
X
f (x) = an (x − a)n
n=0

for every x ∈ (a − R, a + R). Then the series

∞ Z ∞
X X an
an (x − a)n dx = C + (x − a)n+1
n=0 n=0
n + 1

also has radius of convergence R and if

∞
X an
F(x) = C + (x − a)n+1
n=0
n + 1

then F 0 (x) = f (x).

Calculus 2 (B. Forrest)2

Section 6.4: Integration of Power Series 255

Furthermore, if [c, b] ⊂ (a − R, a + R), then

Z b ∞
Z bX
f (x) dx = an (x − a)n dx
c c n=0
∞ Z
X b
= an (x − a)n dx
n=0 c

∞
X an
= · ((b − a)n+1 − (c − a)n+1 )
n=0
n + 1

Note: Similar to the case with the term-by-term differentiation rule for power
series, it should seem natural that we are also able to integrate term-by-term the
functions that are represented by a power series. But as was the case with
differentiation, the reason we are able to do this is because of some very special
properties impacting how power series converge. If
∞
X
F(x) = fn (x)
n=1

for each x ∈ [a, b], then we might hope that

Z b ∞ Z
X b
F(x) dx = fn (x) dx.
a n=0 a

Unfortunately, if we do not make any additional assumptions about the nature of the
functions fn or how the series converges, then this statement could be false. In fact,
it is possible that the function F need not even be integrable on [a, b].
We can use the previous theorem to build power series representations for many
functions.

EXAMPLE 12 Find a power series representation for ln(1 + x).

We begin by recognizing that
d 1
(ln(1 + x)) = .
dx 1+x
1
If we had a power series representation for 1+x
, we could use the previous theorem
to build a representation for ln(1 + x).
We know that for any u ∈ (−1, 1),
∞
1 X
= un .
1 − u n=0

Calculus 2 (B. Forrest)2

Chapter 6: Power Series 256

Let x ∈ (−1, 1). Then (−x) ∈ (−1, 1). If we let u = −x and substitute this into the
previous equation, then
∞
1 X
= (−x)n
1 − (−x) n=0
so that for any (−x) ∈ (−1, 1),
1 1
=
1+x 1 − (−x)
X∞
= (−x)n
n=0
∞
X
= (−1)n xn
n=0

We now know d
dx
(ln(1 + x)) = 1
1+x
and that
∞
1 X
= (−1)n xn .
1 + x n=0

Therefore, there is a constant C such that

∞ Z
X
ln(1 + x) = C + (−1)n xn dx
n=0
∞
X (−1)n
= C+ xn+1
n=0
n+1

for all x ∈ (−1, 1).

To find C, let x = 0. Then
0 = ln(1 + 0)
∞
X (−1)n n+1
= C+ 0
n=0
n + 1
= C

This shows that for all x ∈ (−1, 1),

∞
X (−1)n
ln(1 + x) = xn+1 .
n=0
n+1

There is one more very useful observation that we can make regarding this example.
We know that the series ∞
X (−1)n n+1
x
n=0
n+1

Calculus 2 (B. Forrest)2

Section 6.5: Review of Taylor Polynomials 257

has radius of convergence R = 1. However, if we let x = 1, the series becomes

∞
X (−1)n 1 1 1 1 1
1n+1 = − + − + − ···
n=0
n+1 1 2 3 4 5

which is exactly the Alternating Series. Therefore, the series also converges at
x = 1. However, the equation
∞
X (−1)n
ln(1 + x) = xn+1
n=0
n+1

is actually valid wherever this new series converges. This tells us that
1 1 1 1 1
ln(2) = ln(1 + 1) = − + − + − ···
1 2 3 4 5
so we have found the sum of the alternating series.

6.5 Review of Taylor Polynomials

Recall that if f is differentiable at x = a, then if x a

f (x) − f (a)
f 0 (a) .
x−a
Cross-multiplying gives us

f 0 (a)(x − a) f (x) − f (a)

and finally that

f (x) f (a) + f 0 (a)(x − a) .

This led us to define the linear approximation to f at x = a to be the function

La (x) = f (a) + f 0 (a)(x − a).

We saw that the geometrical significance of the linear approximation is that its
graph is the tangent line to the graph of f through the point (a, f (a)).

Calculus 2 (B. Forrest)2

Chapter 6: Power Series 258

La (x) = f (a) + f 0 (a)(x − a)

Recall also that the linear approximation has the following two important properties:

1. La (a) = f (a).
2. La0 (a) = f 0 (a).

In fact, amongst all polynomials of degree at most 1, that is functions of the form
p(x) = c0 + c1 (x − a),
the linear approximation is the only one with both properties (1) and (2) and as such,
the only one that encodes both the value of the function at x = a and its derivative.
We know that for x near a that
f (x) La (x).
This means that we can use the simple linear function La to approximate what could
be a rather complicated function f at points near x = a. However, any time we use a
process to approximate a value, it is best that we understand as much as possible
about the error in the procedure. In this case, the error in the linear approximation is
Error(x) =| f (x) − La (x) |
and at x = a the estimate is exact since La (a) = f (a).
There are two basic factors that affect the potential size of the error in using linear
approximation. These are

1. The distance between x and a. That is, how large is | x − a |?

2. How curved the graph is near x = a?

Note that the larger | f 00 (x) | is, the more rapidly the tangent lines turn, and hence the
more curved the graph of f . For this reason the second factor affecting the size of
the error can be expressed in terms of the size of | f 00 (x) |. Generally speaking, the
further x is away from a and the more curved the graph of f , the larger the potential
for error in using linear approximation. This is illustrated in the following diagram
which shows two different functions, f and g, with the same tangent line at x = a.
The error in using the linear approximation is the length of the vertical line joining
the graph of the function and the graph of the linear approximation.

Calculus 2 (B. Forrest)2

Section 6.5: Review of Taylor Polynomials 259

La (x) = f (a) + f 0 (a)(x − a)

Error(1)

Error(2)

a x

Notice that in the diagram, the graph of g is much more curved near x = a than is
the graph of f . You can also see that at the chosen point x the error

Error(1) =| f (x) − La (x) |

in using La (x) to estimate the value of f (x) is extremely small, whereas the error

Error(2) =| g(x) − La (x) |

in using La (x) to estimate the value of g(x) is noticeably larger. The diagram also
shows that for both f and g, the further away x is from a, the larger the error is in
the linear approximation process.
In the case of the function g, its graph looks more like a parabola (second degree
polynomial) than it does a line. This suggests that it would make more sense to try
and approximate g with a function of the form

p(x) = c0 + c1 (x − a) + c2 (x − a)2 .

(Notice that the form for this polynomial looks somewhat unusual. You will see that
we write it this way because this form makes it easier to properly encode the
information about f at x = a).
In constructing the linear approximation, we encoded the value of the function and
of its derivative at the point x = a. We want to again encode this local information,
but we want to do more. If we can include the second derivative, we might be able
to capture the curvature of the function that was missing in the linear approximation.
In summary, we would like to find constants c0 , c1 , and c2 , so that

1. p(a) = f (a),
2. p 0 (a) = f 0 (a), and

3. p 00 (a) = f 00 (a).

Calculus 2 (B. Forrest)2

Chapter 6: Power Series 260

It may not seem immediately obvious that we can find such constants. However, this
task is actually not too difficult. For example, if we want p(a) = f (a), then by noting
that
p(a) = c0 + c1 (a − a) + c2 (a − a)2 = c0
we immediately know that we should let c0 = f (a).
We can use the standard rules of differentiation to show that

p 0 (x) = c1 + 2c2 (x − a).

In order that p 0 (a) = f 0 (a), we have

f 0 (a) = p 0 (a) = c1 + 2c2 (a − a) = c1 .

Finally, since
p 00 (x) = 2c2
f 00 (a)
for all x, if we let c2 = 2
, we have

f 00 (a)
p 00 (a) = 2c2 = 2( ) = f 00 (a)
2
exactly as required. This shows that if
f 00 (a)
p(x) = f (a) + f 0 (a)(x − a) + (x − a)2 ,
2
then p is the unique polynomial of degree 2 or less such that

1. p(a) = f (a),
2. p 0 (a) = f 0 (a), and

3. p 00 (a) = f 00 (a).

The polynomial p is called the second degree Taylor polynomial for f centered at
x = a. We denote this Taylor polynomial by T 2,a .

EXAMPLE 13 Let f (x) = cos(x). Then,

f (0) = cos(0) = 1, and f 0 (0) = − sin(0) = 0,

and
f 00 (0) = − cos(0) = −1.
It follows that

L0 (x) = f (0) + f 0 (0)(x − 0) = 1 + 0(x − 0) = 1

Calculus 2 (B. Forrest)2

Section 6.5: Review of Taylor Polynomials 261

for all x while

f 00 (0)
T 2,0 (x) = f (0) + f 0 (0)(x − 0) + (x − 0)2
2
−1
= 1 + 0(x − 0) + (x − 0)2
2
x2
= 1− .
2

The following diagram shows cos(x) with its linear approximation and its second
degree Taylor polynomial centered at x = 0.

L0 (x) = 1
1

0.5
f (x) = cos(x)

−2 −1 0 1 2
−0.5

x2
−1 T 2,0 (x) = 1 − 2

Notice that the second degree Taylor polynomial T 2,0 does a much better job
approximating cos(x) over the interval [−2, 2] than does the linear approximation L0 .

We might guess that if f has a third derivative at x = a, then by encoding the value
f 000 (a) along with f (a), f 0 (a) and f 00 (a), we may do an even better job of
approximating f (x) near x = a than we did with either La or with T 2,a . As such we
would be looking for a polynomial of the form

p(x) = c0 + c1 (x − a) + c2 (x − a)2 + c3 (x − a)3

such that

1. p(a) = f (a),
2. p 0 (a) = f 0 (a),
3. p 00 (a) = f 00 (a), and

4. p 000 (a) = f 000

(a).

Calculus 2 (B. Forrest)2

Chapter 6: Power Series 262

To find such a p, we follow the same steps that we outlined before. We want
p(a) = f (a), but p(a) = c0 + c1 (a − a) + c2 (a − a)2 + c3 (a − a)3 = c0 , so we can let
c0 = f (a).
Differentiating p we get

p 0 (x) = c1 + 2c2 (x − a) + 3c3 (x − a)2

so that
p 0 (a) = c1 + 2c2 (a − a) + 3c3 (a − a)2 = c1 .
Therefore, if we let c1 = f 0 (a) as before, then we will get p 0 (a) = f 0 (a).
Differentiating p 0 gives us

p 00 (x) = 2c2 + 3(2)c3 (x − a).

Therefore,
p 00 (a) = 2c2 + 3(2)c3 (a − a) = 2c2 .
f 00 (a)
Now if we let c2 = 2
, we get

p 00 (a) = f 00 (a).

Finally, observe that

p 000 (x) = 3(2)c3 = 3(2)(1)c3 = 3!c3

for all x, so if we require

p 000 (a) = 3!c3 = f 000
(a),
000 (a)
then we need only let c3 = f
3!
.
It follows that if
f 00 (a) f 000 (a)
p(x) = f (a) + f 0 (a)(x − a) + (x − a)2 + (x − a)3 ,
2 3!
then

1. p(a) = f (a),

2. p 0 (a) = f 0 (a),
3. p 00 (a) = f 00 (a), and
4. p 000 (a) = f 000
(a).

In this case, we call p the third degree Taylor polynomial centered at x = a and
denote it by T 3,a .
Given a function f , we could also write

T 0,a (x) = f (a)

Calculus 2 (B. Forrest)2

Section 6.5: Review of Taylor Polynomials 263

and
T 1,a (x) = La (x) = f (a) + f 0 (a)(x − a)
and call these polynomials the zero-th degree and the first degree Taylor
polynomials of f centered at x = a, respectively.
Observe that using the convention where 0! = 1! = 1 and (x − a)0 = 1, we have the
following:
f (a)
T 0,a (x) = (x − a)0
0!
f (a) f 0 (a)
T 1,a (x) = (x − a)0 + (x − a)1
0! 1!
f (a) f 0 (a) f 00 (a)
T 2,a (x) = (x − a)0 + (x − a)1 + (x − a)2
0! 1! 2!
f (a) f 0 (a) f 00
(a) f 000 (a)
T 3,a (x) = (x − a)0 + (x − a)1 + (x − a)2 + (x − a)3 .
0! 1! 2! 3!

Recall that f (k) (a) denotes the k-th derivative of f at x = a. By convention,

f (0) (x) = f (x). Then using summation notation, we have
0
X f (k) (a)
T 0,a (x) = (x − a)k
k=0
k!
1
X f (k) (a)
T 1,a (x) = (x − a)k
k=0
k!
2
X f (k) (a)
T 2,a (x) = (x − a)k
k=0
k!
and
3
X f (k) (a)
T 3,a (x) = (x − a)k .
k=0
k!

This leads us to the following definition:

DEFINITION Taylor Polynomials

Assume that f is n-times differentiable at x = a. The n-th degree Taylor polynomial
for f centered at x = a is the polynomial

n
X f (k) (a)
T n,a (x) = (x − a)k
k=0
k!
f 00 (a) f (n) (a)
= f (a) + f 0 (a)(x − a) + (x − a)2 + · · · + (x − a)n
2! n!

Calculus 2 (B. Forrest)2

Chapter 6: Power Series 264

NOTE
A remarkable property about T n,a is that for any k between 0 and n,
(k)
T n,a (a) = f (k) (a).

That is, T n,a encodes not only the value of f (x) at x = a but all of its first n
derivatives as well. Moreover, this is the only polynomial of degree n or less that
does so!

EXAMPLE 14 Find all of the Taylor polynomials up to degree 5 for the function f (x) = cos(x) with
center x = 0.
We have already seen that f (0) = cos(0) = 1, f 0 (0) = − sin(0) = 0, and
f 00 (0) = − cos(0) = −1. It follows that
T 0,0 (x) = 1,
and
T 1,0 (x) = L0 (x) = 1 + 0(x − 0) = 1
for all x, while
−1 x2
T 2,0 (x) = 1 + 0(x − 0) + (x − 0)2 = 1 − .
2! 2

Since f 000 (x) = sin(x), f (4) (x) = cos(x), and f (5) (x) = − sin(x), we get
f 000 (0) = sin(0) = 0, f (4) (0) = cos(0) = 1 and f (5) (0) = − sin(0) = 0. Hence,
−1 0
T 3,0 (x) = 1 + 0(x − 0) + (x − 0)2 + (x − 0)3
2! 3!
x2
= 1−
2
= T 2,0 (x)

We also have that

−1 0 1
T 4,0 (x) = 1 + 0(x − 0) + (x − 0)2 + (x − 0)3 + (x − 0)4
2! 3! 4!
x2 x4
= 1− +
2 24

and
−1 0 1 0
T 5,0 (x) = 1 + 0(x − 0) + (x − 0)2 + (x − 0)3 + (x − 0)4 + (x − 0)5
2! 3! 4! 5!
x2 x4
= 1− +
2 24
= T 4,0 (x)

Calculus 2 (B. Forrest)2

Section 6.5: Review of Taylor Polynomials 265

An important observation to make is that not all of these polynomials are distinct. In
fact, T 0,0 (x) = T 1,0 (x), T 2,0 (x) = T 3,0 (x), and T 4,0 (x) = T 5,0 (x). In general, this
equality of different order Taylor polynomials happens when one of the derivatives
is 0 at x = a. (In this example at x = 0.) This can be seen by observing that for any n

f (n+1) (a)
T n+1,a (x) = T n,a (x) + (x − a)n+1
(n + 1)!

so if f (n+1) (a) = 0, we get T n+1,a (x) = T n,a (x).

The following diagram shows cos(x) and its Taylor polynomials up to degree 5. You
will notice that there are only four distinct graphs.

T 0,0 (x) = T 1,0 (x) = 1

x2 x4
T 4,0 (x) = T 5,0 (x) = 1 − 2
+ 24
0.5

0
−2 −1 1 2
−0.5 f (x) = cos(x)

−1
x2
T 2,0 (x) = T 3,0 (x) = 1 − 2

In the next example, we will calculate the Taylor Polynomials for f (x) = sin(x).

EXAMPLE 15 Find all of the Taylor polynomials up to degree 5 for the function f (x) = sin(x) with
center x = 0.
We can see that f (0) = sin(0) = 0, f 0 (0) = cos(0) = 1, f 00 (0) = − sin(0) = 0,
f 000 (0) = − cos(0) = −1, f (4) (0) = sin(0) = 0, and f (5) (0) = cos(0) = 1. It follows
that
T 0,0 (x) = 0,
and
T 1,0 (x) = L0 (x) = 0 + 1(x − 0) = x
and
0
T 2,0 (x) = 0 + 1(x − 0) + (x − 0)2
2!
= x
= T 1,0 (x).

Calculus 2 (B. Forrest)2

Chapter 6: Power Series 266

Next we have
0 −1
T 3,0 (x) = 0 + 1(x − 0) + (x − 0)2 + (x − 0)3
2! 3!
x3
= x−
6

and that
0 −1 0
T 4,0 (x) = 0 + 1(x − 0) + (x − 0)2 + (x − 0)3 + (x − 0)4
2! 3! 4!
x3
= x−
6
= T 3,0 (x).

Finally,
0 −1 0 1
T 5,0 (x) = 0 + 1(x − 0) + (x − 0)2 + (x − 0)3 + (x − 0)4 + (x − 0)5
2! 3! 4! 5!
x3 x5
= x− +
6 5!
x3 x5
= x− + .
6 120

The following diagram includes the graph of sin(x) with its Taylor polynomials up
to degree 5, excluding T 0,0 since its graph is the x-axis.

T 1,0 (x) = T 2,0 (x) = x

1
x3 x5
T 5,0 (x) = x − 6
+ 120
f (x) = sin(x)
−3 −2 −1 0 1 2 3

−1

−2
x3
T 3,0 (x) = T 4,0 (x) = x − 6

−3

Calculus 2 (B. Forrest)2

Section 6.5: Review of Taylor Polynomials 267

Notice again that the polynomials are not distinct though, in general, as the degree
increases so does the accuracy of the estimate near x = 0.
To illustrate the power of using Taylor polynomials to approximate functions, we
can use a computer to aid us in showing that for f (x) = sin(x) and a = 0, we have
1 1 5 1 7 1 1 1
T 13,0 (x) = x − x3 + x − x + x9 − x11 + x13
6 120 5040 362880 39916800 62270 20800
The next diagram represents a plot of the function sin(x) − T 13,0 (x). (This represents
the error between the actual value of sin(x) and the approximated value of T 13,0 (x).)

0.02

sin(x) − T 13,0 (x)

0.01

0
−4 −2 2 4

−0.01

−0.02

Notice that the error is very small until x approaches 4 or −4. However, the y-scale
is different from that of the x-axis, so even near x = 4 or x = −4 the actual error is
still quite small. The diagram suggests that on the slightly more restrictive interval
[−π, π], T 13,0 (x) does an exceptionally good job of approximating sin(x).
To strengthen this point even further, we have provided the plot of the graph of
sin(x) − T 13,0 (x) on the interval [−π, π].

0.000024

sin(x) − T 13,0 (x)

0
−3 −2 −1 1 2 3

−0.000024

Calculus 2 (B. Forrest)2

Chapter 6: Power Series 268

Note again the scale for the y-axis. It is clear that near 0, T 13,0 (x) and sin(x) are
essentially indistinguishable. In fact, we will soon have the tools to show that for
x ∈ [−1, 1],
| sin(x) − T 13,0 (x) |< 10−12
while for x ∈ [−0.01, 0.01],

| sin(x) − T 13,0 (x) |< 10−42 .

Indeed, in using T 13,0 (x) to estimate sin(x) for very small values of x, round-off
errors and the limitations of the accuracy in floating-point arithmetic become much
more significant than the true difference between the functions.

EXAMPLE 16 The function f (x) = e x is particularly well-suited to the process of creating

estimates using Taylor polynomials. This is because for any k, the k-th derivative of
e x is again e x . This means that for any n,
n
X f (k) (a)
T n,0 (x) = (x − a)k
k=0
k!
n
X e0
= (x − 0)k
k=0
k!
n
X xk
= .
k=0
k!

In particular,

T 0,0 (x) = 1,
T 1,0 (x) = 1 + x,
x2
T 2,0 (x) = 1 + x + ,
2
x2 x3
T 3,0 (x) = 1 + x + + ,
2 6
x2 x3 x4
T 4,0 (x) = 1 + x + + + , and
2 6 24
x2 x3 x4 x5
T 5,0 (x) = 1 + x + + + + .
2 6 24 120

Observe that in the case of e x , the Taylor polynomials are distinct since e x , and
hence all of its derivatives, is never 0.

The next diagram shows the graphs of e x and its Taylor polynomials up to degree 5.

Calculus 2 (B. Forrest)2

Section 6.6: Taylor’s Theorem and Errors in Approximations 269

12 f (x) = e x T 5,0 (x)

T 4,0 (x)
10
T 3,0 (x)
8
T 2,0 (x)
6

4 T 1,0 (x)

2
T 0,0 (x)

−2 −1 0 1 2

6.6 Taylor’s Theorem and Errors in Approximations

We have seen that using linear approximation and higher order Taylor polynomials
enable us to approximate potentially complicated functions with much simpler ones
with surprising accuracy. However, up until now we have only had qualitative
information about the behavior of the potential error. We saw that the error in using
Taylor polynomials to approximate a function seems to depend on how close we are
to the center point. We have also seen that the error in linear approximation seems
to depend on the potential size of the second derivative and that the approximations
seem to improve as we encode more local information. However, we do not have
any precise mathematical statements to substantiate these claims. In this section, we
will correct this deficiency by introducing an upgraded version of the Mean Value
Theorem called Taylor’s Theorem.
We begin by introducing some useful notation.

DEFINITION Taylor Remainder

Assume that f is n times differentiable at x = a. Let

Rn,a (x) = f (x) − T n,a (x).

Rn,a (x) is called the n-th degree Taylor remainder function centered at x = a.

The error in using the Taylor polynomial to approximate f is given by

Error =| Rn,a (x) | .

The following is the central problem for this approximation process.

Calculus 2 (B. Forrest)2

Chapter 6: Power Series 270

Problem: Given a function f and a point x = a, how do we estimate the size of

Rn,a (x)?
The following theorem provides us with the answer to this question.

THEOREM 11 Taylor’s Theorem

Assume that f is n + 1-times differentiable on an interval I containing x = a. Let
x ∈ I. Then there exists a point c between x and a such that

f (n+1) (c)
f (x) − T n,a (x) = Rn,a (x) = (x − a)n+1 .
(n + 1)!

We will make three important observations about Taylor’s theorem.

1) First, since T 1,a (x) = La (x), when n = 1 the absolute value of the remainder
R1,a (x) represents the error in using the linear approximation. Taylor’s
Theorem shows that for some c,

f 00 (c)
| R1,a (x) |= | (x − a)2 |.
2
This shows explicitly how the error in linear approximation depends on the
potential size of f 00 (x) and on | x − a |, the distance from x to a.
2) The second observation involves the case when n = 0. In this case, the
theorem requires that f be differentiable on I and its conclusion states that for
any x ∈ I there exists a point c between x and a such that

f (x) − T 0,a (x) = f 0 (c)(x − a).

But T 0,a (x) = f (a), so we have

f (x) − f (a) = f 0 (c)(x − a).

Dividing by x − a shows that there is a point c between x and a such that

f (x) − f (a)
= f 0 (c).
x−a
This is exactly the statement of the Mean Value Theorem. Therefore, Taylor’s
Theorem is really a higher-order version of the MVT.

Calculus 2 (B. Forrest)2

Section 6.6: Taylor’s Theorem and Errors in Approximations 271

(x, f (x))

(a, f (a))

a c x

3) Finally, Taylor’s Theorem does not tell us how to find the point c, but rather
that such a point exists. It turns out that for the theorem to be of any value, we
really need to be able to say something intelligent about how large | f (n+1) (c) |
might be without knowing c. For an arbitrary function, this might be a
difficult task since higher order derivatives have a habit of being very
complicated. However, the good news is that for some of the most important
functions in mathematics, such as sin(x), cos(x), and e x , we can determine
roughly how large | f (n+1) (c) | might be and in so doing, show that the
estimates obtained for these functions can be extremely accurate.

EXAMPLE 17 Use linear approximation to estimate sin(.01) and show that the error in using this
approximation is less than 10−4 .
SOLUTION We know that f (0) = sin(0) = 0 and that f 0 (0) = cos(0) = 1, so

L0 (x) = T 1,0 (x) = x.

Therefore, the estimate we obtain for sin(.01) using linear approximation is

sin(.01) L0 (.01) = .01

Taylor’s Theorem applies since sin(x) is always differentiable. Moreover, if

f (x) = sin(x), then f 0 (x) = cos(x) and f 00 (x) = − sin(x). It follows that there exists
some c between 0 and .01 such that the error in the linear approximation is given by
f 00 (c)
− sin(c)

R1,0 (.01) = (.01 − 0) =
2 2
(.01)
2 2

Recall that the theorem does not tell us the value of c, but rather just that it exists.
Not knowing the value of c may seem to make it impossible to say anything

Calculus 2 (B. Forrest)2

Chapter 6: Power Series 272

significant about the error, but this is actually not the case. The key observation in
this example is that regardless the value of point c, | − sin(c) |≤ 1. Therefore,

− sin(c)
| R1,0 (.01) | = 2
(.01)
2
1
≤ (.01)2
2
< 10−4 .

This simple process seems to be remarkably accurate. In fact, it turns out that this
estimate is actually much better than the calculation suggests. This is true because
not only does T 1,0 (x) = x, but we also have that T 2,0 (x) = T 1,0 (x) = x. This means
that there is a new number c between 0 and .01 such that
| sin(.01) − .01 | = | R2,0 (.01) |
000
f (c)
= (.01 − 0) 3
6

− cos(c)
= 3
(.01)
6

< 10−6

since | − cos(c) |≤ 1 for all values of c.

This shows that the estimate sin(.01) .01 is accurate to six decimal places. In fact,
the actual error is approximately −1.666658333 × 10−7 .
Finally, we know that for 0 < x < π2 , the tangent line to the graph of f (x) = sin(x) is
above the graph of f since sin(x) is concave downward on this interval. (In fact, the
Mean Value Theorem can be used to show that sin(x) ≤ x for every x ≥ 0.) Since the
tangent line is the graph of the linear approximation, this means that our estimate is
actually too large.

2
y=x

1 f (x) = sin(x)

0
1 2 3

Calculus 2 (B. Forrest)2

Section 6.6: Taylor’s Theorem and Errors in Approximations 273

Taylor’s Theorem can be used to confirm this because

− sin(c) 2
sin(x) − x = R1,0 (x) = (x) < 0
2
since sin(c) > 0 for any c ∈ (0, π2 ).

In the next example we will see how Taylor’s Theorem can help in calculating
various limits. In order to simplify the notation, we will only consider limits as
x → 0.

sin(x) − x
EXAMPLE 18 Find lim .
x→0 x2
SOLUTION First notice that this is an indeterminate limit of the type 00 .
We know that if f (x) = sin(x), then T 1,0 (x) = T 2,0 (x) = x. We will assume that we
are working with T 2,0 . Then Taylor’s Theorem shows that for any x ∈ [−1, 1], there
exists a c between 0 and x such that

− cos(c) 3 1
| sin(x) − x | = x ≤ | x |3
3! 6
since | − cos(c) | ≤ 1 regardless where c is located. This inequality is equivalent to
−1 1
| x |3 ≤ sin(x) − x ≤ | x |3 .
6 6
If x , 0, we can divide all of the terms by x2 to get that for x ∈ [−1, 1]

− | x |3 sin(x) − x | x |3
≤ ≤
6x2 x2 6x2
or equivalently that
− | x | sin(x) − x | x |
≤ ≤ .
6 x2 6
We also know that
−|x| |x|
lim= lim =0
x→0 6 x→0 6

The Squeeze Theorem guarantees that

sin(x) − x
lim = 0.
x→0 x2

The technique we outlined in the previous example can be used in much more
generality. However, we require the following observation.

Calculus 2 (B. Forrest)2

Chapter 6: Power Series 274

Suppose that f (k+1) is a continuous function on [−1, 1]. Then so is the function
(k+1)
f (x)
g(x) = .
(k + 1)!

The Extreme Value Theorem tells us that g has a maximum on [−1, 1]. Therefore,
there is an M such that (k+1)
f (x)
≤M
(k + 1)!

for all x ∈ [−1, 1].

Let x ∈ [−1, 1]. Taylor’s Theorem assures us that there is a c between x and 0 such
that (k+1)
f (c) k+1
| Rk,0 (x) | = x .
(k + 1)!
Therefore,

| f (x) − T k,0 (x) | = | Rk,0 (x) |

(k+1)
f (c) k+1
= x
(k + 1)!
≤ M | x |k+1

since c is also in [−1, 1].

It follows that
−M | x |k+1 ≤ f (x) − T k,0 (x) ≤ M | x |k+1 .

We summarize this technique as follows:

THEOREM 12 Taylor’s Approximation Theorem I

Assume that f (k+1) is continuous on [−1, 1]. Then there exists a constant M > 0 such
that
| f (x) − T k,0 (x) |≤ M | x |k+1
or equivalently that

−M | x |k+1 ≤ f (x) − T k,0 (x) ≤ M | x |k+1

for each x ∈ [−1, 1].

This theorem is very helpful in calculating many limits.

Calculus 2 (B. Forrest)2

Section 6.6: Taylor’s Theorem and Errors in Approximations 275

cos(x) − 1
EXAMPLE 19 Calculate lim .
x→0 x2
2
SOLUTION We know that for f (x) = cos(x) we have T 2,0 = 1 − x2 . Moreover, all
of the derivatives of cos(x) are continuous everywhere. The Taylor Approximation
Theorem tells us that there is a constant M such that
x2
−M | x |3 ≤ cos(x) − (1 − ) ≤ M | x |3
2
for all x ∈ [−1, 1]. Dividing by x2 with x , 0 we have that
x2
cos(x) − (1 − 2
)
−M | x | ≤ ≤M|x|
x2
for all x ∈ [−1, 1]. Simplifying the previous expression produces
cos(x) − 1 1
−M | x | ≤ + ≤M|x|
x2 2
for all x ∈ [−1, 1].
Applying the Squeeze Theorem we have that
cos(x) − 1 1
lim + =0
x→0 x2 2
which is equivalent to
cos(x) − 1 −1
lim = .
x→0 x2 2

This limit is consistent with the behavior of the function h(x) = cos(x)−1
x2
near 0. This
is illustrated in the following graph.

−1 −0.5 0.5 1 x
0

−0.1

−0.2

−0.3
cos(x) − 1
h(x) =
−0.4 x2

−0.5

Calculus 2 (B. Forrest)2

Chapter 6: Power Series 276

The previous limit can actually be calculated quite easily using L’Hôpital’s Rule. As
an exercise, you should try to verify the answer using this rule. The next example
would require much more work using L’Hôpital’s Rule. It is provided to show you
how powerful Taylor’s Theorem can be for finding limits.

x4
e 2 − cos(x2 )
EXAMPLE 20 Find lim .
x→0 x4
SOLUTION This is an indeterminate limit of type 00 . We know from Taylor’s
Approximation Theorem that we can find a constant M1 such that for any u ∈ [−1, 1]

−M1 u2 ≤ eu − (1 + u) ≤ M1 u2

since 1 + u is the first degree Taylor polynomial of eu . Now if x ∈ [−1, 1], then
4
u = x2 ∈ [−1, 1]. In fact, u ∈ [0, 12 ]. It follows that if x ∈ [−1, 1] and we substitute
4
u = x2 , then we get
−M1 x8 x4 x4 M1 x 8
≤ e − (1 + ) ≤
2 .
4 2 4
We also can show that there exists a constant M2 such that for any v ∈ [−1, 1]

v2
−M2 v4 ≤ cos(v) − (1 − ) ≤ M2 v4
2
v2
since 1 − 2
is the third degree Taylor polynomial for cos(v).
If x ∈ [−1, 1] then so is x2 . If we let v = x2 , then we see that
x4
−M2 x8 ≤ cos(x2 ) − (1 − ) ≤ M2 x8 .
2

The next step is to multiply each term in the previous inequality by −1 to get

x4
−M2 x8 ≤ (1 − ) − cos(x2 ) ≤ M2 x8 .
2
(Remember, multiplying by a negative number reverses the inequality.)
Now add the two inequalities together:

M1 x4 x4 x4 M1
−( + M2 )x8 ≤ e 2 − (1 + ) + (1 − ) − cos(x2 ) ≤ ( + M2 )x8 .
4 2 2 4
If we let M = M1
4
+ M2 and simplify, this inequality becomes

x4
−Mx8 ≤ e 2 − cos(x2 ) − x4 ≤ Mx8
for all x ∈ [−1, 1]. Dividing by x4 gives us that
x4
e 2 − cos(x2 )
−Mx ≤4
− 1 ≤ Mx4 .
x4

Calculus 2 (B. Forrest)2

Section 6.7: Introduction to Taylor Series 277

The final step is to apply the Squeeze Theorem to show that

x4
e 2 − cos(x2 )
lim −1=0
x→0 x4
or equivalently that
x4
e 2 − cos(x2 )
lim = 1.
x→0 x4
x4
−cos(x2 )
This limit can be confirmed visually from the graph of the function h(x) = e 2
x4
.

1.1
1.08
1.06
1.04
1.02
1

0.98 x4
e 2 − cos(x2 )
0.96 h(x) =
x4
0.94
0.92
0.9
−1 −0.5 0 0.5 1

The previous example involved a rather complicated argument. However, with a

little practice using Taylor polynomials and the mastery of a few techniques, limits
like this can actually be done by inspection!

6.7 Introduction to Taylor Series

Given a function f (x) that can be represented by a power series

∞
X
f (x) = an (x − a)n
n=0

centered at x = a with radius of convergence R > 0, we have seen that f (x) has
derivatives of all orders at x = a and that
f (n) (a)
an = .
n!

Calculus 2 (B. Forrest)2

Chapter 6: Power Series 278

In fact,
∞
X f (n) (a)
f (x) = (x − a)n .
n=0
n!

If we assume that a function f has derivatives of all orders at a ∈ R then this series
can certainly be constructed.

DEFINITION Taylor Series

∞
f (n) (a)
− a)n is
P
Assume that f has derivatives of all orders at a ∈ R. The series n!
(x
n=0
called the Taylor series for f centered at x = a.
We write ∞
X f (n) (a)
f (x) ∼ (x − a)n .
n=0
n!

In the special case where a = 0, the series is referred to as the Maclaurin series for
f.

Remark:
Up until now, we have started with a function that was represented by a power series
on its interval of convergence. In this case, the series that represents the function
must be the Taylor Series.
However, suppose that f is any function for which f (n) (a) exists for each n. Then we
can build the power series
∞
X f (n) (a)
(x − a)n .
n=0
n!
However, we do not know the following:

1) For which values of x does the series

∞
X f (n) (a)
(x − a)n
n=0
n!

converge?
2) If the series converges at x0 , is it true that
∞
X f (n) (a)
f (x0 ) = (x0 − a)n ?
n=0
n!

These two questions essentially ask whether a function f can be fully reconstructed
from the data set consisting of the values the derivatives of all order at a point a ∈ R.

Calculus 2 (B. Forrest)2

Section 6.7: Introduction to Taylor Series 279

The answer to the first question can be answered by using the method developed for
finding the interval of convergence of a power series.
The second problem seems intuitively like it should be true at any point where the
series converges. However, a closer look reveals why this may not be true.
Essentially we are trying to rebuild a function over an interval that could very well
be the entire Real line by using only the information provided by the function at one
single point. In this respect, it seems that using only information about e x at x = 0 to
get
∞
X xn
ex =
n=0
n!
and as such to completely reproduce the function for all values of x seems quite
remarkable and indeed it is! To further illustrate why e x is such a remarkable
function in this regard, consider the following example.

EXAMPLE 21 Consider the function g which is obtained by modifying f (x) = e x outside the
interval [−1, 1]:
if x < −1
 1


 e




g(x) = 

 x
 e if − 1 ≤ x ≤ 1




 e if x > 1



On the interval [−1, 1], g(x) behaves exactly like e x . In particular, g(0) = e0 = 1 and
g(n) (0) = e0 = 1 for every n. This means that the Taylor series centered at x = 0 for
g(x) is
∞
X xn
n=0
n!
which is exactly the same Taylor Series for e x . We already know that this series
converges for all x ∈ R and that
∞
X xn
ex = .
n=0
n!

This means that the Taylor series for g centered at x = 0 also converges for all x ∈ R
and in particular at x = 2. However at x = 2, g(2) = e while
∞
X 2n
= e2 , g(2).
n=0
n!

Hence, this is an example of a function g with the property that its Taylor Series
converges at a point x0 but
∞
X g(n) (a)
g(x0 ) , (x0 − a)n .
n=0
n!

Calculus 2 (B. Forrest)2

Chapter 6: Power Series 280

EXAMPLE 22 Find the Taylor series centered at x = 0 for f (x) = cos(x) and g(x) = sin(x).
We have that
f 0 (x) = − sin(x) =⇒ f 0 (0) = − sin(0) = 0
f 00 (x) = − cos(x) =⇒ f 00 (0) = − cos(0) = −1
f 000 (x) = sin(x) =⇒ f 000 (0) = sin(0) = 0
f (4) (x) = cos(x) =⇒ f (4) (0) = cos(0) = 1
f (5) (x) = − sin(x) =⇒ f (5) (0) = − sin(0) = 0
f (6) (x) = − cos(x) =⇒ f (6) (0) = − cos(0) = −1
f (7) (x) = sin(x) =⇒ f (7) (0) = sin(0) = 0
f (8) (x) = cos(x) =⇒ f (8) (0) = cos(0) = 1
..
.

with the cycle repeating itself every four derivatives. Therefore

f (4k)
(x) = cos(x) =⇒ f (4k)
(0) = cos(0) = 1
f (4k+1)
(x) = − sin(x) =⇒ f (4k+1)
(0) = − sin(0) = 0
f (4k+2)
(x) = − cos(x) =⇒ f (4k+2)
(0) = − cos(0) = −1
f (4k+3)
(x) = sin(x) =⇒ f (4k+3)
(0) = sin(0) = 0

Hence, ∞
X f (n) (0) n
cos(x) ∼ x
n=0
n!
0x −1x2 0x3 1x4
= 1+ + + + + ···
1! 2! 3! 4!
x2 x4 x6
= 1− + − + ···
2! 4! 6!
∞ 2k
k x
X
= (−1)
k=0
(2k)!

A similar calculation shows that

x3 x5 x7
sin(x) ∼ x − + − + ···
3! 5! 7!
∞
X x2k+1
= (−1)k
k=0
(2k + 1)!

The problems that remain is to determine if

∞
X x2k
cos(x) = (−1)k
k=0
(2k)!
and if ∞
X x2k+1
sin(x) = (−1)k .
k=0
(2k + 1)!

Calculus 2 (B. Forrest)2

Section 6.8: Convergence of Taylor Series 281

To answer these questions we will need to use Taylor’s Theorem.

6.8 Convergence of Taylor Series

In this section we return to a question that we asked earlier.

Question: Given a function f that is infinitely differentiable at x = a, is f equal to
its Taylor Series? That is, does
∞
X f (n) (a)
f (x) = (x − a)n
n=0
n!

for any x at which the Taylor Series converges?

Unfortunately, we saw that this need not be true even if the Taylor Series converges
everywhere. Taylor’s Theorem gives us a means to show that for many important
functions this equality does hold. To see why this is the case, note that if we fix an
x0 , then
k
X f (n) (a)
T k,a (x0 ) = (x0 − a)n
n=0
n!
is not only the k-th partial sum of the Taylor Series for f centered at x = a, but it is
also the k-th degree Taylor polynomial. As such, Taylor’s Theorem shows that

| f (x0 ) − T k,a (x0 ) |=| Rk,a (x0 ) | .

If we can show that

lim Rk,a (x0 ) = 0
k→∞

then ∞
X f (n) (a)
f (x) = lim T k,a (x0 ) = (x − a)n .
k→∞
n=0
n!
Therefore, f (x) agrees with its Taylor series precisely when the Taylor remainders

Rk,a (x) → 0

as k goes to ∞.
Remark: Before we present the next example we need to recall the following limit
which we previously established as a consequence of the Ratio Test.
Let x0 ∈ R. Then
M | x0 |k
lim = 0.
k→∞ k!

Calculus 2 (B. Forrest)2

Chapter 6: Power Series 282

EXAMPLE 23 Let f (x) = cos(x) and a = 0. Let x0 be any point in R. Taylor’s Theorem shows that
for each k there exists a point ck between 0 and x0 such that
(k+1)
f (ck ) k+1
| Rk,a (x0 ) |= x
(k + 1)! 0

We have seen that if f (x) = cos(x), then f 0 (x) = − sin(x), f 00 (x) = − cos(x),
f 000 (x) = sin(x) and f (4) (x) = cos(x). Since the fourth derivative is again cos(x), the
5-th, 6-th, 7-th and 8-th derivative will be, respectively, f (5) (x) = − sin(x),
f (6) (x) = − cos(x), f (7) (x) = sin(x) and f (8) (x) = cos(x). This pattern will be
repeated for the 9-th, 10-th, 11-th and 12-th derivatives, and then for every group of
four derivatives thereafter. In fact, what we have just shown is that if f (x) = cos(x),
then for any k

cos(x) if k = 4 j








− sin(x) if k = 4 j + 1





f (x) = 
(k)


− cos(x) if k = 4 j + 2










 sin(x) if k = 4 j + 3



where j = 0, 1, 2, · · · . However, this means that regardless the value of k or where ck

is, we will have

| f (k+1) (ck ) |≤ 1.

It follows immediately that

(k+1)
f (ck ) k+1 | x0 |k+1
Rk,a (x0 ) = x0 ≤ .
(k + 1)! (k + 1)!

|x0 |k
However, we know that lim = 0, so the Squeeze Theorem shows that
k→∞ k!

lim Rk,a (x0 ) = 0.

k→∞

Therefore, since x0 was chosen at random, for f (x) = cos(x) and any x ∈ R, we have

∞
X f (n) (0) n
f (x) = x.
n=0
n!

In particular, for any x ∈ R

∞
X x2k
cos(x) = (−1)k .
k=0
2k!

Calculus 2 (B. Forrest)2

Section 6.8: Convergence of Taylor Series 283

A similar argument applies to sin(x) as it did for cos(x) to show that for any x ∈ R,
sin(x) agrees with the value of its Taylor series. That is,

∞
X x2k+1
sin(x) = (−1)k .
k=0
(2k + 1)!

Remark: Notice that in each of the previous examples that if either f (x) = cos(x) or
f (x) = sin(x), then the function f had the property that for any k = 0, 1, 2, 3, . . . and
for each x ∈ R, then (k)
f (x) ≤ 1.

The fact that we can find a simultaneous uniform bound for the size of all of the
derivatives of f over all of R was the key to showing that both cos(x) and sin(x)
agree with their Taylor series. In fact, these two examples suggest the following
very useful theorem.

THEOREM 13 Convergence Theorem for Taylor Series

Assume that f (x) has derivatives of all orders on an interval I containing x = a.
Assume also that there exists an M such that

| f (k) (x) | ≤ M

for all k and for all x ∈ I. Then

∞
X f (n) (a)
f (x) = (x − a)n
n=0
n!

for all x ∈ I.

PROOF
We know that T k,a (x) is the k-th partial sum of the Taylor series centered at x = a.
We also know that the Taylor series converges at x = a and that
∞
X f (n) (a)
(a − a)n = f (a) + 0 + 0 + 0 + · · · = f (a)
n=0
n!

so we only need to show the theorem holds for x0 ∈ I with x0 , a.

Choose x0 ∈ I with x0 , a. Let k ∈ N ∪ {0}. Then Taylor’s Theorem tells us that
there exists a c between a and x0 so that
| f (k+1) (c)|
| f (x0 ) − T k,a (x0 )| = |x0 − a|k+1 .
(k + 1)!

Calculus 2 (B. Forrest)2

Chapter 6: Power Series 284

But since
| f (k+1) (c) | ≤ M
we have that
|x0 − a|k+1
0 ≤ | f (x0 ) − T k,a (x0 )| ≤ M · .
(k + 1)!
Since
|x0 − a|k+1 |x0 − a|k+1
lim M · = M · lim =0
k→∞ (k + 1)! k→∞ (k + 1)!

the Squeeze Theorem shows that

lim | f (x0 ) − T k,a (x0 )| = 0

k→∞

and hence that ∞

X f (n) (a)
f (x0 ) = lim T k,a (x0 ) = (x0 − a)n
k→∞
n=0
n!
for all x0 ∈ I.

We showed using term-by-term differentiation of power series that

∞ ∞
X f (n) (0) n X xn
ex = (x) = .
n=0
n! n=0
n!

We can now provide a different proof of this important result.

EXAMPLE 24 Let f (x) = e x and let a = 0. Let I = [−B, B]. We know that for each k, f (k) (x) = e x .
Moreover, since e x is increasing,

0 < e−B ≤ e x ≤ eB

for all x ∈ [−B, B]. This means that if M = eB , then for all x ∈ [−B, B] and all k, we
have

| f (k) (x) |= e x ≤ eB = M.

All of the conditions of the Convergence Theorem for Taylor Series are satisfied. It
follows that for any x ∈ [−B, B],

∞ ∞
X f (n) (0) n X xn
e =
x
(x) = .
n=0
n! n=0
n!

Finally, we see that this would work regardless of what B we choose. However,
given any x ∈ R, if we pick a B such that | x |< B, then x ∈ [−B, B]. This means that
for this x

Calculus 2 (B. Forrest)2

Section 6.9: Binomial Series 285

∞
X xn
e =
x
.
n=0
n!

Hence for every x ∈ R, the equality

∞
X xn
e =
x

n=0
n!

holds.

6.9 Binomial Series

Consider the following version of the Binomial Theorem:

THEOREM 14 Binomial Theorem

Let a ∈ R and n ∈ N. Then for each x ∈ R we have that
n !
X n n−k k
(a + x) =
n
a x
k=0
k

where !
n n!
= .
k k!(n − k)!
In particular, when a = 1 we have
n
X n(n − 1)(n − 2) · · · (n − k + 1)
(1 + x) = 1 +
n
xk .
k=1
k!

Remark: Consider the expression

n(n − 1)(n − 2) · · · (n − k + 1)
.
k!
Typically we are only concerned with the case where k ∈ {0, 1, 2, . . . , n}. But the
expression actually makes sense for any k ∈ N ∪ {0}. If k > n, then one of the terms
in the expression
n(n − 1)(n − 2) · · · (n − k + 1)
will be 0 and so
n(n − 1)(n − 2) · · · (n − k + 1)
= 0.
k!

Calculus 2 (B. Forrest)2

Chapter 6: Power Series 286

Consequently,
n
X n(n − 1)(n − 2) · · · (n − k + 1)
(1 + x)n = 1 + xk
k=1
k!
∞
X n(n − 1)(n − 2) · · · (n − k + 1) k
= 1+ x
k=1
k!

This leaves us to make the rather strange observation that the polynomial function
(1 + x)n is actually represented by the power series
∞
X n(n − 1)(n − 2) · · · (n − k + 1)
1+ xk
k=1
k!
∞
In other words, 1 + n(n−1)(n−2)···(n−k+1) k
is the Taylor Series centered at x = 0 for
P
k!
x
k=1
the function (1 + x) .n

By itself the observation above does not tell us anything new about the function
(1 + x)n . However it does give us an important clue towards answering the following
question.
Question: Suppose that α ∈ R. Is there an analog of the Binomial Theorem for the
function
(1 + x)α ?

To answer this question, one strategy would be to mimic what happens with the
classical Binomial Theorem. We begin by defining the generalized binomial
coefficients and the generalized binomial series.

DEFINITION Generalized Binomial Coefficients and Binomial Series

Let α ∈ R and let k ∈ {0, 1, 2, 3, . . .}. Then we define the generalized binomial
coefficient
α α(α − 1)(α − 2) · · · (α − k + 1)
!
=
k k!
if k , 0 and
α
!
= 1.
0

We also define the generalized binomial series for α to be the power series
∞ ∞
α(α − 1)(α − 2) · · · (α − k + 1) α k
X X !
1+ x =
k
x.
k=1
k! k=0
k

Remark: The first problem is to find the

radius of convergence for the generalized
binomial series. To do this let bk =| αk |. Then a straight-forward calculation shows
that if k ≥ 1
bk+1 | α − k |
= .
bk k+1

Calculus 2 (B. Forrest)2

Section 6.9: Binomial Series 287

It follows that
bk+1 |α−k |
lim = lim = 1.
k→∞ bk k→∞ k + 1

This tells us that the radius of convergence for the binomial series is 1. In particular,
the series converges absolutely on (−1, 1).
Next we must determine if
∞ ∞
α(α − 1)(α − 2) · · · (α − k + 1) α k
X X !
α
(1 + x) = 1 + x = k
x ?
k=1
k! k=0
k

To see why this is true we start with the following calculation which
shows that if k ≥ 1, then
α α α(α − 1) · · · (α − k + 1)
! !
(k + 1) + k = (α − k)
k+1 k k!
α(α − 1) · · · (α − k + 1)
+ (k)
k!
α(α − 1) · · · (α − k + 1)
= (α)
k!
α
!
= α
k

Next let
∞ ∞
α(α − 1)(α − 2) · · · (α − k + 1) α k
X X !
f (x) = 1 + x =
k
x
k=1
k! k=0
k

for each x ∈ (−1, 1). We claim that

f 0 (x) + x f 0 (x) = α f (x)

for each x ∈ (−1, 1). To see why this is true we use term-by-term differentiation to
get that

∞ ∞
α k−1 X α k
X ! !
f (x) + x f (x) =
0 0
kx + kx
k=1
k k=1
k
∞ ∞
α α k−1 X α k
! X ! !
= + kx + kx
1 k=2
k k=1
k
∞ ∞
α α k
X ! X !
= α+ (k + 1)x +
k
kx
k=1
k + 1 k=1
k
∞
X α α
! !
= α+ ( (k + 1) + k)xk
k=1
k + 1 k

Calculus 2 (B. Forrest)2

Chapter 6: Power Series 288

But if k ≥ 1 we have
α α α
! ! !
(k + 1) + k=α .
k+1 k k
It follows that
∞
α k
X !
f (x) + x f (x) = α + α
0 0
x
k=1
k
∞
X α!
= α xk
k=0
k

= α f (x)

as claimed.
Next let
f (x)
g(x) = .
(1 + x)α
Then g is differentiable on (−1, 1) with
f 0 (x)(1 + x)α − α f (x)(1 + x)α−1
g 0 (x) =
(1 + x)2α
f 0 (x)(1 + x)α − (1 + x) f 0 (x)(1 + x)α−1
=
(1 + x)2α
f 0 (x)(1 + x)α − f 0 (x)(1 + x)α
=
(1 + x)2α

= 0

since α f (x) = (1 + x) f 0 (x).

Since g 0 (x) = 0 for all x ∈ (−1, 1), g(x) is constant on this interval. However,
g(0) = f (0) = 1.
Therefore, g(x) = 1 for all x ∈ (−1, 1). It follows that
f (x) = (1 + x)α
for all x ∈ (−1, 1).

THEOREM 15 Generalized Binomial Theorem

Let α ∈ R. Then for each x ∈ (−1, 1) we have that
∞ ∞
α(α − 1)(α − 2) · · · (α − k + 1) α k
X X !
α
(1 + x) = 1 + x =
k
x.
k=1
k! k=0
k

Calculus 2 (B. Forrest)2

Section 6.10: Binomial Series 289

EXAMPLE 25 Use the Generalized Binomial Theorem to find a power series representation
for (1 + x)−2 .
The Generalized Binomial Theorem shows that

∞ !
X −2 k
(1 + x) −2
= x.
k=0
k

For k ≥ 1,
(−2)(−2 − 1) · · · (−2 − k + 1)
!
−2
= = (−1)k (k + 1).
k k!
It is also true that !
−2
= 1 = (−1)0 (0 + 1).
0

Therefore,
∞
X
(1 + x) −2
= (−1)k (k + 1)xk
k=0
X∞
= (−1)k−1 kxk−1
k=1

We can also use term-by-term differentiation to verify the previous calculation. First
begin with
∞
1 X
= uk
1 − u k=0
for all u ∈ (−1, 1). Differentiating both sides gives us
∞
1 X
= kuk−1
(1 − u)2 k=1

for all u ∈ (−1, 1).

Finally, if we let u = −x, we have
∞ ∞
1 X X
= k(−x)k−1
= (−1)k−1 kxk−1
(1 + x)2 k=1 k=1

for all x ∈ (−1, 1).

Calculus 2 (B. Forrest)2

Chapter 6: Power Series 290

6.10 Additional Examples and Applications of Taylor Series

In this section we will present some further examples of functions that are
representable by their Taylor series and see what this tells us about these functions.

EXAMPLE 26 Find a power series representation for f (x) = arctan(x) and determine the interval
on which the representation is valid.
We begin with the observation that dx d
(arctan(x)) = 1+x
1
2 . Therefore, if we can find a
1
power series representation for 1+x2 , we can use the integration techniques to find a
representation for arctan(x).
We know that for any u ∈ (−1, 1),
∞
1 X
= un .
1 − u n=0

Let x ∈ (−1, 1). If we let u = −x2 , then u ∈ (−1, 1). It follows that
1 1
=
1+x 2 1 − (−x2 )
X∞
= (−x2 )n
n=0
∞
X
= (−1)n x2n
n=0

1
Since arctan(x) is an antiderivative of 1+x 2 , the Integration of Power Series Theorem

shows that there exists a constant C such that

X ∞ Z
arctan(x) = C + (−1)n x2n dx
n=0
∞
X x2n+1
= C+ (−1)n
n=0
2n + 1

To find C, note that arctan(0) = 0, so

0 = arctan(0)
∞
X 02n+1
= C+ (−1)n
n=0
2n + 1
= C

Calculus 2 (B. Forrest)2

Section 6.10: Additional Examples and Applications of Taylor Series 291

Therefore, we have shown that if x ∈ (−1, 1)

∞
X x2n+1
arctan(x) = (−1)n .
n=0
2n + 1

The radius of convergence of this series is 1. At x = 1, the series becomes

∞
X 1
(−1)n
n=0
2n + 1

which converges by the Alternating Series Test.

When x = −1, the series is
∞ 2n+1 ∞
n (−1) 1
X X
(−1) = (−1)3n+1 .
n=0
2n + 1 n=0
2n + 1

Since (
1 if n is odd
(−1) 3n+1
=
−1 if n is even
the series is the same as ∞
X 1
(−1)n+1
n=0
2n + 1
which also converges by the Alternating Series Test.
Therefore, the Continuity Theorem for Power Series shows that
∞
X x2n+1
arctan(x) = (−1)n
n=0
2n + 1

for all x ∈ [−1, 1].

This last statement has an interesting application. We get that
π
= arctan(1)
4
∞
X 1
= (−1)n
n=0
2n + 1

Multiplying both sides of this equation by 4 gives

1 1 1 1
π = 4(1 − + − + − ···)
3 5 7 9
4 4 4 4
= 4 − + − + − ···
3 5 7 9

Calculus 2 (B. Forrest)2

Chapter 6: Power Series 292

Note: This series representation for arctan(x) is called the Gregory’s series after the
Scottish mathematician of the same name. The famous series expansion for π which
we derived from Gregory’s series is called Leibniz’s formula for π.

EXAMPLE 27 (i) Find the Taylor series centered at x = 0 for the integral function
Z x
F(x) = cos(t2 ) dt.
0

(ii) Find F (9) (0) and F (16) (0).

R 0.1 1
(iii) Estimate 0
cos(t2 ) dt with an error of less than 106
.

SOLUTIONS
(i) For any u ∈ R,
∞
X u2n
cos(u) = (−1)n .
n=0
(2n)!

Let u = t2 to get that for any t ∈ R,

∞ ∞
X (t2 )2n X t4n
cos(t2 ) = (−1)n = (−1)n .
n=0
(2n)! n=0 (2n)!

The Integration Theorem for Power Series gives us that

Z x
F(x) = cos(t2 ) dt
0
Z xX ∞ 4n
n t
= (−1) dt
0 n=0 (2n)!
∞ Z x
X t4n
= (−1)n dt
n=0 0
(2n)!
∞ " x #
X t4n+1
= n
(−1)
(4n + 1)(2n)! 0

n=0
∞
X x4n+1
= (−1)n
n=0
(4n + 1)(2n)!

This is valid for any x ∈ R. Moreover, by the Uniqueness Theorem for Power Series
Representations, this must be the Taylor series centered at x = 0 for F.
(ii) To find F (9) (0), we recall that if
∞
X
F(x) = ak x k
k=0

Calculus 2 (B. Forrest)2

Section 6.10: Additional Examples and Applications of Taylor Series 293

then
F (9) (0)
a9 = .
9!

This tells us that to find F (9) (0) we must first identify the coefficient of x9 in
∞
X x4n+1
(−1)n .
n=0
(4n + 1)(2n)!

Notice for x9 , we let n = 2. The coefficient is then

1 1
(−1)2 ( )=
(4(2) + 1)(2(2))! 9(4!)
Therefore,
1
a9 = .
9(4!)

Finally, this means

F (9) (0) = 9!a9

9!
=
9(4!)
= 5·6·7·8
= 1680

Next, to find F (16) (0) we look for the coefficient of x16 in the Taylor series for F(x).
However, this time there is no n such that x4n+1 = x16 . This means that a16 = 0 and
hence that
F (16) (0) = 0.

(iii) Since
∞
X x4n+1
F(x) = (−1)n
n=0
(4n + 1)(2n)!
we have
0.1 ∞
(0.1)4n+1
Z X
cos(t ) dt = F(0.1) =
2
(−1)n .
0 n=0
(4n + 1)(2n)!
This is an alternating series with

(0.1)4n+1
an = .
(4n + 1)(2n)!
Moreover, we see that
(0.1)5 1
a1 = = 6
(5)(2)! 10

Calculus 2 (B. Forrest)2

Chapter 6: Power Series 294

and
0
X (0.1)4n+1 (0.1)
(−1)n = .
n=0
(4n + 1)(2n)! 1
Using the error estimate in the Alternating Series Test we get that
Z 0.1
1
cos(t2 ) dt − 0.1 < a1 = 6 .

10

0

Suppose that we wanted to know the value of

Z 1
2 1
dx.
0 1 + x9
1
Since 1+x 9 looks like a rather simple rational function, we might be tempted to use

partial fractions to try and calculate the integral exactly. At least theoretically, this
should work. However, in practice, this would require us to factor the polynomial
1 + x9 which would certainly require the aid of a sophisticated computer algebra
program such as Maple. Even then, the answer that we would get would not be very
useful. (Try it!)
Fortunately, we can use what we know about series to get an extremely accurate
approximation to this integral with surprisingly little effort.

R 1
1
EXAMPLE 28 Estimate 0
2
1+x9
dx with an error of less than 10−12 .
We know that for any −1 < u < 1,
∞
1 X
= un .
1 − u n=0

If 0 ≤ x ≤ 12 , then −1 < −x9 < 1, so we can let u = −x9 to get

∞
1 1 X
= = (−1)n x9n
1+x 9 9
1 − (−x ) n=0

for all x ∈ [0, 12 ].

Calculus 2 (B. Forrest)2

Section 6.10: Additional Examples and Applications of Taylor Series 295

The Integration Theorem for Power Series shows that

Z 1 Z 1 ∞
2 1 2 X
dx = (−1)n x9n dx
0 1 + x9 0 n=0
∞ Z 1
X 2
= (−1) n
x9n dx
n=0 0
∞ 1
X x9n+1 2
= (−1) n
9n + 1 0

n=0
∞
X ( 1 )9n+1
= (−1) n 2

n=0
9n + 1

Notice that the numerical series we have just obtained satisfies the conditions of the
Alternating Series Test. In particular, we can use the error estimation in the
Alternating Series Test to conclude that
1 k
( 1 )9n+1 ( 12 )9(k+1)+1
Z 2 1 X
n 2
| dx − (−1) | ≤
0 1 + x9 n=0
9n + 1 9(k + 1) + 1

( 12 )9k+10
=
9k + 10

If we let k = 3, we get
1 3
( 1 )9n+1 1
Z 2 1 X 1 1 1
dx (−1) n 2
= − + −
0 1+x 9
n=0
9n + 1 2 10(2 ) 19(2 ) 28(228 )
10 19

with an error that is less than

1 1
=
29(3)+10 [9(3) + 10] 37(237 )
1
=
5.08524 × 1012

Calculus 2 (B. Forrest)2

Notes-Higher Orde ODE
No ratings yet
Notes-Higher Orde ODE
52 pages
Haskell Solutions
No ratings yet
Haskell Solutions
31 pages
Quetta-Pishin District Gazetteer 1905
100% (1)
Quetta-Pishin District Gazetteer 1905
420 pages
The Concept of Knowledge The Ankara Seminar (Guido Küng (Auth.), Ioanna Kuçuradi Etc.)
No ratings yet
The Concept of Knowledge The Ankara Seminar (Guido Küng (Auth.), Ioanna Kuçuradi Etc.)
288 pages
AAA Companion Notes-Rudin
100% (2)
AAA Companion Notes-Rudin
434 pages
AISHE Final Report 2020-21 PDF
No ratings yet
AISHE Final Report 2020-21 PDF
310 pages
PDE Notes
No ratings yet
PDE Notes
55 pages
Concepts in Calculus I - Beta Version - 2011 - Miklos Bona - Sergei Shabanov
No ratings yet
Concepts in Calculus I - Beta Version - 2011 - Miklos Bona - Sergei Shabanov
188 pages
On Cellular Automata Representation of Submicroscopic Physics: From Static Space To Zuse's Calculating Space Hypothesis
No ratings yet
On Cellular Automata Representation of Submicroscopic Physics: From Static Space To Zuse's Calculating Space Hypothesis
11 pages
Math. Intelligencer 1989 Krantz
No ratings yet
Math. Intelligencer 1989 Krantz
5 pages
Abstract Algebra Notes - by Me-1
No ratings yet
Abstract Algebra Notes - by Me-1
282 pages
AP Live Review Calculus AB - 2022
No ratings yet
AP Live Review Calculus AB - 2022
3 pages
Graphblas
No ratings yet
Graphblas
8 pages
Seidenberg A.-Elements of The Theory of Algebraic Curves-Addison-Wesley (1968)
No ratings yet
Seidenberg A.-Elements of The Theory of Algebraic Curves-Addison-Wesley (1968)
224 pages
Introduction To Theory of The Complexity
100% (2)
Introduction To Theory of The Complexity
290 pages
Volker Strassen (Auth.), Raymond E. Miller, James W. Thatcher, Jean D. Bohlinger (Eds.) - Complexity of Computer Computations
No ratings yet
Volker Strassen (Auth.), Raymond E. Miller, James W. Thatcher, Jean D. Bohlinger (Eds.) - Complexity of Computer Computations
227 pages
Maths Sample Question Paper - Year 7 Entrance Examination (1) For Alma
No ratings yet
Maths Sample Question Paper - Year 7 Entrance Examination (1) For Alma
2 pages
Eog GH8 Aec
No ratings yet
Eog GH8 Aec
971 pages
Lambek
No ratings yet
Lambek
27 pages
Euler's Constant
No ratings yet
Euler's Constant
8 pages
Digital Communications: Chapter 2: Deterministic and Random Signal Analysis
No ratings yet
Digital Communications: Chapter 2: Deterministic and Random Signal Analysis
106 pages
Sanet ST
No ratings yet
Sanet ST
385 pages
Bena R. A Simple Approach To Trigonometry. Proving... 2022
No ratings yet
Bena R. A Simple Approach To Trigonometry. Proving... 2022
128 pages
Updating The QR Factorization and The Least Squares Problem (2008)
No ratings yet
Updating The QR Factorization and The Least Squares Problem (2008)
73 pages
RidleyWorks15 02
No ratings yet
RidleyWorks15 02
112 pages
0262072629.MIT Press - Peter D. Grunwald, in Jae Myung, Mark A. P Advances in Minimum Description Length & Applications - Apr.2009
No ratings yet
0262072629.MIT Press - Peter D. Grunwald, in Jae Myung, Mark A. P Advances in Minimum Description Length & Applications - Apr.2009
455 pages
BLL Export
No ratings yet
BLL Export
372 pages
Reinhardt - Set Existence Principles of Shoenfield, Ackermann, and Powell.
No ratings yet
Reinhardt - Set Existence Principles of Shoenfield, Ackermann, and Powell.
16 pages
Theory of Field Extensions
No ratings yet
Theory of Field Extensions
70 pages
Boston Studies in The Philosophy of Science The Method of Analysis
No ratings yet
Boston Studies in The Philosophy of Science The Method of Analysis
164 pages
J. C. Taylor (Auth.) - An Introduction To Measure and Probability-Springer-Verlag New York (1997) PDF
No ratings yet
J. C. Taylor (Auth.) - An Introduction To Measure and Probability-Springer-Verlag New York (1997) PDF
315 pages
Lattice-Theoretical Fixpoint Theorem
No ratings yet
Lattice-Theoretical Fixpoint Theorem
26 pages
Matroid
No ratings yet
Matroid
18 pages
Tarski's World Textbook
No ratings yet
Tarski's World Textbook
136 pages
Mitre A 2013
No ratings yet
Mitre A 2013
475 pages
PSMT Functions and Relations 10.5
No ratings yet
PSMT Functions and Relations 10.5
16 pages
The Lost Cafe - Rota
No ratings yet
The Lost Cafe - Rota
10 pages
Carol Ann Duffy Poems - Exam Practice
No ratings yet
Carol Ann Duffy Poems - Exam Practice
14 pages
(Daum) Nonlinear Filters - Beyond The Kalman Filter
No ratings yet
(Daum) Nonlinear Filters - Beyond The Kalman Filter
13 pages
Question's
No ratings yet
Question's
8 pages
Counterfactual Explanations and Algorithmic Recourses For Machine Learning: A Review
No ratings yet
Counterfactual Explanations and Algorithmic Recourses For Machine Learning: A Review
23 pages
A Theory of Asymptotic Series (G.N. Watson)
No ratings yet
A Theory of Asymptotic Series (G.N. Watson)
36 pages
Geometry With An Introduction To Cosmic Topology 2018 Edition Michael P. Hitchman
No ratings yet
Geometry With An Introduction To Cosmic Topology 2018 Edition Michael P. Hitchman
238 pages
Mathematics and Statistics in The Social Sciences
No ratings yet
Mathematics and Statistics in The Social Sciences
26 pages
Course Outline (2021-2025)
100% (1)
Course Outline (2021-2025)
50 pages
Analytic Number Theory
No ratings yet
Analytic Number Theory
8 pages
Eulerzig Zag
No ratings yet
Eulerzig Zag
6 pages
AND Sciences: Hegel THE
No ratings yet
AND Sciences: Hegel THE
366 pages
The Epistemology of Quantum Physics (Taha Sochi)
No ratings yet
The Epistemology of Quantum Physics (Taha Sochi)
258 pages
09 - June - Greater Kashmir - 2021
No ratings yet
09 - June - Greater Kashmir - 2021
10 pages
Zermelo Fraenkel Set Theory - Wikipedia
No ratings yet
Zermelo Fraenkel Set Theory - Wikipedia
12 pages
DR John Chungs SAT Math by DR John Chung
No ratings yet
DR John Chungs SAT Math by DR John Chung
529 pages
Quine 1969
No ratings yet
Quine 1969
13 pages
Mellin Transform
No ratings yet
Mellin Transform
32 pages
Example 12 Chapter 9 Transforming Trig Graphs
No ratings yet
Example 12 Chapter 9 Transforming Trig Graphs
7 pages
PrepScholar Sat Practice Test 3
No ratings yet
PrepScholar Sat Practice Test 3
31 pages
Algorithms For Diophantine Equations - B.M.M. de Weger
No ratings yet
Algorithms For Diophantine Equations - B.M.M. de Weger
220 pages
New Microsoft Office Word Document
No ratings yet
New Microsoft Office Word Document
9 pages
2009 Seemous Problems Solutions
No ratings yet
2009 Seemous Problems Solutions
4 pages
Applications of Model Theory to Functional Analysis
From Everand
Applications of Model Theory to Functional Analysis
José Iovino
No ratings yet
MATH1043 Calculus: Study Guide For MATH1043 First Year Semester 2 Course: Engineering Mathematics
No ratings yet
MATH1043 Calculus: Study Guide For MATH1043 First Year Semester 2 Course: Engineering Mathematics
74 pages
Mobilization Construction Equipment & Tools
No ratings yet
Mobilization Construction Equipment & Tools
1 page
Engg Services Civil Engineering Objective Paper 1 2010
No ratings yet
Engg Services Civil Engineering Objective Paper 1 2010
24 pages
Amended and Restated Bylaws - Amarillo EDC
No ratings yet
Amended and Restated Bylaws - Amarillo EDC
13 pages
Amino Acid
No ratings yet
Amino Acid
18 pages
C1. Introduction
No ratings yet
C1. Introduction
38 pages
In A Digital Continuous-Time Communication System, The Bit Rate of NRZ Data Stream Is 1 Mbps and Carrier Frequency of Transmissi
No ratings yet
In A Digital Continuous-Time Communication System, The Bit Rate of NRZ Data Stream Is 1 Mbps and Carrier Frequency of Transmissi
1 page
1) Ethiopia Profile On Meat Processing and Preserving
No ratings yet
1) Ethiopia Profile On Meat Processing and Preserving
55 pages
UserManualforCorrectionModuleofWBTN Rural
No ratings yet
UserManualforCorrectionModuleofWBTN Rural
32 pages
MSDS Petron XD3
No ratings yet
MSDS Petron XD3
5 pages
Sony - XM-2252
No ratings yet
Sony - XM-2252
20 pages
A little light on the spiritual laws Cooper download
No ratings yet
A little light on the spiritual laws Cooper download
66 pages
The Mandala A Model:: Cosmic
No ratings yet
The Mandala A Model:: Cosmic
14 pages
CALIFORNICATION TAB by Red Hot Chili Peppers @
No ratings yet
CALIFORNICATION TAB by Red Hot Chili Peppers @
7 pages
Elo Basic Presentation
No ratings yet
Elo Basic Presentation
41 pages
Noetic Maltego Report
No ratings yet
Noetic Maltego Report
273 pages
Metis Restaurant A) Assess The Existing Performance Report and Suggest Improvements To Its Content and Presentation
No ratings yet
Metis Restaurant A) Assess The Existing Performance Report and Suggest Improvements To Its Content and Presentation
3 pages
Template of Research Proposal For Master-Ph. D Degree
No ratings yet
Template of Research Proposal For Master-Ph. D Degree
7 pages
XI-CS Material 2023-24
No ratings yet
XI-CS Material 2023-24
107 pages
316 Special Issue No. 7 Vol. XVIII (2013) METALURGIA INTERNATIONAL
No ratings yet
316 Special Issue No. 7 Vol. XVIII (2013) METALURGIA INTERNATIONAL
6 pages
Chapter II.9
No ratings yet
Chapter II.9
16 pages
Bimco Standard Time Charter Party For Container Vessels Code Name: Boxtime 2004
No ratings yet
Bimco Standard Time Charter Party For Container Vessels Code Name: Boxtime 2004
16 pages
Curl Activator Gel
No ratings yet
Curl Activator Gel
1 page
Time of Supply-20
No ratings yet
Time of Supply-20
44 pages
Chapter 1 Introduction To Corrosion
No ratings yet
Chapter 1 Introduction To Corrosion
41 pages
ADVANCED BOOKLET
No ratings yet
ADVANCED BOOKLET
183 pages
Pachan Sudha Syrup Digestive Support and Wellness
No ratings yet
Pachan Sudha Syrup Digestive Support and Wellness
9 pages
Ibra
100% (1)
Ibra
18 pages
Mathlinks 8 - Section 8 2
No ratings yet
Mathlinks 8 - Section 8 2
17 pages
Chemistry Chromatography
No ratings yet
Chemistry Chromatography
19 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.