Autumn 2011
Autumn 2011
Autumn 2011
Charles Staats
26 September 2011
1 Introduction
Loosely speaking, there are two sides to mathematics: the ideas and the tech-
nical skills. Most people who say that they hate math have probably gotten
hung up on the technical side. And it is an unfortunate fact that the technical
side cannot be done away with. However, the ideas are what make the technical
side interesting. Without them, no one would ever have discovered the technical
side, and certainly no one would care to study mathematics as their life’s work.
In this course, I will try to flavor the technical details with the ideas that
explain why people were thinking like this in the first place. Think of studying
mathematics like studying a map. One can simply sit down and try to memorize
all the rivers, lakes, and mountain ranges. Or one can imagine how an explorer
might travel, and bring the landforms to life. A river, for instance, becomes
at once an obstacle, a water source, and a highway. Some rivers you can wade
across; others are difficult enough that you may want to build a bridge. My goal
is to present the mathematical “landforms” with some kind of narrative about
how the first explorers might have seen them, and why they built the things
they did.
The situation with calculus is especially tricky. The basics of calculus, as
invented by Newton and Leibniz in the late 1600s, might be seen as “explor-
ing on top of the clouds.” There are plenty of interesting things to explore on
top of these clouds, but you can never be sure what’s under your feet. You
might step on a spot that looks solid, only to find yourself standing on air.
In the 1800s, mathematicians (most notably Cauchy and Weierstrass) built a
solid “foundation” to fix this problem, which is not so much a foundation as a
skyscraper. In this course, we will try to explore both the “castle in the clouds”
and the skyscraper—called analysis—that holds it up. Even if you have seen
some calculus before, you have almost certainly not seen much of the skyscraper.
2 Proofs
One of the key things that distinguishes mathematics from other disciplines is
the presence of logical proofs. In physics, you know that a ball will fall when
you release it because it has done so every time you released it in the past. But
1
in some sense, there is no absolute reason why it would have to keep behaving
in this fashion. One can imagine that gravity might suddenly stop working
tomorrow. √
In mathematics, we know that 2 is irrational because we can show, defini-
tively, that it could never be rational. And in fact, the√way we prove this is
called proof by contradiction: imagine a world in which 2 were rational, and
show that this world is contradictory; thus, it cannot exist.
The Fundamental Theorem of Arithmetic. Every natural number greater
than 1 can be written as the product of primes in a unique way, except for the
order of the factors.
For example, 45 = 3 · 3 · 5.
Claim. Let n be a natural number greater than 1. Then n and n2 have exactly
the same prime factors.
Proof. If
n = p1 · p2 · · · pk
is a factorization of n into primes p1 , . . . , pk , then
n2 = p1 p1 · p2 p2 · · · pk pk
2
Now, since q > 1 and both are positive, we know q 2 > 12 = 1. Hence,
p /q 2 is not a natural number (the denominator is bigger than 1). But since
√
2
p2
n= .
q2
Thus, n is not a natural number. Since we originally assumed it was, this is a
contradiction.
√
Corollary. 2 is irrational.
√
Proof. By the theorem above√ (with n = 2), 2 is either a natural number or
an irrational number. Since 2 is not a natural number, it is irrational.
√
Bonus Exercise. Show that 3 31 is irrational.
3
Math 131, Lecture 2
Charles Staats
28 September 2011
1
Typically, when you see the symbol ε (Greek letter epsilon), you should
think “small positive number.” This is purely psychological: the statement
would be just as correct if you replaced every ε with a y. Nevertheless,
this “psychological” choice of variable can provide an important guide for
intuition. When you see a statement like the theorem above, you should
get the following idea:
“If we can do a good enough job of showing that x is really close
to zero, we’ll know that x is actually equal to zero.”
x2 < 2,
• “I can square both sides.” ISSUE: This only works if both sides are
positive.
• “If I have an inequality like (x−a)(x−b) < 0, where a product is compared
to zero, I can split it into the factors: x−a < 0, x−b < 0.” ISSUE: What
you can actually say in this particular case is that x − a and x − b have
opposite signs. In other words, one is positive and the other is negative.
Quadratic inequalities are more complicated than quadratic equations.
2
If there’s an “interesting idea” in manipulating inequalities, it’s this: in some sit-
uations (for instance, in quadratic inequalities), we divide into cases, connected
by words like and and or. At this point, we are not only doing algebraic ma-
nipulations. We are also playing around with the logical relationships among
the different inequalities. Here’s an example:
3
3 Assignment due Friday, September 30
Read “A Bit of Logic” and “Quantifiers” on pp. 4–6.
Problem Set 0.1, numbers 45, 46, 63, and 64. Problems 45 and 46 will be graded
carefully.
4
Math 131, Lecture 3
Charles Staats
30 September 2011
√
2 Functions: writing f instead of
[Note: the following history is partly fictional. But it could have happened this
way, and in my opinion, it’s a lot more interesting to think about it like this
than just to go through a dry “definition of a function.”]
For many centuries, algebra was, essentially, the study of formulas. Periodi-
cally, when dealing with formulas, people would pose a problem that could not
1
be solved using existing formulas. For instance, to the ancient Greeks, such a
problem was, “What is the side length of a square with area 2?” They knew
that the answer would be a solution to the equation x2 = 2. Unfortunately, this
presented a dilemma, since they had no formulas to solve such an equation.
There were, roughly speaking, two approaches to this dilemma. One ap-
proach, which was taken by Diophantus of Alexandria in the third century A.D.,
was to accept that certain equations have no solutions, and then try to deter-
mine which equations had solutions and which did not. Diophantus produced
some marvelous mathematics this way, and the sorts of questions he asked have
become important in many areas—for instance, in modern cryptography.
Unfortunately, Diophantus’ marvelous mathematics was little comfort to the
farmer who wanted to know how long his fence should be to get a square corral
with a given area. The other approach, which might have been more useful to
said farmer, was to say, “well, since we don’t have a formula for this, let’s invent
one—and then figure out how to calculate it.” Thus, the square root was born.
As the centuries progressed, mathematicians continued to add new notation
to their formulas—exponential, logarithm, sine and cosine, and others. But
eventually, this approach stopped working. In studying differential equations,
the variety of solutions became so great that it was wholly impractical to invent
a new notation for every type of solution. Thus, they started using the same
notation, f (x), for many different “formulas.” They might say something like,
“Let f be defined as the solution to the differential √ equation under considera-
tion,” and then proceed to use f as though it were . Later on, they might
use the same letter f for the solution to a different equation.
In mathematics, notation is usually just notation. But sometimes, a new
notation can lead to new insights. For instance, the symbol 0 was originally
introduced as a placeholder, so that one could write down numbers like 101.
But once the symbol was introduced, people began to realize that it made sense
to think of zero as a number—a conceptual breakthrough.
In the case at hand, mathematicians began to realize that they could study
the “set of all things that can be written as f .” In trying to understand what
these “things that can be written as f ” really were, they came up with the
following definition.
Definition. A function f is a rule that, given a number x, outputs a number
f (x).
√
Let’s consider the case of the square
√ root function f , defined by f (x) = x
(or, if you prefer, defined by f = ). We would like to define f as follows:
For each number x, the function f assigns to x that number y such
that y 2 = x.
Unfortunately, this definition has a couple problems:
• This definition is ambiguous. For instance, if x = 1, then f (x) could
be either 1 or −1. To resolve this ambiguity, we require that f (x) be
nonnegative.
2
• If x is negative, there is no number y such that y 2 = x; in this case, f (x)
is undefined.
To resolve these difficulties, we make the following, better definition:
For each nonnegative number x, the function f assigns to x the
unique nonnegative number y such that y 2 = x.
The second difficulty, in particular, illustrates an important fact: a function
may be defined on only some real numbers.
Definition. The domain of a function is the set of all numbers x such that
f (x) is defined.
Definition. Let f and g be functions. We say that f = g if f and g have the
same domain, and for every value of x in that domain, f (x) = g(x).
Warning. If f is a function, it may be tempting to write something like
f = x2 + 1.
3 Composing functions
In the functions above, I always used x for a variable. There is√nothing special
about x; the
√ square root function can be defined by f (t) = t just as easily
as f (x) = x. More importantly, we can plug in other things for a variable—
numbers, other variables, expressions, even other functions. For instance, if f
is defined by f (x) = x2 , then we may write things like
f (−2) = (−2)2 = 4
f (x + t) = (x + t)2 = x2 + 2tx + t2
f (x2 ) = (x2 )2 = x4 .
Note that none of these is a definition for f ; they are all consequences of the
definition that f (x) = x2 .
If f and g are both functions, then we may define a new function, denoted
f ◦ g, by
(f ◦ g)(x) = f (g(x)).
This is called the composition of f and g; it is read “f composed with g.”
Example. Let f be the function x 7→ x2 , and let g be the function x 7→ x2 + 1.
Compute f ◦ f , f ◦ g, and g ◦ f .
3
Solution. f ◦ f is defined by
(f ◦ f )(x) = f (x2 ) = (x2 )2 = x4 .
f ◦ g is defined by
(f ◦ g)(x) = f (x2 + 1) = (x2 + 1)2 = x4 + 2x2 + 1.
g ◦ f is defined by
(g ◦ f )(x) = g(x2 ) = (x2 )2 + 1 = x4 + 1.
We could just as well have computed g ◦ f by
(g ◦ f )(x) = (f (x))2 + 1 = (x2 )2 + 1 = x4 + 1.
Note that functional composition is not commutative: in the example above,
f ◦ g is not equal to g ◦ f .
4 Piecewise-defined functions
The easiest way to define a function is, of course, using an algebraic formula.
But in a way, this misses the whole point of functions—that they can be used
for rules that cannot be described using an algebraic formula. For instance, it
is a (not entirely obvious) fact that the “rule”
f (x) = the number y such that y 5 + 20y + 16 = x (1)
gives an unambiguous definition for a function f , and a (much less obvious) fact
that this function f cannot be expressed in terms of simpler algebraic functions
like rth roots. Even though we can’t “solve” the equation y 5 + 20y + 16 = x for
y in terms of x in the sense of giving a formula, one can show that the solution
exists as a function.
Another way to define functions is using a combination of logic and algebra.
For instance, the following are two perfectly good functions:
(
x3 − 7 if x < 1,
f (x) = 2
x if x ≥ 1.
(
−1 if x is rational,
g(x) =
1 if x is irrational.
You may have a tendency to think that functions like these are somehow less
“real” than functions defined using only algebra. You need to get past this.
Piecewise-defined functions, as they are called, are just as “real” as functions
given by algebraic formulae, and are extremely important. For instance, if you
want to actually compute approximate values for the function defined in (1)
above, your best bet may well be to construct a piecewise-defined function that
is very close to it, and compute the values of that function.
An extremely important piecewise-defined function is the absolute value
function.
4
Definition. The absolute value function is defined by
(
x if x ≥ 0,
f (x) =
−x if x < 0.
f (x) is typically denoted |x|, read “the absolute value of x.” Note that |x| is
always nonnegative.
Example. Let f be the function defined by
(
x+1 if x ≤ 1
f (x) =
x−1 if x > 1.
2x ≤ 1 and 2x + 1 ≤ x + 1
1
x≤ 2 and x≤0
x ≤ 0.
2x > 1 and 2x − 1 ≤ x + 1
1
x> 2 and x−1≤1
1
x> 2 and x≤2
1
2 < x ≤ 2.
Combining the two cases, we see that the condition f (2x) ≤ x + 1 is equiv-
alent to the condition
x ≤ 0 or 21 < x ≤ 2.
5
forbid you to use the quadratic formula on anything you turn in (including tests
and homework). Instead, I will expect you to use the technique of completing
the square, which is a much more powerful idea that is in fact used to derive
the quadratic formula. It’s also easier to remember, in that the only formula
involved is (b/2)2 .
x2 − 2x − 4 < 0
x2 − 2x + 1 < 5
(x − 1)2 < 5
p √
|x − 1| < |5| = 5
Thus,
√ √
− 5<x−1< 5
√ √
1 − 5 < x < 1 + 5.
6
6 Assignment: Due Monday, October 3
Section 0.2, problems 45 and 46. DO NOT use the quadratic formula, contrary
to the book’s instructions. Problem 46 will be graded carefully.
7
Math 131, Lecture 4
Charles Staats
3 October 2011
1
The textbook calls the addition rule the “Triangle Inequality.” This term
is properly reserved for another inequality. Consider three points P, Q, and R.
Let d(P, Q) denote the distance from P to Q.
The standard fact that “the shortest distance between any two points is a line”
tells us that
d(P, R) ≤ d(P, Q) + d(Q, R).
If a, b, and c are real numbers, they also represent points on the number line.
Moreover, the distance between a and b is precisely |b − a|, and so the triangle
inequality for absolute values is
|c − a| ≤ |b − a| + |c − b|.
c−a=α+β
b−a=α
c − b = β.
|α + β| ≤ |α| + |β|,
2 Graphing functions
One of the keystones of modern mathematics is the interaction between algebra
and geometry via the graphing of equations. In some cases, one can use algebra
to prove a geometric result; you may have seen this sort of analysis used in
analyzing the conic sections. However, in this course, we will be going mostly in
2
the opposite direction: we will be using the geometry to gain additional insight
about the algebra. See, for instance, the discussion of the triangle inequality
above.
The basic approach to graphing functions is, of course, quite simple:
1. Choose some values of x.
2. Calculate and plot the points (x, f (x)).
3. “Connect the dots.”
Example. Graph the function f defined by f (x) = x2 .
Solution. We first calculate f at a few points:
x f(x)
-2 4
-1 1
0 0
1 1
2 4
Now, we plot these points and “connect the dots”:
You don’t—not really. Later on, we’ll discuss how to show definitively that
you’ve plotted enough points, but no one ever does this in real life. But here
are some general guidelines. They’re not guaranteed to work, but they usually
do if you’re smart about it.
1. Make sure it is “clear” how to connect the dots. If your points are too far
apart, either vertically or horizontally, you may need to plot some more.
Generally speaking, you want the graph to be going “up” or “down” for
several points at a time.
3
2. Use your knowledge of the function. If the graph you’ve drawn is a line,
then the function had better be equal to a function of the form f (x) =
mx + b; if it’s not, then you probably need to plot some more points.
If your function has an (x − a) in the denominator, then you are dividing
by zero at x = a. So, you probably want to plot extra points near x = a.
3. Test some extra points in between the one you’ve already plotted. When
you think you know what the graph looks like, plot a few more points in
between the ones you’ve already plotted. If they are about where your
drawing says they should be, that’s a good sign.
Question: How do I know I’ve got all the interesting features of the
graph?
The best answer to this is to use calculus. Since we can’t do that yet, it
may be helpful to try to figure out what the function looks like in the “boring”
part. Most of the functions we will give explicit formulas for this quarter will
look like axn for very positive and very negative values of x. When the function
starts looking like this, there’s a good chance you’re in the “boring” part.
You do probably want to make sure you get all the x-intercepts, i.e., all the
points where f (x) = 0.
4
Example. Recall the inequality from the end of the last lecture:
f (2x) ≤ x + 1,
where (
x+1 if x ≤ 1
f (x) =
x−1 if x > 1.
Graph the functions g(x) = f (2x) and h(x) = x + 1. Use the resulting graph to
study the set of values of x satisfying the inequality g(x) ≤ h(x).
5
3 Assignment 3 due Wednesday, October 5
Section 0.2, problems 53, 54, 57, and 63. Problems 54 and 63 will be graded
carefully.
6
Math 131, Lecture 5
Charles Staats
5 October 2011
I then did an somewhat abysmal job of explaining how the Addition Rule implies
the Subtraction Rule. I’m going to see if I can do any better the second time
around. I’ll also try to use this as an excuse to discuss quantifiers. Please let
me know immediately if I start talking gibberish again.
Example. Take the following statement as given: For every pair of real numbers
a and b,
|a + b| ≤ |a| + |b|.
Use it to prove the Subtraction Rule: For every pair of real numbers a and b,
|a − b| ≥ |a| − |b|.
Solution. The Addition Rule applies to every pair of real numbers. We’ve chosen
to write this pair as a and b. But if a and b are a pair of real numbers, so are b
and a − b. Applying the Addition Rule to the pair b, a − b, we obtain
|b + (a − b)| ≤ |b| + |a − b|
|a| ≤ |b| + |a − b|
|a| − |b| ≤ |a − b|
|a − b| ≥ |a| − |b|.
Let’s review the logic here. The Subtraction Rule has the appearance of a
condition on a and b: that |a + b| ≥ |a| − |b|. Let’s call this condition P (a, b).
Like all conditions, P (a, b) is either true or false, but in principle, we don’t know
which until someone tells us what a and b are.
However, we want to show that this condition P (a, b) holds for every possible
choice of a and b. Statements of the form
1
for all x, the condition P (x) holds
will be increasingly common as we progress into the study of limits, continuity,
and ultimately derivatives. The part of the statement “for all x” is called a
quantifier. It may seem more reasonable to talk about “the quantifier” when
we write the statement in symbols:
∀x, P (x),
where ∀ stands for “for all.” In this case, ∀ is the quantifier. The other important
quantifier is ∃, which stands for “there exists.” It showed up in one of the quiz
questions yesterday:
An important note here is that when you say, “Let x be a . . . ,” you don’t get to
choose x. If it helps, imagine that someone else—an “opponent” or “enemy”—is
going to try to find an x to spite you. What you are doing for the rest of the
proof is showing that, no matter what x they choose, P (x) holds.
The narrative structure for a ∃ proof is a bit more confusing, because the
way you tell the proof is usually in the opposite order from the way you figure
out the proof. If you’re going to prove that
2
2 Walking on clouds: What does this formula
mean?
At this point, I’m supposed to start motivating the notion of a “limit,” as we
ease toward the dreaded ε-δ definition. The problem is that, with the functions
we know how to use right now, there aren’t any really interesting or unexpected
limits—they tend to be a bit boring. Certainly, there is no real need for this
elaborate and potentially confusing definition. One of the few interesting exam-
ples is the difference quotient, not because it is hard to evaluate, but because
without understanding limits well, it is easy to be skeptical that the answer has
any meaning. One philosopher once called it the “ghost of a departed quantity.”
Suppose we are considering the function
f (x) = (x + 1)2 .
(x2 , f (x2 ))
(x1 , f (x1 ))
Given any two distinct values x1 , x2 for x, there is a unique line through the
two points (x1 , f (x1 )) and (x2 , f (x2 )). We may calculate its slope as follows:
change in y
slope =
change in x
f (x2 ) − f (x2 )
=
x2 − x1
(x2 + 1)2 − (x1 + 1)2
= .
x2 − x1
3
Let’s see what happens when x1 = 0, so the first point is (0, f (0)) = (0, 1):
(x2 + 1)2 − 1
slope =
x2 − 0
x22 + 2x2 + 1 − 1
=
x2
2
x + 2x2
= 2
x2
x2 (x2 + 2)
=
x2
= x2 + 2
The interesting thing is that this formula can be evaluated when x2 = 0, even
though the original formula cannot. Moreover, if you look at the graph
you can get an idea that this is probably the slope of the tangent line at (0, 1).
This is something of a miracle, since the original reasoning for the formula
breaks down completely:
1. In the original formula, we’re dividing by zero. How can this possibly give
a meaningful answer?
2. If we look at how we derived the formula, it’s supposed to give us the
“slope of the line through (x1 , f (x1 )) and (x2 , f (x2 )).” But if (x2 , f (x2 ))
is the same point as (x1 , f (x1 )), this is ambiguous: there are infinitely
many lines through these “two” points, all having different slopes. So
where does this choice of a single line with slope 2 come from?
4
The original inventors of calculus more or less just accepted these miracles.
They used this formula to “walk on clouds,” even though they could not really
explain why dividing by zero could give a reasonable answer—or any answer at
all. And while they did great things this way, there were always mathematicians
who distrusted these “miracles.” Even more significantly, once people began
looking at arbitrary functions, they realized that these “miracles” don’t always
work; and it became necessary to understand this miracle so they could know
when it would work.
The solution to this turned out to be the definition of the limit, and was one
of the most important steps in building the “skyscraper” to support the “cloud
castle” of calculus.
5
3 Assignment 4 (due Friday, 7 October)
Section 0.2, problems 35, 36, 37, 38, 39, and 40. Problems 38 and 40 will be
graded carefully.
Bonus Exercise. (Not due until Wednesday, 12 October; worth three points)
Find a piecewise-constant function f , defined on all x for which −2 < x < 2,
such that for all such x,
|f (x) − x2 | < 1.
Essentially, you are trying to approximate the squaring function by an easier-
to-calculate function.
Remember—you need to actually give a proof that your function f works.
This will involve showing, on each “piece” on which f (x) is a constant c, that
|c − x2 | < 1 on that “piece.” (You’ll never be able to find one c that works for
all x, which is why you need several “pieces.”)
6
Math 131, Lecture 6
Charles Staats
7 October 2011
humans. “Formal logic” uses symbols and is comprehensible to computers. The two-column
proofs you may have done in high school geometry fall somewhere in between.
2 Except for this one remark, of course.
1
Whatever does not make you stronger, kills you.
Here we’ve taken a statement that seemed reasonable, and showed that it means
the same thing as another statement that may seem absurd. This could be a neat
trick if you are in a debate with someone who is defending the first statement.
Now, suppose we’re trying to defend the first statement. In order to do that,
we have to think about the second statement in a way that makes it seem more
plausible. For one thing, we replace the too-general notion of “Everything” by
the more restricted notion of “every trial,” which is basically what we meant
anyway. Then, we do something like the following.
Every trial either kills you or makes you stronger. Thus, the only
way a trial can fail to make you stronger is by killing you.
One additional note: From a careful, logical perspective, the statement
“Whatever trial does not kill you, makes you stronger” does not exclude the
possibility that some trial might both kill you and make you stronger. We typ-
ically fill in this bit of information from common sense, and there is nothing
wrong with that in everyday language. But in mathematics, it is extremely
important to be able identify when we are filling in information from “common
sense” rather than logic, for two reasons. First, someone else’s “common sense”
might differ from ours, in which case we need to be able to defend our claims
with logic. Second, “common sense” is sometimes very wrong, as we will see
later in the course.
2
“infinitely small” numbers, but that does not work. Instead, you deal with
arbitrarily small numbers. Rather than trying to do things with infinitely small
numbers, you take a finite number, often called ε, and do something that works
“for arbitrarily small ε.” In other words, what you are doing will still work, no
matter how small ε gets. This “arbitrarily small” number ε is your “saddle,”
the “cushion” between you and the Infinity beast that lets you ride in comfort.
An important note here, worth repeating, is that you don’t get to choose ε.
If you choose a particular ε, whether it be ε = 1 or ε = .000001, you are limiting
how small ε can be. In order for ε to be arbitrarily small, and hence provide a
connection to the infinitesimal3 , you have to remember that ε is just, well, ε.
Generally speaking, it’s a small positive real number. You can’t pretend you
know more about it than that.
Coming back to the quiz question, no matter how small you make ε, ε/2
is always smaller. Thus, no matter how small you make ε, there is always
something smaller.
3 Order of quantifiers
Last lecture, I discussed a bit about the quantifiers “for all” and “there exists.”
Let’s consider how these are used in the statement of one of the quiz problems:
Suppose that x is a positive real number. Show that there is another
positive real number y such that y < x.
I mentioned in the last lecture that this includes the quantifier “there exists.”
But if you think about it, there are actually two quantifiers. The statement
could be re-written more symbolically as
where “s.t.” stands for “such that.” If you imagine playing a game against an
opponent, the opponent gets to move first (he gets to choose x), and then you
get to move, choosing y in response. As we’ve seen, no matter what x he chooses,
you can always choose y = 21 x and win.
Let’s see what happens if we reverse the order of the quantifiers and let you
move first:
∃y > 0 s.t. ∀x > 0, y < x.
In other words, “there exists a positive number less than all positive numbers.”
When you play this game, you lose: whatever y you choose, your opponent then
chooses x = y. Since x = y, it is not true that y < x, which is what you would
have needed in order to win.
This example illustrates something important about the order of quantifiers:
it’s easier to “win” if the ∃ quantifier comes last—in other words, if you move
3 “Infinite” means infinitely large. “Infinitesimal” means infinitely small. A woman who
confused these two once told a speaker, “I enjoyed your lecture very much and thought it was
of absolutely infinitesimal value.” This was not the compliment she apparently thought it was.
3
second, rather than first. If this seems somewhat counterintuitive, think about
it as a debate: it’s easier to win if you get the last word.
As a final point, let’s consider one formal definition of a limit:
Definition. Let f be a function whose domain includes all x > m, for some
finite m. We say that
lim f (x) = c
x→∞
if
∀ε > 0, ∃M > m such that if x > M, then |f (x) − c| < ε.
I don’t expect you to really understand this definition yet; we’ll go over it
more carefully next lecture, with more motivation. The key point to realize is
this: if you look at the quantifiers, you will notice that the ∃ quantifier comes
second. This means that in “playing the game,” you get the last word. This
does not guarantee that you will win, but it does give you a fighting chance.
4
Assignment 4 12 (due Monday, 10 October)
“Problems” 1 and 3 on page 33 of the book you will find at http://www.
phy.duke.edu/~rgb/Class/intro_physics_1/intro_physics_1.pdf. These
“problems” involve doing some reading—three times—and writing a couple of
short essays. This portion of the book gives advice on how to learn. It’s by one
of my favorite professors when I was a college student. (He was much better at
giving out candy than I am.) The essays will be collected and graded (by the
instructor).
One final note: The assignment will be, quite literally, impossible to complete
unless you start it by Friday, since part of the assignment is to work on it on
three different days.
5
Assignment 5 (due Wednesday, 12 October)
• Consider the following two problems from the quiz:
– Suppose that x is a positive real number. Show that there is another
positive real number y such that y < x.
– Either state the smallest positive real number, or prove that there is
no such thing.
Explain why these are really the same problem. This problem will be
graded carefully.
• For each of the following statements,
(a) Rewrite it as an if-then statement.
(b) Give the contrapositive. (You may use “bad” as an abbreviation for
“not good.”) Optionally, rewrite it in a form resembling the original,
rather than the “if-then” form.
Here are the statements:
1. Everyone with a beer has an ID.
2. Everyone on the plane has a ticket.
3. Every good boy does fine.4
4. Every good boy deserves fudge.
5. Good men die young.
6. Only good men die young. [Note: This one is tricky.]
Statements 2, 4, and 6 will be graded carefully.
• A friend of yours does not understand the formal definition of a limit.
Write a paragraph explaining it to him so that it makes sense. (You may
want to try this at least twice—once before the lecture on Monday, and
once afterwards. Ideally, you should turn in several versions that show
how your own understanding has improved.)
Bonus Exercise. (worth three points) Find a piecewise-constant function f ,
defined on all x for which −2 < x < 2, such that for all such x,
|f (x) − x2 | < 1.
Essentially, you are trying to approximate the squaring function by an easier-
to-calculate function.
Remember—you need to actually give a proof that your function f works.
This will involve showing, on each “piece” on which f (x) is a constant c, that
|c − x2 | < 1 on that “piece.” (You’ll never be able to find one c that works for
all x, which is why you need several “pieces.”)
4 This sentence is used as a memory aide in music theory for the sequence of letters EGBDF,
6
Math 131, Lecture 7
Charles Staats
Monday, 10 October 2011
1
Now, suppose you’re a computer scientist, and someone hands you two pro-
grams and asks you which is faster? You see that the first program takes f (n)
operations, and the second program takes g(n) operations. So, for any given
value of n, you can simply calculate f (n) and g(n) and see which is smaller
(hence faster).
The problem is, the person has told you absolutely nothing about what val-
ues of n they care about. Lacking this knowledge, you assume they probably
care most about what happens when n is very large. Thus, you might con-
sider the following criterion for telling them the f -program is faster than the
g-program:
Unfortunately, it’s not clear what you mean by “large n.” One simple attempt
would be to say something like “n is large if n > 100.” This sort of idea would
allow you to produce more precise versions of (1):
(i) For all n > 100, f (n) < g(n).
(ii) For all n > 1,000, f (n) < g(n).
(iii) For all n > 10,000, f (n) < g(n).
Question. Which of these criteria is the hardest to prove / least likely to be
true? Which implies the others?
If you notice, each of these versions of (1) has the following form: you first fix
some N , and then take “n is large” to mean that “n > N .” You then get a
version of (1) that reads
• For all n > N , f (n) < g(n).
Unfortunately, you have no idea how fast the person’s computer is, so you have
no idea what N to choose. So why not let it be arbitrary?
• There exists N such that for all n > N , f (n) < g(n). (2)
If you recall our discussion of quantifiers as a “game,” you may notice that
we have just given ourselves an extra “move” by inserting a ∃ quantifier. Note
that you’ve given yourself the first move rather than the “last word.” You had
no choice: the opponent’s move of choosing an n > N does not even make sense
until you’ve told him what N is, so you have to go first.
Question. What happens if you give the opponent both moves?
2
so badly that we’re likely to get gored. Instead, in (2), we’ve just stumbled upon
another saddle that we can use to sit on the beast without actually touching it:
the “sufficiently large.” When a mathematician writes something like
she means that we get to choose what we mean by “n is large.” In symbols, the
statement above would be written
f (n) = n + 1
g(n) = 2n.
It will turn out that for all n sufficiently large, f (n) < g(n).
Question. How exactly do we show this?
When n = 10, we see that the f -program takes 0.55 times as long as the g-
program. By the time we get to n = 10,000, the f -program takes 0.50005 times
as long as the g-program. It seems that the bigger n gets, the closer f (n)/g(n)
gets to one half. So, we might be tempted to say that “the value of the function
f /g at ∞ is 0.5,” and hence that in some sense, the f -program is twice as fast
as the g program.
Unfortunately, if we say it this way, we’ve spooked the Infinity beast. So,
let’s try out our new “saddle” of the ”sufficiently large”:
f (n) 1
• For all sufficiently large n, = .
g(n) 2
Unfortunately, this is simply not true. No matter how big we make n, f (n)/g(n)
will never be exactly equal to 0.5. What seems to be true is something more
like this:
• For all sufficiently large n, f (n)/g(n) is close to 1/2.
3
Again, the trouble is that we don’t know what exactly we mean by “close.”
Perhaps we mean that f (n)/g(n) is within .01, or .001, or .0001 of 1/2. But it
looks like it’s even closer than this.
f (n)
• For every possible notion of “close to 1/2,” for all sufficiently large n,
g(n)
is “close to 1/2.”
For an arbitrary (small) positive number ε, we get a notion of “close to 1/2” as
“within ε of 1/2.” Thus, in symbols, we may write the above as
f (n) 1
∀ε > 0, ∃N s.t. ∀n > N, − < ε.
g(n) 2
if
∀ε > 0, ∃N s.t. ∀n > N, |h(n) − c| < ε.
This could also be rewritten as
For all ε > 0, there exists N such that if n > N, then |h(n) − c| < ε.
Writing it this way makes my claim last lecture that “you get the last word”
look a little bit better.
4
Assignment 5 (due Wednesday, 12 October)
• Consider the following two problems from the quiz:
– Suppose that x is a positive real number. Show that there is another
positive real number y such that y < x.
– Either state the smallest positive real number, or prove that there is
no such thing.
Explain why these are really the same problem. This problem will be
graded carefully.
• For each of the following statements,
(a) Rewrite it as an if-then statement.
(b) Give the contrapositive. (You may use “bad” as an abbreviation for
“not good.”) Optionally, rewrite it in a form resembling the original,
rather than the “if-then” form.
Here are the statements:
1. Everyone with a beer has an ID.
2. Everyone on the plane has a ticket.
3. Every good boy does fine.3
4. Every good boy deserves fudge.
5. Good men die young.
6. Only good men die young. [Note: This one is tricky.]
Statements 2, 4, and 6 will be graded carefully.
• A friend of yours does not understand the formal definition of a limit.
Write a paragraph explaining it to him so that it makes sense. (You may
want to try this at least twice—once before the lecture on Monday, and
once afterwards. Ideally, you should turn in several versions that show
how your own understanding has improved.)
Bonus Exercise. (worth three points) Find a piecewise-constant function f ,
defined on all x for which −2 < x < 2, such that for all such x,
|f (x) − x2 | < 1.
Essentially, you are trying to approximate the squaring function by an easier-
to-calculate function.
Remember—you need to actually give a proof that your function f works.
This will involve showing, on each “piece” on which f (x) is a constant c, that
|c − x2 | < 1 on that “piece.” (You’ll never be able to find one c that works for
all x, which is why you need several “pieces.”)
3 This sentence is used as a memory aide in music theory for the sequence of letters EGBDF,
5
Assignment 6 (due Friday, 14 October)
Section 1.5, problems 1-9. Remember: It is not enough to get the right answer.
You have to convince the reader that your answer is right. Problems 2, 4, 6,
and 8 will be graded carefully.
6
Math 131, Lecture 8
Charles Staats
Wednesday, 12 October 2011
Informal: For every version of “close to”, we can choose some meaning
for “large” such that if x is “large,” then f (x) is “close to” `.
Formal: For all real ε > 0, there exists N such that for all x > N ,
|f (x) − `| < ε.
The following table shows the correspondence between the informal version
and the formal version.
1
2 Newton’s “definition of a limit”
Consider the following statement from Isaac Newton’s seminal work, the Philosophiae
Naturalis Principia Mathematica:
Quantities, and the ratios of quantities, which in any finite time
converge continually to equality, and before the end of that time
approach nearer to each other than by any given difference, become
ultimately equal.
This was centuries before mathematicians came up with the correct definition
of a limit in order to build the “skyscraper” of analysis. Newton was trying to
build his “cloud castle” of Calculus. It’s kind of hard to see in the middle of a
cloud, so it’s no wonder he was confused: he thought he was proving a theorem
rather than stating a definition.
Nevertheless, this statement has some of the key aspects of the definition of
a limit. Newton understood that it is not enough just to say that one quantity
“approaches” another. He put in a key phrase: approaches nearer than by any
given difference. In other words, when we say that “f (t) approaches `,” we
really mean that f (t) becomes arbitrarily close to `. In more modern language,
Newton’s “difference” would probably be called ε. We would say that for any
given ε, f (t) must approach to within ε of `.
And he incorporated another key understanding—how exactly does this “be-
coming close” depend on t? Newton saw t as time. What we called f (t), he
might have called “the value of f once the time t has passed.” Letting t get
larger is, for him, simply letting a lot of time pass. And when we think about it
this way, we come to the following realization. In order for f (t) to approach `
“nearer than a given difference,” f (t) must become nearer than that difference
in finite time. In other words, there is a time N , after which f (t) becomes—and
remains—within ε of `.
Thus, in Newton’s language, we have the following definition of limit:
We say that a function f (t) (where t represents time) has a limit
` if for any given difference ε, within finite time, the quantity f (t)
approaches—and remains—nearer to ` than by ε.
Exercise. Relate this definition to the formal definition of a limit by making a
table like the one at the end of Section 1.
2
Theorem. (“Main Limit Theorem”) In the following equations, if the right
side makes sense, then the left side also makes sense and is equal to the right
side.
1. lim k = k
x→∞
1
2. lim =0
x→∞ x
h i h i
3. lim [f (x) + g(x)] = lim f (x) + lim g(x)
x→∞ x→∞ x→∞
h i h i
4. lim [f (x) − g(x)] = lim f (x) − lim g(x)
x→∞ x→∞ x→∞
h i h i
5. lim [f (x) · g(x)] = lim f (x) · lim g(x)
x→∞ x→∞ x→∞
“The right side makes sense” means, for now, that the limits in question
exist (as real numbers) and there is no division by 0.
This theorem can be proved from the definition of the limit. The proofs are
not even that difficult. But the only way they can ever be interesting is when
you do them yourself. Watching someone else do them is terribly boring, so I’ll
skip the proofs—at least for now—and move straight to discussing how to use
the theorem to actually compute limits.
Warning. If you use this theorem (typically, repeated applications of this theo-
rem) to compute a limit, then you will have shown, in the process, that the limit
exists. However, if you try to apply this theorem, and end up with something
that makes no sense, you will not have shown that the original limit does not
exist.
Example. (Example 2, p. 78 in the textbook) Compute
x
lim .
x→∞ 1 + x2
3
assuming that the righthand side makes sense. Unfortunately, the right hand
side does not make sense: the limits on the righthand side do not exist.1
A more successful way to solve this problem is to first divide both the top
and the bottom by the highest power of x that appears in the denominator.
x x 1/x2
lim 2
= lim 2
· (algebra) (1)
x→∞ 1 + x x→∞ 1 + x 1/x2
1
x
= lim (algebra) (2)
x→∞ 12 +1
x
lim 1
x→∞ x
= (Rule 6) (3)
lim [( 1 )2 + 1]
x→∞ x
lim 1
x→∞ x
= 2 (Rules 3, 7) (4)
lim x1 + lim 1
x→∞ x→∞
0
= (Rules 2, 1) (5)
02 + 1
= 0. (6)
To the right of each line is written the justification: why do we know it is equal
to the previous line (assuming it is defined)?
A few words should be said on how we actually know the limits exist. If
we actually want to be careful here, our knowledge of the limits goes from the
bottom of the stack of formulas to the top. Because line (5) makes sense, the
theorem tells us that line (4) makes sense and is equal to it. Because line (4)
makes sense, the theorem tells us that line (3) makes sense and is equal to it.
And so on, all the way up to the top (which is what we cared about to begin
with).
General procedure for computing limits of rational functions:
A rational function, as you may recall, is a function of the form
an xn + an−1 xn−1 + · · · + a1 x + a0
f (x) = .
bk xk + bk−1 xk−1 + · · · + b1 x + b0
When faced with a function like this and asked to compute limx→∞ f (x), here
is a procedure that often works:
1. Multiply the numerator and denominator both by 1/xk .
2. Use the rules of the “Main Limit Theorem” to “distribute” the limit signs.
Bring them further and further “inside” the formula, until all the limits
are of the form limx→∞ 1/x = 0 or limx→∞ k = k.
1 In a more sophisticated point of view that we will adopt later, the numerator and the
denominator are both ∞. But ∞/∞ still does not make sense, as we will discuss.
4
Assignment 6 (due Friday, 14 October)
Section 1.5, problems 1-9. Remember: It is not enough to get the right answer.
You have to convince the reader that your answer is right. Problems 2, 4, 6,
and 8 will be graded carefully.
In the textbook, Section 1.5, Problems 15, 16, and 18. Problems 16 and 18 will
be graded carefully.
5
Math 131, Lecture 9
Charles Staats
Friday, 14 October 2011
lim f (x) = `
x→∞
if
∀ε > 0, ∃N s.t. if x > N, then |f (x) − `| < ε.
Less formally:
This definition involves using two different “saddles” for the Infinity beast:
the saddle of the “arbitrarily small,” and the saddle of the “sufficiently large.”
It is the interaction of these two “saddles” that makes the definition of the limit
so intricate. In order to get the definition right, you have to get both of them,
and they have to be in the right order.
It may help to think of the “quantifier game.” In this definition, there are
three moves:
1. First, your opponent moves. He gets to choose ε. In other words, he gets
to choose how small is “arbitrarily small.”
2. Then, you get to move, by choosing N . Informally, you get to say how
large is “sufficiently large.” Your choice can depend on ε, which your
opponent has already chosen. But it cannot depend on x, which has not
been chosen yet.
1
3. Finally, the judge gets to go—to decide who wins. In a sense, he gets to
choose x; then, he determines whether this x does what you said (in which
case you win) or not.
To reiterate:
More formally,
f (x)
2
There is a “missing point” at x = 2, but it is clear what the value f (2) ought
to be. Mathematicians have a way of stating, precisely, that “the value of f (x)
at 2 ought to be 1”: we write
lim f (x) = 1.
x→2
More generally, for a given number c and a given value `, we can claim that “as
x → c, f (x) → `,” which is also written
lim f (x) = `.
x→c
c=2
` = 1.
The definition of this sort of limit is quite similar to the definition we’ve
already encountered. For the previous definition, we needed the concept of
“arbitrarily large x” to deal let us talk about what happens at ∞ without
spooking the Infinity beast. In a sense, this is the same thing as “arbitrarily
close to ∞.”
This time around, we’re trying to understand what is (or should be) happen-
ing at c without actually touching c. (If you like, we want to avoid “spooking
the c-beast.”) The notion that replaces “arbitrarily large” is “arbitrarily close
to c.” Thus, we get the following definition.
Definition. We say
lim f (x) = `
x→c
if
More formally,
The condition x 6= c is added since we want to figure out what the value
“should be” at c, without actually touching c.
3
Example. Consider the function f defined by
(
2x − 1 if x 6= 2,
f (x) =
4 if x = 2.
f (x)
lim f (x) = 3.
x→2
Note: When looking at this sort of example, we are not using the formal
definition of the limit to better understand the function f . We are using the
function f to better understand the definition. The formal definition becomes
really useful when we are dealing with functions f for which we don’t have
formulas.
Solution. Let ε > 0 be given (by our opponent; we can’t choose it). Before we
go around choosing δ haphazardly to set what is “sufficiently close to 2,” let’s
anticipate what the judge will say. In other words, let’s “solve” the inequality
he cares about as best we can, without knowing ε:
|f (x) − 3| < ε.
Since the judge does not care what happens when x = c = 2, we can assume
4
f (x) = 2x − 1. In this case, the judge’s “test” is whether
|(2x − 1) − 3| < ε
|2x − 4| < ε
2|x − 2| < ε
|x − 2| < 21 ε.
Now, we could proceed to finish “solving” the inequality; but in this case, that
would be counterproductive. The condition we impose, by our choice of δ, is
that
|x − 2| < δ.
Thus, if we set δ = 12 ε, we are guaranteed that the judge will like all the
“sufficiently small” values of x we allow him to look at.
Important: δ may depend on ε, and usually will. (Since our opponent has
already chosen ε, we’re allowed to use it.) But δ cannot depend on x. (The
judge doesn’t choose x until after we’ve already chosen δ.)
5
Assignment 7 (due Monday, 17 October)
Do the exercise at the end of Lecture 8, Section 2 on Newton’s “definition of a
limit.”
In the textbook, Section 1.5, Problems 15, 16, and 18. Problems 16 and 18 will
be graded carefully.
Anything that appeared on a quiz will probably show up in some form on the
test. (Exception: no contrapositive questions.) Anything that appeared on a
non-bonus homework question might show up on the test.
If I said something in a lecture that did not make it in any form into a quiz or
homework question, then it will not be on the test.
6
Math 131, Lecture 10
Charles Staats
Monday, 17 October 2011
1 Definition: Limits as x → c
Recall from last time the definition of the limit of f (x) as x → c:
Definition. We say
lim f (x) = `
x→c
if
More formally,
• We specifically exclude the “judge” from looking at the value of f (x) when
x = c, because the limit is supposed to detect what “should be” going on
at c without actually touching c. This was not an issue for limx→∞ f (x),
where we’re sort of “setting c = ∞,” because f (∞) does not make sense
anyway.
1
2 Examples: Using the ε-δ definition
At this point, we’ll begin trying to understand the definition better by doing
some examples of ε-δ proofs. We are doing this to help us understand the defini-
tion, and the concept, of limit, which is much more useful in more complicated
situations.
Example. Consider the function f defined by
(
2x − 1 if x 6= 2,
f (x) =
4 if x = 2.
Note: When looking at this sort of example, we are not using the formal
definition of the limit to better understand the function f . We are using the
function f to better understand the definition. The formal definition becomes
really useful when we are dealing with functions f for which we don’t have
formulas.
Let ε > 0 be given (by our opponent; we can’t choose it). Before we go
around choosing δ haphazardly to set what is “sufficiently close to 2,” let’s
anticipate what the judge will say. In other words, let’s “solve” the inequality
he cares about as best we can, without knowing ε:
|f (x) − 3| < ε.
2
Since the judge does not care what happens when x = c = 2, we can assume
f (x) = 2x − 1. In this case, the judge’s “test” is whether
|(2x − 1) − 3| < ε
|2x − 4| < ε
2|x − 2| < ε
|x − 2| < 21 ε.
Now, we could proceed to finish “solving” the inequality; but in this case, that
would be counterproductive. The condition we impose, by our choice of δ, is
that
|x − 2| < δ.
Thus, if we set δ = 12 ε, we are guaranteed that the judge will like all the
“sufficiently small” values of x we allow him to look at.
Important: δ may depend on ε, and usually will. (Since our opponent has
already chosen ε, we’re allowed to use it.) But δ cannot depend on x. (The
judge doesn’t choose x until after we’ve already chosen δ.)
If you recall the discussion of the “narrative” of a proof with quantifiers, the
way you “tell” a proof is often quite different from the way you work it out.
Now that we’ve worked out what the value of ε should be, let’s “tell” the story
of what goes on in the courtroom—without trying to get into the characters’
heads (as we were, earlier, by anticipating the judge). Just the facts, ma’am.
Solution. Let ε > 0 be given. Set δ = 21 ε. Assume |x − 2| < δ and x 6= 2.
Since x 6= 2, we know f (x) = 2x − 1. Hence,
|f (x) − 3| = |2x − 1 − 3|
= |2x − 4|
= 2|x − 2|
< 2δ
= 2( 21 ε)
= ε.
3
Test Wednesday, 19 October
The test includes lectures through Wednesday, October 12, and assignments 1
through 7 (but not assignment 4.5). Note: The assignment numbers are one off
from the lecture numbers, since I gave no (non-bonus) assignment on the first
day of class. In particular, although the quiz does not include any new material
from the Lectures 9 and 10, it does include the homework set due Monday, 17
October.
Anything that appeared on a quiz will probably show up in some form on the
test. (Exception: no contrapositive questions.) Anything that appeared on a
non-bonus homework question might show up on the test.
If I said something in a lecture that did not make it in any form into a quiz or
homework question, then it will not be on the test.
lim 7x = 0 (1)
x→0
lim 2x = 2 (2)
x→1
lim 4x + 1 = −1 (3)
x→− 12
lim 1 x − 2 = − 12 (4)
x→5 2
4
Math 131, Lecture 11
Charles Staats
Friday, 21 October 2011
can be abbreviated as
1 < x < 2.
The condition
1 < x or x<2
has no such abbreviation. These sorts of “combined statements” can only
be used for and. If you are dealing with an or statement, you have to
write it out in full, with the two statements included.
[Incidentally, the particular or statement above is actually equivalent to
the statement x = x. Why?]
• |2x + 7| > 5 is an or condition. |2x + 7| < 5 is an and condition.
• When you complete the square on 2(b), you should end up with
(x + 2)2 ≥ 25.
1
At this point, since the right side is positive, you take the square root of
both sides, getting
|x + 2| ≥ 5.
You can then the rules for solving absolute value inequalities.
If you were solving an equation, you would probably want to use a ± sign
rather than an absolute value. This does not work reliably for inequalities.
• It is certainly possible for f (x) to equal its limit for large x. We just
don’t often look at such examples because they are not very interesting.
However, when we study limits as x → c, we do explicitly disregard what
happens at x = c. It’s important to distinguish between what happens
to the y-values (f (x) can equal c, although it does not have to) and the
x-values (x cannot equal see—at least, as far as the judge is concerned).
2
Definition. (Terribly Imprecise Version) We say that
lim f (x) = `
x→c
3
Assignment 9 (due Monday, 24 October, 2011)
Read Section 1.1. If you aren’t comfortable with sine and cosine, don’t pay too
much attention to the examples involving them. You do need to understand the
picture in Example 6, however.
Section 1.1, Problems 29 and 30. These problems are (unusually) of the “answers
only” variety. Problem 30 will be graded carefully.
Section 1.2, Problems 11–15. This time, the proof needs to be done carefully.
Problems 12, 14, and 15 will be graded carefully.
4
Math 131, Lecture 12
Charles Staats
Monday, 24 October 2011
We have not defined this function at x = −1, but for the purpose of considering
lim f (x),
x→−1
f does not have to be defined at −1; and even if it is, we don’t care what its
value is.
f (x)
x
−4 −3 −2 −1 1 2 3 4
−1
−2
−3
−4
1
In this situation, the limit does not exist. To handle “jumps” like this, we have
the notion of one-sided limits.
Definition. We say that “f (x) approaches ` as x approaches c from the left,”
written
lim f (x) = `,
x→c−
if
∀ε > 0, ∃δ > 0 s.t. if c − δ < x < c, then |f (x) − `| < ε.
The boxed part says that “x is to the left of c and within δ of it”:
x
( )
c−δ c c+δ
δ δ
lim f (x) = `,
x→c+
if
∀ε > 0, ∃δ > 0 s.t. if c < x < c + δ , then |f (x) − `| < ε.
x
( )
c−δ c c+δ
δ
Exercise. Using this ε-δ definition, show, for the function f defined above, that
lim f (x) = 2
x→−1−
lim f (x) = 1.
x→−1+
Theorem. The two-sided limit limx→c f (x) exists if and only if both the one-
sided limits exist and are equal. In this case, we have
This theorem is not that difficult to prove, but we will refrain because of
time constraints. The basic idea is as follows: when the opponent gives us an
ε, we
• Find a δ1 that works for the left-hand limit.
• Find a δ2 that works for the right-hand.
• Set δ = min{δ1 , δ2 }.
2
1.2 Infinite limits
We say lim f (x) = ∞ if
x→c
lim f (x) = −∞
x→1−
lim f (x) = ∞.
x→1+
3
y
x
−4 −3 −2 −1 1 2 3 4
−1
−2
−3
−4
Consider
lim f (x).
x→0
No matter how small x is, there are always smaller values at which f is 0 (say,
x = n1 , for some really big integer n) and smaller values at which f is 1 (say,
√
x = n1 2, for an even bigger integer n). So, no version of the limit as x → 0
can exist—not the left-hand limit, not the right-hand limit, not even if we allow
limits that are ±∞.
4
y
x
−4 −3 −2 −1 1 2 3 4
−1
−2
−3
−4
f (x) = x2 .
Show that
lim f (x) = 4.
x→2
5
At this point, it might be tempting to say, “Let δ be ε/|x + 2|.” However, to
set δ like this, we’d have to know x, and we don’t: the judge does not choose x
until after we’ve chosen δ.
This is the sort of thing that makes ε-δ proofs so much more difficult for
nonlinear functions: We have to control x to make two things happen at once:
• Make sure ε/|x + 2| does not get too small.
• Make sure |x − 2| does not get too big.
The second task is easy—it’s precisely what the δ is designed to do. But the
first task is much harder, and it’s where we have to start.
Suppose we know that |x − 2| < 1. If you “solve” this inequality, you get
precisely that 1 < x < 3. Thus, x + 2 is clearly positive, and so |x + 2| = x + 2.
Thus, we have
1<x <3
3<x+2 <5
3 < |x + 2| < 5
1 1 1
> >
3 |x + 2| 5
ε ε ε
> > .
3 |x + 2| 5
Note: in the inequalities above, each line implies the next, but is not necessarily
equivalent.
Thus, we have:
ε ε
If |x − 2| < 1, then < .
5 |x + 2|
As long as we choose δ ≤ 1, we have some control over how small ε/|x + 2| can
be. This takes care of the first, problematic task. Since we’ve done that, the
second task is comparatively easy: we need to ensure that
ε
|x − 2| < .
5
This works as long as δ ≤ ε/5.
So, to accomplish the first task, we need δ ≤ 1. Once we’ve done this, to
accomplish the second task, we need δ ≤ ε/5. Since 1 and ε/5 are both positive
numbers, we can simply set
n εo
δ = min 1, .
5
Now, we’ve finally got a plan. Let’s head into the courtroom and see what the
judge says.
Proof. Let ε > 0 be given. Set
n εo
δ = min 1, .
5
6
Now, suppose 0 < |x − 2| < δ. Then we have
|x2 − 4| = |x − 2|x + 2.
To proceed further, we need to know something about |x + 2|. Since δ ≤ 1, we
know
|x − 2| < 1
−1 < x − 2 < 1
1<x<3
3 < x + 2 < 5,
3 < |x + 2| < 5.
Hence, |x + 2| is a positive number less than 5. Thus, we have
|x2 − 4| = |x + 2||x − 2|
< 5|x − 2|
< 5δ,
since |x − 2| < δ
ε
<5· ,
5
since δ ≤ ε/5
= ε.
If |x − 2| < δ, then |x2 − 4| < ε. Hence, the limit is as claimed.
Bonus Exercise. Show, using an ε-δ argument, that
lim x2 = 9.
x→3
7
Assignment 10 (due Wednesday, 26 October, 2011)
Consider the piecewise-linear function f defined by
3x + 11 if x ≤ −3,
f (x) = 1 − 31 x if − 3 < x ≤ 3,
3−x if x > 3.
2. Use the graph to “guess” what lim f (x) and lim f (x) are. If it looks like
x→−3 x→3
the two-sided limits don’t exist, sleep on it, and then try graphing the
function again.
3. Give ε-δ proofs that the two limits in the question above are what you
say they are. Remember: You should have one proof for each (two-sided)
limit. (Hint: you will probably want to let δ = min{δ1 , δ2 }, for appropriate
values of δ1 and δ2 .)
The third problem will be graded carefully.
• Do the exercise on page 2 of the notes for Lecture 12. This will be graded
carefully.
• Let a 6= 0 and c be arbitrary real numbers. (In other words, the “oppo-
nent” gets to choose a and c, and he is allowed to choose anything as long
as he does not set a equal to 0.) Give an ε-δ proof that
lim ax = ac.
x→c
8
• A “sequence” (an ) is a list of numbers, for instance,
1 3 7 15
0, , , , , . . . .
2 4 8 16
Typically, the nth term will be denoted an . Thus, in the sequence above,
we have
a1 = 0
1
a2 =
2
3
a3 =
4
7
a4 =
8
15
a5 =
16
..
.
2n−1 − 1
an =
2n−1
..
.
9
Math 131, Lecture 13
Charles Staats
Wednesday, 26 October 2011
f (x)
x
−4 −3 −2 −1 1 2 3 4
−1
−2
−3
−4
Solution. First, since I don’t feel like doing an extensive preliminary analysis,
I’m going to let you in on a little trick: If you have a function that looks like
f (x) = mx + b,
1
with m 6= 0, and you want to do an ε-δ proof for f , then
1
δ= |m| ε
is a good guess for δ. I’m not saying it will always work, but it’s worth trying.
Now, for the function f in this particular example, we actually have two
linear equations. To the left of x = 1, we have f (x) = −7x + 8. This suggests
that for the left-hand limit, we should take
1
δ1 = |−7| ε = 17 ε.
2
2 Continuity
Given what we’ve already seen, the simplest definition of continuity is the fol-
lowing:
Definition. A function f is said to be continuous at a point x0 if
(i) f is defined at x0 , and
f (x)
x
−4 −3 −2 −1 1 2 3 4
−1
−2
−3
−4
3
The x-values −3 and 1. The other x-values listed above do not lie in
the domain of f .
g(x)
x
−4 −3 −2 −1 1 2 3 4
−1
−2
−3
−4
4
Here’s the picture:
f (b)
(b,f (b))
y0
(x0 ,y0 )
(a,f (a))
f (a)
a x0 b
Intuitively, for the (continuous) f to go from the line y = f (a) to the line
y = f (b), it has to pass through the line y = y0 somewhere. That “somewhere”
is our x0 .
We won’t try to prove this right now.
5
Assignment 11 (due Friday, 28 October, 2011)
• Section 1.5, Problems 51 and 52. Problem 52 will be graded carefully.
• Section 1.6, Problems 1, 3, and 5. None of these will be graded carefully.
• Do the exercise on page 2 of the notes for Lecture 12. This will be graded
carefully.
• Let a 6= 0 and c be real numbers (which you don’t get to choose). Give
an ε-δ proof that
lim ax = ac.
x→c
a1 = 0
1
a2 =
2
3
a3 =
4
7
a4 =
8
15
a5 =
16
..
.
2n−1 − 1
an =
2n−1
..
.
6
Assignment 12 (due Monday, October 31, a.k.a.
Halloween)
Section 1.3, Problems 3, 4, 7, and 8. Remember, these problems are about
showing you understand how to use the Main Limit Theorem, not about finding
the limits. Problems 4 and 8 will be graded carefully.
Section 1.6, Problems 2, 4, 6-8, 32, and 33. Problems 6-8 and 32 will be graded
carefully. You do not need to give ε-δ proofs for these problems.
Bonus problem: Let f be a function (which you do not get to choose). Consider
the statement
For all x0 in the domain of f , for all ε > 0, there exists δ > 0 such
that whenever |x − x0 | < δ, then |f (x) − f (x0 )| < ε.
Explain why this is equivalent to the statement that “f is continuous.” (One
dilemma you may need to address: Why does it make no difference if we write
|x − x0 | < δ rather than 0 < |x − x0 | − δ?)
7
Math 131, Lecture 14
Charles Staats
Friday, 28 October 2011
1
need calculus in applied science (physics, chemistry, atmospheric chemistry,. . . )
this is most likely the language you will see. But I’m honestly not sure which
approach is less confusing to see first, so I’m going to accept the following
wisdom: When in doubt, follow the textbook. More or less.
2
This quantity (when it exists) is called the derivative of f at t0 .
By hypothesis, f (c) is defined, i.e., the last line makes sense. By the Main Limit
Theorem, the previous line makes sense and is equal to it, and so on all the way
up. Thus, for every c in the domain of f ,
3
Solution. a
64
(4, 64)
31
0
x0 4
0
4
Assignment 12 (due Monday, October 31, a.k.a.
Halloween)
Section 1.3, Problems 3, 4, 7, and 8. Remember, these problems are about
showing you understand how to use the Main Limit Theorem, not about finding
the limits. Problems 4 and 8 will be graded carefully.
Section 1.6, Problems 2, 4, 6-8, 32, and 33. Problems 6-8 and 32 will be graded
carefully. You do not need to give ε-δ proofs for these problems.
Bonus problem: Let f be a function (which you do not get to choose). Consider
the statement
For all x0 in the domain of f , for all ε > 0, there exists δ > 0 such
that whenever |x − x0 | < δ, then |f (x) − f (x0 )| < ε.
Explain why this is equivalent to the statement that “f is continuous.” (One
dilemma you may need to address: Why does it make no difference if we write
|x − x0 | < δ rather than 0 < |x − x0 | − δ?)
2
Use the Intermediate Value Theorem √ to prove that, no matter what Diophantus
of Alexandria might have thought, 2 does, in fact, exist. (In other words, there
exists a positive real number x0 such that x20 = 2.) This problem will be graded
carefully.
Section 2.2, Problems 45-48 and 51, 52. Be sure to follow the instructions
carefully on 51 and 52; these problems are as much about how you find the
derivative, as what answer you get. Problems 46, 48, and 52 will be graded
carefully.
Let a 6= 0 be a real number (which you don’t get to choose). Let f be the
function defined by
f (x) = ax.
Show that f is continuous using an ε-δ proof. This problem will be graded
carefully.
5
Math 131, Lecture 15
Charles Staats
Monday, 31 October 2011
However, it can fail rather drastically for more complicated curves. In the
curve below, the almost-vertical line is the one that intersects the curve in only
one point, while the almost-horizontal line clearly “ought” to be the tangent
line. (Intuitively, the almost-vertical line crosses the curve, while the almost-
horizontal line does not—at least, not at the point in question.)
1
For another example, in the following picture, neither the vertical nor the hori-
zontal line really “touches the curve without crossing it.” Each of them intersects
the curve exactly once. But if one of them is the tangent line, it is the horizontal
line rather than the vertical line.
Thus, we take another approach to defining what exactly the tangent line
should be. An easier definition is to define a secant line—that is, a line that
passes through two specified points on a curve. This is easy to specify, since
two points determine a line. We want to think of a tangent line as a “secant
line that passes through the same point twice.” Unfortunately, this does not
actually make any sense.
To remedy the situation, we consider another way of specifying a line: a
point (x0 , y0 ) together with a slope ∆y/∆x. Thus, the secant line through
(x0 , y0 ) and (x, y) is the line passing through (x0 , y0 ) with slope equal to
∆y y − y0
= .
∆x x − x0
If we want to take the tangent line at (x0 , y0 ), we already have a point through
which the line should pass. We just need to know what its slope ought to be.
This is essentially the same problem we were faced with last lecture—we need
a definition for “slope at a point,” in spite of the fact that slope is, inherently, a
property relating two different points. And we solve it the same way: we take
a limit. We say that the slope of the tangent line is
∆y
lim .
∆x→0 ∆x
2
The picture below shows the tangent line as a limit of secant lines:
(x0 ,y0 )
Now, if you recall the previous definition of the derivative, you will see that, if
y is given as y = f (x) for some function f , then in fact, we will have the slope
of the tangent line equal to the derivative:
∆y dy
lim = = f 0 (x0 ).
∆x→0 ∆x dx x=x0
f (x) = x2 .
Compute the derivative of f at x0 = 1. Plot the function and the line tangent
to f at x0 .
Solution. First, let’s solve for ∆y in terms of ∆x:
3
Thus, we have
∆y
f 0 (x0 ) = lim
∆x
∆x→0
2∆x + (∆x)2
= lim
∆x→0 ∆x
= lim 2 + ∆x
∆x→0
= 2.
Now, we plot the function y = f (x), together with line passing through
(x0 , f (x0 )) = (1, 1) and having slope f 0 (x0 ) = 2:
y
(x0 ,f (x0 ))
2 Infinitesimals
The idea of infinitesimals, as it relates to slopes of tangent lines, is to define
the tangent line to f at x0 as the line through (x0 , y0 ) and another point (x0 +
dx, y0 + dy) that is “infinitely close” to the first point. What this means, in
this example, is that dx is “so small” that dx2 = 0, even though dx is not zero.
4
This sort of makes sense, in that the square of a small number is a much smaller
number; for instance,
0.0012 = 0.000001
It does not really make sense—no nonzero number can square to zero—but
that’s why I called this “walking on clouds.”
To start with, we treat the “infinitesimal changes” dx and dy exactly as
though they were more conventional changes ∆x and ∆y. Our earlier compu-
tation of ∆y in terms of ∆x still holds:
dy = 2dx + dx2
dy = 2dx since dx2 = 0
dy
=2
dx
when evaluated at the point x0 = 1.
5
Assignment 13 (due Wednesday, 2 November)
NOTE: From this assignment on, you no longer need to write anything about
“By the Main Limit Theorem,. . . ” when showing your work to take a limit.
(You should, however, continue to show your work.)
1
Use the Intermediate Value Theorem √ to prove that, no matter what Diophantus
of Alexandria might have thought, 2 does, in fact, exist. (In other words, there
exists a positive real number x0 such that x20 = 2.) This problem will be graded
carefully.
Section 2.2, Problems 45-48 and 51, 52. Be sure to follow the instructions
carefully on 51 and 52; these problems are as much about how you find the
derivative, as what answer you get. Problems 46, 48, and 52 will be graded
carefully.
Let a 6= 0 be a real number (which you don’t get to choose). Let f be the
function defined by
f (x) = ax.
Show that f is continuous using an ε-δ proof. This problem will be graded
carefully.
6
Math 131, Lecture 16
Charles Staats
Wednesday, 1 November 2011
∆y
f 0 (x0 ) = lim . (1)
∆x→0 ∆x
One key to mastering mathematics is being able to move facilely among dif-
ferent ways of saying the same thing; which way you want to say it may
depend on what you want to use it for. We’re going to review some other
ways to write the definition of the derivative, using the various relations among
x, x0 , ∆x, y, ∆y, f (x0 ), . . ..
First, observe that
Thus,
∆y f (x) − f (x0 )
= ,
∆x x − x0
and
∆y
lim =`
∆x→0 ∆x
∆y
⇐⇒ ∀ε > 0, ∃δ > 0 s.t. if 0 < |∆x − 0| < δ, then −` <ε
∆x
f (x) − f (x0 )
⇐⇒ ∀ε > 0, ∃δ > 0 s.t. if 0 < |x − x0 | < δ, then −` <ε
x − x0
f (x) − f (x0 )
⇐⇒ lim = `.
x→x0 x − x0
1
In other words, an alternate definition for the derivative is given by
f (x) − f (x0 )
f 0 (x0 ) = lim . (2)
x→x0 x − x0
This definition highlights the feature that the derivative only depends on what
is happening to f near x0 . If we look at a different function g that cannot be
distinguished from f near x0 , then f and g will have the same derivative at x0 ;
i.e., f 0 (x0 ) = g 0 (x0 ).
Another way to state the definition of the derivative is to express ∆y in
terms of x0 and ∆x, rather than x0 and x.
∆y = f (x) − f (x0 )
= f (x0 + ∆x) − f (x0 ),
f (x0 + h) − f (x0 )
f 0 (x0 ) = lim . (3)
h→0 h
2 Derivative as a function
In the definition of (3), one feature is that there are no appearances of the letter
x except in the variable x0 . Thus, we can rename x0 as x, obtaining
f (x + h) − f (x)
f 0 (x) = lim .
h→0 h
The interesting feature here is that when we rewrite the definition this way,
it becomes obvious that we have defined more than a number f 0 (x0 ); we have
defined a function f 0 .
There’s a subtlety here that confused me when I first saw this sort of thing.
It involves the interplay of intuition and rigorous mathematics. Intuitively,
when we write x, we think of it as a variable—something that is allowed to
range over many different numbers. On the other hand, when we write x0 ,
we think of this as a particular value of x, a particular number; we just don’t
happen to know what number it is. These intuitions are valuable. However, it
is equally valuable to realize that these intuitions have absolutely no reflection
in the rigorous mathematics. As far as the pure logic is concerned, x and x0 are
both variables, and that’s all there is to it. So whenever we have a statement
2
that involves only one, we can substitute the other, and get an equally true
expression that feels very different, intuitively.
This is typical of a certain kind of reasoning that appears sometimes in
mathematics. First, you let your intuition guide you, as we did (more or less) in
defining the derivative. Then you do something with rigorous mathematics to
change the statement into something equivalent, but that feels intuitively very
different. At this point, you may feel like your head wants to explode: your
intuition is screaming that what you’ve done can’t possibly be right, but you
can’t see any flaws in your logic. It may be tempting to give up and think about
something else. But instead, you may force yourself to stay on task, to turn
the thing over and over in your head until you either find a flaw in the logic, or
find a way of thinking about it that your intuition will accept. Depending on
the difficulty of the thing in question, resolving the conflict may take moments,
hours, days, weeks, months, or years. But the longer you spend puzzling over
it, the greater will be your feeling of enlightenment when it finally “clicks.”
On the other hand, some of you may be thinking that it was obvious that
the derivative is a function. You may even feel a bit smug about the fact that
this “revelation” was clear to you from the beginning. Perhaps you should. But
I think it is more likely that you were not following my lectures closely, but
were instead thinking about the derivative in terms you have learned in the
past. Or perhaps you never really understood the intuition of x0 as a “fixed
value we don’t know,” versus x as a “variable.” Either way, I suggest you review
the previous buildup to the definition of the derivative. Try to understand with
your whole mind—both logic and intuition. If you succeed, you may get a part
of the revelatory moment that you will otherwise have been cheated of.
Now, enough philosophizing. Since we’ve established that the derivative f 0
is a function, there are two obvious sorts of questions:
1. How do we find a formula for the function, if one exists?
2. How do we characterize the function, even if it does not have a formula
we can write down?
We’ll spend a lot of time on both of these, but in light of the homework I’ve
assigned you for Friday, I’m going to spend the rest of this lecture on a version of
the second problem. Specifically: If someone gives you a graph of the function,
how do you graph its derivative? We’ll approach this mainly through examples.
My plan (which I may or may not have time for) is to give you a few minutes to
try the following examples on your own, and then we will go over them together.
3
Example. The graph of a function f is given on the left. On the right, sketch
the graph of the function f 0 . Remember: above each point x on the x-axis, the
value of f 0 should be the slope of the tangent line to f at x. If f does not have
a unique tangent line at x, then f 0 (x) will not exist.
f (x) f 0 (x)
4 4
3 3
2 2
1 1
x x
−4 −3 −2 −1 1 2 3 4 −4 −3 −2 −1 1 2 3 4
−1 −1
−2 −2
−3 −3
−4 −4
f (x) f 0 (x)
4 4
3 3
2 2
1 1
x x
−4 −3 −2 −1 1 2 3 4 −4 −3 −2 −1 1 2 3 4
−1 −1
−2 −2
−3 −3
−4 −4
f (x) f 0 (x)
4 4
3 3
2 2
1 1
x x
−4 −3 −2 −1 1 2 3 4 −4 −3 −2 −1 1 2 3 4
−1 −1
−2 −2
−3 −3
−4 −4
4
3 Local nature of the limit (and derivative)
I’m probably not going to have time to really go over this section in the lecture,
but I would feel like I would not be fulfilling my responsibilities as a Math 131
teacher if I did not at least mention it in the lecture notes.
Recall that, in the most vague terms, the statement
lim f (x) = `
x→x0
means something like “when x is near x0 , then f (x) is near `.” Thus, it seems
like this limit should only depend on “what f is doing near x0 .” In particular,
it should only depend on how f behaves on an interval (x0 − ∆x, x0 + ∆x).
y
g
`
f
x
x0 − ∆x x0 x0 + ∆x g
f = g on this interval, so
lim f (x) = lim g(x)
x→x0 x→x0
The way we say that the limit “only depends on what f is doing near x0 ” is
that if we replace f by a different function g that “looks the same near x0 ,”
then we are guaranteed to get the same answer. More precisely, we have the
following theorem:
Theorem. Suppose that f and g are two functions. Let ∆x be positive. If f
and g are defined and agree on the interval (x0 − ∆x, x0 + ∆x), then
5
Since limx→x0 f (x) = `, there exists δ1 > 0 such that if 0 < |x − x0 | < δ1 ,
then |f (x) − `| < ε. Set δ = min{δ1 , ∆x}.
Assume 0 < |x − x0 | < δ. Since |x − x0 | < δ ≤ ∆x, we know f (x) = g(x).
Consequently,
|g(x) − `| = |f (x) − `|
< ε,
as claimed.
Similar reasoning shows that, if
lim g(x) = `,
x→x0
the theorem tells us that the derivative of f at x0 depends only on how f behaves
near x0 .
6
Assignment 14 (due Friday, 4 November)
Section 2.2, Problems 37–44. The even-numbered problems will be graded care-
fully.
f (x + h) − f (x)
h
gives the slope of a secant line to the curve y = f (x). Hint: your intuition may
like this problem better if you think in terms of x0 rather than x. This problem
will be graded carefully.
7
Math 131, Lecture 17
Charles Staats
Friday, 4 November 2011
f (x + h) − f (x)
f 0 (x) = lim .
h→0 h
Example. Suppose that f (x) = x. Compute a formula for the function f 0 .
Solution.
f (x + h) − f (x)
f 0 (x) = lim
h→0 h
(x + h) − x
= lim
h→0 h
h
= lim
h→0 h
= lim 1
h→0
= 1.
1
Solution.
f (x + h) − f (x)
f 0 (x) = lim
h→0 h
m(x + h) + b − (mx + b)
= lim
h→0 h
mx + mh + b − mx − b
= lim
h→0 h
mh
= lim
h→0 h
= lim m
h→0
= m.
Solution.
f (x + h) − f (x)
f 0 (x) = lim
h→0 h
(x + h)2 + (x + h) − 3 − x2 − x + 3
= lim
h→0 h
x2 + 2xh + h2 + x + h − 3 − x2 − x + 3
= lim
h→0 h
2
2xh + h + h
= lim
h→0 h
(2x + h + 1)h
= lim
h→0 h
= lim 2x + h + 1
h→0
= 2x + 1.
2
Solution.
f (x + h) − f (x)
f 0 (x) = lim
h→0 h
1 1
− x(x + h)
= lim x+h x ·
h→0 h x(x + h)
x − (x + h)
= lim
h→0 hx(x + h)
−h
= lim
h→0 hx(x + h)
−1
= lim
h→0 x(x + h)
−1
= 2.
x
2 Product rule
Suppose that we have u and v, two functions of x. Suppose we know how to
calculate the derivatives du/dx and dv/dx. We can use this to calculate the
derivative of the product u · v, by means of the product rule.
Warning. It may be tempting to write that
d du dv
(u · v) = · .
dx dx dx
This is not true. For instance, suppose u(x) = 2 and v(x) = x. Then
d d
(u · v) = (2 · x) = 2,
dx dx
since y = 2x is a line of slope 2. However, the “naive product rule” would give
us
d d d
(2 · x) = (2) · (x) = 0 · 1 = 0.
dx dx dx
The naive product rule gives the wrong answer.
Leibniz gave a cute derivation of the product rule using infinitesimals. The
first equation in this proof may seem a bit confusing at first; I’ll explain it
3
du v du
u dv
u uv
v dv
Figure 1: A visual illustration of the product rule. The area of the white
rectangle is uv; the area of the total rectangle is (u + du)(v + dv); and the
change in area, d(uv), is their difference. The black rectangle, with area du dv,
is so small that its contribution is “can be neglected.”
afterwards, but if I give it now, the proof will not seem so “cute.” Remember,
the key “fact” about infinitesimals is that if you multiply two of them together,
you get something “doubly infinitesimal,” which we typically consider equal to
zero. In particular, du dv = 0.
Thus,
df = f (x + dx) − f (x)
= u(x + dx)v(x + dx) − u(x)v(x).
4
Recall that
(By an abuse of notation, we’re writing things like u for u(x) when it suits us
to do so.)
Example. Use the product rule to find (in this order) the derivatives of x2 , x3 ,
and x4 with respect to x.
Solution.
d 2 d
x = (x · x)
dx dx
dx dx
=x +x
dx dx
=x+x
= 2x.
d 3 d
x = (x · x2 )
dx dx
d d
= x (x2 ) + x2 (x)
dx dx
d 2
We just calculated dx (x ) = 2x, so this is equal to
= x · 2x + x2 · 1
= 2x2 + x2
= 3x2 .
d 4 d
x = (x · x3 )
dx dx
d 3 d
=x· (x ) + x3 (x)
dx dx
= x · 3x2 + x3 · 1
= 3x3 + x3
= 4x3 .
5
You may start to notice a pattern here. This pattern will continue: if we
d n−1
calculate on out to dx x , we’ll find that it is equal to (n − 1)xn−2 . Using this
fact, we find that
d n d
x = (x · xn−1 )
dx dx
d n−1 d
=x· (x ) + xn−1 (x)
dx dx
= x · (n − 1)xn−2 + xn−1 · 1
= (n − 1)xn−1 + xn−1
= nxn−1 ,
6
Assignment 15 (due Monday, 7 November)
Section 2.2, Problems 5–8 and 51–54. Follow the instructions. Remember, these
problems are more about how you find the derivative than what derivative you
find. The even-numbered problems will be graded carefully.
Section 2.3, Problems 11–14 and 23–26. (Hint: 23–26 are easier if you use the
product rule.) The even-numbered problems will be graded carefully.
7
Math 131, Lecture 18: Rules for differentiation
Charles Staats
Monday, 7 November 2011
f (x + h) − f (x)
f 0 (x) = lim
h→0 h
c−c
= lim
h→0 h
0
= lim
h→0 h
= 0.
(f + g)0 = f 0 + g 0 .
1
Proof. We apply one of the limit definitions of the derivative:
(f + g)(x + h) − (f + g)(x)
(f + g)0 (x) = lim
h→0 h
f (x + h) + g(x + h) − f (x) − g(x)
= lim
h→0 h
f (x + h) − f (x) g(x + h) − g(x)
= lim +
h→0 h h
= f 0 (x) + g 0 (x)
= (f 0 + g 0 )(x).
Since this holds for all x at which f and g are defined, we have the equality of
functions
(f + g)0 = f 0 + g 0 .
Theorem. (Multiplication by a constant) If f is a differentiable function of x
and c is a (constant) real number, then
d df
(cf (x)) = c .
dx dx
Proof. This is, again, a special case of the mx + b thing. This time, we’re going
to derive it from the product rule.
d df d
(cf (x)) = c + f (x) (c)
dx dx dx
df
=c + f (x) · 0 (constant rule)
dx
df
=c .
dx
Theorem. (Difference rule) If f and g are differentiable functions, then (f −
g)0 = f 0 − g 0 .
Proof.
0
(f − g)0 = f + (−1) · g
0
= f 0 + (−1)g (sum rule)
0 0
= f + (−1)g (multiplication by a constant)
= f 0 − g0 .
2
Theorem. When n is a positive integer,
d n
x = nxn−1 .
dx
(Actually, this theorem applies whenever n is a real number, but we won’t
be able to prove that for some time.)
Proof. Let P (n) be the statement that Dx (xn ) = nxn−1 ; this is a condition
on n.
1. We first show that P (1) is true, i.e., that Dx (x) = 1:
d (x + h) − h
(x) = lim
dx h→0 h
h
= lim
h→0 h
= 1,
as desired.
2. We now show, using the product rule, that whenever P (n) is true, then
P (n + 1) is also true.
d n+1 d
x = (x · xn )
dx dx
d d
= x (xn ) + xn (x) (product rule)
dx dx
= x · nxn−1 + xn · 1 (since P (n) is true)
= nxn + xn
= (n + 1)xn .
3
Thus, by induction, P (n) is true for every positive integer n. In other words,
for every positive integer n,
d n
x = nxn−1 .
dx
Using the power rule, together with the “easy rules,” we can, in principle,
compute the derivative of any polynomial.
Example 1. Differentiate x2 − 4x + 1.
Solution.
d 2 d 2 d d
(x − 4x + 1) = (x ) − (4x) + (1) (sum rule)
dx dx dx dx
d 2 d
= (x ) − 4 (x) + 0 (constant multiple; constant)
dx dx
= 2x − 4 · 1 + 0 (power rule)
= 2x − 4.
Solution.
d
2x3 − 12 x2 − x + 17246
937 = 2 · 3x2 − 1
2 · 2x − 1 + 0
dx
= 6x2 − x − 1.
4
We are now going to show how to prove the product rule rigorously. Pay at-
tention to how what we are doing rigorously corresponds to the non-rigorous
infinitesimal method.
Theorem. Suppose that u is a function of x such that du/dx|x=x0 exists. Like-
wise, suppose that v is a function of x such that dv/dx|x=x0 exists. Then the
derivative of the product uv at x0 exists, and
d dv du
(uv) = u0 + v0 .
dx x=x0 dx x=x0 dx x=x0
Note the trick on the third line that was used to show that ∆u∆v/∆x → 0:
∆u∆v ∆u∆v∆x ∆u ∆v du dv
= = · · ∆x → · ·0=0
∆x (∆x)2 ∆x ∆x dx dx
as ∆x → 0. This (sort of) gives a justification for the infinitesimal idea that
du dv = 0.
5
We can give a non-rigorous, infinitesimal derivation as follows: One (non-
rigorous) definition of the derivative is that, if y = f (x), then f 0 (x) is the
number such that
dy = f 0 (x)dx.
Now, suppose that y = f (u) and u = g(x), so that y = f (u) = f (g(x)). Then
we have dy = f 0 (u)du and du = g 0 (x)dx, so
dy = f 0 (u)du
= f 0 (u)g 0 (x)dx
= f 0 (g(x))g 0 (x)dx.
Hence,
dy
= f 0 (g(x))g 0 (x).
dx
Example 3. Differentiate (x + 1)500 .
Solution.
d d
(x + 1)500 = 500(x + 1)499 · (x + 1)
dx dx
= 500(x + 1)499 · 1
= 500(x + 1)499 .
It would have been possible, but very hard, to differentiate this by expanding
out all 501 terms of the polynomial and then applying the techniques of the first
section.
6
Assignment 16 (due Wednesday, 9 November)
Section 2.2, Problems 11–14 and 55–58. Follow the instructions. Remember,
these problems (except for 57 and 58) are more about how you find the derivative
than what derivative you find. The even-numbered problems will be graded
carefully (although a certain amount of leeway will be provided on problem 58).
Section 2.3, Problems 11–14 and 23–26. (Hint: 23–26 are easier if you use the
product rule.) The even-numbered problems will be graded carefully.
∀ε > 0, ∃δ > 0 s.t. if |∆x| < δ, then ∆y is within ε|∆x| of f 0 (x0 )∆x.
Section 2.5, Problems 1–4. Make sure it is clear, from your answer, how you are
using the Chain Rule (see, for instance, Example 3 at the end of Lecture 18).
Problems 2 and 4 will be graded carefully.
7
2. Let f be the function defined by
(
−1x − 18
if x < 3,
f (x) = 1 7 7 7
6x − 2 if x ≥ 3.
Suppose y = f (x) and f (x0 ) = y0 . A purely ε-δ version of the statement that
“f is continuous at x0 ” is given as follows:
u = f (x),
u0 = f (x0 ),
y = g(u) = g(f (x)), and
y0 = g(u0 ) = g(f (x0 )).
Part of Assignment 18
Assignment 18 will include giving ε-δ proofs of the following:
1. Let f be the function defined by
(
−8x + 10 if x ≤ 1,
f (x) =
3x − 1 if x > 1.
8
2. Let f be the function defined by
(
−4x + 5 if x < 1,
f (x) =
− 21 x + 32 if x > 1.
9
Math 131, Lecture 19: The Chain Rule
Charles Staats
Wednesday, 9 November 2011
Important Note: There have been a few changes to Assignment 17. Don’t
use the version from Lecture 18.
f (x + h) − f (x)
f 0 (x) = lim .
h→0 h
If I don’t ask you for εs and δs, don’t give them to me.
f (x + h) − f (x)
f 0 (x) = lim =
h→0 h
1
This simply makes no sense. The notation limh→0 is supposed to represent
the limit of an expression. If you instead follow it by an equals sign, it’s
like saying “The limit of is. . . .”
If you write this on a test, I will deduct points.
5. When I ask you to give an ε-δ proof for a limit of a piecewise-linear
function, my secret goal is to get you to understand how you might go
about proving the following statement:
The two-sided limit exists and equals ` if and only if both the
one-sided limits exist and equal `.
Some of you gave separate ε-δ proofs of both the one-sided limits, and
then used the fact above to deduce the value of the two-sided limit. This
is perfectly correct, and would have received full credit if I were grading
the quiz for credit. However, since I was grading primarily to let you know
whether you are prepared for this sort of problem on the test, I deducted a
couple points. I very probably will give a problem like this on the test; if I
do so, I will explicitly state that you are not allowed to use the statement
above, so that I can justifiably deduct points if you do.
Having reread the sentence above, I realized it sounds like I am looking for
excuses to deduct points. This is NOT the case. When I give a particular
problem, there are certain things I am trying to see if you understand.
If you can do the problem correctly without understanding these things,
then my whole purpose in giving the problem is compromised.
(x0 ,y0 )
2
The easiest definition to remember is probably
dy ∆y
= lim .
dx ∆x→0 ∆x
The easiest to compute with is probably based on the so-called “difference quo-
tient”:
dy f (x + h) − f (x)
= lim .
dx h→0 h
If I ask you to compute a derivative from the definition, I’m not mostly interested
in whether you can find the answer. I may even phrase the question something
like the following1 :
Use the definition of the derivative to prove that if f is the function
defined by f (x) = 1/x, then f 0 (x) = −1/x2 .
As you may note, I am in fact giving you the “answer” here: f 0 (x) = −1/x2 .
What I care about, when I ask a question like this, is whether you understand
the definition well enough to use it.
Although we’ve done this example in Lecture 17, I’m going to repeat it here,
since I will need it later in the lecture.
Example 1. Use the definition of the derivative to prove that if f is the function
defined by f (x) = x1 , then f 0 (x) = −1
x2 .
Solution.
f (x + h) − f (x)
f 0 (x) = lim
h→0 h
1 1
− x(x + h)
= lim x+h x ·
h→0 h x(x + h)
x − (x + h)
= lim
h→0 hx(x + h)
−h
= lim
h→0 hx(x + h)
−1
= lim
h→0 x(x + h)
−1
= 2.
x
If you want more examples of this sort of computation, you should review
Lecture 17. (There’s a version on Chalk that includes solutions to all of the
examples.)
1 NOTE: If I ask you this question on the test, the function f will be different.
3
3 Differentiating Quotients
We can use the Chain Rule together with the Product Rule and Example 1
(page 3) to differentiate quotients.
d 1
Example 2. Find .
dx x − 1
Recall, from Example 1, that Dx (1/x) = −1/x2 .
Solution.
d 1 −1 d
= · (x − 1)
dx x−1 (x − 1)2 dx
−1
= .
(x − 1)2
Informally, this means that the statement m = f 0 (x) is equivalent to the state-
ment that
If the change in x is small, then ∆y ≈ m∆x.
In other words, near x0 , y is approximated by the tangent line. See Figure 1.
This suggests a (non-rigorous) definition of the derivative using infinitesi-
mals: if y = f (x), then f 0 (x) is the number such that
dy = f 0 (x) dx.
This “definition” is based on the general notion that “if something is approxi-
mately true for small ∆x, then it should be exactly true for dx because dx is so
small.” Thus, since ∆y ≈ f 0 (x)∆x, we get dy = f 0 (x) dx. This principle can get
you in big trouble if applied indiscriminately, which is why using infinitesimals
4
y
∆y = f 0 (x0 )∆x
y = f (x)
ε∆x
y0
x
x0 − δ x0 x0 + δ
|∆x| < δ
5
is “walking on clouds.” But in many circumstances, it can give good intuition
and correct results.
Now, suppose that y = f (u) and u = g(x), so that y = f (u) = f (g(x)).
Then we have dy = f 0 (u) du and du = g 0 (x) dx, so
dy = f 0 (u) du
= f 0 (u)g 0 (x) dx
= f 0 (g(x))g 0 (x) dx.
Note: If I ask you on a test for the “Leibniz derivation of the Chain
Rule” or the “Infinitesimal derivation of the Chain Rule,” I am asking you,
more or less, to give me the paragraph above.
The proof of the Chain Rule is to use εs and δs to say exactly what is meant
by “approximately equal” in the argument
∆y ≈ f 0 (u)∆u
≈ f 0 (u)g 0 (x)∆x
= f 0 (g(x))g 0 (x)∆x.
Unfortunately, there are two complications that have to be dealt with. The first
is that, for technical reasons, we need an ε-δ definition for the derivative that
allows |∆x| = 0. The following statement turns out to work:
∀ε > 0, ∃δ > 0 s.t. if |∆x| < δ, then |∆y − f 0 (x0 )∆x| ≤ ε|∆x|.
Comparing this to the earlier version, we got rid if the requirement 0 < |∆x|
by changing the final < ε|∆x| to ≤ ε|∆x|. I don’t want to explain why exactly
we can do this, but anyone who has taken (and understood) an analysis course
ought to be able to do it without much trouble.
The second complication is that the expression for δ in terms of ε turns
out to be a bit ugly. For this reason, I will spare you the details. However, I
hope I have convinced you that the basic idea of the proof of the Chain Rule is
comprehensible, even if the technical details are a bit involved.
6
Assignment 17 (due Friday, 11 November)
From Section 2.3:
• Problems 5–8. Do each problem two ways—using the limit definition of
your choice, and using the rules of differentiation (including the Chain
Rule, if you find it helpful).
• Problems 17–20.
• Problems 31–32. Do not FOIL out the products; instead, use the product
rule for differentiation.
The even-numbered problems will be graded carefully.
Section 2.5, Problems 1–4. Make sure it is clear, from your answer, how you are
using the Chain Rule (see, for instance, Example 3 at the end of Lecture 18).
Problems 2 and 4 will be graded carefully.
Give an ε-δ proof for each of the following. Do not use the fact that if both the
one-sided limits exist and are equal, then the two-sided limit exists and is equal
to both of them.
1. Let f be the function defined by
(
7x − 3 if x ≤ 0,
f (x) =
− 91 x − 3 if x > 0.
6x − 2 if x ≥ 3.
7
Problems 1 and 3 will be graded carefully.
Suppose that
u = f (x),
u0 = f (x0 ),
Section 2.3, Problems 27–30. Use the Product Rule. Problems 28 and 30 will
be graded carefully.
Section 2.5, Problems 5–8, 13–14, and 17–18. You do not need to show every
single step, but it should be clear to the grader how you got to the answer. The
even-numbered problems will be graded carefully.
8
Give ε-δ proofs of the following facts. Do not use the fact that if both the
one-sided limits exist and are equal, then the two-sided limit exists and is equal
to both of them.
1. Let f be the function defined by
(
−8x + 10 if x ≤ 1,
f (x) =
3x − 1 if x > 1.
9
Math 131, Lecture 20: The Chain Rule, continued
Charles Staats
Friday, 11 November 2011
1
y
∆y = f 0 (x0 )∆x
y = f (x)
ε∆x
y0
x
x0 − δ x0 x0 + δ
|∆x| < δ
2
2 The Infinitesimal derivation of the Chain Rule
As you may recall from last lecture, the infinitesimal derivation of the Chain
Rule goes something like this:
Let y = f (u) and u = g(x). Then we have
dy = f 0 (u) |{z}
du
z }| {
= f 0 (|{z}
u ) g 0 (x) dx, since du = g 0 (x)dx
z}|{
= f 0 g(x) g 0 (x) dx since u = g(x).
Hence,
dy
= f 0 (g(x))g 0 (x), i.e.,
dx
(f ◦ g)0 (x) = f 0 (g(x))g 0 (x).
0 0 dy dy
f (g(x0 ) = f (u0 ) = =
du u=u0 du u=g(x0 )
du
g 0 (x0 ) =
dx x=x0
3
and the Chain Rule becomes
! !
dy dy du
= ,
dx x=x0 du u=g(x0 ) dx x=x0
or more simply,
dy dy du
= · .
dx du dx
When written this way, the Chain rule seems completely obvious—just cancel
the du’s. This is not a great way to think about why the Chain Rule is actually
true, because unlike most infinitesimal arguments, it cannot be turned into a
rigorous proof. If I ask you for the infinitesimal or Leibniz derivation
of the Chain Rule on the test, the explanation here will not receive
full credit. However, it does make a good mnemonic device.
u = 2x + 1
y = u3 = (2x + 1)3 .
Then
du
=2
dx
dy
= 3u2
du
and so
dy dy du
=
dx du dx
= 3u2 · 2
= 6(2x + 1)2 ,
4
Solution (short version).
d d
(2x + 1)3 = 3(2x + 1)2 · (2x + 1)
dx dx
2
= 3(2x + 1) · 2
= 6(2x + 1)2 .
5 Differentiating Quotients
Recall that last lecture, we computed that
d 1 −1
= 2.
dx x x
We can use this, together with the Chain Rule, to compute a lot of derivatives.
To start with, we will take a look at xn when n is a negative integer.
Example 2. Let f (x) = x−m , where m is a positive integer. We may use the
Chain Rule to compute f 0 (x), as follows:
df d 1
=
dx dx xm
−1 d m
= m 2· (x )
(x ) dx
−1
= 2m · mxm−1
x
= −m · x−2m+(m−1)
= −m · x−m−1 .
5
show using implicit differentiation that the Power Rule holds whenever n is a
rational number. It is in fact true even when n is irrational, although proving
that requires logarithms.)
We can also use the Chain Rule, together with the Product Rule, to differ-
entiate quotients.
You can either memorize the Quotient Rule, or remember how to differentiate
quotients by combining the Product Rule with the Chain Rule. As long as you
can differentiate quotients, I don’t much care which method you use. If you do
want to memorize this, the standard mnemonic is
“Dee quotient equals bottom Dee top minus top Dee bottom, all
over bottom squared.”
However, if you use this mnemonic, remember not to equate infinitesimals with
finite quantities. Either all the Dees should be Dx (derivative with respect to
x, a finite quantity) or they should all be d (gives infinitesimals on both sides).
Proof.
f (x) 1
Dx = Dx f (x) ·
g(x) g(x)
1 1
= f (x)Dx + Dx (f (x)) (product rule)
g(x) g(x)
−1 f 0 (x)
= f (x) · 2
· Dx (g(x)) + (chain rule)
(g(x)) g(x)
0 0
−f (x)g (x) f (x)
= +
g(x)2 g(x)
−f (x)g (x) + f 0 (x)g(x)
0
=
g(x)2
g(x)f 0 (x) − f (x)g 0 (x)
= .
g(x)2
6
Example 3. Let
x+1
f (x) = .
x−1
Compute f 0 (x).
7
Assignment 18 (due Monday, 14 November)
Section 2.2, Problems 9, 10, 15, and 16. These problems are about the process
of computing the derivative from the limit; finding the “answer” by another
method will not receive full credit. You do not need to hand in any of these,
but a similar problem will appear on the test.
Section 2.3, Problems 27–30. Use the Product Rule. Problems 28 and 30 will
be graded carefully.
Section 2.5, Problems 5–8, 13–14, and 17–18. You do not need to show every
single step, but it should be clear to the grader how you got to the answer. The
even-numbered problems will be graded carefully.
Give ε-δ proofs of the following facts. Do not use the fact that if both the
one-sided limits exist and are equal, then the two-sided limit exists and is equal
to both of them.
8
3. Let f be the function defined by
−8x − 2 if x < 0,
f (x) = −2 if x = 0,
1
7x − 2 if x > 0.
g(t) = 2f (t),
h(t) = f (2t).
You may want to check your answers below by considering the specific cases of
f (t) = t and f (t) = t2 .
1. Explain how to obtain the graphs of g and h from the graph of f by
shrinking/stretching.
2. Compute g 0 and h0 in terms of f 0 . (Hint: they are NOT the same.)
9
Math 131, Lecture 21
Charles Staats
Monday, 14 November 2011
x
−4 −3 −2 −1 1 2 3 4
−1
−2
−3
−4
1
More formally, we see that
|0 + h| − |0| −h
lim− = lim = −1
h→0 h h→0 h
|0 + h| − |0| h
lim = lim = 1.
h→0+ h h→0 h
Since the one-sided limits are not equal, the two-sided limit
|0 + h| − |0|
lim = f 0 (0)
h→0 h
does not exist.
Definition. We say that a function f is differentiable at x0 if f is defined at x0
and the derivative f 0 (x0 ) exists (and is finite). We say that f is differentiable if
it is differentiable at every point of its domain.
√
Example 2. Consider the function f defined by f (x) = 3 x:
f (x)
x
−4 −3 −2 −1 1 2 3 4
−1
−2
−3
−4
You should not find it hard to believe that the tangent line to f at the origin is
vertical—i.e., a line with slope infinity. Correspondingly, if one evaluates
(0 + h)1/3 − 01/3
f 0 (0) = lim ,
h→0 h
one will find that the limit is ∞. Since this derivative is not finite, we still say
that f is not differentiable at 0.
One reason for this convention is that the pChain Rule does not work here: if
it did, it would tell us that the derivative of 3 g(x) at x = 0 is ∞·g 0 (0). Say that
2
p
g(x) = x3 ; then we know that 3 g(x) = x, so the derivative at 0 should be 1;
but the chain rule would tell us that this derivative is ∞·0, which does not make
sense. However, because
√ the Chain Rule only applies when both functions are
differentiable, and 3 is not differentiable, we don’t run into a contradiction.
As I have said before, the “main point” of functions is, more or less, that
they give us a way to talk about things that we don’t have formulas for. Thus,
if we have a problem, we might be able to show that there is a function that
gives its solution, even if there is no formula for the solution. Once we’ve shown
that the solution is given by a function, we can ask how “nice” the function is:
Is it continuous? Is it differentiable? In this situation, we would probably find
the following theorem very interesting:
Theorem. Let f be a function. If f is differentiable at x0 , then f is continuous
at x0 .
Proof. Assume f is differentiable at x0 . Then f is defined at x0 , by definition
of differentiability.
Let y = f (x), y0 = f (x0 ), and note that
y = y0 + ∆y
∆y
= y0 + · ∆x.
∆x
Hence,
lim f (x) = lim y
x→x0 ∆x→0
∆y
= lim y0 + · ∆x
∆x→0
∆x
∆y
= y0 + lim · lim ∆x
∆x→0 ∆x ∆x→0
!
dy
= y0 + ·0
dx x=x0
= y0 = f (x0 ).
Note: In the proof above, if the dy/dx|x=x0 did not exist (as a finite number),
then the limit
∆y
lim
∆x→0 ∆x
would not have made sense, and so the Main Limit Theorem would not have
been applicable.
For our purposes in this class, the most important use of this theorem may
be a way to tell when a function is not differentiable. For this, we use the
contrapositive:
3
Warning. It is quite possible for a function to be continuous but not
√ differen-
tiable. For instance, our earlier examples f (x) = |x| and g(x) = 3 x are both
continuous, but neither is differentiable at 0.
2 Higher derivatives
Given a differentiable function f , its derivative f 0 is also a function. This
function f 0 may itself be differentiable, and have a derivative of its own, which
we call f 00 —the second derivative. The derivative of f 00 , if it exists, is denoted
f 000 , and called the third derivative of f .
Example 3. Let f be the function defined by f (x) = x5 . Find the first, second,
and third derivatives of f .
Solution.
f 0 (x) = 5x4
f 00 (x) = 5 · 4x3 = 20x3
f 000 (x) = 20 · 3x2 = 60x2 .
We can, of course, proceed to take higher derivatives than just the third deriva-
tive. But since something like f 0000000 (x) would be rather hard to read, we denote,
e.g., the seventh derivative of f by f (7) . There are several other notations:
read aloud prime notation D notation Leibniz notation
dn f
the nth derivative of f (n) (x) Dxn (f (x))
dxn
f with respect to x
d dy
The Leibniz notation is based on the idea that should be written
dx dx
d2 y
as . Unlike most other versions of the Leibniz notation, this is purely a
dx2
mnemonic device; trying to think about this as the “quotient” of “infinitesimal”
quantities d2 y and dx2 ends up just giving a mess.
4
Test Wednesday, 16 November
Test II will be on Wednesday. It will cover the lectures up through Lecture 20
(Friday), and the homework up through Assignment 18 (due today). You should
also look at the quizzes (graded and ungraded) up to and including tomorrow’s
quiz. Although there will be one problem involving limits and ε-δ proofs, the
emphasis of the test will be on derivatives, rather than limits. (Note, however,
that you will be asked to find a derivative from the definition, which will require
you to evaluate a limit.)
If you can do all the quiz problems without consulting anyone or anything,
you should do well. When you’ve mastered the quiz problems, there still may be
some additional benefit from practicing homework problems. Bonus problems
will not be tested.
Last Tuesday, I sent out an e-mail outlining what sorts of problems I was
thinking about putting on the test. As promised, that list is not perfect—some
of the problems their did not make it, and one or two things that ended up on
the test were not on the list. However, it’s still a fairly good study guide.
The lecture notes, in addition to trying to explain the material, discuss some
pitfalls that could cost you points, even if you think you know what you are
doing.
On the last test, a lot of people who got basically the right answers still lost
points, because what they wrote did not show me that they truly understood
what was going on. If you want to know exactly how much you need to write
to get credit on certain kinds of problems, please come to office hours (Tuesday,
3–4pm) and ask me. If you cannot make my office hours, I’ll be happy to make
an appointment with you.
5
Math 131, Lecture 22
Charles Staats
Friday, 18 November 2011
Organizational matters
• Tutorial scheduling for next quarter
1 Higher derivatives
Given a differentiable function f , its derivative f 0 is also a function. This
function f 0 may itself be differentiable, and have a derivative of its own, which
we call f 00 —the second derivative. The derivative of f 00 , if it exists, is denoted
f 000 , and called the third derivative of f .
Example 1. Let f be the function defined by f (x) = x5 . Find the first, second,
and third derivatives of f .
Solution.
f 0 (x) = 5x4
f 00 (x) = 5 · 4x3 = 20x3
f 000 (x) = 20 · 3x2 = 60x2 .
We can, of course, proceed to take higher derivatives than just the third deriva-
tive. But since something like f 0000000 (x) would be rather hard to read, we denote,
e.g., the seventh derivative of f by f (7) . There are several other notations:
read aloud prime notation D notation Leibniz notation
dn f
the nth derivative of f (n) (x) Dxn (f (x))
dxn
f with respect to x
1
d dy
The Leibniz notation is based on the idea that should be written
dx dx
d2 y
as . Unlike most other versions of the Leibniz notation, this is purely a
dx2
mnemonic device; trying to think about this as the “quotient” of “infinitesimal”
quantities d2 y and dx2 ends up just giving a mess.
2 Acceleration
If you recall from when we first introduced the derivative, the first motivation
I gave was that “the derivative is the rate of change of position with respect to
time.” If x represents position, then
dx
dt
represents velocity. One of the most important instances of a higher derivative
is acceleration, or the rate of change of velocity with respect to time:
d2 x
.
dt2
If the velocity of an object is increasing, then its acceleration is positive; if the
velocity is decreasing, then the acceleration is negative.
Intuitively, we are inclined to think that something is “accelerating” if it is
“getting faster,” and “decelerating” if it is ”slowing down.” This intuition can
be useful, but it is also dangerous. If an object has positive velocity (i.e., moving
to the right), but negative acceleration, then its velocity will decrease to zero,
and continue to decrease to be negative; i.e., the object will start moving to the
left. We could say that the object decelerates to a stop, and then accelerates
in the opposite direction; however, this is deceptive, because the (negative)
acceleration is exactly the same before, during, and after the instant at which
the object is “stopped.”
For another example, consider what happens when an object is tossed up-
wards. We might be inclined to say that under the force of gravity, it decelerates
until it reaches the apex of its path, and then starts falling downward. But re-
ally, the acceleration is the same (negative) from the moment the object leaves
the hand. Thus, it actually makes more sense to say that the object is falling
from the instant it leaves the hand—even while it is still moving upward (i.e.,
has positive velocity).
One time (in middle school, I think), I was in an auditorium with a bunch of
other students listening to an astronaut speak. At one point, he asked us why,
when an astronaut in a spaceship “drops” something, it floats rather than falling.
The auditorium shook as everyone in the audience shouted, “No gravity!” The
astronaut replied, “Everyone who just said ‘no gravity’ is 100% wrong.” If there
were no gravity, then the spaceship would not be orbiting the earth; instead, it
would be traveling away from the earth in a straight line, never to return. The
2
without gravity
ground level
Figure 1: The spaceship is falling, but it’s moving sidewise so quickly that by
the time it reaches “ground level,” it has not actually gotten any closer to the
surface of the earth.
reason, he said, that an object dropped inside the spaceship appears to float
is that the object, the astronaut, and the entire spaceship are already falling.
The only reason the spaceship does not reach the ground is that by the time
it reaches “ground level,” it’s moved so far horizontally that the ground has
dropped out from beneath it.
For trajectories short enough that we can pretend the earth is flat, the
general rule is the following:
Law of Falling Bodies. If an object is under no influences1 but that of gravity,
then its vertical acceleration is a constant g ≈ −10 sm2 .
If the acceleration g is measured in feet per second squared rather than
meters per second squared, its value is approximately −32. In either case, this
acceleration is negative, because the object’s velocity is decreasing. (If the object
is moving downward, then its velocity is already negative, and is becoming more
negative.)
I’ve called this the Law of Falling Bodies rather than the Law of Gravity
because gravity generally refers to a deeper phenomenon discovered by Isaac
Newton, whereas a version of the Law of Falling Bodies was known earlier to
Galileo. (Who was forced to deal with average acceleration, because he did not
know about derivatives.)
If we translate the Law of Falling Bodies into mathematical notation, we
obtain the equation
d2 y
= −10,
dt2
where y is the vertical position of the object. This is a very simple example
of what is called a differential equation; to “solve” the differential equation, we
1 If this were a physics course, we’d use the word “forces.”
3
what functions y(t) would make it true. It is not hard to verify that for any
choice of a and b, the function
y(t) = −5t2 + at + b
y 0 (t) = −10t + a
y 00 (t) = −10.
As it turns out, these are the only functions that satisfy this differential equation,
although we will not see why until next quarter. Thus, any time you toss or
drop an object, its vertical position is described by
y = −5t2 + at + b
for some choice of a and b. Note that there are many different paths possible,
since there are many different values of a and b. This is good, since there are
many different paths falling bodies can follow in real life. (If you throw a piece
of chalk up, it will follow a different path from the piece of chalk you throw
down, but both paths can be described by the equation y = −5t2 + at + b, for
some (different) values of a and b.)
However, no matter what a and b are, y = −5t2 + at + b is always some
sort of upside-down parabola. Correspondingly, a falling object always moves
in some form of upside-down parabola; see Figure 2a. If you imagine that your
object is a droplet of water, and you string a bunch of these “objects” together
in a continuous stream, you can see the whole path at once, as in Figure 2b.
Notice how much more interesting nature’s parabola is than the stark, abstract
curve given in 2a. The water’s arc seem to scintillate with reflected light; cords
of water seem to twist together, like the muscles in a Michelangelo drawing of
an arm.
In the textbook, it essentially just gives you the equation for the position,
say y = −5t2 + t + 1, and asks you to calculate the acceleration. And as it turns
out, the acceleration is constantly −10 (or perhaps −32, since the textbook
seems to like feet more than meters). While finding the acceleration from the
position function is a perfectly good exercise, it somehow feels backwards. In
some sense, the basic statement is that the vertical acceleration of a falling
object is constantly −10; this basic fact is the cause of the effect that the object
travels in a parabola given by y = −5t2 + at + b. By starting off with the path
and deducing the acceleration, it feels as though you are mixing up the cause
and the effect.
One final note: If you look more closely at Figure 2a, you will see that the
horizontal axis is indicating time. On the other hand, in Figure 2b, the “hor-
izontal axis,” such as it is, clearly is given by horizontal position, or distance
(more or less). Since the graph does not actually tell you where the object is
horizontally at a given time, it is not entirely clear why the “parabola” descrip-
tion should be accurate; the graph could just as easily describe an object that
4
y
(a) An object with constant negative accelera-(b) Parabolic trajectory of water. By GuidoB.
tion moves in an upside-down parabola. Modified (primarily to make it grayscale).
This image is licensed under a Creative
Commons Attribution-Share Alike 3.0 Un-
ported license; see http://creativecommons.
org/licenses/by-sa/3.0/deed.en.
goes straight up and straight back down with no “sideways” movement. For the
moment, I’m just going to ignore this discrepancy. We may, or may not, discuss
it when discussing related rates.
3 Implicit Differentiation
Suppose we know, or suspect, that y is a differentiable function of x. We don’t
have a formula for y, but we may know that y and x satisfy some relation, for
instance,
y 2 + x2 = 1.
Often, we can use this, together with the chain rule, to figure out what the
derivative of y must be (assuming it has one). In the example at hand, we
5
differentiate both sides with respect to x, and then solve for the derivative Dx y:
Dx (y 2 + x2 ) = Dx (1)
2y · Dx (y) + 2x = 0
2yDx y = −2x
−2x −x
Dx y = = .
2y y
This expression for the derivative Dx y has a y in it as well as an x, which, as the
book says, can be “a nuisance.” However, it can nevertheless be quite useful. If
we should happen to know that the value of y at a point x0 is y0 , then we can
use this to calculate Dx y = dy/dx at the point (x0 , y0 ), assuming this derivative
exists.
6
Assignment 20 (due Monday, 21 November)
Section 2.3, Problems 39–40. Problem 40 will be graded carefully.
Section 2.7, Problems 1–2. Your expression for Dx y may include both xs and
ys. Problem 2 will be graded carefully.
On the other hand, we know that not every function is continuous. Thus, there
must be a flaw in the argument. What is it? (Hint: this argument can be used
to show that every differentiable function is continuous.)
Section 2.6, Problems 11–12 and 20–21. Problems 12 and 21 will be graded
carefully.
Section 2.7, Problems 3–6 and 19–20. The even-numbered problems will be
graded carefully.
7
(“Semi-bonus problem”) Suppose that
i.e., both the one-sided limits are defined, and they are equal. Use the ε-δ
definition of the limit to show that the two-sided limit is also defined and equal
to `, i.e., that
lim f (x) = `.
x→c
The technique involved should be similar to that used to give ε-δ proofs for
piecewise linear functions.
This is a “semi-bonus problem” in the following sense:
• If you do not seriously attempt it, you will not receive full credit on their
homework.
• If you seriously attempt it, you will receive full credit for it (although your
actual homework grade will, of course, depend on the other homework
problems).
• If you get it right, you will receive a bonus point on the homework.
8
Math 131, Lecture 23
Charles Staats
Monday, 21 November 2011
1
√
Exercise 2. Use implicit differentiation to show that if y = − x, then
dy 1
=− √ .
dx 2 x
√ 2
Solution. This time, y also satisfies the equation y 2 = x, since (− x) =
√ 2
(−1)2 ( x) = 1 · x = x. Thus, we can differentiate implicitly:
y2 = x
dy
2y · =1
dx
dy 1 1 1
= = √ =− √ .
dx 2y 2 (− x) 2 x
Note that there √ is a problem we never dealt with here: we never actually
showed that f (x) = x is differentiable. We only figured out what its derivative
must be, if the derivative exists. There are a couple ways to solve this problem.
√
• It is possible to compute the derivative of x directly from the defini-
tion of the derivative (i.e., as the limit of the difference quotient). This
is done earlier in the textbook. However, this is a way to avoid using
implicit differentiation; what we really want is a way to show that implicit
differentiation works.
• There is a theorem called the “Implicit Function Theorem” that states,
roughly, that if implicit differentiation gives a reasonable answer, then the
equation in question does in fact have a solution y = f (x) where f is a
differentiable function. This is kind of like the Main Limit Theorem: If
the process gives a reasonable answer, then we know that must be the
right answer; but if the process does not give a reasonable answer, we
don’t know anything.
The Implicit Function Theorem may seem to be the answer to our problems,
but there are subtleties even here. First, the actual statement of the theorem is
something that I find confusing, so I very much doubt that you want to see it.
Second, while the Implicit Function Theorem
√ can guarantee√ that some solutions
are differentiable (in this case, f (x) = x and f (x) = − x are both solutions
to f (x)2 = x that are differentiable for x > 0), there will also be other solutions
that are not differentiable. For instance, if f is the function defined by
(√
x if 0 < x ≤ 1,
f (x) = √
− x if x > 1,
2
f (x)
x
−1 1 2 3 4
−1
−2
−3
then y = f (x) is also a solution to the equation y 2 = x for all x > 0, but f is
not even continuous, much less differentiable. We will not try to explain why
the Implicit Function Theorem applies for some “solutions,” but not to others.
Instead, we will adopt a “third way”:
• Ignore the difficulties and just assume implicit differentiation works. Any
function we encounter “naturally” in this course1 is going to work out just
fine.
In essence, we’ve reached a point where the skyscraper just gets too convoluted
to deal with, so we’re going to continue walking on clouds.
There’s one more very important result we want to obtain using implicit dif-
ferentiation. Recall that we proved the Power Rule, Dx (xn ) = nxn−1 , whenever
n is an integer. We’re now going to that this holds, not just for integers, but
for rational numbers.
Theorem. (Power Rule for rational exponents) Let r be any rational number.
Then
Dx (xr ) = rxr−1 .
Incomplete Proof. Since r is a rational number (i.e., a “ratio” of two integers),
we may write
p
r= ,
q
for some integers p, q, where q 6= 0. By definition, y = xp/q is a solution to the
equation
y q = xp .
1 That is, any function that has not been explicitly designed to cause problems.
3
Applying implicit differentiation, together with the power rule for integer expo-
nents, we see that
dy
qy q−1 = pxp−1
dx
dy pxp−1
= q−1
dx qy
p xp−1
= ·
q (xr )q−1
= r · x(p−1)−r(q−1)
= r · xp−1−(p/q)(q−1)
= r · xp−1−p+p/q
= r · x−1+p/q
= r · xr−1 .
The key point of this proof is that we could apply the power rule to xp and
q
y , because we already knew the power rule for integer exponents, and p, q are
integers. This proof is incomplete in that we have not really turned implicit
differentiation into a rigorous technique, so we can’t use it in “real” proofs.
I commented at one point that calculus is “supposed” to work exactly the
same for rational and irrational numbers. Thus, it seems peculiar that we have
a rule that only seems to work for rational numbers. In fact, as it turns out,
the Power Rule does hold for all real exponents—rational or irrational. There’s
even a nice, elegant proof that does not care whether r is rational or irrational.
Unfortunately, this proof uses logarithms, so we won’t see it for some time (if
at all). Thus, for now, all our powers will be rational.
4
to tell that we mean “the function mapping x 7→ 2x” rather than simply “the
number 2x.”
So far, this section has been entirely theoretical, but there is a practical,
computational issue as well. Suppose someone asks you to calculate the deriva-
tive of x2 + 1 at x = 2. You may be tempted to substitute in x = 2 before
differentiating, which would be a disaster. You’d be differentiating a number
rather than a function; you’d probably try to treat it as the constant function
22 + 1 = 5, and end up getting derivative 0 since the derivative of any constant
function is zero.
To be honest, I hope that none of you would make this particular error,
because this example is fairly straightforward. But when you deal with more
complicated relations—say, u and v are both functions of t, y is a function of
u, and you have some equation that involves all four letters t, u, v, y—it can
be easy to lose track of whether you are dealing with functions or numbers
“underneath.” A good rule of thumb here is the following:
Rule of Thumb. First, do all your differentiating. Then, and only then, start
treating variables as numbers.
For instance, if you are asked to find the derivative of x2 + 1 at x = 2, you
should first differentiate (obtaining 2x) and then substitute in x = 2 (obtaining
4, the correct answer). Like any rule of thumb, this one has occasional excep-
tions. The only truly reliable way to stay out of trouble is to know what you
are doing: to know, at each step of your argument, whether x2 + 1 really means
“the number x2 + 1” or “the function that maps x 7→ x2 + 1.” However, trying
to keep track of this can be quite confusing, and I think the Rule of Thumb
above will probably serve you well.
5
Assignment 21 (due Wednesday, 23 November)
[Note: If you won’t be in class on the day before Thanksgiving, then some time
before class, put the homework in my mailbox (in the Eckhart basement). Also,
send me an email so that I know you have done this.]
Section 2.6, Problems 11–12 and 20–21. Problems 12 and 21 will be graded
carefully.
Section 2.7, Problems 3–6 and 19–20. The even-numbered problems will be
graded carefully.
i.e., both the one-sided limits are defined, and they are equal. Use the ε-δ
definition of the limit to show that the two-sided limit is also defined and equal
to `, i.e., that
lim f (x) = `.
x→c
The technique involved should be similar to that used to give ε-δ proofs for
piecewise linear functions.
This is a “semi-bonus problem” in the following sense:
• If you do not seriously attempt it, you will not receive full credit on your
homework.
• If you seriously attempt it, you will receive full credit for it (although your
actual homework grade will, of course, depend on the other homework
problems).
• If you get it right, you will receive a bonus point on the homework.
Section 2.6, Problems 23 and 38. Both of these will be graded carefully.
Section 2.7, Problems 8, 21, and 37. Problems 21 and 37 will be graded carefully.
6
Math 131, Lecture 24: Related Rates
Charles Staats
Wednesday, 23 November 2011
As far as I can tell, “related rates” are the textbook’s first excuse to really
start in on so-called “word problems.” Up to now, the course has been mostly
theoretical; the only real “applications” have been to studying the graphs of
functions. However, calculus was invented for real-world problems. If you can’t
understand how calculus relates to the real world, then you don’t really under-
stand calculus at all.
I think the real meat of the notion of “related rates” is in the examples, so
let us proceed to these examples without further ado.
Example 1. Suppose that a straight railroad consists of two completely rigid
segments, each 50 kilometers long. Suppose, further, that two immensely strong
men move the ends of the railroad toward each other at a constant rate of one
centimeter per hour, forcing the railroad to rise up in the center. (By this, I
mean that each end of the railroad is moving at a rate of one centimeter per
hour.) After one hour, how fast is the center point of the railroad moving up?
Solution. In any word problem like this, the first step is almost always to draw a
picture. At the same time, we probably want to assign names to all the variable
quantities.
dh/dt
50 km
50 km
h
1 cm/hr 1 cm/hr
` `
What we are interested in calculating is the rate of change of the height h with
respect to time t. We need to fix units, so let’s say we take time in hours and
distance in kilometers. We are given that the horizontal length ` is shrinking at
a rate of
cm km
1 = .00001 .
hr hr
In other words,
d`
= −.00001.
dt
1
The Pythagorean Theorem tells us that
h2 + `2 = 502 ;
dh d`
2h + 2` = 0
dt dt
dh
2h + 2`(−.00001) = 0
dt
dh .00001`
=
dt h
1
= .
100000h
Up to now, we have only been making substitutions when we knew that
something held for all time (or at least, all t > 0). This is because we needed
to be able to differentiate; and as discussed last time, this means we are really
working with functions rather than numbers. Our variables that could change
over the course of time, would need to remain “dummy variables” so that we
could differentiate them (or with respect to them).
However, we are done differentiating now, so we can substitute in the par-
ticular case we care about: specifically, when t = 1 (i.e., after one hour). In this
case, we have that
` = 50 − .00001 = 49.99999.
Since there is an h in the formula for dh/dt, we also need to find out what h is
at t = 1, which we do using the Pythagorean Theorem (again):
h2 + `2 = 502
h2 = 502 − `2
= 502 − (50 − .00001)2
= 502 − 502 + 2 · 50(.00001) − .0000000001
= .0001 − .0000000001
√
h = .0001 − .0000000001
p
= 10−4 − 10−10
p
= 10−4 (1 − 10−6 )
p
= 10−2 1 − 10−6 .
2
Thus, plugging in this h, we find that
dh 1
=
dt t=1 105 h
1
= √
105
· 10−2
1 − 10−6
−5+2
10
=√
1 − 10−6
.001
=√ .
1 − 10−6
√
Since 1 − 10−6 is very nearly 1, this tells us that the rate of the vertex going
up, dh/dt, is very close to .001 kilometers per hour, or 1 meter per hour, at time
t = 1 hr. Thus, the midpoint is going up much faster the sides are going in (1
cm/hr).
If you really think about it, the problem above does not so much calculate
the rate of change, as explain why the problem is so incredibly unrealistic. The
way to make work easier is to use leverage, or “mechanical advantage,” so that
your quick motion produces a slow motion in the thing you are trying to move.
The fictional “very strong men” in this example are doing exactly the opposite:
they are working at an enormous mechanical disadvantage.
Section 2.6, Problems 23 and 38. Both of these will be graded carefully.
Section 2.7, Problems 8, 21, and 37. Problems 21 and 37 will be graded carefully.
3
Math 131, Lecture 25
Charles Staats
Monday, 28 November 2011
Solution. Let e denote the edge length of the cube, and let V denote its
volume.
e
e
e
dV de
= 3e2
dt dt
= 3e2 · 3
= 9e2 .
(We substituted in de/dt = 3 since this is true for all time.) At the particular
instant we care about, we are given that e = 12, and so
dV
= 9e2 = 9(12)2 = 9 · 144 = 1296.
dt
1
The volume is increasing at a rate of 1296 cubic inches per second.
2
2 Maxima and Minima
Consider a child selling lemonade on the sidewalk.1 If she sets the price at $0
per cup (i.e., she gives it away for free), then plenty of people will take a cup,
but she won’t make any money. On the other hand, if she sets the price too
high—say, $7 per cup—then no one will buy from her, and she also won’t make
any money. If she puts the price somewhere in the middle, then she may well
sell some lemonade and make some money. But how can she figure out what
price to set so that she will make the most money? Realistically, she probably
can’t—but only because she does not know calculus.2
Let m denote the amount of money she makes, let p denote the price she
charges, and let n denote the number of cups she sells. It is fairly clear that
m = n · p;
in words, the amount of money she makes is the number of cups she sells times
the price per cup.3 Moreover, we are assuming that the number of cups she
1 Let’s pretend it’s summer; otherwise, she has chosen a singularly inappropriate time of
research, and even then the answer would only be approximate. But since this is a course in
calculus rather than economics, we’re going to ignore that bit.
3 You might object that we should also consider how much she has to pay for the lemonade,
3
sells is determined by the price she sets. In other words, n is a function of p.
Consequently, m is also a function of p.
Ideally, we should do a fair amount of market research to figure out what
function gives n; in other words, how many cups sells when she sets a given
price. But since we’re mathematicians rather than economists here, let’s just
make a sort of silly guess. Let’s say that if she sets the price at $0, then she will
“sell” (give away) 50 cups (maybe 50 people pass by during the hour she sits at
the stand). If she sets the price to $7 or more, she will sell zero cups. So, let’s
just draw a straight line between the points (0, 50) and (7, 0), and call it n.
n
50
p
$7
Note that these functions are not defined for p < 0, since “negative price” really
does not make sense in this context.
If we graph m, the amount of money made, as a function of the price p, we
obtain
4
m
$90
$80
$70
$60
$50
$40
$30
$20
$10
$0 p
$0 $1 $2 $3 $4 $5 $6 $7 $8
The maximum value is the point where the tangent line to the graph is horizontal—
in other words, where m0 (p) = 0. And we can find this using calculus:
(
50p − 50
7 p
2
if 0 ≤ p ≤ 7,
m(p) =
0 if p > 7.
(
50 − 100
7 p if 0 < p < 7,
m0 (p) =
0 if p > 7.
Warning. One error that a lot of people made on the test would amount, in
this case, to writing m0 (p) = 50 − 100
7 p for 0 ≤ p ≤ 7. (Note the ≤ sign rather
than the < sign.) When you differentiate a piecewise-defined function, a ≤ sign
will usually (although not always) become a < sign. If you look at the graph,
you can see that the function is not differentiable at p = 7.
If we solve for the places where m0 (p) = 0, we find that this holds when
p = 3/2 or p > 7. Looking at the graph, it is clear that m is maximized
(i.e., the girl makes the most possible money) when p = 7/2 = 3.5; in other
words, according to this model, she ought to set her price at $3.50 per cup. The
maximum value of the function is
7 2
m 72 = 50 72 − 50 7 2 = 87.5.
In other words, the most money the girl can possibly make is $87.5.
The following, more precise mathematics allows us to handle these sorts of
things more generally:
Definition. Let f be a function defined on an interval [a, b] and x0 a point in
its domain. We say that x0 is a critical point of f if any of the following holds:
• x0 is an endpoint of the interval (i.e., x0 = a or x0 = b); or
• f 0 (x0 ) does not exist; or
5
• f 0 (x0 ) = 0.
The last type of critical point, where f 0 (x0 ) = 0, is in some sense the most
interesting sort of critical point to find (find the derivative f 0 , then solve for
f 0 (x) = 0). But the other two kinds should not be forgotten, since they are
absolutely necessary to make the following theorem true.
Theorem. Let f be a continuous function with domain a closed interval [a, b].
Then f has a maximum value and a minimum value. Moreover, every point at
which the maximum (minimum) is attained is a critical point.
In other words, if we know f is a continuous function on [a, b], then the
following procedure will allow us to find the minima and maxima of f on [a, b]:
1. Find the critical points of f (all three kinds).
2. Evaluate f at each of the critical points.
3. The largest of the resulting values is the maximum value of f on [a, b].
The least of the resulting values is the minimum value of f on [a, b].
6
Assignment 23 (last assignment; due Wednesday,
30 November, 2011)
Section 2.7, Problems 21, 22, and 38.
Section 3.1, Problems 1 and 5–6. On 5 and 6, include graphs of the function on
the interval. Do NOT graph the function outside the interval.
Bonus Exercise. Let f be a function defined on (0, 1); in other words, f (x) is
defined whenever 0 < x < 1.
(i) Give a formal (M -δ) definition for the statement that
lim f (x) = ∞.
x→0+
(ii) Assume that lim+ f (x) = ∞. Use the M -δ statement above to prove that
x→0
f has no maximimum value on (0, 1).
7
Math 131, Lecture 26 (final lecture)
Charles Staats
Wednesday, 30 November 2011
1
3 Maxima and minima—the theory
We’re going to spend a few minutes talking about the basic theory (theorems
and such) before seeing the applications.
Definition. Let f be a function. The maximum value of f is a value M such
that
(i) f attains the value M ; i.e., there is some x0 such that M = f (x0 ); and
2
f (x)
g(x)
(
x if 0 ≤ x < 1,
g(x) = 1
2 if 1 ≤ x ≤ 2.
3
• x0 is an endpoint of the interval (i.e., x0 = a or x0 = b); or
• f 0 (x0 ) does not exist; or
• f 0 (x0 ) = 0.
The last type of critical point, where f 0 (x0 ) = 0, is in some sense the most
interesting sort of critical point to find (find the derivative f 0 , then solve for
f 0 (x) = 0). But the other two kinds should not be forgotten, since they are
absolutely necessary to make the following theorem true.
Theorem. Let f be a continuous function with domain a closed interval [a, b].
Then the only points where f could possibly equal its extreme values are the
critical points.
Idea of proof. We prove the contrapositive. Suppose x0 is not a critical point.
We will show that f (x0 ) is not an extremal value of f .
f (x)
f (x0 )
x
x0
4
4 Maxima and minima: example
In other words, if we know f is a continuous function on [a, b], then the following
procedure will allow us to find the minima and maxima of f on [a, b]:
1. Find the critical points of f (all three kinds).
2. Evaluate f at each of the critical points.
3. The largest of the resulting values is the maximum value of f on [a, b].
The least of the resulting values is the minimum value of f on [a, b].
Example 1. Find the critical points, minimum, and maximum for the function
f given by
f (x) = 13 x3 − x
on the closed interval [−2.5, 1.5].
f (x)
x
−4 −3 −2 −1 1 2 3 4
−1
−2
−3
−4