Higher Order Derivatives and Taylor Expansions: Example
Higher Order Derivatives and Taylor Expansions: Example
Higher Order Derivatives and Taylor Expansions: Example
∂ 2f ∂ ∂f
∂x2
≡ ∂x ∂x
∂
= ∂x (6xy + 2x)
= 6y + 2
∂2 f ∂ ∂f
∂x∂y
≡ ∂x ∂y
∂
= ∂x 3x2 + 0
= 6x
∂2 f ∂ ∂f
∂y∂x
≡ ∂y ∂x
∂
= ∂y (6xy + 2x)
= 6x + 0
= 6x
∂x∂y
= ∂y∂x
This is in fact a general phenomenon; the value of a mixed partial derivative does not depend on the order
in which the derivatives are taken . Stated more formally;
then
∂ 2f 2
∂xi ∂xj
= ∂x∂ ∂x
f
j i
1
2. TAYLOR’S FORMULA FOR FUNCTIONS OF SEVERAL VARIABLES 2
Recall that if f (x) is a function of a single variable that is continuous and differentiable up to order n + 1
then Taylor’s theorem says that
f (n ) (a)
f (x) = f (a) + f (a)(x − a) + f (a)(x − a)2 + · ·· + (x − a) + Rn (x, a)
n
1
2! n!
x
x − s (n+1)
Rn (x, a) = f
n!
(s)ds
a
and that, moreover, the error term is of order (x − a)n+1 . Thus, to order (x − a)n we can approximate the
function f (x) by the polynomial function
2! n!
There is an analogous theorem for functions of severa variables. However, since its general statement is
a bit messy unless we introduce some new notation, we’ll simply state the first and second order Taylor
formulae
Theorem 10.3. Let f : Rn →R have continuous partial derivatives up to order 2. Then we may write
f (x ) = f (a ) + ∇ f (a ) · (x − a ) + R (x , a )
1
with the error term R1 (x, a) going to zero faster that a constant times x − a 2
as x →a .
Note that this function is linear in the coordinates of x. It’s graph is thus a flat plane and generalizes the
idea of the best straight line fit to a curve : it represents the best flat plane approximation to the graph of
Theorem . Let f Rn → R have continuous partial derivatives up to order 3. Then we may write
n n n 2
∂f
f (x ) = f (a ) + (a) (xi 2
∂xi
i=0 i=0 j=0
with the error term R2 (x, a) going to zero faster that a constant times x − a3
as x →a .
Example 10.5. Compute the second order Taylor formula for the function f (x, y) = xy + x2 + y2 about
the point (1, 1).
2. TAYLOR’S FORMULA FOR FUNCTIONS OF SEVERAL VARIABLES 3
• We have
f (1, 1) = 1 + 1 + 1 = 3
∂f
∂y (1,1)
= (y + 2x + 0)|(1 1) = 3 ,
∂f
= (x + 0 + 2y)|(1 1) = 3
∂y (1,1) ,
∂ 2 f
∂x2 (1,1)
= (0 + 2 + 0)|(1 1) = 2 ,
∂2 f ∂2 f
∂x∂y (1,1)
=
∂y∂x (1,1)
= (1 + 0 + 0)|(1 1) = 1
,
∂ 2 f
∂y2 (1,1)
= (0 + 0 + 2)|(1 1) = 2 ,
So
∂f ∂f
f (x, y) = f (1, 1) + ( x − 1) + (y − 1)
∂y (1,1) ∂y (1,1)
+ 21 ∂ 2 f
2
∂x (1,1)
(x − 1)2 +
∂ 2 f
∂x∂y (1,1)
(x − 1)(y − 1)
∂ 2 f ∂2 f
+ (y − 1)(x − 1) + 2 (y − 1)2
∂y∂x (1,1) ∂y (1,1)
+O (x, y) − (1, 1)3
1
= 3 + 3(x − 1) + 3(y − 1) + 2(x − 1)2 + 2(x − 1)(y − 1) + 2(y − 1)2
2
+ O (x, y) − (1, 1)3
Below I present another (equivalent) formula for the second order Taylor expansion.
Let (x − a) be the n-dimensional column vector with components
x −a
1 1
x2 − a21
(x − a ) =
...
xn − a n
and let (x − a) be the matrix transpose of (x − a) (an n-dimensional row vector)
T
(x − a)T = (x1 − a1 , x2 − a2 , ·· · , xn − an ) .
The gradient vector ∇f (a) = Df (a), according to the conventions of Section 2.3 is an n-dimensional row
vector;
∂f
∇f (a) = ∂x1
(a), ∂f (a), ·· · , ∂x
∂f
( a) .
x2 n
2. TAYLOR’S FORMULA FOR FUNCTIONS OF SEVERAL VARIABLES 4
Let us now define the Hessian matrix at the point a as the n × n matrix Hf (a) defined by
∂2f
(a ) ∂2f
(a) · ·· ∂2f
( a)
∂x1 ∂x1 ∂x1 ∂x2 ∂x1 ∂xn
..
Hf (a) = ∂x2 ∂x1 (a ) ∂x2 ∂x2 (a ) · ··
2 2
∂ f ∂ f
.
..
.
..
.
..
.
.
∂xn ∂x1 (a ) ·· · · ·· ( a)
2
∂ f 2
∂ f
∂xn ∂xn
Then we can write
1
f (x) ≈ f (a) + ∇f (a) · (x − a) + (x − a)T Hf (a) (x − a) + O x − a3
2
for the second order Taylor expansion of f about a.