Chapter 4: Unconstrained Optimization

Chapter 4: Unconstrained Optimization
Unconstrained optimization problem minx F (x) or maxx F (x)

Constrained optimization problem
min F (x) or max F (x)
x x
subject to g(x) = 0
and/or h(x) < 0 or h(x) > 0
Example: minimize the outer area of

a cylinder subject to a fixed volume.
Objective function
hri
2
F (x) = 2r + 2rh, x =
h
Constraint: 2r2h = V
1
Outline:
Part I: one-dimensional unconstrained optimization
Analytical method
Newtons method
Golden-section search method
Part II: multidimensional unconstrained optimization
Analytical method
Gradient method steepest ascent (descent) method
Newtons method
2
PART I: One-Dimensional Unconstrained Optimization Techniques
1 Analytical approach (1-D)
minx F (x) or maxx F (x)

0
Let F (x) = 0 and find x = x.
00
If F (x) > 0, F (x) = minx F (x), x is a local minimum of F (x);
00
If F (x) < 0, F (x) = maxx F (x), x is a local maximum of F (x);
00
If F (x) = 0, x is a critical point of F (x)
0 00
Example 1: F (x) = x2, F (x) = 2x = 0, x = 0. F (x) = 2 > 0. Therefore,
F (0) = minx F (x)
0 00
Example 2: F (x) = x3, F (x) = 3x2 = 0, x = 0. F (x) = 0. x is not a local
minimum nor a local maximum.
0 00
Example 3: F (x) = x4, F (x) = 4x3 = 0, x = 0. F (x) = 0.
0 0
In example 2, F (x) > 0 when x < x and F (x) > 0 when x > x.
0
In example 3, x is a local minimum of F (x). F (x) < 0 when x < x and
0
F (x) > 0 when x > x.
3
F(x)<0
F(x)=0
F(x)>0 F(x)<0
F(x)>0
F(x)<0 F(x)>0 F(x)=0

F(x)=0
F(x)=0 F(x)>0
F(x)>0
Figure 1: Example of constrained optimization problem
2 Newtons Method
minx F (x) or maxx F (x)

Use xk to denote the current solution.
p2 00 0
F (xk + p) = F (xk ) + pF (xk ) + F (xk ) + . . .
2
0 p2 00
F (xk ) + pF (xk ) + F (xk )
2
4
F (x) = min F (x) min F (xk + p)
x p
2

0 p 00
min F (xk ) + pF (xk ) + F (xk )
p 2
Let
F (x) 0 00
= F (xk ) + pF (xk ) = 0
p
we have 0
F (xk )
p = 00
F (xk )
Newtons iteration 0
F (xk )
xk+1 = xk + p = xk 00
F (xk )
x2
Example: find the maximum value of f (x) = 2 sin x 10 with an initial guess
of x0 = 2.5.
Solution:
0 2x x
f (x) = 2 cos x = 2 cos x
10 5
5
00 1
f (x) = 2 sin x
5
2 cos xi x5i
xi+1 = xi
2 sin xi 51
x0 = 2.5, x1 = 0.995, x2 = 1.469.
Comments:
0
Same as N.-R. method for solving F (x) = 0.
Quadratic convergence, |xk+1 x| |xk x|2
May diverge
Requires both first and second derivatives
Solution can be either local minimum or maximum
6
3 Golden-section search for optimization in 1-D
maxx F (x) (minx F (x) is equivalent to maxx F (x))

Assume: only 1 peak value (x) in (xl , xu)
Steps:
1. Select xl < xu
2. Select 2 intermediate values, x1 and x2 so that x1 = xl + d, x2 = xu d, and
x1 > x2.
3. Evaluate F (x1) and F (x2) and update the search range
If F (x1) < F (x2), then x < x1. Update xl = xl and xu = x1.
If F (x1) > F (x2), then x > x2. Update xl = x2 and xu = xu.
If F (x1) = F (x2), then x2 < x < x1. Update xl = x2 and xu = x1.
4. Estimate
x = x1 if F (x1) > F (x2), and
x = x2 if F (x1) < F (x2)
7
F(x1)<F(x2) F(x1)>F(x2)
Xl X2 X1 Xu Xl X2 X1 Xu
(new Xl ) (new Xu ) (new Xl ) (new Xu )
Xl X2 X1 Xu Xl X2 X1 Xu
(new Xl ) (new Xu ) (new Xl ) (new Xu )
Figure 2: Golden search: updating search range
Calculate a. If a < threshold, end.

xnew xold
a = 100%
xnew
8
The choice of d
Any values can be used as long as x1 > x2.

If d is selected appropriately, the number of function evaluations can be min-
imized.
Figure 3: Golden search: the choice of d
d0 = l1, d1 = l2 = l0 d0 = l0 l1. Therefore, l0 = l1 + l2.

l0 l1 l0 l1
d0 = d1 . Then l1 = l2 .
2
l12 = l0l2 = (l1 + l2)l2. Then 1 = ll21 + ll12 .
9

d0 d1 l2
Define r = = = Then r2 + r 1 = 0, and r = 51
l0 l1 l1 . 2 0.618
d = r(xu xl ) 0.618(xu xl ) is referred to as the golden value.
Relative error

xnew xold
a = 100%
xnew
Consider F (x2) < F (x1). That is, xl = x2, and xu = xu.
For case (a), x > x2 and x closer to x2.
x x1 x2 = (xl + d) (xu d)
= (xl xu) + 2d = (xl xu) + 2r(xu xl )
= (2r 1)(xu xl ) 0.236(xu xl )
For case (b), x > x2 and x closer to xu.
x xu x1
= xu (xl + d) = xu xl d
= (xu xl ) r(xu xl ) = (1 r)(xu xl )
0.382(xu xl )
Therefore, the maximum absolute error is (1 r)(xu xl ) 0.382(xu xl ).
10

x
a 100%
x
(1 r)(xu xl )

100%
|x |
0.382(xu xl )
=
100%
|x |
x2
Example: Find the maximum of f (x) = 2 sin x with xl = 0 and xu = 4 as
10
the starting search range.
Solution:
Iteration 1: xl = 0, xu = 4, d = 512 (xu xl ) = 2.472, x1 = xl + d = 2.472,
x2 = xu d = 1.528. f (x1) = 0.63, f (x2) = 1.765.
Since f (x2) > f (x1), x = x2 = 1.528, xl = xl = 0 and xu = x1 = 2.472.

Iteration 2: xl = 0, xu = 2.472, d = 51
2 (xu xl ) = 1.528, x1 = xl + d = 1.528,
x2 = xu d = 0.944. f (x1) = 1.765, f (x2) = 1.531.
Since f (x1) > f (x2), x = x1 = 1.528, xl = x2 = 0.944 and xu = xu = 2.472.
11
Multidimensional Unconstrained Optimization
4 Analytical Method
Definitions:
If f (x, y) < f (a, b) for all (x, y) near (a, b), f (a, b) is a local maximum;
If f (x, y) > f (a, b) for all (x, y) near (a, b), f (a, b) is a local minimum.
If f (x, y) has a local maximum or minimum at (a, b), and the first order partial
derivatives of f (x, y) exist at (a, b), then
f f
|(a,b) = 0, and |(a,b) = 0
x y
If
f f
|(a,b) = 0 and |(a,b) = 0,
x y
then (a, b) is a critical point or stationary point of f (x, y).
If
f f
|(a,b) = 0 and |(a,b) = 0
x y
12
and the second order partial derivatives of f (x, y) are continuous, then
2f
When |H| > 0 and |
x2 (a,b)
< 0, f (a, b) is a local maximum of f (x, y).
2f
When |H| > 0 and |
> 0, f (a, b) is a local minimum of f (x, y).
x2 (a,b)
When |H| < 0, f (a, b) is a saddle point.
Hessian of f (x, y):
" #
2f 2f
x2 xy
H= 2
f 2f
yx y 2
2f 2f 2f 2f
|H| = x2
y 2
xy yx
2f 2f 2f
When xy is continuous, xy = yx .
2f 2f
When |H| > 0, x2
y 2
> 0.
Example (saddle point): f (x, y) = x2 y 2.

f f
x = 2x, y = 2y.
f f
Let x = 0, then x = 0. Let y = 0, then y = 0.
13
Therefore, (0, 0) is a critical point.
2f 2f
x 2 = x (2x) = 2, y 2 = y (2y) = 2
2f 2f
xy = x (2y) = 0, yx = y (2x) = 0
2f 2f 2f 2f
|H| = x2 y2 xy yx = 4 < 0

Therefore, (x , y ) = (0, 0) is a saddle maximum.
Example: f (x, y) = 2xy + 2x x2 2y 2, find the optimum of f (x, y).
Solution:
f f
x = 2y + 2 2x, y = 2x 4y.
Let f
x = 0, 2x + 2y = 2.
Let f
y = 0, 2x 4y = 0.
Then x = 2 and y = 1, i.e., (2, 1) is a critical point.
2f
x 2 = x (2y + 2 2x) = 2
2f
y 2 = y (2x 4y) = 4
2f
xy = x (2x 4y) = 2, or
14
2 2
z=x y
0.4
0.2
0.2
0.4
0.5
0.5
0
0
y 0.5 0.5
x
Figure 4: Saddle point
15
2f
yx = y (2y + 2 2x) = 2
2f 2f 2f 2f
|H| =
x2 y 2
xy yx = (2) (4) 22 = 4 > 0
2f
x2
< 0. (x, y ) = (2, 1) is a local maximum.
5 Steepest Ascent (Descent) Method
Idea: starting from an initial point, find the function maximum (minimum) along
the steepest direction so that shortest searching time is required.
Steepest direction: directional derivative is maximum in that direction gradi-
ent direction.
Directional derivative
f f f f 0 0
Dhf (x, y) = cos + sin = h[ ] [cos sin ] i
x y x y
hi: inner product
Gradient
16
0 0
When [ f f
x y ] is in the same direction as [cos sin ] , the directional derivative
is maximized. This direction is called gradient of f (x, y).
The gradient of a 2-D function is represented as f (x, y) = f ~i+ f ~j, or [ f f ]0 .
x y x y
h i0
The gradient of an n-D function is represented as f (X)~ = f f . . . f ,
x1 x2 xn
0
~
where X = [x1 x2 . . . xn]
Example: f (x, y) = xy 2. Use the gradient to evaluate the path of steepest ascent
at (2,2).
Solution:
f 2 f
x = y , y = 2xy.
f f
x |(2,2) = 22 = 4,y |(2,2) = 2 2 2 = 8
Gradient: f (x, y) = f ~i + f ~j = 4~i + 8~j
x y
1 8 o
= tan 4 = 1.107, or 63.4 .
cos = 424+82 , sin = 428+82 .
Directional derivative at (2,2): f x cos + f
y sin = 4 cos + 8 sin = 8.944
17
0 0
If 6= , for example, = 0.5325, then
f 0 f 0 0 0
Dh0 f |(2,2) = cos + sin = 4 cos + 8 sin = 7.608 < 8.944
x y
Steepest ascent method
Ideally:
Start from (x0, y0). Evaluate gradient at (x0, y0).
Walk for a tiny distance along the gradient direction till (x1, y1).
Reevaluate gradient at (x1, y1) and repeat the process.
Pros: always keep steepest direction and walk shortest distance
Cons: not practical due to continuous reevaluation of the gradient.
Practically:
Start from (x0, y0).
Evaluate gradient (h) at (x0, y0).
18
Evaluate f (x, y) in direction h.
Find the maximum function value in this direction at (x1, y1).
Repeat the process until (xi+1, yi+1) is close enough to (xi, yi).
~ i+1 from X
Find X ~i
For a 2-D function, evaluate f (x, y) in direction h:

f f
g() = f (xi + |(xi,yi) , yi + |(xi,yi) )
x y
where is the coordinate in h-axis.
~
For an n-D function f (X),
~ + f | ~ )
g() = f (X (Xi )
0
Let g () = 0 and find the solution = .
Update xi+1 = xi + f | (x
x i i ,y )
, yi+1 = yi + f
y |(xi ,yi ) .
19
Figure 5: Illustration of steepest ascent
20
Figure 6: Relationship between an arbitrary direction h and x and y coordinates
21
Example: f (x, y) = 2xy + 2x x2 2y 2, (x0, y0) = (1, 1).
First iteration:
x0 = 1, y0 = 1.
f f
x | (1,1) = 2y + 2 2x| (1,1) = 6, y |(1,1) = 2x 4y|(1,1) = 6
f = 6~i 6~j
f f
g() = f (x0 + |(x0,y0) , y0 + |(x0,y0) )
x y
= f (1 + 6, 1 6)
= 2 (1 + 6) (1 6) + 2(1 + 6) (1 + 6)2 2(1 6)2
= 1802 + 72 7
0
g () = 360 + 72 = 0, = 0.2.
Second iteration:
x1 = x0 + f |
x (x0 ,y0 )
= 1+60.2 = 0.2, y 1 = y0 + f
y |(x0 ,y0 ) = 160.2 =
0.2
f
x |(0.2,0.2) = 2y + 2 2x|(0.2,0.2) = 2 (0.2) + 2 2 0.2 = 1.2,
f
y |(0.2,0.2) = 2x 4y|(0.2,0.2) = 2 0.2 4 (0.2) = 1.2
22
f = 1.2~i + 1.2~j
f f
g() = f (x1 + |(x1,y1) , y1 + |(x1,y1) )
x y
= f (0.2 + 1.2, 0.2 + 1.2)
= 2 (0.2 + 1.2) (0.2 + 1.2) + 2(0.2 + 1.2)
(0.2 + 1.2)2 2(0.2 + 1.2)2
= 1.442 + 2.88 + 0.2
0
g () = 2.88 + 2.88 = 0, = 1.
Third iteration:
x2 = x1 + f
x |(x1 ,y1 ) = 0.2 + 1.2 1 = 1.4, y2 = y1 +
f
y |(x1 ,y1 ) =
0.2 + 1.2 1 = 1
...
(x, y ) = (2, 1)
23
6 Newtons Method
Extend the Newtons method for 1-D case to multidimensional case.

~ approximate f (X)
Given f (X), ~ by a second order Taylor series at X ~ =X ~ i:
~ ~ 0
~ ~ ~ 1 ~ ~ 0
~ X~ i)
f (X) f (Xi) + f (Xi)(X Xi) + (X Xi) Hi(X
2
where Hi is the Hessian matrix
2
f 2f 2f
x212 x1 x2 . . . x1 xn
f 2f 2f
. . . x2xn
H = x2x1 x22
...
2 2 2

f f f
xn x1 xn x2 . . . x2
n
~
f (X)
At the maximum (or minimum) point, = 0 for all j = 1, 2, . . . , n, or
xj
f = ~0. Then
~ i) + Hi(X
f (X ~ X
~ i) = 0
If Hi is non-singular,
~ =X
X ~ i H 1f (X
~ i)
i
24
~ i+1 = X
Iteration: X ~ i H 1f (X
~ i)
i
~ = 0.5x21 + 2.5x22
Example: f (X)

~ x1
f (X) =
5x2
" #
2f 2f
x2 1 0
xy
H= f 2f
2 =
0 5
yx y 2

~ 5 ~ ~ 1 ~ 5 1 0 5 0
X0 = , X1 = X0 H f (X0) = =
1 1 0 51 5 0
Comments: Newtons method

Converges quadratically near the optimum
Sensitive to initial point
Requires matrix inversion
Requires first and second order derivatives
25

Chapter 4: Unconstrained Optimization

Uploaded by

Copyright:

Available Formats

Chapter 4: Unconstrained Optimization

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 4: Unconstrained Optimization

Uploaded by

Copyright:

Available Formats

Chapter 4: Unconstrained Optimization

Unconstrained optimization problem minx F (x) or maxx F (x)

Example: minimize the outer area of

1 Analytical approach (1-D)

minx F (x) or maxx F (x)

F(x)<0 F(x)>0 F(x)=0

Figure 1: Example of constrained optimization problem

minx F (x) or maxx F (x)

maxx F (x) (minx F (x) is equivalent to maxx F (x))

Calculate a. If a < threshold, end.

Any values can be used as long as x1 > x2.

Figure 3: Golden search: the choice of d

d0 = l1, d1 = l2 = l0 d0 = l0 l1. Therefore, l0 = l1 + l2.

Example (saddle point): f (x, y) = x2 y 2.

Example: f (x, y) = 2xy + 2x x2 2y 2, find the optimum of f (x, y).

Figure 4: Saddle point

5 Steepest Ascent (Descent) Method

For a 2-D function, evaluate f (x, y) in direction h:

Extend the Newtons method for 1-D case to multidimensional case.

Comments: Newtons method

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.