Chapter 4: Unconstrained Optimization

Download as pdf or txt
Download as pdf or txt
You are on page 1of 25

Chapter 4: Unconstrained Optimization

Unconstrained optimization problem minx F (x) or maxx F (x)


Constrained optimization problem
min F (x) or max F (x)
x x
subject to g(x) = 0
and/or h(x) < 0 or h(x) > 0

Example: minimize the outer area of


a cylinder subject to a fixed volume.
Objective function
hri
2
F (x) = 2r + 2rh, x =
h
Constraint: 2r2h = V

1
Outline:
Part I: one-dimensional unconstrained optimization
Analytical method
Newtons method
Golden-section search method
Part II: multidimensional unconstrained optimization
Analytical method
Gradient method steepest ascent (descent) method
Newtons method

2
PART I: One-Dimensional Unconstrained Optimization Techniques

1 Analytical approach (1-D)

minx F (x) or maxx F (x)


0
Let F (x) = 0 and find x = x.
00
If F (x) > 0, F (x) = minx F (x), x is a local minimum of F (x);
00
If F (x) < 0, F (x) = maxx F (x), x is a local maximum of F (x);
00
If F (x) = 0, x is a critical point of F (x)
0 00
Example 1: F (x) = x2, F (x) = 2x = 0, x = 0. F (x) = 2 > 0. Therefore,
F (0) = minx F (x)
0 00
Example 2: F (x) = x3, F (x) = 3x2 = 0, x = 0. F (x) = 0. x is not a local
minimum nor a local maximum.
0 00
Example 3: F (x) = x4, F (x) = 4x3 = 0, x = 0. F (x) = 0.
0 0
In example 2, F (x) > 0 when x < x and F (x) > 0 when x > x.
0
In example 3, x is a local minimum of F (x). F (x) < 0 when x < x and
0
F (x) > 0 when x > x.
3
F(x)<0
F(x)=0

F(x)>0 F(x)<0
F(x)>0

F(x)<0 F(x)>0 F(x)=0


F(x)=0

F(x)=0 F(x)>0
F(x)>0

Figure 1: Example of constrained optimization problem

2 Newtons Method

minx F (x) or maxx F (x)


Use xk to denote the current solution.
p2 00 0
F (xk + p) = F (xk ) + pF (xk ) + F (xk ) + . . .
2
0 p2 00
F (xk ) + pF (xk ) + F (xk )
2

4
F (x) = min F (x) min F (xk + p)
x p
2

0 p 00
min F (xk ) + pF (xk ) + F (xk )
p 2
Let
F (x) 0 00
= F (xk ) + pF (xk ) = 0
p
we have 0
F (xk )
p = 00
F (xk )
Newtons iteration 0
F (xk )
xk+1 = xk + p = xk 00
F (xk )
x2
Example: find the maximum value of f (x) = 2 sin x 10 with an initial guess
of x0 = 2.5.
Solution:

0 2x x
f (x) = 2 cos x = 2 cos x
10 5
5
00 1
f (x) = 2 sin x
5
2 cos xi x5i
xi+1 = xi
2 sin xi 51
x0 = 2.5, x1 = 0.995, x2 = 1.469.

Comments:
0
Same as N.-R. method for solving F (x) = 0.
Quadratic convergence, |xk+1 x| |xk x|2
May diverge
Requires both first and second derivatives
Solution can be either local minimum or maximum

6
3 Golden-section search for optimization in 1-D

maxx F (x) (minx F (x) is equivalent to maxx F (x))


Assume: only 1 peak value (x) in (xl , xu)
Steps:
1. Select xl < xu
2. Select 2 intermediate values, x1 and x2 so that x1 = xl + d, x2 = xu d, and
x1 > x2.
3. Evaluate F (x1) and F (x2) and update the search range
If F (x1) < F (x2), then x < x1. Update xl = xl and xu = x1.
If F (x1) > F (x2), then x > x2. Update xl = x2 and xu = xu.
If F (x1) = F (x2), then x2 < x < x1. Update xl = x2 and xu = x1.
4. Estimate
x = x1 if F (x1) > F (x2), and
x = x2 if F (x1) < F (x2)

7
F(x1)<F(x2) F(x1)>F(x2)

Xl X2 X1 Xu Xl X2 X1 Xu
(new Xl ) (new Xu ) (new Xl ) (new Xu )

Xl X2 X1 Xu Xl X2 X1 Xu
(new Xl ) (new Xu ) (new Xl ) (new Xu )
Figure 2: Golden search: updating search range

Calculate a. If a < threshold, end.



xnew xold
a = 100%
xnew

8
The choice of d

Any values can be used as long as x1 > x2.


If d is selected appropriately, the number of function evaluations can be min-
imized.

Figure 3: Golden search: the choice of d

d0 = l1, d1 = l2 = l0 d0 = l0 l1. Therefore, l0 = l1 + l2.


l0 l1 l0 l1
d0 = d1 . Then l1 = l2 .
2
l12 = l0l2 = (l1 + l2)l2. Then 1 = ll21 + ll12 .

9

d0 d1 l2
Define r = = = Then r2 + r 1 = 0, and r = 51
l0 l1 l1 . 2 0.618
d = r(xu xl ) 0.618(xu xl ) is referred to as the golden value.
Relative error

xnew xold
a = 100%
xnew
Consider F (x2) < F (x1). That is, xl = x2, and xu = xu.
For case (a), x > x2 and x closer to x2.
x x1 x2 = (xl + d) (xu d)
= (xl xu) + 2d = (xl xu) + 2r(xu xl )
= (2r 1)(xu xl ) 0.236(xu xl )
For case (b), x > x2 and x closer to xu.
x xu x1
= xu (xl + d) = xu xl d
= (xu xl ) r(xu xl ) = (1 r)(xu xl )
0.382(xu xl )
Therefore, the maximum absolute error is (1 r)(xu xl ) 0.382(xu xl ).
10

x
a 100%
x
(1 r)(xu xl )

100%
|x |
0.382(xu xl )
=
100%
|x |
x2
Example: Find the maximum of f (x) = 2 sin x with xl = 0 and xu = 4 as
10
the starting search range.
Solution:
Iteration 1: xl = 0, xu = 4, d = 512 (xu xl ) = 2.472, x1 = xl + d = 2.472,
x2 = xu d = 1.528. f (x1) = 0.63, f (x2) = 1.765.
Since f (x2) > f (x1), x = x2 = 1.528, xl = xl = 0 and xu = x1 = 2.472.

Iteration 2: xl = 0, xu = 2.472, d = 51
2 (xu xl ) = 1.528, x1 = xl + d = 1.528,
x2 = xu d = 0.944. f (x1) = 1.765, f (x2) = 1.531.
Since f (x1) > f (x2), x = x1 = 1.528, xl = x2 = 0.944 and xu = xu = 2.472.

11
Multidimensional Unconstrained Optimization

4 Analytical Method

Definitions:
If f (x, y) < f (a, b) for all (x, y) near (a, b), f (a, b) is a local maximum;
If f (x, y) > f (a, b) for all (x, y) near (a, b), f (a, b) is a local minimum.
If f (x, y) has a local maximum or minimum at (a, b), and the first order partial
derivatives of f (x, y) exist at (a, b), then
f f
|(a,b) = 0, and |(a,b) = 0
x y
If
f f
|(a,b) = 0 and |(a,b) = 0,
x y
then (a, b) is a critical point or stationary point of f (x, y).
If
f f
|(a,b) = 0 and |(a,b) = 0
x y
12
and the second order partial derivatives of f (x, y) are continuous, then
2f
When |H| > 0 and |
x2 (a,b)
< 0, f (a, b) is a local maximum of f (x, y).
2f
When |H| > 0 and |
> 0, f (a, b) is a local minimum of f (x, y).
x2 (a,b)
When |H| < 0, f (a, b) is a saddle point.
Hessian of f (x, y):
" #
2f 2f
x2 xy
H= 2
f 2f
yx y 2

2f 2f 2f 2f
|H| = x2
y 2
xy yx
2f 2f 2f
When xy is continuous, xy = yx .
2f 2f
When |H| > 0, x2
y 2
> 0.

Example (saddle point): f (x, y) = x2 y 2.


f f
x = 2x, y = 2y.
f f
Let x = 0, then x = 0. Let y = 0, then y = 0.
13
Therefore, (0, 0) is a critical point.
2f 2f
x 2 = x (2x) = 2, y 2 = y (2y) = 2
2f 2f
xy = x (2y) = 0, yx = y (2x) = 0
2f 2f 2f 2f
|H| = x2 y2 xy yx = 4 < 0

Therefore, (x , y ) = (0, 0) is a saddle maximum.

Example: f (x, y) = 2xy + 2x x2 2y 2, find the optimum of f (x, y).

Solution:
f f
x = 2y + 2 2x, y = 2x 4y.
Let f
x = 0, 2x + 2y = 2.
Let f
y = 0, 2x 4y = 0.
Then x = 2 and y = 1, i.e., (2, 1) is a critical point.
2f
x 2 = x (2y + 2 2x) = 2
2f
y 2 = y (2x 4y) = 4
2f
xy = x (2x 4y) = 2, or

14
2 2
z=x y

0.4

0.2

0.2

0.4
0.5
0.5
0
0

y 0.5 0.5
x

Figure 4: Saddle point

15
2f
yx = y (2y + 2 2x) = 2
2f 2f 2f 2f
|H| =
x2 y 2
xy yx = (2) (4) 22 = 4 > 0
2f
x2
< 0. (x, y ) = (2, 1) is a local maximum.

5 Steepest Ascent (Descent) Method

Idea: starting from an initial point, find the function maximum (minimum) along
the steepest direction so that shortest searching time is required.
Steepest direction: directional derivative is maximum in that direction gradi-
ent direction.
Directional derivative
f f f f 0 0
Dhf (x, y) = cos + sin = h[ ] [cos sin ] i
x y x y
hi: inner product
Gradient

16
0 0
When [ f f
x y ] is in the same direction as [cos sin ] , the directional derivative
is maximized. This direction is called gradient of f (x, y).
The gradient of a 2-D function is represented as f (x, y) = f ~i+ f ~j, or [ f f ]0 .
x y x y
h i0
The gradient of an n-D function is represented as f (X)~ = f f . . . f ,
x1 x2 xn
0
~
where X = [x1 x2 . . . xn]
Example: f (x, y) = xy 2. Use the gradient to evaluate the path of steepest ascent
at (2,2).
Solution:
f 2 f
x = y , y = 2xy.
f f
x |(2,2) = 22 = 4,y |(2,2) = 2 2 2 = 8
Gradient: f (x, y) = f ~i + f ~j = 4~i + 8~j
x y
1 8 o
= tan 4 = 1.107, or 63.4 .
cos = 424+82 , sin = 428+82 .
Directional derivative at (2,2): f x cos + f
y sin = 4 cos + 8 sin = 8.944

17
0 0
If 6= , for example, = 0.5325, then
f 0 f 0 0 0
Dh0 f |(2,2) = cos + sin = 4 cos + 8 sin = 7.608 < 8.944
x y
Steepest ascent method

Ideally:
Start from (x0, y0). Evaluate gradient at (x0, y0).
Walk for a tiny distance along the gradient direction till (x1, y1).
Reevaluate gradient at (x1, y1) and repeat the process.
Pros: always keep steepest direction and walk shortest distance
Cons: not practical due to continuous reevaluation of the gradient.

Practically:
Start from (x0, y0).
Evaluate gradient (h) at (x0, y0).

18
Evaluate f (x, y) in direction h.
Find the maximum function value in this direction at (x1, y1).
Repeat the process until (xi+1, yi+1) is close enough to (xi, yi).
~ i+1 from X
Find X ~i

For a 2-D function, evaluate f (x, y) in direction h:


f f
g() = f (xi + |(xi,yi) , yi + |(xi,yi) )
x y
where is the coordinate in h-axis.

~
For an n-D function f (X),

~ + f | ~ )
g() = f (X (Xi )
0
Let g () = 0 and find the solution = .

Update xi+1 = xi + f | (x
x i i ,y )
, yi+1 = yi + f
y |(xi ,yi ) .

19
Figure 5: Illustration of steepest ascent

20
Figure 6: Relationship between an arbitrary direction h and x and y coordinates

21
Example: f (x, y) = 2xy + 2x x2 2y 2, (x0, y0) = (1, 1).

First iteration:
x0 = 1, y0 = 1.
f f
x | (1,1) = 2y + 2 2x| (1,1) = 6, y |(1,1) = 2x 4y|(1,1) = 6
f = 6~i 6~j
f f
g() = f (x0 + |(x0,y0) , y0 + |(x0,y0) )
x y
= f (1 + 6, 1 6)
= 2 (1 + 6) (1 6) + 2(1 + 6) (1 + 6)2 2(1 6)2
= 1802 + 72 7
0
g () = 360 + 72 = 0, = 0.2.

Second iteration:
x1 = x0 + f |
x (x0 ,y0 )
= 1+60.2 = 0.2, y 1 = y0 + f
y |(x0 ,y0 ) = 160.2 =
0.2
f
x |(0.2,0.2) = 2y + 2 2x|(0.2,0.2) = 2 (0.2) + 2 2 0.2 = 1.2,
f
y |(0.2,0.2) = 2x 4y|(0.2,0.2) = 2 0.2 4 (0.2) = 1.2
22
f = 1.2~i + 1.2~j
f f
g() = f (x1 + |(x1,y1) , y1 + |(x1,y1) )
x y
= f (0.2 + 1.2, 0.2 + 1.2)
= 2 (0.2 + 1.2) (0.2 + 1.2) + 2(0.2 + 1.2)
(0.2 + 1.2)2 2(0.2 + 1.2)2
= 1.442 + 2.88 + 0.2
0
g () = 2.88 + 2.88 = 0, = 1.

Third iteration:
x2 = x1 + f
x |(x1 ,y1 ) = 0.2 + 1.2 1 = 1.4, y2 = y1 +
f
y |(x1 ,y1 ) =
0.2 + 1.2 1 = 1
...
(x, y ) = (2, 1)

23
6 Newtons Method

Extend the Newtons method for 1-D case to multidimensional case.


~ approximate f (X)
Given f (X), ~ by a second order Taylor series at X ~ =X ~ i:

~ ~ 0
~ ~ ~ 1 ~ ~ 0
~ X~ i)
f (X) f (Xi) + f (Xi)(X Xi) + (X Xi) Hi(X
2
where Hi is the Hessian matrix
2
f 2f 2f
x212 x1 x2 . . . x1 xn
f 2f 2f
. . . x2xn
H = x2x1 x22
...
2 2 2

f f f
xn x1 xn x2 . . . x2
n

~
f (X)
At the maximum (or minimum) point, = 0 for all j = 1, 2, . . . , n, or
xj
f = ~0. Then
~ i) + Hi(X
f (X ~ X
~ i) = 0
If Hi is non-singular,
~ =X
X ~ i H 1f (X
~ i)
i
24
~ i+1 = X
Iteration: X ~ i H 1f (X
~ i)
i

~ = 0.5x21 + 2.5x22
Example: f (X)

~ x1
f (X) =
5x2
" #
2f 2f
x2 1 0
xy
H= f 2f
2 =
0 5
yx y 2

~ 5 ~ ~ 1 ~ 5 1 0 5 0
X0 = , X1 = X0 H f (X0) = =
1 1 0 51 5 0

Comments: Newtons method


Converges quadratically near the optimum
Sensitive to initial point
Requires matrix inversion
Requires first and second order derivatives

25

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy