Second Order Method: Newton Method Quasi Newton Method
Second Order Method: Newton Method Quasi Newton Method
Second Order Method: Newton Method Quasi Newton Method
Newton Method
Quasi Newton Method
Newton Method
It is a second order method.
Let x(k) be the current point. The Taylor expansion of the objective function about
x(k):
f(x(k +1)) = f (x(k)) + f (x(k)) T x + 1/2 x T 2f (x(k)) x + O (x3)
The quadratic approximation of f(x) is
f (x) = f(x(k)) + f (x(k)) T x+ 1/2x T 2 f (x(k)) x
We need to find the critical point of the approximation:
f(x(k)) + 2f(x(k)) x = 0
x = 2f(x(k)) 1 f(x(k))
Newton Method
The Newton optimization method is
x(k + 1) = x(k) (2f(x(k))) 1 f(x(k))
If the function f(x) is quadratic, the solution can be
found in exactly one step.
Quasi-Newton Method
Quasi-Newton methods use a Hessian-like matrix but without calculating
second-order derivatives.
Sometimes, these methods are referred to as the variable metric methods.
Take the general formula:
x(k + 1) = x(k) A(k) f(x(k))
When A(k) = I (identity matrix), the formula becomes the formula of the
steepest descent method.
When A(k) = 2f(x(k))-1, the formula becomes the formula of the Newton
method.
Starting from a positive definite matrix, the quasiNewton methods gradually build up an approximate
Hessian matrix by using gradient information from the
previous iterations.
The matrix A is kept positive definite; hence the
direction
s(k) = - A(k)f(x(k))
remains a descent direction.
There are several ways to update the matrix A.
Davidon-Fletcher-Powel Formula
Earliest (and one of the most clever) schemes for constructing the inverse
Hessian was originally proposed by Davidon (1959) and later developed by
Fletcher and Powell (1963).
It has the interesting property that, for a quadratic objective, it
simultaneously generates the directions of the conjugate gradient method
while constructing the inverse Hessian.
The method is also referred to as the variable metric method (originally
suggested by Davidon).
A(k +1) = A(k) (A(k)(k) ((k))TA(k)/ (k) A(k)) + ((k) ((k))T/ ((k))T (k))
where A(0) = I, (k) = x(k+1) x(k) and (k) = f(x(k+1)) f(x(k))
BFGS method
Broyden-Fletcher-Goldfarb-Shanno.
To solve a nonlinear optimization problem.
Is derived from gradient descent.
Using an approximate Hessian matrix, by analyzing successive gradient vectors.
Allowing the application of quasi-Newton condition to update Hessian matrix.
B(k + 1)(x(k +1) x(k)) = f(x(k + 1)) f(x(k))
B(k +1) = B(k) (B(k)(k)(B(k)(k))T/((k))TB(k)(k)) + ((k) ((k))T/ ((k))T (k)) + ((k))TB(k)(k)vvT)
Where
(k) = f(x(k+1)) f(x(k)), (k) = x(k+1) x(k) , v = ( (k)/ ((k))T (k)) (B(k)(k)/((k))TB(k)(k))
Algorithm
1.