document
document
(NLP)
Minimize f (x )
subject to x ∈ Ω,
f (x ) ≥ f (x ∗ ), ∀x ∈ Ω ∩ B(, x ∗ ) \ {x ∗ }.
Definition(Global minima)
A point x ∗ is a global minimizer of f over Ω, if
f (x ) ≥ f (x ∗ ), ∀x ∈ Ω \ {x ∗ }.
f (x + αd) − f (x )
f 0 (x , d) = lim .
α→0 α
Remark: f 0 (x , d) = d
dα f (x + αd) = ∇f (x )T d.
α=0
Theorem
Assume that a function f : Rn → R is twice continuously differentiable
function. Then, the Taylor series expansion of f about the point x◦ ∈ Rn is
1 1
f (x ) = f (x◦ ) + ∇f (x◦ )(x − x◦ ) + (x − x◦ )T ∇2 (x − x◦ ) + o(kx − x◦ k2 ).
1! 2!
where o(kx − x◦ k2 ) → 0 as x → x◦ .
(P) minimize f (x ) = x 2
subject to Ω = {x ∈ R : x ≥ 1}
Let x̄ = 2. Then d T f 0 (x̄ ) < 0 for any feasible direction d ∈ [−1, 0).
φ(α) = f (x (α)).
By Taylor’s theorem,
where aij are real and aij = aji , 1 ≤ i, j ≤ n is said to be a real quadratic
form in n variables x1 , x2 , ...xn .
For example:
12x 2 is a quadratic form of one variable x .
x 2 + 7xy + y 2 is a quadratic form of two variables x and y .
3x 2 + 11xy + 17yz + 8xz + 2y 2 + z 2 is a quadratic form of three
variables x , y and z.
Example
Let Q = x 2 + 2y 2 + 4z 2 + 2xy − 4yz − 2xz be a real quadratic form.Verify
whether Q is positive definite or not.
Solution: We can rewrite the given quadratic form in the following manner:
Q = (x + y − z)2 + (y − z)2 + 2z 2
y T Ay > 0,
y T Ay ≥ 0.
Example
2 2 −2
Verify whether the matrix A = 2 5 −4 is positive definite or not.
−2 −4 5
Solution:
The characteristic equation of the matrix A is given by:
det(A − λI) = 0
Solving this equation, we obtain:
λ = 1, 1, 10
Since, all the eigen values of this matrix is positive, hence it is a
positive definite matrix.
A matrix is positive definite if its symmetric and all its pivots are
non-negative.
Positive definite
⇓
All eigenvalues positive
⇓
Inverse exists
m
Columns (rows) linearly independent
Solution:
The characteristic equation of the matrix A is given by:
det(A − λI) = 0
Solving this equation, we obtain:
λ = 1, −2, 4
Since, all the eigen values of this matrix are positive as well as
negative, hence it is an indefinite matrix.
If all second partial derivatives of f exist and are continuous over the
domain of the function, then the Hessian matrix F (x̄ ) (or ∇2 f (x̄ )) of
f at x̄ ∈ Rn is a n × n matrix, usually defined and arranged as follows:
∂2f
Fij ≡ (1)
∂xi ∂xj
Example
x2
Let f (x , y ) = y , Then
" 2
− 2x
#
y y2
F (x , y ) = 2x 2
− 2x
y2 y3
d T ∇f (x ∗ ) = 0,
then we have
d T F (x ∗ )d ≥ 0,
where F is the Hessian of f .
Proof. Suppose on contrary that, there is a feasible direction d at x ∗ such
that
d T ∇f (x ∗ ) = 0
and
d T F (x ∗ )d < 0.
Define
x (α) = x ∗ + αd ∈ Ω
B.B. Upadhyay Convexity and Its Application March 22, 2021 29 / 73
Define the composite function
φ(α) = f (x (α)).
α2
φ(α) = φ(0) + φ00 (0) + o(α2 ),
2
where by assumption
φ0 (0) = d T ∇f (x ∗ ) = 0
and
φ00 (0) = d T F (x ∗ )d < 0.
For sufficiently small α,
α2
φ(α) − φ(0) = φ00 (0) + o(α2 ) < 0,
2
φ00 (0) = d T F (x ∗ )d ≥ 0.
βS = {x : x = βv , v ∈ S}
is also convex;
βS = {x : x = βv , v ∈ S}
is also convex;
If S1 and S2 are convex sets, then the set
S1 + S2 = {x : x = v1 + v2 , v1 ∈ S1 , v2 ∈ S2 }
is also convex;
βS = {x : x = βv , v ∈ S}
is also convex;
If S1 and S2 are convex sets, then the set
S1 + S2 = {x : x = v1 + v2 , v1 ∈ S1 , v2 ∈ S2 }
is also convex;
The intersection of any collection of convex sets is convex.
x2
|x |
ex
x
θ(x ) = pf (x ), p = 0
is convex on S.
Theorem
For a real valued function f defined on a convex set S ⊆ Rn to be convex
on S it is necessary and sufficient that its epigraph
Gf = {(x , ζ) | x ∈ S, ζ ∈ R, f (x ) ≤ ζ}
f (x ) = sup fi (x )
i∈I
is a convex function on S.
Theorem
Let f be a differentiable function defined on a nonempty open convex set
S ⊆ Rn . Then f is convex on S if and only if for every x◦ ∈ S
or, equivalently
f (x ) ≤ λf (x ∗ ) + (1 − λ)f (x◦ )
< λf (x◦ ) + (1 − λ)f (x◦ ) = f (x◦ ), ∀λ ∈ [0, 1]
(y − x )T ∇2 f (x + α(y − x ))(y − x ) ≥ 0.
Therefore, we have
f (y ) ≥ f (x ) + (y − x )T ∇f (x ).
B.B. Upadhyay Convexity and Its Application March 22, 2021 46 / 73
Conversely, on contrary assume that there exists x ∈ S such that ∇2 f (x )
is not positive semidefinite.
Therefore, there exists d ∈ Rn such that d T ∇2 f (x )d < 0.
By the continuity of Hessian matrix there exists a nonzero s ∈ R such that
y = x + sd ∈ S
Then for all points z lying on the line segment joining x and y , we have
d T ∇2 f (z)d < 0.
f (y ) < f (x ) + (y − x )T ∇f (x ),
d T ∇f (x ∗ ) ≥ 0,
d T ∇f (x ∗ ) = 0
and
d T F (x ∗ )d ≥ 0,
where F is the Hessian of f , then x ∗ is a local minimizer of f over Ω.
B.B. Upadhyay Convexity and Its Application March 22, 2021 49 / 73
Nonlinear programming problem with equality constraints
(NPP)
Minimize f (x )
subject to h(x ) = 0
Minimize f (x , y ) = 2x + 3y − 4
subject to h(x , y ) := xy − 6 = 0
h1 (x ∗ ) = 0, . . . , hm (x ∗ ) = 0
∇h1 (x ∗ ), . . . , ∇hm (x ∗ )
Dh(x ∗ ) = .. ..
=
. .
Dhm (x ∗ ) ∇hm (x ∗ )T
Then x ∗ is regular if and only if rank Dh(x ∗ ) = m, i.e. the Jacobian
matrix is of full rank.
Dh(x ∗ ) = .. ..
=
. .
Dhm (x ∗ ) ∇hm (x ∗ )T
Then x ∗ is regular if and only if rank Dh(x ∗ ) = m, i.e. the Jacobian
matrix is of full rank.
Definition (Surface)
The set of equality constraints h1 (x ) = 0, . . . , hm (x ) = 0, describes a
surface
S = {x ∈ R n : h1 (x ) = 0, ......, hm (x ) = 0}
Assuming the points in S are regular, the dimension of the surface S is
n − m.
S := {x ∈ Rn : h(x ) = 0}
is the set
T (x ∗ ) := {y ∈ Rn : Dh(x ∗ )y = 0}
T (x ∗ ) = N(Dh(x ∗ ))
TP(x ∗ ) = T (x ∗ ) + x ∗ = {x + x ∗ : x ∈ T (x ∗ )}
in S such that
x (t ∗ ) = x ∗
and
ẋ (t ∗ ) = y
for some t ∗ ∈ (a, b). Then,
h(x (t)) = 0
Dh(x ∗ )y = 0,
and hence y ∈ T (x ∗ ).
For converse part, we use implicit function theorem.
S = {x ∈ Rn : h(x ) = 0}
is the set
N(x ∗ ) = {x ∈ Rn : x = Dh(x ∗ )T z, ∀z ∈ Rm }
N(x ∗ ) = R(Dh(x ∗ )T )
T (x ∗ ) = N(x ∗ )⊥
T (x ∗ )⊥ = N(x ∗ ).
Lagrange’s Theorem
Let x ∗ be a local minimizer of (NPP). Assume that x ∗ is a regular point,
then there exist a scalar λ∗ ∈ Rm such that
∇f (x ∗ ) + ∇h(x ∗ )λ∗ = 0.
Proof. To prove
∇f (x ∗ ) = −∇h(x ∗ )λ∗
∇f (x ∗ ) = −Dh(x ∗ )T λ∗
∇f (x ∗ ) ∈ R(Dh(x ∗ )T ) = N(x ∗ ) = T (x ∗ )⊥
i.e. ∇f (x ∗ ) ∈ T (x ∗ )⊥
{x : h(x ) = 0}
by the curve
{x (t) : t ∈ (a, b)}
such that for all t ∈ (a, b),
h(x (t)) = 0.
φ(t) = f (x (t))
∇f (x ∗ ) 6∈ N(x ∗ ).
L(x , λ) , f (x ) + λT h(x )
DL(x ∗ , λ∗ ) = 0T
for some λ∗ , where the derivative operation D is with respect to the entire
argument [x T , λT ]T .
f (x , y ) = −2
x T Px = 1.
Maximize x T Qx
subject to x T Px = 1.
L(x , λ) = x T Qx + λ(1 − x T Px )
Dx L(x , λ) = 0 ⇒ 2x T Q − 2λx T P = 0T
Dλ L(x , λ) = 0 ⇒ 1 − x T Px = 0.
(λP − Q)x = 0
(λIn − P −1 Q)x = 0
λ∗ = x ∗T Qx ∗ .