0% found this document useful (0 votes)
18 views

Convex Optimization and Lagrange Duality

The document discusses convex optimization and Lagrange duality theory. It introduces convex optimization problems and properties like global optimality from local optimality. It then covers the Lagrange dual problem and function, and relationships between optimal primal and dual values. It provides examples applying these concepts to entropy maximization.

Uploaded by

clarken1992
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Convex Optimization and Lagrange Duality

The document discusses convex optimization and Lagrange duality theory. It introduces convex optimization problems and properties like global optimality from local optimality. It then covers the Lagrange dual problem and function, and relationships between optimal primal and dual values. It provides examples applying these concepts to entropy maximization.

Uploaded by

clarken1992
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

ELE539A: Optimization of Communication Systems

Lecture 2: Convex Optimization and Lagrange Duality

Professor M. Chiang
Electrical Engineering Department, Princeton University

February 7, 2007
Lecture Outline

• Convex optimization

• Optimality condition

• Lagrange dual problem

• Interpretations

• KKT optimality condition

• Sensitivity analysis

Thanks: Stephen Boyd (some materials and graphs from Boyd and
Vandenberghe)
Convex Optimization

A convex optimization problem with variables x:

minimize f0 (x)
subject to fi (x) ≤ 0, i = 1, 2, . . . , m
aT
i x = bi , i = 1, 2, . . . , p

where f0 , f1 , . . . , fm are convex functions.

• Minimize convex objective function (or maximize concave objective


function)

• Upper bound inequality constraints on convex functions (⇒ Constraint


set is convex)

• Equality constraints must be affine


Convex Optimization

• Epigraph form:

minimize t
subject to f0 (x) − t ≤ 0
fi (x) ≤ 0, i = 1, 2, . . . , m
aT
i x = bi , i = 1, 2, . . . , p

• Not in convex optimization form:

minimize x21 + x22


x1
subject to 1+x2
≤0
2
(x1 + x2 )2 = 0

Now transformed into a convex optimization problem:

minimize x21 + x22


subject to x1 ≤ 0
x1 + x2 = 0
Locally Optimal ⇒ Globally Optimal

Given x is locally optimal for a convex optimization problem, i.e., x is


feasible and for some R > 0,

f0 (x) = inf{f0 (z)|z is feasible , kz − xk2 ≤ R}

Suppose x is not globally optimal, i.e., there is a feasible y such that


f0 (y) < f0 (x)

Since ky − xk2 > R, we can construct a point z = (1 − θ)x + θy where


R
θ = 2ky−xk . By convexity of feasible set, z is feasible. By convexity of
2
f0 , we have
f0 (z) ≤ (1 − θ)f0 (x) + θf0 (y) < f0 (x)
which contradicts locally optimality of x

Therefore, there exists no feasible y such that f0 (y) < f0 (x)


Optimality Condition for Differentiable f0

x is optimal for a convex optimization problem iff x is feasible and for all
feasible y:
∇f0 (x)T (y − x) ≥ 0

−∇f0 (x) is supporting hyperplane to feasible set

Unconstrained convex optimization: condition reduces to:

∇f0 (x) = 0

Proof: take y = x − t∇f0 (x) where t ∈ R+ . For small enough t, y is


feasible, so ∇f0 (x)T (y − x) = −tk∇f0 (x)k22 ≥ 0. Thus ∇f0 (x) = 0
Unconstrained Quadratic Optimization

1 T
Minimize f0 (x) = 2
x Px + qT x + r

P is positive semidefinite. So it’s a convex optimization problem

x minimizes f0 iff (P, q) satisfy this linear equality:

∇f0 (x) = P x + q = 0

• If q ∈
/ R(P ), no solution. f0 unbounded below

• If q ∈ R(P ) and P ≻ 0, there is a unique minimizer x∗ = −P −1 q

• If q ∈ R(P ) and P is singular, set of optimal x: −P † q + N (P )


Duality Mentality

Bound or solve an optimization problem via a different optimization


problem!

We’ll develop the basic Lagrange duality theory for a general


optimization problem, then specialize for convex optimization
Lagrange Dual Function

An optimization problem in standard form:

minimize f0 (x)
subject to fi (x) ≤ 0, i = 1, 2, . . . , m
hi (x) = 0, i = 1, 2, . . . , p

Variables: x ∈ Rn . Assume nonempty feasible set

Optimal value: p∗ . Optimizer: x∗

Idea: augment objective with a weighted sum of constraints


Pp
Lagrangian L(x, λ, ν) = f0 (x) + m
P
λ f
i=1 i i (x) + i=1 νi hi (x)

Lagrange multipliers (dual variables): λ  0, ν

Lagrange dual function: g(λ, ν) = inf x L(x, λ, ν)


Lower Bound on Optimal Value

Claim: g(λ, ν) ≤ p∗ , ∀λ  0, ν

Proof: Consider feasible x̃:


m
X p
X
L(x̃, λ, ν) = f0 (x̃) + λi fi (x̃) + νi hi (x̃) ≤ f0 (x̃)
i=1 i=1

since fi (x̃) ≤ 0 and λi ≥ 0

Hence, g(λ, ν) ≤ L(x̃, λ, ν) ≤ f0 (x̃) for all feasible x̃

Therefore, g(λ, ν) ≤ p∗
Lagrange Dual Function and Conjugate Function

• Lagrange dual function g(λ, ν)

• Conjugate function: f ∗ (y) = supx∈dom f (y T x − f (x))

Consider linearly constrained optimization:

minimize f0 (x)
subject to Ax  b
Cx = d

“ ”
T T
g(λ, ν) = inf f0 (x) + λ (Ax − b) + ν (Cx − d)
x
“ ”
T T T T T
= −b λ − d ν + inf f0 (x) + (A λ + C ν) x
x

= −bT λ − dT ν − f0∗ (−AT λ − C T ν)


Example

We’ll use the simplest version of entropy maximization as our example


for the rest of this lecture on duality. Entropy maximization is an
important basic problem in information theory:

f0 (x) = n
P
minimize i=1 xi log xi
subject to Ax  b
1T x = 1

Since the conjugate function of u log u is ey−1 , by independence of the


sum, we have
Xn
f0∗ (y) = eyi −1
i=1

Therefore, dual function of entropy maximization is


n
X T
−ν−1
T
g(λ, ν) = −b λ − ν − e e−ai λ

i=1

where ai are columns of A


Lagrange Dual Problem

Lower bound from Lagrange dual function depends on (λ, ν). What’s the
best lower bound that can be obtained from Lagrange dual function?

maximize g(λ, ν)
subject to λ0

This is the Lagrange dual problem with dual variables (λ, ν)

Always a convex optimization! (Dual objective function always a


concave function since it’s the infimum of a family of affine functions in
(λ, ν))

Denote the optimal value of Lagrange dual problem by d∗


Weak Duality

What’s the relationship between d∗ and p∗ ?

Weak duality always hold (even if primal problem is not convex):

d∗ ≤ p∗

Optimal duality gap:


p∗ − d∗

Efficient generation of lower bounds through (convex) dual problem


Strong Duality

Strong duality (zero optimal duality gap):

d∗ = p∗

If strong duality holds, solving dual is ‘equivalent’ to solving primal. But


strong duality does not always hold

Convexity and constraint qualifications ⇒ Strong duality

A simple constraint qualification: Slater’s condition (there exists strictly


feasible primal variables fi (x) < 0 for non-affine fi )

Another reason why convex optimization is ‘easy’


Example

Primal optimization problem (variables x):


Pn
minimize f0 (x) = i=1 xi log xi
subject to Ax  b
1T x = 1

Dual optimization problem (variables λ, ν):


Pn −ai Tλ
maximize −bT λ − ν − e−ν−1 i=1 e
subject to λ0

Analytically maximize over the unconstrained ν ⇒ Simplified dual


optimization problem (variables λ):
Pn
maximize −bT λ − log i=1 exp(−aT
i λ)
subject to λ0

Strong duality holds


Saddle Point Interpretation

Assume no equality constraints. We can express primal optimal value as

p∗ = inf sup L(x, λ)


x λ0

By definition of dual optimal value:

d∗ = sup inf L(x, λ)


λ0 x

Weak duality (max min inequality):

sup inf L(x, λ) ≤ inf sup L(x, λ)


λ0 x x λ0

Strong duality (saddle point property):

sup inf L(x, λ) = inf sup L(x, λ)


λ0 x x λ0
Economics Interpretation

• Primal objective: cost of operation

• Primal constraints: can be violated

• Dual variables: price for violating the corresponding constraint (dollar


per unit violation). For the same price, can sell ‘unused violation’ for
revenue

• Lagrangian: total cost

• Lagrange dual function: optimal cost as a function of violation prices

• Weak duality: optimal cost when constraints can be violated is less


than or equal to optimal cost when constraints cannot be violated, for
any violation prices

• Duality gap: minimum possible arbitrage advantage

• Strong duality: can price the violations so that there is no arbitrage


advantages
Complementary Slackness

Assume strong duality holds:

f0 (x∗ ) = g(λ∗ , ν ∗ )
m p
!
X X
= inf f0 (x) + λ∗i fi (x) + νi∗ hi (x)
x
i=1 i=1
m
X p
X
≤ f0 (x∗ ) + λ∗i fi (x∗ ) + νi∗ hi (x∗ )
i=1 i=1
≤ f0 (x∗ )

So the two inequalities must hold with equality. This implies:

λ∗i fi (x∗ ) = 0, i = 1, 2, . . . , m

Complementary Slackness Property:

λ∗i > 0 ⇒ fi (x∗ ) = 0


fi (x∗ ) < 0 ⇒ λ∗i = 0
KKT Optimality Conditions

Since x∗ minimizes L(x, λ∗ , ν ∗ ) over x, we have


m
X p
X

∇f0 (x ) + λ∗i ∇fi (x∗ ) + νi∗ ∇hi (x∗ ) = 0
i=1 i=1

Karush-Kuhn-Tucker optimality conditions:

fi (x∗ ) ≤ 0, hi (x∗ ) = 0, λ∗i  0


λ∗i fi (x∗ ) = 0
Pp
∇f0 (x∗ ) + m ∗ ∇f (x∗ ) + ∗ ∗
P
λ
i=1 i i i=1 νi ∇hi (x ) = 0

• Any optimization (with differentiable objective and constraint


functions) with strong duality, KKT condition is necessary condition for
primal-dual optimality
• Convex optimization (with differentiable objective and constraint
functions) with Slater’s condition, KKT condition is also sufficient
condition for primal-dual optimality (useful for theoretical and numerical
purposes)
Waterfilling

Pn
maximize i=1 log(αi + xi )
subject to x  0, 1T x = 1
Variables: x (powers). Constants: α (noise)

KKT conditions:

x∗  0, 1T x∗ = 1, λ∗  0
λ∗i x∗i = 0, −1/(αi + xi ) − λ∗i + ν ∗ = 0

Since λ∗ are slack variables, reduce to

x∗  0, 1T x∗ = 1
x∗i (ν ∗ − 1/(α∗i + x∗i )) = 0, ν ∗ ≥ 1/(αi + x∗i )

If ν ∗ < 1/αi , x∗i > 0. So x∗i = 1/ν ∗ − αi . Otherwise, x∗i = 0

Thus, x∗i = [1/ν ∗ − αi ]+ where ν ∗ is such that


P ∗
i xi = 1
Global Sensitivity Analysis

Perturbed optimization problem:

minimize f0 (x)
subject to fi (x) ≤ ui , i = 1, 2, . . . , m
hi (x) = vi i = 1, 2, . . . , p

Optimal value p∗ (u, v) as a function of parameters (u, v)

Assume strong duality and that dual optimum is attained:

p∗ (0, 0) = g(λ∗ , ν ∗ ) ≤ f0 (x) + ∗ ∗ f0 (x) + λ∗T u + ν ∗T v


P P
i λi fi (x) + i νi hi (x) ≤
p∗ (u, v) ≥ p∗ (0, 0) − λ∗T u − ν ∗T v

• If λ∗i is large, tightening ith constraint (ui < 0) will increase optimal
value greatly

• If λ∗i is small, loosening ith constraint (ui > 0) will reduce optimal
value only slightly
Local Sensitivity Analysis

Assume that p∗ (u, v) is differentiable at (0, 0):

∂p∗ (0, 0) ∂p∗ (0, 0)


λ∗i =− ∗
, νi = −
∂ui ∂vi

Shadow price interpretation of Lagrange dual variables

Small λ∗i means tightening or loosening ith constraint will not change
optimal value by much
Lecture Summary

• Convexity mentality. Convex optimization is ‘nice’ for several reasons:


local optimum is global optimum, zero optimal duality gap (under
technical conditions), KKT optimality conditions are necessary and
sufficient

• Duality mentality. Can always bound primal through dual, sometimes


solve primal through dual

• Primal-dual: where is the optimum, how sensitive it is to perturbations

Readings: Sections 4.1-4.2 and 5.1-5.6 in Boyd and Vandenberghe

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy