Strong Duality in Cone Constrained Nonconvex Optimization
Strong Duality in Cone Constrained Nonconvex Optimization
Strong Duality in Cone Constrained Nonconvex Optimization
OPTIMIZATION∗
FABIÁN FLORES-BAZÁN† AND GIANDOMENICO MASTROENI‡
Abstract. In this paper we deepen the analysis of the conditions that ensure strong duality
for a cone constrained nonconvex optimization problem. We first establish a necessary and sufficient
condition for the validity of strong duality without convexity assumptions with a possibly empty
solution set of the original problem, and second, via Slater-type conditions involving quasi interior or
quasirelative interior notions, various results about strong duality are also obtained. Our conditions
can be used where no previous result is applicable, even in a finite dimensional or convex setting.
DOI. 10.1137/120861400
where P ∗ is the positive polar cone of P . We say that problem (1.1) has a (La-
grangian) zero duality gap if the optimal values of (1.1) and (1.2) coincide, that is,
μ = ν. Problem (1.1) is said to have strong duality if it has a zero duality gap and
problem (1.2) admits a solution. Our task is to characterize this property without
convexity assumptions and under minimal hypotheses. It is known that some con-
straints qualification (CQ) is needed, which may be of Slater-type, or interior-point
condition, to get strong duality. In some other situation, the validity of strong duality
requires a so called closed-cone CQ. Such CQ often restricts some applications.
∗ Received by the editors January 5, 2012; accepted for publication (in revised form) November
(gmastroeni@di.unipi.it).
153
the classical Slater condition. A similar result is proved in [21, Theorem 3.3] when
X = Rn , C = {x ∈ Rn : Hx = d}.
Similarly, when g is P -convex (g(tx1 + (1 − t)x2 ) ∈ tg(x1 ) + (1 − t)g(x2 ) − P for
every t ∈ ]0, 1[ and all x1 , x2 ∈ C, provided C is convex) and continuous, it is proved
in [4] that (1.1) has strong duality for each f ∈ X ∗ if and only if a certain CQ holds.
This CQ involves the epigraph of the support function of C and the epigraph of the
conjugate of the function x → λ∗ , g(x). This CQ is also equivalent to the fact that
(1.1) has strong duality for each continuous and convex function f [16].
Stable zero duality gaps in convex programming (g is continuous and P -convex,
and f is a lower semicontinuous proper convex function), which means that strong
duality holds for each linear perturbation of f , were characterized in terms of a similar
CQ as above; see [18, 20] for details.
Several sufficient conditions of the zero duality gap have been also established in
the literature; see [12, 1, 2, 31, 4, 6, 7].
Unlike some of the above results (those including closed-cone CQ [4, 16]), which
involve conditions on g and C that guarantee (1.1) has strong duality for every f in
a certain class of functions, our approach allows us to derive conditions on f , g, and
C, jointly, that ensure (1.1) has strong duality under no convexity assumption. Thus,
we provide results where none of those in [12, 4, 18, 16, 5, 6, 7, 19] is applicable.
At the same time, because of many applications, our purpose is also to consider
convex cones P possibly with empty topological interior. This happens for instance
if (1 < p < +∞ and Ω being an open bounded set in Rn )
.
P = Lp+ = {u ∈ Lp (Ω) : u ≥ 0 for a.e. x ∈ Ω},
or if P is of the form P = Q × {0} with int Q = ∅. The former case appears when
dealing with constrained best interporlation problems; see the nice work by Qi [27]
(see also [24]).
A good substitute for the topological interior is the quasi interior and even the
quasi-relative interior. Borwein and Lewis in [3] introduced the quasi-relative interior
of a convex set A ⊆ Y , although the concept of quasi interior was introduced earlier.
We use both notions, and since the sets considered are not necessarily convex, the
convex hull arises naturally.
The paper is structured as follows. Section 2 provides the basic definitions, nota-
tion, and preliminaries on quasi (relative) interior of convex sets. In section 3 we first
establish a characterization of strong duality without any additional assumption; then
we present two main theorems on the validity of strong duality under no convexity as-
sumptions, which extend and unify existing results in the literature. Furthermore, we
show instances where no previous result is applicable. Consequences and comparison
with other existing results are discussed in section 4.
2. Basic notation and preliminaries. Throughout the paper, Y is a real
Hausdorff locally convex topological vector space, its topological dual space is Y ∗ ,
and ·, · denotes the duality pairing between Y and Y ∗ .
A set P ⊆ Y is said to be a cone if tP ⊆ P ∀ t ≥ 0; given A ⊆ Y , cone(A) stands
for the smallest cone containing A, that is,
cone(A) = tA,
t≥0
whereas cone(A) denotes the smallest closed cone containing A: obviously cone(A) =
cone(A), where A denotes the closure of A. Additionally, we set
Downloaded 11/11/19 to 142.157.252.78. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
.
cone+ (A) = tA.
t>0
Evidently, cone(A) = cone+ (A) ∪ {0} and therefore cone(A) = cone+ (A).
Some elementary properties of cones are collected in the next lemma, where co(A),
int A stand for the convex hull of A which is the smallest convex set containing A
.
and topological interior of A, respectively. We denote R+ = [0, +∞[.
Given a convex set A ⊆ Y and x ∈ A, NA (x) stands for the normal cone to A at
x, defined by NA (x) = {ξ ∈ Y ∗ : ξ, a − x ≤ 0 ∀ a ∈ A}. We say that x ∈ A is a (see,
for instance, [7])
(a) quasi-interior point of A, denoted by x ∈ qi A, if cone(A − x) = Y , or
equivalently, NA (x) = {0};
(b) quasi-relative interior of A, denoted by x ∈ qri A, if cone(A − x) is a linear
subspace of Y , or equivalently, NA (x) is a linear subspace of Y ∗ .
For any convex set A, we have that [24, 7] qi A ⊆ qri A and int A = ∅ implies
int A = qi A. Similarly, if qi A = ∅, then qi A = qri A. Moreover [3], if Y is a finite
dimensional space, then qi A = int A and qri A = ri A, where ri A means the relative
interior of A, which is the interior with respect to the affine hull of A, denoted by
aff A.
We recall the definition of pointedness for a cone that is not necessarily convex
(see, for instance, [29]).
Definition 2.1. A cone P ⊆ Y is called “pointed” if x1 + · · · + xk = 0 is
impossible for x1 , x2 , . . . , xk in P unless x1 = x2 = · · · = xk = 0.
It is easy to see that a cone P is pointed if and only if co(P ) ∩ (−co(P )) = {0} if
and only if 0 is a extremal point of co(P ).
The positive polar of the convex cone P ⊆ Y is defined by
.
P ∗ = {y ∗ ∈ Y ∗ : y ∗ , x ≥ 0 ∀x ∈ P }.
(b) The inclusion (⊇) is obvious; the other comes from (a).
q
(c) (⊆) Let mi ∈ M, pi ∈ N, i = 1, . . . , q, αi ≥ 0, i=1 αi = 1. Then
q
q
q
Downloaded 11/11/19 to 142.157.252.78. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
αi (mi + pi ) = αi mi + αi pi ∈ co(M ) + N.
i=1 i=1 i=1
q
(⊇) Let mi ∈ M, i = 1, . . . , q, αi ≥ 0, i=1 αi = 1, p ∈ N .
Then
q
q
αi mi + p = αi (mi + p) ∈ co(M + N ).
i=1 i=1
It follows that there exists at least one i ∈ 1, . . . , p, such that x∗ , xi < 0; otherwise
(2.1) would be contradicted, which proves the necessity part.
The sufficiency is straightforward.
The next result [3] is a useful characterization of the quasi-relative interior.
Theorem 2.4 (see [3, Theorem 3.10]). Let Y be as above partially ordered by a
convex cone P with P − P = Y . Then
y ∈ qri P ⇐⇒ y ∈ P and y ∗ , y > 0, ∀y ∗ ∈ P ∗ \ {0}.
Note that in case P is a closed convex cone, the condition y ∈ P can be omitted.
Proposition 2.5. Let A ⊆ Y be a convex set. Then,
(a) cone(A − A) = cone A − cone A provided 0 ∈ A;
(b) [0 ∈ qi A] ⇐⇒ [0 ∈ qi(A − A) and 0 ∈ qri A].
The converse implication follows from (2.2) as well. Indeed, if 0 ∈ qri A, then cone(A)
is a linear subspace of Y , and therefore from (2.2) it follows that cone(A − A) =
cone(A). If additionally 0 ∈ qi(A − A), then Y = cone(A − A) = cone(A). Hence
0 ∈ qi A, and the proof is completed.
Notice that (b) can also be found in [15].
Proposition 2.6. Let P ⊆ Y be a convex cone such that P − P = Y . Then
qri P = qi P.
This set or its conic hull arises in a natural way when dealing with duality results
or in deriving alternative theorems; see [22, 12, 8, 15, 10]. Giannessi [13] used it in
a systematic manner for a constrained extremum problem giving rise to the image
space analysis.
Proposition 3.1. The following assertions hold.
(a) Assume that μ ∈ f (K). Then, μ = inf x∈K f (x) if and only if
(3.2) Eμ ∩ H = ∅,
.
where H = {(u, v) ∈ R × Y : u < 0, v ∈ −P }. Furthermore,
Eμ ∩ H = ∅ ⇐⇒ cone(Eμ ) ∩ H = ∅.
Eρ ∩ H = ∅ ∀ ρ ∈ R,
Downloaded 11/11/19 to 142.157.252.78. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
(3.3)
is impossible.
Let μ = inf x∈K f (x) and assume that (3.2) does not hold. Then, there exists a
solution (x̃, t̃, p̃) ∈ (C × R+ × P ) of system (3.4), i.e.,
f (x) ≥ μ ∀x ∈ K
which implies
i.e.,
u + λ∗0 , v ≥ 0 ∀ (u, v) ∈ Eμ .
It follows that
for a suitable choice of ρ > 0 and V (0). By the separation theorem for convex sets in
a t.v.s., there exist (γ0∗ , λ∗0 ) ∈ (R × Y ∗ ), (γ0∗ , λ∗0 ) = (0, 0), such that
Let us prove that γ0∗ = 0. By contradiction, suppose that γ0∗ = 0; then from (3.8)
it follows that λ∗0 , v ≤ 0 ∀v ∈ V (0), which implies λ∗0 = 0, thus contradicting
(γ0∗ , λ∗0 ) = (0, 0). Therefore, γ0∗ = 0, and with no loss of generality, we can assume
γ0∗ = 1, since by (3.8) (at the point (−1, 0) ∈ A), we have −γ0∗ ≤ 0. Then, (3.7)
implies
and, in turn,
so that
(a) There exist (γ0∗ , λ∗0 ) ∈ R+ × P ∗ , (γ0∗ , λ∗0 ) = (0, 0), such that
x∈K x∈C
or, equivalently,
Hence
γ0∗ inf f (x) = inf L(γ0∗ , λ∗0 , x) and γ0∗ (f (x̃) + t̃) + λ∗0 , g(x̃) + p̃ > μγ0∗ .
x∈K x∈C
Proof. The equivalences are consequences of Theorem 2.3. The remaining part
follows a similar reasoning as in the preceding theorem.
Downloaded 11/11/19 to 142.157.252.78. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Looking at Theorems 3.3 and 3.4, we realize that strong duality is obtained under
the nonverticality of the linear functional (γ0∗ , λ∗0 ), that is, we need γ0∗ > 0. It holds
whenever a Slater-type condition is imposed as the following two theorems show.
Theorem 3.5. Assume that μ is finite and cone(co(g(C)) + P ) = Y , i.e., 0 ∈
qi(co(g(C)+P )). Then, any of the assumptions (b) or (c) of Theorem 3.3 is equivalent
to (3.5) for some λ∗0 ∈ P ∗ . In such a situation,
λ∗0 , v ≥ 0 ∀ v ∈ cone(co(g(C)) + P ).
Therefore, by assumption, we obtain λ∗0 = 0, which cannot happen as (γ0∗ , λ∗0 ) = (0, 0).
Hence γ0∗ > 0, and the conclusion follows.
For the equality in (3.11), we observe that the inequality ≥ is obvious. The reverse
inequality is a consequence of (3.5):
inf f (x) ≥ inf L(1, λ∗0 , x) ≥ inf L(1, λ∗0 , x) = inf f (x).
λ∗
0 ,g(x)≤0 λ∗
0 ,g(x)≤0 x∈C x∈K
x∈C x∈C
for some λ∗0 ∈ P ∗ . In such a case, if x̄ is a solution to (3.1), then λ∗0 , g(x̄) = 0.
Proof. By Theorem 3.4, we have only to prove that γ0∗ > 0, taking into account
that, in such a case, the second assertion in Theorem 3.4(a)
λ∗0 , v ≤ 0 ∀ v ∈ co(g(C) + P ),
Downloaded 11/11/19 to 142.157.252.78. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
(3.13)
i.e., λ∗0 ∈ Nco(g(C)+P ) (0), recalling that 0 ∈ co(g(C) + P ) because the feasible set is
nonempty. Since 0 ∈ qri(co(g(C) + P )) is equivalent to saying that Nco(g(C)+P ) (0) is
a linear subspace, −λ∗0 ∈ Nco(g(C)+P ) (0), and it follows that
λ∗0 , v = 0 ∀ v ∈ co(g(C) + P ),
Therefore u∗ ∈ R, v ∗ ∈ Y ∗ , and
(3.15) u∗ (u − ((f (x̃) + t − μ)) + v ∗ , v − (g(x̃) + p̃) ≤ 0 ∀(u, v) ∈ co(Eμ ∪ {(0, 0)}).
. .
Setting u = f (x̃) + 2t − μ, v = g(x̃) + p̃, we obtain −u∗ 2t ≤ 0, and setting u =
3 ∗t ∗
f (x̃) + 2 t − μ, v = g(x̃) + p̃, we obtain u 2 ≤ 0. Since t > 0, it must be u = 0.
Therefore, (3.15) becomes
so that it must be v ∗ = 0.
Hence,
be deleted everywhere simply by requiring the convexity of the sets cone Eμ and
cone(g(C) + P ), since in this situation,
An important class of vector functions implying the convexity of the sets Eμ and
g(C) + P which satisfy more verifiable conditions is that introduced in [22]: given a
convex set C ⊆ X with X as above, a real locally convex topological vector space Z
along with a convex cone Q ⊆ Z, a mapping G : C → Z is called ∗-quasi-convex if
q ∗ , G(·) is quasi-convex ∀ q ∗ ∈ Q∗ . Independently, the author in [30] says that G
is naturally Q-quasi-convex if ∀ x, y ∈ C, G([x, y]) ⊆ [G(x), G(y)] − Q. Both classes
coincide as shown in [8, Proposition 3.9] when int Q = ∅ and [11, Theorem 2.3] for
general Q. See also [23].
It is known from Corollary 3.11 in [8] that every ∗-quasi-convex function G : C →
Z satisfying
is such that G(C) + P is convex, so that G(C ) + P is also convex for every convex
set C ⊆ C.
Therefore, by setting F = (f, g) and assuming the convexity of C, the lower
semicontinuity on any line segment of C of q ∗ , F (·) ∀ q ∗ ∈ R+ × P ∗ and the ∗-
quasi-convexity of F : C → R × Y , we get the convexity of F (C) + (R+ × P ) (and
so Eμ is convex as well) and the quasiconvexity of the functions f and p∗ , g(·) on C
∀ p∗ ∈ P ∗ . Hence, f (C) + R+ and g(C) + P are convex sets as well.
Obviously there are vector functions F such that F (C) + (R+ × P ) is convex
without being ∗-quasi-convex. The convexity of F (C) + (R+ × P ) was imposed in
[7, 15]. Hence, our Theorem 3.6 is more general, even in the convex case, than
Theorem 4.4 in [7] and Theorem 10 in [15], since the last two theorems require the
stronger condition 0 ∈ qi(g(C) + P ). This is shown by Example 4.3 below. To be
more precise, Theorem 4.4 in [7] reads as follows.
Theorem 4.1 (see [7, Theorem 4.4]). Suppose that F (C) + (R+ × P ) is convex,
0 ∈ qi(g(C) + P ), and (0, 0) ∈ qri[co(Eμ ∪ {(0, 0)})]. Then, there exists λ∗0 ∈ P ∗ such
that (3.12) holds.
In order to prove the previous theorem, the authors show first that “Fenchel
and Lagrange duality” (so, some convexity assumptions are imposed) are equivalent,
generalizing an earlier result due to Magnanti [25]. Then, from such an equivalence
the strong duality is obtained.
On the other hand, from Proposition 2.5(b), it follows that
(4.2)
0 ∈ qi(co(g(C))+P ) ⇐⇒ 0 ∈ qi[co(g(C))+P −(co(g(C))+P )] and 0 ∈ qri(co(g(C))+P ).
This implies that Theorems 4.2 and 4.4 in [7] are identical provided g(C)+P is convex.
Furthermore, we point out that Theorems 3.5 and 3.6 apply to more general
situations, even to non-quasi-convex functions with equality and inequality constraints
and possibly where argminK f is empty, as the next example shows. Example 4.3
below will show that even in the convex case, our Theorem 3.6 is applicable but
Theorem 4.4 in [7] or Theorem 10 in [15] are not.
Example 4.2. Notice this example shows our approach applies even if int P = ∅.
2
Take C = R2 , P = {0} × R+ , x = (x1 , x2 ), f (x) = x21 + 2e−x2 , and
Downloaded 11/11/19 to 142.157.252.78. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
2
g1 (x) = x41 − e−x2 , g2 (x) = x21 − x22 ,
if
inf 2 L(1, λ, x) = 2 − λ1 if λ2 = 0, λ1 > 2,
x∈R ⎩
−∞ if λ2 = 0, λ1 < 0 or λ2 > 0,
and therefore,
The following example shows that even in the convex case, our Theorem 3.6 is
applicable, but Theorem 4.4 in [7] or Theorem 10 in [15] are not.
Example 4.3. This example shows an application to a convex problem with
int P = ∅.
Take C = {(x1 , x2 , x3 ) ∈ R3 : x3 = 0}, P = {0} × {0} × R+ , x = (x1 , x2 , x3 ),
f (x) = x21 ,
E0 = {(u, v1 , v2 , v3 ) ∈ R4 : u ≥ x21 , v1 = x1 + x2 + x3 ,
v2 = x1 + x2 − x3 , v3 ≥ x21 + x22 − 1, x ∈ R3 }
is convex, and it is not difficult to check that cone(E0 ) ∩ H = ∅, which implies that
cone(E0 ) ∩ (−R++ × {0}) = ∅, where
The former equality allows us to apply Theorem 3.2. It is easy to see that
is a convex set with empty interior so that qi((g1 , g2 , g3 )(C) + P ) = ∅. However, since
It is easy to see that μ = 0 and x̄ = 0 is the optimal solution. On the other hand,
F (C) = {(x2 , x2 + x4 ) : x ∈ R} = {(u, v) ∈ R × R : v = u + u2 , u ∈ R+ }. Then
F (C) + (R+ × P ) = R2+ is a closed convex cone and (b) of Theorem 3.2 is fulfilled.
Downloaded 11/11/19 to 142.157.252.78. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
f (x) + λg(x) ≥ 0 ∀ x ∈ Rn .
Let us sketch a proof. Obviously (b) =⇒ (a) always holds. Assume therefore that
.
(a) holds. This means that g(x) ≤ 0 implies f (x) ≥ 0, that is, 0 ≤ μ = inf g(x)≤0 f (x).
By Proposition 3.1 we have that cone(Eμ ) ∩ H = ∅, where (set F = (f, g))
. .
Eμ = F (Rn ) − μ(1, 0) + R2+ and H = {(u, v) ∈ R2 : u < 0, v ≤ 0}.
By the Dines theorem [26, Proposition 2.3], F (Rn ) is convex, and therefore Eμ is
convex. It follows that
or, equivalently (recalling that for any nonempty convex sets C1 , C2 ⊆ Rn , ri(C1 +
C2 ) = riC1 + riC2 ; see [28, Corollary 6.6.2]),
REFERENCES
[1] A. Auslender, Existence of optimal solutions and duality results under weak conditions, Math.
Program. A, 88 (2000), pp. 45–59.
[2] A. Auslender and M. Teboulle, Asymptotic Cones and Functions in Optimization and
Variational Inequalities, Springer Monogr. Math., Springer-Verlag, New York, 2003.
[3] J. M. Borwein and A. S. Lewis, Partially finite convex programming, part I: Quasi relative
interiors and duality theory, Math. Program. A, 57 (1992), pp. 15–48.
[4] R. I. Bot and G. Wanka, An alternative formulation for a new closed cone constraint quali-
fication, Nonlinear Anal., 64 (2006), pp. 1367–1381.
[5] R. I. Bot, S.-M. Grad, and G. Wanka, New regularity conditions for strong and total Fenchel-
Lagrange duality in infinite dimensional spaces, Nonlinear Anal., 69 (2008), pp. 323–336.
[6] R. I. Bot, E. R. Csetnek, and A. Moldovan, Revisiting some duality theorems via the
quasirelative interior in convex optimization, J. Optim. Theory Appl., 139 (2008), pp. 67–
84.
[7] R. I. Bot, E. R. Csetnek, and G. Wanka, Regularity conditions via quasi-relative interior
in convex programming, SIAM J. Optim., 19 (2008), pp. 217–233.
[8] F. Flores-Bazán, N. Hadjisavvas, and C. Vera, An optimal alternative theorem and appli-
cations to mathematical programming, J. Global Optim., 37 (2007), pp. 229–243.
[9] F. Flores-Bazán, F. Flores-Bazán, and C. Vera, A complete characterization of strong
Downloaded 11/11/19 to 142.157.252.78. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php