Dustin

Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

Control Theory and PDE’s

Dustin Connery-Grigg

December 14, 2012

1 Introduction
Differential equations are extremely useful tools in modelling all sorts of dynamical systems.
As mathematicians, studying them for their own sake is an entirely acceptable, and even
laudable venture.

Unfortunately, many people and institutions that supply actual grant money have different
sensibilities, and want ’results’ that are ’useful’ and ’practical’. Heresy at its most profane,
I agree. But it’s the way of the world, and one must persevere. Onwards and upwards, as
they say.

Most human endeavours involve some degree of trying to control the behaviour of some
system or another. That is, a common theme in human thought is, ”given that I know how
this system will behave, how can I change it , to make the system act in a way that I would
prefer?”
What follows is then the perpetual cycle of human history between the two stages of ”What
harm could it do?” and ”How could we have known?”

The main goal of Control Theory is to answer this first question for dynamical systems
modelled by differential equations (the latter two are left for watchdog organizations and
historians, respectively).

These notes are divided into four sections. The first is a brief introduction to the con-
trol problem, and a statement of some of the central ideas and tools used in establishing
controllability (whatever that may mean). The second develops some of the theory of finite
linear systems, in particular arriving at Kalman’s controllability condition, which completely
characterizes the controllability properties of such systems. The third is a development of
parallel controllability results for the heat equations, while the fourth provides investigates
how the theory of viscosity solutions to Hamilton-Jacobi equations can be used to build op-
timal controls.

2 Section One: Control versus Optimal Control


The astute reader may have noticed that ’Optimal Control’ is comprised of two words, and
that the first is a modifier of the second. There is a reason for this; Control Theory and
Optimal Control Theory ask two different, but related, questions. Let’s investigate.

An exceptionally (perhaps lamentably) general version of the optimal control problem is as


follows: given some sort of dynamical system, together with some way to control it, which

1
may be modelled as

Λy = Bu (*)

Where Λ and B are operators, specifying the model of the system, and how the control acts
on the system, respectively, y is the state of the system, and u is the control, from some
admissible set of controls, U.

We are additionally given some ’cost functional’ J(u), which maps to the reals, express-
ing the ’cost of running the control u’. The question Optimal Control theory asks is to find
a control u, in the space of admissible controls, such that

J(u) = infu∈U J(u)

The slight peccadillo here is the following: how ought we choose the set of admissible con-
trols? What would we want out of these controls in the first place? Obviously, it would do us
no good if we found some control that minimized the cost functional, but this control didn’t
produce the desired results.

There are many types of control problems, depending on the context of the system un-
der study, and the objectives of the controllers. Three types are generally considered, it is
convenient to introduce some notation here:
Definition 1. Let U be a function space and let y1 be some given initial data in some Hilbert
space V, let T>0, then we define the set of reachable states in time T

R(T ; y1 ) = {y(T ) : y is a solution of (∗), with u ∈ U}

Here, U will be chosen based on the properties desired of our control. Control theory generally
considers three types of controllability:
Definition 2. We say system (*) is exactly controllable in time T if for every initial data
y1 , R(T;y1 )=V.
Definition 3. System (*) is said to be approximately controllable in time T if for every
initial data y1 , R(T;y1 ) is dense in V.
Definition 4. System (*) is said to be null controllable in time T if, for every initial data
y1 , we have 0 ∈ R(T ; y1 )
Depending on the type of controllability required by the application, the Optimal Control
setting will take as its set of admissible controls the control space that permits the type of
controllability needed by the application. The question of control theory is then ”for what
types of controls does this system have the desired controllability properties?”

3 Section Two: Finite Linear Systems


In order to introduce some of the concerns of control theory in a more concrete setting, let’s
develop the theory of controllability for finite, linear systems, which turns out to have an
exceptionally elegant answer. In this section, we consider the system:

y 0 = Ay + Bu , t ∈ (0, T ) (1)
n
y0 = y(0) ∈ R

2
We have A ∈ M(n,n), B ∈ M(n,m), m≤n.

The controllability problem for this class of systems has a satisfying algebraic answer. Our
program for this section will be as follows; first, we will show the equivalence of all three
types of controllability for finite linear systems, then we’ll prove Kalman’s rank condition,
which characterizes exactly when a system is controllable by the rank of a certain matrix.

First, let’s establish the link between exact and null controllability:
Note that linearity of this system immediately gives us an equivalence between null control-
lability and exact controllability, since, for T>0, if y(T)=yT 6= 0 then we can define: z=y-x,
solving the system:
z 0 = Az + Bu
z(0) = y(0) − x(0)
Where x is a solution to the system:
x0 = Ax
x(T ) = yT
Then y(T)=yT iff z(T)=0. So due to the linear, finite nature of the system, we have that null
controllability and exact controllability are equivalent. When we examine the heat equation,
we’ll see that this is not the case in an infinite dimensional context.

In a similarly quick way, we can brush concerns of approximate controllability under the
rug by noting that the set of reachable states is affine (explicitly writing y(T) out as a func-
tion of y(0) and u with the variation of constants formula makes this clear), and for finite n,
the only dense affine subspace of Rn is Rn , so in the finite case, approximate controllability
is equivalent to exact controllability.

Let’s concern ourselves now with characterizing exactly which systems are controllable (in
any of these three senses).
It is often convenient to examine the properties a system that is strongly related to our prime
system, that is the adjoint system of (1). Letting A∗ be the adjoint of A, then we consider
the adjoint system, which runs backward in time:
− φ0 = A∗ φ, t ∈ (0, T) (2)
φ(T ) = φT
This system gives rise to the dual notion to controllability, known as observability.
Definition 5. The adjoint system is said to be observable in time T>0 if ∃c>0 such that
Z T
|B ∗ φ|2 dt ≥ c|φ(0)|2
0
For all φT ∈ Rn , with φ the corresponding solution to the adjoint system.
The above inequality is often referred to as the observation inequality. The concept of
observability makes concrete the general notion that ”the action of the controls is sufficient
to determine the state of the system”. In particular, the observation inequality guarantees
that the solution of the adjoint equation at t=0 is completely determined by the B ∗ φ term,
which is the quantity observed through the control.

Let us now relate null controllability (and hence, exact controllability) to the adjoint sys-
tem.

3
Lemma 1. Given T>0, a control u ∈ L2 (0,T), and an initial data point y0 , we have that
yT =0 iff
Z T
hu, B ∗ φi dt + hy0 , φ(0)i = 0
0

Proof. Let φT be arbitrary in Rn , and let φ be the corresponding solution in the adjoint
system. Then, note that:

hx0 , φi = hAy + Bu, φi


h−φ0 , yi = hφ, Ayi

Summing these two gives:

hy 0 , φi + hφ0 , yi = hBu, φi
⇒d/dthy, φi = hu, B ∗ φi
Z T
⇒ hu, B ∗ φi dt = hyT , φT i − hy0 , φ0 i
0
RT ∗
and certainly yT =0 iff 0 hu, B φi dt = 0

We’re now almost ready to make explicit the duality between observability and controllability,
but before this, we need a short lemma that follows from the above condition:
RT
Lemma 2. Suppose the functional J : Rn → Rn , defined by J(φT )= 12 0 |B ∗ φ|2 dt+hy0 , φ(0)i
has a minimizer φˆT and let φ̂ be the corresponding solution to the adjoint system with final
data φˆT . Then u = B ∗ φ̂ is a control of system (1) that drives initial data y0 to 0. (That is,
the solution to system (1) with control u and y(0) = y0 has y(T ) = 0.)

Proof. If φˆT is a minimizer for J, then we have

J(φˆT + h ∗ φT ) − J(φˆT )
lim = 0, ∀φT ∈ Rn
h→0 h
Z T
⇔ hB ∗ φ̂, B ∗ φi dt + hy0 , φ(0)i = 0, ∀φT ∈ Rn
0
⇒u = B ∗ φ̂ is a control for (1)

Interestingly, even though we claimed that questions of controllability precede questions of


optimality, this gives a hint that, often we may be able to establish controllability properties
simply by finding a minimizer of the appropriate functional. It does bear mentioning how-
ever, that this control is not necessarily unique.

In order to find minimizers of a functional, we often employ the Direct Method of the Calculusof Variations
(DMCV), which we state here for completeness:

Theorem 1. Let H be a reflexive Banach space, K a closed convex subset of H and φ : K → R


a function such that:
1. φ is convex
2. φ is lower semi-continuous
3. φ If K in unbounded, then φ is coercive
Then φ attains its minimum in K.

4
Now, we’re in the position to establish the correspondence between observability and opti-
mality:

Theorem 2. System (1) is exactly controllable in time T iff the adjoint system is observable
in time T.

Proof. (Observability ⇒ controllability)


By the above lemma, if the adjoint system is observable in time T, then it suffices to show
that ∀y0 ∈ Rn , the functional J has a minimum.

By the DMCV, since J is continuous, it suffices to show that J is coercive, but the observation
inequality gives us that
Z T
|B ∗ φ|2 dt ≥ c|φ(0)|2 ∀φT ∈ Rn
0

So that, certainly,
c
J(φT ) > |φT |2 − |hy0 , φ(0)i|
2
Which gives us coercivity.

(Controllability ⇒ Observability)
Suppose that system (1) is exactly controllable in time T, but the adjoint system is not ob-
servable in time T.
Certainly, then there exists a sequence (φkT )k≥1 ⊂ Rn with |φkT | = 1, ∀k ≥ 1 and
Z T
lim |B ∗ φk |2 dt = 0
k→∞ 0

Moreover, a subsequence (φkTm )m≥1 converges to φT and |φT | = 1. Letting φ denote the
solution to the adjoint system with final data φT . We have
Z T
|B ∗ φ|2 dt = 0
0

But the observability inequality tells us that for some c>0


Z T
|B ∗ φ|2 dt ≥ c|φ(0)|2
0
∗ (t−T )
And, noting that φ(t) = eA φ(0), we have that this is equivalent to
Z T
|B ∗ φ|2 dt ≥ k|φT |2
0
RT
for some k>0, and since 0 |B ∗ φ|2 = 0, we have that φT = 0, which is a contradiction to the
fact that |φT | = 1.

We need one more lemma that relies crucially on the finite nature of the system in order to
establish Kalman’s rank condition. It is the following:

Lemma 3. To show observability, it suffices to show that B ∗ φ(t) = 0, ∀t ∈ [0, T ] ⇒ φT = 0

5
RT
Proof. Suppose the above condition holds, then define the seminorm |φT |∗ = ( 0 |B ∗ φ|2 dt)1/2 .
Since |.|∗ is a seminorm, it is a norm iff the assumed condition holds. Thus, it is a norm. But
all norms on Rn are equivalent, so ∃ C>0 such that
C|φT | ≤ |φT |∗
Z T
⇔C 2 |φT |2 ≤ |B ∗ φ|2 dt
0

And using once more that φ(t) = eA (t−T ) φ(0), we have that
Z T
K|φ0 |2 ≤ |B ∗ φ|2 dt
0
Hence the system is observable.

Having established the equivalence of observability and we’re now in a position to prove
Kalman’s rank condition:
Theorem 3. System (1) is exactly controllable for some time T>0 iff
rank[B, AB, ..., An−1 B] = n
Moreover, if the system is controllable for some time T>0, then it is controllable for all time.
Proof. (⇒) Suppose that rank[B, AB, ..., An−1 B] <n, then the rows are linearly dependent,
so there exists some v ∈ Rn with v6=0 and
vT [B, AB, ..., An−1 B] = [v T B, v T AB, ..., v T An−1 B] = 0
Which gives that v T Ak B = 0, for k={0, 1, ..., n − 1}
By the Cayley-Hamilton theorem, we have the existence of a polynomial q, such that q(A)=0
and thus, there exist constants c1 , ..., cn such that
An = c1 An−1 + ... + cn I
but by the above, multiplying by v T gives v T A = 0.
So that, in fact, we have v T Ak B = 0, ∀k ∈ N.
This gives us that v T eAt B = 0, ∀t,
but the variation of constants formula tells us that the state of the system at any point is
given by:
Z t
At
y(t) = e y0 + eA(t−s) Bu(s) ds
0
Taking the inner product against v kills the integrand, giving:
hv, y(T )i = hv, eaT y0 i
So the projection of the solution along v is independent of the control, hence the system is
uncontrollable.

(⇐) Suppose now that rank([B, AB, ..., An−1 B]) = n, we know that it suffices to show that
the system is observable. For which it suffices to show that B ∗ φ(t) = 0, ∀t ∈ [0, T ] ⇒ φT = 0.

So, suppose B ∗ φ = 0.

Since φ(t) = eA (T −t) φT , we have that 0 = B ∗ eA(T −t) φT , for all 0 ≤ t ≤ T .
Taking derivatives in T yields that B ∗ [A∗ ]φT = 0 for all k>0.
This implies that [B ∗ , B ∗ A∗ , ..., B ∗ (A∗ )n−1 ]φT =0, but since [B, AB, ..., An−1 B] is of full rank,
so is [B ∗ , B ∗ A∗ , ..., B ∗ (A∗ )n−1 ], hence φT =0. Which is our desired result.

6
4 Section Three: Controllability of the Heat Equation
As an example of how one extends the techniques and concerns of the finite linear case to
the infinite case, where exact, approximate and null controllability are no longer equivalent,
let us consider now the heat equation: Let Ω ⊂ Rn and Γ = ∂Ω, T>0 we consider:
yt − 4y = u in Ω × (0, T )
y = 0 on Γ × (0, T )
y(x, 0) = y0 (x) in Ω
With supp(u) := ω ⊂ Ω. This is known (for obvious reasons) as the interior control problem.
We consider here the questions of approximate and exact controllability, as the question of
null controllability, while answered in the affirmative, is considerably more involved.

Let us consider the problem of exact controllability (which is resolved exceptionally quickly):
Theorem 4. The heat equation is not exactly controllable for any ω ( Ω, T > 0.
Proof. Note that for any T > 0, solutions of the heat equation are smooth on Ω \ ω, so for
any target v(x) ∈ L2 (Ω) \ C ∞ (Ω) we have that y(x, T ) 6= v(x), so that the system is not
exactly controllable.

Note: In the case that ω = Ω, we have by standard existence and uniqueness of the PDE
that the system is exactly controllable in H 1 (Ω).
Lemma 4. The heat equation is approximately controllable for any T>0
We give two proofs of this fact, the first is quick but unconstructive, but proves approximate
controllability for a more general class of systems, while the second involves the construction
of an explicit control for the heat equation, and is somewhat more enlightening. We will
recall here a consequence of Holmgren’s Uniqueness theorem, that if P is an elliptic partial
differentiation operator with analytic coefficients, and P y is real analytic in some open neigh-
bourhood of Ω ⊂ Rn , then y is analytic.

For the first proof, recall that the Hahn-Banach theorem implies that if A is a subspace
of X, and 0 is the only element in the orthogonal complement of A, then A is dense in X.

Proof. Consider a system of the form


∂t y + Ay = f + u in Ω × (0, T )
y = 0 on Γ × (0, T ),
y(x, 0; u) = y0 (x) for x ∈ Ω
With
aij ∈ L∞ (Ω × (0, T ))
X
aij ξi ξj ≥ αkξk2 , α > 0, ξ ∈ Rn
i,j

Note first that by a translation, it suffices to consider f = 0, y0 = 0


Now, let ψ ∈ L2 ([0, T ] × Ω) be orthogonal to the subspace generated by y(v) as v runs over
L2 ([0, T ]×Ω), with y the solution to the above system, with control v, and fixed but arbitrary
initial data. We have:
Z
y(v)ψ dxdt, ∀v ∈ L2 ([0, T ] × Ω)
[0,T ]×Ω

7
We then let ξ be the solution to the following adjoint system:

− + A∗ ξ = ψ in [0, T ] × Ω
dt
ξ = 0 on [0, T ] × Σ
ξ(x, T ) = 0 on Ω

Then we have:
Z Z

y(v)ψ dxdt = y(v)(− + A∗ ξ) dxdt
[0,T ]×Ω [0,T ]×Ω dt
Z
dy(v)
( + Ay(v))ξ dxdt
[0,T ]×Ω dt
Z
ξv dxdt, ∀v ∈ L2 ([0, T ] × Ω)
[0,T ]×Ω

But then ξ = 0 in [0, T ] × Ω implying that ψ = 0, so that by the Hahn-Banach theorem, the
reachable set is dense in L2 ([0, T ] × Ω), and so the system is approximately controllable.

Note that the above proof holds for a much more general class of functions that just the
heat equation, so we’ve actually proved something much stronger. This proof is, however,
somewhat unsatisfying, as it gives so intuition as to how to construct the desired control. To
that end, let’s examine the following constructive proof of the approximate controllability of
the heat equation. We first define the functional J :L2 (Ω) −→ R
Z Z
1 2
J (φT ) = φ dxdt + kφT kL2 (Ω) − y1 φT dx
2 Q Ω

Where φ is the solution to the adjoint equation with final data φT , and Q = ω × (0, T ),
and y1 ∈ L2 (Ω) is the final target state. Suppose  > 0 is given, and by the linearity of the
system, we may suppose that y0 = 0, as in the last proof.
The idea is that the minimizer of this functional, should it lie in L2 (Ω) will be our desired
control.

Lemma 5. If φˆT is a minimum of J in L2 (Ω) and φ̂ is the solution of the adjoint system
with φT as final data, then u = φ̂|ω is an approximate control for the heat equation.
(ie: ky(T ) − y1 kL2 (Ω) ≤ )

Proof. Suppose that φˆT ∈ L2 (Ω) is a minimum of the functional J . Then, for any ψ0 ∈ L2 (Ω)
and h ∈ R, we have
J (φ̂) ≤ J (φ̂ + hψ0 )
Writing J (φ̂ + hψ0 ) out explicitly, we have:
Z Z
1 2
J (φ̂ + hψ0 ) = |φ̂ + hψ0 | dxdt + kφ̂ + hψ0 kL2 (Ω) − y1 φ̂ + hψ0 dx
2 Q Ω
1
Z
h2 Z Z Z
= 2
φ̂ dxdt + ψ dxdt + h φ̂ψ dxdt + kφT + hψ0 kL2 (Ω) − y1 (φˆT + hψ0 ) dx
2 ˆ
2 Q 2 Q Q Ω

8
So that 0 ≤ J (φ̂ + hψ0 ) − J (φ̂) gives us:

h2
Z Z Z Z Z
2 2
0 ≤ φ̂ dxdt + ψ dxdt + h φ̂ψ dxdt + kφ̂T + hψ0 kL2 (Ω) − y1 (φ̂T + h(ψ0 ) dx + y1 φ̂T ddx
Q 2 Q Q Ω Ω
Z
1
− φ̂2 dxdt + kφ̂T kL2 (Ω)
2 Q
h2
Z Z Z
=(kφ̂T + hψ0 kL2 (Ω) − kφ̂T kL2 (Ω) ) + ψ 2 dxdt + h( φ̂ψ dxdt − y1 ψ0 dx)
2 Q Q Ω

The triangle inequality on the L2 -norm gives us that

h2
Z Z Z
0 ≤|h|kψ0 kL2 (Ω) + ψ 2 dxdt + h( φ̂ψ dxdt − y1 ψ0 dx)
2 Q Q Ω

Dividing through by h>0 and taking the limit, and then doing similarly for h<0 in the above
inequality gives us that
Z Z
| φ̂ψ dxdt − y1 ψ0 dx| ≤ kψ0 kL2 (Ω) , ∀ψ0 ∈ L2 (Ω)
Q Ω

We now want to relate the first term in the left hand side to our approximate solution. To
do so, remark that

yt − 4y = u|ω

So, setting u = φ̂, and multiplying through by ψ, the solution to the adjoint system with ψ0
as final data, we have
Z Z
φ̂ψ dxdt = (yt − 4y)ψ dxdt
ω Q
Z Z T Z
= yψ|T0 − 4uψ
Ω 0 Ω
Z Z T Z
= yψ|T0 − y4ψ
Ω 0 Ω
Z Z Z T
= (yψ)|T0 − y(ψT + 4ψ)
ZΩ Ω 0

= (y(T ) − y0 )ψT
ZΩ
= y(T )ψT

So that
Z
| (y(T ) − y1 )ψ0 dx| ≤ kψ0 kL2 (Ω) , ∀ψ0 ∈ L2 (Ω)

Taking ψ0 = y(T ) − y1 guarantees us that

ky(T ) − y1 kL2 (Ω) ≤ 

Which proves our result.

9
Now, of course, we’ve assumed the existence of a minimizer of this functional in L2 . Once
we’ve confirmed this, we’ve established approximate controllability with the control con-
structed above.
Lemma 6. ∃φ̂T ∈ L2 (Ω) such that J (φ̂T ) = minφt ∈L2 (Ω) J (φT )
Proof. We want to apply the Direct Method in the Calculus of Variations. Notice that the
bound we established in the previous proof on J (φ̂T + hψ) − J (φ̂T ) gives us continuity of the
operator, and convexity follows immediately from the linearity of integration and the triangle
inequality from the L2 -norm. So we just need to establish coercivity of the operator. To do
this, it suffices to show
J (φT )
lim inf ≥
kφT kL2 (Ω) →∞ kφT kL2 (Ω)

We take an arbitrary sequence φT,j ⊂ L2 (Ω) and normalize them such that
φT,j
φ̃T,j =
kφT,j kL2 (Ω)

and let φ˜j be the solution to the adjoint system with final data φ̃T,j . Then,
Z Z
J (φT,j ) 1 ˜ 2
= kφT,j kL2 (Ω) φj dxdt +  − y1 φ̃T,j dx
kφT,j kL2 (Ω) 2 [0×T ]×ω Ω

There are twoRpossible cases now:


1. lim inf j→∞ [0,T ]×ω |φ˜j |2 dxdt > 0, in which case, since the Ω y1 φ̃T,j dx term is bounded by
R

the L2 norm of y1 due to Holder, we have lim inf j→∞ J (φT,j ) = ∞, so we have coercivity
directly.

˜ 2 dxdt
R
2. lim inf j→∞ [0,T ]×ω |φj | =0
In this case we note that since φ̃T,j is bounded in L2 (Ω) so there exists a weakly convergent
subsequence φ̃T,j → ψ0 such that φ˜j → ψ weakly, where ψ is the solution of the adjoint
system with final data ψ0 .
By lower semi-continuity we have:
Z Z
2
2
ψ dxdt ≤ lim inf φ˜j dxdt
[0,T ]×ω j→∞ [0,T ]×ω

So that ψ = 0 on [0, T ] × ω, and Holmgren’s uniqueness theorem guarantees that ψ = 0 on


[0, T ] × Ω, so ψ = 0.
Therefore, φ̃T,j → 0 weakly in L2 (Ω) so that Ω y1 φ̃T,j dx → 0 as well, thus we have that
R
J (φT,j ) R
lim inf j→∞ kφT,j k 2 ≥ lim inf j→∞ ( − Ω y1 φ̃T,j dx) = 
L (Ω)
as desired.

5 Section Four: Optimal Control and Hamilton-Jacobi Equa-


tions
We would like to approach the problem of constructing optimal controls. To that end, consider
the following problem:
ẋ(s) = f (x(s), α(s)) (t < s < T )
x(t) = x ∈ Rn

10
With α a control, and we take the space of admissible controls, U, to be the set of measurable
functions on [0,T], with T>0 a fixed time. We define the cost functional to be:
Z T
Cx,t (α) = L(x(s), α(s)) ds + ψ(x(T ))
t

And our goal is to find some α that minimizes this functional. In our current setting, we
suppose that f, L and ψ are all bounded and Lipschitz continuous on their domain of definition.
As it turns out, there is a useful result that gives us a starting place for this endeavour, giving
necessary conditions for such a control to be optimal. It is the Pontryagin Maximum Principle,
which we state here:
Theorem 5. Given the control system

ẏ = f (y, u), u(t) ∈ U, t ∈ [0, T ], y(0) = y0 , φi (y(T )) = 0, i = 1, ..., m, m ≤ n

Consider the problem of finding a control u∗ such that

ψ(y(T, u∗ )) = max ψ(y(T, u))


u∈U
n
ψ:R →R

Define y ∗ to be the trajectory of the state under the optimal control. Then, there exists a
non-zero vector function p(t) such that:
m
X
p(T ) = λ0 ∇ψ(y ∗ (T )) + λi ∇φi (y ∗ (T )) with λ0 ≥ 0
i=1
∗ ∗
ṗ(t) = −p(t)Dy f (t, y (t), u (t)), t ∈ [0, T ]
p(τ )f (τ, y ∗ (τ ), u∗ (τ )) = maxw∈U {p(τ )f (τ, y ∗ (τ ), w)} f ora.e.τ ∈ [0, T ]

For some λi .
Note that we can actually use this theorem in the more general optimization problem:

ẏ = f (y, u), u(t) ∈ U, t ∈ [0, T ], y(0) = y0

With control u maximizing:


Z T
max{ψ(y(T, u)) − L(t, y(t), u(t)) dt}
u∈U 0

by introducing yn+1 such that:

ẏn+1 = L(t, y(t), u(t))


yn+1 (0) = 0

And considering the re-defined problem:

max{ψ(y(T, u)) − yn+1 (T, u)}


u∈U

Obviously, we can equally well consider minimization problems with this technique.
The Pontryagin Maximum Principle provides us with necessary conditions for a control to be
optimal, and so it helps us search for optimal controls. But how ought we check the optimality
of the controls once we’ve determined a candidate? This is where Hamilton-Jacobi-Bellman
equation come into play:

Let give the context in which we will be working.

11
Definition 6. Given a Hamilton-Jacobi-Bellman equation of the form

ut + mina∈A {f (x, a) ∗ Du + h(x, a)}) = 0 in Rn × [0, T ]


u = g on Rn × {t = T }

A function u is called a viscosity solution to the above problem if u = g on Rn × {t = T },


and for all v ∈ C ∞ (Rn × (0, T ))

if u − v has a local maximum at (x0 , t0 ) ∈ Rn × (0, T ) then


vt (x0 , t0 ) + H(Dv(x0 , t0 ), x0 ) ≥ 0

And

if u − v has a local minimum at (x0 , t0 ) ∈ Rn × (0, T ) then


vt (x0 , t0 ) + H(Dv(x0 , t0 ), x0 ) ≤ 0

The utility of viscosity solutions comes from the following fact:

Theorem 6. If a Jacobi-Hamilton PDE satisfies the following Lipschitz condition:

|H(p, x) − H(q, x)| ≤ C|p − q|


|H(p, x) − H(p, y)| ≤ C|x − y|(1 + |p|)

for x,y,p,q ∈ Rn , then the initial value problem of the PDE has at most one viscosity solution.

Our strategy for obtaining sufficiency conditions for controls generated through the Pontrya-
gin Maximum Principle will be the following:
first, we define the value function:
Z T
u(x, t) = inf { L(x(s), α(s)) ds + ψ(x(T ))}
α∈U t

Our goal will be to show that this function is, in fact, the unique viscosity solution to the
Hamilton-Jacobi-Bellman equation, thereby giving us a condition against which we can check
potential controls.

Our first task is to establish a useful lemma, which states that the optimal control must
be optimal on subinterval, which allows us to break up it’s computation. This is known as
the Dynamic Programming Principle:

Lemma 7. For all τ ∈ [t, T ] we have:


Z τ
u(x, t) = inf { L(x(s), α(s)) ds + u(x(τ ), τ )}
α∈U t

Proof. We split the problem up. First, choose some arbitrary control α1 ∈ U and solve

x˙1 (s) = f (x1 (s), α1 (s)) (t < s < τ )


x1 (t) = x

Fix  > 0 and choose α2 ∈ U such that


Z T
L(x2 (s), α2 (s)) ds + ψ(x2 (T )) ≤ u(x1 (τ ), τ )
τ

12
where

x˙2 (s) = f (x2 (s), α2 (s)) (t < s < τ )


x2 (t) = x

Then we set α3 (s) = α1 (s) on (t, τ ) and = α2 on (τ, T )


The fact that u(x, t) = inf α∈U Cx,t (α) gives us that

u(x, t) ≤ Cx,t (α3 )


Z τ Z T
L(x1 (s), α1 (s)) ds + L(x2 (s), α(s)) ds + ψ(x2 (T ))
t τ
Z τ
≥ L(x1 (s), α1 (s)) ds + u(x1 (τ ), τ ) + , by our choice of α2
t

But α1 was arbitrary, so


Z τ
(u(x, t)) ≤ inf { L(x(s), α(s)) ds + u(x(τ ), τ )} + 
α∈U t

In the other direction, fix again  > 0 and let α4 ∈ U be such that
Z T
u(x, t) +  ≥ L(x4 (s), α4 (s)) ds + ψ(x4 (T ))
t

Where,

x˙4 (s) = f (x4 (s), α4 (s)) (t < s < τ )


x4 (t) = x

But certainly,
Z T
u(x4 (τ )) ≤ L(x4 (s), α4 (s)) ds + ψ(x4 (T )), So taking the infimum of both sides
τ
Z τ
u(x, t) +  ≥ inf { L(x(s), α(s)) ds + u(x(τ ), τ )}
α∈U t

Taking  → 0 in both inequalities finishes the proof.

We now need to establish that the value function is Lipschitz continuous:

Lemma 8. Suppose that f, L, and ψ are bounded and Lipschitz continuous, then the value
function u(x,t) is bounded and Lipschitz continuous. That is, there exists a constant K, such
that

|u(x, t)| ≤ K
|u(x, t) − u(x0 , t0 )| ≥ K(|x − x0 | + |t − t0 |)

Proof. Boundedness of u is clear from the boundedness of the component functions. So we


focus on Lipschitz continuity.
Let (x̄, t̄) be given initial data, and let  > 0 be given. Choose a control w ∈ U such that

Cx̄,t̄ (w) ≤ u(x̄, t̄) + 

13
let y(t) be the trajectory under w. Define then z(t) to be the trajectory under different initial
data (x̂, t̂) but keeping the same control. Since f is bounded and Lipschitz, we have:

|y(t̂) − z(t̂)| ≤ |y(t̂) − y(t̄)| + |y(t̄) − z(t̂)| ≤ C|t̂ − t̄| + |x̄ − x̂|, and
|y(t) − z(t)| ≤ eC|t−t̂| |y(t̂) − z(t̂)|, by Gronwall0 s inequality
≤ eCT (C|t̄ − t̂| + |x̄ − x̂|)

Using next the boundedness and Lipschitz continuity of L and ψ, we have


Z t̄ Z T
Cx̂,t̂ (w) = Cx̄,t̄ (w) + L(z, w) dt + L(z, w) − L(y, w) dt + ψ(z(T )) − ψ(y(T ))
t̂ t̄
Z T
≤ Cx̄,t̄ (w) + K|t̂ − t̄| + K|z(t) − y(t)| dt + K|z(T ) − y(T )|

0
≤ Cx̄,t̄ (w) + C |t̂ − t̄| + |x̄ − x̂|, so
u(x̂, t̂) ≤ Cx̂,t̂ (w) ≤ u(x̄, t̄) +  + C 0 (|t̂ − t̄| + |x̄ − x̂|)

We then let  → 0 and repeat the argument with the roles of (x̂, t̂) and (x̄, t̄) reversed, which
yields Lipschitz continuity of u.

We are now ready to state the main result of this section, that is:

Theorem 7. The value function u is the unique viscosity solution of the terminal value
problem for the Hamilton-Jacobi-Bellman equation:

ut + min{f (x, α)Du + L(x, α)}


α∈U
u=ψ on Rn × {t = T }

Proof. Obviously, from our bounds on f, and ψ, H satisfies the necessary Lipschitz conditions,
so that it has at most one viscosity solution. We aim to show that u(x,t) is this very solution.

Certainly, by construction, u(x, T ) = ψ(x), so letting φ ∈ C 1 ((R)n × (0, T )), we have two
things to show:
If u − φ attains a local maximum at (x0 , t0 ), then
φt (x0 , t0 ) + minw∈U {f (x0 , w)Dφ(x0 , t0 ) + L(x0 , w)} ≥ 0
And
If u − φ attains a local minimum at (x0 , t0 ), then
φt (x0 , t0 ) + minw∈U {f (x0 , w)Dφ(x0 , t0 ) + L(x0 , w)} ≤ 0
We begin with the first.
By translation and restricting our attention to a neighbourhood of (x0 , t0 ) we can may assume
u(x0 , t0 ) = φ(x0 , t0 ) and u(x, t) ≤ φ(x, t) otherwise. Now, we suppose that there exists w ∈ U
and θ > 0 such that

φt (x0 , t0 ) + Dφ(x0 , t0 )f (x0 , w) + L(x0 , w) < −θ (1)

That is, the first condition does not hold. But if this is the case, then by continuity and
Lipschitz continuity, there is some δ such that while |t − t0 | ≤ δ| and |x − x0 | ≤ Cδ, we have

φt (x, t) + Dφ(x, t)f (x, w) < −θ − L(x, w)

14
Let y(t) be the trajectory under w, that is, the solution solution of:
ẏ(t) = f (t, x)
y(t0 ) = x0
Then we have
u(t0 + δ, y(t0 + δ)) − u(t0 , x0 ) ≤ φ(y(t0 + δ), t0 + δ) − φ(x0 , t0 )
Z t0 +δ
d
= φ(y(t, y)) dt
t0 dt
Z t0 +δ
= φt (x(t), t) + Dφ(x(t), t)f (x(t), w) dt
t0
Z t0 +δ
≤− L(x(t), w) dt − δθ So that
t0
Z t0 +δ
u(x0 , t0 ) ≥ L(x(t), w) dt + u(t0 + δ, y(t0 + δ)) + δθ
t0
R t +δ
But u(x0 , t0 ) = t00 L(x(t), w dt + u(x(t0 + δ), t0 + δ) by the Dynamic Programming Prin-
ciple, so we have a contradiction.

Now we prove the second condition for u to be a viscosity solution. Again, by translation
and restriction of attention, we may suppose WLOG that
u(x0 , t0 ) = φ(x0 , t0 ) and
u(x, t) ≥ φ(x, t) ∀x, t
Suppose that the second condition fails, that is, that ∃θ > 0 such that
φ(x0 , t0 ) + Dφ(x0 , t0 )f (x0 , w) + L(x0 , w) > θ, ∀w ∈ U
By continuity again, we have for x and t such that |t − t0 | ≤ δ and |x − x0 | ≤ Cδ, for some
δ > 0,
φ(x0 , t0 ) + Dφ(x0 , t0 )f (x0 , w) > θ − L(x0 , w) ∀x ∈ U

Choosing an arbitrary control β ∈ U and letting x(t) denote the corresponding trajectory of
the system, we have:
u(t0 + δ, y(t0 + δ)) − u(t0 , x0 ) ≥ φ(y(t0 + δ), t0 + δ) − φ(x0 , t0 )
Z t0 +δ
d
= φ(y(t, y)) dt
t0 dt
Z t0 +δ
= φt (x(t), t) + Dφ(x(t), t)f (x(t), w) dt
t0
Z t0 +δ
≥ θ − L(x(t), w) dt So that
t0
Z t0 +δ
u(x(t0 + δ), t0 + δ0 ) + L(x(t), u(t)) dt ≥ u(x0 , t0 ) ∀w ∈ U
t0
Taking the infimum over all controls and applying the Dynamic Programming Principle to
the left hand side gives
u(x0 , t0 ) ≥ u(x0 , t0 ) + δθ
A contradiction, hence u the unique viscosity solution to the HJB equation.

15
So we now have a sufficient condition against which we may check the optimality of controls,
as promised.

16
References
[1] Lawrence C. Evans, Partial Differential Equations: Second Edition. AMS 1998.

[2] Jacques Louis Lions, Optimal Control of Systems Governed by Partial Differential Equa-
tions. AMS 1970.

[3] Sorin Micu, Enrique Zuazua, An Introduction to the Controllability of Partial Differential
Equations. www.univ-orleans.fr/mapmo/CPDEA/Cours/Zuazua/micu-zuazua.pdf. Ac-
cessed on Fri. Nov 23, 2012.

[4] Jerzy Zabczyk, Mathematical Control Theory: An Introduction. Birkhäuser 1995.

[5] Alberto Bressan, Viscosity Solutions of Hamilton-Jacobi Equations and Optimal Control
Problems. www.math.psu.edu/bressan/PSPDF/hj.pdf. Accessed on Sat. Dec 1, 2012.

17

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy