Optimal Control of Hybrid Systems
Optimal Control of Hybrid Systems
Optimal Control of Hybrid Systems
0 ≤ (λ qjk )1 + (λ qjk )2 + l qjk (6) Proof. Assume that x ∈ X jk . Noting that ∆ 1 Vqjk
( j + 1) k j ( k+ 1)
(λ qjk )tit ≤ ( f qjk )tit ∆ i Vqjk i −2, −1, 1, 2 (7) ∆ −1 Vq , ∆ 2 Vqjk ∆ −2 Vq , the inequalities (6)-
(8) taken at grid points jk, j ( k + 1), ( j + 1) k, and
jk
(λ qjk )tit ≤ ( f q )tit ∆ i Vqjk i −2, −1, 1, 2 (8) ( j + 1)( k + 1) give
0 ≤ Vrjk − Vqjk + s( x jk , q, r) ∀ x jk ∈ Sq,r (9)
0 ≤ f q1( x, u)∆ 1 Vqjk + f q2( x, u)∆ 2 Vqjk + lq ( x, u) (12)
0 Vq00f (10) j ( k+ 1)
0 ≤ f q1( x, u)∆ 1 Vq + f q2( x, u)∆ 2 Vqjk + lq ( x, u)
where (6)-(8) form a combination of backward and (13)
forward difference approximations of (3). ( j + 1) k
0 ≤ f q1( x, u)∆ 1 Vqjk + f q2( x, u)∆ 2 Vq + lq ( x, u)
For x x jk + θ 1 he1 + θ 2 he2 ∈ X , define the jk
(14)
interpolating function j ( k+ 1) ( j + 1) k
0 ≤ f q1( x, u)∆ 1 Vq + f q2( x, u)∆ 2 Vq +
Vq ( x) (1 − θ 1 )(1 − θ 2) Vqjk + θ 1 (1 − θ 2 ) Vq
( j + 1) k
+ lq( x, u) (15)
− θ 1)θ 2 Vq + θ 1θ 2 Vq
j ( k+ 1) ( j +1)( k+1)
+ (1 (11)
The gradient of Vq is given by
The following result applies. #T
(1 − θ 2 )∆ 1 Vqjk + θ 2 ∆ 1 Vq
j ( k+ 1)
"
Vq
THEOREM 1—DISCRETIZATION IN R2 x ( j + 1) k
(1 − θ 1 )∆ 2 Vqjk + θ 1 ∆ 2 Vq
If Vqjk satisfy (6)-(10) for all q ∈ Q and for all grid
points x jk ∈ X ⊂ R2 such that X jk intersects X , and thus, adding (12)-(15) weighted with (1 −θ 1)(1 −
then the interpolating function Vq defined by (11) θ 2 ), (1−θ 1)θ 2, θ 1(1−θ 2), and θ 1θ 2 respectively proves
satisfies (3)-(5) and, for every ( x0 , q0), Vq0 ( x0 ) is a that (3) is met for x. The inequality (4) is met since
lower bound of J ( x0 , q0). Vq is a convex combination of grid points that all
meet (9), and (5) is the same condition as (10).
Remark 1. Any function that meet the constraints,
even the trivial choice Vq ( x) 0, is a lower bound Note a special case in which the computational load
on the true cost. Thus, to yield useful bounds, Vq ( x) of the local optimizations in Theorem 1 is lightened:
need to be maximized subject to (6)-(10). The max- if f q( x, u) hq( x) + gq( x)u and lq ( x, u) oq( x) +
imization could be carried out in either one point, mq ( x)u while Ω u [−1, 1], then u can be entirely
( x0 , q0), or several points, ( x, q) ∈ X Q, simultane- eliminated from (6)-(8) by replacing f qjk , f q , and l qjk
jk
ously. jk
with hqjk ± g qjk , hq ± g qjk , and oqjk ± m qjk respectively.
For the original, non-discretized problem, the result
of a maximization of Vq ( x) is always identical to the This will double the set of equations (6)-(8), but the
optimal cost, regardless if the maximization is done functions hq, gq, oq, and mq are optimized over X̂ jk
at a particular initial state, or by summing the values solely.
at several initial states.
5. Computing the Control Law
However, for the discretized problem, different
choices of maximization criteria may lead to differ- Provided that the lower bound, Vq , is a good enough
ent results. Fortunately, experience from examples approximation of the optimal cost, the optimal feed-
back control law can be calculated as penalty for gear changes. Thus, the components of
(2) have been chosen as l1 ( x, u) l2 ( x, u) 1,
Vq
s( x, 1, 2) s( x, 2, 1) 0.5.
û ( x, q ) argmin f q ( x, u ) + l q ( x, u )
x
u∈Ωu
The problem is plugged into the machinery of Sec-
µ̂ ( x, q)
argmin { Vν ( x) + s( x, q, ν )} tion 4 and Vq ( x) is maximized over a region −5.5 ≤
µ ∈Ωµ tx∈Sq,ν x1 ≤ 1.0, −0.5 ≤ x2 ≤ 3.0.
(16)
The result is shown in Figure 3 and 4 where xi and
x f also have been marked. The functions look rather
where ν ν ( x, q, µ ). Thus, the continuous input, û,
similar, since the cost for changing gears is only 0.5.
is computed in a standard way. The discrete input, µ̂ ,
One can see that V1 has a threshold along the line
is chosen such that switching occur whenever there
x2 1. Figure 2 reveals that the first gear is almost
exist a discrete mode for which the value function has
useless for high speeds, leading to V1 V2 + 0.5 for
a lower value than the cost of the value function for
x2 > 1. This is the cost for using the second gear
the current mode minus the cost for switching there.
optimally after a gear switch.
Consider the true optimal value function, Vq⋆ . For
those ( x, q, r) where the optimal trajectory requires
mode switching, the inequality (3) will turn to equal-
ity i.e. Vq⋆ Vr⋆ + s( x, q, r) (this will be shown in
10
0
2
3
−6 1
−5
(
ẋ1 x2 V2 5
(17) 3
1
3
2
0
x2
1
1.2
g1( x) g2( x) Studying Fig. 5, where V1 − V2 is plotted, the strategy
1
0.8
for changing gears is even more obvious: there is
0.6 only one discrete mode allowed under optimal control
0.4 when the difference hits its maximum distance. In
0.2 conformity with previous reasoning, V1 − V2 0.5
0
for x2 > 1, indicating the need for a change of gears
−0.2
−1 −0.5 0 0.5 1 1.5 2 2.5 3 when using the first gear at high speed. Analogously,
x
the second gear should be avoided, starting with zero
Figure 2: Gear efficiency at various speeds. speed.
A simulation of the controlled system is shown in
The problem is to bring (17) from xi (−5, 0), q i 1 Fig. 6, where the initial point is marked with a
to x f (0, 0), q f 1 in minimum time. Torque square. The state trajectory coincides with the one
losses when using the clutch calls for an additional of a professional rally-driver with lousy brakes. In
temperature of the furnaces and is given by ẋ
f q( x), where
0.5 " # " #
V1 − V2 − x1 + u0 − x1
f 1( x) f 2 ( x)
0 −2x2 −2x2 + u0
" #
− x1
−0.5
−6
f 3( x)
−5
−4
−2x2
−3 3
2.5
−2 2
x1 −1
0 0.5
1
1.5
Thus, there are three discrete modes: q 1 means
1
2 −1
−0.5
0
x2 that the first furnace is heated, q 2 means that
the second furnace is heated, q 3 corresponds to
Figure 5: The difference between V1 and V2 . no heating. The cost function to be minimized is
2
Z ∞X M
J ( x0 , q0)
X
the beginning, maximum throttle is used on the first ( xi − ci)2 e− tdt + be− tk
t0 i1 k 1
gear (solid line). When the speed roughly reaches
the point of equal efficiency between the gears ( x2 where the desired stationary temperature values are
0.5), they are switched in favor of the second gear c1 1/4, c2 1/8 and the cost for switching the
(dashed line). At half the distance, the gas pedal is power is b 1/1000. Since the furnaces can only be
lightened to use the braking force of the engine. In fed by a fixed amount of energy, u0 , it is impossible
the end, the first gear is used again before the origin to keep them stationary at the desired temperature.
is hit. As seen in the figure, the granularity of the Hence, the time weighting, e− t, is necessary to get a
discretization grid ( h 0.18) prevents the solution bounded cost function.
from hitting the exact origin. If Vq ( x, t) is defined as the cost for starting in ( x, q) at
time t, then the continuous part of the general time
2
dependent Bellman inequality can be written
Vq ( x, t) Vq ( x, t)
+ f q( x, u, t) + lq ( x, u, t) ≥ 0 (18)
1.5
t x
Rewriting the functions like Vq ( x, t) e− t Ṽq ( x) and
x2 1
lq ( x, u, t) e− tl̃q ( x, u) for the furnace example,
(18) becomes
0.5
Ṽq ( x)
− Ṽq( x) + f q( x, u) + l̃q ( x, u) ≥ 0 (19)
x
0
Thus, the time dependence introduced in Bellman’s
−5 −4.5 −4 −3.5 −3 −2.5 −2 −1.5 −1 −0.5 0
inequality cancels and techniques similar to those
x1
presented above apply.
Figure 6: Phase portrait of a simulation. The solid line The optimal control results in a limit cycle as seen in
shows where gear number one has been used, Figure 7. The figure, that contains the phase portrait
the dashed line shows the second gear. The of the continuous states, shows how the temperature
initial point is marked with a square. of one furnace always decreases as the other one is
heated. By alternate heating, the temperatures first
climb up to, and above the set-point and then both
furnaces are turned off and the state drifts towards
the origin. This procedure is then repeated over and
EXAMPLE 2—ALTERNATE HEATING OF TWO FURNACES over again, making the trajectory enclose the desired
Since the industrial power fee is determined by the steady state (marked with a circle in the figure). The
highest peak of the season [5], it is desirable to trajectory has been dashed for t ∈ [0, 2.8] to make the
spread the power consumption evenly over time. This limit cycle clear.
is handled by load control, which means that the
available electrical power is altered between different Figure 8 shows what happens when the power supply
loads of the mill. is insufficient for driving both furnaces. Mode 3 is not
entered since the temperature set-points are never
In this example, the temperature of two furnaces reached.
should be controlled by alternate heating. The system
has two continuous states that correspond to the
0.2 upon request from the authors.
0.18
0.04
Control, 43:4, pp. 457–460, April 1998. Special
0.02
issue on hybrid systems.
0
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35
[2] D. P. Bertsekas and J. N. Tsitsiklis. Neuro-
x1
dynamic Programming. Athena Scientific, 1996.
Figure 7: Phase portrait of the continuous states under
optimal control when u0 0.8. The mode [3] M. Branicky. “Multiple Lyapunov functions and
number, q, has been marked for the limit cycle other analysis tools for switched and hybrid sys-
tems.” IEEE Transactions on Automatic Control,
43:4, pp. 475–482, April 1998. Special issue on
0.14
hybrid systems.
0.12
x1
0.2 0.25 0.3 0.35
Automation, Lund Institute of Technology, Box
118, S-221 00 Lund, SWEDEN, 1997.
Figure 8: Phase portrait of the continuous states under
optimal control when u0 0.4. [6] J. Ezzine and A. H. Haddad. “Controllability and
observability of hybrid systems.” Int. J. Contr.,
49, June, pp. 2045–2055, June 1989.
7. Summary [7] R. Grossman, A. Nerode, A. Ravn, and
An extended version of Bellman’s inequality was H. Rischel. “Models for hybrid systems: Au-
discretized in this paper to compute a lower bound on tomata, topologies, controllability, observabil-
the optimal cost function, using linear programming. ity.” In Hybrid Systems, pp. 317–356. Springer,
Based on these computations, an approximation of 1993.
the optimal control feedback law was derived. [8] M. Johansson. Piecewise Linear Control Sys-
Hybrid systems combine discrete and continuous tems. PhD thesis TFRT-1052, Dept. of Auto-
dynamics. The analysis should therefore contain matic Control, Lund Institute of Technology, Box
techniques that are well suited for computer science 118, S-221 00 Lund, SWEDEN, 1999.
as well as control theory. The emphasis in this paper [9] A. Rantzer. “Dynamic programming via convex
is on the continuous part, the discrete part consisting optimization.” In Proceedings of the IFAC World
of a few system modes. At the other end of the hybrid Congress, Beijing, 1999.
spectrum, where purely discrete systems are found,
X will reduce to a single point. The first inequality [10] A. Rantzer and M. Johansson. “Piecewise lin-
of proposition 1 will then be superfluous. The set of ear quadratic optimal control.” In Proceedings
inequalities given by (4), possibly large depending on of American Control Conference, Albuquerque,
Q, should be met for Sq,r { x f }. The resulting LP 1997. Submitted for journal publication.
formulation solves the shortest-paths problem on a [11] V. I. Utkin. “Variable structure systems with
non-negatively weighted, directed graph — a problem sliding modes.” IEEE Transactions on Auto-
that is usually attacked using Dijkstra’s algorithm. matic Control, AC-22, pp. 212–222, 1977.
A set of MATLAB commands has been compiled by the [12] R. B. Vinter and R. M. Lewis. “A necessary and
authors to make it easy to test the above methods sufficient condition for optimality of dynamic
and implement the examples. The LP solver that programming type, making no a priori assump-
is used is “PCx”, developed by the Optimization tions on the controls.” SIAM Journal on Control
Technology Center, Illinois. The MATLAB commands and Optimization, 16:4, pp. 571–583, July 1978.
and a manual of usage are available free of charge