Optimal Control of Hybrid Systems

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Optimal Control of Hybrid Systems

Sven Hedlund and Anders Rantzer


Department of Automatic Control
Lund Institute of Technology, Box 118, 221 00 Lund, Sweden.
Phone:(+46)-46-222 42 87, Fax:(+46)-46-138118, Email:sven@control.lth.se

Abstract control problem in terms of linear programming has


previously been used for continuous time systems in
This paper presents a method for optimal control of
[9] and [10] and is closely connected to ideas of [12].
hybrid systems. An inequality of Bellman type is con-
Related methods were discussed for discrete systems
sidered and every solution to this inequality gives a
in [2] and on an abstract level for hybrid systems
lower bound on the optimal value function. A dis-
in [4].
cretization of this “hybrid Bellman inequality” leads
to a convex optimization problem in terms of finite- This paper presents a novel computational approach
dimensional linear programming. From the solution to optimal control of hybrid systems, based on ideas
of the discretized problem, a value function that pre- from dynamic programming and convex optimiza-
serves the lower bound property can be constructed. tion. Discretization of Bellman’s inequality gives a
An approximation of the optimal feedback control law lower bound on the optimal cost in terms of linear
is given and tried on some examples. programming. A control law which is used for simula-
tion is constructed from the lower bound. The results
Keywords: hybrid systems, optimal control, linear
are demonstrated in some examples.
programming, dynamic programming.
2. Problem Formulation
1. Introduction
Define a hybrid system as
Hybrid systems are systems that involve interaction
between discrete and continuous dynamics. Such (
ẋ( t)  f q( t)( x( t), u( t))
systems have been studied with growing interest (1)
and activity in recent years. One reason for the q( t)  ν ( x( t), q( t−), µ ( t))
interest is that modeling and simulation of a complex
system often require a combination of mathematical where x( t) ∈ X ⊂ Rn is the state vector, u( t) ∈
models from a variety of engineering disciplines. The Ω u ⊂ Rm is a continuous input signal of the system.
structure of such submodels can be very different, There is also a discrete input, µ ( t) ∈ Ω µ , which
some can be discrete and some continuous. allows for the selection between N different system
Very often, the same phenomenon can be described modes, q( t) ∈ Q  {1, 2, . . . , N }. The notation q( t−)
either by a discrete model or a continuous one, de- is used for the left-hand limit of q at t. Sq,r is a set
pending on the context and purpose of the model [1]. (parameterized by q and r) such that switching from
Consider for example an asynchronous discrete-event mode q to r is possible when x ∈ Sq,r ⊆ X . The time
driven thermostat, which discretizes temperature in- argument, t, will often be omitted in the sequel for
formation as {too hot, too cold, normal}. readability.
Practical control systems typically involve switching The optimal control problem is to minimize the cost
between several different modes, depending on the function
range of operation. Even if the dynamics in each M
Z tf
mode is simple and well understood, it is well known J ( x0 , q0)  lq( x, u) dt +
X
s( x( tk), q( t− +
k ), q( tk ))
that automatic mode switching can give rise to t0 k 1
unexpected phenomena. (2)
Basic aspects of hybrid systems were treated in [6],
[7], and [11]. For stability analysis, see [3, 8] and subject to (1) while bringing the system from an
references therein. The reformulation of an optimal initial state ( x0 , q0) at time t0 , to a final state ( x f , q f )
at time t f , where the end time, t f , is free. Here, M is
an arbitrary finite number of switches occurring at and q̂ k  q̂( t), tk ≤ t < tk+1. Then
times t0 < t1 < t2 < . . . < t M < t f and s( x, q, r) > 0 is
an associated cost for switching from discrete state J ( x0, q̂0 ) 
q to r, the continuous part being x just before the M Z tk+1 M
switch. Note that s(⋅) > 0 removes the problem of lq̂k ( x, û) dt + k , q̂ k−1, q̂ k ) ≥
X X
s( x−
infinitely many jumps in a finite interval. k 0 tk k 1
The framework developed in this paper would also al- M Z tk+1
 Vq̂k ( x)
f q̂k ( x, û) dt +
X
low the number of continuous states to vary with the −
discrete mode according to ẋq ( t)  f q( t)( xq( t), uq( t)), tk x
k 0
where xq ( t) ∈ X q ⊂ Rn( q) , uq( t) ∈ Ω uq ⊂ Rm( q) . The M
X
Vq̂k−1 ( x− −

usage of the system description (1), however, will + k ) − Vq̂k ( xk ) 
hopefully prevent the reader from getting stuck on k 1
details. M
X 
Vq̂k ( xk) − Vq̂k ( xk+1) +
3. Lower Bounds on Optimal Cost k 0
M
X 
+ Vq̂k−1 ( xk) − Vq̂k ( xk) 
PROPOSITION 1 k 1
Let Vq : X @→ R, q  1, 2, . . . , N be a set of
Vq̂0 ( x0 ) − Vq̂M ( x M+1 )  Vq̂0 ( x0 )
continuous, piecewise C 1 functions that satisfy

 Vq ( x) Also the optimal value function, Vq⋆ ( x) will meet the


0 ≤ f q( x, u) + lq ( x, u) the constraints (3)-(5), under appropriate interpre-
x
∀ x ∈ X , u ∈ Ωu, q ∈ Q (3) tation of  Vq ( x)/ x. Hence the inequalities do not
introduce any conservatism in the lower bound.
0 ≤ Vr ( x) − Vq ( x) + s( x, q, r)
∀ x ∈ Sq,r q, r ∈ Q : q 6 r (4) 4. Discretization
0  Vq f ( x f ) (5) Utilizing a computer to solve (3)-(5) for a specific
control problem, a straight forward approach is to
grid the state space to require the inequalities to
where f q( x, u) gives the dynamics of a hybrid system be met at a set of evenly distributed points in X .
according to (1), lq ( x, u) and s( x, q, r) define a cost This approximation will, however, not guarantee a
function for the system according to (2). Then, for lower bound on the optimal cost, unless the nature
every ( x0 , q0), Vq0 ( x0 ) gives a lower bound on the of f q and Vq between the grid points is taken into
cost for optimally bringing the system from ( x0 , q0) consideration.
to ( x f , q f ), x( t) ∈ X ∀ t ∈ [t0 , t f ].
In the case of a two-dimensional continuous state
space, introduce the notation
Remark 1. Rather than having one single value
function, V ( x), as would be the case for a purely x jk  x f + jhe1 + khe2
continuous system, the proposition gives a set of X jk  { x jk + θ 1 he1 + θ 2 he2 : 0 ≤ θ i ≤ 1}
value functions, Vq ( x), where q is the initial value
X̂ jk  { x jk + θ 1 he1 + θ 2 he2 : −1 ≤ θ i ≤ 1}
of the discrete mode. Note that these functions give
the cost for optimal trajectories that are allowed ( f qjk ) i  min ( f q( x, u))i
to switch modes — the index q only implies that x∈X̂ j k ,u∈Ωu
jk
trajectories starting in mode q are considered. ( f q )i  max ( f q( x, u))i
x∈X̂ j k ,u∈Ωu
It is of course possible to think of Vq ( x) as one single
function, parameterized by x and q. For consistent ( l qjk ) i  min ( lq( x, u)) i
x∈X̂ j k ,u∈Ωu
notation, however, Vq ( x) has been chosen instead of
V ( x, q). Vqjk  Vq ( x jk )
∆ i Vqjk  ( Vq ( x jk + hei) − Vq ( x jk ))/h
Proof. Let û(⋅) and µ̂ (⋅) be control signals that drive ∆ − i Vqjk  ( Vq ( x jk ) − Vq ( x jk − hei ))/h
the system from ( x0 , q0 ) at time t0 to ( x f , q f ) at
time t f  t M+1. Let q̂( t) denote the mode trajectory where e1 and e2 are unit vectors along the coordinate
resulting from µ̂ ( t) and define xk  x( tk), x− −
k  x( tk ), axes, and h is the grid size.
x2 shows that the difference between the results of a
x j ( k+ 1) single-point and a multi-point maximization is of-
x1
X jk ten small, making it possible to compute the value
function in a large subset of X  Q solving one LP.
x jk x( j+1) k
X̂ jk Remark 2. The restriction x( t) ∈ X in the optimal
control problem is essential. It may happen that
for some initial states x0 there exist no admissible
solutions inside X . Then the maximization of Vq0 ( x0 )
can lead to arbitrarily large values.
Figure 1: Illustration of X jk and X̂ jk .
Remark 3. The theorem is easily extended to Rn .
Define j  ( j1 , j2 , . . . , jn ) and exchange jk for the new
Introduce new vector variables, λ qjk ∈ Rn for ( j , k, q) multi-index j in the above inequalities. The limits
such that x jk ∈ X , q ∈ Q. The inequalities (3)-(5) of all summations and enumerations should also be
can then be replaced by adjusted.

0 ≤ (λ qjk )1 + (λ qjk )2 + l qjk (6) Proof. Assume that x ∈ X jk . Noting that ∆ 1 Vqjk 
( j + 1) k j ( k+ 1)
(λ qjk )tit ≤ ( f qjk )tit ∆ i Vqjk i  −2, −1, 1, 2 (7) ∆ −1 Vq , ∆ 2 Vqjk  ∆ −2 Vq , the inequalities (6)-
(8) taken at grid points jk, j ( k + 1), ( j + 1) k, and
jk
(λ qjk )tit ≤ ( f q )tit ∆ i Vqjk i  −2, −1, 1, 2 (8) ( j + 1)( k + 1) give
0 ≤ Vrjk − Vqjk + s( x jk , q, r) ∀ x jk ∈ Sq,r (9)
0 ≤ f q1( x, u)∆ 1 Vqjk + f q2( x, u)∆ 2 Vqjk + lq ( x, u) (12)
0  Vq00f (10) j ( k+ 1)
0 ≤ f q1( x, u)∆ 1 Vq + f q2( x, u)∆ 2 Vqjk + lq ( x, u)
where (6)-(8) form a combination of backward and (13)
forward difference approximations of (3). ( j + 1) k
0 ≤ f q1( x, u)∆ 1 Vqjk + f q2( x, u)∆ 2 Vq + lq ( x, u)
For x  x jk + θ 1 he1 + θ 2 he2 ∈ X , define the jk
(14)
interpolating function j ( k+ 1) ( j + 1) k
0 ≤ f q1( x, u)∆ 1 Vq + f q2( x, u)∆ 2 Vq +
Vq ( x)  (1 − θ 1 )(1 − θ 2) Vqjk + θ 1 (1 − θ 2 ) Vq
( j + 1) k
+ lq( x, u) (15)
− θ 1)θ 2 Vq + θ 1θ 2 Vq
j ( k+ 1) ( j +1)( k+1)
+ (1 (11)
The gradient of Vq is given by
The following result applies. #T
(1 − θ 2 )∆ 1 Vqjk + θ 2 ∆ 1 Vq
j ( k+ 1)
"
 Vq

THEOREM 1—DISCRETIZATION IN R2 x ( j + 1) k
(1 − θ 1 )∆ 2 Vqjk + θ 1 ∆ 2 Vq
If Vqjk satisfy (6)-(10) for all q ∈ Q and for all grid
points x jk ∈ X ⊂ R2 such that X jk intersects X , and thus, adding (12)-(15) weighted with (1 −θ 1)(1 −
then the interpolating function Vq defined by (11) θ 2 ), (1−θ 1)θ 2, θ 1(1−θ 2), and θ 1θ 2 respectively proves
satisfies (3)-(5) and, for every ( x0 , q0), Vq0 ( x0 ) is a that (3) is met for x. The inequality (4) is met since
lower bound of J ( x0 , q0). Vq is a convex combination of grid points that all
meet (9), and (5) is the same condition as (10).
Remark 1. Any function that meet the constraints,
even the trivial choice Vq ( x)  0, is a lower bound Note a special case in which the computational load
on the true cost. Thus, to yield useful bounds, Vq ( x) of the local optimizations in Theorem 1 is lightened:
need to be maximized subject to (6)-(10). The max- if f q( x, u)  hq( x) + gq( x)u and lq ( x, u)  oq( x) +
imization could be carried out in either one point, mq ( x)u while Ω u  [−1, 1], then u can be entirely
( x0 , q0), or several points, ( x, q) ∈ X  Q, simultane- eliminated from (6)-(8) by replacing f qjk , f q , and l qjk
jk

ously. jk
with hqjk ± g qjk , hq ± g qjk , and oqjk ± m qjk respectively.
For the original, non-discretized problem, the result
of a maximization of Vq ( x) is always identical to the This will double the set of equations (6)-(8), but the
optimal cost, regardless if the maximization is done functions hq, gq, oq, and mq are optimized over X̂ jk
at a particular initial state, or by summing the values solely.
at several initial states.
5. Computing the Control Law
However, for the discretized problem, different
choices of maximization criteria may lead to differ- Provided that the lower bound, Vq , is a good enough
ent results. Fortunately, experience from examples approximation of the optimal cost, the optimal feed-
back control law can be calculated as penalty for gear changes. Thus, the components of
(2) have been chosen as l1 ( x, u)  l2 ( x, u)  1,
 Vq
 
s( x, 1, 2)  s( x, 2, 1)  0.5.

 û ( x, q )  argmin f q ( x, u ) + l q ( x, u )
x


u∈Ωu
The problem is plugged into the machinery of Sec-
µ̂ ( x, q)

  argmin { Vν ( x) + s( x, q, ν )} tion 4 and Vq ( x) is maximized over a region −5.5 ≤
µ ∈Ωµ tx∈Sq,ν x1 ≤ 1.0, −0.5 ≤ x2 ≤ 3.0.
(16)
The result is shown in Figure 3 and 4 where xi and
x f also have been marked. The functions look rather
where ν  ν ( x, q, µ ). Thus, the continuous input, û,
similar, since the cost for changing gears is only 0.5.
is computed in a standard way. The discrete input, µ̂ ,
One can see that V1 has a threshold along the line
is chosen such that switching occur whenever there
x2  1. Figure 2 reveals that the first gear is almost
exist a discrete mode for which the value function has
useless for high speeds, leading to V1  V2 + 0.5 for
a lower value than the cost of the value function for
x2 > 1. This is the cost for using the second gear
the current mode minus the cost for switching there.
optimally after a gear switch.
Consider the true optimal value function, Vq⋆ . For
those ( x, q, r) where the optimal trajectory requires
mode switching, the inequality (3) will turn to equal-
ity i.e. Vq⋆  Vr⋆ + s( x, q, r) (this will be shown in
10

Ex. 1). A consequence of this is that for (16) to de- 8

scribe correct switching between the modes, s( x, q, q)


7

has to be defined as s( x, q, q)  ε > 0 (rather than V1 5

the real cost s( x, q, q)  0). For Vq⋆ , the proper control 4

law is achieved as ε approaches 0+ . A small value of 2

ε suffices, however, for numerical computations. 1

0
2
3

−6 1
−5

Integration of (2) along a simulated trajectory based


−4 −3 −2 −1 0 1 2 −1
0
x2
x1
on (16) will provide an upper bound on the optimal
cost. The better the control law, the better the
Figure 3: Plot of V1 . The initial point, xi , is marked with
estimate. a vertical dashed line, the final point, x f , with
a solid line.
6. Examples
10

EXAMPLE 1—A CAR WITH TWO GEARS 8

Consider the system 7

(
ẋ1  x2 V2 5

(17) 3

ẋ2  gq( x2 )u, q  1, 2 t ut ≤ 1 2

1
3
2
0

where gq( x) is plotted in Fig. 2. This could be seen −6 −5 −4 −3 −2 −1 0 −1


0
1

x2
1

as a crude model of a car, u being the throttle, gq( x) x1


the efficiency for gear number q.
Figure 4: Plot of V2 .
1.4

1.2
g1( x) g2( x) Studying Fig. 5, where V1 − V2 is plotted, the strategy
1

0.8
for changing gears is even more obvious: there is
0.6 only one discrete mode allowed under optimal control
0.4 when the difference hits its maximum distance. In
0.2 conformity with previous reasoning, V1 − V2  0.5
0
for x2 > 1, indicating the need for a change of gears
−0.2
−1 −0.5 0 0.5 1 1.5 2 2.5 3 when using the first gear at high speed. Analogously,
x
the second gear should be avoided, starting with zero
Figure 2: Gear efficiency at various speeds. speed.
A simulation of the controlled system is shown in
The problem is to bring (17) from xi  (−5, 0), q i  1 Fig. 6, where the initial point is marked with a
to x f  (0, 0), q f  1 in minimum time. Torque square. The state trajectory coincides with the one
losses when using the clutch calls for an additional of a professional rally-driver with lousy brakes. In
temperature of the furnaces and is given by ẋ 
f q( x), where
0.5 " # " #
V1 − V2 − x1 + u0 − x1
f 1( x)  f 2 ( x) 
0 −2x2 −2x2 + u0
" #
− x1
−0.5
−6
f 3( x) 
−5
−4
−2x2
−3 3
2.5
−2 2
x1 −1
0 0.5
1
1.5
Thus, there are three discrete modes: q  1 means
1
2 −1
−0.5
0
x2 that the first furnace is heated, q  2 means that
the second furnace is heated, q  3 corresponds to
Figure 5: The difference between V1 and V2 . no heating. The cost function to be minimized is
2
Z ∞X M
J ( x0 , q0) 
X
the beginning, maximum throttle is used on the first ( xi − ci)2 e− tdt + be− tk
t0 i1 k 1
gear (solid line). When the speed roughly reaches
the point of equal efficiency between the gears ( x2  where the desired stationary temperature values are
0.5), they are switched in favor of the second gear c1  1/4, c2  1/8 and the cost for switching the
(dashed line). At half the distance, the gas pedal is power is b  1/1000. Since the furnaces can only be
lightened to use the braking force of the engine. In fed by a fixed amount of energy, u0 , it is impossible
the end, the first gear is used again before the origin to keep them stationary at the desired temperature.
is hit. As seen in the figure, the granularity of the Hence, the time weighting, e− t, is necessary to get a
discretization grid ( h  0.18) prevents the solution bounded cost function.
from hitting the exact origin. If Vq ( x, t) is defined as the cost for starting in ( x, q) at
time t, then the continuous part of the general time
2
dependent Bellman inequality can be written
 Vq ( x, t)  Vq ( x, t)
+ f q( x, u, t) + lq ( x, u, t) ≥ 0 (18)
1.5
t x
Rewriting the functions like Vq ( x, t)  e− t Ṽq ( x) and
x2 1
lq ( x, u, t)  e− tl̃q ( x, u) for the furnace example,
(18) becomes
0.5
 Ṽq ( x)
− Ṽq( x) + f q( x, u) + l̃q ( x, u) ≥ 0 (19)
x
0
Thus, the time dependence introduced in Bellman’s
−5 −4.5 −4 −3.5 −3 −2.5 −2 −1.5 −1 −0.5 0
inequality cancels and techniques similar to those
x1
presented above apply.
Figure 6: Phase portrait of a simulation. The solid line The optimal control results in a limit cycle as seen in
shows where gear number one has been used, Figure 7. The figure, that contains the phase portrait
the dashed line shows the second gear. The of the continuous states, shows how the temperature
initial point is marked with a square. of one furnace always decreases as the other one is
heated. By alternate heating, the temperatures first
climb up to, and above the set-point and then both
furnaces are turned off and the state drifts towards
the origin. This procedure is then repeated over and
EXAMPLE 2—ALTERNATE HEATING OF TWO FURNACES over again, making the trajectory enclose the desired
Since the industrial power fee is determined by the steady state (marked with a circle in the figure). The
highest peak of the season [5], it is desirable to trajectory has been dashed for t ∈ [0, 2.8] to make the
spread the power consumption evenly over time. This limit cycle clear.
is handled by load control, which means that the
available electrical power is altered between different Figure 8 shows what happens when the power supply
loads of the mill. is insufficient for driving both furnaces. Mode 3 is not
entered since the temperature set-points are never
In this example, the temperature of two furnaces reached.
should be controlled by alternate heating. The system
has two continuous states that correspond to the
0.2 upon request from the authors.
0.18

0.16 q1 8. References


0.14

0.12 [1] P. J. Antsaklis and A. Nerode. “Hybrid con-


x2 0.1
q2 trol systems: An introductory discussion to the
0.08
q3 special issue.” IEEE Transactions on Automatic
0.06

0.04
Control, 43:4, pp. 457–460, April 1998. Special
0.02
issue on hybrid systems.
0
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35
[2] D. P. Bertsekas and J. N. Tsitsiklis. Neuro-
x1
dynamic Programming. Athena Scientific, 1996.
Figure 7: Phase portrait of the continuous states under
optimal control when u0  0.8. The mode [3] M. Branicky. “Multiple Lyapunov functions and
number, q, has been marked for the limit cycle other analysis tools for switched and hybrid sys-
tems.” IEEE Transactions on Automatic Control,
43:4, pp. 475–482, April 1998. Special issue on
0.14
hybrid systems.
0.12

[4] M. S. Branicky and S. K. Mitter. “Algorithms


0.1
for optimal hybrid control.” In Proceedings of
x2
0.08 the 34th Conference on Decision & Control, New
0.06 Orleans, 1995.
q2
0.04 [5] L. Ericsson. Dynamic Load Control, Power Peak
0.02
Shaving Applied to a Foundry. Lic Tech thesis,
q1 Dept. of Industrial Electrical Engineering and
0
0 0.05 0.1 0.15

x1
0.2 0.25 0.3 0.35
Automation, Lund Institute of Technology, Box
118, S-221 00 Lund, SWEDEN, 1997.
Figure 8: Phase portrait of the continuous states under
optimal control when u0  0.4. [6] J. Ezzine and A. H. Haddad. “Controllability and
observability of hybrid systems.” Int. J. Contr.,
49, June, pp. 2045–2055, June 1989.
7. Summary [7] R. Grossman, A. Nerode, A. Ravn, and
An extended version of Bellman’s inequality was H. Rischel. “Models for hybrid systems: Au-
discretized in this paper to compute a lower bound on tomata, topologies, controllability, observabil-
the optimal cost function, using linear programming. ity.” In Hybrid Systems, pp. 317–356. Springer,
Based on these computations, an approximation of 1993.
the optimal control feedback law was derived. [8] M. Johansson. Piecewise Linear Control Sys-
Hybrid systems combine discrete and continuous tems. PhD thesis TFRT-1052, Dept. of Auto-
dynamics. The analysis should therefore contain matic Control, Lund Institute of Technology, Box
techniques that are well suited for computer science 118, S-221 00 Lund, SWEDEN, 1999.
as well as control theory. The emphasis in this paper [9] A. Rantzer. “Dynamic programming via convex
is on the continuous part, the discrete part consisting optimization.” In Proceedings of the IFAC World
of a few system modes. At the other end of the hybrid Congress, Beijing, 1999.
spectrum, where purely discrete systems are found,
X will reduce to a single point. The first inequality [10] A. Rantzer and M. Johansson. “Piecewise lin-
of proposition 1 will then be superfluous. The set of ear quadratic optimal control.” In Proceedings
inequalities given by (4), possibly large depending on of American Control Conference, Albuquerque,
Q, should be met for Sq,r  { x f }. The resulting LP 1997. Submitted for journal publication.
formulation solves the shortest-paths problem on a [11] V. I. Utkin. “Variable structure systems with
non-negatively weighted, directed graph — a problem sliding modes.” IEEE Transactions on Auto-
that is usually attacked using Dijkstra’s algorithm. matic Control, AC-22, pp. 212–222, 1977.
A set of MATLAB commands has been compiled by the [12] R. B. Vinter and R. M. Lewis. “A necessary and
authors to make it easy to test the above methods sufficient condition for optimality of dynamic
and implement the examples. The LP solver that programming type, making no a priori assump-
is used is “PCx”, developed by the Optimization tions on the controls.” SIAM Journal on Control
Technology Center, Illinois. The MATLAB commands and Optimization, 16:4, pp. 571–583, July 1978.
and a manual of usage are available free of charge

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy