Direct Adaptive Control, Direct Optimization: F Pait and Rodrigo Romano December 2015 Steve Morse Workshop in Osaka
Direct Adaptive Control, Direct Optimization: F Pait and Rodrigo Romano December 2015 Steve Morse Workshop in Osaka
Direct Adaptive Control, Direct Optimization: F Pait and Rodrigo Romano December 2015 Steve Morse Workshop in Osaka
(1)
u = K(t)x.
(2)
(3)
y = CO ()x + GO ()y.
(4)
y D = (I G())
A
0
0
0
b
] , BO = [ ] , DO = [ ] ,
A
b
0
(5)
CD x D ,
CD )
EO ()AO = A D EO ()
EO ()DO = DD ()
CO () = CD E0 ()
BD ()
EO ()BO = BD ()
GO () = G().
(6)
(7)
using only process input and output data. We write the quantity
on the lefthand side as a magnitude squared to emphasize that it
takes positive values only. The algebraic Riccati equation
T
A T P + PA PBR1 B T P + (I G)1 CO
QCO (I G)1 = 0,
d T
x Px = x T A T Px + x T PAx + u T B T Px + x T PBu
dt
T
QCO (I G)1 x + x T PBR1 B T Px + x T K T B T Px + x T PBKx
= x T (I G)1 CO
= x T (K + R1 B T P)T R(K + R1 B T P)x y T Qy u T Ru.
ti
ti T
z2 dt =
ti
ti T
(8)
where the stabilizing feedback gain matrix K = R B P is unknown, as is the matrix P. We now assume that here exists a known
matrix P such that P P is positive definite, and write
1 T
i) +
zK (ti ) = x(ti )T Px(t
ti
ti T
z2 dt.
The term in K looks almost like the error in a traditional parameter estimation problem. It would be amenable to treatment by a
least-squares algorithm, except that in the present case the error is
known in magnitude only4 . The need arises for tuners with capabilities comparable to those of traditional leastsquares or gradient
type, which can function with information on the magnitude of the
error only.
Direct tuning for direct control. We consider that the feedback control K is linearly parametrized by a vector of parameters .
Our task then is to tune in a manner that keeps
f (, ti ) =
zK() (ti )
T
e + Tx (ti T)x(ti
T)
small.
1. Choose a sequence of instants ti We shall fix t0 = 0 and use
ti+1 = ti + T, however different durations of the intervals ti+1 ti
might be algorithmically advantageous.
2. During each interval [ti , ti+1 ) apply a constant feedback of K to
obtain the control cost f (, ti ). The controlled process plays the
role of a 0order oracle: when queried it supplies the value of f ,
with no hint on how to use it.
.
x
h
A betterbehaved approximation which doesnt need the difference term uses a complex step;
Im [F(x+ih)]Im [F(x)]
Im [F(x+ih)]
F
=
.
x
Im [ih]
h
We shall not have the opportunity to
use this approximation here because
there is no experimental manner to
compute f (, ti ) for complex arguments, but I find the formula neat and
wrote it down for inspiration.
i ).
3. At each instant ti use the barycenter formula to compute (t
i ) and, during the
4. Pick (ti ) as a random variable with mean (t
interval [ti , ti+1 ), use the feedback K((ti )).
The barycenter method can be used to tune the parameters :
mi = mi1 + e f (i ,ti )
(10)
1
i =
(mi1 i1 + e f (i ,ti ) i )
mi
(11)
i = i1 + i
(12)
Here m0 = 0, 0 = 0, i is the sequence i of test values of the controller parameters, and is a positive real constant.
The rationale behind the method is that points where f is large
receive low weight in comparison with those for which f is small.
ij=1 i e
ij=1 e
f ( j ,t j )
f ( j ,t j )
(9)
2 f ( j ,t j )
.
( j ) e
j=1
In system identification, this method has been employed successfully for filter tuning5 .
It will prove useful to consider the sequence of test points i as
defined by the sum of the barycenter i1 of the previous points
and a curiosity i , as spelled out in (12). Then (11) reads
i i1 =
e f (i ,ti )
(i i1 ) = Fi (i )i .
mi1 + e f (i ,ti )
i1
+1)2
f
= F (here and in the computations that follow the subscript
indicating dependence of the interval i is omitted if there is no
ambiguity.)
F
1
(2)n
e 2 ()
1
1 ()
F
d +
p()d = F()p() ( d) ,
X
X
F
p()d,
E [ i ] = E [Fi ()] E [ Fi () f ( i1 + , ti )] .
p
()d.
Felipe M Pait. A tuner that accele
(13)
Formula (13) is the main result concerning the barycenter method. It shows
that roughly speaking the search performed by the barycenter algorithm
follows the direction of the negative
average gradient of the function to
be minimized, the weighted average
being taken over the domain where the
search is performed.