0% found this document useful (0 votes)
27 views9 pages

Inno2024 EMT4203 CONTROL II NOTES R6

This document contains lecture notes on optimal control and the Linear Quadratic Regulator (LQR). It introduces LQR as a method for state feedback control that implicitly places poles to meet system requirements. The notes define optimal control as seeking to minimize cost for a given system trajectory. LQR specifically uses a quadratic cost criterion and results in a linear state feedback control law. The notes provide mathematical details on defining the quadratic performance index and dynamic programming approach used to derive the LQR control law.

Uploaded by

kabuej3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views9 pages

Inno2024 EMT4203 CONTROL II NOTES R6

This document contains lecture notes on optimal control and the Linear Quadratic Regulator (LQR). It introduces LQR as a method for state feedback control that implicitly places poles to meet system requirements. The notes define optimal control as seeking to minimize cost for a given system trajectory. LQR specifically uses a quadratic cost criterion and results in a linear state feedback control law. The notes provide mathematical details on defining the quadratic performance index and dynamic programming approach used to derive the LQR control law.

Uploaded by

kabuej3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

EMT 4203: CONTROL ENGINEERING

II

Bsc. Mechatronic Engineering

DeKUT

Lecture Notes

By

Dr. Inno Oduor Odira

February 2024
3.1 Optimal Control: LQR (Linear Quadratic Regulator) 3

3.1 Optimal Control: LQR (Linear Quadratic Regulator)

3.1.1 LQR: Introduction


The intuition in eigenvalue placement method in state feedback control may some time become
less intuitive particularly for systems with many variables or many poles. LQR proves to be
more intuitive method of implicitly placing the poles in away that corresponds to desired system
requirement and performance.
The so-called linear-quadratic regulator or LQR formulation of the controller problem for
linear systems uses an integral-square (i.e. quadratic) cost criterion to pose a compromise between
the desire to bring the state to zero and the desire to limit control effort. The optimal control turns
out to be precisely a state feedback which enables computation of the optimal feedback gain matrix
K

3.1.2 Optimal Control:LQR


An optimal control system seeks to maximize the return from a system for the minimum cost. In
general terms, the optimal control problem is to find a control u(t) which causes the system

ẋ = g(x(t), u(t),t) (9.1)

to follow an optimal trajectory x(t) that minimizes the performance criterion, or cost function
Z t1
J= h(x(t), u(t),t)dt (9.2)
t0

The problem is one of constrained functional minimization, and has several approaches namely:

1. Variational calculus - Euler Lagrange equations

2. The maximum principle of Pontryagin- Hamiltonian function

3. Dynamic programming method of Bellman - principle of optimality (Hamilton-Jacobi-


Bellman partial differential equation)

4. The Hamilton-Jacobi equation solved for special case of the linear time- invariant plant with
quadratic performance criterion (called the performance index), which takes the form of the
matrix Riccati (1724) equation.

3.1.3 Quadratic performance index


If, in the racing yacht example, the following state and control variables are defined

x1 = ye (t), x2 = ψe (t), x3 = ue (t), u = δa (t)

then the performance index could be expressed


4

Z t1
J= {(q11 x1 + q22 x2 + q33 x3 ) + (r1 u)} dt (9.4)
t0
or
Z t1
J= (Qx + Ru)dt (9.5)
t0

If the state and control variables in equations (9.4) and (9.5) are squared, then the performance
index become quadratic. The advantage of a quadratic performance index is that for a linear system
it has a mathematical solution that yields a linear control law of the form

u(t) = −Kx(t) (9.6)

A quadratic performance index for this example is therefore

Z t1 
q11 x12 + q22 x22 + q33 x32 + r1 u2 dt
 
J= (9.7)
t0
   
Z t1 h i q11 0 0 x1
J= x1 x2 x3  0 q22 0   x2  + [u] [r1 ] [u] dt
   
t0
0 0 q33 x3

or, in general
Z t1
xT Qx + uT Ru dt

J= (9.8)
t0

Q and R are the state and control weighting matrices and are always square and symmetric. J
is always a scalar quantity.

3.1.4 The Linear quadratic problem (LQR)


The Linear Quadratic Regulator (LQR) provides an optimal control law for a linear system with a
quadratic performance index.

3.1.5 Dynamic Programming and Full-State Feedback


We consider here the regulation problem, that is, of keeping xdesired = 0. The closed-loop system
thus is intended to reject disturbances and recover from initial conditions, but not necessarily
follow y-trajectories. There are several necessary definitions. First we define an instantaneous
penalty function l(x(t), u(t)), which is to be greater than zero for all nonzero x and u. The cost
associated with this penalty, along an optimal trajectory, is
Z ∞
J= l(x(t), u(t))dt
0
3.1 Optimal Control: LQR (Linear Quadratic Regulator) 5

i.e., the integral over time of the instantaneous penalty. Finally, the optimal return is the cost of the
optimal trajectory remaining after time t :
Z ∞
V (x(t), u(t)) = l(x(τ), u(τ))dτ.
t

We have directly from the dynamic programming principle

V (x(t), u(t)) = min{l(x(t), u(t))δt +V (x(t + δt), u(t + δt))}. (3.1)


u

The minimization of V (x(t), u(t)) is made by considering all the possible control inputs u
in the time interval (t,t + δt). As suggested by dynamic programming, the return at time t is
constructed from the return at t + δt, and the differential component due to l(x, u).
Through multivariate Taylor series expansion

p1 (x, y) = f (a, b) + D f (a, b)((x, y) − (a, b))


∂f ∂f (3.2)
= f (a, b) + (a, b)(x − a) + (a, b)(y − b)
∂x ∂y
Since the input u(t) is independent variable, the V (x(t), u(t)) function may be treated as a univariate
function in x(t)

f ′′ (x) f ′′′ (x) f (4) (ξ1 )


f (x + ∆x) = f (x) + ∆x f ′ (x) + ∆x2 + ∆x3 + ∆x4 ··· (3.3)
2! 3! 4!

And if V is smooth and has no explicit dependence on t, as written, then

∂V dx
V (x(t + δt), u(t + δt)) = V (x(t), u(t)) + + h.o.t. −→
∂ x dt
∂V
= V (x(t), u(t)) + (Ax(t) + Bu(t)).
∂x
Substituting the V (x(t + δt), u(t + δt)) in the bellman equation gives

∂V
V (x(t), u(t)) = min{l(x(t), u(t))δt +V (x(t), u(t)) + (Ax(t) + Bu(t))}. (3.4)
u ∂x

Now control input u in the interval (t,t + δt) cannot affect V (x(t), u(t)), so isolating it from
the minimization component and making a cancellation gives
 
∂V
0 = min l(x(t), u(t)) + (Ax(t) + Bu(t)) . (3.5)
u ∂x
We next make the assumption that V (x, u) has the following form:

1 1
V (x, u) = xT Px + uT Zu
2 2
6

where P is a symmetric matrix, and positive definite. It follows that

∂V
= xT P −→
∂x
0 = min l(x, u) + xT P(Ax + Bu) .

u

We finally specify the instantaneous penalty function. The LQR employs the special quadratic
form
1 1
l(x, u) = xT Qx + uT Ru,
2 2
where Q and R are both symmetric and positive definite. The matrices Q and R are to be set by the
user, and represent the main "tuning knobs" for the LQR. Substitution of this form into the above
equation gives
 
1 T 1 T T
0 = min x Qx + u Ru + x P(Ax + Bu) . (3.6)
u 2 2
and setting the derivative with respect to u to zero gives

0 = uT R + xT PB
uT = −xT PBR−1
u = −R−1 BT Px.

The gain matrix for the feedback control is thus K = R−1 BT P. Inserting this solution back into
equation 3.6, and eliminating u in favor of x, we have

1 1
0 = xT Qx − xT PBR−1 BT Px + xT PAx.
2 2

All the matrices here are symmetric except for PA;


since
xT PAx = xT AT Px

, we can make its effect symmetric.


or since

2xT PAx = xT AT P + PA x


then
1 1
xT PAx = xT PAx + xT AT Px,
2 2
leading to the final matrix-only result

0 = Q + PA + AT P − PBR−1 BT P.

This equation (9.25)is referred to as (ARE) Agebraic Riccatti Equation


3.1 Optimal Control: LQR (Linear Quadratic Regulator) 7

Example:
The regulator shown in Figure 9.1 contains a plant that is described by
" # " #" # " #
ẋ1 0 1 x1 0
= + u
ẋ2 −1 −2 x2 1
h i
y= 1 0 x

and has a performance index


" " # #
2 0
Z ∞
J= xT x + u2 dt
0 0 1
Determine
(a) the Riccati matrix P
(b) the state feedback matrix K
(c) the closed-loop eigenvalues

Fig. 3.1 LQR controller

(a)
" # " #
0 1 0
A= B=
−1 −2 1
" #
2 0
Q= R = scalar = 1
0 1
From equation (9.25) the reduced Riccati equation is
8

Q + PA + AT P − PBR−1 BT P = 0
" #" # " #
p11 p12 0 1 −p12 p11 − 2p12
PA = =
p21 p22 −1 −2 −p22 p21 − 2p22
" #" # " #
T 0 −1 p11 p12 −p21 −p22
A P= =
1 −2 p21 p22 p11 − 2p21 p12 − 2p22
" #" # " #
−1 T p11 p12 0 h i p
11 p12
PBR B P = 1 0 1
p21 p22 1 p21 p22
" #
p12 h i
= p21 p22
p22
" #
p12 p21 p12 p22
=
p22 p21 p222
Combining equations (9.34), (9.35) and (9.36) gives

" # " # " #


2 0 −p12 p11 − 2p12 −p21 −p22
+ + (9.37)
0 1 −p22 p21 − 2p22 p11 − 2p21 p12 − 2p22
" #
p12 p21 p12 p22
− =0
p22 p21 p222

Since P is symmetric, p21 = p12 . Equation (9.37) can be expressed as four simultaneous
equations

2 − p12 − p12 − p212 = 0 (9.38)


p11 − 2p12 − p22 − p12 p22 = 0 (9.39)
−p22 + p11 − 2p12 − p12 p22 = 0 (9.40)
1 + p12 − 2p22 + p12 − 2p22 − p222 = 0 (9.41)

Note that equations (9.39) and (9.40) are the same. From equation (9.38)

−2 + p212 + 2p12 = 0

solving

p12 = p21 = 0.732 and − 2.732

Using positive value

p12 = p21 = 0.732 (9.42)


3.1 Optimal Control: LQR (Linear Quadratic Regulator) 9

From equation (9.41)

1 + 2p12 − 4p22 − p222 = 0


p222 + 4p22 − 2.464 = 0
solving

p22 = 0.542 and − 4.542

Using positive value

p22 = 0.542 (9.43)

From equation (9.39)

p11 − (2 × 0.732) − 0.542 − (0.732 × 0.542) = 0


p11 = 2.403 (9.44)

From equations (9.42), (9.43) and (9.44) the Riccati matrix is


" #
2.403 0.732
P= (9.45)
0.732 0.542
(b) Equation (9.21) gives the state feedback matrix
" #
h i 2.403 0.732
K = R−1 BT P = 1 0 1 (9.46)
0.732 0.542
Hence
h i
K= 0.732 0.542

(c) From equation (8.96), the closed-loop eigenvalues are

|sI − A + BK| = 0
" # " # " #
s 0 0 1 0 h i
− + 0.732 0.542 = 0
0 s −1 −2 1
" # " #
s −1 0 0
+ =0
1 s+2 0.732 0.542
s −1
=0
1.732 s + 2.542
s2 + 2.542s + 1.732 = 0
s1 , s2 = −1.271 ± j0.341
10

1.4.1 LQ-Observer
For a system of the form

ẋ(t) = Ax(t) + Bu(t) (1.12)


y(t) = Cx(t)

The state estimator design problem is to choose the observer gain L in the observer equation

x̂˙ = Ax̂ + Bu + L(y −Cx̂)

With the observer error dynamics equation

ė = (A − LC)e

So that the observer error dynamics is stable.


The related state feedback problem (Dual) is to choose K in

ẋ = AT x +CT u with u = −Kx


ẋ = AT −CT K x

Which implies

and AT −CT K is stable.




By choosing L = K T for the observer, the observer is ensured to be stable.


Since the K obtained by LQ optimal control design is stabilizing as long as some stabilizability
and detectability conditions are satisfied. L = K T can be used as a stabilizing observer gain as
well.
Solving LQ control problem for the dual problem

ẋ = AT x +CT u with u = −Kx

Transform the algebraic Riccatti equation

B → CT A → AT

Thus Q + PAT + AP − PCT R−1CP = 0


Stabilizing feedback gain K for the dual system is given by

K = LT = R−1CP ⇒ L = PCT R−1

Where L is the observer gain

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy