Abstract
This entry provides a short introduction to a class of reinforcement learning algorithms, in particular value function approximation, applied to stochastic optimal control problems. The entry demonstrates how core ideas from dynamic programming and Bellman equations are utilized in common data-driven reinforcement learning algorithms, as well as discuss fundamental challenges of the approach.
Similar content being viewed by others
Bibliography
Berkenkamp F, Turchetta M, Schoellig A, Krause A (2017) Safe model-based reinforcement learning with stability guarantees. In: Advances in neural information processing systems, pp 908–918
Bertsekas DP (2015) Dynamic programming and optimal control, vol II, 4th edn. Athena Scientific, Nashua
Buşoniu L, de Bruin T, Tolić D, Kober J, Palunko I (2018) Reinforcement learning for control: performance, stability, and deep approximators. Ann Rev Control 46:8–28
Dean S, Mania H, Matni N, Recht B, Tu S (2018) Regret bounds for robust adaptive control of the linear quadratic regulator. In: Advances in neural information processing systems, pp 4188–4197
Fisac JF, Akametalu AK, Zeilinger MN, Kaynama S, Gillula J, Tomlin CJ (2019) A general safety framework for learning-based control in uncertain robotic systems. IEEE Trans Autom Control 64(7):2737–2752
Kumar PR, Varaiya P (2015) Stochastic systems: estimation, identification, and adaptive control, vol 75. Society for Industrial and Applied Mathematics, Philadelphia
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436
Levine S, Finn C, Darrell T, Abbeel P (2016) End-to-end training of deep visuomotor policies. J Mach Learn Res 17(1):1334–1373
Lewis FL, Vrabie D (2009) Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits Syst Mag 9(3):32–50
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G, et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529
Rosolia U, Borrelli F (2017) Learning model predictive control for iterative tasks. a data-driven control framework. IEEE Trans Autom Control 63(7):1883–1896
Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M et al (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7587):484
Sutton RS, Barto AG (2018) Reinforcement learning: an introduction. MIT Press, Cambridge
Sutton RS, Maei HR, Precup D, Bhatnagar S, Silver D, Szepesvári C, Wiewiora E (2009) Fast gradient-descent methods for temporal-difference learning with linear function approximation. In: Proceedings of the 26th annual international conference on machine learning. ACM, pp 993–1000
Tu S, Recht B (2018) Least-squares temporal difference learning for the linear quadratic regulator. In: International conference on machine learning, pp 5012–5021
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2020 Springer-Verlag London Ltd., part of Springer Nature
About this entry
Cite this entry
Gatsis, K., Pappas, G.J. (2020). Reinforcement Learning for Control Using Value Function Approximation. In: Baillieul, J., Samad, T. (eds) Encyclopedia of Systems and Control. Springer, London. https://doi.org/10.1007/978-1-4471-5102-9_100067-1
Download citation
DOI: https://doi.org/10.1007/978-1-4471-5102-9_100067-1
Published:
Publisher Name: Springer, London
Print ISBN: 978-1-4471-5102-9
Online ISBN: 978-1-4471-5102-9
eBook Packages: Living Reference EngineeringReference Module Computer Science and Engineering