Skip to main content

Reinforcement Learning for Control Using Value Function Approximation

  • Living reference work entry
  • First Online:
Encyclopedia of Systems and Control

Abstract

This entry provides a short introduction to a class of reinforcement learning algorithms, in particular value function approximation, applied to stochastic optimal control problems. The entry demonstrates how core ideas from dynamic programming and Bellman equations are utilized in common data-driven reinforcement learning algorithms, as well as discuss fundamental challenges of the approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Similar content being viewed by others

Bibliography

  • Berkenkamp F, Turchetta M, Schoellig A, Krause A (2017) Safe model-based reinforcement learning with stability guarantees. In: Advances in neural information processing systems, pp 908–918

    Google Scholar 

  • Bertsekas DP (2015) Dynamic programming and optimal control, vol II, 4th edn. Athena Scientific, Nashua

    Google Scholar 

  • BuÅŸoniu L, de Bruin T, Tolić D, Kober J, Palunko I (2018) Reinforcement learning for control: performance, stability, and deep approximators. Ann Rev Control 46:8–28

    Article  MathSciNet  Google Scholar 

  • Dean S, Mania H, Matni N, Recht B, Tu S (2018) Regret bounds for robust adaptive control of the linear quadratic regulator. In: Advances in neural information processing systems, pp 4188–4197

    Google Scholar 

  • Fisac JF, Akametalu AK, Zeilinger MN, Kaynama S, Gillula J, Tomlin CJ (2019) A general safety framework for learning-based control in uncertain robotic systems. IEEE Trans Autom Control 64(7):2737–2752

    Article  MathSciNet  Google Scholar 

  • Kumar PR, Varaiya P (2015) Stochastic systems: estimation, identification, and adaptive control, vol 75. Society for Industrial and Applied Mathematics, Philadelphia

    Book  Google Scholar 

  • LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436

    Article  Google Scholar 

  • Levine S, Finn C, Darrell T, Abbeel P (2016) End-to-end training of deep visuomotor policies. J Mach Learn Res 17(1):1334–1373

    MathSciNet  MATH  Google Scholar 

  • Lewis FL, Vrabie D (2009) Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits Syst Mag 9(3):32–50

    Article  Google Scholar 

  • Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G, et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529

    Article  Google Scholar 

  • Rosolia U, Borrelli F (2017) Learning model predictive control for iterative tasks. a data-driven control framework. IEEE Trans Autom Control 63(7):1883–1896

    Article  MathSciNet  Google Scholar 

  • Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M et al (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7587):484

    Article  Google Scholar 

  • Sutton RS, Barto AG (2018) Reinforcement learning: an introduction. MIT Press, Cambridge

    MATH  Google Scholar 

  • Sutton RS, Maei HR, Precup D, Bhatnagar S, Silver D, Szepesvári C, Wiewiora E (2009) Fast gradient-descent methods for temporal-difference learning with linear function approximation. In: Proceedings of the 26th annual international conference on machine learning. ACM, pp 993–1000

    Google Scholar 

  • Tu S, Recht B (2018) Least-squares temporal difference learning for the linear quadratic regulator. In: International conference on machine learning, pp 5012–5021

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Konstantinos Gatsis .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer-Verlag London Ltd., part of Springer Nature

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Gatsis, K., Pappas, G.J. (2020). Reinforcement Learning for Control Using Value Function Approximation. In: Baillieul, J., Samad, T. (eds) Encyclopedia of Systems and Control. Springer, London. https://doi.org/10.1007/978-1-4471-5102-9_100067-1

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-5102-9_100067-1

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-5102-9

  • Online ISBN: 978-1-4471-5102-9

  • eBook Packages: Living Reference EngineeringReference Module Computer Science and Engineering

Publish with us

Policies and ethics

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy