0% found this document useful (0 votes)

22 views

1 Related Works

1. The document reviews several related works on using reinforcement learning to train recurrent neural networks. These include using RL to train Q-functions or advantage functions represented by LSTMs. 2. Many of the related works assume full observability of states or differentiability of models and rewards. Some train RNNs to rate the outputs of other RNNs or use evolutionary algorithms. 3. The document also reviews works incorporating recurrence into RL by including memory states in MDPs or feeding history to Q-functions. However, it notes these do not fully utilize backpropagation through time information.

Uploaded by

Han Solo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views

1 Related Works

Uploaded by

Han Solo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

1 Related Works

Balduzzi and Ghifary [3] learns a G-function, which predicts gradients of a Q

function, using a gradient perturbation trick. Fairbank et al. [4] trains a G func-
tion using a DQN-style training for G. However, to compute target gradients,
they assume that their model and reward function is differentiable with respect
to all inputs. Nguyen and Widrow [12] Jordan and Jacobs [10] learn a model
and differentiates through that. Prokhorov and Wunsch [14] gives an overview
of heuristic dynamic programming, dual heuristic programming, and globalized
gual heuristic programming.
Bahdanau et al. [1] use RL to train an Q-function, which is then used to train
a RNN. However, this assume that you always have access to the target label.
Jaques et al. [9] train a DQN to rate how good the output of an RNN is. Heess
et al. [8] trains a RNN using DDPG, but does not explore how to use these tasks
for supervised learning tasks. As such, their policy is super slow. Hausknecht
and Stone [7] train a LSTM Q-funtion in a DQN set up. However, they assume
that either (1) you always roll out from the beginning of an episode or that (2)
you begin a rollout from a random point and start with a zero hidden state.
Wierstra and Alexander [16] shows how to get the policy gradient for recurrent
policies. Bakker [2] uses an LSTM to represent an advantage function and use
eligibility traces.
Gomez and Schmidhuber [5] train RNNs with evolutionary algorithms. Schmid-
huber et al. [15] train RNNs by randomly guessing weights.
Hasinoff [6] provides a survey of old papers. Lin and Mitchell [11] explores
3 different ways to incorporate recurrence in RL: (1) feed history to Q function,
(2) recurrent Q function, and (3) recurrent model.
? ] augments the MDP to include memory states in the state, observation,
and actions. However, this doesnt use BPTT information. Peshkin et al. [13]
first introduces memory states

References
[1] Dzmitry Bahdanau, Philemon Brakel, Kelvin Xu, Anirudh Goyal, Ryan
Lowe, Joelle Pineau, Aaron Courville, and Yoshua Bengio. An Actor-
Critic Algorithm for Sequence Prediction. arXiv:1607.07086v1 [cs.LG],
2016. URL http://arxiv.org/abs/1607.07086.
[2] Bram Bakker. Reinforcement Learning with Long Short-Term Memory.
Advances in Neural Information Processing Systems 14, pages 14751482,
2002. ISSN 1049-5258.
[3] David Balduzzi and Muhammad Ghifary. Compatible Value Gradients for
Reinforcement Learning of Continuous Deep Policies. pages 127, 2015.
URL http://arxiv.org/abs/1509.03005.
[4] Michael Fairbank, Eduardo Alonso, and Danil Prokhorov. An equiva-
lence between adaptive dynamic programming with a critic and back-

1
propagation through time. IEEE Transactions on Neural Networks
and Learning Systems, 24(12):20882100, 2013. ISSN 2162237X. doi:
10.1109/TNNLS.2013.2271778.
[5] F. J. Gomez and J. Schmidhuber. Co-Evolving Recurrent Neurons Learn
Deep Memory POMDPs. Galleria Rassegna Bimestrale Di Cultura, pages
114, 2004. doi: 10.1145/1068009.1068092.
[6] Samuel W Hasinoff. Reinforcement Learning for Problems with Hidden
State. University of Toronto, Technical Report, pages 118, 2003.
[7] Matthew Hausknecht and Peter Stone. Deep Recurrent Q-Learning for
Partially Observable MDPs. 2015.
[8] Nicolas Heess, Jonathan J Hunt, Timothy P Lillicrap, and David Silver.
Memory-based control with recurrent neural networks. arXiv, pages 111,
2015. URL http://arxiv.org/abs/1512.04455.
[9] Natasha Jaques, Shixiang Gu, Richard E Turner, and Douglas Eck. Gen-
erating Music by Fine-Tuning Recurrent Neural Networks with Reinforce-
ment Learning. pages 111, 2016.
[10] Michael I Jordan and Robert A Jacobs. Learning to control an unstable
system with forward modeling. Advances in Neural Information Processing
Systems, 2:324331, 1990.

[11] L.J. Lin and T.M. Mitchell. Memory approaches to reinforcement learning
in non-Markovian domains. Artificial Intelligence, 8(7597):28, 1992. URL
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.52.319.
[12] Derrick H. Nguyen and Bernard Widrow. Neural Networks for Self-Learning
Control Systems, 1990. ISSN 02721708.

[13] Leonid Peshkin, Nicolas Meuleau, and Leslie Kaelbling. Learn-

ing Policies with External Memory. Sixteenth International
Conference on Machine Learning, (March):8, 2001. ISSN
1098-6596. doi: 10.1017/CBO9781107415324.004. URL
http://arxiv.org/abs/cs/0103003.

[14] Danil V. Prokhorov and Donald C. Wunsch. Adaptive critic designs. IEEE
Transactions on Neural Networks, 8(5):9971007, 1997. ISSN 10459227.
doi: 10.1109/72.623201.
[15] Jurgen Schmidhuber, Sepp Hochrieter, and Yoshua Bengio. Evaluating
Long-term Dependendency Denchmark Problems by Random Guessing.
1997.
[16] Daan Wierstra and F Alexander. Recurrent Policy Gradients. (May 2009).

Deep Reinforcement Learning Mohit Sewak
No ratings yet
Deep Reinforcement Learning Mohit Sewak
6 pages
Deep Reinforcement Learning: From Q-Learning To Deep Q-Learning
No ratings yet
Deep Reinforcement Learning: From Q-Learning To Deep Q-Learning
9 pages
15
No ratings yet
15
17 pages
Unit 5d - Deep Reinforcement Learning
No ratings yet
Unit 5d - Deep Reinforcement Learning
52 pages
RLAlgs in MDPs
No ratings yet
RLAlgs in MDPs
98 pages
42-Deep Q Learning
No ratings yet
42-Deep Q Learning
8 pages
An Introduction To Deep Reinforcement Learning PDF
No ratings yet
An Introduction To Deep Reinforcement Learning PDF
140 pages
3.5 Intro2DeepQLearning
No ratings yet
3.5 Intro2DeepQLearning
12 pages
6S191 MIT DeepLearning L5
No ratings yet
6S191 MIT DeepLearning L5
62 pages
Q-Learning and Deep Q Networks (DQN)
No ratings yet
Q-Learning and Deep Q Networks (DQN)
52 pages
Algorithms For Reinforced Learning
No ratings yet
Algorithms For Reinforced Learning
98 pages
Neural Network Study Group
No ratings yet
Neural Network Study Group
24 pages
A Short Survey On Memory Based RL
No ratings yet
A Short Survey On Memory Based RL
18 pages
Algorithms For Reinforcement Learning - Szepesvari
No ratings yet
Algorithms For Reinforcement Learning - Szepesvari
98 pages
Deep Reinforcement Learning An Overview
No ratings yet
Deep Reinforcement Learning An Overview
30 pages
1701 07274v2 PDF
No ratings yet
1701 07274v2 PDF
30 pages
Full Download Deep Reinforcement Learning in Action 1st Edition Alexander Zai Brandon Brown PDF DOCX
100% (2)
Full Download Deep Reinforcement Learning in Action 1st Edition Alexander Zai Brandon Brown PDF DOCX
65 pages
RL+ LSTM
No ratings yet
RL+ LSTM
18 pages
Algorithms For Reinforcement Learning Csaba Szepesvari instant download
No ratings yet
Algorithms For Reinforcement Learning Csaba Szepesvari instant download
36 pages
Playing Geometry Dash With Convolutional Neural Networks
No ratings yet
Playing Geometry Dash With Convolutional Neural Networks
7 pages
Rldl End Sem
No ratings yet
Rldl End Sem
230 pages
10 Deep Reinforcement
No ratings yet
10 Deep Reinforcement
40 pages
Unit 5 - Reinforcement Learning
No ratings yet
Unit 5 - Reinforcement Learning
15 pages
RADL LACuong
No ratings yet
RADL LACuong
81 pages
Pplication of Deep Reinforcement Learning For Ndian Stock Trading Automation
No ratings yet
Pplication of Deep Reinforcement Learning For Ndian Stock Trading Automation
9 pages
1 s2.0 S0925231220303337 Main
No ratings yet
1 s2.0 S0925231220303337 Main
12 pages
Deep Deformable Q-Network An Extension of Deep Q-Network
No ratings yet
Deep Deformable Q-Network An Extension of Deep Q-Network
4 pages
Full Download Foundations of Deep Reinforcement Learning Theory and Practice in Python First Edition Laura Graesser PDF
100% (5)
Full Download Foundations of Deep Reinforcement Learning Theory and Practice in Python First Edition Laura Graesser PDF
62 pages
Lecture15 Deep Reinforcement Learning PDF
No ratings yet
Lecture15 Deep Reinforcement Learning PDF
109 pages
dqn-atari
No ratings yet
dqn-atari
26 pages
Modern_Deep_Reinforcement_Learning_Algorithms
No ratings yet
Modern_Deep_Reinforcement_Learning_Algorithms
56 pages
Multi-Agent Deep Reinforcement Learning: Maxim Egorov Stanford University
No ratings yet
Multi-Agent Deep Reinforcement Learning: Maxim Egorov Stanford University
8 pages
Deep Reinforcement Learning Yuxi Li Itebooks download
No ratings yet
Deep Reinforcement Learning Yuxi Li Itebooks download
53 pages
Chapter_1_Introduction_RL_Report_Kiran
No ratings yet
Chapter_1_Introduction_RL_Report_Kiran
2 pages
UNIT- 5
No ratings yet
UNIT- 5
43 pages
Deep Quality-Value (DQV) Learning: Preprint. Work in Progress
No ratings yet
Deep Quality-Value (DQV) Learning: Preprint. Work in Progress
10 pages
Chapter 1
No ratings yet
Chapter 1
33 pages
Disertatie
No ratings yet
Disertatie
5 pages
Get Deep Reinforcement Learning in Action 1st Edition Alexander Zai Brandon Brown PDF ebook with Full Chapters Now
100% (2)
Get Deep Reinforcement Learning in Action 1st Edition Alexander Zai Brandon Brown PDF ebook with Full Chapters Now
40 pages
Reinforcement Learning (RL) : Big Data Mining
No ratings yet
Reinforcement Learning (RL) : Big Data Mining
86 pages
Sandholm 1996
No ratings yet
Sandholm 1996
20 pages
unit 4
No ratings yet
unit 4
23 pages
Deep Reinforcement Learning in Action 1st Edition Alexander Zai Brandon Brown pdf download
100% (2)
Deep Reinforcement Learning in Action 1st Edition Alexander Zai Brandon Brown pdf download
44 pages
Introduction To Deep Reinforcement Learning
No ratings yet
Introduction To Deep Reinforcement Learning
7 pages
1602 02672 PDF
No ratings yet
1602 02672 PDF
10 pages
18-deeprl
No ratings yet
18-deeprl
19 pages
Lecture Notes on Reinforcement Learning Basics
No ratings yet
Lecture Notes on Reinforcement Learning Basics
6 pages
DL questions
No ratings yet
DL questions
30 pages
Reinforcement Learning With Python
No ratings yet
Reinforcement Learning With Python
24 pages
Alg RLearning Ejemplo
No ratings yet
Alg RLearning Ejemplo
99 pages
07 Deep Reinforcement Learning (John)
No ratings yet
07 Deep Reinforcement Learning (John)
52 pages
13-RL DRL
No ratings yet
13-RL DRL
102 pages
2015.08.26.Lecture01Intro 2
No ratings yet
2015.08.26.Lecture01Intro 2
37 pages
An Introduction to Deep Reinforcement Learning
No ratings yet
An Introduction to Deep Reinforcement Learning
280 pages
RLDL_PBL_AmriteshChandra_09411503121
No ratings yet
RLDL_PBL_AmriteshChandra_09411503121
15 pages
Paper Fiuri
No ratings yet
Paper Fiuri
17 pages
Download Full Foundations of Deep Reinforcement Learning Theory and Practice in Python First Edition Laura Graesser PDF All Chapters
100% (4)
Download Full Foundations of Deep Reinforcement Learning Theory and Practice in Python First Edition Laura Graesser PDF All Chapters
62 pages
Report On Reinforcement Learning
No ratings yet
Report On Reinforcement Learning
26 pages
Enhancing Deep Learning Performance Using Displaced Rectifier Linear Unit
From Everand
Enhancing Deep Learning Performance Using Displaced Rectifier Linear Unit
David Macêdo
No ratings yet
Bayesian Learning: Fundamentals and Applications
From Everand
Bayesian Learning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Instant ebooks textbook (Ebook) A Course on Mathematical Logic (2nd Edition) by Shashi Mohan Srivastava ISBN 9781493998951, 1493998951 download all chapters
100% (4)
Instant ebooks textbook (Ebook) A Course on Mathematical Logic (2nd Edition) by Shashi Mohan Srivastava ISBN 9781493998951, 1493998951 download all chapters
71 pages
[7WCP]abstracts
No ratings yet
[7WCP]abstracts
86 pages
6.045 Final Exam: 6.045J/18.400J: Automata, Computability and Complexity
No ratings yet
6.045 Final Exam: 6.045J/18.400J: Automata, Computability and Complexity
15 pages
Download Basic Principles of Sound Reasoning 2nd Edition Cynthia Bolton ebook file with all chapters
100% (1)
Download Basic Principles of Sound Reasoning 2nd Edition Cynthia Bolton ebook file with all chapters
82 pages
10th Maths Unit Exercise PDF
86% (7)
10th Maths Unit Exercise PDF
50 pages
Class Program Sy 2019 2020
No ratings yet
Class Program Sy 2019 2020
5 pages
Design & Analysis of Algorithms: Dr. Iftikhar Ahmad
No ratings yet
Design & Analysis of Algorithms: Dr. Iftikhar Ahmad
21 pages
JEE Main PYQ ITF, Logic 210623
No ratings yet
JEE Main PYQ ITF, Logic 210623
11 pages
Dokumen - Tips - An Excursion in Mathematics Bhaskaracharya An Excursion in Mathematics Modak PDF
0% (3)
Dokumen - Tips - An Excursion in Mathematics Bhaskaracharya An Excursion in Mathematics Modak PDF
4 pages
COMPT Assignment v2
No ratings yet
COMPT Assignment v2
14 pages
Minimization of Finite Automata: Computer Engineering Department B. E. Iii, Co - E, 6 Semester (EVEN - 2018)
No ratings yet
Minimization of Finite Automata: Computer Engineering Department B. E. Iii, Co - E, 6 Semester (EVEN - 2018)
21 pages
Difference Between DFA NFA - NFA Vs DFA Automata - Engineer's Portal
No ratings yet
Difference Between DFA NFA - NFA Vs DFA Automata - Engineer's Portal
3 pages
CNF DNF Mathematical induction
No ratings yet
CNF DNF Mathematical induction
41 pages
Fuzzy Propositions Vs Classical Propositions
No ratings yet
Fuzzy Propositions Vs Classical Propositions
9 pages
Additional Mathematics SPM Paper 1 2018
27% (11)
Additional Mathematics SPM Paper 1 2018
36 pages
Bio Inspired Computing
No ratings yet
Bio Inspired Computing
4 pages
Full Download Mathematical Logic Through Python Yannai A. Gonczarowski PDF
100% (6)
Full Download Mathematical Logic Through Python Yannai A. Gonczarowski PDF
84 pages
Unit 5 and 6 Symbolic Reasoning Under Uncertainty
No ratings yet
Unit 5 and 6 Symbolic Reasoning Under Uncertainty
19 pages
Where can buy One Hundred Years of Intuitionism 1907 2007 The Cerisy Conference 1st Edition Dirk Van Dalen (Auth.) ebook with cheap price
100% (1)
Where can buy One Hundred Years of Intuitionism 1907 2007 The Cerisy Conference 1st Edition Dirk Van Dalen (Auth.) ebook with cheap price
80 pages
DS Lec 11
No ratings yet
DS Lec 11
19 pages
Knowledge Representation and Reasoning - The FL - AL Family of Description Logics
No ratings yet
Knowledge Representation and Reasoning - The FL - AL Family of Description Logics
4 pages
Solution To Credit Assignment Problem in MLP. Rumelhart, Hinton and Relating To Economics)
No ratings yet
Solution To Credit Assignment Problem in MLP. Rumelhart, Hinton and Relating To Economics)
14 pages
Module 3 1 Inductive and Deductive Reasoning
No ratings yet
Module 3 1 Inductive and Deductive Reasoning
76 pages
Deep Learning: Convolutional Neural Network With Python (Keras and Tensorflow)
No ratings yet
Deep Learning: Convolutional Neural Network With Python (Keras and Tensorflow)
2 pages
Logic and Proofs
No ratings yet
Logic and Proofs
59 pages
NP Hard and NP Complete Problems
No ratings yet
NP Hard and NP Complete Problems
25 pages
AI Propositional Logic
No ratings yet
AI Propositional Logic
11 pages
New IB Math Courses Coming For The IB Class of 2021: Mathematics Curriculum Review, April 2017
No ratings yet
New IB Math Courses Coming For The IB Class of 2021: Mathematics Curriculum Review, April 2017
4 pages
Question Bank COURSE: Artificial Intelligence Department: Cse Class: Iii B.Tech Sem Ii Year: 2009-2010 Unit I
No ratings yet
Question Bank COURSE: Artificial Intelligence Department: Cse Class: Iii B.Tech Sem Ii Year: 2009-2010 Unit I
9 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

1 Related Works

Uploaded by

1 Related Works

Uploaded by

1 Related Works

Balduzzi and Ghifary [3] learns a G-function, which predicts gradients of a Q

[13] Leonid Peshkin, Nicolas Meuleau, and Leslie Kaelbling. Learn-

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.