0% found this document useful (0 votes)

292 views

Deep Reinforcement Learning

Its about deep learning who want to know more about the deep learning and NLP then you should follow the instructions in the pdf

Uploaded by

Harsh Arora

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

292 views

Deep Reinforcement Learning

Its about deep learning who want to know more about the deep learning and NLP then you should follow the instructions in the pdf

Uploaded by

Harsh Arora

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 47

Deep Reinforcement Learning

CS 294 - 112
Course logistics
Class Information & Resources

Sergey Levine Kate Rakelly Greg Kahn Sid Reddy Michael Chang Soroush Nasiriany
Instructor Head GSI GSI GSI GSI uGSI

• Course website: http://rail.eecs.berkeley.edu/deeprlcourse

• Piazza: UC Berkeley, CS294-112
• Subreddit (for non-enrolled students): www.reddit.com/r/berkeleydeeprlcourse/
• Office hours: check course website (mine are after class on Wed in Soda 341B)
Prerequisites & Enrollment
• All enrolled students must have taken CS189, CS289, CS281A, or an
equivalent course at your home institution
• Please contact Sergey Levine if you haven’t
• Please enroll for 3 units
• Students on the wait list will be notified as slots open up
• Lectures will be recorded
• Since the class is full, please watch the lectures online if you are not enrolled
What you should know
• Assignments will require training neural networks with standard
automatic differentiation packages (TensorFlow by default)
• Review Section
• Greg Kahn will TensorFlow and neural networks on Wed next week (8/29)
• You should be able to at least do the TensorFlow MNIST tutorial (if not, make
sure to attend Greg’s lecture and ask questions!)
What we’ll cover
• Full list on course website (click “Lecture Slides”)
1. From supervised learning to decision making
2. Model-free algorithms: Q-learning, policy gradients, actor-critic
3. Advanced model learning and prediction
4. Exploration
5. Transfer and multi-task learning, meta-learning
6. Open problems, research talks, invited lectures
Assignments
1. Homework 1: Imitation learning (control via supervised learning)
2. Homework 2: Policy gradients (“REINFORCE”)
3. Homework 3: Q learning and actor-critic algorithms
4. Homework 4: Model-based reinforcement learning
5. Homework 5: Advanced model-free RL algorithms
6. Final project: Research-level project of your choice (form a group of
up to 2-3 students, you’re welcome to start early!)

Grading: 60% homework (12% each), 40% project

Your “Homework” Today
1. Sign up for Piazza (see course website)
2. Start forming your final project groups, unless you want to work
alone, which is fine
3. Check out the TensorFlow MNIST tutorial, unless you’re a
TensorFlow pro
What is reinforcement learning, and why
should we care?
How do we build intelligent machines?
Intelligent machines must be able to adapt
Deep learning helps us handle unstructured
environments
Reinforcement learning provides a formalism for
behavior
decisions (actions)

Schulman et al. ’14 & ‘15 Mnih et al. ‘13

consequences
observations
rewards
Levine*, Finn*, et al. ‘16
What is deep RL, and why should we care?
standard
features mid-level features classifier
computer
(e.g. HOG) (e.g. DPM) (e.g. SVM)
vision
Felzenszwalb ‘08

end-to-end training
deep
learning

standard
reinforcement
learning
features
? more features
? linear policy
or value func.
action

deep end-to-end training

reinforcement action
learning
What does end-to-end learning mean for
sequential decision making?
perception

Action
(run away)

action
sensorimotor loop

Action
(run away)
Example: robotics

robotic state
modeling & low-level
control observations estimation
prediction
planning
control
controls
pipeline (e.g. vision)
tiny, highly specialized tiny, highly specialized
“visual cortex” “motor cortex”

no direct supervision
actions have consequences
decisions (actions)

Deep models are what allow reinforcement Actions: muscle contractions

Observations: sight, smell
Actions: motor current or torque
Observations: camera images
Rewards: task success measure (e.g.,
learning algorithms to solve complex problemsRewards: food
running speed)
consequences
endobservations
to end!
rewards

Actions: what to purchase

The reinforcement learning problem is the AI problem! Observations: inventory levels
Rewards: profit
Complex physical tasks…

Rajeswaran, et al. 2018

Unexpected solutions…

Mnih, et al. 2015

Not just games and robots!

Cathy Wu
Why should we study this now?

1. Advances in deep learning

2. Advances in reinforcement learning
3. Advances in computational capability
Why should we study this now?

Tesauro, 1995

L.-J. Lin, “Reinforcement learning for robots using neural networks.” 1993
Why should we study this now?

Atari games: Real-world robots: Beating Go champions:

Q-learning: Guided policy search: Supervised learning + policy
V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. S. Levine*, C. Finn*, T. Darrell, P. Abbeel. “End-to-end gradients + value functions +
Antonoglou, et al. “Playing Atari with Deep training of deep visuomotor policies”. (2015).
Reinforcement Learning”. (2013).
Monte Carlo tree search:
Q-learning: D. Silver, A. Huang, C. J. Maddison, A. Guez,
Policy gradients: D. Kalashnikov et al. “QT-Opt: Scalable Deep L. Sifre, et al. “Mastering the game of Go
J. Schulman, S. Levine, P. Moritz, M. I. Jordan, and P. Reinforcement Learning for Vision-Based Robotic with deep neural networks and tree
Abbeel. “Trust Region Policy Optimization”. (2015). Manipulation”. (2018). search”. Nature (2016).
V. Mnih, A. P. Badia, M. Mirza, A. Graves, T. P. Lillicrap,
et al. “Asynchronous methods for deep reinforcement
learning”. (2016).
What other problems do we need to solve to
enable real-world sequential decision making?
Beyond learning from reward

• Basic reinforcement learning deals with maximizing rewards

• This is not the only problem that matters for sequential decision
making!
• We will cover more advanced topics
• Learning reward functions from example (inverse reinforcement learning)
• Transferring knowledge between domains (transfer learning, meta-learning)
• Learning to predict and using prediction to act
Where do rewards come from?
Are there other forms of supervision?

• Learning from demonstrations

• Directly copying observed behavior
• Inferring rewards from observed behavior (inverse reinforcement learning)
• Learning from observing the world
• Learning to predict
• Unsupervised learning
• Learning from other tasks
• Transfer learning
• Meta-learning: learning to learn
Imitation learning

Bojarski et al. 2016

More than imitation: inferring intentions

Warneken & Tomasello

Inverse RL examples

Finn et al. 2016

Prediction
What can we do with a perfect model?

Mordatch et al. 2015

Prediction for real-world control

Ebert et al. 2017

How do we build intelligent machines?
How do we build intelligent machines?
• Imagine you have to build an intelligent machine, where do you start?
Learning as the basis of intelligence
• Some things we can all do (e.g. walking)
• Some things we can only learn (e.g. driving a car)
• We can learn a huge variety of things, including very difficult things
• Therefore our learning mechanism(s) are likely powerful enough to do
everything we associate with intelligence
• But it may still be very convenient to “hard-code” a few really important bits
A single algorithm?
• An algorithm for each “module”?
• Or a single flexible algorithm?

Seeing with your tongue

Auditory
Cortex
Human echolocation (sonar)

[BrainPort; Martinez et al; Roe et al.]

adapted from A. Ng
What must that single algorithm do?
• Interpret rich sensory inputs

• Choose complex actions

Why deep reinforcement learning?
• Deep = can process complex sensory input
▪ …and also compute really complex functions
• Reinforcement learning = can choose complex actions
Some evidence in favor of deep learning
Some evidence for reinforcement learning
• Percepts that anticipate reward
become associated with similar
firing patterns as the reward
itself
• Basal ganglia appears to be
related to reward system
• Model-free RL-like adaptation is
often a good fit for experimental
data of animal adaptation
• But not always…
What can deep learning & RL do well now?
• Acquire high degree of proficiency in
domains governed by simple, known
rules
• Learn simple skills with raw sensory
inputs, given enough experience
• Learn from imitating enough human-
provided expert behavior
What has proven challenging so far?
• Humans can learn incredibly quickly
• Deep RL methods are usually slow
• Humans can reuse past knowledge
• Transfer learning in deep RL is an open problem
• Not clear what the reward function should be
• Not clear what the role of prediction should be
Instead of trying to produce a
program to simulate the adult
mind, why not rather try to
produce one which simulates the
child's? If this were then subjected general learning
to an appropriate course of algorithm

observations
education one would obtain the

actions
adult brain.
- Alan Turing
environment

CS 1 - KFC in China
No ratings yet
CS 1 - KFC in China
6 pages
Your Space Student's Book
No ratings yet
Your Space Student's Book
4 pages
Session 1
0% (1)
Session 1
13 pages
Deep Learning Methods and Applications For Electrical Power Systems A Comprehensive Review
No ratings yet
Deep Learning Methods and Applications For Electrical Power Systems A Comprehensive Review
22 pages
Applied Deep Learning: Design and implement your own Neural Networks to solve real-world problems (English Edition)
From Everand
Applied Deep Learning: Design and implement your own Neural Networks to solve real-world problems (English Edition)
Dr. Rajkumar Tekchandani
No ratings yet
Practical Deep Reinforcement Learning with Python: Concise Implementation of Algorithms, Simplified Maths, and Effective Use of TensorFlow and PyTorch (English Edition)
From Everand
Practical Deep Reinforcement Learning with Python: Concise Implementation of Algorithms, Simplified Maths, and Effective Use of TensorFlow and PyTorch (English Edition)
Ivan Gridin
4/5 (1)
Machine Learning: Adaptive Behaviour Through Experience: Thinking Machines
From Everand
Machine Learning: Adaptive Behaviour Through Experience: Thinking Machines
alasdair gilchrist
4.5/5 (5)
MACHINELEARING UNIT 1material
100% (1)
MACHINELEARING UNIT 1material
64 pages
Deep Learning
100% (1)
Deep Learning
49 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
62 pages
Download Full Deep Learning 1st Edition Dulani Meedeniya PDF All Chapters
100% (2)
Download Full Deep Learning 1st Edition Dulani Meedeniya PDF All Chapters
50 pages
Regularization: Swetha V, Research Scholar
No ratings yet
Regularization: Swetha V, Research Scholar
32 pages
02 ML Supervised Learning
No ratings yet
02 ML Supervised Learning
32 pages
Building Machine Learning Systems With Python - Second Edition - Sample Chapter
100% (2)
Building Machine Learning Systems With Python - Second Edition - Sample Chapter
32 pages
22 Selected Top Papers On Deep Learning
No ratings yet
22 Selected Top Papers On Deep Learning
393 pages
1 - Machine Learning (Start)
No ratings yet
1 - Machine Learning (Start)
32 pages
ML - Expectation-Maximization Algorithm
No ratings yet
ML - Expectation-Maximization Algorithm
3 pages
Lecture 12 - Deep Learning
No ratings yet
Lecture 12 - Deep Learning
25 pages
Immediate download (eBook PDF) Machine Learning A Probabilistic Perspective by Kevin P. Murphy ebooks 2024
100% (8)
Immediate download (eBook PDF) Machine Learning A Probabilistic Perspective by Kevin P. Murphy ebooks 2024
46 pages
Statquest Gentle Introduction To Rna Seq
100% (1)
Statquest Gentle Introduction To Rna Seq
188 pages
Introduction of Neural Network
No ratings yet
Introduction of Neural Network
31 pages
Symbolic Machine Learning: M.S.Kaysar, M.Engg Cse, Iub
100% (2)
Symbolic Machine Learning: M.S.Kaysar, M.Engg Cse, Iub
112 pages
Cs490 Advanced Topics in Computing (Deep Learning) : Lecture 16: Convolutional Neural Networks (CNNS)
No ratings yet
Cs490 Advanced Topics in Computing (Deep Learning) : Lecture 16: Convolutional Neural Networks (CNNS)
63 pages
Machine Learning Coursera All Exercies
75% (12)
Machine Learning Coursera All Exercies
117 pages
1 - Intro To Machine Learning
100% (1)
1 - Intro To Machine Learning
20 pages
Types of Neural Networks
No ratings yet
Types of Neural Networks
7 pages
Matlab Deep Learning Series
No ratings yet
Matlab Deep Learning Series
6 pages
ML
No ratings yet
ML
79 pages
Artificial Intelligence: Yunita Sari Kamis, 23 Feb 2012
No ratings yet
Artificial Intelligence: Yunita Sari Kamis, 23 Feb 2012
24 pages
Notes On Backpropagation
No ratings yet
Notes On Backpropagation
14 pages
A Gentle Introduction To Neural Networks With Python
100% (1)
A Gentle Introduction To Neural Networks With Python
85 pages
Naïve Bayes Classifier Algorithm
No ratings yet
Naïve Bayes Classifier Algorithm
2 pages
The Multilayer Perceptron
No ratings yet
The Multilayer Perceptron
11 pages
Lec16 - Autoencoders
No ratings yet
Lec16 - Autoencoders
18 pages
The 9 Deep Learning Papers You Need To Know About 3
No ratings yet
The 9 Deep Learning Papers You Need To Know About 3
19 pages
Scaling AI and ML
No ratings yet
Scaling AI and ML
4 pages
ML Unit-Iv
No ratings yet
ML Unit-Iv
19 pages
Hyperparameter Tuning in XGBoost Using Genetic Algorithm
100% (1)
Hyperparameter Tuning in XGBoost Using Genetic Algorithm
11 pages
Generative Adversarial Networks (GANs)
No ratings yet
Generative Adversarial Networks (GANs)
51 pages
SUpport Vector Machine
No ratings yet
SUpport Vector Machine
28 pages
Deep Learning Patterns and Practices 1st Edition Andrew Ferlitsch 2024 scribd download
100% (3)
Deep Learning Patterns and Practices 1st Edition Andrew Ferlitsch 2024 scribd download
40 pages
Machine Learning
No ratings yet
Machine Learning
9 pages
Ensemble Machine Learning With Python: 7-Day Mini-Course Jason Brownlee - The full ebook version is ready for instant download
100% (1)
Ensemble Machine Learning With Python: 7-Day Mini-Course Jason Brownlee - The full ebook version is ready for instant download
46 pages
Deep Learning Notes
No ratings yet
Deep Learning Notes
200 pages
SRM Valliammai Engineering College (An Autonomous Institution)
No ratings yet
SRM Valliammai Engineering College (An Autonomous Institution)
9 pages
Machine Learning
100% (1)
Machine Learning
21 pages
Leukemia Cancer Cells Segmentation and Classification Using Machine Learning
No ratings yet
Leukemia Cancer Cells Segmentation and Classification Using Machine Learning
18 pages
Machine Learning
100% (1)
Machine Learning
6 pages
Statistics in Details
100% (2)
Statistics in Details
283 pages
Deep Learning
100% (3)
Deep Learning
32 pages
Deep Learning in Object Detection, PDF
No ratings yet
Deep Learning in Object Detection, PDF
64 pages
Deep Reinforcement Learning PDF
No ratings yet
Deep Reinforcement Learning PDF
150 pages
A Novel Adoption of LSTM in Customer Touchpoint Prediction Problems Presentation 1
100% (1)
A Novel Adoption of LSTM in Customer Touchpoint Prediction Problems Presentation 1
73 pages
Deep Learning
No ratings yet
Deep Learning
43 pages
Neural Networks
No ratings yet
Neural Networks
29 pages
1 Introduction To Reinforcement Learning
100% (2)
1 Introduction To Reinforcement Learning
104 pages
Machine Learning Is Fun 1565131730
No ratings yet
Machine Learning Is Fun 1565131730
48 pages
L1 - Machine Learning For Finance
No ratings yet
L1 - Machine Learning For Finance
131 pages
CS5560 Lect12-RNN - LSTM
No ratings yet
CS5560 Lect12-RNN - LSTM
30 pages
Lecture On Fuzzy Logic Control
No ratings yet
Lecture On Fuzzy Logic Control
87 pages
Hopfield Networks: Fundamentals and Applications of The Neural Network That Stores Memories
From Everand
Hopfield Networks: Fundamentals and Applications of The Neural Network That Stores Memories
Fouad Sabry
No ratings yet
Hebbian Learning: Fundamentals and Applications for Uniting Memory and Learning
From Everand
Hebbian Learning: Fundamentals and Applications for Uniting Memory and Learning
Fouad Sabry
No ratings yet
Fidomilural
No ratings yet
Fidomilural
5 pages
MSP - Benefit Profile
No ratings yet
MSP - Benefit Profile
1 page
Annotated Bibliography
No ratings yet
Annotated Bibliography
6 pages
Lesson Plan For High School Choir RCR
No ratings yet
Lesson Plan For High School Choir RCR
5 pages
Guidelines On: Chronic Pelvic Pain
No ratings yet
Guidelines On: Chronic Pelvic Pain
90 pages
(GROUP3 - Seminar On ELT
No ratings yet
(GROUP3 - Seminar On ELT
12 pages
Eumind Self Assessment Part 4
No ratings yet
Eumind Self Assessment Part 4
3 pages
Lead Small Team Learning Module
No ratings yet
Lead Small Team Learning Module
5 pages
Mid Term Test S5,5
No ratings yet
Mid Term Test S5,5
10 pages
MP Singh Admin Book (Mid Term)
No ratings yet
MP Singh Admin Book (Mid Term)
240 pages
Assessment
No ratings yet
Assessment
30 pages
Foreign Student Requirements
No ratings yet
Foreign Student Requirements
2 pages
Head Secrtrial Compliance - FloraDas (16y - 0m)
No ratings yet
Head Secrtrial Compliance - FloraDas (16y - 0m)
3 pages
Tic-tac-Toe - Java Game Programming Case Study
No ratings yet
Tic-tac-Toe - Java Game Programming Case Study
27 pages
Midterm Module - 5 Constitution
No ratings yet
Midterm Module - 5 Constitution
6 pages
NSSB General Paper Syllabus
No ratings yet
NSSB General Paper Syllabus
3 pages
S5 SFM
No ratings yet
S5 SFM
3 pages
COT 1-E-TEch Lesson Plan - SY 2021-2022
No ratings yet
COT 1-E-TEch Lesson Plan - SY 2021-2022
2 pages
Overtime Wages Under Factories Act
No ratings yet
Overtime Wages Under Factories Act
8 pages
Jadwal Pertandingan Arena E-1
No ratings yet
Jadwal Pertandingan Arena E-1
11 pages
RepetMat U13 Test Podst B
No ratings yet
RepetMat U13 Test Podst B
2 pages
Diabetes in Pregnancy: Multiple Choice Questions For Vol. 25, No. 1 - Obgyn Key
No ratings yet
Diabetes in Pregnancy: Multiple Choice Questions For Vol. 25, No. 1 - Obgyn Key
1 page
Concept Paper
No ratings yet
Concept Paper
6 pages
Emotion Regulation A Matter of Time Frontiers of Developmental Science 1st Edition Pamela M Cole Editor Tom Hollenstein Editor instant download
100% (1)
Emotion Regulation A Matter of Time Frontiers of Developmental Science 1st Edition Pamela M Cole Editor Tom Hollenstein Editor instant download
52 pages
Introduction To Food and Beverage Services
100% (1)
Introduction To Food and Beverage Services
30 pages
PSM Unit 1
No ratings yet
PSM Unit 1
16 pages
Suresh Chandra - Fuzzy Linear Programming Problem
No ratings yet
Suresh Chandra - Fuzzy Linear Programming Problem
76 pages
Regional Aspirations-1
No ratings yet
Regional Aspirations-1
9 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Deep Reinforcement Learning

Uploaded by

Deep Reinforcement Learning

Uploaded by

Deep Reinforcement Learning

• Course website: http://rail.eecs.berkeley.edu/deeprlcourse

Grading: 60% homework (12% each), 40% project

Schulman et al. ’14 & ‘15 Mnih et al. ‘13

deep end-to-end training

Deep models are what allow reinforcement Actions: muscle contractions

Actions: what to purchase

Rajeswaran, et al. 2018

Mnih, et al. 2015

1. Advances in deep learning

Atari games: Real-world robots: Beating Go champions:

• Basic reinforcement learning deals with maximizing rewards

• Learning from demonstrations

Bojarski et al. 2016

Warneken & Tomasello

Finn et al. 2016

Mordatch et al. 2015

Ebert et al. 2017

Seeing with your tongue

[BrainPort; Martinez et al; Roe et al.]

• Choose complex actions

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.