Reinforcement Learning Playground

What is reinforcement learning?

Reinforcement learning is a branch of machine learning.
Involves an agent and environment.
Agents learns optimal for maximizing rewards.

When should we worry about sequential decision making?

Limited supervision: you know what you want, but not how to get it.

Late consequences?

Why learn RL?

Not just for games
Make optimal decisions
Maximize efficiency

What are RL applications?

Robotics
Self-driving cars
Inventory management
Finantial investments
Decision-based situations

RL terminology

What is the agent?

The agent is the algorithm
Decides which action to tale
Agent monitors the environment
Who is learning
It's only outcome are decisions(actions, controls)

What is an environment?

The environment is everything the agent can interact with.
Agent's actions affect the environment.
It responds to actor's actions with consequences(observations, rewards estimation)

What is a state?

The state is a representation of what the agent can sense.
Does not always involve the entire environment. It's limited to what the agent can sense.

What is an action?

An action is what an agent can do is a given state.
Actions are limited by the environment.
The action's goal is to maximize reward.

What is the reward?

Result from making an action.
Feedback from the environment.
It can be positive or negative.
Helps encourage or discourage certain actions, policies or behaivours.
Is what the agent tries to optimize.
Rewards are hard to formulate.

Where do rewards come from?

When playing video games, rewards come from scores.

Are there other forms of supervision?

Learning from demostrations.
- Directly copying observed behavior.
- Inferring rewards from observed behavior.
Learning from observing the world.
- Learning to predict.
- Unsupervised Learning
Learning from other tasks
- Transfer learning

What is the standard reinforcement loop?

TODO

What is Deep Reinforcement Learning?

Deep learning: end-to-end training of expressive, multi-layer models.
Deep models are what allow RL algorithms to solve complex problems end-to-end.

Why Deep Reinforcement Learning?

Deep = can process complex sensory input

What can deep learning & RL do well now?

Adquire high degree of proficiency in domains governed by simple, known rules.
Learn simple skills with raw sensory inputs, given enough experience.
Learn from imitating enough human-provided expert behavior.

What has proven challenging so far?

Humans can learn incredibly quickly
Humans can reuse past knowledge
- Transfer learning in deep RL is an open problem
Not clear what the reward function should be

How do we build intelligent machines?

Learning as the basis of intelligence.

Some things we can all do.
Some things we can only learn.

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
docs		docs
markov-decision-processes		markov-decision-processes
multi-armed-bandid-problem		multi-armed-bandid-problem
q-learning-maze		q-learning-maze
q-learning		q-learning
rl-hello-world		rl-hello-world
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Reinforcement Learning Playground

What is reinforcement learning?

When should we worry about sequential decision making?

Why learn RL?

What are RL applications?

RL terminology

What is the agent?

What is an environment?

What is a state?

What is an action?

What is the reward?

Where do rewards come from?

Are there other forms of supervision?

What is the standard reinforcement loop?

What is Deep Reinforcement Learning?

Why Deep Reinforcement Learning?

What can deep learning & RL do well now?

What has proven challenging so far?

How do we build intelligent machines?

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

License

martin-fabbri/reinforcement-learning-playground

Folders and files

Latest commit

History

Repository files navigation

Reinforcement Learning Playground

What is reinforcement learning?

When should we worry about sequential decision making?

Why learn RL?

What are RL applications?

RL terminology

What is the agent?

What is an environment?

What is a state?

What is an action?

What is the reward?

Where do rewards come from?

Are there other forms of supervision?

What is the standard reinforcement loop?

What is Deep Reinforcement Learning?

Why Deep Reinforcement Learning?

What can deep learning & RL do well now?

What has proven challenging so far?

How do we build intelligent machines?

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Packages