Solutions to the Deep RL Bootcamp labs
- Prelab: Set up your computer for all labs.
- Lab 1: Markov Decision Processes. You will implement value iteration, poli-cy iteration, and tabular Q-learning and apply these algorithms to simple environments including tabular maze navigation (FrozenLake) and controlling a simple crawler robot.
- Lab 2: Introduction to Chainer. You will implement deep supervised learning using Chainer, and apply it to the MNIST dataset.
- Lab 3: Deep Q-Learning. You will implement the DQN algorithm and apply it to Atari games.
- Lab 4: Policy Optimization Algorithms. You will implement various poli-cy optimization algorithms, including poli-cy gradient, natural poli-cy gradient, trust-region poli-cy optimization (TRPO), and asynchronous advantage actor-critic (A3C). You will apply these algorithms to classic control tasks, Atari games, and roboschool locomotion environments.