0% found this document useful (0 votes)

110 views4 pages

Assignment 4: Reinforcement Learning Prof. B. Ravindran

The document contains the solutions to 10 questions about reinforcement learning and Markov decision processes. It discusses topics like state transition graphs, determining optimal policies from value and q-value functions, benefits of RL algorithms, equations related to value functions, properties of optimal policies and stochastic matrices, solving an MDP with a given policy, and contraction mappings.

Uploaded by

Sathya J

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

110 views4 pages

Assignment 4: Reinforcement Learning Prof. B. Ravindran

Uploaded by

Sathya J

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Assignment 4

Reinforcement Learning
Prof. B. Ravindran
1. State True/False
The state transition graph for any MDP is a directed acyclic graph.
(a) True
(b) False
Sol. (b)
The statement is false. There is a possibility of transitioning to the same state, as well as
having other cycles.
2. Consider the following statements:
(i) The optimal policy of an MDP is unique.
(ii) We can determine an optimal policy for a MDP using only the optimal value function(v ∗ ),
without accessing the MDP parameters.
(iii) We can determine an optimal policy for a given MDP using only the optimal q-value
function(q ∗ ), without accessing the MDP parameters.
Which of these statements are true?

(a) Only (ii)

(b) Only (iii)
(c) Only (i), (ii)
(d) Only (i), (iii)
(e) Only (ii), (iii)
Sol. (b)
Optimal policy can be recovered from an optimal q-value function. Also, a given MDP can
have multiple optimal policies.

3. Which of the following is a benefit of using RL algorithms for solving MDPs?

(a) They do not require the state of the agent for solving a MDP.
(b) They do not require the action taken by the agent for solving a MDP.
(c) They do not require the state transition probability matrix for solving a MDP.
(d) They do not require the reward signal for solving a MDP.

Sol. (c)
RL algorithms require to know the state the agent is in, the action it takes and a reward
signal from the environment to solve the MDP. However, they do not need to know the state
transition probability matrix.

4. Consider the following equations:

1
P∞
(i) v π (s) = Eπ [ i=t γ i−t Ri+1 |St = s]
(ii) q π (s, a) = s′ p(s′ |s, a)v π (s′ )
P

(iii) v π (s) = a π(a|s)q π (s, a)

Which of the above are correct?

(a) Only (i)
(b) Only (i), (ii)
(c) Only (ii), (iii)
(d) Only (i), (iii)
(e) (i), (ii), (iii)

Sol. (d)
(i) is the definition of v π (s) and (iii) follows from definition of v π (s) and q π (s, a). (ii) doesn’t
contain the immediate reward term and hence is wrong.
5. State True/False
While solving MDPs, in case of discounted rewards, the value of γ(discount factor) cannot
affect the optimal policy.
(a) True
(b) False
Sol. (b)
With the change in γ value, the expected return of any state could change and thus, the
optimal policy could change.
6. Consider the following statements for a finite MDP (I is an identity matrix with dimensions
|S| × |S|(S is the set of all states) and Pπ is a stochastic matrix):
(i) MDP with stochastic rewards may not have a deterministic optimal policy.
(ii) There can be multiple optimal stochastic policies.
(iii) If 0 ≤ γ < 1, then rank of the matrix I − γPπ is equal to |S|.
(iv) If 0 ≤ γ < 1, then rank of the matrix I − γPπ is less than |S|.
Which of the above statements are true?

(a) Only (ii), (iii)

(b) Only (ii), (iv)
(c) Only (i), (iii)
(d) Only (i), (ii), (iii)

Sol. (a)
Check the lectures, it states that there always exists a deterministic optimal policy.
Lectures provide an example of multiple stochastic optimal policies.
I − γPπ will have non zero eigenvalues, so the rank will be equal to the number of rows which
is |S|

2
7. Consider an MDP with 3 states A, B, C. From each state, we can go to either of the two
states, i.e, from state A, we can perform 2 actions, that lead to state B and C respectively.
The rewards for all the transitions are: r(A, B) = 2 (reward if we go from A to B), r(B, A) = 5,
r(B, C) = 7, r(C, B) = 10, r(A, C) = 1, r(C, A) = 12. The discount factor is 0.7. Find the
value function for the policy given by: π(A) = C (if we are in state A, we choose the action
to go to C), π(B) = A and π(C) = B ([v π (A), v π (B), v π (C)]).

(a) [10.2, 16.7, 20.2]

(b) [14.2, 16.5, 15.1]
(c) [15.9, 16.1, 21.3]
(d) [12.2, 6.2, 14.5]

Sol. (c)
We can just substitute the options to find out which one is a fixed point of the Bellman equa-
tion, alternatively, compute (I − γPπ )−1 rπ

Note: Pπ = [[0 , 0 , 1], [1 , 0 , 0], [0 , 1 , 0]] ; rπ = [1 , 5 , 10]T ; γ = 0.7

8. Suppose x is a fixed point for the function A, y is a fixed point for the function B, and x =
BA(x), where BA is the composition of B and A. Consider the following statements:
(i) x is a fixed point for B
(ii) x = y
(iii) BA(y) = y

Which of the above must be true?

(a) Only (i)
(b) Only (ii)
(c) Only (i), (ii)
(d) (i), (ii), (iii)
Sol. (a)
x = B(A(x)) =⇒ x = B(x). Therefore, x is a fixed point of B.
However, that does not mean x = y. The function B could have multiple fixed points (consider
the identity function), so (ii) is False.
There is no guarantee that y is a fixed point for A. So, we cannot say (iii) is True.
9. Which of the following is not a valid norm function? (x is a D dimensional vector)
(a) maxd∈{1,...,D} |xd |
q
(b) ΣD 2
d=1 xd

(c) mind∈{1,...,D} |xd |

(d) ΣD
d=1 |xd |

Sol. (c)
(c) can be zero when x is not the zero vector.

3
10. Which of the following is a contraction mapping in any norm?

(a) T ([x1 , x2 ]) = [0.5x1 , 0.5x2 ]

(b) T ([x1 , x2 ]) = [2x1 , 2x2 ]
(c) T ([x1 , x2 ]) = [2x1 , 3x2 ]
(d) T ([x1 , x2 ]) = [x1 + x2 , x1 − x2 ]

Sol. (a)
(a) is a contraction mapping in any norm as ∥T u − T v∥ = 0.5∥u − v∥.

CS230 Midterm Solutions Fall 2022
No ratings yet
CS230 Midterm Solutions Fall 2022
20 pages
DEEP LEARNING IIT Kharagpur Assignment - 5 - 2024
No ratings yet
DEEP LEARNING IIT Kharagpur Assignment - 5 - 2024
9 pages
2022 ML Assignments
No ratings yet
2022 ML Assignments
45 pages
Introduction To Machine Learning - Unit 3 - Week 1 - Non - Graded
No ratings yet
Introduction To Machine Learning - Unit 3 - Week 1 - Non - Graded
3 pages
Assignment 3: Introduction To Machine Learning Prof. B. Ravindran
No ratings yet
Assignment 3: Introduction To Machine Learning Prof. B. Ravindran
4 pages
DRL_Homework_1
No ratings yet
DRL_Homework_1
4 pages
Artificial Intelligence - Knowledge Representation and Reasoning - Unit 8 - Week 5
100% (1)
Artificial Intelligence - Knowledge Representation and Reasoning - Unit 8 - Week 5
5 pages
2023 ML Assignment
No ratings yet
2023 ML Assignment
57 pages
Artificial Intelligence - Knowledge Representation and Reasoning - Unit 7 - Week 4 - Models
No ratings yet
Artificial Intelligence - Knowledge Representation and Reasoning - Unit 7 - Week 4 - Models
4 pages
The Therapeutic Touch - How To Use Your Hands To Help or To - Krieger, Dolores
100% (1)
The Therapeutic Touch - How To Use Your Hands To Help or To - Krieger, Dolores
180 pages
Assignment 4 (Sol.) : Introduction To Machine Learning Prof. B. Ravindran
No ratings yet
Assignment 4 (Sol.) : Introduction To Machine Learning Prof. B. Ravindran
4 pages
Introduction To Machine Learning - Unit 4 - Week 2
100% (1)
Introduction To Machine Learning - Unit 4 - Week 2
3 pages
Machine Learning, ML Ass 1
No ratings yet
Machine Learning, ML Ass 1
4 pages
Assignment 2
No ratings yet
Assignment 2
7 pages
Assignment 1
No ratings yet
Assignment 1
7 pages
Assignment 1: Unit 3 - Week 1
No ratings yet
Assignment 1: Unit 3 - Week 1
80 pages
Thesis End HK PDF
No ratings yet
Thesis End HK PDF
142 pages
1157_CS_F425_20231222015056_Mid_Semester_Question_Paper_DL
No ratings yet
1157_CS_F425_20231222015056_Mid_Semester_Question_Paper_DL
2 pages
Deep Learning
No ratings yet
Deep Learning
6 pages
Introduction To Machine Learning - IITKGP - Unit 4 - Week 2
No ratings yet
Introduction To Machine Learning - IITKGP - Unit 4 - Week 2
5 pages
Deep Learning - IIT Ropar - Unit 10 - Week 7
100% (1)
Deep Learning - IIT Ropar - Unit 10 - Week 7
4 pages
Assignment 4: Introduction To Machine Learning Prof. B. Ravindran
No ratings yet
Assignment 4: Introduction To Machine Learning Prof. B. Ravindran
2 pages
Object Role Modeling Fundamentals A Practical Guide to Data Modeling with ORM First Edition Terry Halpin 2024 scribd download
100% (17)
Object Role Modeling Fundamentals A Practical Guide to Data Modeling with ORM First Edition Terry Halpin 2024 scribd download
81 pages
SN Week7 PDF
No ratings yet
SN Week7 PDF
5 pages
Total Marks (15 Qns 1 Mark 15 Marks) : Business Intelligence and Analytics Assignment Week 1
No ratings yet
Total Marks (15 Qns 1 Mark 15 Marks) : Business Intelligence and Analytics Assignment Week 1
29 pages
M2 L2 The Liturgy
No ratings yet
M2 L2 The Liturgy
81 pages
Assignment 3: Reinforcement Learning Prof. B. Ravindran
100% (1)
Assignment 3: Reinforcement Learning Prof. B. Ravindran
4 pages
Digital Image Processing Assignment Week 6: NPTEL Online Certificate Courses Indian Institute of Technology, Kharagpur
No ratings yet
Digital Image Processing Assignment Week 6: NPTEL Online Certificate Courses Indian Institute of Technology, Kharagpur
4 pages
PA12
100% (2)
PA12
3 pages
Practice Assignment 6: Reinforcement Learning Prof. B. Ravindran
No ratings yet
Practice Assignment 6: Reinforcement Learning Prof. B. Ravindran
24 pages
Deep Learning - IIT Ropar - - Unit 12 - Week 9 (1)
No ratings yet
Deep Learning - IIT Ropar - - Unit 12 - Week 9 (1)
4 pages
Machine Learning, ML - ASS3
100% (1)
Machine Learning, ML - ASS3
6 pages
Deep Learning - Week 7
No ratings yet
Deep Learning - Week 7
4 pages
Deep Learning - Week 11
No ratings yet
Deep Learning - Week 11
4 pages
Basic Concepts of Management
No ratings yet
Basic Concepts of Management
10 pages
Understanding Machine Learning Solution Manual: 2 Gentle Start
No ratings yet
Understanding Machine Learning Solution Manual: 2 Gentle Start
67 pages
Deep Learning - IIT Ropar - Unit 7 - Week 4
100% (1)
Deep Learning - IIT Ropar - Unit 7 - Week 4
5 pages
Deep Learning - IIT Ropar - Unit 11 - Week 8
No ratings yet
Deep Learning - IIT Ropar - Unit 11 - Week 8
4 pages
Deep Learning - IIT Ropar - Unit 4 - Week 1
No ratings yet
Deep Learning - IIT Ropar - Unit 4 - Week 1
8 pages
Fabm2 q3 Module 7.types of Bank Accounts
No ratings yet
Fabm2 q3 Module 7.types of Bank Accounts
14 pages
Photography
75% (4)
Photography
39 pages
Deep Learning - IIT Ropar - - Unit 8 - Week 5
No ratings yet
Deep Learning - IIT Ropar - - Unit 8 - Week 5
4 pages
Artificial Intelligence - Knowledge Representation and Reasoning - Unit 5 - Week 2
No ratings yet
Artificial Intelligence - Knowledge Representation and Reasoning - Unit 5 - Week 2
6 pages
Social Networks - NPTEL JANUARY 2022: Assignment 1
No ratings yet
Social Networks - NPTEL JANUARY 2022: Assignment 1
4 pages
Sham Daa MCQ
67% (3)
Sham Daa MCQ
2 pages
Deep Learning - IIT Ropar - Unit 12 - Week 9
No ratings yet
Deep Learning - IIT Ropar - Unit 12 - Week 9
4 pages
Sub-Culture Theory
No ratings yet
Sub-Culture Theory
5 pages
AI 3000 / CS 5500: Reinforcement Learning Assignment 1: Problem 1: Markov Reward Process
No ratings yet
AI 3000 / CS 5500: Reinforcement Learning Assignment 1: Problem 1: Markov Reward Process
5 pages
Assignment 11: Reinforcement Learning Prof. B. Ravindran
No ratings yet
Assignment 11: Reinforcement Learning Prof. B. Ravindran
4 pages
Manlutac Christian
No ratings yet
Manlutac Christian
7 pages
Assignment 7: Assignment Submitted On 2024-03-13, 17:33 IST
No ratings yet
Assignment 7: Assignment Submitted On 2024-03-13, 17:33 IST
4 pages
Deep Learning - IIT Ropar - Unit 4 - Week 1
No ratings yet
Deep Learning - IIT Ropar - Unit 4 - Week 1
5 pages
Machine Learning, ML Ass 5
No ratings yet
Machine Learning, ML Ass 5
6 pages
Deep Learning - IIT Ropar - Unit 6 - Week 3
No ratings yet
Deep Learning - IIT Ropar - Unit 6 - Week 3
4 pages
ML Ass 2
No ratings yet
ML Ass 2
6 pages
Assignment 11
100% (1)
Assignment 11
4 pages
Assignment 11: Introduction To Machine Learning Prof. B. Ravindran
100% (2)
Assignment 11: Introduction To Machine Learning Prof. B. Ravindran
3 pages
w6 Classical Art
No ratings yet
w6 Classical Art
2 pages
Assignment - Week 6 (Neural Networks) Type of Question: MCQ/MSQ
No ratings yet
Assignment - Week 6 (Neural Networks) Type of Question: MCQ/MSQ
4 pages
publication
No ratings yet
publication
4 pages
Tips For Passing The Civil Service Exam
No ratings yet
Tips For Passing The Civil Service Exam
4 pages
NPTEL Online Certification Courses Indian Institute of Technology Kharagpur
100% (1)
NPTEL Online Certification Courses Indian Institute of Technology Kharagpur
4 pages
Assignment 7 (Sol.) : Reinforcement Learning
0% (1)
Assignment 7 (Sol.) : Reinforcement Learning
3 pages
SENSEI - An Architecture For The Internet
No ratings yet
SENSEI - An Architecture For The Internet
76 pages
Solutions To Reinforcement Learning by Sutton Chapter 5 r3
No ratings yet
Solutions To Reinforcement Learning by Sutton Chapter 5 r3
9 pages
A8
No ratings yet
A8
4 pages
Week3 Assignment
No ratings yet
Week3 Assignment
6 pages
RECEPTION TABLE DETAILS
No ratings yet
RECEPTION TABLE DETAILS
1 page
Enviro Dual Compact Pack-Off
100% (1)
Enviro Dual Compact Pack-Off
1 page
Code of Conduct For Students
No ratings yet
Code of Conduct For Students
4 pages
MCQ Question
No ratings yet
MCQ Question
5 pages
Introduction To Machine Learning Assignment-Week 4
No ratings yet
Introduction To Machine Learning Assignment-Week 4
5 pages
Ergonic 3 CE Brochure en
100% (1)
Ergonic 3 CE Brochure en
7 pages
Twin and Full Size Platform Bed Project Diagram
No ratings yet
Twin and Full Size Platform Bed Project Diagram
8 pages
Foreign Direct Investment in Ows and The Industrialization of African Countries
No ratings yet
Foreign Direct Investment in Ows and The Industrialization of African Countries
15 pages
Ellie Ragland "Counting From 0 To 6 - Lacan"
100% (1)
Ellie Ragland "Counting From 0 To 6 - Lacan"
27 pages
Epistle of Timothy
No ratings yet
Epistle of Timothy
15 pages
Thank You For Taking The Week 3: Assignment 3. Week 3: Assignment 3
No ratings yet
Thank You For Taking The Week 3: Assignment 3. Week 3: Assignment 3
3 pages
SYLLABUS DESIGN (HO ESP Mate Dev Topic# 4) PDF
No ratings yet
SYLLABUS DESIGN (HO ESP Mate Dev Topic# 4) PDF
49 pages
Assignment 5 (Sol.) : Reinforcement Learning
100% (1)
Assignment 5 (Sol.) : Reinforcement Learning
4 pages
New Teacher Email Subject Area Grade Level: K.CC.5 Counting and Cardinality Count To Tell The Number of Objects
No ratings yet
New Teacher Email Subject Area Grade Level: K.CC.5 Counting and Cardinality Count To Tell The Number of Objects
5 pages
Ridhi Resume
No ratings yet
Ridhi Resume
2 pages
E8372h Firmware Release Notes V1.0: Huawei Technologies Co., LTD
No ratings yet
E8372h Firmware Release Notes V1.0: Huawei Technologies Co., LTD
11 pages
Practice Assignment 4: Reinforcement Learning Prof. B. Ravindran
No ratings yet
Practice Assignment 4: Reinforcement Learning Prof. B. Ravindran
2 pages
Swami Vivekananda About Anger
No ratings yet
Swami Vivekananda About Anger
3 pages
Reinforcement Learning - Unit 6 - Week 4
No ratings yet
Reinforcement Learning - Unit 6 - Week 4
3 pages
Data Sheet - OS YBC 2x1200mm - TOSHIBA T8 LED (PRO) - 14W - VECK - SSR
No ratings yet
Data Sheet - OS YBC 2x1200mm - TOSHIBA T8 LED (PRO) - 14W - VECK - SSR
1 page
MUCLecture 2021 111657848
No ratings yet
MUCLecture 2021 111657848
6 pages
Anaerobic Respiration PDF
No ratings yet
Anaerobic Respiration PDF
7 pages
Activity 3 LESSON 2 ETHICS
No ratings yet
Activity 3 LESSON 2 ETHICS
1 page

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Assignment 4: Reinforcement Learning Prof. B. Ravindran

Uploaded by

Assignment 4: Reinforcement Learning Prof. B. Ravindran

Uploaded by

Assignment 4

(a) Only (ii)

3. Which of the following is a benefit of using RL algorithms for solving MDPs?

4. Consider the following equations:

(iii) v π (s) = a π(a|s)q π (s, a)

Which of the above are correct?

(a) Only (ii), (iii)

(a) [10.2, 16.7, 20.2]

Note: Pπ = [[0 , 0 , 1], [1 , 0 , 0], [0 , 1 , 0]] ; rπ = [1 , 5 , 10]T ; γ = 0.7

Which of the above must be true?

(c) mind∈{1,...,D} |xd |

(a) T ([x1 , x2 ]) = [0.5x1 , 0.5x2 ]

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.