mhrd logo inflibnet logo ugc logo

Module Details

Course : Reinforcement Learning

Subject : Computer science

No. of Modules : 54

Level : UG

Source : SwayamPrabha;Channel-13

Back

Sr. No. Title Creator/Author E-Text Video URL Metadata
1 Pomdp introduction Prof. Balaraman Ravindran - - Click Here
2 Solving pomdp Prof. Balaraman Ravindran - - Click Here
3 Maxq value function decomposition Prof. Balaraman Ravindran - - Click Here
4 Option discovery Prof. Balaraman Ravindran - - Click Here
5 Maxq Prof. Balaraman Ravindran - - Click Here
6 Semi-markov decision processes Prof. Balaraman Ravindran - - Click Here
7 Learning with options Prof. Balaraman Ravindran - - Click Here
8 Option Prof. Balaraman Ravindran - - Click Here
9 Ham Prof. Balaraman Ravindran - - Click Here
10 Policy gradient with function approximation Prof. Balaraman Ravindran - - Click Here
11 Proof of theorem 1 (reinforce cont'd) Prof. Balaraman Ravindran - - Click Here
12 Hierarchical reinforcement learning Prof. Balaraman Ravindran - - Click Here
13 Types of optimality Prof. Balaraman Ravindran - - Click Here
14 Actor-critic and reinforce Prof. Balaraman Ravindran - - Click Here
15 Policy gradient approach Prof. Balaraman Ravindran - - Click Here
16 Dqn and fitted q-iteration Prof. Balaraman Ravindran - - Click Here
17 Lspi and fitted q Prof. Balaraman Ravindran - - Click Here
18 Function approximation and eligibility traces Prof. Balaraman Ravindran - - Click Here
19 Lstd and lstdq Prof. Balaraman Ravindran - - Click Here
20 State aggregation methods Prof. Balaraman Ravindran - - Click Here
21 Linear parameterization Prof. Balaraman Ravindran - - Click Here
22 Function approximation Prof. Balaraman Ravindran - - Click Here
23 Thompson sampling Prof. Balaraman Ravindran - - Click Here
24 Backward traces of eligibility trace Prof. Balaraman Ravindran - - Click Here
25 Eligibility trace control Prof. Balaraman Ravindran - - Click Here
26 Eligibility traces Prof. Balaraman Ravindran - - Click Here
27 Afterstate Prof. Balaraman Ravindran - - Click Here
28 Q-learning Prof. Balaraman Ravindran - - Click Here
29 Td (0) control Prof. Balaraman Ravindran - - Click Here
30 Td (0) Prof. Balaraman Ravindran - - Click Here
31 Control in monte carlo Prof. Balaraman Ravindran - - Click Here
32 Off policy mc Prof. Balaraman Ravindran - - Click Here
33 Monte carlo Prof. Balaraman Ravindran - - Click Here
34 Uct Prof. Balaraman Ravindran - - Click Here
35 Dynamic programming Prof. Balaraman Ravindran - - Click Here
36 Policy iteration Prof. Balaraman Ravindran - - Click Here
37 Banach fixed point theorem Prof. Balaraman Ravindran - - Click Here
38 Value iteration proof Prof. Balaraman Ravindran - - Click Here
39 Convergence proof Prof. Balaraman Ravindran - - Click Here
40 L_pi convergence Prof. Balaraman Ravindran - - Click Here
41 Cauchy sequence and green's equation Prof. Balaraman Ravindran - - Click Here
42 Bellman optimality equation Prof. Balaraman Ravindran - - Click Here
43 Bellman equation Prof. Balaraman Ravindran - - Click Here
44 Mdp modeling Prof. Balaraman Ravindran - - Click Here
45 Returns, value-functioned and mdps Prof. Balaraman Ravindran - - Click Here
46 Full rl introduction Prof. Balaraman Ravindran - - Click Here
47 Contextual bandits Prof. Balaraman Ravindran - - Click Here
48 Policy search Prof. Balaraman Ravindran - - Click Here
49 Reinforce Prof. Balaraman Ravindran - - Click Here
50 Pac explanation and nave algo proof (pac bounds) Prof. Balaraman Ravindran - - Click Here
51 Theorem 1 proof (ucb1 theorem) Prof. Balaraman Ravindran - - Click Here
52 Thompson sampling Prof. Balaraman Ravindran - - Click Here
53 Media elimination Prof. Balaraman Ravindran - - Click Here
54 Concentration bounds Prof. Balaraman Ravindran - - Click Here