Session 8 Bellman Equation, Optimal Policy, Iterative Policy Evaluation, Policy & Value Iteration
Автор: Mainak's PMRF Tutorials
Загружено: Дата премьеры: 9 апр. 2025 г.
Просмотров: 115 просмотров
In this video we introduce the concept of Bellman optimality Equations. We start with the relation between the Q-value and the value function. Then, putting the policy as optimal, we derived the Bellman Equations at optimality. We showed that the optimal value function can be obtained from the Q-value function by maximising over actions.
Next, we used this property to derive value iteration and policy iteration algorithms. We consider a grid-world example for policy evaluation and write a Python code to obtain the value function on convergence.
Materials: https://drive.google.com/drive/folder...

Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: