Популярное

Музыка Кино и Анимация Автомобили Животные Спорт Путешествия Игры Юмор

Интересные видео

2025 Сериалы Трейлеры Новости Как сделать Видеоуроки Diy своими руками

Топ запросов

смотреть а4 schoolboy runaway турецкий сериал смотреть мультфильмы эдисон
dTub
Скачать

Monte Carlo And Off-Policy Methods | Reinforcement Learning Part 3

Monte Carlo Methods

Off Policy Methods

Off-Policy Methods

Constant MC

Reinforcement Learning

Автор: Mutual Information

Загружено: 26 окт. 2022 г.

Просмотров: 67 927 просмотров

Описание:

The machine learning consultancy: https://truetheta.io
Join my email list to get educational and useful articles (and nothing else!): https://mailchi.mp/truetheta/true-the...
Want to work together? See here: https://truetheta.io/about/#want-to-w...

Part three of a six part series on Reinforcement Learning. It covers the Monte Carlo approach a Markov Decision Process with mere samples. At the end, we touch on off-policy methods, which enable RL when the data was generate with a different agent.

SOCIAL MEDIA

LinkedIn :   / dj-rich-90b91753  
Twitter :   / duanejrich  
Github: https://github.com/Duane321

Enjoy learning this way? Want me to make more videos? Consider supporting me on Patreon:   / mutualinformation  

SOURCES

[1] R. Sutton and A. Barto. Reinforcement learning: An Introduction (2nd Ed). MIT Press, 2018.

[2] H. Hasselt, et al. RL Lecture Series, Deepmind and UCL, 2021,    • DeepMind x UCL | Deep Learning Lectur...  

SOURCE NOTES

The video covers topics from chapters 5 and 7 from [1]. The whole series teaches from [1]. [2] has been a useful secondary resource.

TIMESTAMP
0:00 What We'll Learn
0:33 Review of Previous Topics
2:50 Monte Carlo Methods
3:35 Model-Free vs Model-Based Methods
4:59 Monte Carlo Evaluation
9:30 MC Evaluation Example
11:48 MC Control
13:01 The Exploration-Exploitation Trade-Off
15:01 The Rules of Blackjack and its MDP
16:55 Constant-alpha MC Applied to Blackjack
21:55 Off-Policy Methods
24:32 Off-Policy Blackjack
26:43 Watch the next video!

NOTES

Link to Constant-alpha MC applied to Blackjack: https://github.com/Duane321/mutual_in...

The Off-Policy method you see at 25:00 is different from the rule you'll see in the textbook at eq 7.9 (which will be MC if n goes to inf). That's because they are showing re-weighted IS and I'm showing plain ( high variance) IS.

Monte Carlo And Off-Policy Methods | Reinforcement Learning Part 3

Поделиться в:

Доступные форматы для скачивания:

Скачать видео mp4

  • Информация по загрузке:

Скачать аудио mp3

Похожие видео

Temporal Difference Learning (including Q-Learning) | Reinforcement Learning Part 4

Temporal Difference Learning (including Q-Learning) | Reinforcement Learning Part 4

Bellman Equations, Dynamic Programming, Generalized Policy Iteration | Reinforcement Learning Part 2

Bellman Equations, Dynamic Programming, Generalized Policy Iteration | Reinforcement Learning Part 2

Monte Carlo Simulation

Monte Carlo Simulation

Reinforcement Learning, by the Book

Reinforcement Learning, by the Book

Самая простая нерешённая задача — гипотеза Коллатца [Veritasium]

Самая простая нерешённая задача — гипотеза Коллатца [Veritasium]

How do Graphics Cards Work?  Exploring GPU Architecture

How do Graphics Cards Work? Exploring GPU Architecture

6. Monte Carlo Simulation

6. Monte Carlo Simulation

[DeepLearning | видео 1] Что же такое нейронная сеть?

[DeepLearning | видео 1] Что же такое нейронная сеть?

Function Approximation | Reinforcement Learning Part 5

Function Approximation | Reinforcement Learning Part 5

The Riddle That Seems Impossible Even If You Know The Answer

The Riddle That Seems Impossible Even If You Know The Answer

© 2025 dtub. Все права защищены.



  • Контакты
  • О нас
  • Политика конфиденциальности



Контакты для правообладателей: [email protected]