RL 7: Monte-Carlo Method | Reinforcement Learning
Автор: AI Insights - Rituraj Kaushik
Загружено: 2019-08-17
Просмотров: 36946
Monte-Carlo Method in Reinforcement Learning - In the previous video about policy iteration and value iteration we assumed that the agen has access to the model of the environment. However, this assumption is not true always. In this video, we discuss an approach called monte-carlo method (for prediction and control) using which an agent can improve its policy by interacting in the environment. We discuss a specific variant of Monte-Carlo method called "exploring start" where each episode starts from a randomly selected state-action pair. The algorithm basically uses the framework of generalized policy iteration to improve the policy iteratively.
Reinforcement learning tutorial series:
1. Multi-armed Bandits: • RL 1: Multi-armed Bandits 1
2. Multi-Armed Bandits - Action value estimation: • RL 2: Multi-Armed Bandits 2 - Action value...
3. Upper confidence bound: • RL 3: Upper confidence bound (UCB) to solv...
4. Thompson Sampling: • RL 4: Thompson Sampling - Multi-armed bandits
5. Markov Decision Process - MDP: • RL 5: Markov Decision Process - MDP | Rein...
6. Policy iteration and value iteration: • RL 6: Policy iteration and value iteration...
7. Monte-Carlo Method: • RL 7: Monte-Carlo Method | Reinforcement L...
#monte_carlo_method #reinforcement_learning
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: