Introduction to Reinforcement Learning | DigiKey
Автор: DigiKey
Загружено: Jul 17, 2023
Просмотров: 41,222 views
Reinforcement Learning (RL) is a field of machine learning that aims to find optimal solutions to control theory problems for various tasks. It employs an artificial intelligence (AI) “agent” that takes in observations, chooses actions, and learns from rewards. Modern RL algorithms train agents using trial-and-error approaches that involve directly interacting with the given environment.
In the video, we cover the basic theory behind RL and demonstrate how to use Farama Foundation Gymnasium (https://gymnasium.farama.org/) and Stable Baselines3 (https://stable-baselines3.readthedocs...) in Python to train an AI agent to solve the classic cartpole (https://gymnasium.farama.org/environm...) control theory problem. At the end of the video, we encourage you to try applying the knowledge to solve the slightly more advanced inverted pendulum problem (https://gymnasium.farama.org/environm....
The solution to the challenge can be found here: https://www.digikey.com/en/maker/proj...
Code for training RL agents to solve both the cartpole and pendulum problems can be found here: https://github.com/ShawnHymel/reinfor...
In RL, the environment can be anything the agent interacts with, such as board games, video games, virtual settings, or the real world. We often use a code wrapper (e.g. Gymnasium) to observe this environment, perform agent-specified actions, and assign rewards. Note that rewards are considered part of the environment and are instrumental in training.
The decision-making process for choosing actions based on observations is known as the “policy.” During training, the agent selects actions randomly or per policy. The environment then offers a new observation and reward, guiding the training algorithm to help the agent choose actions leading to higher predicted total rewards in the future.
The cartpole problem consists of a virtual pole balanced on top of a cart that can only move left and right. The goal is to design an AI agent that can keep the pole balanced by pushing the cart left or right. In the video, we use Deep Q-Learning (https://towardsdatascience.com/deep-q...) to train a Deep Q-Network (DQN) to solve the cartpole problem.
We list some recommended reading and viewing materials below if you would like to dive deeper into reinforcement learning.
Articles:
Reinforcement Learning Algorithms — an intuitive overview - / reinforcement-learning-algorithms-an-intui...
Which Reinforcement learning-RL algorithm to use where, when and in what scenario? - https://medium.datadriveninvestor.com...
Q-Learning vs. Deep Q-Learning vs. Deep Q-Network - https://www.baeldung.com/cs/q-learnin...
Deep Q Networks (DQN) With the Cartpole Environment - https://wandb.ai/safijari/dqn-tutoria...
RL — Proximal Policy Optimization (PPO) Explained - / rl-proximal-policy-optimization-ppo-explained
Proximal Policy Optimization (PPO) - https://huggingface.co/blog/deep-rl-ppo
Related Videos:
Exploring Reinforcement Learning: Can AI Learn to Play QWOP?
Intro to Edge AI
Related Project Links:
Intro to Reinforcement Learning Using Gymnasium and Stable Baselines3
Related Articles:
Teach an AI to play QWOP
What is Edge AI? Machine Learning + IoT
Learn more:
Maker.io - https://www.digikey.com/en/maker
DigiKey’s Blog – TheCircuit https://www.digikey.com/en/blog
Connect with Digi-Key on Facebook / digikey.electronics
And follow us on Twitter / digikey
00:00 - Intro
00:59 - History of reinforcement learning
02:14 - Environment and agent interaction loop
06:21 - Gymnasium and Stable Baselines3
07:55 - Hands-on: how to set up a gymnasium environment
26:57 - Markov decision process
31:02 - Bellman equation for the state-value function
34:12 - Bellman equation for the action-value function
35:47 - Bellman optimality equations
36:43 - Exploration vs. exploitation
38:39 - Recommended textbook
39:25 - Model-based vs. model-free algorithms
40:27 - On-policy vs. off-policy algorithms
41:19 - Discrete vs. continuous action space
42:36 - Discrete vs. continuous observation space
43:56 - Overview of modern reinforcement learning algorithms
46:29 - Q-learning
49:27 - Deep Q-network (DQN)
51:59 - Hands-on: how to train a DQN agent
01:12:36 - Usefulness of reinforcement learning
01:13:26 - Challenge: inverted pendulum
01:14:10 - Conclusion

Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: