Keynote - Offline reinforcement learning

Автор: Anyscale

Загружено: 2022-04-17

Просмотров: 5349

Описание:

Reinforcement learning (RL) provides an algorithmic framework for rational sequential decision making. However, the kinds of problem domains where RL has been applied successfully seem to differ substantially from the settings where supervised machine learning has been successful. RL algorithms can learn to play Atari or board games, whereas supervised machine learning algorithms can make highly accurate predictions in complex open-world settings.

Virtually all the problems that we want to solve with machine learning are really decision making problems — deciding which product to show to a customer, deciding how to tag a photo, or deciding how to translate a string of text — so why aren't we solving them all with RL? One of the biggest issues with modern RL is that it does not effectively utilize the kinds of large and highly diverse datasets that have been instrumental to the success of supervised machine learning.

In this talk, I will discuss the technologies that can help us address this issue: enabling RL methods to use large datasets via offline RL. Offline RL algorithms can analyze large, previously collected datasets to extract the most effective policies, and then fine-tune these policies with additional online interaction as needed. I will cover the technical foundations of offline RL, discuss recent algorithm advances, and present several applications.

Keynote - Offline reinforcement learning

Доступные форматы для скачивания:

Скачать видео mp4

Информация по загрузке:

Скачать аудио mp3

Похожие видео

Offline Reinforcement Learning: BayLearn 2021 Keynote Talk

Offline Reinforcement Learning: BayLearn 2021 Keynote Talk

Offline RL with RLlib

Offline RL with RLlib

Введение в методы градиента политики — глубокое обучение с подкреплением

Введение в методы градиента политики — глубокое обучение с подкреплением

Обучение с подкреплением на основе моделей наконец-то работает!

Обучение с подкреплением на основе моделей наконец-то работает!

Deep RL Bootcamp Lecture 6: Nuts and Bolts of Deep RL Experimentation

Deep RL Bootcamp Lecture 6: Nuts and Bolts of Deep RL Experimentation

NeurIPS 2020 Tutorial on Offline RL: Part 1

NeurIPS 2020 Tutorial on Offline RL: Part 1

LLM и GPT - как работают большие языковые модели? Визуальное введение в трансформеры

LLM и GPT - как работают большие языковые модели? Визуальное введение в трансформеры

A 24x Speedup for Reinforcement Learning with RLlib + Ray

A 24x Speedup for Reinforcement Learning with RLlib + Ray

Но что такое нейронная сеть? | Глава 1. Глубокое обучение

Но что такое нейронная сеть? | Глава 1. Глубокое обучение

Policy Gradient Theorem Explained - Reinforcement Learning

Policy Gradient Theorem Explained - Reinforcement Learning

A Gentle Introduction to Offline Reinforcement Learning

A Gentle Introduction to Offline Reinforcement Learning

Обучение с подкреплением в автономном режиме: включение знаний из данных в обучение с подкреплением

Обучение с подкреплением в автономном режиме: включение знаний из данных в обучение с подкреплением

Reinforcement Learning Series: Overview of Methods

Reinforcement Learning Series: Overview of Methods

Обучение с подкреплением, по книге

Обучение с подкреплением, по книге

Reinforcement Learning: on-policy vs off-policy algorithms

Reinforcement Learning: on-policy vs off-policy algorithms

MIT 6.S191 (2023): Reinforcement Learning

MIT 6.S191 (2023): Reinforcement Learning

Оффлайн обучение с подкреплением

Оффлайн обучение с подкреплением

Using Reinforcement Learning to Optimize IAP Offer Recommendations in Mobile Games

Using Reinforcement Learning to Optimize IAP Offer Recommendations in Mobile Games

Самые стыдные вопросы об электричестве!

Самые стыдные вопросы об электричестве!

RobotLearning: Scaling Offline Reinforcement Learning

RobotLearning: Scaling Offline Reinforcement Learning