Zhang-Wei Hong on Explore and Exploit Data in Reinforcement Learning | Toronto AIR Seminar

Автор: AI Robotics Seminar - University of Toronto

Загружено: 2023-04-03

Просмотров: 652

Описание:

Abstract:
Reinforcement learning (RL) is a data-driven method for solving sequential decision-making problems from interaction experience with the environment. RL has shown to be able to learn non-trivial controllers in robot locomotion and manipulation that are challenging for model-based planning. However, the intensive data requirement prevents RL from being widely applied in robotics. Even training policies in simulators take several weeks to obtain a satisfactory policy. Prior works circumvent this data requirements using a curiosity-driven (a.k.a. exploration bonuses or intrinsic rewards) strategy to improve exploration (data collection) or learning from dataset (offline RL) curated by humans or pre-programmed controller. In this talk, I will illustrate the fallacy of curiosity-driven exploration strategy and sensitivity to data distribution of offline RL algorithms.

Paper:
Chen, Eric R., et al. "Redeeming intrinsic rewards via constrained optimization." Advances in Neural Information Processing Systems. NeurIPS. 2022. https://arxiv.org/abs/2211.07627
Hong, Zhang-Wei, Ge Yang, and Pulkit Agrawal. "Bi-linear value networks for multi-goal reinforcement learning." ICLR. 2022. https://arxiv.org/abs/2204.13695

Bio:
Zhang-Wei Hong is a Ph.D. candidate in the Department of Electrical Engineering and Computer Science at Massachusetts Institute of Technology (MIT). He received his B.S. and M.S. degrees from National Tsing Hua University in Taiwan and has conducted research internships at TU Darmstadt in Germany and Preferred Networks (PFN) in Japan. Zhang-Wei's research interests lie at the intersection of reinforcement learning and optimization, with a focus on developing principled algorithms to improve the usability of RL in real-world scenarios. His work has been published in top-tier conferences such as NeurIPS, ICLR, ICRA, and CoRL.

Toronto AIR Seminar:
The Toronto AI Robotics Seminar Series is a set of events featuring young robotics and AI experts. The talks are given by local as well as global speakers and organized by the Faculty and Students at University of Toronto’s Department of Computer Science. We welcome students, researchers and robotics enthusiasts from around the world to join us and interact with the Toronto Robotics Community.
Find out more at: https://robotics.cs.toronto.edu/toron...

Zhang-Wei Hong on Explore and Exploit Data in Reinforcement Learning | Toronto AIR Seminar

Доступные форматы для скачивания:

Скачать видео mp4

Информация по загрузке:

Скачать аудио mp3

Похожие видео

Joanne Truong on Sim2Robot: Training Robots for the Real-World with Imperfect Simulators|AIR Seminar

Joanne Truong on Sim2Robot: Training Robots for the Real-World with Imperfect Simulators|AIR Seminar

Mohit Shridhar on Acting with Perception and Language | Toronto AIR Seminar

Mohit Shridhar on Acting with Perception and Language | Toronto AIR Seminar

Reference Genome Alignment

Reference Genome Alignment

Taming Infinities - Martin Hairer (2017 Fields Medal Symposium)

Taming Infinities - Martin Hairer (2017 Fields Medal Symposium)

Mathias Gehrig on Event Cameras and How to Make Them Useful | Toronto AIR Seminar

Mathias Gehrig on Event Cameras and How to Make Them Useful | Toronto AIR Seminar

Prezydent Nawrocki alarmuje: Unijna biurokracja dusi polski biznes! Dość nadregulacji!

Prezydent Nawrocki alarmuje: Unijna biurokracja dusi polski biznes! Dość nadregulacji!

10 EKSTREMALNYCH zdarzeń w ZSRR

10 EKSTREMALNYCH zdarzeń w ZSRR

PIETUSZEWSKI BOHATEREM PORTO! DEBIUT MARZENIE - WSZEDŁ I WYWALCZYŁ KARNEGO, RYWAL WYLECIAŁ Z 🟥

PIETUSZEWSKI BOHATEREM PORTO! DEBIUT MARZENIE - WSZEDŁ I WYWALCZYŁ KARNEGO, RYWAL WYLECIAŁ Z 🟥

RASC-TC Double Stars, Unnoticed Treasures

RASC-TC Double Stars, Unnoticed Treasures

Nolan Wagener on MoCapAct: A Multi-Task Dataset for Simulated Humanoid Control | Toronto AIR Seminar

Nolan Wagener on MoCapAct: A Multi-Task Dataset for Simulated Humanoid Control | Toronto AIR Seminar

UofT Robotics: Frank Park (Seoul National U) on Geometric Methods for Robot Learning

UofT Robotics: Frank Park (Seoul National U) on Geometric Methods for Robot Learning

Varieties of Mathematical Understanding

Varieties of Mathematical Understanding

Wenyuan Zeng on Neural World Models for Autonomous Driving | Toronto AIR Seminar

Wenyuan Zeng on Neural World Models for Autonomous Driving | Toronto AIR Seminar

A Theory of the Mechanics of Information - Christopher Hazard

A Theory of the Mechanics of Information - Christopher Hazard

HEJT STOP! | Abcar Oldtimers

HEJT STOP! | Abcar Oldtimers

Jun Gao on Towards Generative Modeling of 3D Objects Learned from Images | Toronto AIR Seminar

Jun Gao on Towards Generative Modeling of 3D Objects Learned from Images | Toronto AIR Seminar

Christopher Denniston on Active Robot Perception for Understanding the World | Toronto AIR Seminar

Christopher Denniston on Active Robot Perception for Understanding the World | Toronto AIR Seminar

RI Seminar: Alec Jacobson : Geometry Processing in The Wild

RI Seminar: Alec Jacobson : Geometry Processing in The Wild

Taming Infinities - Martin Hairer (2017 Fields Medal Symposium)

Taming Infinities - Martin Hairer (2017 Fields Medal Symposium)

Frederike Dümbgen on Towards Globally Optimal State Estimation | Toronto AIR Seminar

Frederike Dümbgen on Towards Globally Optimal State Estimation | Toronto AIR Seminar