Decoupling Representation Learning From Reinforcement Learning | Paper Explained

Автор: Bits Of Deep Learning

Загружено: 2020-09-20

Просмотров: 2187

Описание:

Can we improve Reinforcement Leanining by decoupling Representation Learning from the RL part?
In this video you'll find out.

Decoupling Representation Learning From Reinforcement Learning Paper: https://arxiv.org/abs/2009.08319

Abstract:
In an effort to overcome limitations of reward-driven feature learning in deep reinforcement learning (RL) from images, we propose decoupling representation learning from policy learning. To this end, we introduce a new unsupervised learning (UL) task, called Augmented Temporal Contrast (ATC), which trains a convolutional encoder to associate pairs of observations separated by a short time difference, under image augmentations and using a contrastive loss. In online RL experiments, we show that training the encoder exclusively using ATC matches or outperforms end-to-end RL in most environments. Additionally, we benchmark several leading UL algorithms by pre-training encoders on expert demonstrations and using them, with weights frozen, in RL agents; we find that agents using ATC-trained encoders outperform all others. We also train multi-task encoders on data from multiple environments and show generalization to different downstream RL tasks. Finally, we ablate components of ATC, and introduce a new data augmentation to enable replay of (compressed) latent images from pre-trained encoders when RL requires augmentation. Our experiments span visually diverse RL benchmarks in DeepMind Control, DeepMind Lab, and Atari

#reinforcementlearning #contrastivelearning #unsupervisedlearning #AugmentedTemporalContrast #ATC

Decoupling Representation Learning From Reinforcement Learning | Paper Explained

Доступные форматы для скачивания:

Скачать видео mp4

Информация по загрузке:

Скачать аудио mp3

Похожие видео

Experience Replay vs Parametric Dynamic Model | Reinforcement Learning

Experience Replay vs Parametric Dynamic Model | Reinforcement Learning

Never Give Up: Learning Exploration strategies in RL | Paper Explained

Never Give Up: Learning Exploration strategies in RL | Paper Explained

What are the Eligibility Traces? || Reinforcement Learning

What are the Eligibility Traces? || Reinforcement Learning

Koronka do Miłosierdzia Bożego, Msza św., Godzina Miłosierdzia, Różaniec, Sanktuarium w Łagiewnikach

Koronka do Miłosierdzia Bożego, Msza św., Godzina Miłosierdzia, Różaniec, Sanktuarium w Łagiewnikach

Is your model robust? | Deep Learning

Is your model robust? | Deep Learning

Diffusion Models Tutorial

Diffusion Models Tutorial

Я в опасности

A conversation with Eric Jang on the Present and Future of Robotics— Podcast Series #001

A conversation with Eric Jang on the Present and Future of Robotics— Podcast Series #001

How to Escape Google Surveillance: Replace Every Service in 2 Weeks

How to Escape Google Surveillance: Replace Every Service in 2 Weeks

BERT for Video

Why Everyone Stopped Using Dropbox

Why Everyone Stopped Using Dropbox

Can You Name What You're Looking For?

Can You Name What You're Looking For?

Robotic Simulators for Reinforcement Learning

Robotic Simulators for Reinforcement Learning

Jan 09 2026 Tutte-Felipe Fidalgo

Jan 09 2026 Tutte-Felipe Fidalgo

Anna Bryłka w Gościu Krzysztofa Ziemca w RMF FM

Anna Bryłka w Gościu Krzysztofa Ziemca w RMF FM

🎙️ The Right Angle Makes Microchips Faster

🎙️ The Right Angle Makes Microchips Faster

Ziemkiewicz MIAŻDŻY reformy Nowackiej: to tresowanie niewolników, a nie szkoła!

Ziemkiewicz MIAŻDŻY reformy Nowackiej: to tresowanie niewolników, a nie szkoła!

World-Models 🌍 Model Based Reinforcement Learning

World-Models 🌍 Model Based Reinforcement Learning

MOŁDAWIA W RUMUNII? PREZYDENT I PREMIER KRAJU SĄ ZA

MOŁDAWIA W RUMUNII? PREZYDENT I PREMIER KRAJU SĄ ZA

[V-JEPA] Beyond Pixels: V-JEPA 2 and the Shift to Action-Conditioned Video Prediction.

[V-JEPA] Beyond Pixels: V-JEPA 2 and the Shift to Action-Conditioned Video Prediction.