Martin Klissarov - MaestroMotif: Skill Design from AI Feedback

Автор: UCL DARK

Загружено: 2025-04-14

Просмотров: 219

Описание:

Invited talk by Martin Klissarov, Research Scientist at Google DeepMind, on April 7, 2025 at UCL DARK.

Title:
MaestroMotif: Skill Design from AI Feedback

Abstract:
Describing skills and behaviours in natural language has the potential of providing an accessible way of injecting human knowledge about decision-making tasks into an AI system. We present MaestroMotif (ICLR 2025 Oral), a method for skill design that fundamentally embraces the human-AI paradigm, yielding high-performing and adaptable agents. Starting from a natural language description of a set of skills provided by a user, it leverages an LLM's feedback to automatically design rewards corresponding to each skill. It then builds on an LLM's code generation abilities to sequence and learn these skills. On a suite of complex tasks in the NetHack Learning Environment (NLE), MaestroMotif demonstrates that it surpasses existing approaches in both performance and usability.

Bio:
Martin Klissarov is a Research Scientist at Google DeepMind working with Prof. Ed Grefenstette in the Autonomous Assistants team. He is currently wrapping up his PhD student supervised by Prof. Doina Precup and Prof. Marlos C Machado. He works on the intersection of RL, LLMs and human-AI interactions.

Martin Klissarov - MaestroMotif: Skill Design from AI Feedback

Доступные форматы для скачивания:

Скачать видео mp4

Информация по загрузке:

Скачать аудио mp3

Похожие видео

Kenneth O. Stanley - Novel Opportunities in Open-Endedness @ UCL DARK

Kenneth O. Stanley - Novel Opportunities in Open-Endedness @ UCL DARK

Stefano V. Albrecht - From Deep Reinforcement Learning to LLM-based Agents

Stefano V. Albrecht - From Deep Reinforcement Learning to LLM-based Agents

Matthew Fontaine - Quality Diversity Scenario Generation for Robust Autonomy @ UCL DARK

Matthew Fontaine - Quality Diversity Scenario Generation for Robust Autonomy @ UCL DARK

Andrew Lampinen - Comparing Language Models to Humans: Reasoning & Grammer @ UCL DARK

Andrew Lampinen - Comparing Language Models to Humans: Reasoning & Grammer @ UCL DARK

[TSLL2025] Plenary: New Conversations new evidence by Dr. Serge Bibauw (UCLouvain)

[TSLL2025] Plenary: New Conversations new evidence by Dr. Serge Bibauw (UCLouvain)

MLOps In Practice – How To Run Your Machine Learning Models In Production At Enterprise Scale

MLOps In Practice – How To Run Your Machine Learning Models In Production At Enterprise Scale

DeepMind Open-Endedness Team - Genie: Generative Interactive Environments

DeepMind Open-Endedness Team - Genie: Generative Interactive Environments

They Said It Couldn't Be Done - Starling Bank

They Said It Couldn't Be Done - Starling Bank

Micah Carroll - Uni[MASK]: Unified Inference in Sequential Decision Problems @ UCL DARK

Micah Carroll - Uni[MASK]: Unified Inference in Sequential Decision Problems @ UCL DARK

Advancing Spark - Data + AI Summit 2024 Key Announcements

Advancing Spark - Data + AI Summit 2024 Key Announcements

Maciej and Bartek - Fine-tuning Reinforcement Learning Models is a Forgetting Mitigation Problem

Maciej and Bartek - Fine-tuning Reinforcement Learning Models is a Forgetting Mitigation Problem

Anssi Kanervisto - After 8 years, Minecraft continues to push the frontier of AI

Anssi Kanervisto - After 8 years, Minecraft continues to push the frontier of AI

Nuclear Energy in the Climate Equation: From COP21 to COP30, and Beyond

Nuclear Energy in the Climate Equation: From COP21 to COP30, and Beyond

Kto nie chce pokoju na Ukrainie? | Witkoffgate? | Straty Ukrainy

Kto nie chce pokoju na Ukrainie? | Witkoffgate? | Straty Ukrainy

Generative AI in Data Management and Analytics, a New Era of Assistance, Productivity and Automation

Generative AI in Data Management and Analytics, a New Era of Assistance, Productivity and Automation

Wojciech Czarnecki - On the Geometry of Competitive Games @ UCL DARK

Wojciech Czarnecki - On the Geometry of Competitive Games @ UCL DARK

BA (Hons) Games Art Online Open Day

BA (Hons) Games Art Online Open Day

Thomas Kipf - Learning Structured Models of the World @ UCL DARK

Thomas Kipf - Learning Structured Models of the World @ UCL DARK

Nathan Lambert - Reinforcement Learning from Human Feedback @ UCL DARK

Nathan Lambert - Reinforcement Learning from Human Feedback @ UCL DARK

Advancing Spark - Data Lakehouse Star Schemas with Dynamic Partition Pruning!

Advancing Spark - Data Lakehouse Star Schemas with Dynamic Partition Pruning!