Популярное

Музыка Кино и Анимация Автомобили Животные Спорт Путешествия Игры Юмор

Интересные видео

2025 Сериалы Трейлеры Новости Как сделать Видеоуроки Diy своими руками

Топ запросов

смотреть а4 schoolboy runaway турецкий сериал смотреть мультфильмы эдисон
dTub
Скачать

Kaiqing Zhang - “Towards Principled AI-Agents with Decentralized and Asymmetric Information”

Автор: UWMadison SILO Seminar

Загружено: 2025-02-19

Просмотров: 141

Описание:

Time: Wednesday, Feb 19th, 12:30-1:30 pm

Speaker: Kaiqing Zhang

Abstract: AI Models have been increasingly deployed to develop "Autonomous Agents" for decision-making, with prominent application examples including playing Go and video games, robotics, autonomous driving, healthcare, human-assistant, etc. Most such success stories naturally involve multiple AI-agents interacting dynamically with each other and humans. More importantly, these agents oftentimes operate with asymmetric information in practice, both across different agents and across the training-testing phases. In this talk, we will share some of our recent explorations in understanding (multi-)AI-agents decision-making with such decentralized and asymmetric information. First, we will focus on Reinforcement Learning (RL)-Agents, in partially observable environments: we will analyze the pitfalls and efficiency of RL in partially observable Markov decision processes when there is privileged information in training, a common practice in robot learning and deep RL, and in partially observable stochastic games, when information-sharing is allowed among decentralized agents. We will show the provable benefits of privileged information and information sharing in these cases. Second, we will examine Large-Language-Model (LLM)-(powered-)Agents, which use LLM as the main controller for decision-making, by understanding and enhancing their decision-making capability in canonical decentralized and multi-agent scenarios. In particular, we use the metric of Regret, commonly studied in Online Learning and RL, to understand LLM-agents’ decision-making limits in context and in controlled experiments. Motivated by the observed pitfalls of existing LLM agents, we also proposed a new fine-tuning loss to promote the no-regret behaviors of the models, both provably and experimentally. Time permitting, we will conclude with some additional thoughts on building principled AI-Agents for decision-making with information constraints.

Bio: Kaiqing Zhang is currently an Assistant Professor at the Department of Electrical and Computer Engineering (ECE) and the Institute for Systems Research (ISR), at the University of Maryland, College Park. He is also a member of the Center for Machine Learning, Maryland Robotics Center, and Artificial Intelligence Interdisciplinary Institute at Maryland (AIM). Prior to joining Maryland, he was a postdoctoral scholar affiliated with LIDS and CSAIL at MIT, and a Research Fellow at the Simons Institute for the Theory of Computing at Berkeley. He finished his Ph.D. from the Department of ECE at the University of Illinois at Urbana-Champaign (UIUC). He also received M.S. in both ECE and Applied Math from UIUC, and B.E. from Tsinghua University. His research interests lie in Control and Decision Theory, Game Theory, Robotics, Reinforcement/Machine Learning, Computation, and their intersections. His works have been recognized by several awards, including Simons-Berkeley Research Fellowship, CSL Thesis Award, IEEE Robotics and Automation Society TC Best-Paper Award, ICML Outstanding Paper Award, AAAI New Faculty Highlights, and NSF CAREER Award.

Kaiqing Zhang - “Towards Principled AI-Agents with Decentralized and Asymmetric Information”

Поделиться в:

Доступные форматы для скачивания:

Скачать видео mp4

  • Информация по загрузке:

Скачать аудио mp3

Похожие видео

Alberto Del Pia - “Minimizing quadratics over integers”

Alberto Del Pia - “Minimizing quadratics over integers”

Peter Frazier -

Peter Frazier - "Bayesian Preference Exploration: Making Optimization Accessible to Non-Experts"

Практический бесплатный вебинар: Как и зачем усиливать LLM c помощью RAG

Практический бесплатный вебинар: Как и зачем усиливать LLM c помощью RAG

MIT 6.S191: Convolutional Neural Networks

MIT 6.S191: Convolutional Neural Networks

Michael W. Mahoney -

Michael W. Mahoney - "Random Matrix Theory and Modern Machine Learning"

Dimitris Papailiopoulos -

Dimitris Papailiopoulos - "Self-Improving Transformers: Overcoming Length Generalization Challenges"

Sujay Sanghavi -

Sujay Sanghavi - "Faster Diffusion Language Models"

Searching for architectures and BERT moments in specialized AI applications

Searching for architectures and BERT moments in specialized AI applications

MIT 6.S191: Reinforcement Learning

MIT 6.S191: Reinforcement Learning

Serena Wang -

Serena Wang - "Relying on the Metrics of Evaluated Agents"

OpenAI goes NUCLEAR (CODE RED)

OpenAI goes NUCLEAR (CODE RED)

Jeff Schneider -

Jeff Schneider - "Reinforcement Learning and Bayesian Optimization for Nuclear Fusion"

OpenAI готовит новую модель «Чеснок»

OpenAI готовит новую модель «Чеснок»

Philip Thomas -

Philip Thomas - "Qualia Optimization: Exploring Mathematical Formulations of AI Experience"

MIT 6.S191: Recurrent Neural Networks, Transformers, and Attention

MIT 6.S191: Recurrent Neural Networks, Transformers, and Attention

MIT Introduction to Deep Learning | 6.S191

MIT Introduction to Deep Learning | 6.S191

2025 MIT Integration Bee - Finals

2025 MIT Integration Bee - Finals

MIT 6.S087: Базовые модели и генеративный ИИ. ВВЕДЕНИЕ

MIT 6.S087: Базовые модели и генеративный ИИ. ВВЕДЕНИЕ

Mattie Fellows - Simplifying Deep Temporal Difference Learning

Mattie Fellows - Simplifying Deep Temporal Difference Learning

Генеральный директор OpenAI Сэм Альтман запускает «CODE RED» на фоне роста Gemini

Генеральный директор OpenAI Сэм Альтман запускает «CODE RED» на фоне роста Gemini

© 2025 dtub. Все права защищены.



  • Контакты
  • О нас
  • Политика конфиденциальности



Контакты для правообладателей: [email protected]