⚡️Multi-Turn RL for Multi-Hour Agents — with Will Brown, Prime Intellect
Автор: Latent Space
Загружено: 2025-05-22
Просмотров: 4287
Claude 4 controversies, reactions, LMArena and all that jazz.
References:
Reinforcing Multi-Turn Reasoning in LLM Agents via Turn-Level Credit Assignment : https://x.com/willccbb/status/1925389...
Verifiers: https://github.com/willccbb/verifiers
Timestamps
00:00 Introduction to the Podcast and Guests
01:00 Discussion on Claude 4 and AI Models
03:07 Extended Thinking and Tool Use in AI
06:47 Technical Highlights and Model Trustworthiness
10:31 Thinking Budgets and Their Implications
13:38 Controversy Surrounding Opus and AI Ethics
18:49 Reflections on AI Tools and Their Limitations
21:58 The Chaos of Predictive Systems
22:56 Marketing and Safety in AI Models
24:30 Evaluating AI Companies and Their Strategies
25:53 The Role of Academia in AI Evaluations
27:43 Teaching Taste in Research
28:41 Making Educated Bets in AI Research
30:12 Recent Developments in Multi-Turn Tool Use
32:50 Incentivizing Tool Use in AI Models
34:45 The Future of Reward Models in AI
39:10 Exploring Flexible Reward Systems
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: