How GPT-5 Thinks — OpenAI VP of Research Jerry Tworek

Автор: The MAD Podcast with Matt Turck

Загружено: 2025-10-16

Просмотров: 18700

Описание:

What does it really mean when GPT-5 “thinks”? In this conversation, OpenAI’s VP of Research Jerry Tworek explains how modern reasoning models work in practice—why pretraining and reinforcement learning (RL/RLHF) are both essential, what that on-screen “thinking” actually does, and when extra test-time compute helps (or doesn’t). We trace the evolution from O1 (a tech demo good at puzzles) to O3 (the tool-use shift) to GPT-5 (Jerry calls it “03.1-ish”), and talk through verifiers, reward design, and the real trade-offs behind “auto” reasoning modes.

We also go inside OpenAI: how research is organized, why collaboration is unusually transparent, and how the company ships fast without losing rigor. Jerry shares the backstory on competitive-programming results like ICPC, what they signal (and what they don’t), and where agents and tool use are genuinely useful today. Finally, we zoom out: could pretraining + RL be the path to AGI?

This is the MAD Podcast —AI for the 99%. If you’re curious about how these systems actually work (without needing a PhD), this episode is your map to the current AI frontier.

OpenAI
Website - https://openai.com
X/Twitter - https://x.com/OpenAI

Jerry Tworek
LinkedIn -   / jerry-tworek-b5b9aa56
X/Twitter - https://x.com/millionint

FIRSTMARK
Website - https://firstmark.com
X/Twitter -   / firstmarkcap

Matt Turck (Managing Director)
LinkedIn -   / turck
X/Twitter -   / mattturck

LISTEN ON:
Spotify - https://open.spotify.com/show/7yLATDS...
Apple - https://podcasts.apple.com/us/podcast...

00:00 - Intro
01:01 - What Reasoning Actually Means in AI
02:32 - Chain of Thought: Models Thinking in Words
05:25 - How Models Decide Thinking Time
07:24 - Evolution from O1 to O3 to GPT-5
11:00 - Before OpenAI: Growing up in Poland, Dropping out of School, Trading
20:32 - Working on Robotics and Rubik's Cube Solving
23:02 - A Day in the Life: Talking to Researchers
24:06 - How Research Priorities Are Determined
26:53 - Collaboration vs IP Protection at OpenAI
29:32 - Shipping Fast While Doing Deep Research
31:52 - Using OpenAI's Own Tools Daily
32:43 - Pre-Training Plus RL: The Modern AI Stack
35:10 - Reinforcement Learning 101: Training Dogs
40:17 - The Evolution of Deep Reinforcement Learning
42:09 - When GPT-4 Seemed Underwhelming at First
45:39 - How RLHF Made GPT-4 Actually Useful
48:02 - Unsupervised vs Supervised Learning
49:59 - GRPO and How DeepSeek Accelerated US Research
53:05 - What It Takes to Scale Reinforcement Learning
55:36 - Agentic AI and Long-Horizon Thinking
59:19 - Alignment as an RL Problem
1:01:11 - Winning ICPC World Finals Without Specific Training
1:05:53 - Applying RL Beyond Math and Coding
1:09:15 - The Path from Here to AGI
1:12:23 - Pure RL vs Language Models

How GPT-5 Thinks — OpenAI VP of Research Jerry Tworek

Доступные форматы для скачивания:

Скачать видео mp4

Информация по загрузке:

Скачать аудио mp3

Похожие видео

Are We Misreading the AI Exponential? Julian Schrittwieser on Move 37 & Scaling RL (Anthropic)

Are We Misreading the AI Exponential? Julian Schrittwieser on Move 37 & Scaling RL (Anthropic)

What’s Next for AI? OpenAI’s Łukasz Kaiser (Transformer Co-Author)

What’s Next for AI? OpenAI’s Łukasz Kaiser (Transformer Co-Author)

Interpretability: Understanding how AI models think

Interpretability: Understanding how AI models think

Rich Sutton, The OaK Architecture: A Vision of SuperIntelligence from Experience - RLC 2025

Rich Sutton, The OaK Architecture: A Vision of SuperIntelligence from Experience - RLC 2025

OpenAI Chief Economist Ronnie Chatterji:

OpenAI Chief Economist Ronnie Chatterji: "How People Use ChatGPT" Deep Dive Discussion

Что такое «хакерство с целью получения вознаграждения» в сфере искусственного интеллекта и почему...

Что такое «хакерство с целью получения вознаграждения» в сфере искусственного интеллекта и почему...

Алексей Савватеев: кто уничтожил образование в России?

Алексей Савватеев: кто уничтожил образование в России?

Yann LeCun | Self-Supervised Learning, JEPA, World Models, and the future of AI

Yann LeCun | Self-Supervised Learning, JEPA, World Models, and the future of AI

Making AI Way More Energy Efficient | Extropic CTO

Making AI Way More Energy Efficient | Extropic CTO

Как изменилась жизнь разработчиков с приходом ИИ

Как изменилась жизнь разработчиков с приходом ИИ

«Open AI — это пузырь»! Откровения из Кремниевой долины | Братья Либерманы

«Open AI — это пузырь»! Откровения из Кремниевой долины | Братья Либерманы

The Godmother of AI on jobs, robots & why world models are next | Dr. Fei-Fei Li

The Godmother of AI on jobs, robots & why world models are next | Dr. Fei-Fei Li

Почему ИИ — чистая фантазия.

Почему ИИ — чистая фантазия.

The Limits of AI: Generative AI, NLP, AGI, & What’s Next?

The Limits of AI: Generative AI, NLP, AGI, & What’s Next?

Sonnet 4.5 & the AI Plateau Myth — Sholto Douglas (Anthropic)

Sonnet 4.5 & the AI Plateau Myth — Sholto Douglas (Anthropic)

America’s Official AI Plan: Genesis Mission, Claude 4.5, Google vs. NVIDIA, & ChatGPT Shopping

America’s Official AI Plan: Genesis Mission, Claude 4.5, Google vs. NVIDIA, & ChatGPT Shopping

Да, ИИ отнимет у вас работу. Но то, что произойдёт дальше, ещё хуже.

Да, ИИ отнимет у вас работу. Но то, что произойдёт дальше, ещё хуже.

Закат программистов? Нет, эра архитекторов AI. // Сергей Марков

Закат программистов? Нет, эра архитекторов AI. // Сергей Марков

François Chollet: How We Get To AGI

François Chollet: How We Get To AGI

Intelligence Isn’t Enough: Why Energy & Compute Decide the AGI Race – Eiso Kant

Intelligence Isn’t Enough: Why Energy & Compute Decide the AGI Race – Eiso Kant