The "Final Boss" of Deep Learning
Автор: Machine Learning Street Talk
Загружено: 2025-12-22
Просмотров: 10142
We often think of Large Language Models (LLMs) as all-knowing, but as the team reveals, they still struggle with the logic of a second-grader. Why can’t ChatGPT reliably add large numbers? Why does it "hallucinate" the laws of physics? The answer lies in the architecture. This episode explores how *Category Theory*—an ultra-abstract branch of mathematics—could provide the "Periodic Table" for neural networks, turning the "alchemy" of modern AI into a rigorous science.
In this deep-dive exploration, Andrew Dudzik*, *Petar Velichkovich*, *Taco Cohen*, *Bruno Gavranović*, and *Paul Lessard join host Tim Scarfe to discuss the fundamental limitations of today’s AI and the radical mathematical framework that might fix them.
---
Key Insights in This Episode:
The "Addition" Problem: Andrew Dudzik explains why LLMs don't actually "know" math—they just recognize patterns. When you change a single digit in a long string of numbers, the pattern breaks because the model lacks the internal "machinery" to perform a simple carry operation.
Beyond Alchemy: Tim Scarfe argues that deep learning is currently in its "alchemy" phase—we have powerful results, but we lack a unifying theory. Category Theory is proposed as the framework to move AI from trial-and-error to principled engineering. [00:13:49]
Algebra with Colors: To make Category Theory accessible, the guests use brilliant analogies—like thinking of matrices as magnets with colors that only snap together when the types match. This "partial compositionality" is the secret to building more complex internal reasoning. [00:09:17]
Synthetic vs. Analytic Math: Paul Lessard breaks down the philosophical shift needed in AI research: moving from "Analytic" math (what things are made of) to "Synthetic" math (how things behave and relate to one another). [00:23:41]
The 4D Carry: In a mind-blowing conclusion, the team discusses how simple algorithmic tasks, like "carrying the one" in addition, actually relate to complex geometric structures like *Hopf Fibrations*. [00:39:30]
---
Why This Matters for AGI
If we want AI to solve the world's hardest scientific problems, it can't just be a "stochastic parrot." It needs to internalize the rules of logic and computation. By imbuing neural networks with categorical priors, researchers are attempting to build a future where AI doesn't just predict the next word—it understands the underlying structure of the universe.
---
TIMESTAMPS:
00:00:00 The Failure of LLM Addition & Physics
00:01:26 Tool Use vs Intrinsic Model Quality
00:03:07 Efficiency Gains via Internalization
00:04:28 Geometric Deep Learning & Equivariance
00:07:05 Limitations of Group Theory
00:09:17 Category Theory: Algebra with Colors
00:11:25 The Systematic Guide of Lego-like Math
00:13:49 The Alchemy Analogy & Unifying Theory
00:15:33 Information Destruction & Reasoning
00:18:00 Pathfinding & Monoids in Computation
00:20:15 System 2 Reasoning & Error Awareness
00:23:31 Analytic vs Synthetic Mathematics
00:25:52 Morphisms & Weight Tying Basics
00:26:48 2-Categories & Weight Sharing Theory
00:28:55 Higher Categories & Emergence
00:31:41 Compositionality & Recursive Folds
00:34:05 Syntax vs Semantics in Network Design
00:36:14 Homomorphisms & Multi-Sorted Syntax
00:39:30 The Carrying Problem & Hopf Fibrations
---
REFERENCES:
Company:
Model:
[00:01:05] Veo
https://deepmind.google/models/veo/
[00:01:10] Genie
https://deepmind.google/blog/genie-3-...
Paper:
[00:04:30] Geometric Deep Learning Blueprint
https://arxiv.org/abs/2104.13478
[00:16:45] AlphaGeometry
https://deepmind.google/blog/alphageo...
[00:16:55] AlphaCode
https://arxiv.org/abs/2203.07814
[00:17:05] FunSearch
https://www.nature.com/articles/s4158...
[00:37:00] Attention Is All You Need
https://arxiv.org/abs/1706.03762
[00:43:00] Categorical Deep Learning
https://arxiv.org/abs/2402.15332
Thanks to ly3xqhl8g9 on our Discord server for the draft show review!
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: