Google's Nested Learning (HOPE) Cures LLM Amnesia & Unlocks the Physics of Intelligence
Автор: ErgoSum / X Labs
Загружено: 2025-11-30
Просмотров: 64
Blog: https://blog.nilayparikh.com/beyond-t...
Paper: https://abehrouz.github.io/files/NL.pdf
Google's revolutionary new framework, Nested Learning (NL), detailed in a NeurIPS 2025 paper, proposes a pivotal shift in how we build AI. This research moves beyond the traditional "stack of layers" view of deep learning, suggesting neural networks should instead be seen as a hierarchy of nested optimization loops operating at different frequencies.
The Core Problem: Anterograde Amnesia
Current Large Language Models (LLMs) suffer from a condition analogous to human Anterograde Amnesia. While they retain vast pre-trained knowledge (old knowledge), they cannot consolidate new experiences into long-term storage. Knowledge acquired within a temporary Context Window evaporates the moment that window closes. This limitation leads to frustrating real-world scenarios, such as needing to re-teach an AI specific API documentation or having a "Static Assistant" recommend something you told it you were allergic to in a previous thread.
The Nested Learning Solution
NL suggests we stop building "flat" neural networks and instead use Nested Systems. This idea is inspired by the human brain, which coordinates activity using Multiple Frequencies, such as fast Gamma Waves (Perception) and slow Delta Waves (Deep Consolidation). NL decomposes the model into nested loops where different parts update at different speeds, allowing the system to react instantly to new data without losing long-term knowledge.
The Deep Optimizer Breakthrough
A critical insight of the paper is the mathematical proof that the standard Optimizer (e.g., Adam or SGD) is actually a primitive, linear Memory System in disguise. It attempts to compress the history of gradients into a single vector, but its linearity limits its ability to detect intricate patterns (like recalling a complex movie by averaging all the pixels into a dull grey blur).
NL introduces the Deep Optimizer, which replaces the fixed mathematical update rule with a Non-Linear Neural Network (MLP). This means the AI is not just learning the data; it is learning how to learn.
HOPE Architecture and Efficiency
To demonstrate this theory, Google built HOPE (Higher-Order Processing Engine), which uses a Continuum Memory System to replace the standard Transformer block. HOPE is structured around three pillars:
1. Fast Weights: Layers that update instantly (for context).
2. Slow Weights: Layers that update periodically (for knowledge).
3. Self-Correction: A mechanism that allows slow weights to tune the fast weights.
Performance Highlights (1.3B Parameters):
• Language Modeling (Perplexity): HOPE achieved 15.11 (Lower is Better), beating the standard Transformer (18.53) and Titans (15.60).
• Complex Reasoning (ARC-c Accuracy): HOPE achieved 42.52% accuracy (Higher is Better), surpassing Transformer++ (40.66%) and Titans (42.05%).
Conclusion: Nested Learning suggests the era of "Static AI" is ending. By blending neuroscience and mathematics, NL creates models that can Self-Evolve and Cure Amnesia, confirming the belief that the future of AI is physical. HOPE represents a move toward dynamic, organic computation.
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: