Llama 4 Explained: Architecture, Long Context, and Native Multimodality

Автор: Julia Turc

Загружено: 10 апр. 2025 г.

Просмотров: 4 629 просмотров

Описание:

Curious how Meta’s Llama 4 works under the hood? In this deep dive, I reverse-engineer the Llama 4 architecture based on Meta’s official blog post and unpack the innovations that enable its 10M token context window and native multimodality.

✅ What makes Llama 4 natively multimodal?
✅ How does it support long context lengths? Is RAG obsolete?
✅ How good is it *really*?

🔍 Topics covered (with papers):
🔵 Early fusion (https://arxiv.org/pdf/2405.09818)
🔵 Context Parallelism / Ring Attention (https://arxiv.org/pdf/2310.01889)
🔵 Rotary Positional Embeddings / RoPE (https://arxiv.org/pdf/2104.09864)
🔵 Position Interpolation (https://arxiv.org/pdf/2306.15595)
🔵 No Positional Embeddings / NoPE (https://arxiv.org/pdf/2305.19466)
🔵 New training strategies: Mid-training, MetaP

This video is ideal for engineers and researchers curious about how LLMs scale, why Llama 4 matters, and what's next for long-context transformers.

📌 Note: This is a corrected re-upload due to A/V sync issues in the previous version.

#Llama4 #MetaAI #MultimodalLLM #LongContext

00:00 Intro
00:55 Behemoth, Maverick, Scout & Mixture-of-Experts
02:36 Multimodality in Llama 3
05:02 Native multimodality in Llama 4
08:27 10M context window
09:41 Ring Attention
12:28 Length generalization
16:56 New training techniques
20:21 Is RAG dead?
21:08 Evaluation

Llama 4 Explained: Architecture, Long Context, and Native Multimodality

Доступные форматы для скачивания:

Скачать видео mp4

Информация по загрузке:

Скачать аудио mp3

Похожие видео

LLM и GPT - как работают большие языковые модели? Визуальное введение в трансформеры

LLM и GPT - как работают большие языковые модели? Визуальное введение в трансформеры

Harvard Professor Explains Algorithms in 5 Levels of Difficulty | WIRED

Harvard Professor Explains Algorithms in 5 Levels of Difficulty | WIRED

Mixture of Experts: How LLMs Are Getting Smarter Without Getting Slower (LLaMA 4, DeepSeek)

Mixture of Experts: How LLMs Are Getting Smarter Without Getting Slower (LLaMA 4, DeepSeek)

How We Build Effective Agents: Barry Zhang, Anthropic

How We Build Effective Agents: Barry Zhang, Anthropic

Градиентный спуск, как обучаются нейросети | Глава 2, Глубинное обучение

Градиентный спуск, как обучаются нейросети | Глава 2, Глубинное обучение

Andrew Ng Explores The Rise Of AI Agents And Agentic Reasoning | BUILD 2024 Keynote

Andrew Ng Explores The Rise Of AI Agents And Agentic Reasoning | BUILD 2024 Keynote

Что происходит с нейросетью во время обучения?

Что происходит с нейросетью во время обучения?

Американский профессор предупреждает о катастрофе! Искусственный интеллект опаснее, чем мы думаем

Американский профессор предупреждает о катастрофе! Искусственный интеллект опаснее, чем мы думаем

Но что такое нейронная сеть? | Глава 1. Глубокое обучение

Но что такое нейронная сеть? | Глава 1. Глубокое обучение

RAG vs Fine-Tuning vs Prompt Engineering: Optimizing AI Models

RAG vs Fine-Tuning vs Prompt Engineering: Optimizing AI Models