How RWKV-7 "Goose" and It's Linear Inference Work with Author Eugene Cheah

Автор: Oxen

Загружено: 2025-04-15

Просмотров: 1013

Описание:

Paper 📜 https://arxiv.org/abs/2503.14456

Links + Notes 📝 https://www.oxen.ai/blog/how-rwkv-7-g...

Join Arxiv Dives 🤿 https://oxen.ai/community

Discord 🗿 / discord

Use Oxen AI 🐂 https://oxen.ai/

Oxen AI makes versioning your datasets as easy as versioning your code! Even is millions of unstructured images, the tool quickly handles any type of data so you can build cutting-edge AI.

--
Chapters
0:00 Why is RWKV-7 Goose interesting
2:53 How to quickly run RWKV-7 Goose
4:04 What is RWKV-7
10:20 RNN’s forget things
12:33 First paper: Reinventing RNNs for the Transformer Era
24:22 Paper author Eugene Cheah joins the dive
36:43 The intuition behind each model layer
47:57 Parallelization during training
53:01 How well did RWKV-7 do on benchmarks?
56:50 Live evals on RWKV-7 and fine-tuning tips
1:00:38 Why they made the World Tokenizer

How RWKV-7 "Goose" and It's Linear Inference Work with Author Eugene Cheah

Доступные форматы для скачивания:

Скачать видео mp4

Информация по загрузке:

Скачать аудио mp3

Похожие видео

RWKV: Reinventing RNNs for the Transformer Era (Paper Explained)

RWKV: Reinventing RNNs for the Transformer Era (Paper Explained)

Digital Electronics - The First Video YOU Should Watch

Digital Electronics - The First Video YOU Should Watch

Think Fast, Talk Smart: Communication Techniques

Think Fast, Talk Smart: Communication Techniques

Как создают стекло, управляющее светом? Производство оптического и специального стекла!

Как создают стекло, управляющее светом? Производство оптического и специального стекла!

RWKV-7 "Goose" with Expressive Dynamic State Evolution

Think Faster, Talk Smarter with Matt Abrahams

Think Faster, Talk Smarter with Matt Abrahams

Cybersecurity Architecture: Who Are You? Identity and Access Management

Cybersecurity Architecture: Who Are You? Identity and Access Management

How AI Cracked the Protein Folding Code and Won a Nobel Prize

How AI Cracked the Protein Folding Code and Won a Nobel Prize

Why The First Computers Were Made Out Of Light Bulbs

Why The First Computers Were Made Out Of Light Bulbs

Deep Dive into LLMs like ChatGPT

Deep Dive into LLMs like ChatGPT

Cybersecurity Architecture: Networks

Cybersecurity Architecture: Networks

Но что такое нейронная сеть? | Глава 1. Глубокое обучение

Но что такое нейронная сеть? | Глава 1. Глубокое обучение

Как внимание стало настолько эффективным [GQA/MLA/DSA]

Как внимание стало настолько эффективным [GQA/MLA/DSA]

Stanford CS229 I Machine Learning I Building Large Language Models (LLMs)

Stanford CS229 I Machine Learning I Building Large Language Models (LLMs)

Управление поведением LLM без тонкой настройки

Управление поведением LLM без тонкой настройки

Градиентный спуск, как обучаются нейросети | Глава 2, Глубинное обучение

Градиентный спуск, как обучаются нейросети | Глава 2, Глубинное обучение

RAG vs Fine-Tuning vs Prompt Engineering: Optimizing AI Models

RAG vs Fine-Tuning vs Prompt Engineering: Optimizing AI Models

MIT 6.S191 (2024): Recurrent Neural Networks, Transformers, and Attention

MIT 6.S191 (2024): Recurrent Neural Networks, Transformers, and Attention

Agentic AI - What and How!

Agentic AI - What and How!

How R1 and GRPO Work (Deep Technical Dive into DeepSeeks Models)

How R1 and GRPO Work (Deep Technical Dive into DeepSeeks Models)