How DeepSeek Changes the LLM Story

Автор: Sasha Rush

Загружено: 2025-02-04

Просмотров: 16526

Описание:

Quick turnaround survey of DeepSeek v3 and DeepSeek R1 the two technical papers behind the recent open-source LLM news. Presented at Simons Institute Feb 3, 2024.

Slides: https://docs.google.com/presentation/...

How DeepSeek Changes the LLM Story

Доступные форматы для скачивания:

Скачать видео mp4

Информация по загрузке:

Скачать аудио mp3

Похожие видео

Speculations on Test-Time Scaling (o1)

Speculations on Test-Time Scaling (o1)

How DeepSeek changes the LLM story

How DeepSeek changes the LLM story

Andrew Ng Explores The Rise Of AI Agents And Agentic Reasoning | BUILD 2024 Keynote

Andrew Ng Explores The Rise Of AI Agents And Agentic Reasoning | BUILD 2024 Keynote

DeepSeek-V3

Фейки в «Войне и мире», любовницы Пушкина, тайны детских сказок / вДудь

Фейки в «Войне и мире», любовницы Пушкина, тайны детских сказок / вДудь

LLM и GPT - как работают большие языковые модели? Визуальное введение в трансформеры

LLM и GPT - как работают большие языковые модели? Визуальное введение в трансформеры

Charlie Snell, UC Berkeley. Title: Scaling LLM Test-Time Compute

Charlie Snell, UC Berkeley. Title: Scaling LLM Test-Time Compute

Andrej Karpathy: Software Is Changing (Again)

Andrej Karpathy: Software Is Changing (Again)

An Unexpected Reinforcement Learning Renaissance

An Unexpected Reinforcement Learning Renaissance

Linear Attention and Beyond (Interactive Tutorial with Songlin Yang)

Linear Attention and Beyond (Interactive Tutorial with Songlin Yang)

The Long Arm of Theoretical Computer Science: The Case of Blockchains/Web3

The Long Arm of Theoretical Computer Science: The Case of Blockchains/Web3

Jason Wei: Scaling Paradigms for Large Language Models

Jason Wei: Scaling Paradigms for Large Language Models

[GRPO Explained] DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

[GRPO Explained] DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Stanford Webinar - Agentic AI: A Progression of Language Model Usage

Stanford Webinar - Agentic AI: A Progression of Language Model Usage

The Misconception that Almost Stopped AI [How Models Learn Part 1]

The Misconception that Almost Stopped AI [How Models Learn Part 1]

Но что такое нейронная сеть? | Глава 1. Глубокое обучение

Но что такое нейронная сеть? | Глава 1. Глубокое обучение

nanoAhaMoment: RL for LLM from Scratch with 1 GPU - Part 1

nanoAhaMoment: RL for LLM from Scratch with 1 GPU - Part 1

Deep Dive into LLMs like ChatGPT

Deep Dive into LLMs like ChatGPT

How language model post-training is done today

How language model post-training is done today

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Visualizing transformers and attention | Talk for TNG Big Tech Day '24