Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Lecture 3: Policy Gradients

Автор: Stanford Online

Загружено: 2025-12-08

Просмотров: 5394

Описание:

View course details: https://online.stanford.edu/courses/x...

April 9, 2025
• Key intuition behind policy gradients
• How to implement, when to use policy gradients

To learn more about enrolling in the graduate course, visit: https://online.stanford.edu/courses/c...

To follow along with the course schedule and syllabus, visit:
https://cs224r.stanford.edu/

Chelsea Finn
Assistant Professor in Computer Science and Electrical Engineering at Stanford University and co-founder of Pi.

Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Lecture 3: Policy Gradients

Доступные форматы для скачивания:

Скачать видео mp4

Информация по загрузке:

Скачать аудио mp3

Похожие видео

Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Lecture 4: Actor-Critic Methods

Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Lecture 4: Actor-Critic Methods

Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Lecture 5: Off-Policy Actor Critic

Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Lecture 5: Off-Policy Actor Critic

Stanford AI Club: Jeff Dean on Important AI Trends

Stanford AI Club: Jeff Dean on Important AI Trends

Complete Machine Learning and Data Science Courses

Complete Machine Learning and Data Science Courses

LLM fine-tuning или ОБУЧЕНИЕ малой модели? Мы проверили!

LLM fine-tuning или ОБУЧЕНИЕ малой модели? Мы проверили!

But what is a Laplace Transform?

But what is a Laplace Transform?

Why Does Fire BURN? Feynman's Answer Will DESTROY Your Reality

Why Does Fire BURN? Feynman's Answer Will DESTROY Your Reality

Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Lecture 1: Class Intro

Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Lecture 1: Class Intro

Как освоить любой навык так быстро, что это покажется незаконным

Как освоить любой навык так быстро, что это покажется незаконным

Трамп опять презирает Зеленского?

Трамп опять презирает Зеленского?

How to Speak

Синьор 1С: 10 привычек, без которых ты не вырастешь

Синьор 1С: 10 привычек, без которых ты не вырастешь

Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Lecture 2: Imitation Learning

Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Lecture 2: Imitation Learning

Обучение с подкреплением ужасно – Андрей Карпати

Обучение с подкреплением ужасно – Андрей Карпати

Как внимание стало настолько эффективным [GQA/MLA/DSA]

Как внимание стало настолько эффективным [GQA/MLA/DSA]

The Strange Math That Predicts (Almost) Anything

The Strange Math That Predicts (Almost) Anything

Код работает в 100 раз медленнее из-за ложного разделения ресурсов.

Код работает в 100 раз медленнее из-за ложного разделения ресурсов.

Математическая тревожность, нейросети, задачи тысячелетия / Андрей Коняев

Математическая тревожность, нейросети, задачи тысячелетия / Андрей Коняев

Stanford CS230 | Autumn 2025 | Lecture 8: Agents, Prompts, and RAG

Stanford CS230 | Autumn 2025 | Lecture 8: Agents, Prompts, and RAG

The Only Trait for Success in the AI Era—How to Build It | Carnegie Mellon University Po-Shen Loh

The Only Trait for Success in the AI Era—How to Build It | Carnegie Mellon University Po-Shen Loh