How AI Creates Images/Videos/Audio - Diffusion Models Explained

Автор: Adam Lucek

Загружено: 2024-06-21

Просмотров: 3233

Описание:

Generating images, videos, and audio with AI, how does it work? Seeing the recent improvements with different GenAI diffusion models like Luma’s Dream Machine, OpenAI’s Sora, and Stable Diffusion 3 medium coming out recently, I was wondering this exact question! To gain a better understanding and help other curious folks like myself, I’ve put together a full intuitive breakdown of how diffusion models work, and the differences between image, video, and audio models.

And not to just keep everything purely theoretical, the second part of this resource shows diffusion models in action: Using Stable Diffusion 3 Medium, Stable Video img2vid, and Stable Audio Open 1.0 to generate an image, convert it into a video, and add an audio track to create a fully diffusion model generated clip.

Colab Notebook: https://colab.research.google.com/dri...
Miro Board: https://miro.com/app/board/uXjVK6HcIX...

Additional Resources:
Blog by Kemal Erden: https://erdem.pl/2023/11/step-by-step...
Blog by Lilian Wang: https://lilianweng.github.io/posts/20...

Chapters:
00:00 - Introduction
01:05 - Diffusion Overview
03:18 - Step 1: Image Forward Diffusion
06:46 - Step 2: Image Model Training
10:02 - Step 3: Image Reverse Diffusion/Generation
11:58 - Post Breakdown Overview
13:43 - Audio Diffusion Models
15:54 - Video Diffusion Models Part 1
17:29 - Video Diffusion Models - Handling Time & Space
20:07 - Video Diffusion Models Part 2
21:33 - How Diffusion Models Use Text Prompts
25:06 - Diffusion Overview Recap
25:55 - Code: Setting Up Colab
27:14 - Code: Image Gen with Stable Diffusion 3 Medium
31:14 - Code: Video Gen with Stable Video Diffusion img2vid
33:34 - Code: Audio Gen with Stable Audio Open 1.0
36:57 - Code: Combing Image, Video, & Audio
37:56 - Outro

#artificialintelligence #stablediffusion #diffusionmodel

How AI Creates Images/Videos/Audio - Diffusion Models Explained

Доступные форматы для скачивания:

Скачать видео mp4

Информация по загрузке:

Скачать аудио mp3

Похожие видео

Fine Tuning Video Generation Models | Make Your Own AI Videos

Fine Tuning Video Generation Models | Make Your Own AI Videos

Why Every AI Developer Should Learn Model Context Protocol (MCP)

Why Every AI Developer Should Learn Model Context Protocol (MCP)

Diffusion Models for AI Image Generation

Diffusion Models for AI Image Generation

400x Faster Embeddings! - Static & Distilled Embedding Models

400x Faster Embeddings! - Static & Distilled Embedding Models

NotebookLM тихо обновился. Как делать Инфографику, Презентации, Видеопересказ.

NotebookLM тихо обновился. Как делать Инфографику, Презентации, Видеопересказ.

Как LLM могут хранить факты | Глава 7, Глубокое обучение

Как LLM могут хранить факты | Глава 7, Глубокое обучение

Gemini 3, кванты и плоть. Странное будущее искусственного интеллекта.

Gemini 3, кванты и плоть. Странное будущее искусственного интеллекта.

Как работает трассировка лучей в видеоиграх и фильмах?

Как работает трассировка лучей в видеоиграх и фильмах?

AI ускоряется, но становится страннее: что происходит с GPT-5.2 и OpenAI

AI ускоряется, но становится страннее: что происходит с GPT-5.2 и OpenAI

DS3000 Group Project

DS3000 Group Project

Do Reranking Models Actually Improve RAG?

Do Reranking Models Actually Improve RAG?

Nano Banana Pro: Голливуд по промпту + 5-ступенчатая формула

Nano Banana Pro: Голливуд по промпту + 5-ступенчатая формула

LLM и GPT - как работают большие языковые модели? Визуальное введение в трансформеры

LLM и GPT - как работают большие языковые модели? Визуальное введение в трансформеры

Stop Prompt Engineering! Program Your LLMs with DSPy

Stop Prompt Engineering! Program Your LLMs with DSPy

Coding Stable Diffusion from scratch in PyTorch

Coding Stable Diffusion from scratch in PyTorch

Diffusion Models in Image Restoration - Bahjat Kawar PhD Seminar

Diffusion Models in Image Restoration - Bahjat Kawar PhD Seminar

Create AI Images of YOU with FLUX (Training and Generating Tutorial)

Create AI Images of YOU with FLUX (Training and Generating Tutorial)

Making Computer & Browser Use Agents

Making Computer & Browser Use Agents

ЛУЧШАЯ БЕСПЛАТНАЯ НЕЙРОСЕТЬ Google, которой нет аналогов

ЛУЧШАЯ БЕСПЛАТНАЯ НЕЙРОСЕТЬ Google, которой нет аналогов

Real-Time UI Generation: Building Dynamic Web Experiences with GenUI

Real-Time UI Generation: Building Dynamic Web Experiences with GenUI