Multimodal AI in 2025: Testing Commercial and Open Source Models & Modalities

Автор: CanAIHelp

Загружено: 2025-04-02

Просмотров: 357

Описание:

🚀 Multimodal AI in 2025! 🚀
AI isn’t just about text anymore—it sees, hears, and even reasons across multiple types of data. But which models are actually delivering? In this video, I test and explore the latest multimodal AI models, from Gemini 2 and Apple’s Intelligence to open-source challengers.

More content on Neural Nets here: • Neural Nets Explained

🔍 What’s inside?
✅ Hands-on tests with cutting-edge multimodal models
✅ Testing Gemini 2 with images, YouTube videos, videos, and screen sharing
✅ Open-source challengers like QVQ and InternVL—can they compete with the big names?
✅ AI beyond speech and vision—music from images, scent mapping, and even robotic action!

📖 Chapters:
1. 00:00 Intuition behind multimodal AI
2. 00:50 Gemini 2.0
3. 02:09 Gemini in Google AI Studio
4. 03:14 Screen share with Gemini 2.0
5. 03:58 Apple Intelligence
6. 06:11 Open Source Multimodal models
7. 07:47 QVQ model
8. 08:58 InternVL model
9. 09:40 Other modalities

💡 Whether you're a tech enthusiast, researcher, or just curious about AI's next leap, this video breaks it all down with real examples.

🔔 Like, subscribe, and join the conversation on the future of AI!

Links:
1. MMMU: https://mmmu-benchmark.github.io/
2. QVQ model: https://qwenlm.github.io/blog/qvq-72b...
3. IntenrVL: https://internvl.opengvlab.com/
4. Riffusion: https://www.riffusion.com/
5. Osmo AI: https://www.osmo.ai/

#AI #MultimodalAI #ArtificialIntelligence #Gemini2 #DeepLearning #MachineLearning #TechNews #OpenSourceAI #FutureTech

Multimodal AI in 2025: Testing Commercial and Open Source Models & Modalities

Доступные форматы для скачивания:

Скачать видео mp4

Информация по загрузке:

Скачать аудио mp3

Похожие видео

Чем ОПАСЕН МАХ? Разбор приложения специалистом по кибер безопасности

Чем ОПАСЕН МАХ? Разбор приложения специалистом по кибер безопасности

Как работают мультимодальные модели ИИ? Простое объяснение

Как работают мультимодальные модели ИИ? Простое объяснение

Создаю AI-бизнес на инструментах Google: 6 сервисов, которые работают как фабрика!

Создаю AI-бизнес на инструментах Google: 6 сервисов, которые работают как фабрика!

Gemini 3 и NanoBanana Pro в деле: как использовать новый апдейт ИИ от Google

Gemini 3 и NanoBanana Pro в деле: как использовать новый апдейт ИИ от Google

Новости ИИ: GPT Image 1,5 – самый бессмысленный релиз OpenAI

Новости ИИ: GPT Image 1,5 – самый бессмысленный релиз OpenAI

Разработка с помощью Gemini 3, AI Studio, Antigravity и Nano Banana | Подкаст Agent Factory

Разработка с помощью Gemini 3, AI Studio, Antigravity и Nano Banana | Подкаст Agent Factory

ЛУЧШАЯ БЕСПЛАТНАЯ НЕЙРОСЕТЬ Google, которой нет аналогов

ЛУЧШАЯ БЕСПЛАТНАЯ НЕЙРОСЕТЬ Google, которой нет аналогов

AI Can Now Deceive Its Creators : Here's The Proof.

AI Can Now Deceive Its Creators : Here's The Proof.

Vibecoding Следующая эволюция

Vibecoding Следующая эволюция

Understanding Residual Transformers: Interpretability and Explanation

Understanding Residual Transformers: Interpretability and Explanation

What Are Vision Language Models? How AI Sees & Understands Images

What Are Vision Language Models? How AI Sees & Understands Images

Нейросеть Grok: полный гайд по работе в нейросети от Илона Маска

Нейросеть Grok: полный гайд по работе в нейросети от Илона Маска

Как стать круче 99% людей с помощью ИИ

Как стать круче 99% людей с помощью ИИ

Социобиолог про ИИ и утрату навыков: выживут талантливые

Социобиолог про ИИ и утрату навыков: выживут талантливые

NotebookLM тихо обновился. Как делать Инфографику, Презентации, Видеопересказ.

NotebookLM тихо обновился. Как делать Инфографику, Презентации, Видеопересказ.

Elevator Pitch Slide Deck

Elevator Pitch Slide Deck

Why Most AI Startups Are Bad Businesses

Why Most AI Startups Are Bad Businesses

Нейронка, которая УНИЧТОЖИЛА ChatGPT 5! / Обзор бесплатной нейросети и ее возможности

Нейронка, которая УНИЧТОЖИЛА ChatGPT 5! / Обзор бесплатной нейросети и ее возможности

What Is Multimodal AI? | AI Tutorials For Beginners | How Multimodal AI Works? | Edureka

What Is Multimodal AI? | AI Tutorials For Beginners | How Multimodal AI Works? | Edureka

Чат ПГТ 5.2 - это похоронная. Самый УЖАСНЫЙ релиз в истории ИИ

Чат ПГТ 5.2 - это похоронная. Самый УЖАСНЫЙ релиз в истории ИИ