Think-in-Video Reasoning and Building a Local-First Video Indexer | Multimodal Weekly 104

Автор: TwelveLabs

Загружено: 2026-01-12

Просмотров: 146

Описание:

In the 104th session of Multimodal Weekly, we feature a paper on evaluating the reasoning capabilities of video generative models and an open-source local-first video indexer.

✅ Harold Chen will present TiViBench, a hierarchical manner benchmark specifically designed to evaluate the reasoning capabilities of image-to-video (I2V) generation models.
TiViBench: https://haroldchen19.github.io/TiViBe...
Github: https://github.com/EnVision-Research/...
Paper: https://arxiv.org/abs/2511.13704

✅ Ilias Haddad will present Edit Mind - a web application that indexes videos with AI (object detection, face recognition, emotion analysis), enables semantic search through natural language queries, and export scenes.
Connect with Ilias: https://iliashaddad.com/
Check out Edit Mind: https://github.com/iliashad/edit-mind

Timestamps:
00:07 Introduction
04:23 Harold starts
06:00 The 4 dimensions of TiViBench
08:33 Data and Prompt Suite (Why Narrative Prompts)
09:29 Metrics - How to score "reasoning correctness"?
10:47 Results overview across 24 tasks
11:30 Key numbers
12:27 Failure Analysis - where and why models break
13:20 VideoTPO - Prompt Preference Optimization at Test Time
15:05 Wrap-Up
16:10 Q&A with Harold
21:20 Ilias starts
21:54 Story - Why built Edit Mind
24:21 Process - How Edit Mind Works behind the scene and the tech stack powering it
30:48 Demo - Showcase of Edit Mind and what you can do with it
38:35 Q&A with Ilias

Join the Multimodal Minds community to receive an invite for future webinars: / discord

Think-in-Video Reasoning and Building a Local-First Video Indexer | Multimodal Weekly 104

Доступные форматы для скачивания:

Скачать видео mp4

Информация по загрузке:

Скачать аудио mp3

Похожие видео

Long Video Reasoning and Sports Highlights Generation | Multimodal Weekly 105

Long Video Reasoning and Sports Highlights Generation | Multimodal Weekly 105

Даулет Жангузин, Groq, Cohere, Lyft - Советы программистам от 10х инженера из Кремниевой Долины

Даулет Жангузин, Groq, Cohere, Lyft - Советы программистам от 10х инженера из Кремниевой Долины

Conversation with Elon Musk | World Economic Forum Annual Meeting 2026

Conversation with Elon Musk | World Economic Forum Annual Meeting 2026

Motion Designer learns how to draw: day 672: the dog is not dogging

Motion Designer learns how to draw: day 672: the dog is not dogging

ИИ - ЭТО ИЛЛЮЗИЯ ИНТЕЛЛЕКТА. Но что он такое и почему совершил революцию?

ИИ - ЭТО ИЛЛЮЗИЯ ИНТЕЛЛЕКТА. Но что он такое и почему совершил революцию?

Proactive Video Assistants, Context Memory for Video Agents, and Video Tag! | Multimodal Weekly 103

Proactive Video Assistants, Context Memory for Video Agents, and Video Tag! | Multimodal Weekly 103

КАК СТАРТОВАТЬ в AI-разработке 2026: Claude Code для начинающих | 2 часа до результата

КАК СТАРТОВАТЬ в AI-разработке 2026: Claude Code для начинающих | 2 часа до результата

Accelerate Media Operations with Agentic AI and TwelveLabs on AWS | Multimodal Weekly 99

Accelerate Media Operations with Agentic AI and TwelveLabs on AWS | Multimodal Weekly 99

Почему «Трансформеры» заменяют CNN?

Почему «Трансформеры» заменяют CNN?

How do thinking and reasoning models work?

How do thinking and reasoning models work?

Может ли у ИИ появиться сознание? — Семихатов, Анохин

Может ли у ИИ появиться сознание? — Семихатов, Анохин

⚡️ Путин пошёл на условия Трампа || Президент отказывается от войны

⚡️ Путин пошёл на условия Трампа || Президент отказывается от войны

LTX-2 Подробный обзор возможностей | Генерируем видео со звуком локально

LTX-2 Подробный обзор возможностей | Генерируем видео со звуком локально

Подсказка цепочки мыслей - Объяснено!

Подсказка цепочки мыслей - Объяснено!

Глава IBM: мы на пороге квантового взрыва, который изменит ИИ навсегда

Глава IBM: мы на пороге квантового взрыва, который изменит ИИ навсегда

What Are Large Reasoning Models (LRMs)? Smarter AI Beyond LLMs

What Are Large Reasoning Models (LRMs)? Smarter AI Beyond LLMs

Multimodal Search in Video Editing, Distributed Encoding, Video Understanding | Multimodal Weekly 01

Multimodal Search in Video Editing, Distributed Encoding, Video Understanding | Multimodal Weekly 01

ЛУЧШАЯ БЕСПЛАТНАЯ НЕЙРОСЕТЬ Google, которой нет аналогов

ЛУЧШАЯ БЕСПЛАТНАЯ НЕЙРОСЕТЬ Google, которой нет аналогов

Лучший бесплатный генератор музыки локально на ПК HeartMuLa — установка от А до Я для новичков!

Лучший бесплатный генератор музыки локально на ПК HeartMuLa — установка от А до Я для новичков!

Вы неверно понимаете теорию эволюции [Veritasium]

Вы неверно понимаете теорию эволюции [Veritasium]