Think-in-Video Reasoning and Building a Local-First Video Indexer | Multimodal Weekly 104
Автор: TwelveLabs
Загружено: 2026-01-12
Просмотров: 146
In the 104th session of Multimodal Weekly, we feature a paper on evaluating the reasoning capabilities of video generative models and an open-source local-first video indexer.
✅ Harold Chen will present TiViBench, a hierarchical manner benchmark specifically designed to evaluate the reasoning capabilities of image-to-video (I2V) generation models.
TiViBench: https://haroldchen19.github.io/TiViBe...
Github: https://github.com/EnVision-Research/...
Paper: https://arxiv.org/abs/2511.13704
✅ Ilias Haddad will present Edit Mind - a web application that indexes videos with AI (object detection, face recognition, emotion analysis), enables semantic search through natural language queries, and export scenes.
Connect with Ilias: https://iliashaddad.com/
Check out Edit Mind: https://github.com/iliashad/edit-mind
Timestamps:
00:07 Introduction
04:23 Harold starts
06:00 The 4 dimensions of TiViBench
08:33 Data and Prompt Suite (Why Narrative Prompts)
09:29 Metrics - How to score "reasoning correctness"?
10:47 Results overview across 24 tasks
11:30 Key numbers
12:27 Failure Analysis - where and why models break
13:20 VideoTPO - Prompt Preference Optimization at Test Time
15:05 Wrap-Up
16:10 Q&A with Harold
21:20 Ilias starts
21:54 Story - Why built Edit Mind
24:21 Process - How Edit Mind Works behind the scene and the tech stack powering it
30:48 Demo - Showcase of Edit Mind and what you can do with it
38:35 Q&A with Ilias
Join the Multimodal Minds community to receive an invite for future webinars: / discord
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: