[SKKU AI Colloquium2025] 심규홍 교수-Context Compression for Efficient Multimodal LLMs
Автор: 성균관대학교AI대학원
Загружено: 2026-01-11
Просмотров: 56
강연제목: Context Compression for Efficient Multimodal LLMs
강연자: 심규홍 교수(성균관대학교)
강연요약: As multimodal large language models (MLLMs) continue to extend their context length, a single model can now integrate information from text, audio, video, and embodied signals. Despite this progress, deploying ultra-long-context models in real systems remains difficult because of practical memory constraints and strict latency requirements. In this talk, I will outline recent approaches designed to address these challenges, with particular attention to techniques that compress the key–value (KV) cache. I will close by highlighting open research directions and practical considerations for building scalable and efficient multimodal LLM inference pipelines.
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: