Interpreting and Leveraging Diffusion Representations with Deepti Ghadiyaram
Автор: NDIF Team
Загружено: 2026-01-15
Просмотров: 12
Deepti Ghadiyaram is an Assistant Professor at Boston University in the Department of Computer Science, with affiliated appointments in Electrical and Computer Engineering and the Faculty of Computing & Data Sciences. Her research focuses on building safe, interpretable, and robust computer vision systems with enhanced reasoning capabilities. Prior to joining BU, she earned her PhD from UT Austin and worked at Facebook AI Applied Research and Runway.
In this seminar, Professor Ghadiyaram presents groundbreaking research on interpretability in diffusion models. The work unveils how rich visual semantic information is encoded across different layers and denoising timesteps of various diffusion architectures. Using k-sparse autoencoders, the team discovers monosemantic interpretable features and validates their findings through transfer learning experiments with lightweight classifiers on off-the-shelf diffusion models. The research demonstrates the effectiveness of diffusion features for representation learning across four datasets, while providing comprehensive analysis of how architectural choices, pre-training datasets, and language model conditioning influence visual representation granularity, inductive biases, and transfer learning performance. This work represents a significant advancement in understanding and interpreting these powerful but complex black-box models.
📄 Paper: https://arxiv.org/abs/2411.16725
💻 Code & Visualizations: https://github.com/revelio-diffusion/...
🌐 Deepti's Website: https://deeptigp.github.io/
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: