F. Boenisch "Understanding & Mitigating Memorization in Foundation Models"@CISPAELLISSummerSchool'25
Автор: CISPA
Загружено: 2025-10-22
Просмотров: 30
Talk by Franziska Boenisch (CISPA Helmholtz Center for Information Security) at the CISPA ELLIS Summer School 2025 on "Trustworthy AI – Secure and Safe Foundation Models"
https://cispa.de/summer-school-2025
Abstract
Memorization occurs when machine learning models store and reproduce specific training examples at inference time—a phenomenon that raises serious concerns for privacy and intellectual property. In this talk, we will explore what it means for modern ML models to memorize data, and why this behavior has become especially relevant in large foundation models. I will present concrete ways to define and measure memorization, show how it manifests in practice, and analyze which data is most vulnerable. We will examine both large self-supervised vision encoders and state-of-the-art diffusion models. For encoders, we identify the neurons responsible for memorization, revealing insights into internal model behavior and contrasting supervised with self-supervised training. For diffusion models, I will show how to localize memorization, prune responsible neurons, and reduce overfitting to the training data—helping to mitigate privacy and copyright risks while improving generation diversity.
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: