Guy Bresler | Global Minimizers of Sigmoid Contrastive Loss
Автор: Harvard CMSA
Загружено: 2025-10-14
Просмотров: 131
Workshop on Mathematical foundations of AI
10/6/2025
Speaker: Guy Bresler, MIT
Title: Global Minimizers of Sigmoid Contrastive Loss
Abstract: The meta-task of obtaining and aligning representations through contrastive pre-training is steadily gaining importance since its introduction in CLIP and ALIGN. In this paper we theoretically explain the advantages of synchronizing with trainable inverse temperature and bias under the sigmoid loss, as implemented in the recent SigLIP models of Google DeepMind. Temperature and bias can drive the loss function to zero for a rich class of configurations that we call (m,b)-Constellations. (m,b)-Constellations are a novel combinatorial object related to spherical codes and are parametrized by a margin m and relative bias b. We use our characterization of constellations to theoretically justify the success of SigLIP on retrieval, to explain the modality gap present in SigLIP, and to identify the necessary dimension for producing high-quality representations. We also propose a reparameterization of the sigmoid loss with explicit relative bias, which appears to improve training dynamics. Joint work with Kiril Bangachev, Iliyas Noman, and Yury Polyanskiy.
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: