"White-Box Transformers via Sparse Rate Reduction" - Sam Buchanan, Research at TTIC
Автор: TTIC
Загружено: 2025-06-04
Просмотров: 107
“White-Box Transformers via Sparse Rate Reduction”
Sam Buchanan, Toyota Technological Institute at Chicago (TTIC)
Originally recorded on May 23, 2025, at TTIC, 6045 S. Kenwood Avenue, Chicago, IL.
In this talk, Sam Buchanan introduces a new theoretical framework for understanding and designing transformer-like architectures through the lens of sparse rate reduction, a measure that balances intrinsic information compression with extrinsic sparsity. He presents CRATE, a family of mathematically interpretable architectures derived from this principle, where multi-head self-attention and MLP layers emerge as optimization steps on this unified objective. Experiments demonstrate that CRATE models effectively compress and sparsify representations on real-world datasets, achieving performance comparable to ViT and GPT2 with more interpretable structure.
Timestamps:
00:00 Introduction
01:45 Talk begins
57:30 Q&A
#Transformers #RepresentationLearning #SparseRateReduction #MachineLearning #DeepLearning #AI #WhiteBoxModels #Interpretability #NeuralNetworks #Research #TTIC #CRATE #ViT #GPT2
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: