Unnatural Language Processing:On the Puzzling Out-of-Distribution Behavior of Lang Models-M. Baroni
Автор: HiTZ zentroa
Загружено: 2024-06-12
Просмотров: 140
Unnatural Language Processing: On the Puzzling Out-of-Distribution Behavior of Language Models - Marco Baroni (Universitat Pompeu Fabra)
Summary: Modern language models (LMs) respond with uncanny fluency when prompted using a natural language, such as English. However, they can also produce predictable, semantically meaningful output when prompted with low-likelihood "gibberish" strings, a phenomenon exploited for developing effective information extraction prompts (Shin et al. 2020) and bypassing security checks in adversarial attacks (Zou et al. 2023). Moreover, the same "unnatural" prompts often trigger the same behavior across LMs (Rakotonirina et al. 2023, Zou et al. 2023), hinting at a shared "universal" but unnatural LM code. In my talk, I will use unnatural prompts as a tool to gain insights into how LMs process language-like input. I will in particular discuss recent and ongoing work on three fronts: transferable unnatural prompts, as a window into LM invariances (Rakotonirina et al. 2023); mechanistic interpretability exploration of the activation pathways triggered by natural and unnatural prompts (Kervadec et al. 2023); and first insights into the lexical nature of unnatural prompts. Although a comprehensive understanding of how and why LMs respond to unnatural language remains elusive, I aim to present a set of intriguing facts that I hope will inspire others to explore this phenomenon.
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: