Jan Buys - The Curious Case of Neural Text Degeneration [IndabaX SA 2020]
Автор: Deep Learning IndabaX
Загружено: 2020-03-27
Просмотров: 628
Talk by Jan Buys from the University of Cape Town presenting at the Deep Learning Indaba𝕏 South Africa 2020
[https://indabax.co.za]
Despite considerable advances in neural language modelling, it is still an open question how to best apply language models to generate text. In this talk we investigate what the best decoding strategy is for high-quality long-form text generation. While language models are trained to maximize likelihood, maximization-based decoding methods such as beam search lead to degeneration — producing text that is bland, incoherent, or repetitive. To address this we propose Nucleus Sampling, a simple but effective decoding method that avoids text degeneration by truncating the unreliable tail of the probability distribution, and sampling from the nucleus of tokens containing most of the probability mass. We perform an extensive evaluation of multiple decoding methods, comparing generations from each method to human text along several axes including likelihood, diversity, and repetition. Our results show that Nucleus Sampling is the best currently available decoding strategy for generating long-form text that is both high-quality and diverse.
About Jan:
I am a Lecturer in the Department of Computer Science at the University of Cape Town. My research focuses on Natural Language Generation and sequence modelling with deep learning. I am particularly interested in approaches that add more explicit structure or knowledge to neural networks.
In my masters and PhD, I worked on syntactic and semantic parsing, and in language models which incorporate syntax or semantics. During my postdoc, my focus shifted to the more general problem of long-form text generation with neural language models. I worked on ways to incorporate additional structure into the model in order to improve its quality and to avoid gibberish. More recently, my collaborators and I have also investigated the effect of the decoding algorithm (e.g. beam search or sampling) on text generation quality, and proposed a new decoding method.
Other research questions I am currently interested in include using graph-based representations for dialog systems and question answering, and developing deep learning models for low-resource languages. I am particularly excited to apply my NLP expertise to African languages. This year I am planning to work on language modelling, morphological tagging and segmentation, and neural machine translation for African languages.
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: