Claudio Pinhanez - The Non-Determinism of Small LLMs
Автор: PyData
Загружено: 2025-12-17
Просмотров: 280
Claudio Pinhanez, Principal Research Scientist at IBM, presents a talk on
“The Non-Determinism of Small LLMs: Evidence of Low Answer Consistency in Repetition Trials of Standard Multiple-Choice Benchmarks.”
Large language models are often evaluated based on accuracy — but how consistent are their answers when asked the same question multiple times? This talk explores how small language models (2B–8B parameters) behave under repeated questioning and what their variability reveals about reliability and evaluation.
In this session, Claudio discusses:
🔹 Answer consistency in small vs. medium-sized LLMs
🔹 The impact of inference settings, model size, and fine-tuning
🔹 Trade-offs between accuracy and consistency in model evaluation
🔹 New analytical tools for studying model stability
This talk was recorded during the PyData Yerevan November Meetup, held on November 6, 2025, at the American University of Armenia.
--
www.pydata.org
PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.
PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.
00:22 Welcome!
Want to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: