Which LLM Benchmarks Really Matter?
Автор: Garrett Love
Загружено: 2025-07-31
Просмотров: 423
There are so many LLM benchmarks! What do they mean and how should you view them?
Sources from this video:
https://www.vals.ai/benchmarks/aime-2...
https://www.vals.ai/benchmarks/gpqa-0...
https://www.vals.ai/benchmarks/lcb-07...
https://aider.chat/2024/12/21/polyglo...
https://agi.safe.ai/
https://huggingface.co/spaces/TIGER-L...
https://mathvista.github.io/
https://www.vals.ai/benchmarks/mgsm-2...
https://huggingface.co/spaces/Krissec...
https://evalplus.github.io/leaderboar...
Signup for my local-first AI assistant, Anna:
https://holaanna.com
Get $200 in credit on Digital Ocean and help support my channel!
https://m.do.co/c/ffbb4875a5db
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: