Unlocking LLM Reasoning, with Simeng Sophia Han
Автор: Women in AI Research WiAIR
Загружено: 2025-08-27
Просмотров: 291
How can we go beyond accuracy to truly understand large language models?
In this episode of the Women in AI Research podcast, hosts Jekaterina Novikova and Malikeh Ehghaghi sit down with Simeng Sophia Han (PhD candidate at @yale, Research Scientist Intern at @meta , ex @googledeepmind, ex @amazon aws) to explore the future of 𝐋𝐋𝐌 𝐫𝐞𝐚𝐬𝐨𝐧𝐢𝐧𝐠, 𝐞𝐯𝐚𝐥𝐮𝐚𝐭𝐢𝐨𝐧, 𝐚𝐧𝐝 𝐞𝐱𝐩𝐥𝐚𝐢𝐧𝐚𝐛𝐥𝐞 𝐀𝐈.
🌟 What you’ll learn in this episode:
• Why evaluating reasoning goes beyond correctness
• How brain teasers uncover hidden strengths and weaknesses of LLMs
• The importance of symbolic reasoning for complex problem solving
• The role of mentorship and early research experiences in shaping careers
• Why consistency in AI outputs is essential for building trust
• How humans combine brute force and intuition — and what this means for AI
ToC:
00:00 Introduction to LLM Reasoning and Evaluation
02:36 Simeng Sophia Han's Research Journey
06:26 Reflections on Early Research Experiences
11:25 Understanding LLM Reasoning Beyond Accuracy Metrics
16:16 Exploring Brain Teasers in LLM Reasoning
22:25 Example of Mathematical Problem Solving with Constraints
24:16 Cognitive Science Insights for Language Models
29:13 Challenges in Human-Written Reasoning Chains
32:16 Explaining the Black Box of LLMs
37:28 Consistency and Trustworthiness in AI Models
39:17 Symbolic Reasoning in LLMs
42:00 Neuro-Symbolic Reasoning Approaches
46:33 Future Directions in AI Research
49:11 Advice for Women in AI Research
REFERENCES:
01:26 Simeng Sophia Han - Google Scholar profile (https://scholar.google.ca/citations?h...)
11:40 Creativity or Brute Force? Using Brainteasers as a Window into the Problem-Solving Abilities of Large Language Models (https://arxiv.org/abs/2505.10844)
25:35 HYBRIDMIND: Meta Selection of Natural Language and Symbolic Language for Enhanced LLM Reasoning (https://arxiv.org/abs/2409.19381)
29:25 P-FOLIO: Evaluating and Improving Logical Reasoning with Abundant Human-Written Reasoning Chains (https://arxiv.org/abs/2410.09207)
39:05 Folio: Natural Language Reasoning with First-Order Logic (https://arxiv.org/abs/2209.00840)
41:54 HYBRIDMIND: Meta Selection of Natural Language and Symbolic Language for Enhanced LLM Reasoning (https://arxiv.org/abs/2409.19381)
🎧 Subscribe to stay updated on new episodes spotlighting brilliant women shaping the future of AI.
WiAIR website:
♾️ https://women-in-ai-research.github.io
Follow us at:
♾️ LinkedIn: / women-in-ai-research
♾️ Bluesky: https://bsky.app/profile/wiair.bsky.s...
♾️ X (Twitter): https://x.com/WiAIR_podcast
#LLM #AIResearch #ExplainableAI #Reasoning #aireasoning #MachineLearning #CognitiveScience #SymbolicReasoning #WiAIRPodcast #WiAIR
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: