OpenAI O3 & O4 Mini: The First True Reasoning Agents?
Автор: Prompt Engineering
Загружено: 16 апр. 2025 г.
Просмотров: 11 852 просмотра
I tested GPT-4.1 on my own coding benchmark. Its impressive but the intelligence vs cost doesn't justify to replace better options like Gemini-2.5 Pro from Google. Learn more here!
LINK:
https://openai.com/index/introducing-...
https://github.com/openai/codex
https://www.swebench.com/#verified
https://github.com/sierra-research/ta...
https://aider.chat/docs/leaderboards/
https://x.com/testingcatalog/status/1...
https://x.com/kimmonismus/status/1912...
RAG Beyond Basics Course:
https://prompt-s-site.thinkific.com/c...
Let's Connect:
🦾 Discord: / discord
☕ Buy me a Coffee: https://ko-fi.com/promptengineering
|🔴 Patreon: / promptengineering
💼Consulting: https://calendly.com/engineerprompt/c...
📧 Business Contact: [email protected]
Become Member: http://tinyurl.com/y5h28s6h
💻 Pre-configured localGPT VM: https://bit.ly/localGPT (use Code: PromptEngineering for 50% off).
Signup for Newsletter, localgpt:
https://tally.so/r/3y9bb0
OpenAI's Revolutionary O3 and O4 Mini Models: A New Benchmark in AI Tools and Reasoning
In this video, we unpack OpenAI's latest groundbreaking announcement featuring the release of two new models: O3 and O4 Mini. These models mark the first time reasoning models can effectively use tools, overcoming a major limitation of previous models. With native multimodal reasoning capabilities, enhanced tool usage, and improved performance in coding, math, science, and visual perception, these models set a new standard in AI performance. The video also introduces OpenAI's Codex CLI for terminal-based reasoning and discusses the significant performance improvements and cost optimizations compared to the previous O1 models. Stay tuned for detailed benchmarks, real-world tests, and comparisons with competitors' models.
00:00 Introduction to OpenAI's New Models
00:39 Tool Usage and Multimodal Capabilities
01:06 Model Replacements and Enhancements
01:57 Performance Benchmarks and Real-World Applications
03:40 Cost Efficiency and Usage Limits
05:23 Instruction Following and Coding Performance
10:43 Reinforcement Learning and Tool Integration
15:14 Pricing and Codex CLI
16:37 Conclusion and Future Expectations

Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: