OpenAI O3 & O4 Mini: The First True Reasoning Agents?

prompt engineering

Prompt Engineer

LLMs

AI

artificial Intelligence

Llama

GPT-4

fine-tuning LLMs

Автор: Prompt Engineering

Загружено: 16 апр. 2025 г.

Просмотров: 11 852 просмотра

Описание:

I tested GPT-4.1 on my own coding benchmark. Its impressive but the intelligence vs cost doesn't justify to replace better options like Gemini-2.5 Pro from Google. Learn more here!

LINK:
https://openai.com/index/introducing-...
https://github.com/openai/codex
https://www.swebench.com/#verified
https://github.com/sierra-research/ta...
https://aider.chat/docs/leaderboards/
https://x.com/testingcatalog/status/1...
https://x.com/kimmonismus/status/1912...

RAG Beyond Basics Course:
https://prompt-s-site.thinkific.com/c...

Let's Connect:
🦾 Discord: / discord
☕ Buy me a Coffee: https://ko-fi.com/promptengineering
|🔴 Patreon: / promptengineering
💼Consulting: https://calendly.com/engineerprompt/c...
📧 Business Contact: [email protected]
Become Member: http://tinyurl.com/y5h28s6h

💻 Pre-configured localGPT VM: https://bit.ly/localGPT (use Code: PromptEngineering for 50% off).

Signup for Newsletter, localgpt:
https://tally.so/r/3y9bb0

OpenAI's Revolutionary O3 and O4 Mini Models: A New Benchmark in AI Tools and Reasoning

In this video, we unpack OpenAI's latest groundbreaking announcement featuring the release of two new models: O3 and O4 Mini. These models mark the first time reasoning models can effectively use tools, overcoming a major limitation of previous models. With native multimodal reasoning capabilities, enhanced tool usage, and improved performance in coding, math, science, and visual perception, these models set a new standard in AI performance. The video also introduces OpenAI's Codex CLI for terminal-based reasoning and discusses the significant performance improvements and cost optimizations compared to the previous O1 models. Stay tuned for detailed benchmarks, real-world tests, and comparisons with competitors' models.

00:00 Introduction to OpenAI's New Models
00:39 Tool Usage and Multimodal Capabilities
01:06 Model Replacements and Enhancements
01:57 Performance Benchmarks and Real-World Applications
03:40 Cost Efficiency and Usage Limits
05:23 Instruction Following and Coding Performance
10:43 Reinforcement Learning and Tool Integration
15:14 Pricing and Codex CLI
16:37 Conclusion and Future Expectations

OpenAI O3 & O4 Mini: The First True Reasoning Agents?

Доступные форматы для скачивания:

Скачать видео mp4

Информация по загрузке:

Скачать аудио mp3

Похожие видео

Meta's AI Chief:

Meta's AI Chief: "I'm DONE with LLMs"

The Industry Reacts to o3 and o4!

The Industry Reacts to o3 and o4!

The Biggest Misconception in Physics

The Biggest Misconception in Physics

Building AI Agents That Work - My Talk at Google NEXT

Building AI Agents That Work - My Talk at Google NEXT

OpenAI o3 & o4-mini shocking abilities

OpenAI o3 & o4-mini shocking abilities

o3 and o4-mini - they’re great, but easy to over-hype

o3 and o4-mini - they’re great, but easy to over-hype

Google’s AI Studio Gets a Big Update

Google’s AI Studio Gets a Big Update

VS Code Agent Mode Just Changed Everything

VS Code Agent Mode Just Changed Everything

LLM и GPT - как работают большие языковые модели? Визуальное введение в трансформеры

LLM и GPT - как работают большие языковые модели? Визуальное введение в трансформеры

ТРЕНИРУЕМ МОЗГ / ЧИТАЕМ МЫСЛИ / СТРОИМ МОДЕЛИ МИРА. Семихатов, Сурдин, Каплан

ТРЕНИРУЕМ МОЗГ / ЧИТАЕМ МЫСЛИ / СТРОИМ МОДЕЛИ МИРА. Семихатов, Сурдин, Каплан