Beyond Evals - Build Environments That Make Agents Better: Jay Ram, Founder & CEO, hud.ai

Автор: Fondo

Загружено: 2025-11-18

Просмотров: 25

Описание:

Jay Ram is Founder & CEO of Hud, the evaluation and RL platform for AI agents. Hud helps startups build RL environments, run fast reward loops, and plug into any RL backend—so teams can cut costs and push last-mile accuracy once they've hit PMF. Before Hud, Jay left a lucrative quant career, shipped an AI prank-calling app that briefly hit #1 on the App Store (≈500k calls), and decided he wanted harder problems and smarter customers. He's a YC W25 alum; Hud is already used by researchers at foundation labs and is expanding into enterprise environments.

Jay's catalyst was realizing he didn't want to just talk weekends—he wanted to build. He and his co-founders first tackled computer-use evals for labs. Inside that work, the language shifted: labs asking for "evals" really needed environments—places where you design rewards, iterate, and actually improve model behavior. Today, Jay frames Hud as the "Next.js of RL environments": opinionated lifecycle, backend-agnostic training, and infra that returns signal fast. Early on, use a foundation model; post-PMF, train your own with SFT/RL—that's where environments matter. Looking ahead, he sees post-training speciation: domain-tuned models for finance, accounting, creative tooling, and more—because teams will own more of their stack again.

Key Topics Covered:

· What Hud is: tools to set up your agent for RL, define tasks, shape rewards, and plug into RFT/other RL backends.
· From evals to environments: why scores measure but rewards improve—and how iteration loops change outcomes.
· Where it fits: use foundation models early; post-PMF train your own for cost leverage + last-mile gains.
· Design + infra: a new category needs opinionated UX and fast results; why lab researchers use Hud for computer-use evals.
· Market timing: the "DeepSeek moment" pulled RL from hobbyists into enterprise interest in 2025.
· Pre-train vs post-train: scale vs accuracy + domain depth—and why post-training is the real edge.
· Future of work: enterprises will own more of the stack; model speciation by domain.
· Reality check: agents ace toy DBs, struggle in production; modeling real environments is the unlock.
· YC W25 arc: vision matched the original app more than mid-batch; enterprise demand is catching up now.
· Finance stack aside: keep ops boring; focus cycles on shipping product.

Chapters:

(00:15) Cold open — "We give you all the tools to set up your agent for RL."
(00:59) Intro — Jay Ram, Hud, and the origin story
(01:41) What Hud does — build RL environments; backend-agnostic (OpenAI RFT, Thinking Machines, etc.)
(02:12) Where environments fit — early: foundation models; post-PMF: train for cost + accuracy
(02:50) From quant to builder — leaving Wall Street to make things
(03:30) The prank-calling app — #1 on App Store; ≈500k calls; why the customers weren't it
(04:40) Evals → environments — labs' "eval" asks were really RL environments with rewards
(05:40) Evals vs RL — scores vs rewarded steps; how updates happen
(07:14) Hard parts — opinionated design + infra speed for researchers and teams
(08:08) Before Hud — no toolkit/standards; emerging gymnasium-style efforts vs Hud's opinionated path
(09:25) YC W25 — applying, partners (Aaron & Matt), why YC felt like "actual college"
(11:05) Vision vs timing — market caught up; enterprises now exploring environments
(12:20) Trend — teams rolling their own models post-PMF (SFT/RL)
(13:01) Today's fragmented stack — hosting, inference, data; Hud's role in the loop
(13:48) The "DeepSeek moment" — hobbyist RL → enterprise interest in 2025
(15:57) Future of agents — own the stack, post-training speciation
(18:26) Why end-to-end is hard — production data systems need real environments
(19:29) Forward-deployed labs — domain hires and environments; how Hud plugs into RFT
(20:15) Rapid wrap — it's early; the stack is shifting fast
‍
Where to find Jay Ram:
X: @jayendra_ram
LinkedIn: www.linkedin.com/in/jay-ram-29003b198/

Where to find Hud:
X: @hud_evals
Website: hud.ai

Where to find David Phillips:
X: @davj
LinkedIn: linkedin.com/in/davjphillips
‍
Brought to you by:‍
Fondo — All-in-one accounting for startups: fondo.com

Beyond Evals - Build Environments That Make Agents Better: Jay Ram, Founder & CEO, hud.ai

Доступные форматы для скачивания:

Скачать видео mp4

Информация по загрузке:

Скачать аудио mp3

Похожие видео

LLM и GPT - как работают большие языковые модели? Визуальное введение в трансформеры

LLM и GPT - как работают большие языковые модели? Визуальное введение в трансформеры

ГЛАВНЫЕ правила переговоров. СЕКРЕТ адвоката дьявола — Александр Добровинский.

ГЛАВНЫЕ правила переговоров. СЕКРЕТ адвоката дьявола — Александр Добровинский.

Как мы создаем эффективных агентов: Барри Чжан, Anthropic

Как мы создаем эффективных агентов: Барри Чжан, Anthropic

7 самых мощных рвов для стартапов в сфере ИИ

7 самых мощных рвов для стартапов в сфере ИИ

750K Contractors Paid, $25M Raised, MassChallenge Board: From Gig Wage to Ogentic AI: Craig J. Lewis

750K Contractors Paid, $25M Raised, MassChallenge Board: From Gig Wage to Ogentic AI: Craig J. Lewis

Richard Sutton – Father of RL thinks LLMs are a dead end

Richard Sutton – Father of RL thinks LLMs are a dead end

Правда о создании ИИ-стартапов сегодня

Правда о создании ИИ-стартапов сегодня

«Open AI — это пузырь»! Откровения из Кремниевой долины | Братья Либерманы

«Open AI — это пузырь»! Откровения из Кремниевой долины | Братья Либерманы

Tips for building AI agents

Tips for building AI agents

Руководитель отдела роста в Lovable | Почему стратегии роста терпят крах и что будет дальше

Руководитель отдела роста в Lovable | Почему стратегии роста терпят крах и что будет дальше

Bootstrapped, Beat 30x-Funded Rivals, Acquired: Now He's Running for Mayor | Joe Holberg

Bootstrapped, Beat 30x-Funded Rivals, Acquired: Now He's Running for Mayor | Joe Holberg

Growth Flywheels, Underpriced Attention & Building Graphed.com | Cody Schneider

Growth Flywheels, Underpriced Attention & Building Graphed.com | Cody Schneider

n8n Tutorial for Beginners 2025: Build AI Agents Step-by-Step

n8n Tutorial for Beginners 2025: Build AI Agents Step-by-Step

Идеи для стартапов, которые теперь можно реализовать с помощью ИИ

Идеи для стартапов, которые теперь можно реализовать с помощью ИИ

Вы (пока) не отстаёте: как освоить ИИ за 17 минут

Вы (пока) не отстаёте: как освоить ИИ за 17 минут

От 35 тысяч до 10 миллионов долларов: Альфа, стоящий за следующей ставкой Кевина Сю

От 35 тысяч до 10 миллионов долларов: Альфа, стоящий за следующей ставкой Кевина Сю

Интервью по проектированию системы Google: Design Spotify (с бывшим менеджером по маркетингу Google)

Интервью по проектированию системы Google: Design Spotify (с бывшим менеджером по маркетингу Google)

Что такое генеративный ИИ и как он работает? – Лекции Тьюринга с Миреллой Лапатой

Что такое генеративный ИИ и как он работает? – Лекции Тьюринга с Миреллой Лапатой

Demis Hassabis On The Future of Work in the Age of AI

Demis Hassabis On The Future of Work in the Age of AI

Как построить спутник

Как построить спутник