🧪 Live Demo: Training LLMs with RFT in the Predibase SDK (Tool Use + Reward Functions Explained)
Автор: Predibase
Загружено: 2025-04-08
Просмотров: 167
In this live walkthrough, we show you exactly how to train an LLM using Reinforcement Fine-Tuning (RFT) in the Predibase SDK—and how to monitor performance using our built-in observability tools.
You'll learn how to:
✅ Set up your dataset and prompts
✅ Define custom reward functions for correctness, format, and length
✅ Use the Predibase SDK to launch a fine-tuning job
✅ View reward graphs, logs, and completions in the RFT dashboard
✅ Update reward functions live during training — no restart needed!
We walk through a real-world function calling task using the Glaive dataset, where the model must select the correct tool based on a user prompt (e.g., get stock price, create calendar event).
🔍 Unlike traditional SFT, RFT lets you define flexible, dynamic rules (e.g., Think/Tool tags, argument parsing, completion length) and reward models accordingly — even with minimal labeled data.
This is a must-watch if you're:
💡 Building agentic systems
💻 Customizing open-source LLMs
⚙️ Designing robust inference + training stacks
📉 Looking to reduce data labeling costs
👉 Try Reinforcement Fine-Tuning on your own task: https://pbase.ai/4brbC8u
🔗 Watch the full video and get a notebook link: • 🔥 Live Demo: Reinforcement Fine-Tuning for...
🔔 SUBSCRIBE for the latest on LLM fine-tuning, AI scaling, and reinforcement learning hacks!
👉 / @predibase
👉 Schedule a live demo: https://pbase.ai/41FZKfy
👉 Learn more: https://pbase.ai/Intro-RFT-platform
00:00 - Intro: Setting up RFT in the Predibase SDK
01:15 - Loading the Glaive function calling dataset
02:05 - Prompt and tool call structure explained
03:00 - What makes a reward function in Predibase
04:30 - Writing the correctness reward function (Python)
06:15 - Writing the formatting reward function
07:35 - Adding a completion length constraint
08:50 - Launching the RFT job via the SDK
10:00 - Defining GRPO config and training parameters
11:45 - Assigning and packaging reward functions
12:30 - Job launched! Switching to the UI
13:15 - Exploring the Reward Functions tab
14:25 - Viewing Reward Graphs and interpreting metrics
15:35 - Using logs to debug your reward functions
16:40 - Completions Viewer: Compare model generations by epoch
18:10 - Updating a reward function live during training
19:25 - Adding a more flexible length function (sliding scale)
20:30 - Pushing live updates to running RFT job
21:10 - Summary: Why Predibase RFT simplifies LLM fine-tuning
#predibase #rft #llmtraining #functioncalling #reinforcementfinetuning #ai #machinelearning #mlengineering #rlhf #opensourcellms #customllm #fewshotlearning #agenticai #pythonai #AIDevTools #llmops #observability #RewardFunctions #finetuning #dataefficiency #aiinfrastructure

Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: