Train Your Own Reasoning Model (DeepSeek Clone) Fast & With Only 7Gb Of VRAM
Автор: Machine Learning With Hamza
Загружено: 2025-02-17
Просмотров: 10898
Hello everyone, I hope you're doing well!
In this video, I show you how to fine-tune LLMs locally for the task of reasoning, using the reinforcement learning algorithm called GRPO. You can perform the fine tuning with a GPU of at least 7Gb of VRAM using the Unsloth fast fine-tuning python library.
Used material links:
Github Repo: https://github.com/Hmzbo/Fine-tune-LL...
Hugging face post: https://huggingface.co/learn/cookbook...
Unsloth notebooks: https://docs.unsloth.ai/get-started/u...
Let's connect:
LinkedIn: https://bit.ly/3roXgQ2
GitHub: https://bit.ly/3CrfRRP
Kaggle: https://bit.ly/3C1mqZD
Twitter: https://bit.ly/3UR06e3
--------------------------------------------------------------
♪ Song: Memories
Artist: Owl Nest
Music by: CreatorMix.com
Video: • Free Lofi Music For YouTube Videos No Copy...
--------------------------------------------------------------
If you have any question, suggestion, or remark. Feel free to leave it in a comment below!
Until next time, stay safe!
#mlwh
00:00 Intro
01:02 Explaining GRPO
08:03 Environment Setup guidelines
10:20 Data , Model & Reward functions
17:57 Training
21:24 Training results
23:47 Testing
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: