Gen AI on Intel Arc GPUs - Building a Dual Arc B580 LLM Inference Server! (24 GB VRAM!)
Автор: YourAvgDev
Загружено: 2025-12-28
Просмотров: 6
How to get 24 GB VRAM for cheap? Let's try 2 Intel Arc B580s as a cheap solution!
I am going to start a really cool video series where I am going to optimize an LLM inference server running on Intel Arc B580s with 12 GB VRAM each. It's a more cost-effective and efficient solution to get 24 GB VRAM total and still be able to inference models like gpt-oss-20b at rates up to 83 tokens/s!
We'll be using vLLM for xpu to achieve this. vLLM for xpu is hard to setup so in the next video I will walk you through step by step on how to get it set up correctly natively without Docker so that you can always be on the latest vLLM to run the latest models locally.
Specs of the system:
AMD Ryzen 9 9900X 12c/24t
64 GB DDR5-5600 RAM
1 TB PNY NVMe SSD
(2) Intel Arc B580 12 GB VRAM Battlemage GPUs
Motherboard: MSI MAG Tomahawk X870E
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: