Unlocking the Full Potential of GPUs for AI Workloads on Kubernetes - Kevin Klues, NVIDIA
Dynamic Resource Allocation (DRA) is new Kubernetes feature that puts resource scheduling in the hands of 3rd-party developers. It moves away from the limited "countable" interface for requesting access to resources (e.g. "nvidia.com/gpu: 2"), providing an API more akin to that of persistent volumes. In the context of GPUs, this unlocks a host of new features without the need for awkward solutions shoehorned on top of the existing device plugin API. These features include: * Controlled GPU Sharing (both within a pod and across pods) * Multiple GPU models per node (e.g. T4 and A100) * Specifying arbitrary constraints for a GPU (min/max memory, device model, etc.) * Dynamic allocation of Multi-Instance GPUs (MIG) * … the list goes on ... In this talk, you will learn about the DRA resource driver we have built for GPUs. We walk through each of the features it provides, and conclude with a series of demos showing you how you can get started using it today.
Поделиться в:
Доступные форматы для скачивания:
Скачать видео mp4
Информация по загрузке:
Скачать аудио mp3
Похожие видео
array(10) {
[0]=>
object(stdClass)#4488 (5) {
["video_id"]=>
int(9999999)
["related_video_id"]=>
string(11) "nOgxv_R13Dg"
["related_video_title"]=>
string(100) "Which GPU Sharing Strategy Is Right for You? A Comprehensive Benchmark Study Us... K. Klues, Y. Chen"
["posted_time"]=>
string(27) "7 месяцев назад"
["channelName"]=>
string(40) "CNCF [Cloud Native Computing Foundation]"
}
[1]=>
object(stdClass)#4461 (5) {
["video_id"]=>
int(9999999)
["related_video_id"]=>
string(11) "90kZRyPcRZw"
["related_video_title"]=>
string(94) "Kubernetes Deconstructed: Understanding Kubernetes by Breaking It Down - Carson Anderson, DOMO"
["posted_time"]=>
string(19) "7 лет назад"
["channelName"]=>
string(40) "CNCF [Cloud Native Computing Foundation]"
}
[2]=>
object(stdClass)#4486 (5) {
["video_id"]=>
int(9999999)
["related_video_id"]=>
string(11) "szEH4HWLj4Q"
["related_video_title"]=>
string(77) "This Automated n8n Project Management System Saves +15hrs A Week (live build)"
["posted_time"]=>
string(26) "54 минуты назад"
["channelName"]=>
string(8) "Rob Nool"
}
[3]=>
object(stdClass)#4493 (5) {
["video_id"]=>
int(9999999)
["related_video_id"]=>
string(11) "ZuIQurh_kDk"
["related_video_title"]=>
string(67) "Kubernetes Design Principles: Understand the Why - Saad Ali, Google"
["posted_time"]=>
string(19) "6 лет назад"
["channelName"]=>
string(40) "CNCF [Cloud Native Computing Foundation]"
}
[4]=>
object(stdClass)#4472 (5) {
["video_id"]=>
int(9999999)
["related_video_id"]=>
string(11) "FZ9XML4KaiY"
["related_video_title"]=>
string(89) "Accelerate Your GenAI Model Inference with Ray and Kubernetes - Richard Liu, Google Cloud"
["posted_time"]=>
string(19) "1 год назад"
["channelName"]=>
string(40) "CNCF [Cloud Native Computing Foundation]"
}
[5]=>
object(stdClass)#4490 (5) {
["video_id"]=>
int(9999999)
["related_video_id"]=>
string(11) "40VfZ_nIFWI"
["related_video_title"]=>
string(29) "Kubernetes Ingress networking"
["posted_time"]=>
string(19) "5 лет назад"
["channelName"]=>
string(14) "Project Calico"
}
[6]=>
object(stdClass)#4485 (5) {
["video_id"]=>
int(9999999)
["related_video_id"]=>
string(11) "h9Z4oGN89MU"
["related_video_title"]=>
string(55) "How do Graphics Cards Work? Exploring GPU Architecture"
["posted_time"]=>
string(27) "8 месяцев назад"
["channelName"]=>
string(16) "Branch Education"
}
[7]=>
object(stdClass)#4495 (5) {
["video_id"]=>
int(9999999)
["related_video_id"]=>
string(11) "jbpIFCkEEng"
["related_video_title"]=>
string(104) "Mastering GPU Management in Kubernetes Using the Operator Pattern- Shiva Krishna Merla & Kevin Klues"
["posted_time"]=>
string(19) "1 год назад"
["channelName"]=>
string(40) "CNCF [Cloud Native Computing Foundation]"
}
[8]=>
object(stdClass)#4471 (5) {
["video_id"]=>
int(9999999)
["related_video_id"]=>
string(11) "hk2R51pw-Xg"
["related_video_title"]=>
string(86) "Unleashing the Power of DRA (Dynamic Resource Allocation) for Just-in-Time GPU Slicing"
["posted_time"]=>
string(19) "1 год назад"
["channelName"]=>
string(40) "CNCF [Cloud Native Computing Foundation]"
}
[9]=>
object(stdClass)#4489 (5) {
["video_id"]=>
int(9999999)
["related_video_id"]=>
string(11) "DPVFqvkIo5M"
["related_video_title"]=>
string(100) "Operationalizing High-Performance GPU Clusters in Kubernetes: Lessons Learned fr... W. Gleich, W. Wu"
["posted_time"]=>
string(27) "7 месяцев назад"
["channelName"]=>
string(40) "CNCF [Cloud Native Computing Foundation]"
}
}