How Runhouse Orchestrates Multi-Cluster Ray Workloads | Ray Summit 2025
Автор: Anyscale
Загружено: 2025-12-01
Просмотров: 88
At Ray Summit 2025, Donny Greenberg from Runhouse shares how Kubetorch introduces a next-generation, Ray-inspired distributed programming paradigm for Kubernetes—enabling teams to build complex AI workloads “serverlessly” in Python, without drowning in YAML or manually managing Ray clusters.
He begins by outlining a trend across modern ML infrastructure: Ray has become a first-class distributed computing primitive, with teams composing multi-stage inference, training, and reinforcement learning workloads on Kubernetes alongside other compute backends. But this shift has surfaced new challenges—rapid debugging, fluent programmatic orchestration, fault-tolerant workflows, and the growing expectation that Ray clusters should be ephemeral and created per task, à la serverless computing.
Kubetorch aims to fill this gap.
Donny introduces Kubetorch as a programming model that extends Ray’s familiar Task and Actor abstractions to Kubernetes-native resources. In Kubetorch:
An Actor represents not just a process, but a full Kubernetes resource—including a KubeRay RayCluster.
Entire Ray programs, services, and pipelines can be composed as higher-order workflows directly in Python.
Teams get a dramatically improved developer experience with fast iteration, fault tolerance, and minimal operational overhead.
Workloads scale elastically and portably across Kubernetes environments.
Incremental adoption is natural—Kubetorch can wrap existing Ray workloads while offering a smoother serverless-like experience.
The session highlights how Kubetorch brings serverless Ray to life, offering instant cluster provisioning, ephemeral execution, and scalable workflow composition—all while staying grounded in the Ray programming model that ML practitioners already know.
Attendees will walk away with a clear picture of how Kubetorch simplifies distributed ML workflows, closes critical gaps in the Ray ecosystem, and makes Kubernetes-native AI infrastructure more developer-friendly than ever.
Liked this video? Check out other Ray Summit breakout session recordings • Ray Summit 2025 - Breakout Sessions
Subscribe to our YouTube channel to stay up-to-date on the future of AI! / anyscale
🔗 Connect with us:
LinkedIn: / joinanyscale
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: