How Runhouse Orchestrates Multi-Cluster Ray Workloads | Ray Summit 2025

Автор: Anyscale

Загружено: 2025-12-01

Просмотров: 88

Описание:

At Ray Summit 2025, Donny Greenberg from Runhouse shares how Kubetorch introduces a next-generation, Ray-inspired distributed programming paradigm for Kubernetes—enabling teams to build complex AI workloads “serverlessly” in Python, without drowning in YAML or manually managing Ray clusters.

He begins by outlining a trend across modern ML infrastructure: Ray has become a first-class distributed computing primitive, with teams composing multi-stage inference, training, and reinforcement learning workloads on Kubernetes alongside other compute backends. But this shift has surfaced new challenges—rapid debugging, fluent programmatic orchestration, fault-tolerant workflows, and the growing expectation that Ray clusters should be ephemeral and created per task, à la serverless computing.

Kubetorch aims to fill this gap.

Donny introduces Kubetorch as a programming model that extends Ray’s familiar Task and Actor abstractions to Kubernetes-native resources. In Kubetorch:

An Actor represents not just a process, but a full Kubernetes resource—including a KubeRay RayCluster.

Entire Ray programs, services, and pipelines can be composed as higher-order workflows directly in Python.

Teams get a dramatically improved developer experience with fast iteration, fault tolerance, and minimal operational overhead.

Workloads scale elastically and portably across Kubernetes environments.

Incremental adoption is natural—Kubetorch can wrap existing Ray workloads while offering a smoother serverless-like experience.

The session highlights how Kubetorch brings serverless Ray to life, offering instant cluster provisioning, ephemeral execution, and scalable workflow composition—all while staying grounded in the Ray programming model that ML practitioners already know.

Attendees will walk away with a clear picture of how Kubetorch simplifies distributed ML workflows, closes critical gaps in the Ray ecosystem, and makes Kubernetes-native AI infrastructure more developer-friendly than ever.

Liked this video? Check out other Ray Summit breakout session recordings    • Ray Summit 2025 - Breakout Sessions

Subscribe to our YouTube channel to stay up-to-date on the future of AI!    / anyscale

🔗 Connect with us:
LinkedIn:   / joinanyscale

How Runhouse Orchestrates Multi-Cluster Ray Workloads | Ray Summit 2025

Доступные форматы для скачивания:

Скачать видео mp4

Информация по загрузке:

Скачать аудио mp3

Похожие видео

Prompt Learning: A Reinforcement Learning-Inspired Approach to AI Optimization | Ray Summit 2025

Prompt Learning: A Reinforcement Learning-Inspired Approach to AI Optimization | Ray Summit 2025

Running Ray in Production: Google’s Guide to Operators & Observability | Ray Summit 2025

Running Ray in Production: Google’s Guide to Operators & Observability | Ray Summit 2025

Stop Learning Python: Cybersecurity Career Survival Guide (2026)

Stop Learning Python: Cybersecurity Career Survival Guide (2026)

Scaling Production LLM Inference Using EKS Auto Mode & Ray Serve | Ray Summit 2025

Scaling Production LLM Inference Using EKS Auto Mode & Ray Serve | Ray Summit 2025

Brendan Burns: Lessons from Building Kubernetes and the Future of AI Infrastructure

Brendan Burns: Lessons from Building Kubernetes and the Future of AI Infrastructure

4. Trusted AI Agents Workshop : Let’s Work On A Trusted Market | Andor Kesselman, CEO @agentoverlay

4. Trusted AI Agents Workshop : Let’s Work On A Trusted Market | Andor Kesselman, CEO @agentoverlay

Sting - Shape of My Heart || Sylwester z Dwójką 2025

Sting - Shape of My Heart || Sylwester z Dwójką 2025

How DataRobot Parallelizes Agentic Pipeline Searches with Ray + syftr | Ray Summit 2025

How DataRobot Parallelizes Agentic Pipeline Searches with Ray + syftr | Ray Summit 2025

LiquidAI’s Approach to Large-Scale Synthetic Data Generation Using Ray | Ray Summit 2025

LiquidAI’s Approach to Large-Scale Synthetic Data Generation Using Ray | Ray Summit 2025

Sting - Message in the Bottle || Sylwester z Dwójką 2025

Sting - Message in the Bottle || Sylwester z Dwójką 2025

Orędzie noworoczne Prezydenta RP

Orędzie noworoczne Prezydenta RP

How DigitalOcean Builds Next-Gen Inference with Ray, vLLM & More | Ray Summit 2025

How DigitalOcean Builds Next-Gen Inference with Ray, vLLM & More | Ray Summit 2025

Benchmarking GPU Scheduling for Massive-Scale Ray Workloads at Minimal Cost - MSFT | Ray Summit 2025

Benchmarking GPU Scheduling for Massive-Scale Ray Workloads at Minimal Cost - MSFT | Ray Summit 2025

How Coinbase Uses Ray, vLLM & LiteLLM to Power Secure LLM Services | Ray Summit 2025

How Coinbase Uses Ray, vLLM & LiteLLM to Power Secure LLM Services | Ray Summit 2025

Taming Distributed AI Training with Ray + Datadog Observability | Ray Summit 2025

Taming Distributed AI Training with Ray + Datadog Observability | Ray Summit 2025

Куда инвестировать в 2026? SP500 - переоценен? - Говард Маркс

Куда инвестировать в 2026? SP500 - переоценен? - Говард Маркс

How vLLM Works + Journey of Prompts to vLLM + Paged Attention

How vLLM Works + Journey of Prompts to vLLM + Paged Attention

Improved Scheduling Flexibility with Label Selectors in Ray | Ray Summit 2025

Improved Scheduling Flexibility with Label Selectors in Ray | Ray Summit 2025

How the VAST AI Operating System Powers a Dynamic Data Plane for Ray | Ray Summit 2025

How the VAST AI Operating System Powers a Dynamic Data Plane for Ray | Ray Summit 2025

Boosting vLLM Inference on Huawei NPU with Ray Compiled Graphs — Huawei | Ray Summit 2025

Boosting vLLM Inference on Huawei NPU with Ray Compiled Graphs — Huawei | Ray Summit 2025