How Coinbase Uses Ray, vLLM & LiteLLM to Power Secure LLM Services | Ray Summit 2025
Автор: Anyscale
Загружено: 2025-12-08
Просмотров: 559
At Ray Summit 2025, Wenyue Liu and Akshit Trehan from Coinbase share how the Coinbase Machine Learning Platform (MLP) team built trusted, production-grade LLM services using Ray, vLLM, and LiteLLM—supporting one of the world’s most security-sensitive environments and reinforcing Coinbase’s mission to remain the most trusted crypto exchange.
They begin by outlining the unique challenges of building LLM infrastructure inside a financial institution, where trust, security, and reliability are non-negotiable. To meet these requirements, Coinbase engineered an LLM serving stack that seamlessly integrates:
Ray for distributed orchestration and scaling
vLLM for high-throughput, low-latency inference
LiteLLM for routing, abstraction, and multi-provider reliability
The speakers then take a deep dive into the technical architecture behind Coinbase’s internal LLM services, including:
User authentication and authorization patterns tailored for secure LLM access
Service-to-service (s2s) trust models that allow safe and auditable communication between internal systems
LiteLLM distribution strategies to balance throughput, reliability, and fallback behavior
How vLLM and Ray work together to power scalable, production-grade LLM serving APIs
Systems built to support high-volume internal LLM traffic, ensuring consistent performance under load
The session walks through the full end-to-end story of how Coinbase uses Ray and vLLM to deliver trustworthy, secure, and efficient LLM services—meeting the strict reliability requirements of a top global crypto exchange.
Liked this video? Check out other Ray Summit breakout session recordings • Ray Summit 2025 - Breakout Sessions
Subscribe to our YouTube channel to stay up-to-date on the future of AI! / anyscale
🔗 Connect with us:
LinkedIn: / joinanyscale
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: