How to Scale Unstructured Data Processing with Ray Data | Ray Summit 2024
Автор: Anyscale
Загружено: 2024-10-18
Просмотров: 784
At Ray Summit 2024, Hao Chen and Praveen Gorthy from Anyscale tackle the challenge of processing unstructured data at scale. As images, videos, and other unstructured data formats become more popular, the associated data sizes grow exponentially—and traditional frameworks struggle to keep up. This talk introduces Ray Data on Anyscale as a solution to this pressing issue.
The speakers explore Ray Data's streaming batch model and adaptive scheduling, demonstrating how these features efficiently handle the heterogeneous compute requirements of unstructured data workloads. They also highlight Anyscale's enhancements to Ray Data, including autoscaling, fault tolerance, and performance optimizations.
A key feature of this presentation is a live demo, showcasing the development and scaling of an unstructured data processing workflow using Ray Data on Anyscale. Attendees will see firsthand how Anyscale's observability tools provide real-time insights into workload performance and resource utilization, enabling on-the-fly pipeline optimization.
This session is invaluable for data scientists, engineers, and organizations grappling with large-scale unstructured data processing, offering practical solutions to improve performance and cost-efficiency in their data pipelines.
--
Interested in more?
Watch the full Day 1 Keynote: • Ray Summit 2024 Keynote Day 1 | Where Buil...
Watch the full Day 2 Keynote • Ray Summit 2024 Keynote Day 2 | Where Buil...
--
🔗 Connect with us:
Subscribe to our YouTube channel: / @anyscale
Twitter: https://x.com/anyscalecompute
LinkedIn: / joinanyscale
Website: https://www.anyscale.com
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: