Boost Your Apache Spark Performance with These Optimization Techniques
Автор: NextGenLakehouse
Загружено: 2024-02-23
Просмотров: 2101
Chapters
• 0:00 - Introduction to the Databricks Optimization Guide
• 1:00 - Overview of Guide Sections & Topics
• 2:30 - Deep Dive: Spark Workload Optimization
• 3:00 - Understanding & Optimizing Data Shuffling
• 3:15 - What is Shuffling & Its Impact
• 4:00 - Avoiding Shuffles with Broadcast Joins
• 5:45 - Broadcast Caveats & Configuration Tuning
• 8:30 - Other Shuffle Optimizations & ANALYZE TABLE
• 9:30 - Addressing Data Spilling
• 9:45 - What is Data Spilling & Its Impact
• 10:30 - Solutions: AQE & Optimizing Shuffle Partitions
• 13:00 - Common Misconception:
• 13:45 - Handling Data Skewness
• 14:00 - What is Data Skew & Its Consequences.
• 14:45 - Identifying Data Skew in Spark UI
• 16:00 - Strategies for Fixing Data Skew
Documentation
Ebook Optimization guide : https://www.databricks.com/discover/p...
Subscribe to the newsletter : https://nextgenlakehouse.substack.com/
NextGenLakehouse
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: