Популярное

Музыка Кино и Анимация Автомобили Животные Спорт Путешествия Игры Юмор

Интересные видео

2025 Сериалы Трейлеры Новости Как сделать Видеоуроки Diy своими руками

Топ запросов

смотреть а4 schoolboy runaway турецкий сериал смотреть мультфильмы эдисон
dTub
Скачать

Apache Spark RDD Tutorial: Master RDD & Core Concepts | Data Engineering

Автор: itversity

Загружено: 2025-03-11

Просмотров: 1427

Описание:

In this video, we'll dive deep into Apache Spark RDDs (Resilient Distributed Datasets) and equip you with the skills to leverage them for efficient big data processing.

What You'll Learn in this Video:
What is Apache Spark and why is it used for Big Data?
What is RDD in Spark?
What are RDDs (Resilient Distributed Datasets) and how do they work?
How do I set up a free Spark environment using DataBricks Community Edition?
How can I implement parallel programming in Spark using RDDs?
How do I create, transform, and process data with Spark RDDs?
How do I implement key RDD transformations like map, filter, flatMap, reduceByKey, and sortBy?
What are the differences between narrow and wide transformations in Spark?
What is shuffling in Spark and why is it important?
How can I create a word count program using Spark RDDs in Python?
What are DAGs (Directed Acyclic Graphs) and lazy evaluation in Spark?
How can I monitor and troubleshoot Spark applications using the Spark UI and driver logs?
How can I save the output of spark application?
How to improve my understanding about rdds?
What are the difference between rdds, dataframes and datasets?
What are the transformations and actions in RDD.
Explain Narrow Transformations and wide transformations
What are aggregate functions?
Use of split function on top of string
How to create list of Rules
Difference between file and RDD

Timestamps:
0:00:05 - Apache Spark Full Course Intro: RDDs & Getting Started
0:02:45 - What is Apache Spark? Features and Use Cases
0:04:15 - Distributed Computing Explained: Spark vs. Single Machine
0:07:55 - Exploring Data Sets provided by Databricks
0:11:33 - Python Collections for Spark: List, Tuple, Dict and Set
0:20:09 - Creating Spark RDDs from Python Collections
0:26:49 - Spark Data Structures: RDDs vs. DataFrames vs. DataSets
0:37:19 - Connecting to Spark cluster
0:40:57 - Spark RDDs: Actions and Transformations Explained
0:57:20 - Filter Transformation: How to Filter the data
1:01:14 - Map Transformations
1:06:53 - FlatMap Transformation
1:32:56 - Reduce By Key Transformations and sorting
1:33:08 - What is Shuffling in Spark? Understanding Wide Transformations
1:47:04 - Spark: Sort Data In ascending and Descending
1:54:30 - Saving Processed Data Into the File
2:04:50 - Putting It all together
2:14:52 - Monitoring spark Jobs by using Databricks and spark UI
2:27:00 - Lazy Evaluation: What are Dax and their use?

By the end of this tutorial, you’ll be able to:
Design and implement RDD-based workflows for large-scale data processing.
Optimize Spark jobs by understanding shuffling, partitioning, and lazy evaluation.
Confidently debug and analyze jobs using Spark UI and logs.
Apply these skills to real-world problems like log analysis, ETL, and aggregations.

Watch this video to learn how to create a cluster and get started with a free Databricks Community Edition account! Perfect for practicing everything demonstrated in this tutorial.    • Getting Started with Spark using Databrick...  

📂 Resources
GitHub Repo for Notebooks: https://github.com/itversity/apache-s...

Please go through the following link to understand how to upload notebooks to the Databricks cluster
https://docs.databricks.com/aws/en/no...

Master Apache Spark with a complete course, including 24/7 support, real-life case studies, and hands-on assignments, check out our Udemy course here:
https://www.udemy.com/course/apache-s...

If you're an absolute beginner and want to learn Python from scratch, this is the perfect place to start!
https://www.udemy.com/course/python-f...

Who Should Watch:*
This tutorial is perfect for data engineers, developers, data scientists, students, and anyone interested in mastering Spark RDDs for efficient big data processing.

Continue Your Learning Journey here:
Full Playlist:    • Apache Spark for Beginners: Full Course Us...  
Previous Video:    • Getting Started with Spark using Databrick...  
Next Video:    • PySpark DataFrames Tutorial: ETL in Databr...  

Don’t forget to Like, Comment, and Subscribe for more content on Apache Spark, Big Data, and Data Engineering! 🚀

Connect with Us:
Newsletter: http://notifyme.itversity.com
LinkedIn:   / itversity  
Facebook:   / itversity  
Twitter:   / itversity  
Instagram:   / itversity  

Join this channel to get access to perks:
   / @itversity  

#dataengineering #cloudcomputing #ApacheSpark #python #tutorial #bigdata #python

Apache Spark RDD Tutorial: Master RDD & Core Concepts | Data Engineering

Поделиться в:

Доступные форматы для скачивания:

Скачать видео mp4

  • Информация по загрузке:

Скачать аудио mp3

Похожие видео

array(10) { [0]=> object(stdClass)#8747 (5) { ["video_id"]=> int(9999999) ["related_video_id"]=> string(11) "I1rsQSt7EL8" ["related_video_title"]=> string(80) "PySpark DataFrames Tutorial: ETL in Databricks Community Edition (FREE Hands-on)" ["posted_time"]=> string(25) "3 месяца назад" ["channelName"]=> string(9) "itversity" } [1]=> object(stdClass)#8720 (5) { ["video_id"]=> int(9999999) ["related_video_id"]=> string(11) "iXVIPQEGZ9Y" ["related_video_title"]=> string(38) "Apache Spark Architecture - EXPLAINED!" ["posted_time"]=> string(28) "10 месяцев назад" ["channelName"]=> string(28) "Databricks For Professionals" } [2]=> object(stdClass)#8745 (5) { ["video_id"]=> int(9999999) ["related_video_id"]=> string(11) "tb3RIIcytsM" ["related_video_title"]=> string(29) "Apache Spark Core Concepts 01" ["posted_time"]=> string(21) "3 года назад" ["channelName"]=> string(12) "CloudFitness" } [3]=> object(stdClass)#8752 (5) { ["video_id"]=> int(9999999) ["related_video_id"]=> string(11) "azx6BC8hNx0" ["related_video_title"]=> string(58) "План развития backend разработчика" ["posted_time"]=> string(21) "8 дней назад" ["channelName"]=> string(17) "Eugene Suleimanov" } [4]=> object(stdClass)#8731 (5) { ["video_id"]=> int(9999999) ["related_video_id"]=> string(11) "rNbFMSH4cQY" ["related_video_title"]=> string(48) "Alpha Software Security Framework June 25 2025" ["posted_time"]=> string(24) "29 минут назад" ["channelName"]=> string(14) "Alpha Software" } [5]=> object(stdClass)#8749 (5) { ["video_id"]=> int(9999999) ["related_video_id"]=> string(11) "8L51FUsjMxA" ["related_video_title"]=> string(115) "Как устроена База Данных? Кластеры, индексы, схемы, ограничения" ["posted_time"]=> string(27) "6 месяцев назад" ["channelName"]=> string(25) "Артём Шумейко" } [6]=> object(stdClass)#8744 (5) { ["video_id"]=> int(9999999) ["related_video_id"]=> string(11) "_C8kWso4ne4" ["related_video_title"]=> string(16) "PySpark Tutorial" ["posted_time"]=> string(21) "3 года назад" ["channelName"]=> string(16) "freeCodeCamp.org" } [7]=> object(stdClass)#8754 (5) { ["video_id"]=> int(9999999) ["related_video_id"]=> string(11) "O_45zAz1OGk" ["related_video_title"]=> string(25) "Master Reading Spark DAGs" ["posted_time"]=> string(19) "1 год назад" ["channelName"]=> string(12) "Afaque Ahmad" } [8]=> object(stdClass)#8730 (5) { ["video_id"]=> int(9999999) ["related_video_id"]=> string(11) "_KaEC-Mktv4" ["related_video_title"]=> string(41) "Что такое SAGA за 10 минут" ["posted_time"]=> string(24) "20 часов назад" ["channelName"]=> string(9) "Listen IT" } [9]=> object(stdClass)#8748 (5) { ["video_id"]=> int(9999999) ["related_video_id"]=> string(11) "jGO6QtWOPJQ" ["related_video_title"]=> string(174) "КАСЬЯНОВ: "Я видел это своими глазами. Это не публиковалось": что показали Путину, чего он боится" ["posted_time"]=> string(24) "11 часов назад" ["channelName"]=> string(24) "И Грянул Грэм" } }
PySpark DataFrames Tutorial: ETL in Databricks Community Edition (FREE Hands-on)

PySpark DataFrames Tutorial: ETL in Databricks Community Edition (FREE Hands-on)

Apache Spark Architecture - EXPLAINED!

Apache Spark Architecture - EXPLAINED!

Apache Spark Core Concepts 01

Apache Spark Core Concepts 01

План развития backend разработчика

План развития backend разработчика

Alpha Software Security Framework   June 25 2025

Alpha Software Security Framework June 25 2025

Как устроена База Данных? Кластеры, индексы, схемы, ограничения

Как устроена База Данных? Кластеры, индексы, схемы, ограничения

PySpark Tutorial

PySpark Tutorial

Master Reading Spark DAGs

Master Reading Spark DAGs

Что такое SAGA за 10 минут

Что такое SAGA за 10 минут

КАСЬЯНОВ:

КАСЬЯНОВ: "Я видел это своими глазами. Это не публиковалось": что показали Путину, чего он боится

© 2025 dtub. Все права защищены.



  • Контакты
  • О нас
  • Политика конфиденциальности



Контакты для правообладателей: [email protected]