End-to-End Data Pipeline with Airflow, dbt, Cosmos, GCS, BigQuery & more
Автор: Data Pipeline Lab
Загружено: 2025-02-18
Просмотров: 3643
Welcome back, everyone! In today’s video, I’m excited to walk you through building an end-to-end cloud data pipeline using modern data engineering tools. You'll see how to orchestrate and automate your data workflows with Apache Airflow, integrate dbt seamlessly using Cosmos, and leverage Google Cloud Storage for raw data along with BigQuery as your data warehouse.
What You’ll Learn:
Airflow Orchestration: Set up and schedule Airflow DAGs to manage complex workflows—including data generation, creating external tables in BigQuery, and running dbt transformations.
Cosmos Integration: Use Cosmos to seamlessly integrate dbt jobs into Airflow, dynamically creating tasks for your dbt models for more efficient transformations.
dbt Transformations: Run dbt tests and transformations to ensure your raw data is accurate before loading it into BigQuery.
GCS & BigQuery Setup: Configure Google Cloud Storage for storing raw CSV, JSON, and Parquet files, and create external tables in BigQuery to query data directly.
Environment Switching: Learn how to effortlessly switch between development and production environments using configuration files and environment variables.
If you enjoy this video and find it helpful, please consider giving it a thumbs up and subscribing to my channel. Your support really helps me create more valuable content for the data engineering community.
For a detailed step-by-step guide on setting up this healthcare data pipeline, check out the README on GitHub: https://github.com/ChuQuEmeka/Airflow...
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: