Bringing DuckDB to the Cloud: Dual Execution Explained

Автор: MotherDuck

Загружено: 2024-06-27

Просмотров: 1117

Описание:

Don't miss this special episode! Stephanie, a founding engineer at MotherDuck, will talk about what it takes to put a database in the cloud, specifically DuckDB. Mehdi and Steph will explain dual execution, showing how it works and what users need to know about running things in the cloud or locally.

--------------------------------------

Join MotherDuck founding engineer Stephanie for a deep dive into the architecture of MotherDuck, the serverless data warehouse built on DuckDB. This video demystifies what makes MotherDuck different from a self-hosted DuckDB instance by breaking down its three key components: the client layer (including DuckDB in WASM for browsers), the server-side compute layer, and the cloud storage layer. We explore how MotherDuck leverages the DuckDB extension system, a crucial design choice that avoids forking the open-source project and allows for rapid adoption of new DuckDB features. This approach extends DuckDB's parser, optimizer, and storage to create a robust, cloud-native experience for data analysts and developers.

Discover the power of dual execution, MotherDuck's hybrid query model that intelligently decides whether to run operations locally on your machine or remotely in the cloud. We provide a hands-on demonstration showing how to use `EXPLAIN` to analyze a query plan, revealing how a join between a local Parquet file and a remote cloud table is optimized. You'll learn how this unique DuckDB optimization logic minimizes data transfer and leverages your local compute for maximum efficiency. We'll also show you how to take control with the `md_run` parameter to force local or remote execution for specific scans, including Parquet, CSV, and Delta Lake files.

This session also covers the practical challenges of running DuckDB at scale and how MotherDuck solves them. Learn about our secure, pluggable secret manager for easily querying data in S3, GCS, and Azure without exposing credentials. We'll touch on our differential storage implementation, which enables powerful features like database sharing and time travel, transforming DuckDB from a single-player tool into a collaborative data platform. We cap it off with a performance comparison, showing the speed benefits of querying large S3 files with MotherDuck's cloud compute versus a local DuckDB client. Finally, we discuss how MotherDuck contributes back to the DuckDB open source project and how you can get involved.

Bringing DuckDB to the Cloud: Dual Execution Explained

Доступные форматы для скачивания:

Скачать видео mp4

Информация по загрузке:

Скачать аудио mp3

Похожие видео

Fixing SQL Spaghetti : Effective Refactoring Techniques

Fixing SQL Spaghetti : Effective Refactoring Techniques

#652 Rosja odrzuca kompromisy. 90 mld dla Ukrainy, Putin w Indiach, Plan USA dla Sudanu, Jemen

#652 Rosja odrzuca kompromisy. 90 mld dla Ukrainy, Putin w Indiach, Plan USA dla Sudanu, Jemen

LESZEK ŻEBROWSKI | JAN POSPIESZALSKI ROZMAWIA #161

LESZEK ŻEBROWSKI | JAN POSPIESZALSKI ROZMAWIA #161

DuckDB (a deeper dive)

DuckDB (a deeper dive)

DuckDB with Apache Iceberg inside AWS Lambda

DuckDB with Apache Iceberg inside AWS Lambda

Understanding DuckLake: A Table Format with a Modern Architecture

Understanding DuckLake: A Table Format with a Modern Architecture

Дайте каждому пользователю свою собственную базу данных! Раскройте весь потенциал небольших данных!

Дайте каждому пользователю свою собственную базу данных! Раскройте весь потенциал небольших данных!

Малые данные в цифрах: беседа у камина с Джорджем Фрейзером

Малые данные в цифрах: беседа у камина с Джорджем Фрейзером

How-To Supercharge Your Data Pipelines with DuckDB

How-To Supercharge Your Data Pipelines with DuckDB

Why Your Postgres Queries Are So Slow (And 3 Ways to Fix It)

Why Your Postgres Queries Are So Slow (And 3 Ways to Fix It)

How DuckLake Simplifies Lakehouse Architecture ft. Jordan Tigani & Hannes Mühleisen

How DuckLake Simplifies Lakehouse Architecture ft. Jordan Tigani & Hannes Mühleisen

Using DuckDB & DBT to Create a Date & Location Spine

Using DuckDB & DBT to Create a Date & Location Spine

VS Code vs. Notepad++ vs. Scope A 500GB File (The Stress Test)

VS Code vs. Notepad++ vs. Scope A 500GB File (The Stress Test)

OpenAI готовит новую модель «Чеснок»

OpenAI готовит новую модель «Чеснок»

DuckDB для разработчиков Python: 6 причин, по которым он лучше DataFrames

DuckDB для разработчиков Python: 6 причин, по которым он лучше DataFrames

Admirał NATO: "ZAATAKUJMY PIERWSI!" Putin: „To Wy Chcecie Wojny”

How to Efficiently Load Data into DuckLake with Estuary

How to Efficiently Load Data into DuckLake with Estuary

Разработка данных с DuckDB и MotherDuck | Запуск курса

Разработка данных с DuckDB и MotherDuck | Запуск курса

Introduction to duckdb and harlequin

Introduction to duckdb and harlequin

2. MCP servers for DBT CLI & Github, Jaffle shop project in vscode

2. MCP servers for DBT CLI & Github, Jaffle shop project in vscode