Soumil Shah


As a Data Engineer and an expert in Apache Hudi and Iceberg, I navigate the vast landscape of AWS Big Data and data lakes with a focus on building scalable data ingestion pipelines.

I developed the "LakeBoost" framework, integrating Apache Hudi with AWS Glue ETL to enhance efficiency and significantly reduce costs for large-scale data operations. With strong skills in Spark and data platforms, I design systems that support robust, high-performance data workflows.

Beyond my engineering work, I am a dedicated content creator, running a YouTube channel with 44,000 subscribers and over 1,600 videos on big data technologies. My passion for data engineering extends through both technical innovation and educational outreach, making complex concepts accessible to a global audience.