Популярное

Музыка Кино и Анимация Автомобили Животные Спорт Путешествия Игры Юмор

Интересные видео

2025 Сериалы Трейлеры Новости Как сделать Видеоуроки Diy своими руками

Топ запросов

смотреть а4 schoolboy runaway турецкий сериал смотреть мультфильмы эдисон
dTub
Скачать

Data Preprocessing for Machine Learning | Go Beyond Basic Cleaning

Автор: Data Geek is my name

Загружено: 2025-09-01

Просмотров: 348

Описание:

Most beginners stop at dropping duplicates and nulls. But if you want machine learning models to perform at their best, you need advanced data preparation. In this video, I’ll show you step-by-step how to prepare datasets for ML using Python — including imputation, scaling, encoding, outlier treatment, and pipelines. By the end, you’ll know how to transform messy data into high-performance input that powers accurate models.

🔗 Download code & sample DB here: https://github.com/data-geek-lab/beyo...

How to download Anaconda Navigator to use Jupyter Notebook and many other tools for data analytics:    • How to Download Anaconda for Jupyter Noteb...  

==== Support my channel ====
🔔 Don’t forget to LIKE & SUBSCRIBE for more Python & Data Analysis tutorials!
☕ Want to Buy Me A Coffee: https://buymeacoffee.com/datageekismy...
💎 Donate on PayPal : https://www.paypal.com/donate/?hosted...

== *Great Books For Mastering Data Science and Data Cleaning ==
*Data Science and Machine Learning (providing a Python code)*: Mathematical and Statistical Methods: https://amzn.to/41AqOfT

*Linear Algebra for Data Science, Machine Learning, and Signal Processing*: https://amzn.to/3JFm4Q4

*Python Machine Learning: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow*: https://amzn.to/45YSE73

Disclaimer: This content is for educational purposes only. Affiliate links may be included, and I may earn a small commission at no extra cost to you. Thank you for supporting the channel!

Timestamps:
0:00 – Intro
1:00 – Review of the csv dataset
1:56 - In Jupyter Notebook Step 1: Import libraries and Load & preview the dataset
3:37 - Step 2: Quick EDA & data types (Shows information of the dataset)
5:26 Step 3: Fix dates & basic schema
6:33 Step 4: Feature engineering from dates
12:23 Step 5: Split features & target
17:29 Step 6: Outlier exploration (numeric) use z-score/IQR to detect outliers
19:30 Step 7: Preprocessing pipeline Impute missing values (KNN for numeric; constant for categorical)
27:08 Optional step: Outlier capping (winsorization) adding a custom transformer to cap extreme values after imputation.
29:33 Step 8: Train a model with preprocessing pipeline - Using Logistic Regression as a baseline to demonstrate how preprocessing and modeling fit together.
35:05 Compare with winsorization variant
38:55 Step 9 Export the preprocessing pipeline (save the fitted preprocessor + model for reuse.
40:17 Outro

Data Preprocessing for Machine Learning | Go Beyond Basic Cleaning

Поделиться в:

Доступные форматы для скачивания:

Скачать видео mp4

  • Информация по загрузке:

Скачать аудио mp3

Похожие видео

array(0) { }

© 2025 dtub. Все права защищены.



  • Контакты
  • О нас
  • Политика конфиденциальности



Контакты для правообладателей: [email protected]