Nano Course: Basics of K-means and Gaussian Mixture Clustering
Автор: Phi-AI
Загружено: 2025-01-03
Просмотров: 970
▬▬ Description ▬▬▬▬▬▬▬▬
This video introduces concepts of K-means and Gaussian Mixtures Clustering applied to word embeddings. Word embeddings are obtained for pair of words extracted from Support Ticket dataset using #llms from #sentencetransformers library. You will also learn a connection between Euclidean distance and probability.
▬▬ Code ▬▬▬▬▬▬▬▬▬▬▬
K-means Clustering
https://github.com/enoten/ml-numpy-py...
Gaussian Mixtures Clustering
https://github.com/enoten/ml-numpy-py...
Support Tickets Dataset is available here:
https://huggingface.co/datasets/phi-a...
Code for analysis of Support Tickets Dataset and
extraction of keyword pairs is available here:
https://github.com/enoten/support_tic...
▬▬ Recommended ML Books ▬▬▬▬▬▬▬▬▬▬▬
https://amzn.to/3Xzy22k Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow
https://amzn.to/4dPv7rS Machine Learning with PyTorch and Scikit-Learn
https://amzn.to/4gaxUxj Designing Machine Learning Systems
https://amzn.to/4gfsYqO Machine Learning Design Patterns
▬▬ Exclusive Phi AI merch ▬▬▬▬▬▬▬▬▬▬▬
https://phiaistore.etsy.com
https://phi-ai-store.printify.me/prod...
▬▬ Timeline ▬▬▬▬▬▬▬▬▬▬▬
0:05 - Basics of Clustering
0:10 - K-means Clustering
0:40 - Word Embeddings with SentenceTransformers
2:26 - Support Tickets Dataset
2:51 - BERT-based Embedding LLM and Embedding Dimensions
3:47 - Dimensionality Reduction of Embeddings
6:02 - K-means Clustering Overview
7:55 - K-means Clustering: Cluster Assignment Step
10:35 - K-means Clustering: Centroid Update Step
13:12 - Epoch Definition
13:30 - Run Multiple Epochs of K-means Clustering
14:26 - Results of Clustering
17:14 - Visualize Evolution of Cluster Centroids across Epochs
17:40 - Visualize Clusters
18:13 - Illustrate Cluster Size
18:15 - Conclusion and Next Videos
18:20 - Gaussian Mixture Clustering0:00 - Intro
18:46 - Revisit K-means clustering
20:32 - Revisit Word Embeddings
22:02 - Revisit Dimentionality Reduction
22:53 - Introducing Gaussian (Normal) Distribution: 1D case
24:33 - Two examples of Gaussian Distributions
25:23 - Problem with small values of standard deviation
26:26 - Introducing Gaussian (Normal) Distribution: N-dim and 2D cases
28:49 - Mahalanobis and Euclidean Distances: Relationship between distance and probability
30:34 - Example of Covariance Matrix
30:45 - Two examples of 2D Gaussian Distribution: Rotation and Extension
32:49 - Gaussian Mixtures
33:50 - Example of the Gaussian Mixture of 1D Normal Distributions
34:36 - Introducing EM algorithm
36:11 - EM algo vs. K-Means: Cluster Assignment step aka Expectation Step
41:26 - EM algo vs. K-Means: Centroid Update step aka Maximization Step
44:43 - Run Gaussian Mixture Clustering for Multiple Epochs
46:03 - Evolution of Centroids
46:14 - Visualize Final clusters
47:03 - Conclusion
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: