Maarten Grootendorst on BERTopic - Weaviate Podcast #28
Автор: Weaviate vector database
Загружено: 2022-11-17
Просмотров: 1434
Thank you so much for watching the 28th Weaviate Podcast! This episode features Maarten Grootendorst, developer of the BERTopic python library and an active evangelist of this exciting cluster analysis technology, (Maarten has written some incredible articles here - / maartengrootendorst ! In this podcast, Maarten did an incredible job explaining how BERTopic works, particular details such as k-Means clustering vs. HDBSCAN, Semi-Supervised topic modeling, Dynamic topic modeling, and many more! I was amazed at Maarten's expertise in the miscellaneous details of these algorithms! We are extremely excited about adding BERTopic to Weaviate, please see this proposal if interested in contributing to the discussion: https://github.com/semi-technologies/...!
Timestamps
0:00 Welcome Maarten!
0:44 BERTopic Algorithm
5:00 HDBSCAN vs. k-Means
7:08 Keyword Lists of “Topics”
12:30 Semi-Supervised Topic Modeling
14:55 Recursive BERTopic
18:56 Image Vectors for BERTopic
21:18 Data Analysis of Vector Clusters
22:45 Inspiration from Psychology
26:42 BERTopic for Generalization Testing
28:50 BERTopic for Data Cleaning
31:15 Subjectivity in Topic Labeling
36:22 Evaluating Topic Models
39:20 Searching through Topics
41:52 What is Dynamic Topic Modeling?
44:43 What is Online Topic Modeling?
47:58 BERTopic python library
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: