Building out an entity resolution pipeline with Python and dbt, Vouch.us
Автор: dbt Labs
Загружено: 2020-10-05
Просмотров: 6286
You’d probably expect a table named companies to have one record per company, right?
But what happens when your organization uses a number of different tools to collect customer information, all with different ways to track a company? Or when two people fill in slightly different names for their company? That companies table is likely going to end up with different records representing the same real-life company.
That’s where entity resolution comes in — the practice of mapping different identifiers to a single entity.
In this Office Hours, Pedram from Vouch Insurance shares how they solved this problem, using a combination of both dbt and python!
Speaker: Pedram Navid, Data Engineer, Vouch.us
Slides: https://bit.ly/3lg6f0z
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: