Популярное

Музыка Кино и Анимация Автомобили Животные Спорт Путешествия Игры Юмор

Интересные видео

2025 Сериалы Трейлеры Новости Как сделать Видеоуроки Diy своими руками

Топ запросов

смотреть а4 schoolboy runaway турецкий сериал смотреть мультфильмы эдисон
dTub
Скачать

[90] Intro to OpenRefine for Data Cleaning and Reconciliation (Martin Magdinier)

Автор: Data Umbrella

Загружено: 2023-10-03

Просмотров: 1565

Описание:

Join our Meetup group:
https://www.meetup.com/data-umbrella

Resources
Slides: https://docs.google.com/presentation/...
Dataset: https://open.toronto.ca/dataset/build...
OpenRefine Discourse forums: https://forum.openrefine.org/

About the Event
OpenRefine stands as a robust, open-source tool specifically tailored for those delving into the complex world of messy data. It is designed to not only cleanse such data but also to transform it, making it easier to convert between varying formats.
The talk will unfold in three primary segments. The first portion provides a comprehensive introduction to OpenRefine, exploring its purpose, its user base, and its historical evolution. Following this, attendees will embark on a tour of OpenRefine, familiarizing themselves with its download and installation processes, the intricacies of data import, the nuances of filtering and faceting, clustering, as well as vital data cleaning techniques, and the application of reconciliation services. Finally, the session culminates in an invitation to participants to join the OpenRefine community, shedding light on various avenues through which they can contribute – be it through coding, design, translation, documentation enhancement, or user support.

Timestamps
00:00 Data Umbrella introduction
03:35 What is OpenRefine?
05:00 History of OpenRefine (Freebase Gridworks, Google Refine to Open Refine)
08:33 OpenRefine user base
10:42 Project statistics
11:34 Features of OpenRefine
14:00 Contributing to OpenRefine (use, promote, help, translate, fix, create, design)
19:40 begin demo: Example dataset of Toronto building permits)
20:23 Running OpenRefine locally, installation
20:44 Download OpenRefine (openrefine.org/download)
21:45 Demo: reading in the data
24:15 Demo: export data from OpenRefine
24:38 Demo: working with the data
25:30 Demo: Text facet shows summary of different values
26:45 facet / filter
27:17 combine multiple facets
28:10 text filter
28:40 Cluster algorithm to clean text data (Ex: Fingerprint function, etc)
32:54 Cluster algorithm: n-Gram fingerprint
33:30 Cluster algorithm: Cologne phonetic
34:15 Cleaning: working with numerical data
35:20 find and replace: remove commas in number
37:49 working with dates
38:40 doing reconciliations in OpenRefine (merge multiple fields into one field)
41:12 Reconciliation Service: an API
41:32 about the dataset: Bathurst Street from Wiki Foundation
44:00 connect my dataset with Wikipedia data
44:45 Reconciliation service test bench (plus: clean street name data)
47:38 Example: Excel type code for editing data
55:26 Resources list
56:20 Q: In the Reconciliation service API, which API versions are supported by OpenRefine?

About the Speaker
Martin Magdinier is OpenRefine Project Manager and core contributor since 2013.

GitHub: https://github.com/OpenRefine/
X:   / openrefine  
LinkedIn:   / openrefine  

#python #opensource #datascience #dataanalytics

[90] Intro to OpenRefine for Data Cleaning and Reconciliation (Martin Magdinier)

Поделиться в:

Доступные форматы для скачивания:

Скачать видео mp4

  • Информация по загрузке:

Скачать аудио mp3

Похожие видео

Data Education

Data Education

[34] Taking the Edge Off of Data Science with dabl (Andreas Mueller)

[34] Taking the Edge Off of Data Science with dabl (Andreas Mueller)

Using OpenRefine for Name Reconciliation

Using OpenRefine for Name Reconciliation

Master Data Cleaning Skills with OpenRefine & WinPure (Hands on Lab)

Master Data Cleaning Skills with OpenRefine & WinPure (Hands on Lab)

Data Cleaning with OpenRefine

Data Cleaning with OpenRefine

Understanding Data Cleaning | Google Data Analytics Certificate

Understanding Data Cleaning | Google Data Analytics Certificate

Proactive Security for Safeguarding Data within the Salesforce Ecosystem -  by Salesforce & Accellor

Proactive Security for Safeguarding Data within the Salesforce Ecosystem - by Salesforce & Accellor

[105] Polars for Data Analysis in Python (Kimberly Fessel)

[105] Polars for Data Analysis in Python (Kimberly Fessel)

Get Started with OpenRefine: Explore, Clean, and Transform your Data!

Get Started with OpenRefine: Explore, Clean, and Transform your Data!

[114] Getting Started with PyMC (Chris Fonnesbeck)

[114] Getting Started with PyMC (Chris Fonnesbeck)

[106] RAGged Edge Box: A Personal AI-Powered Document Search System (Pablo Duboue)

[106] RAGged Edge Box: A Personal AI-Powered Document Search System (Pablo Duboue)

Introduction to OpenRefine

Introduction to OpenRefine

Объяснение API (за 4 минуты)

Объяснение API (за 4 минуты)

Data Cleaning in OpenRefine (Library Skills Week 2021)

Data Cleaning in OpenRefine (Library Skills Week 2021)

Введение в очистку данных с помощью OpenRefine и облаков слов

Введение в очистку данных с помощью OpenRefine и облаков слов

[101] Build Your First Database Web App with Tiki Trackers (open source) (Marc Laporte)

[101] Build Your First Database Web App with Tiki Trackers (open source) (Marc Laporte)

[98] Intro to Bash Scripting (Rebecca BurWei)

[98] Intro to Bash Scripting (Rebecca BurWei)

Introduction to OpenRefine

Introduction to OpenRefine

Cleaning Data with OpenRefine: Streamline Data Management in Humanitarian Organizations.

Cleaning Data with OpenRefine: Streamline Data Management in Humanitarian Organizations.

[107] Polars & Narwhals: Translating from pandas (Marco Gorelli)

[107] Polars & Narwhals: Translating from pandas (Marco Gorelli)

© 2025 dtub. Все права защищены.



  • Контакты
  • О нас
  • Политика конфиденциальности



Контакты для правообладателей: [email protected]