Becoming a Pokémon Master with DVC: Reproducible Machine Learning Experiments | Rob De Wit
Автор: DVCorg
Загружено: 2023-06-06
Просмотров: 830
From @PyData Eindhoven 2022
In machine learning projects we need to experiment in order to find and maintain the best-performing model. While we can do initial prototyping in a Notebook, eventually we need to move towards more structured experiment tracking to facilitate the reproducibility of our experiments.
The open-source DVC library aims to tackle this problem through a Git-based approach to versioning data and artifacts. In this talk we will explore how DVC works, how we can apply it to conduct ML experiments, and how we can use it to become a great Pokémon trainer.
Every data scientist has at one point kept track of their experiments on paper, sticky notes, or in a spreadsheet. But how can we guarantee reproducibility for potentially thousands of experiments over numerous years? Can we figure out which version of a model ran in production six months ago, and what data went into its training?
The talk is aimed at data scientists and explores best practices for ML projects using a light-hearted topic. Some general knowledge of how ML works is expected, but not necessary to understand the talk. The key concept is reproducibility: how can we track and version not just code, but entire experiments?
DVC is a potential solution for this. The philosophy behind it can be summarized as "Git for data and models". I will discuss its concepts and show how it works in practice for a classifier of Pokémon sprites.
The main takeaway will be the importance of reproducibility and a demo on how to achieve this.
Try out the DVC Extension for VS Code here: https://marketplace.visualstudio.com/...
To learn more about Iterative's open-source and SaaS tools please visit:
🧑🏽💻 Our online course: https://learn.iterative.ai
✍🏼 Our docs: https://dvc.org/doc (Data Version Control, Pipelines, Experiments)
https://cml.dev/doc (CI/CD for Machine Learning)
https://mlem.ai/doc (Package and Serve your models)
https://studio.iterative.ai (Team Collaboration, Experiments, Model Registry)
Join our Discord server: / discord
#dvc #machinelearning #datascience
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: