HAI Seminar: Addressing Challenges of Public Web Data
Автор: Stanford HAI
Загружено: 2025-10-31
Просмотров: 148
This HAI seminar featured Common Crawl Foundation’s work on preserving humanity's knowledge and making it accessible through its free public web dataset, a vital resource since 2008. The Common Crawl team presented insights from a new data product that utilizes Common Crawl's metadata to explore concerns around robots.txt exclusions, legal demands, and "bot defenses," advocating for greater transparency and informed solutions for the future of public web data.
This seminar was recorded on October 22, 2025 at Stanford University.
00:00:00 Introduction
00:01:01 Lecture
00:48:53 Q&A
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: