Data Science with Python: Wikitext, Wikipedia API
Автор: Włodzimierz Lewoniewski
Загружено: 2025-04-06
Просмотров: 444
In this video tutorial you’ll learn — step by step — how to use the MediaWiki API for the English Wikipedia to programmatically fetch the raw wikicode (“wikitext”) of any article, process it in Python, and save it locally. You’ll see the entire workflow: starting with a manual look at the HTML source, then moving to the API Sandbox, parsing the JSON response, and finally batch‑downloading multiple pages. By the end you’ll be able to pull not only single articles but whole collections for data analysis, machine‑learning pipelines, automated translation, or your own apps.
What You’ll Be Able to Do Afterwards:
🟢 Test API calls – Use the Wikipedia Sandbox to set parameters and preview responses.
🟢 Build Python requests – Craft HTTP queries with both urllib and requests and store JSON.
🟢 Extract wikitext – Traverse the nested JSON to isolate and save raw markup.
🟢 Download many articles at once – Loop through multiple pages and write their code to a single file.
🟢 Lay groundwork for advanced projects – Apply the data stream to text‑mining, ML corpora, or custom web tools.
#Wikipedia #python #datascience #wikitext #wikicode #api #requests #urllib
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: