Python: How to Screen Scrape a Website Using BeautifulSoup (BS4) | Learn with Dr. Todd Wolfe

Автор: Dr. Todd Wolfe Technology Training and Tutorials

Загружено: 2024-10-31

Просмотров: 382

Описание:

In this video, the famous Dr. Todd Wolfe will walk you through how to scrape data from a website using Python and the BeautifulSoup (BS4) library. We'll explore the fundamentals of web scraping, showing you step-by-step how to extract the titles of articles from the Hacker News website.

Key Topics Covered in This Video:

Introduction to web scraping and its use cases.
Setting up your Python environment, including installing BeautifulSoup and requests packages.

HTTP requests: How to get the HTML content of a webpage.

Using BeautifulSoup to parse the HTML and navigate through the tags.

Extracting and printing article titles from the Hacker News website, using simple Python loops and techniques.

Tips on scraping ethically and respecting website rules.

Tutorial Objectives:
Learn how to install BeautifulSoup and requests for Python.
Understand how to fetch webpage content using the requests library.
Learn to parse and navigate HTML with BeautifulSoup to find specific elements.
Extract and print article titles from the Hacker News homepage using Python.
Get valuable insights from Dr. Todd Wolfe about best practices in web scraping.

By the end of this video, you'll be able to create your own web scraper to gather useful information from publicly available websites, and you'll gain a solid understanding of how BeautifulSoup can make HTML parsing easy.

Code Snippets Used in This Video: You can find the example code used in this video in the description below, so you can follow along as Dr. Todd Wolfe shows you each step of the process.

🚀 Subscribe to our channel for more in-depth tutorials on Python programming, data analysis, databases, and other tech topics from Dr. Todd Wolfe.

👍 Like and Comment below if you enjoyed this video or have suggestions for future content.

Resources Mentioned in This Video:

BeautifulSoup Documentation: https://www.crummy.com/software/Beaut...
Hacker News Website: https://news.ycombinator.com/
Follow Dr. Todd Wolfe for more tech and programming updates:

LinkedIn: / toddwolfe

📌 Don’t forget to hit the bell icon 🔔 to get notified whenever we post a new tutorial!

#Python #WebScraping #BeautifulSoup #DrToddWolfe #HackerNews #LearnPython #ProgrammingTutorial

Code Snippets:

import requests
from bs4 import BeautifulSoup

URL of the website to scrap

url = "https://news.ycombinator.com/"

send an http request to the url

response = requests.get(url)

check if the request is successful
if response.status_code == 200:
parse the hTML content of the page using the beautifulSoup library
soup = BeautifulSoup(response.text, 'html.parser')

Find all story titles

titles = soup.find_all('span', class_='titleline')

print all the articles
print("Top Articles on Hacker News:")

print(response.text)

for idx, title in enumerate(titles):
print(f"{idx+1}. {title.text}")
else:
print(f"Failed to retrieve the page. Status code: {response.status_code}")

Python: How to Screen Scrape a Website Using BeautifulSoup (BS4) | Learn with Dr. Todd Wolfe

Доступные форматы для скачивания:

Скачать видео mp4

Информация по загрузке:

Скачать аудио mp3

Похожие видео

Обучение парсингу на Python | Парсинг без обнаружения | Парсинг CloudFlare | Selenium, CloudFlare

Обучение парсингу на Python | Парсинг без обнаружения | Парсинг CloudFlare | Selenium, CloudFlare

Typst: Современная замена Word и LaTeX, которую ждали 40 лет

Typst: Современная замена Word и LaTeX, которую ждали 40 лет

Я в опасности

Веб-скрапинг с помощью Python и BeautifulSoup — ЭТО ТАК ПРОСТО!

Веб-скрапинг с помощью Python и BeautifulSoup — ЭТО ТАК ПРОСТО!

Новое расширение Claude для Chrome: секретное оружие, которое должен использовать каждый

Новое расширение Claude для Chrome: секретное оружие, которое должен использовать каждый

Учебное пособие по HTTPX — HTTP-клиент нового поколения для Python

Учебное пособие по HTTPX — HTTP-клиент нового поколения для Python

Твоя ПЕРВАЯ НЕЙРОСЕТЬ на Python с нуля! | За 10 минут :3

Твоя ПЕРВАЯ НЕЙРОСЕТЬ на Python с нуля! | За 10 минут :3

Chrome, Firefox, Vivaldi или Brave? Сравниваем безопасность и конфиденциальность браузеров

Chrome, Firefox, Vivaldi или Brave? Сравниваем безопасность и конфиденциальность браузеров

ИНТЕРНЕТ 2026: Смерть VPN, Белые списки и режим Интранета. Системный анализ конца сети

ИНТЕРНЕТ 2026: Смерть VPN, Белые списки и режим Интранета. Системный анализ конца сети

Web Scraping with Beautiful Soup - Make Databases from Scratch

Web Scraping with Beautiful Soup - Make Databases from Scratch

Самая опасная база данных прямо сейчас

Самая опасная база данных прямо сейчас

Co Naprawdę Oznacza Podpisanie Umowy Mercosur? Rolnictwo, Klauzule Ochronne, Sprzeciw Państw, TSUE!

Co Naprawdę Oznacza Podpisanie Umowy Mercosur? Rolnictwo, Klauzule Ochronne, Sprzeciw Państw, TSUE!

FFmpeg: бесплатный видеоконвертер из командной строки

FFmpeg: бесплатный видеоконвертер из командной строки

Самая быстрая передача файлов МЕЖДУ ВСЕМИ ТИПАМИ УСТРОЙСТВ 🚀

Самая быстрая передача файлов МЕЖДУ ВСЕМИ ТИПАМИ УСТРОЙСТВ 🚀

Massively Speed Up Requests with HTTPX in Python

Massively Speed Up Requests with HTTPX in Python

Сисадмины больше не нужны? Gemini настраивает Linux сервер и устанавливает cтек N8N. ЭТО ЗАКОННО?

Сисадмины больше не нужны? Gemini настраивает Linux сервер и устанавливает cтек N8N. ЭТО ЗАКОННО?

Забудь VS Code — Вот Почему Все Переходят на Cursor AI

Забудь VS Code — Вот Почему Все Переходят на Cursor AI

Python Web Scraping Example: Selenium and Beautiful Soup

Python Web Scraping Example: Selenium and Beautiful Soup

SQLite Backend для начинающих — быстрое создание баз данных с помощью Python и SQL

SQLite Backend для начинающих — быстрое создание баз данных с помощью Python и SQL

Руководство по запросам входа в систему и постоянным сеансам Python 🔥: «хакерский» подход | Веб-с...

Руководство по запросам входа в систему и постоянным сеансам Python 🔥: «хакерский» подход | Веб-с...