Code Monkey home page Code Monkey logo

beautifulsoup-tutorial's Introduction

BeautifulSoup Web Scraping Tutorial

Python BeautifulSoup Requests GitHub Last Commit GitHub Issues GitHub Stars GitHub Forks

Beautifulsoup Tutorial

A beginner's tutorial to scraping websites using Python's BeautifulSoup library.

This repository is the source code for the tutorial found here.

Getting Started

Get set up locally in two steps:

Environment Variables

Replace the value in .env.example with your value, and rename this file to .env:

  • TARGET_URL: An HTTP URL to scrape and display metadata from.

Installation

Get up and running with make deploy:

git clone https://github.com/hackersandslackers/beautifulsoup-tutorial.git
cd beautifulsoup-tutorial
make deploy

Hackers and Slackers tutorials are free of charge. If you found this tutorial helpful, a small donation would be greatly appreciated to keep us in business. All proceeds go towards coffee, and all coffee goes towards more content.

beautifulsoup-tutorial's People

Contributors

dependabot-preview[bot] avatar dependabot[bot] avatar renovate-bot avatar toddbirchard avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

beautifulsoup-tutorial's Issues

main.py - Line 5 ('scrape = scrape_page_metadata()')

Love the walk-through, great warm up for web scraping.
I found an error with the instance call for scrape_page_data().
By editing the:
5: scrape = scrape_page_metadata()
to:
5: scrape = scrape_page_metadata
the scrape object will properly initialize with the url variable from config in line 8.

TypeError: scrape_page_metadata() missing 1 required positional argument: 'url'

TypeError: scrape_page_metadata() missing 1 required positional argument: 'url' [ ] 0s

Error:

Traceback (most recent call last):
  File "main.py", line 5, in <module>
    scrape = scrape_page_metadata()
TypeError: scrape_page_metadata() missing 1 required positional argument: 'url'

How to obtain error:

!git clone https://github.com/hackersandslackers/beautifulsoup-tutorial.git
%cd beautifulsoup-tutorial

!pip3 install -r requirements.txt
!python3 main.py

URL is updated in config.py, seems like the class should also have URL as input?

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

Rate-Limited

These updates are currently rate-limited. Click on a checkbox below to force their creation now.

  • Update dependency flake8 to v7.1.1
  • Update dependency soupsieve to v2.6
  • ๐Ÿ” Create all rate-limited PRs at once ๐Ÿ”

Open

These updates have all been created already. Click a checkbox below to force a retry/rebase of any.

Detected dependencies

pep621
pyproject.toml
pip_requirements
requirements.txt
  • beautifulsoup4 ==4.12.2
  • certifi ==2023.11.17
  • charset-normalizer ==3.3.2
  • idna ==3.6
  • python-dotenv ==1.0.0
  • requests ==2.31.0
  • soupsieve ==2.5
  • urllib3 ==2.1.0
poetry
pyproject.toml
  • python >=3.10,<4.0
  • requests *
  • beautifulsoup4 *
  • python-dotenv *
  • black *
  • isort *
  • flake8 *
  • pylint *
  • mypy *

  • Check this box to trigger a request for Renovate to run again on this repository

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.