Code Monkey home page Code Monkey logo

web-scraping-newegg's Introduction

Project logo

Web Scraping Newegg



🧐 About

This is a web scraping program that harvest the website information and organize the prouct information in a sorted way. Everytime you run the program it will out put the updated information from the website

Required Libraries

bs4

pip install beautifulsoup4

requests

python -m pip install requests

🏁 Installing

python3 web-scrape.py

🎈 Code Walk Through

uClient is opening up a connection with the my_url and grabing the page information and store it in page_html

my_url = "https://www.newegg.com/Video-Cards-Video-Devices/Category/ID-38?Tpk=graphic%20card"
#Opening up connection grabbing the page
uClient = uReq(my_url)
page_html = uClient.read()

Then we use soup to parse the html page and sotre it in page_soup. In order to know which div contains all of the products information, we need to inspect the web page and find the class name for that. In this case, we want to find all divs that have class name "item-container". containers is a list that contains all of the products info in this htnl page.

uClient.close()
page_soup = soup(page_html, "html.parser")
containers = page_soup.findAll("div", {"class":"item-container"})

In the for loop, we are iterating over each individual item and check ther brand, title, and shipping info.

for container in containers:
	brand_container = container.findAll("a", {"class":"item-brand"})
	brand = brand_container[0].img["title"]
	title_container = container.findAll("a", {"class":"item-title"})
	product_name = title_container[0].text
	shipping_container = container.findAll("li", {"class":"price-ship"})
	shipping_price = shipping_container[0].text.strip()
	print("brand : ",brand)
	print("name : ",product_name)
	print("shipping price : ",shipping_price)
	print("--------------------")

πŸš€ Result

image image

⛏️ Built Using

  • [Python] - Programming Language

License

🌱 MIT 🌱

web-scraping-newegg's People

Contributors

chenhunluo321 avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.