Code Monkey home page Code Monkey logo

py-ecommerce-selenium-scraping's Introduction

Ecommerce selenium scraping

Task

This time you will implement the scraper for E-commerce test-site. Yep, the similar one to that site in the video, but with some more changes. Firstly - you need to scrape & parse info about all products and all pages.

The list of pages is next:

  • home page (3 random products);
  • computers page (3 random computers);
  • laptops page (117 laptops) with more button pagination;
  • tablets page (21 tablets) with more button pagination;
  • phones page (3 random phones);
  • touch page (9 touch phones) with more button pagination.

All of these pages should be scraped & content of products should be written in corresponding .csv file. For ex. results for home page -> home.csv, touch page -> touch.csv. Of course, on same pages there are random products, so the tests will only check content of 3 constant pages. There are classes template for Product in app/parse.py.

So, your task is to implement get_all_products function, which will save all 6 pages to corresponding .csv files with correct product data.

Optional Task

  1. Run Selenium without opening a browser;
  2. Add comprehensive process annotations.

Hints:

  • Do not copy-paste the code for different pages scraping;
  • Write the global logic for parsing the single page;
  • Be aware of accept cookies button, while developing, possible fix - just to click it, when it appears;
  • Sometimes, you need to wait a bit, while your driver is acting after some event;
  • Make your code as clean as possible;
  • Optional task โ„–1: read about "headless" mode;
  • Optional task โ„–2: read about tqdm library.

py-ecommerce-selenium-scraping's People

Contributors

danylott avatar abnormaltype avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.