Code Monkey home page Code Monkey logo

carp's Introduction

carp

Repository Introduction

This is an integration task that encapsulates selenium twice, which allows us to crawl different websites with a set of encapsulated code and perform different task types.

Now, it has some functions:

  1. log
  2. debug
  3. just need to implement 2 functions
  4. exception screenshot
  5. email api send to yourself

Currently, this repository contains:

  1. 下厨房
  2. 心食谱
  3. 美食天下
  4. 果蔬网
  5. 东方财经
  6. 12306
    Plus: The above URLs are not necessarily the URLs of the corresponding tasks.

Install

This project uses Python Git Chrome. Go check them out if you don't have them locally installed.

git clone https://github.com/touero/carp.git

Usage

Recommend using Python's virtual environment

python -m venv venv

Activate virtual environment

source ./venv/bin/activate # Unix 

.\venv\Scripts\activate # windows 

Install packagers

pip install -r requriements.txt 

Quiting from virtual environment

deactivate # Unix 

.\Scripts\deactivate.bat # Windows 

Run

  • default_config is used to configure tasks in run.py
  • default email config is in config/smtp.yaml
  • if you want to use it please build a new yaml
python run.py

As you can see, there are relatively few crawling parts. But, contribute to this under Python PEP-8

If you want to increase your robot following:

  • please creating ***_robot.py in package of name is robots.
  • Adding task's type and task's url in constants.py.
  • Adding 1&2 in robots and urls in RobotMaster.
  • Over writing your _str_'s and run_task's func in your robot.
  • if you want to use email api [email=True] in local_runner and set your config/smtp.yaml
  • Fixing task_type in local_runner and run it.

Real running instructions if using email api example:

python run.py -y config/smtp.yaml

Related Repository

  • Python — All Algorithms implemented in Python.
  • Selenium — A browser automation framework and ecosystem.

Related Driver Download

Currently, there is only Chrome driver, if you have a feat with other driver please submit PRs

Maintainers

@touero

Contributing

How I wish I could add more content in this repo !

Feel free to dive in! Open an issue or submit PRs.

Standard Python follows the Python PEP-8 Code of Conduct.

Contributors

This project exists thanks to all the people who contribute.

License

GNU General Public License v3.0

carp's People

Contributors

linghaoshiyan avatar touero avatar wangpengkaireal avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.