Code Monkey home page Code Monkey logo

dalianmao's Introduction

大脸猫

大脸猫是一个基于aiohttp,uvloop和BeautifulSoup的爬虫框架,框架结构类似Scrapy,语法类似Flask,本框架目前限制速度的因素在于url去重算法,后续会使用Cython重写。 (多进程和分布式版本暂定名为“蓝皮鼠”,等写完毕业论文并找到工作后再写)

依赖

该框架依赖于uvloop,aiohttp,aiofiles,Motor,BeautifulSoup等非python标准库,使用前应确保安装,pip可以自动安装

安装

  • pip install dalianmao
  • 下载源码后用 python setup.py install安装

使用

大脸猫爬虫框架的__init__.py引入了三个类:Executor,Options,和DaLianMao。在新建爬虫时,Options和DaLianMao必须引入
from dalianmao import Options, DaLianMao

  • Executor为标准库concurrent.futures中的ProcessPoolExecutor,可以通过DaLianMao.run_in_executor(Executor, func)使用,func如果带参数,可以使用functools.partial
  • Options为配置文件,可以设置爬虫名字name: str,初始链接start_urls: list,这两个设置项是Options初始化 必须指明的,如:
    options = Options(name='wandoujia', start_urls=['http://www.wandoujia.com/apps', ]),
    其它缺省项将在后面详述
  • 通过DaLianMao可以新建爬虫对象,使用时仅需传递Options对象
    app = DaLianMao(options)
    未完待续。。。

dalianmao's People

Contributors

zengjianxin avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.