Code Monkey home page Code Monkey logo

grey_harvest's Introduction

grey_harvest

Scrapes the web for reliable http/https proxies and tests them for speed and reliability. Can be used as both a python module and a command line utility. When run as a command line utility, proxies are sent to stdout. When run as a module, it returns a generator.

Check out the project on PyPi at https://pypi.python.org/pypi/grey_harvest/0.1.3.5

Key Features

  • Quickly and easily generate a list of reliable http/https proxies
  • Usable as a comamnd line utility or a python module
  • Can filter for proxies that support SSL
  • Can filter for proxies locationed within specific countries
  • Can exclude proxies located within specific countries

Installation

First, install the following dependencies:

# On Centos/RHEL/Fedora:
sudo yum install python-devel libxlt-devel libxml-devel

# On Debian/Ubuntu:
sudo apt-get install python-dev libxml2-dev libxslt1-dev

Then install grey_harvest using pip as follows:

pip install grey_harvest

Usage

We can generate a list of 10 viable proxies with the following command:

# use the -n flag to specify number of proxies to generate
grey_harvest -n 10

To select only proxies with SSL enabled, we do this:

# use the -H flag to select only https proxies
grey_harvest -n 10 -H

We can use the -a flag to filter for proxies located within a list of specific countries. For example, to choose proxies located within Ukraine, Hong Kong, and the United States, we'd use this:

# use the -a flag to filter by country
grey_harvest -a "United States" "Hong Kong" Ukraine -n 10

We can use the -p flag to filter for ports running on specific ports:

# the -p flag to only use proxies that run on port 80
grey_harvest -p 80 -n 10

We can deny proxies located within specific countries by using the -d flag. Proxies located within China are blocked by default as they are often located behind the Great Firewall, and as such tend to be unreliable. This can be changed within grey_harvest.py's internal configs.:

# use the -d flag to deny proxies located within France and
# Germany
grey_harvest -d France Germany -n 10 -H

grey_harvest library - basic example

Before diving into the documentation for the grey_harvest library, check out how easily we can generate a list of 20 proxies:

import gray_harvest

''' spawn a harvester '''
harvester = grey_harvest.GreyHarvester()

''' harvest some proxies from teh interwebz '''
count = 0
for proxy in harvester.run():
        print proxy
        count += 1
        if count >= 20:
                break

That's it. We now have 20 http/https proxies ready to go.

grey_harvest's People

Contributors

s0lst1c3 avatar ethinallen avatar magicknight avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.