This script automates the process of searching for a website via keyword
and the Bing search engine.... page after page
Pass a complete URL and at least 1 keyword as command line arguments:
python proxy_crawler.py -u https://www.example.com -k keyword
python proxy_crawler.py -u https://www.whatsmyip.org -k "my ip"
If on a Linux system, proxy_crawler can run headless. Give the -x option (requires XVFB):
python proxy_crawler.py -u https://www.whatsmyip.org -k "my ip" -x
- It first scrapes a list of proxies from the web using SSL Proxies
- Then using a new proxy socket for each iteration, the specified keyword(s) is searched for via Bing until the desired website is found
- The website is then visited, and one random link is clicked within the website
- The bot is slowed down on purpose
- If searching with multiple keywords, wrap them in quotes: "example search phrase"
Along with Python 3 and geckodriver, the following are also required:
pip install selenium
apt-get install xvfb (Linux only)
proxy_crawler.py passes pep8/pycodestyle
I use this version of geckodriver on Ubuntu:
wget https://github.com/mozilla/geckodriver/releases/download/v0.26.0/geckodriver-v0.26.0-linux64.tar.gz
geckodriver should be unzipped and saved somewhere in your PATH... ie:
/usr/local/bin
This was developed on Ubuntu 16.04.4 LTS with selenium/geckodriver and firefox 60.0
Also tested on Ubuntu 18.04
Author: rootVIII 2018-2020