Code Monkey home page Code Monkey logo

instagramcrawler's Introduction

05/03/2019 Repo is now archived.

I am now officially archiving this repo after a long time of, well, not maintaining.


InstagramCrawler

A non API python program to crawl public photos, posts, followers, and following

Login to crawl followers/following

To crawl followers or followings, you will need to login with your credentials either by filling in 'auth.json' or typing in(as you would do when you are simply browsing instagram)

Well, it is to copy 'auth.json.example' to 'auth.json' and fill in your username and password

PhantomJS for headless browser

For headless browser, after installing phantomjs, add '-l' to the arguments

Examples:

Download the first 100 photos and captions(user's posts, if any) from username "instagram"

NOTE: When I ran on public account 'instagram', somehow it stops at caption 29
$ python instagramcrawler.py -q 'instagram' -c -n 100

Search for the hashtag "#breakfast" and download first 50 photos

$ python instagramcrawler.py -q '#breakfast' -n 50

Record the first 30 followers of the username "instagram", requires log in

$ python instagramcrawler.py -q 'instagram' -t 'followers' -n 30 -a auth.json

Full usage:

usage: instagramcrawler.py [-h] [-d DIR] [-q QUERY] [-t CRAWL_TYPE] [-n NUMBER] [-c]  [-a AUTHENTICATION]
  • [-d DIR]: the directory to save crawling results, default is './data/[query]'
  • [-q QUERY] : username, add '#' to search for hashtags, e.g. 'username', '#hashtag'
  • [-t CRAWL_TYPE]: crawl_type, Options: 'photos | followers | following'
  • [-n NUMBER]: number of posts, followers, or following to crawl
  • [-c]: add this flag to download captions(what user wrote to describe their photos)
  • [-a AUTHENTICATION]: path to a json file, which contains your instagram credentials, please see 'auth.json'
  • [-l HEADLESS]: If set, will use PhantomJS driver to run script as headless
  • [-f FIREFOX_PATH]: path to the binary (not the script) of firefox on your system (see this issue in Selenium SeleniumHQ/selenium#3884 (comment))

Installation

There are 2 packages : selenium & requests

NOTE: I used selenium = 3.4, geckodriver = 0.16 (fixed bug in previous versions)
$ pip install -r requirements.txt
Optional: geckodriver and phantomjs if not present on your system
bash utils/get_gecko.sh
bash utils/get_phantomjs.sh
source utils/set_path.sh

instagramcrawler's People

Contributors

jopasserat avatar leabdalla avatar shangyusu avatar tzuhsial avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

instagramcrawler's Issues

Timeout error

I'm getting this error when i ran that script ( also happens with python comand )

python3 instagramcrawler.py -q '#breakfast' -n 20 -a auth.json
Traceback (most recent call last):
File "instagramcrawler.py", line 360, in
main()
File "instagramcrawler.py", line 350, in main
crawler = InstagramCrawler(headless=args.headless, firefox_path=args.firefox_path)
File "instagramcrawler.py", line 72, in init
self._driver.implicitly_wait(10)
File "/home/mestre/.local/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 719, in implicitly_wait
'implicit': int(float(time_to_wait) * 1000)})
File "/home/mestre/.local/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 249, in execute
self.error_handler.check_response(response)
File "/home/mestre/.local/lib/python3.6/site-packages/selenium/webdriver/remote/errorhandler.py", line 194, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: timeouts

Erratic but very common timeout exception

Frequently get this error when storing captions. Doesn't seem to be an issue when just taking images.

Traceback (most recent call last):
  File "instagramcrawler.py", line 341, in <module>
    main()
  File "instagramcrawler.py", line 337, in main
    authentication=args.authentication)
  File "instagramcrawler.py", line 114, in crawl
    self.click_and_scrape_captions(number)
  File "instagramcrawler.py", line 221, in click_and_scrape_captions
    EC.presence_of_element_located((By.TAG_NAME, "time"))
  File "/home/jake/.virtualenvs/insta/lib/python3.5/site-packages/selenium/webdriver/support/wait.py", line 80, in until
    raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message: 

Firefox path not solve and Need gecko driver executable in path

I try the program and changing path to "C:\Program Files\Mozilla Firefox" in instagramcrawker.py program, but when just run the program i get this message. I use it with Windows 10 and firefox 59. Any Idea o fix this issue?

Traceback (most recent call last):
  File "instagramcrawler.py", line 360, in <module>
    main()
  File "instagramcrawler.py", line 350, in main
    crawler = InstagramCrawler(headless=args.headless, firefox_path="C:\Program Files\Mozilla Firefox")
  File "instagramcrawler.py", line 70, in __init__
    self._driver = webdriver.Firefox(firefox_binary=binary)
  File "C:\Users\Santo Wijaya\AppData\Local\Programs\Python\Python36-32\lib\site-packages\selenium\webdriver\firefox\webdriver.py", line 142, in __init__
    self.service.start()
  File "C:\Users\Santo Wijaya\AppData\Local\Programs\Python\Python36-32\lib\site-packages\selenium\webdriver\common\service.py", line 81, in start
    os.path.basename(self.path), self.start_error_message)
selenium.common.exceptions.WebDriverException: Message: 'geckodriver' executable needs to be in PATH.

error

windows7 64bit, running the following command:
python instagramcrawler.py -q 'nude_yogagirl' -n 20

output is:
d:\python\InstagramCrawler-master>python instagramcrawler.py -q 'nude_yogagirl' -n 20 Traceback (most recent call last):
File "instagramcrawler.py", line 297, in
main()
File "instagramcrawler.py", line 291, in main
crawler = InstagramCrawler()
File "instagramcrawler.py", line 58, in init
self._driver = webdriver.Firefox()
File "D:\Program Files\Python361\lib\site-packages\selenium\webdriver\firefox\webdriver.py", line 152, in init
keep_alive=True)
File "D:\Program Files\Python361\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 98, in init
self.start_session(desired_capabilities, browser_profile)
File "D:\Program Files\Python361\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 185, in start_session

response = self.execute(Command.NEW_SESSION, parameters)

File "D:\Program Files\Python361\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 249, in execute
self.error_handler.check_response(response)
File "D:\Program Files\Python361\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 194, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: Unable to find a matching set of apabilities

Error when scraping captions

Hey, so far I crawled followers smoothly, but I have 2 issues:

  1. I get this when I try to crawl the captions
    python instagramcrawler.py -d data -q 'viralnova365' -c -n 10
    dir_prefix: data, query: viralnova365, crawl_type: photos, number: 10, caption: True
    posts: 1660, number: 10
    Scraping photo links...
    Number of photo_links: 25
    Scraping captions...
    Traceback (most recent call last):
    File "instagramcrawler.py", line 297, in
    main()
    File "instagramcrawler.py", line 293, in main
    caption=args.caption)
    File "instagramcrawler.py", line 85, in crawl
    self.click_and_scrape_captions(number)
    File "instagramcrawler.py", line 161, in click_and_scrape_captions
    FIREFOX_FIRST_POST_PATH).click()
    File "/InstagramCrawler/crawl/lib/python3.4/site-packages/selenium/webdriver/remote/webdriver.py", line 313, in find_element_by_xpath
    return self.find_element(by=By.XPATH, value=xpath)
    File "InstagramCrawler/crawl/lib/python3.4/site-packages/selenium/webdriver/remote/webdriver.py", line 791, in find_element
    'value': value})['value']
    File 'InstagramCrawler/crawl/lib/python3.4/site-packages/selenium/webdriver/remote/webdriver.py", line 256, in execute
    self.error_handler.check_response(response)
    File "InstagramCrawler/crawl/lib/python3.4/site-packages/selenium/webdriver/remote/errorhandler.py", line 194, in check_response
    raise exception_class(message, screen, stacktrace)
    selenium.common.exceptions.NoSuchElementException: Message: Unable to locate element: //a[contains(@Class, '_8mlbc _vbtk2 _t5r8b')]
  2. also I would like to crawl all the images, but it never downloades the number specifed by -n, do you have any suggestions?

Only crawl 1 following or followers

I only get 1 following or followers when set -n to 20
I analyse the xml code of instagram and find something wrong with line 273 num_of_shown_follow = len(List.find_elements_by_xpath('*'))
In instagram, xml structure is showed as follows:
ul>
div>
li>'follow info'<\li>
...
li>'another follow info'<\li>
\div>
\ul>
Here class List is node ul
and we want info of node li
The problem is line 273 get node div
I changed line 273s to List.find_elements_by_tag_name('li') and solved this problem.
Hope may help others.

Timeout exception

I am getting the following error. The CSS selector needs to be updated from what is in the code (it is now "a._1cr2e _epyes"), but that still does not solve it for me.

Any insights much appreciated!

Traceback (most recent call last):
File "C:/Users/QS-2 SARAH/Desktop/IG/IGcrawler.py", line 360, in
main()
File "C:/Users/QS-2 SARAH/Desktop/IG/IGcrawler.py", line 356, in main
authentication=args.authentication)
File "C:/Users/QS-2 SARAH/Desktop/IG/IGcrawler.py", line 119, in crawl
self.scroll_to_num_of_posts(number)
File "C:/Users/QS-2 SARAH/Desktop/IG/IGcrawler.py", line 171, in scroll_to_num_of_posts
(By.CSS_SELECTOR, CSS_LOAD_MORE))
File "C:\Users\QS-2 SARAH\Anaconda2\lib\site-packages\selenium\webdriver\support\wait.py", line 80, in until
raise TimeoutException(message, screen, stacktrace)
TimeoutException

Error on scraping followers

Works fine but scraping followers gives me:

Scraping followers...
Traceback (most recent call last):
File "instagramcrawler.py", line 302, in
caption='None')
File "instagramcrawler.py", line 96, in crawl
self.scrape_followers_or_following(crawl_type, query, number)
File "instagramcrawler.py", line 214, in scrape_followers_or_following
title = self._driver.find_element_by_xpath(FOLLOW_PATH)
File "...\selenium\webdriver\remote\webdriver.py", line 313, in find_element_by_xpath
return self.find_element(by=By.XPATH, value=xpath)
File "...\selenium\webdriver\remote\webdriver.py", line 791, in find_element
'value': value})['value']
File "...\selenium\webdriver\remote\webdriver.py", line 256, in execute
self.error_handler.check_response(response)
File "...\selenium\webdriver\remote\errorhandler.py", line 194, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: Unable to locate element: //div[contains(text(), 'Followers')]

Any idea? Thanks for your help.

AttributeError: 'NoneType' object has no attribute 'group'

When running just a basic search on the instagram profile, with python instagramcrawler.py -q instagram -t photos -n 100 -l I get the following error:

Traceback (most recent call last):
  File "instagramcrawler.py", line 360, in <module>
    main()
  File "instagramcrawler.py", line 356, in main
    authentication=args.authentication)
  File "instagramcrawler.py", line 117, in crawl
    self.scroll_to_num_of_posts(number)
  File "instagramcrawler.py", line 161, in scroll_to_num_of_posts
    self._driver.page_source).group()
AttributeError: 'NoneType' object has no attribute 'group'

Any idea on what I'm doing wrong?

phantomjs version 1.9.8
ubuntu 16.04

TimeoutException

Hi
When we run the script in pycharm on windows we get this exception:


C:\Python27\python.exe C:/Users/Vahid/PycharmProjects/instagram/instagramcrawler.py -q instagram -t photos -c -n 100
Number to crawl 100
Traceback (most recent call last):
  File "C:/Users/Vahid/PycharmProjects/instagram/instagramcrawler.py", line 314, in main
    crawler.browse(args.query,args.type).crawl(args.number,args.caption).save()
  File "C:/Users/Vahid/PycharmProjects/instagram/instagramcrawler.py", line 127, in crawl
    self.captions = self._crawl_captions()
  File "C:/Users/Vahid/PycharmProjects/instagram/instagramcrawler.py", line 234, in _crawl_captions
    EC.presence_of_element_located((By.CSS_SELECTOR,CSS_RIGHT_ARROW))
  File "C:\Python27\lib\site-packages\selenium\webdriver\support\wait.py", line 81, in until
    raise TimeoutException(message, screen, stacktrace)
TimeoutException: Message:  

How can we solve the problem?

How does this work?

Hi can anyone tell me how to use this? I'm new to this and can't seem to get it working. Appreciate it if anyone could help!! thanks!

Followers crawling

Hi guys, I was able to crawl followers on Instagram after modifying some lines in the intagramcrawler.py. Now, I am facing a problem when the number of followers is over 1000: The crawler scrolls down the followers page but the page freezes in a loading phase so after some time the crawler simply quits the query.
Have you faced this issue before? If so how to fix it?

#168 loadmore.click() seems to be unstable

When I crawle diffrent users, I only get 13 photos in many users wherever my '-n' is set to 200 and these user in fact have more than 13 photos.
So I think loadmore.click() in line 168 may not work well. Dose anyone suffer the same problem

some info:
posts: 148, number: 200
Scraping photo links...
Number of photo_links: 13
Saving...
Downloading 12 images to ta/linyaudavy
Quitting driver...
headless mode on
dir_prefix: ./data/, query: hecs_510, crawl_type: photos, number: 200, caption: False, authentica
tion: auth.json
posts: 273, number: 200
Scraping photo links...
Number of photo_links: 13
Saving...
Downloading 12 images to ta/hecs_510
Quitting driver...
headless mode on
dir_prefix: ./data/, query: da1sun, crawl_type: photos, number: 200, caption: False, authenticati
on: auth.json
posts: 2140, number: 200
Scraping photo links...
Number of photo_links: 13
Saving...
Downloading 12 images to ta/da1sun
Quitting driver...

BadStatusLine: '' exceptions

Hi Guys,

Thanks for the great project which I use to get followers of a user. This works in about 50% of the cases, but sometimes I get the following error:

Traceback (most recent call last):
  File "crawler_tuintjedelen.py", line 364, in main
    crawler.browse(args.query,args.type).crawl(args.number,args.caption).save()
  File "crawler_tuintjedelen.py", line 160, in crawl
    self.followlist = self._crawl_follow()
  File "crawler_tuintjedelen.py", line 319, in _crawl_follow
    self.driver.execute_script(SCROLL_DOWN)
  File "/home/makusu/.local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 465, in execute_script
    'args': converted_args})['value']
  File "/home/makusu/.local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 234, in execute
    response = self.command_executor.execute(driver_command, params)
  File "/home/makusu/.local/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 408, in execute
    return self._request(command_info[0], url, body=data)
  File "/home/makusu/.local/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 478, in _request
    resp = opener.open(request, timeout=self._timeout)
  File "/usr/lib/python2.7/urllib2.py", line 429, in open
    response = self._open(req, data)
  File "/usr/lib/python2.7/urllib2.py", line 447, in _open
    '_open', req)
  File "/usr/lib/python2.7/urllib2.py", line 407, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.7/urllib2.py", line 1228, in http_open
    return self.do_open(httplib.HTTPConnection, req)
  File "/usr/lib/python2.7/urllib2.py", line 1201, in do_open
    r = h.getresponse(buffering=True)
  File "/usr/lib/python2.7/httplib.py", line 1136, in getresponse
    response.begin()
  File "/usr/lib/python2.7/httplib.py", line 453, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python2.7/httplib.py", line 417, in _read_status
    raise BadStatusLine(line)
BadStatusLine: ''

Exception urllib2.URLError: URLError(error(111, 'Connection refused'),) in <bound method InstagramCrawler.__del__ of <__main__.InstagramCrawler object at 0x7f81e3cf7bd0>> ignored

Any idea what could be going on? Running it on Ubuntu with PhantomJS.

Thanks!

raise exception_class(message, screen, stacktrace) selenium.common.exceptions.WebDriverException: Message: Unable to find a matching set of capabilities

Do you know how can I fix this error?

[jalal@goku InstagramCrawler]$ python instagramcrawler.py -q '#breakfast' -n 50
Traceback (most recent call last):
  File "instagramcrawler.py", line 360, in <module>
    main()
  File "instagramcrawler.py", line 350, in main
    crawler = InstagramCrawler(headless=args.headless, firefox_path=args.firefox_path)
  File "instagramcrawler.py", line 70, in __init__
    self._driver = webdriver.Firefox(firefox_binary=binary)
  File "/home/grad3/jalal/.local/lib/python3.6/site-packages/selenium/webdriver/firefox/webdriver.py", line 152, in __init__
    keep_alive=True)
  File "/home/grad3/jalal/.local/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 98, in __init__
    self.start_session(desired_capabilities, browser_profile)
  File "/home/grad3/jalal/.local/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 185, in start_session
    response = self.execute(Command.NEW_SESSION, parameters)
  File "/home/grad3/jalal/.local/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 249, in execute
    self.error_handler.check_response(response)
  File "/home/grad3/jalal/.local/lib/python3.6/site-packages/selenium/webdriver/remote/errorhandler.py", line 194, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: Unable to find a matching set of capabilities

Also, how should I enter user/pass?

Cannot crawl data

D:\Development\python\InstagramCrawler-master>python instagramcrawler.py -q daniel.hoi23s -c -n 100
dir_prefix: ./data/, query: daniel.hoi23s, crawl_type: photos, number: 100, caption: True
posts: 394, number: 100
Scraping photo links...
Number of photo_links: 27
Scraping captions...
Traceback (most recent call last):
File "instagramcrawler.py", line 297, in
main()
File "instagramcrawler.py", line 293, in main
caption=args.caption)
File "instagramcrawler.py", line 85, in crawl
self.click_and_scrape_captions(number)
File "instagramcrawler.py", line 161, in click_and_scrape_captions
FIREFOX_FIRST_POST_PATH).click()
File "C:\Users\aaaaa\AppData\Local\Programs\Python\Python35\lib\site-packages\selenium\webdriver\remote\webelement.py", line 77, in click
self._execute(Command.CLICK_ELEMENT)
File "C:\Users\aaaaa\AppData\Local\Programs\Python\Python35\lib\site-packages\selenium\webdriver\remote\webelement.py", line 493, in _execute
return self._parent.execute(command, params)
File "C:\Users\aaaaa\AppData\Local\Programs\Python\Python35\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 249, in execute
self.error_handler.check_response(response)
File "C:\Users\aaaaa\AppData\Local\Programs\Python\Python35\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 194, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.ElementNotInteractableException: Message:

============================================

geckodriver.log:
1496170001211 geckodriver INFO Listening on 127.0.0.1:64773
1496170003300 geckodriver::marionette INFO Starting browser \?\C:\Program Files (x86)\Mozilla Firefox\firefox.exe with args ["-marionette"]
1496170004239 addons.manager ERROR startup failed: [Exception... "Component returned failure code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsIFile.create]" nsresult: "0x80070057 (NS_ERROR_ILLEGAL_VALUE)" location: "JS frame :: resource://gre/modules/FileUtils.jsm :: FileUtils_getDir :: line 70" data: no] Stack trace: FileUtils_getDir()@resource://gre/modules/FileUtils.jsm:70 < FileUtils_getFile()@resource://gre/modules/FileUtils.jsm:42 < validateBlocklist()@resource://gre/modules/AddonManager.jsm:671 < startup()@resource://gre/modules/AddonManager.jsm:834 < startup()@resource://gre/modules/AddonManager.jsm:3129 < observe()@resource://gre/components/addonManager.js:65
JavaScript error: resource://gre/modules/AddonManager.jsm, line 1657: NS_ERROR_NOT_INITIALIZED: AddonManager is not initialized
1496170009047 Marionette INFO Listening on port 64780
JavaScript error: resource://gre/modules/AddonManager.jsm, line 2570: NS_ERROR_NOT_INITIALIZED: AddonManager is not initialized
1496170009378 Marionette WARN TLS certificate errors will be ignored for this session
JavaScript error: resource://gre/modules/FileUtils.jsm, line 70: NS_ERROR_ILLEGAL_VALUE: Component returned failure code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsIFile.create]

Crawler for "followers" stops after opening follower list

After calling this:

python instagramcrawler.py -q 'instagram' -t 'followers' -n 30 -a auth.json

the process works well till it opens the follower list. After opening it, it stucks. Nothing happens anymore. Weeks ago it worked well with autoscrolling down of follower list and getting all followers.

Does instagram changed something in their code? Is there a solution for it?

why can't I get the photo links?

Hi! I typed
python instagramcrawler.py -q #breakfast -n 50
Then the firefox worked. But the crawler program showed:
posts: 61650050, number: 50
Saving...
Saving to directory: E:/ins\breakfast.hashtag
Scraping photo links...
Number of photo_links: 33
?[FDownloading 1 images to
Traceback (most recent call last):
File "instagramcrawler.py", line 342, in
main()
File "instagramcrawler.py", line 338, in main
authentication=args.authentication)
File "instagramcrawler.py", line 133, in crawl
self.download_and_save(dir_prefix, query, crawl_type)
File "instagramcrawler.py", line 294, in download_and_save
urlretrieve(photo_link, filepath)
File "C:\Python27\lib\urllib.py", line 98, in urlretrieve
return opener.retrieve(url, filename, reporthook, data)
File "C:\Python27\lib\urllib.py", line 245, in retrieve
fp = self.open(url, data)
File "C:\Python27\lib\urllib.py", line 213, in open
return getattr(self, name)(url)
File "C:\Python27\lib\urllib.py", line 443, in open_https
h.endheaders(data)
File "C:\Python27\lib\httplib.py", line 1038, in endheaders
self._send_output(message_body)
File "C:\Python27\lib\httplib.py", line 882, in _send_output
self.send(msg)
File "C:\Python27\lib\httplib.py", line 844, in send
self.connect()
File "C:\Python27\lib\httplib.py", line 1263, in connect
server_hostname=server_hostname)
File "C:\Python27\lib\ssl.py", line 363, in wrap_socket
_context=self)
File "C:\Python27\lib\ssl.py", line 611, in init
self.do_handshake()
File "C:\Python27\lib\ssl.py", line 840, in do_handshake
self._sslobj.do_handshake()
IOError: [Errno socket error] EOF occurred in violation of protocol (_ssl.c:661)

So why can't I download them?

'geckodriver' executable needs to be in PATH

Please add info, that user should install geckodriver to PC and add path to PATH. Thanks!

Query account 'instagram', download 20 photos and their captions Traceback (most recent call last): File "instagramcrawler.py", line 355, in <module> main() File "instagramcrawler.py", line 345, in main crawler = InstagramCrawler(headless=args.headless) File "instagramcrawler.py", line 67, in __init__ self._driver = webdriver.Firefox() File "C:\Python27\lib\site-packages\selenium\webdriver\firefox\webdriver.py", line 142, in __init_ _ self.service.start() File "C:\Python27\lib\site-packages\selenium\webdriver\common\service.py", line 81, in start os.path.basename(self.path), self.start_error_message) selenium.common.exceptions.WebDriverException: Message: 'geckodriver' executable needs to be in PATH .

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.