Code Monkey home page Code Monkey logo

pinterest-image-scraper's Introduction

Pinterest Image Scraper

Now you can take the URL to any Pinterest board (or a CSV of a bunch of boards) and return a Python list of the URLs to the hi-rez versions of all of the images on the board.

Requirements:

  • Python 3.5+ (Anaconda recommended)
  • Pandas (pip install pandas or conda install pandas)
  • Firefox + Gecko driver (Firefox can be omitted if you know what you're doing and have another browser set up to be used via Selenium)
  • Selenium (pip install selenium or conda install -c conda-forge selenium, then see these instructions for installing the Gecko driver if not installing it from Conda)
  • Alternatively, install the Gecko driver using conda: conda install -c conda-forge geckodriver
  • If you want to use Chrome or PhantomJS, install their respective selenium drivers: conda install python-chromedriver-binary phantomjs
  • A Pinterest Account

How to Run:

git clone https://github.com/xjdeng/pinterest-image-scraper.git
cd pinterest-image-scraper
pip install -U .
cd ..
python
from pinterest_scraper import scraper as s
ph = s.Pinterest_Helper(<Pinterst login> , <Pinterest password>)
images = ph.runme("http://URL-to-image-board")

Or if you have a CSV file with a URL to a different image board on every line:

images = ph.getURLs(imageboards.csv)

Now if you want to download these images:

s.download(images, "/path/to/your/destination/dir")

or to download to your current directory:

s.download(images)

Note: you no longer need Firefox. If you'd like to use a different browser (i.e. Chrome or PhantomJS), you'll need to initialize it through selenium, then pass it through the Pinterest_Helper object. For example, using Chrome:

from selenium import webdriver
chrome = webdriver.Chrome()
ph = s.Pinterest_Helper(<Pinterst login> , <Pinterest password>, chrome)

pinterest-image-scraper's People

Contributors

xjdeng avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pinterest-image-scraper's Issues

"The browser appears to have exited "

Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/aleksandar/pinterest-image-scraper/pinterest_scraper/scraper.py", line 53, in __init__ self.browser = webdriver.Firefox(firefox_profile=profile) File "/usr/local/lib/python3.5/dist-packages/selenium/webdriver/firefox/webdriver.py", line 80, in __init__ self.binary, timeout) File "/usr/local/lib/python3.5/dist-packages/selenium/webdriver/firefox/extension_connection.py", line 52, in __init__ self.binary.launch_browser(self.profile, timeout=timeout) File "/usr/local/lib/python3.5/dist-packages/selenium/webdriver/firefox/firefox_binary.py", line 68, in launch_browser self._wait_until_connectable(timeout=timeout) File "/usr/local/lib/python3.5/dist-packages/selenium/webdriver/firefox/firefox_binary.py", line 99, in _wait_until_connectable "The browser appears to have exited " selenium.common.exceptions.WebDriverException: Message: The browser appears to have exited before we could connect. If you specified a log_file in the FirefoxBinary constructor, check it for details.

selenium.common.exceptions.ElementNotInteractableException: Message: Element <a class="dangerouslyDisableFocusStyle"> is not reachable by keyboard

When running command:
images =ph.runme("https://www.pinterest.co.uk/artistdrummer40/arnold-bodybuilding")

get the following error output:

>>> images =ph.runme("https://www.pinterest.co.uk/artistdrummer40/arnold-bodybuilding") Traceback (most recent call last): File "<stdin>", line 1, in <module> File "C:\Users\Nonwork\Downloads\pinterest-image-scraper-master\pinterest-image-scraper-master\pinterest_scraper\scraper.py", line 113, in runme dummy.send_keys(Keys.PAGE_DOWN) File "C:\Users\Nonwork\AppData\Local\Programs\Python\Python38-32\lib\site-packages\selenium\webdriver\remote\webelement.py", line 477, in send_keys self._execute(Command.SEND_KEYS_TO_ELEMENT, File "C:\Users\Nonwork\AppData\Local\Programs\Python\Python38-32\lib\site-packages\selenium\webdriver\remote\webelement.py", line 633, in _execute return self._parent.execute(command, params) File "C:\Users\Nonwork\AppData\Local\Programs\Python\Python38-32\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute self.error_handler.check_response(response) File "C:\Users\Nonwork\AppData\Local\Programs\Python\Python38-32\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response raise exception_class(message, screen, stacktrace) selenium.common.exceptions.ElementNotInteractableException: Message: Element <a class="dangerouslyDisableFocusStyle"> is not reachable by keyboard

Firefox however, does browse to the correct imageboard but that's as far as it goes.

README.md typo

From my IDE's point of view, last line of your how-tos should be
images = ph.runme()
instead of
images = s.runme()

SyntaxError: Invalid Syntax

This is what I get

ph = s.Pinterest_Helper( , )
File "", line 1
ph = s.Pinterest_Helper( , )
^
SintaxError: Invalid Syntax

Is there something I'm doing wrong

Selenium doesn't find 'id'

selenium.common.exceptions.NoSuchElementException: Message: Unable to locate element: [name="id"]

that's the error I've while running this line

ph = s.Pinterest_Helper('myusername' , 'mypassword')

thanks

Make scrolling stop

How to make the scrolling stop once we have reached the end of the page ?

selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":"[name="id"]"}

`from pinterest_scraper import scraper as s

from selenium import webdriver

chromedriverPath = '/home/shubhamturai/anaconda3/envs/pinterestScraper/lib/python3.8/site-packages/chromedriver_binary/chromedriver'

import os

os.chmod(chromedriverPath, 755)

chrome = webdriver.Chrome(chromedriverPath)
#chrome = webdriver.Chrome()

ph = s.Pinterest_Helper("[email protected]" , "xxxxxxxxxxx", chrome)

images = ph.runme("https://www.pinterest.at/search/pins/?q=rug&rs=typed&term_meta[]=rug%7Ctyped")

s.download(images, "pinterest_dataset/rug")`

The error is :
Traceback (most recent call last):
File "main.py", line 8, in
ph = s.Pinterest_Helper("[email protected]" , "xxxxxxxx", chrome)
File "/home/shubhamturai/AI/KfV/pinterest-image-scraper/pinterest_scraper/scraper.py", line 58, in init
emailElem = self.browser.find_element_by_name('id')
File "/home/shubhamturai/anaconda3/envs/pinterestScraper/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 496, in find_element_by_name
return self.find_element(by=By.NAME, value=name)
File "/home/shubhamturai/anaconda3/envs/pinterestScraper/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 976, in find_element
return self.execute(Command.FIND_ELEMENT, {
File "/home/shubhamturai/anaconda3/envs/pinterestScraper/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "/home/shubhamturai/anaconda3/envs/pinterestScraper/lib/python3.8/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":"[name="id"]"}
(Session info: chrome=85.0.4183.121)

This is my code but I am not able to get through this email and password stuff. Please help!

Causing this error after executing --- ph = s.Pinterest_Helper('username' , 'password')

File "", line 1, in
File "/home/frostman/work_stuff/Scraped Images/pinterest-image-scraper/pinterest_scraper/scraper.py", line 58, in init
emailElem = self.browser.find_element_by_name('id')
File "/home/frostman/venvs/scrp/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 487, in find_element_by_name
return self.find_element(by=By.NAME, value=name)
File "/home/frostman/venvs/scrp/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 955, in find_element
'value': value})['value']
File "/home/frostman/venvs/scrp/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 312, in execute
self.error_handler.check_response(response)
File "/home/frostman/venvs/scrp/lib/python3.5/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: Unable to locate element: [name="id"]

NameError: name 'ph' is not defined

images = ph.runme("https://www.pinterest.com.au/natalie_jane7/outdoor-portrait-photography/")
Traceback (most recent call last):
File "", line 1, in
NameError: name 'ph' is not defined
s.download(images)
Traceback (most recent call last):
File "", line 1, in
NameError: name 'images' is not defined
images = ph.runme("https://www.pinterest.com.au/natalie_jane7/outdoor-portrait-photography/")
Traceback (most recent call last):
File "", line 1, in
NameError: name 'ph' is not defined
ph = s.Pinterest_Helper("[email protected]" , "my_password_here")
Traceback (most recent call last):
File "", line 1, in
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pinterest_scraper/scraper.py", line 58, in init
emailElem = self.browser.find_element_by_name('id')
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/selenium/webdriver/remote/webdriver.py", line 496, in find_element_by_name
return self.find_element(by=By.NAME, value=name)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/selenium/webdriver/remote/webdriver.py", line 976, in find_element
return self.execute(Command.FIND_ELEMENT, {
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: Unable to locate element: [name="id"]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.