Code Monkey home page Code Monkey logo

google-images-download's Issues

Naming confussion

Hi, the project is called google-images-download but the code is meant not for google but flickr (?), so why does it mention google in its name?

Also in the documentation examples, the examples refer to bing_scraper.py while the actual script is called flickr_scraper.py.

Although not important felt like mentioning it.
Thank you.
Best regards.

Similiar image search doesn't work

Hi,

I have been trying to use the similar image parameter but have not been able to get it to work.

Here's the search I'm running

python bing_scraper.py --similar_images 'https://upload.wikimedia.org/wikipedia/commons/4/4d/Apis_mellifera_Western_honey_bee.jpg' --limit 10 --download -chromedriver C:\Users\Clayton\Downloads\chromedriver_win32\chromedriver.exe
and the result

image

I tried with several other combinations of images and using 'link address' vs. 'image address' but have not been able to get it to work.

UnicodeEncodeError

I have error^ then name image in cyrillic UTF8

26/30 UnicodeEncodeError on an image...trying next one... Error: 'ascii' codec can't encode characters in position 32-53: ordinal not in range(128)

Apparent 500-800 Image Download Limit

It seems that only a limited number of images may be downloaded for a particular search term, typically around 500-800 images based on anecdotal evidence.

After this it seems that Google/Bing search simply reaches the end of the allowable scrolling range. Since the selenium functionality of the scraper simply mimics a human using Chrome to conduct an image search manually, I don't believe a workaround is possible, other than possibly searching for slightly different search terms, and then removing duplicates in post-processing.

Screen Shot 2020-02-25 at 10 04 06 PM

aborts downloading prematurely

python bing_scraper.py --search 'Buzz Lightyear' --download --limit 300 -o /x/ --chromedriver /usr/local/bin/chromedriver          

Searching for https://www.bing.com/images/search?q=Buzz%20Lightyear
Downloading HTML... 1376820 elements: 100%|███████████████████████████| 30/30 [00:16<00:00,  1.85it/s]
Downloading images...
1/300 https://vignette.wikia.nocookie.net/buzz-lightyear-rides/images/e/e8/Robot_Toy.jpg/revision/latest Invalid or missing image format. Skipping...
1/300 https://cdnb.artstation.com/p/assets/images/images/026/253/135/medium/eugene-napadovskiy-nos-4-a2.jpg 
2/300 https://www.gratistodo.com/wp-content/uploads/2016/10/Toy-Story-Wallpapers-6.jpg 
3/300 http://www.littlebcakes.com/wp-content/uploads/2014/01/Bumble-Bee-Cake-764x1024.jpg 
4/300 https://spongekids.com/wp-content/uploads/2014/03/costumes-for-kids/52-buzz-lightyear-kid-costume-idea.JPG Invalid or missing image format. Skipping...
4/300 https://colorearimagenes.net/wp-content/uploads/2015/11/toystory1.gif4_.jpg 
5/300 https://spongekids.com/wp-content/uploads/2014/10/super-cool-costume-ideas/11-scarecrow-costume.jpg 
6/300 http://www.lubbockonline.com/storyimage/TX/20121121/LIFESTYLE/311219834/AR/0/AR-311219834.jpg 
7/300 http://blog.holidaydiscountcentre.co.uk/wp-content/uploads/2014/10/Alice-in-Wonderland-by-Loren-Javier-via-Flickr-576x384.jpg 
8/300 https://www.littlebcakes.com/wp-content/uploads/2014/01/Kitty-Cat-Cakes-760x1024.jpg 
Unfortunately all 291 could not be downloaded because some images were not downloadable. 8 is all we got for this search filter!
Done with 2 errors in 77.0s. All images saved to /Users/evar/Base/_Code/misc/google-images-download/images

Run without chrome driver installation: Solution

Hi, you could use

from webdriver_manager.chrome import ChromeDriverManager

        chromedriver = ChromeDriverManager().install()

instead of specifying the path to the chrome driver. Makes it easier (;

selenium.common.exceptions.InvalidArgumentException: Message: invalid argument

DevTools listening on ws://127.0.0.1:53852/devtools/browser/bc25e6c0-e37c-4083-9092-7b062cd14cf8
Traceback (most recent call last):
File "bing_scraper.py", line 936, in
main()
File "bing_scraper.py", line 922, in main
paths, errors = response.download(arguments) # wrapping response in a variable just for consistency
File "bing_scraper.py", line 759, in download
paths, errors = self.download_executor(arguments)
File "bing_scraper.py", line 871, in download_executor
raw_html = self.download_extended_page(url, arguments['chromedriver'])
File "bing_scraper.py", line 206, in download_extended_page
browser.get(url)
File "D:\Program Files\Anaconda3\envs\python37\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 333, in
get
self.execute(Command.GET, {'url': url})
File "D:\Program Files\Anaconda3\envs\python37\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in
execute
self.error_handler.check_response(response)
File "D:\Program Files\Anaconda3\envs\python37\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 242,
in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.InvalidArgumentException: Message: invalid argument
(Session info: headless chrome=81.0.4044.129)

I am using this command:
python bing_scraper.py --url 'https://www.bing.com/images/search?q=parcel' --limit 100 --download --chromedriver ../chromedriver.exe

and the version of chromedriver is 81.0.4044.129

missing argument 'keywords'

Hi,
My problem is: every time I try to run this module with 'search' argument and without 'keywords' argument, i get error message "Uh oh! Keywords is a required argument[....]". When i use 'keywords' argument only, module will use google images as a source, and it obviously won't download any images. When I use both, it will still use google images url. I will provide you part of my code if it helps you.

import bing_scraper
response = bing_scraper.googleimagesdownload()
arguments = {"search": 'honeybees on flowers',
"limit": 10,
"download":True,
"chromedriver":r"C:\Users\User\Desktop\python\chromedriver.exe"
}

response.download(arguments)

safe_search always on

I would like to have safe_search off although it seems to have no effect. The search is always set to moderate.

This version of ChromeDriver only supports Chrome version 81

When I try to use the script I get the following error:

$ python3 bing_scraper.py --url 'https://www.bing.com/images/search?q=flowers' --limit 10 --chromedriver '/Users/Me/Downloads/chromedriver'
Searching for https://www.bing.com/images/search?q=flowers
chromedriver not found (use the '--chromedriver' argument to specify the path to the executable)or google chrome browser is not installed on your machine (exception: Message: session not created: This version of ChromeDriver only supports Chrome version 81
)
$ ls -lat ~/Downloads
-rwxr-xr-x@   1 Me  staff    14786468 Feb 12 19:09 chromedriver

Downloaded latest chromedriver from here.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.