commonists / commonsdownloader Goto Github PK
View Code? Open in Web Editor NEWTool to download thumbnails of files from Wikimedia Commons
License: MIT License
Tool to download thumbnails of files from Wikimedia Commons
License: MIT License
When I specify a width for a file that is larger than the file width I want CommonsDownloader to get the largest thumb possible for this file instead of getting an error.
As a user, I want to download all images in a given Commons category.
Log is:
INFO:root:Downloading C'est_l+á_le_moulin?.JPG with width 99999
INFO:root:Requested width is bigger than source - downloading full size
Traceback (most recent call last):
File "C:\Python27\Scripts\download_from_Wikimedia_Commons-script.py", line 9, in <module>
load_entry_point('CommonsDownloader==0.2', 'console_scripts', 'download_from_Wikimedia_Commons')()
File "build\bdist.win32\egg\commonsdownloader\commonsdownloader.py", line 85, in main
File "build\bdist.win32\egg\commonsdownloader\commonsdownloader.py", line 27, in download_from_file_list
File "build\bdist.win32\egg\commonsdownloader\thumbnaildownload.py", line 122, in download_file
IOError: [Errno 22] invalid mode ('wb') or filename: ".\\Pictures\\C'est_l\xc3\xa0_le_moulin?.jpg"
Running @symac's patched version with Python 2.7.8 in PowerShell in Windows 8.1 Pro N. Issue seems to be caused by NTFS / Windows not supporting "?" in filenames.
As a user, I want to have some feedback on CommonsDownloader progress.
Right now, CommonsDownloader does not display any information message. We probably want to display by default INFO-level messages, with an option to mute them.
When you've a large text file containing filenames and for any reason you have to cancel the process, it should be good to have an option that allows you to skip already downloaded files
CommonsDownloader could use multiprocessing to download several files at once. That may not speed up the process significantly though.
That would be more efficient.
According to @dschwen, using thumb.php
« is evil for lack of caching ».
CommonsDownloader should not rely on it to get thumbs, but rather use Special:FilePath
Either:
https://commons.wikimedia.org/w/index.php?title=Special:FilePath&file=Example.jpg&width=100px
https://commons.wikimedia.org/wiki/Special:FilePath/Example.jpg?width=100
Requested by Coyau: wants CommonsDownloader to download the full-size file.
With @symac's patched CommonsDownloader version of October 1st (see mail discussions), resulting file names on the filesystem are not properly encoded : two bytes UTF-8 chars (like é), that are properly encoded in the file list (Abbaye Saint-Pierre de Marcilhac-sur-Célé - Eglise.JPG,99999
, get translated to two one byte chars, like in Abbaye_Saint-Pierre_de_Marcilhac-sur-Célé_-_Eglise
. On the console, output is misencoded too, but not in the same way : Downloading Abbaye_Saint-Pierre_de_Marcilhac-sur-C├®l├®_-_Eglise.JPG
.
Running with Python 2.7.8 in PowerShell under Windows 8.1 Pro N.
I used pip to install CommonsDownloader and I'm using python 3.6 on Ubuntu. 18.04 When I run this command:
download_from_Wikimedia_Commons --category Giovanni_Battista_Moroni --output ~/path
I get the following error:
File "/home/user/.local/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2456, in resolve
module = __import__(self.module_name, fromlist=['__name__'], level=0)
File "/usr/local/lib/python3.6/dist-packages/commonsdownloader/commonsdownloader.py", line 104
except DownloadException, e:
^
SyntaxError: invalid syntax
Is it something I'm doing wrong?
Thanks in advance
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.