dragoonaethis / itch-dl Goto Github PK
View Code? Open in Web Editor NEWDownload all games from a public Itch.io Game Jam
License: MIT License
Download all games from a public Itch.io Game Jam
License: MIT License
I'm backing up games on Itch and I've noticed multiple inconsistencies with the archived pages generated by itch-dl.
--devlog
.I tried with and without using --mirror-web
but there was not much of a difference. Screenshots are saved when specified but I did not note any additional benefit.
I have a quick and dirty implementation here but I'd like this feature to be done correctly and merged upstream (or at least plugg-able).
This includes two main tasks:
Some questions:
I'm not sure I'll have the time personally work on this, but I'd gladly put a bounty of $200 (which could be release gradually as we merge features) from the gbdev Open Collective.
Running itch-dl on Python 3.11.1 installed via pip on WIndows 11 powershell.
...
INFO:root:Downloading page 49 (found 2400 keys total)
INFO:root:Downloading page 50 (found 2450 keys total)
INFO:root:Downloading page 51 (found 2500 keys total)
INFO:root:Downloading page 52 (found 2550 keys total)
INFO:root:Downloading page 53 (found 2600 keys total)
INFO:root:Fetched 2648 download keys.
Games: 0%| | 0/49557 [00:00<?, ?game/s]INFO:root:Downloading https://dkelroqur.itch.io/black-jumper
https://img.itch.zone/aW1hZ2UvMzMwNzcvMTQxODU4LnBuZw==/original/oKU282.png: 100%|██| 19.8k/19.8k [00:00<00:00, 142kB/s]
Traceback (most recent call last):NzcvMTQxODU4LnBuZw==/original/oKU282.png: 100%|██| 19.8k/19.8k [00:00<00:00, 142kB/s]
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "C:\Users\Proxy_ydkavi9\AppData\Local\Programs\Python\Python311\Scripts\itch-dl.exe\__main__.py", line 7, in <module>
File "C:\Users\Proxy_ydkavi9\AppData\Local\Programs\Python\Python311\Lib\site-packages\itch_dl\cli.py", line 81, in run
return drive_downloads(jobs, download_to, args.mirror_web, settings, keys, parallel=args.parallel)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Proxy_ydkavi9\AppData\Local\Programs\Python\Python311\Lib\site-packages\itch_dl\downloader.py", line 353, in drive_downloads
results = [downloader.download(job) for job in tqdm(jobs, **tqdm_args)]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Proxy_ydkavi9\AppData\Local\Programs\Python\Python311\Lib\site-packages\itch_dl\downloader.py", line 353, in <listcomp>
results = [downloader.download(job) for job in tqdm(jobs, **tqdm_args)]
^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Proxy_ydkavi9\AppData\Local\Programs\Python\Python311\Lib\site-packages\itch_dl\downloader.py", line 324, in download
f.write(site.prettify())
File "C:\Users\Proxy_ydkavi9\AppData\Local\Programs\Python\Python311\Lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeEncodeError: 'charmap' codec can't encode character '\x80' in position 23385: character maps to <undefined>
Games: 0%| | 0/49557 [00:01<?, ?game/s]
As a workaround I've been running the script under Ubuntu WSL2 where it seems to be working fine.
Hi!
INFO:root:Downloading https://andreipasynkov.itch.io/upandown
Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/tmp/itch-dl/itch_dl/__main__.py", line 3, in <module>
run()
File "/tmp/itch-dl/itch_dl/cli.py", line 68, in run
return drive_downloads(jobs, download_to, args.mirror_web, args.api_key, keys, parallel=args.parallel)
File "/tmp/itch-dl/itch_dl/downloader.py", line 350, in drive_downloads
results = [downloader.download(job) for job in tqdm(jobs, **tqdm_args)]
File "/tmp/itch-dl/itch_dl/downloader.py", line 350, in <listcomp>
results = [downloader.download(job) for job in tqdm(jobs, **tqdm_args)]
File "/tmp/itch-dl/itch_dl/downloader.py", line 240, in download
metadata = self.extract_metadata(game_id, url, site)
File "/tmp/itch-dl/itch_dl/downloader.py", line 149, in extract_metadata
infobox = parse_infobox(infobox_div)
File "/tmp/itch-dl/itch_dl/infobox.py", line 116, in parse_infobox
parsed_block = parse_tr(name, content_td)
File "/tmp/itch-dl/itch_dl/infobox.py", line 100, in parse_tr
raise NotImplementedError(f"Unknown infobox block name '{name}' - please file a new itch-dl issue.")
NotImplementedError: Unknown infobox block name 'Category' - please file a new itch-dl issue.
...so I did! Got this from https://itch.io/jam/game-boy-showdown
.
There's a /my-games
endpoint available in the API - it'd be useful to be able to archive all games in the user's library.
Consistently happen when I download:
Password protected:
https://beowulf.itch.io/rpg-boss-monsters-minions-huge-pack
https://world-land-trust.itch.io/thank-you-from-the-world-land-trust
https://jayskibean.itch.io/microhorrorarcade-trilogy-i
Access restricted
https://witpop.itch.io/sprite-pack-fantasy-male-mage
(or https://itch.io/my-purchases ).
The issues seems to be that the script can't handle any page that isn't publicly available.
A solution might just be to add proper handling... or just to ignore those (and maybe output the page URL to a file so that they can be downloaded manually)
This effectively crashes the itch-dl and requires re-starting it to download other files (I ended up just running it on loop)
Traceback (most recent call last):
File "/home/<username>/.local/bin/itch-dl", line 8, in <module>
sys.exit(run())
^^^^^
File "/home/<username>/.local/share/pipx/venvs/itch-dl/lib/python3.11/site-packages/itch_dl/cli.py", line 87, in run
return drive_downloads(jobs, download_to, args.mirror_web, settings, keys, parallel=args.parallel)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/<username>/.local/share/pipx/venvs/itch-dl/lib/python3.11/site-packages/itch_dl/downloader.py", line 355, in drive_downloads
results = thread_map(downloader.download, jobs, max_workers=parallel, **tqdm_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/<username>/.local/share/pipx/venvs/itch-dl/lib/python3.11/site-packages/tqdm/contrib/concurrent.py", line 69, in thread_map
return _executor_map(ThreadPoolExecutor, fn, *iterables, **tqdm_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/<username>/.local/share/pipx/venvs/itch-dl/lib/python3.11/site-packages/tqdm/contrib/concurrent.py", line 51, in _executor_map
return list(tqdm_class(ex.map(fn, *iterables, chunksize=chunksize), **kwargs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/<username>/.local/share/pipx/venvs/itch-dl/lib/python3.11/site-packages/tqdm/std.py", line 1181, in __iter__
for obj in iterable:
File "/usr/lib/python3.11/concurrent/futures/_base.py", line 619, in result_iterator
yield _result_or_cancel(fs.pop())
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/concurrent/futures/_base.py", line 317, in _result_or_cancel
return fut.result(timeout)
^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/concurrent/futures/_base.py", line 456, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/usr/lib/python3.11/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/<username>/.local/share/pipx/venvs/itch-dl/lib/python3.11/site-packages/itch_dl/downloader.py", line 240, in download
game_id = self.get_game_id(url, site)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/<username>/.local/share/pipx/venvs/itch-dl/lib/python3.11/site-packages/itch_dl/downloader.py", line 117, in get_game_id
game_id = int(data_request.json().get("id"))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType'
It is currently impossible to download metadata for some titles, such as:
These will not immediately 404 on the initial page load, so there is a valid title behind the link, but we can't download the metadata required to proceed with the download. In all cases the page is configured as restricted, optionally behind a password.
It's possible to query the API for any available Download Keys directly for a Game ID, instead of downloading them all before the fact. This would be faster for a small amount of titles to download, especially if the user has a ton of purchased/claimed games.
This is very useful! Please add support for downloading entire bundles if you can. 😃
Lately downloading from Itch stopped working for me, I tested 3 different pages.
itch-dl --verbose --mirror-web https://beth-and-angel-make-games.itch.io/a-winter-tale
DEBUG:root:Found config file: /home/xxx/.config/itch-dl/config.json
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.itch.io:443
DEBUG:urllib3.connectionpool:https://api.itch.io:443 "GET /profile HTTP/1.1" 200 None
INFO:root:Found 1 URL(s).
INFO:root:Fetching all download keys...
INFO:root:Downloading page 1 (found 0 keys total)
DEBUG:urllib3.connectionpool:https://api.itch.io:443 "GET /profile/owned-keys HTTP/1.1" 200 None
INFO:root:Downloading page 2 (found 50 keys total)
DEBUG:urllib3.connectionpool:https://api.itch.io:443 "GET /profile/owned-keys HTTP/1.1" 200 None
INFO:root:Downloading page 3 (found 100 keys total)
…
INFO:root:Fetched 500 download keys.
Games: 0%| | 0/1 [00:00<?, ?game/s]INFO:root:Downloading https://beth-and-angel-make-games.itch.io/a-winter-tale
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): beth-and-angel-make-games.itch.io:443
DEBUG:urllib3.connectionpool:https://beth-and-angel-make-games.itch.io:443 "GET /a-winter-tale HTTP/1.1" 200 None
Traceback (most recent call last):
File "/run/media/xxx/eeea9c45-51a7-4354-94d3-0e538cb28013/itch_downloads/envname/bin/itch-dl", line 8, in <module>
sys.exit(run())
^^^^^
File "/run/media/xxx/eeea9c45-51a7-4354-94d3-0e538cb28013/itch_downloads/envname/lib/python3.11/site-packages/itch_dl/cli.py", line 81, in run
return drive_downloads(jobs, download_to, args.mirror_web, settings, keys, parallel=args.parallel)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/run/media/xxx/eeea9c45-51a7-4354-94d3-0e538cb28013/itch_downloads/envname/lib/python3.11/site-packages/itch_dl/downloader.py", line 353, in drive_downloads
results = [downloader.download(job) for job in tqdm(jobs, **tqdm_args)]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/run/media/xxx/eeea9c45-51a7-4354-94d3-0e538cb28013/itch_downloads/envname/lib/python3.11/site-packages/itch_dl/downloader.py", line 353, in <listcomp>
results = [downloader.download(job) for job in tqdm(jobs, **tqdm_args)]
^^^^^^^^^^^^^^^^^^^^^^^^
File "/run/media/xxx/eeea9c45-51a7-4354-94d3-0e538cb28013/itch_downloads/envname/lib/python3.11/site-packages/itch_dl/downloader.py", line 240, in download
metadata = self.extract_metadata(game_id, url, site)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/run/media/xxx/eeea9c45-51a7-4354-94d3-0e538cb28013/itch_downloads/envname/lib/python3.11/site-packages/itch_dl/downloader.py", line 150, in extract_metadata
infobox = parse_infobox(infobox_div)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/run/media/xxx/eeea9c45-51a7-4354-94d3-0e538cb28013/itch_downloads/envname/lib/python3.11/site-packages/itch_dl/infobox.py", line 121, in parse_infobox
parsed_block = parse_tr(name, content_td)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/run/media/xxx/eeea9c45-51a7-4354-94d3-0e538cb28013/itch_downloads/envname/lib/python3.11/site-packages/itch_dl/infobox.py", line 58, in parse_tr
return "published_at", parse_date_block(content)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/run/media/xxx/eeea9c45-51a7-4354-94d3-0e538cb28013/itch_downloads/envname/lib/python3.11/site-packages/itch_dl/infobox.py", line 38, in parse_date_block
time = datetime.strptime(time_str.strip(), "%H:%M")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/_strptime.py", line 568, in _strptime_datetime
tt, fraction, gmtoff_fraction = _strptime(data_string, format)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/_strptime.py", line 352, in _strptime
raise ValueError("unconverted data remains: %s" %
ValueError: unconverted data remains: UTC
Games: 0%| | 0/1 [00:00<?, ?game/s]
Originally part of #8 @JoshuaFern - Itch supports storing multiple game versions (and exposes that partially through the API), but the downloader currently grabs just the latest one. There's no way to specify which versions to download, or to list existing versions. Would be neat to make this possible.
Is there a quick way to skip this error?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.