Code Monkey home page Code Monkey logo

civitai_image_grabber's Introduction

Civit Image grabber

It downloads all the images from a provided Username, Model ID or Model TAG from CivitAI. Should the API not spit out all the data for all images then I'm sorry. The script can only download where data is provided.

The images are Downloaded into a folder with the name of the user, ModelID or the TAG
Second Level is the Model Name with which the image was generated.

CivitAI API is fixed

Usage

install Python3
pip install -r requirements.txt
python civit_image_downloader.py

The script will ask you to

      Enter timeout value (in seconds): 
      Choose image quality (1 for SD, 2 for HD): 
      Allow re-downloading of images already tracked (1 for Yes, 2 for No) [default: 2]: 
      Choose mode (1 for username, 2 for model ID, 3 for tag search): 
      Mode 3 
      Enter tags (comma-separated): TAG
      Disable prompt check? (y/n):

If you leave the timeout value emtpy it will use the default Timeout value 20 sec.
If you leave the image quality value emtpy it will use the default image quality Value SD.

Optional: 2 or more Items which are separated with a comma

Update History

0.9 Feature & Updates

New Feature

Redownload of images. The new option allows the tracking file to be switched off. So that already downloaded images can be downloaded again.

Allow re-downloading of images already tracked (1 for Yes, 2 for No) [default: 2]: 

If you choose 2 or just hit enter the Script will run with Tracking as Default like always.

New Update

When the script is finished, a summary of the usernames or Model IDs that could not be found is displayed.

Failed identifiers:
username: 19wer244rew
Failed identifiers:
ModelID: 493533

0.8 Helper script tagnames

With this Script you can search locally in txt a file if your TAG is searchable.
Just launch tagnames.py and it creates a txt File with all the Tags that the API gives out for the Model TAG search Option 3
But there are some entrys that are cleary not working. I dont kow why they are in the API Answer.
It has an function to add only new TAGS to he txt File if you run it again.

0.7 Features Updates Performance

Features:

Model Tag Based Image download in SD or HD with Prompt Check Yes or NO
Prompt Check YES means when the TAG is also present in the Prompt, then the image will be Downloaded. Otherwise it will be skipped.
Prompt Check NO all Images with the searched TAG will be Downloaded. But the chance for unrelated Images is higher.

CSV File creation within Option 3 TAG Seach
The csv file will contain the image data that, according to the JSON file, has already been downloaded under a different TAG in this format:
"Current Tag,Previously Downloaded Tag,Image Path,Download URL"

Litte Statistc how many images have just been downloaded and skipped with a why reasons.

Updates:

Use of Multiple Entrys in all 3 Options comma-separated

New Folder Structure for Downloaded Images in all Options First Folder is named after what you searched Username, ModelID, TAG. Second is the Model that was used to generate the image

Untitled

Performance:

Code optimizations now the script runs smoother and faster.
Better Error Handling for some Cases

0.6 New Function

Rate Limiting set to 20 simultaneous connections. Download Date Format changend in the JSON Tracking File

0.5 New Features

Option for Downloading SD (jpeg) Low Quality or HD (PNG) High Quality Version of Images

Better Tracking of images that already downloaded, with a JSON File called downloaded_images.json in the same Folder as the script. The Scripts writes for SD Images with jpeg Ending

        "ImageID_SD": 
        "path": "image_downloads/civitAIuser/image.jpeg",
        "quality": "SD",
        "download_date": "YYYY-MM-DD - H:M"       

For HD Images with PNG Ending

        "ImageID_HD": {
        "path": "image_downloads/civitAIuser/Image.png",
        "quality": "HD",
        "download_date": "YYYY-MM-DD- H:M"

into it and checks before Downloading a Image. For Both Option, Model ID or Username

0.4 Added new Functions

Image Download with Model ID. Idea for it came from bbdbby The outcome looks sometimes chaotic a lot of folders with Modelnames you cant find on CivitAI. Because of renaming or Deleting the Models. But older Images have the old Model Names in the META data.

Sort function to put the images and meta txt files into the right Model Folder. The sort Function relies on the Meta Data from the API for the images. Sometimes Chaos. Especially for models that have a lot of images.

Tracking of images that already downloaded with a text file called downloaded_images.txt in the same Folder as the script. The Scripts writes the Image ID into it and checks before Downloading a Image. For Both Option, Model ID or Username

Increased the timeout to 20

0.3 Added a new Function

It is writing the Meta Data for every image into a separate text file with the ID of the image: ID_meta.txt. If no Meta Data is available, the text file will have the URL to the image to check on the website.

Increased the timeout to 10

Added a delay between requests

0.2 Updated with better error handling, some json validation and an option to set a timeout

civitai_image_grabber's People

Contributors

confuzu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

civitai_image_grabber's Issues

API changes possibly affecting image grabbing

With the new content rating system, they rolled out changes to image ratings. This looks like it might be affecting the grabber now.

Using the new version you just uploaded, I tried user Superspice and nothing was grabbed. I tried mnh which grabbed 7 of his 33 images. Nothing reported as to why it stopped. The 7 images are all PG that were grabbed.

I did find this on their discord which was a few days ago so not sure if they have since updated the API to use the rating numbers over True/False.

hi all, does anyone know what the the new public API "nsfwLevel": 1, numbers mean?

I saw that CivitAI now has "PG", "PG-13", "R", "X" and "XXX" for NSFW filters, but I'm wondering what number corelates to what value in the public API, does anyone know?

PG = 1
PG13 = 2
R = 4
X = 8
XXX = 16
confirmed, sorry about that
I believe that the public api is currently using the nsfw: true/false to return either pg/pg13 images or R+

not downloading all the images

Hello, it's an exciting script, however, it doesn't download all the images from the users, in more than one instance I verified this. I'll leave just one example: user - betweenspectrums
Even when running the script twice, it fails to archive many of the older images.

There are many on the page but none is archived, in the first series only number: 34920 gets downloaded.

The user has however also those: 34917, 34918, 34919, 34921, 34922, 34923, 34924, 34925, 34926, 34927, and the list goes on.

It's a great help but it's not working perfectly and for some reason, it skips entirely the images (and many more).

Could you maybe reproduce the same error and see if there is anything to change to let maybe it harvest a little slower but more completely?

limit at 999 images

i tryed to download the images of a user. civit say´s it should be 2k. but the downloader stops at 999 and is done. it also refuses to go if i start the downloader again with the same settings.

Bug - The filename, directory name, or volume label syntax is incorrect

I had another windows character error tonight.

OSError: [WinError 123] The filename, directory name, or volume label syntax is incorrect: 'image_downloads\duskfallcrew\FFINSPOMIX.fp16 x (1-alpha-beta) + HellaineIllustrationPonyXLV2Cross.fp16 x alpha + HellaineIllustrationPonyXL.fp16 x beta (alpha = 0.0 1.0 0.995722430686905 0.982962913144534 0.961939766255643 0.933012701892219 0.896676670145617 0.853553390593274 0.80438071450436 0.75 0.5 0.434736903889974 0.37059047744874 0.308658283817455 0.25 0.195619285495639 0.146446609406726 0.103323329854382 0.0669872981077805 0.0380602337443566 0 beta = 0.0 1.0 0.995722430686905 0.982962913144534 0.961939766255643 0.933012701892219 0.896676670145617 0.853553390593274 0.80438071450436 0.75 0.5 0.434736903889974 0.37059047744874 0.308658283817455 0.25 0.195619285495639 0.146446609406726 0.103323329854382 0.0669872981077805 0.0380602337443566 0)_cosineA ID 1'

2024-04-06 040520,806 - ERROR.txt

UnboundLocalError: local variable 'dir_name' referenced before assignment - just a FYI

I had this pop up today and I realized what caused it. I don't know if any other rejections will cause it, but I was using a username which had been changed since I made note of it. A google search found an image linked to the old name, but when you followed the link it took you to the new username. I don't know if this error will pop up for any other reason, but I am guessing the API is rejecting the request due to unknown name. It does immediately cause it to fail though even if you have other names, guessing due to unable to create directory. I don't know what the api spits out when you try a wrong username, but tonyhs is what I used that was renamed.

Traceback (most recent call last):
File "E:\Civitai Image Grabber\civit_image_downloader.py", line 634, in main
results = await asyncio.gather(*tasks)
File "E:\Civitai Image Grabber\civit_image_downloader.py", line 554, in download_images
sort_images_by_model_name(dir_name)
UnboundLocalError: local variable 'dir_name' referenced before assignment

[WinError 123] The filename, directory name, or volume label syntax is incorrect: 'image_downloads\\Oppkllll\\"FINAL 0,5= F0,4loras+GGG-half"'

Traceback (most recent call last):
File "X:\CivitAI_Image_grabber-main\civit_image_downloader.py", line 649, in
asyncio.run(main())
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\asyncio\runners.py", line 44, in run
return loop.run_until_complete(main)
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\asyncio\base_events.py", line 649, in run_until_complete
return future.result()
File "X:\CivitAI_Image_grabber-main\civit_image_downloader.py", line 613, in main
results = await asyncio.gather(*tasks)
File "X:\CivitAI_Image_grabber-main\civit_image_downloader.py", line 533, in download_images
sort_images_by_model_name(dir_name)
File "X:\CivitAI_Image_grabber-main\civit_image_downloader.py", line 216, in sort_images_by_model_name
os.makedirs(target_dir, exist_ok=True)
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\os.py", line 225, in makedirs
mkdir(name, mode)
OSError: The filename, directory name, or volume label syntax is incorrect: 'image_downloads\Oppkllll\"FINAL 0,5= F0,4loras+GGG-half"'

This is all that showed in the console.
I was running a user download, user Oppklll and got the above error. I am guessing it didn't like a character, quotations assuming that was what it tried to write.

I also added the full details in the log.
[traceback.txt]

Thanks again for all your work on the program. I really do appreciate it.

Downloading PNGs

Is it possible to download to download the full quality PNGs instead of jpgs from a users gallery?

like this image, for example https://civitai.com/images/4199662

I noticed that apps like jdownloader can see the PNGs but somehow it downloads low quality jpgs instead

Traceback error after RemoteProtocolError: Server disconnected without sending a response.

This popped up tonight when the server disconnected. I wish I could help, but I have zero python knowledge. Not sure if related to the other error.

Traceback (most recent call last):
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\httpx_transports\default.py", line 69, in map_httpcore_exceptions
yield
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\httpx_transports\default.py", line 373, in handle_async_request
resp = await self._pool.handle_async_request(req)
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\httpcore_async\connection_pool.py", line 216, in handle_async_request
raise exc from None
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\httpcore_async\connection_pool.py", line 196, in handle_async_request
response = await connection.handle_async_request(
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\httpcore_async\connection.py", line 101, in handle_async_request
return await self._connection.handle_async_request(request)
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\httpcore_async\http11.py", line 143, in handle_async_request
raise exc
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\httpcore_async\http11.py", line 113, in handle_async_request
) = await self._receive_response_headers(**kwargs)
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\httpcore_async\http11.py", line 186, in _receive_response_headers
event = await self._receive_event(timeout=timeout)
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\httpcore_async\http11.py", line 238, in _receive_event
raise RemoteProtocolError(msg)
httpcore.RemoteProtocolError: Server disconnected without sending a response.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "X:\AI\Civitai Image Grabber\civit_image_downloader.py", line 370, in
asyncio.run(main())
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\asyncio\runners.py", line 44, in run
return loop.run_until_complete(main)
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\asyncio\base_events.py", line 649, in run_until_complete
return future.result()
File "X:\AI\Civitai Image Grabber\civit_image_downloader.py", line 352, in main
results = await asyncio.gather(*tasks)
File "X:\AI\Civitai Image Grabber\civit_image_downloader.py", line 265, in download_images_for_username
response = await client.get(url, timeout=timeout_value)
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\httpx_client.py", line 1801, in get
return await self.request(
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\httpx_client.py", line 1574, in request
return await self.send(request, auth=auth, follow_redirects=follow_redirects)
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\httpx_client.py", line 1661, in send
response = await self._send_handling_auth(
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\httpx_client.py", line 1689, in _send_handling_auth
response = await self._send_handling_redirects(
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\httpx_client.py", line 1726, in _send_handling_redirects
response = await self._send_single_request(request)
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\httpx_client.py", line 1763, in _send_single_request
response = await transport.handle_async_request(request)
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\httpx_transports\default.py", line 372, in handle_async_request
with map_httpcore_exceptions():
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\contextlib.py", line 153, in exit
self.gen.throw(typ, value, traceback)
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\httpx_transports\default.py", line 86, in map_httpcore_exceptions
raise mapped_exc(message) from exc
httpx.RemoteProtocolError: Server disconnected without sending a response.

Only downloads 100 images

Used to work fine so I suspect a change on civitai, but now when attempting to download all a users images, only 100 are downloaded before your script says "Image download completed successfully".

Example user: WTFusion (via https://civitai.com/user/WTFusion/images) has 1860 images at time of writing, only 100 are downloaded.

UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 171: character maps to <undefined>

I did get this traceback error today with the updated version.

Traceback (most recent call last):
File "X:\AI\Civitai Image Grabber\civit_image_downloader.py", line 635, in
asyncio.run(main())
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\asyncio\runners.py", line 44, in run
return loop.run_until_complete(main)
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\asyncio\base_events.py", line 649, in run_until_complete
return future.result()
File "X:\AI\Civitai Image Grabber\civit_image_downloader.py", line 599, in main
results = await asyncio.gather(*tasks)
File "X:\AI\Civitai Image Grabber\civit_image_downloader.py", line 519, in download_images
sort_images_by_model_name(dir_name)
File "X:\AI\Civitai Image Grabber\civit_image_downloader.py", line 190, in sort_images_by_model_name
lines = file.readlines()
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 171: character maps to

Failed identifiers: Username randomly happening even though valid users

Not sure if they did something with the API or if it is spitting out an error similar, but I have had a few users that will randomly reject like it is not a valid username when run with other usernames.

ClamJam and DarkAgent was one. When run individually they complete fine, but when running together they were acting like either were known users.

Currently happening with albertolologgb627 and Aishavingfun. No traceback errors so not sure the API is actually rejecting with. Originally ran these 2 with 2 others that completed successfully but rejected these 2. Running just the two names, the first succeeded, but second failed.
Number of downloaded images: 507
Number of skipped images: 0
Failed identifiers:
username: albertolologgb627
Image download completed.

By itself Aishavingfun failed, but rerunning re-download worked. Not sure if it is a json issue, though albertolologgb627 worked without re-download enabled when I ran just the two.

I noticed a trend with this. It always seems to be the first name of multiple names that fails if it is going to fail.

UnboundLocalError: local variable 'model_dir' referenced before assignment

I am getting the following error sometimes when attempting to download via model ID. Nothing is downloaded and the folder isn't generated. This is also resulting I believe in loops being closed saying already visited even though I have nothing saved in the folder.

Traceback (most recent call last):
File "X:\AI\Civitai Image Grabber\civit_image_downloader.py", line 370, in
asyncio.run(main())
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\asyncio\runners.py", line 44, in run
return loop.run_until_complete(main)
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\asyncio\base_events.py", line 649, in run_until_complete
return future.result()
File "X:\AI\Civitai Image Grabber\civit_image_downloader.py", line 352, in main
results = await asyncio.gather(*tasks)
File "X:\AI\Civitai Image Grabber\civit_image_downloader.py", line 244, in download_images_for_model
sort_images_by_model_name(model_dir)
UnboundLocalError: local variable 'model_dir' referenced before assignment

ssl.SSLSyscallError: Some I/O error occurred (_ssl.c:1007)

Not sure if this was just a disconnection error, but it didn't match the last one and a bunch of other exceptions popped up also so figured I would post it in case.

Traceback (most recent call last):
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\site-packages\anyio\streams\tls.py", line 131, in _call_sslobject_method
result = func(*args)
File "C:\Users\me\AppData\Local\Programs\Python\Python310\lib\ssl.py", line 975, in do_handshake
self._sslobj.do_handshake()
ssl.SSLSyscallError: Some I/O error occurred (_ssl.c:1007)

2024-04-28 004729,807 - ERROR.txt

UnicodeEncodeError: 'charmap' codec can't encode character '\uff0c' in position 4: character maps to <undefined> in tagnames.py

I just ran tagnames.py with the latest version and got the below error. It was different enough I didn't want to just add it to the other issue.

Traceback (most recent call last):
File "X:\AI\Civitai Image Grabber\tagnames.py", line 51, in
process_data(items, file_path, read_existing_tag)
File "X:\AI\Civitai Image Grabber\tagnames.py", line 30, in process_data
file.write(name+ '\n')
File "C:\Users\barre\AppData\Local\Programs\Python\Python310\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\uff0c' in position 4: character maps to

Unknown_meta folder causing image deletion and not saving new images

I had a weird issue tonight so not sure what you need me to upload. I reran a few usernames tonight that were previously run a few weeks ago so the files weren't in folders based off the models.

I checked one of the folders only to find the only thing in there was a folder called unknown_meta. Neither the main folder of the username or the unknown_meta folder contained any new or previously downloaded images. The window even shows that files were being downloaded, but I have been unable to locate them.

I copied the same user over to a new folder with the existing images and ran it again. The same thing happened. Images are gone and Unknown_Meta folder is created. Again it also told me it downloaded 14 images and I have none.

When I tested on my NAS, I found the images in the recycle bin, but in Windows they are just gone so it must be permanently deleting them after it thinks they have been moved. I don't know if this affected all users or only certain ones. Two users for sure that were affected were linkin911 and beg0n, linkin911 was the one I tested on windows and my NAS. Checking other users though they have unknown_meta folders which only contain the meta txt files and no images.

I don't know if the bug was introduced in this latest version or the one just before it trying to fix the folder bug.

In looking into this a bit further it looks like any user that ends up with an unknown_meta folder won't have any images saved in that folder.

On a side note, I am guessing once fixed I will need to nuke my downloaded_images json or at least wipe anything in the last 48 hours since it thinks those images were downloaded even though they are now gone.

Enhancement - create download folders in mode folders

It would be nice if Image Grabber could separate the users, model number, and model tag search into their own subfolders based off the mode even if just using users, model number, and model tag as the folder name.

Currently I move the folders myself at the end of the night otherwise the image download folder can get unwieldy especially when trying to figure out is it a username or a model tag at a glance.

Enhancement - Ability to limit number of images

While testing the new update, I realized it would be helpful if we could enter a limit on the number of images. Since I was just testing a few random tags, I only needed 10 or so images.

I don't want to annoy Civitai and end up hammering their API not realizing it is downloading an enormous number of images. Some of these older models seem to have 10s of thousands of images. Since it tracks images it has grabbed before, it would hopefully just grab images it didn't grab the last run.

Even better would be on model and tag searches if you could limit by model version ID or model number when doing a tag search. This way you could say grab 100 images of each model version ID when grabbing model X or if in the case of tag search only grab 100 images from a model before moving on to the next one.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.