I have a large collection of doujins from a huge siterip. So many that I have no idea

You can put the data from the contents file in the <code cla

alternate metadata input procedure about happypanda HOT 13 CLOSED

twiddli commented on June 9, 2024

alternate metadata input procedure

from happypanda.

Comments (13)

twiddli commented on June 9, 2024

An edge case like this won't get supported. There is a way though. It involves making a small script (like you mention yourself). Currently, when importing galleries Happypanda supports extracting metadata from files named info.json with eze's structure which can be cut down to this (this feature will be expanded on sometime in the future):

{
  "gallery_info": {
    "title": "Hello Title",
    "title_original": "Hello Original Title",
    "category": "Manga",
    "tags": {
      "namespace": ["tag1"],
      "namespace2": ["tag1", "tag2"],
    },
    "language": "English",
    "translated": true,
  },
  "image_api_key": "",
  "image_info": []
}

The most important keys are gallery_info, image_api_key and image_info. They need to be present. image_api_key and image_info can be left empty because the only reason they are included is for identifying purposes.

So what I would do in your case is write a script that gathers the metadata across the files, puts it in a dict identical to the one above, then serializes it as info.json.

I'll be available for help in making the script if you need that (not to be confused with me making the script for you).

from happypanda.

Exedge commented on June 9, 2024

I think i can write most of the code in java, if not i can try others. however i might need a little help getting it to work through multiple folders. I plan on making it a standalone little application that you can drag and drop into the folder. when you click on it it will go through all the subfolders and do the compiling/deleting.

also i really need to know how the database reads the metadata. specifically which parts are essential and which parts i can just leave blank when writing to the json file

from happypanda.

Exedge commented on June 9, 2024

so far i have made a python program that can read in the data and output it into a json file. however the json file is only in a long string for now as a test run. i still need to get it into the right format and i need the deleteOld function to work. once these are all done i can work on making it run through multiple folders.

if anyone can help I could really use it, especially getting the deleteOld function to delete the files

here is the python code, just change the .txt to .py convertor.txt

i am using this to organize a torrent siterip of pururin. Useful since they seem to have shut down.

If i can get this to convert correctly i can make it so anyone can do the same, and the collection is around a hundred GBs, which is even larger than it sounds.

from happypanda.

twiddli commented on June 9, 2024

In the makeJSON function, I recommend making a dict first and then add the necessary data to it like so:

metadata = {
"gallery_info": {}, # this is the key you put data in
"image_api_key":"", # these last two keys need to be present or else Happypanda won't accept this file (just leave them empty)
"image_info":""
""
}

gallery_info = metadata['gallery_info']

gallery_info['title'] = getTitle(path)
gallery_info['category'] = getCategory(path)
gallery_info['artist'] = getArtist(path)
gallery_info['language'] = getLanguage(path)

gallery_info['tags'] = {}
gallery_tags = gallery_info['tags']

gallery_tags['Characters'] = getCharacters(path) # should return a list
# contents = getContents(path)   what is this?
gallery_tags['Group'] = getGroup(path) # should return a list
gallery_tags['Parody'] = getParody(path) # should return a list

Python has json encoder/decoder in the standard library so after filling out the metadata, you just do:

import json

# in makeJSON function
# metadata = {} from earlier

with open("info.json", "w", encoding="utf-8") as infofile:
    json.dump(metadata, infofile)

The deleteOld function is fine.. You run it after making the json, I guess..
One thing that I recommend is replacing all those if statements with a for loop :

# after making the json
def deleteOld(path):
     files = ["__Artist__", "__Category__", "__Characters__", ...]
     for f in files:
         p = os.path.join(path, f)
         if os.path.isfile(p):
              os.remove(p)

from happypanda.

Exedge commented on June 9, 2024

thank you for the help, i will see if i can get this to do at least work a single folder today. i might even manage to get this done today!

also, can you to post the python code that you use to populate from a directory? I can probably re-purpose it to work with this. I just need the part that iterates through the subfolders from the top.

from happypanda.

Exedge commented on June 9, 2024

here is a test folder with the test program. it should do what it is supposed to but something is wrong

$1,000,000 no Best Order!.txt

change the extension to zip and then extract it

from happypanda.

twiddli commented on June 9, 2024

I don't know how you're running the python file but you should do it via cmd. That way you'll see its output and whatnot.
Here is the convertor.txt with fixed formatting.

I tried running it and it spewed some error saying some object wasn't serialize-able...
I didn't fix it for you but getTitle and getCategory doesn't return strings.

from happypanda.

Exedge commented on June 9, 2024

ok it looks like it does the stuff for the most part but the formatting of the json is a bit off and also is there anywhere that we can put the contents file into it? if that file can also be read in then the database can search for specific tags like non-h etc

here are the info that was generated as well as the contents file.

info.txt
Contents.txt

once the formatting is fixed than i can just focus on making it iterate through folders. the end is in sight!

from happypanda.

twiddli commented on June 9, 2024

You can put the data from the contents file in the tag field:

gallery_tags['Misc'] = getContents(path) # should return a list

You can format the output of the json file with the optional indent parameter:

json.dump(metadata, infofile, indent=4)

from happypanda.

Exedge commented on June 9, 2024

here is the last stage of the single folder version. all that needs to be fixed is that it needs to always output the same order as the order that it takes thing in. any ideas?
convertor.txt

from happypanda.

Exedge commented on June 9, 2024

Ok now i have a program that can work with a single folder and and it converts it over to a readable json. it seems like the order of the other stuff doesn't matter but i have made it so that the gallery_info is all ordered. Now all that is left is to make it iterate through folders and make it do this in each one

here is the script so far

convertor.txt

also can i post a link to the torrent on here containing the whole archive that this works with?

from happypanda.

Exedge commented on June 9, 2024

i think i have just about done it! i have run a few tests and things seem to be working smoothly. i'm going to try some larger scale stuff before i post it. it should be done within a few hours.

from happypanda.

Exedge commented on June 9, 2024

Ok i have tested it and it seems to run smoothly! I believe i can now call this the pururin convertor 1.0.

here is what you need to do in order to convert your files:

step 1: put the convertor in the root folder of the files (should contain only the folders that contain the individual chapters, use extract here when you extract the zips)

step 2: change the extension from .txt to .py

step 3: run it on the command line
-type cd
-copy the directory of the program and paste after that, hit enter
-type py pururinConvertor.py and hit enter

step 4: make a sandwich, it should take a while

step 5: finally, add them to the database

this last step can take a very VERY VEEEERY long time if you do a large quantity.

and thats all!

here is the file:

pururinConvertor.txt

from happypanda.

alternate metadata input procedure about happypanda HOT 13 CLOSED

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent