Code Monkey home page Code Monkey logo

j2konverter's Introduction

j2konverter

A Python script to split volume-based CBZ archives into chapter-based CBZ archives, for Tachiyomi J2K.

Usage

$ python ./j2konverter.py <source> <pattern> <dest>

source is the path to either a single archive or a directory containing multiple archives`

pattern is a stream of tokens that get matched to file names from inside the archives to generate metadata. The aim is to fill the following data structure:

scanner: str = ""
volume: int = 0
chapter: int = 0
page: int = 0
name: str = ""

So, if the file names inside the archive are, for example SomeTitle V01 - C21 / 16 [Chapter Name] [Company] then the pattern to fill the metadata will look like

x|volume|x|chapter|x|page|[name]|[scanner]

x is just used to discard words we don't need. The square brackets are automatically skipped and are used to describe multi-word tokens that we need.

If there is a lot of garbage between two tokens that we need, we can discard it as

loopToken where Token should be replaced by the first token after the garbage that we want.

Finally, dest is the path to the directory where we want the outputs to be generated.

j2konverter's People

Contributors

legoeggolas avatar

Stargazers

TheInternetUser avatar  avatar  avatar

Watchers

 avatar

Forkers

kolisekpl

j2konverter's Issues

Better Pattern Matcher

The current pattern matcher (found in fillMetadata()) is very primitive and needs to be upgraded into a more robust version.
It should be able to fill the metadata from a multitude of different but ordered filenames through better heuristics. Some examples of these filenames can be found all over the inter-webs.

A good way to go about it would be Regex, and another would be a traditional Context Free Grammar assisted parser.

Not to make it over-engineered, the basic requirements for such a new pattern matcher would be:

  • Able to match ordered tokens to filenames and extract metadata.
  • The pattern should be easy to write and flexible.
  • There can be some variation in the order of the tokens in a filename, so the matcher should be able to account for that, if possible. If an integer and a string token get switched, it should be able to compensate for that, given that prior knowledge of such cases is available.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.