dmmaze / comic-text-detector Goto Github PK
View Code? Open in Web Editor NEWManga&Comic text detection
License: GNU General Public License v3.0
Manga&Comic text detection
License: GNU General Public License v3.0
I would like to train a model in Spanish and I was wondering what dataset was needed? I mean the folder structure.
Static font files are missing for text rendering data/font_statics_en.csv
and data/font_statics_jp.csv
I'm interested in leveraging features learned on Japanese manga to do transfer learning on the problem of detecting individual characters in a dataset of handwritten Japanese. While looking through the train_*.py
scripts, I saw that you were providing a train_mask_dir
argument. I realize that one of the outputs of comic-text-detector
is an image mask so it makes sense that the model was trained on image segmentations annotations but I'm only interested in the text block detection module. How can I check out a model pretrained on comics and then fine-tune it with my dataset?
Auxiliary to this, you mention that βAll models were trained on around 13 thousand anime & comic style images, 1/3 from Manga109-sβ in the README. Does this mean that the entirety of Manga109-s was used during training and that the whole Manga109-s composed a third of the overall training dataset or that you only took one third of Manga-109s and then used this smaller subsample of Manga-109s in your overall training data? I'm wondering because Manga-109s does not provide image segmentation annotations. Did you just use the bounding box annotations or did you make use of the Manga109 image segmentation annotations made in the paper Unconstrained Text Detection in Manga?
Can I make the instruction more detailed? Which file to run for training?
File ~/imt/comic-text-detector/text_rendering.py:170, in TextLinesSampler.init(self, page_size, sampler_dict)
168 self.page_w, self.page_h = page_size
169 self.lang = sampler_dict['lang']
--> 170 self.lang_dict = load_dict(lang=self.lang)
171 self.orientation_sampler = ScaledSampler(sampler_dict['orientation'])
172 self.numlines_sampler = ScaledSampler(sampler_dict['num_lines'])
TypeError: load_dict() got an unexpected keyword argument 'lang'
In example.ipynb file, the code isn't work with error messages.
what can I do??
Hey I've been trying to get a command line tool which makes use of comic text detector to work, but I ran into some version incompatibilities I thought I should quickly mention.
On Numpy 1.23 your code seems to run just fine, however things break on the latest version 1.24 (which gets installed by default).
The problem seems to be that Numpy version 1.24+ removes the np.int
alias.
Would be great if you could replace all uses of np.int
with a suitable alternative? π
Not 100% sure but I think it's safe to replace np.int
with np.int_
or just regular int
(from my understanding of the problem)
ValueError Traceback (most recent call last)
Cell In[2], line 7
5 img_dir = r'data/examples' # can be dir list
6 save_dir = r'data/examples/annotations'
----> 7 model2annotations(model_path, img_dir, save_dir, save_json=False)
File ~/comic-text-detector/inference.py:45, in model2annotations(model_path, img_dir_list, save_dir, save_json)
43 blk_xyxy.append(blk.xyxy)
44 blk_dict_list.append(blk.to_dict())
---> 45 blk_xyxy = xyxy2yolo(blk_xyxy, im_w, im_h)
46 if blk_xyxy is not None:
47 cls_list = [1] * len(blk_xyxy)
File ~/comic-text-detector/utils/imgproc_utils.py:40, in xyxy2yolo(xyxy, w, h)
39 def xyxy2yolo(xyxy, w: int, h: int):
---> 40 if xyxy == [] or xyxy == np.array([]) or len(xyxy) == 0:
41 return None
42 if isinstance(xyxy, list):
ValueError: operands could not be broadcast together with shapes (0,) (24,4)
Thanks!
how to fix it?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. πππ
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google β€οΈ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.