autonise / craft-remade Goto Github PK

View Code? Open in Web Editor NEW

190.0 21.0 45.0 2.72 MB

Implementation of CRAFT Text Detection

License: MIT License

Python 100.00%

ocr detection text-detection weak-supervision craft pytorch-implementation pytorch

craft-remade's People

Contributors

Stargazers

Watchers

Forkers

fendaq happog mahendra047 carrewei lamhoangtung jiangli1679 w121211 vothanhdanh95 akarazniewicz yacobby andres-mejia ttyhu super-ljg chadpieere terragona kapitsa2811 hell-to-heaven eyebies xingyi123456 vivienseguy dun933 docongminh alwc gitchenguang bboyhanat peternara nguyenquanghieu2000d malcolmgreaves sunxingxingtf mlcom abhishekvermasg koryakovdmitry abhishekbankar002 arafat-jt ds-brx gauthamguganesh gregbugaj jayashree505 bart-khalid asad-raza hkbu-victor attendfov anminhhung elhoussainee-ch mohammadaljazzar

craft-remade's Issues

Error on loading final_model.pkl

I am getting this 'RuntimeError: Error(s) in loading state_dict for DataParallelModel' while loading final_model.pkl

What could this be?

Add option in config to train sentence level models instead of word-level models

Will have to make changes in add_characters, add_affinity in train_synth/dataloader.py

RuntimeError: DataLoader worker (pid 4249) is killed by signal: Killed.

How can I solve this? I am using Colab to train CRAFT model but I have just one GPU available.

I think I found a solution to improving the detection rate

@mayank-git-hub I think I found a way to improve the detection rate, I want to fully test my code, I would demand a script that converts icdar dataset into .json files which are compatible with labelme, having "shape_type": "polygon".
not combined json, I want separate json files.
Waiting for your reply

When I run'python main.py synthesize --model=./model/63000_model.pkl --folder=./input', I got 'Segmentation fault (core dumped)'

what's the problem?

Pre-trained Model

when will you upload these model files？

gt.mat file

I am trying to train weak-supervision on IC13 dataset and I don't know where to find gt.mat file that is mentionned in the config.py ?
the error I'm getting : FileNotFoundError: [Errno 2] No such file or directory: '/home/SharedData/Mayank/SynthText/gt.mat'

Invalid Mode `synthesize`

@mayank-git-hub
When running:
python main.py train-synth --mode=synthesize --model=./model/final.pkl --folder=./input

It seems that you didn't add an actual synthesize mode in main.py.

Note: replace train_synth with train-synth in README

Line detection

@mayank-git-hub
For line detection, what modifications to the code should be made?
PS: did you remove line detection from Projects / to do?

Model got after training

After training with weak supervision with pre-trained model we are getting a .pkl file but how to convert to .pth?

.pkl is a image file or you have saved the model inside the file?

Error: No such command 'train_synth'.

According to the docs, to train from scratch, I run:

python3 CRAFT-Remade/main.py train_synth

I get:

Usage: main.py [OPTIONS] COMMAND [ARGS]...
Try 'main.py --help' for help.

Error: No such command 'train_synth'.

This is right? (weight file)

Hi.
Thank you for your successful project.

I downloaded the ICDAR2013 pre-trained file of Weakly supervised learning and made inferences.

However, it showed better performance than the MLT pre-trained data for the Korean dataset.(Even for Korean data not included in ICDAR2013)
Is right the weight file learned from scratch with ICDAR 2013 data?

Thnak you.

How to load pkl model with torch ?

strong train path

in train_synth i see pretrained_path and loss_plot_training.npy, can you help me how to download it or how to generate it

No Output After Training on Weak Supervision

I tried training for Weak supervision using the SynthText Strong supervision model on a custom data set.

I trained for just three epochs with 1000 iterations each but when I use the models that get saved at each iteration at save-path the inference results are totally blank! No bounding box is drawn on any image after prediction.

Here are a few other observations:

The Test-(iteration) folders in save-path contains images from my test dateset with blue bounding boxes which I assume is the ground truth, but no predicted bounding box seem to be drawn
My loss drops to 0 and my accuracy goes to one just after a few iterations in the first epoch
My precision and recall are both 1.0. The cumulative F-score is nan, probably because at some stage it gives a division error for some reason.
All the folders in both (iteration)-next-target and (iteration)-predicted folders in the Generated folder at target-path are empty
Prediction on the SynthText Strong Supervision model works just fine

I did write a script to convert the annotations for my data to the ICDAR format (basically what the pre_process function in main.py does), and it seems to match the required format, so I do not think that is the issue. The training runs without error.

Also, all the text annotations for the bounding boxes in my ground truth are ### since the data is not text level annotated, only the bounding boxes are drawn. I hope that is not the issue.

If you can make any sense of what the problem might be, kindly do tell me.
Regards

Loading cache.pkl while debugging

In the class DataLoaderSYNTH in the file train_synth/dataloader.py the code is dumping and then loading the cache.pkl in case of debugging. Thought the cache.pkl is being dumped but it can't be reloaded.
Error appears: ran out of input
Code: self.imnames, self.charBB, self.txt = pickle.load(f)

Any idea how to resolve this issue.

Suggestions needed for custom training

Hello, I'm currently using EasyOCR for text text extraction from a sheets with high resolution on avg (7200x4800). As EasyOCR is using the craft text detection model to localize the text I want to train a custom model for that as the pretrained model is not giving perfect bounding boxes for some text and there are some cases where the texts are missed which leads to lower recognition accuracy.

Can anyone help me with some suggestions on how should I train the model to get better accuracy as the text in my images are small and congested at some places.

Paths in a train_synth/config.py point to a Google drive directory that is not shared.

Hi, when I run this command "!python main.py train-synth" I get this error:

FileNotFoundError: [Errno 2] No such file or directory: '/home/SharedData/Mayank/Models/SYNTH/config.py'

In addition, in file config.py, in the main directory, a series of paths have been read that appear to be in your system. for example :
DataLoaderSYNTH_mat = '/home/SharedData/Mayank/SynthText/gt.mat'
Is it possible for you to share this data?

License

Can you please add license to this repo ?

Thanks
Ophir

Generate requirements.txt file

CUDA memory consumption

Hello,

following the README.md we are getting failures while running

python main.py weak_supervision --model /path/to/strong/supervision/model --iterations <num_of_iterations(20)>

python main.py weak_supervision --model model/63000_model.pkl --iterations 1

this step on Tesla T4 with 16 Gb VRAM.

Boundary character value =  0.4012882678889833 | Threshold character value =  0.43128826788898333 | Threshold character upper value =  0.6012882678889833
Boundary affinity value =  0.45783336177161427 | Threshold affinity value =  0.4878333617716143 | Threshold affinity upper value =  0.6578333617716143
Scale character value =  1.3397710965632044 | Scale affinity value =  1.365808455018482
Training Dataset =  ICDAR2013_ICDAR2017 | Testing Dataset =  ICDAR2013
Number of parameters in the model: 20770466
Generating for iteration: 0
  0%|                                                                                                                                                        | 0/8 [00:00<?, ?it/s]THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=383 error=11 : invalid argument
F-score: 0.7510562308524278: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:45<00:00,  4.64s/it]
Testing for iteration: 0
F-score: 0.678219963897929| Cumulative F-score: 0.7265306122448979: 100%|████████████████████████████████████████████████████████████████████████████| 8/8 [01:19<00:00,  8.78s/it]
Test Results for iteration: 0  | F-score:  0.7265306122448979  | Precision:  0.7216216216216216  | Recall:  0.7315068493150685
Fine-tuning for iteration: 0
Learning Rate Changed to  5e-05
Loading Synthetic dataset
Loaded DEBUG
  0%|                                                                                                                                                    | 0/12500 [00:00<?, ?it/s]Traceback (most recent call last):
  File "main.py", line 201, in <module>
    main()
  File "/root/CRAFT-Remade/env/lib/python3.6/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/root/CRAFT-Remade/env/lib/python3.6/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/root/CRAFT-Remade/env/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/root/CRAFT-Remade/env/lib/python3.6/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/root/CRAFT-Remade/env/lib/python3.6/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "main.py", line 101, in weak_supervision
    model, optimizer, loss, accuracy = train(model, optimizer, iteration)
  File "/root/CRAFT-Remade/train_weak_supervision/trainer.py", line 133, in train
    output = model(image)
  File "/root/CRAFT-Remade/env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/root/CRAFT-Remade/env/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/root/CRAFT-Remade/env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/root/CRAFT-Remade/src/craft_model.py", line 55, in forward
    sources = self.basenet(x)
  File "/root/CRAFT-Remade/env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/root/CRAFT-Remade/src/vgg16bn.py", line 66, in forward
    h = self.slice1(x)
  File "/root/CRAFT-Remade/env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/root/CRAFT-Remade/env/lib/python3.6/site-packages/torch/nn/modules/container.py", line 92, in forward
    input = module(input)
  File "/root/CRAFT-Remade/env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/root/CRAFT-Remade/env/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 338, in forward
    self.padding, self.dilation, self.groups)
RuntimeError: CUDA out of memory. Tried to allocate 1.12 GiB (GPU 0; 14.73 GiB total capacity; 13.26 GiB already allocated; 675.88 MiB free; 20.96 MiB cached)

Could you please tell how much memory is required and if it is possible to somehow lower it and still run on 16 Gb Tesla?

Thanks a lot!

SynthText dataset

Hi autonise, can you share me a link to download your SynthText dataset ?

Link Refiner

@mayank-git-hub
clovaai just released the LinkRefiner code clovaai/CRAFT-pytorch@3cd65f5 will you impliment it, along with option to train ?

Update the ReadMe.md file

Text-Line Detection

@mayank-git-hub How can I train a text-line detection model?
Can you add documentation of training a text-line detection model.

utils_old.py

Thank you for awsome repo.
Why not you use the code in utils_old.py ?
Thank you.

Weak Supervision Training

Hello, I would like to know whether it is possible to only train on weak supervision. During weak supervision main, it reads some configs from train_synth:
pretrained_path = '/home/SharedData/Mayank/Models/SYNTH/6000_model.pkl'
pretrained_loss_plot_training = '/home/SharedData/Mayank/Models/SYNTH/loss_plot_training.npy'

Should I worry about these?

Recall and precision on ICDAR2013

How about the recall and precision on ICDAR2013 by your implemention？

How to reproduce the provided ICDAR2013 model?

Hi.

I have try that train model for ICDAR2013 dataset.
my way:

Save & load your SynthText Strong Supervision model (trained model for SynthText dataset, 50k iter)
2-a. python main.py weakly-supervision --model model/63000_model.pkl --iterations 2
2-b. python main.py weakly-supervision --model model/63000_model.pkl --iterations 5

[2-a]
I trained for ICDAR2013 dataset and sampled SynthText dataset with probability 1/6 for each batch, totally 25k iters.
But I couldn't reproduce the same performance as the ICDAR2013 model you provided.
So, I did [2-b] but same results as [2-a].

How can I reproduce the results ICDAR2013 dataset?

How to test .pkl pre-trained models?

Getting wrong target_affinity when training SynthText style dataset

Hi. Thanks for the great work. I'm trying to train this code on my own generated dataset, which is a SynthText style dataset for Japanese.

The code seem running but I've been noticed in the debug folder that the target_affinity.png file is just a black while target_characters.png seem correct.

Image:

Target affinity:

Target character heat map:

Do you have any suggestion what might be wrong here ? You can see my code in my fork, it's pretty much identical to yours (I only created a new data loader :P)

(final) home@home-desktop:~/programs/CRAFT-Remade$ python main.py train-synth --mode synthesize --model=./model/final_model.pkl --folder=./input
Will generate the predictions at:  ./target_affinity
Will generate the predictions at:  ./target_character
Will generate the predictions at:  ./word_bbox
  0%|                                                                                                                                                                           | 0/1 [00:00<?, ?it/s]Traceback (most recent call last):
  File "main.py", line 107, in <module>
    main()
  File "/home/home/anaconda3/envs/final/lib/python3.6/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/home/home/anaconda3/envs/final/lib/python3.6/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/home/home/anaconda3/envs/final/lib/python3.6/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/home/anaconda3/envs/final/lib/python3.6/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/home/anaconda3/envs/final/lib/python3.6/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "main.py", line 61, in train_synth
    base_path_bbox='/'.join(folder.split('/')[:-1])+'/word_bbox',)
  File "/home/home/programs/CRAFT-Remade/train_synth/synthesize.py", line 269, in main
    synthesize(infer_dataloader, model, base_path_affinity, base_path_character, base_path_bbox)
  File "/home/home/programs/CRAFT-Remade/train_synth/synthesize.py", line 101, in synthesize
    affinity_threshold=config.threshold_affinity)['word_bbox']
  File "/home/home/programs/CRAFT-Remade/src/utils/utils.py", line 209, in generate_word_bbox
    all_characters, hierarchy = cv2.findContours(character_heatmap, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
ValueError: too many values to unpack (expected 2)

Link download pretrain model

Hi autonise, can you share Link download Pre-trained model ?