mhliao / db Goto Github PK

View Code? Open in Web Editor NEW

2.0K 43.0 468.0 155 KB

A PyTorch implementation of "Real-time Scene Text Detection with Differentiable Binarization".

Python 77.57% C++ 7.58% Cuda 13.32% HTML 1.53%

scene-text-detection dbnet

db's People

Contributors

Stargazers

Watchers

Forkers

wwwanghao happog chiukin changya1990 shengzhang90 sunxingxingtf wangning7149 liuwenhaha huangshenneng shengyucaihua tangtangchx hantao19921227 liu100286 kotomidu banyueqin wenmuzhou wangxiong101 cqray1990 xuweidongkobe zhuguangqiang stivensss xiaolaodi jeffrey98-ai dlml wxs29 wangbingok1118 xiaoyubing trami1995 holygen alwc xxlxx1 wyc2015fq lyk125 wuxiaolianggit wjinhai challenging6 jadentan laterxxx jacklongking templeblock pzw520125 ling-cv randal7 freewind2016 wuzuowuyou zzmcdc udacitysimon pkq1688 zhangmiaozju leoli08 bluseking spencerx xiaohuihuichao tchigher hhy5277 sylvia6 magnetstone allen15rg zoujuny fasladodo kapitsa2811 meerkat-cv chadpieere zjzcn1 0x4d3342 felixzhang7 fengjunxi arufus moooood wanglianjietju wulouzhu yangtong1989 llf10811020205 zhangjunyi1225054736 xrosliang jacke121 tangyoubao 2016xjtuzyt congjianting chenjun2hao pekinghk xuannianz leo-xxx sudonghao msjyyt smilealvin hate-deadline bachelorwangwei youjiangxu dazition56 sailychen zzhanq dodgaga guome hanyeliu talqinyong xiaoye77 chengmuni66 zinzinhust96 a178052771

db's Issues

pretrained synthtext model on res18

Hi, thanks for the awesome job. Could you provide the pretrained synthtext model on resnet-18?

The label of threshold map?

Hi, Minghui, thanks to your nice work, here are some question to ask you:

"where the label of the threshold map can be generated by computing the distance
to the closest segment in G."
I think the distance is distanceTransform of pixels in border, however I add it into loss cannot be converged.
"Lt is computed as the sum of L1 distances between the prediction and label inside the dilated text polygon Gd"
I think the loss for threshold map is calculated for region in border.
Pixels in threshold map are in [0, 1]? If so, the threshold map network predicted should process by sigmoid.

Above all are my question and my thoughs.

学长的忠实粉丝，从textboxes，textboxes++一路跟来，盼望开源

python setup.py build_ext --inplace出现错误

/usr/local/cuda:/usr/local/cuda:/usr/local/cuda-9.0/bin/nvcc -I/home/nmt/anaconda3/envs/pytorch1_2/lib/python3.6/site-packages/torch/include -I/home/nmt/anaconda3/envs/pytorch1_2/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/home/nmt/anaconda3/envs/pytorch1_2/lib/python3.6/site-packages/torch/include/TH -I/home/nmt/anaconda3/envs/pytorch1_2/lib/python3.6/site-packages/torch/include/THC -I/usr/local/cuda:/usr/local/cuda:/usr/local/cuda-9.0/include -I/home/nmt/anaconda3/envs/pytorch1_2/include/python3.6m -c src/deform_conv_cuda_kernel.cu -o build/temp.linux-x86_64-3.6/src/deform_conv_cuda_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options '-fPIC' -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=deform_conv_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
unable to execute '/usr/local/cuda:/usr/local/cuda:/usr/local/cuda-9.0/bin/nvcc': No such file or directory
error: command '/usr/local/cuda:/usr/local/cuda:/usr/local/cuda-9.0/bin/nvcc' failed with exit status 1

试过更改.brashrc的CUDA的环境变量，但是不行，这个问题要怎么解决?

为什么thresh_map的最小值要设置为0.3

这一行为什么要设置canvas 的最小值是0.3？
https://github.com/MhLiao/DB/blob/master/data/processes/make_border_map.py#L41

train的时候出现这个错误

python train.py ./experiments/seg_detector/base_totaltext.yaml --num_gpus 2

Traceback (most recent call last):
File "train.py", line 70, in
main()
File "train.py", line 60, in main
experiment_args = conf.compile(conf.load(args['exp']))['Experiment']
KeyError: 'Experiment'

train

Traceback (most recent call last):
File "train.py", line 70, in
main()
File "train.py", line 60, in main
experiment_args = conf.compile(conf.load(args['exp']))['Experiment']
KeyError: 'Experiment'

upload weight on google drive

could you please upload weight on google drive

python3 eval.py Error:NotImplementedError

When I run python3 eval.py, raise NotImplementedError. why? how to slove it?

请问如何训练自己的数据集

请问训练自己的数据集时，除了数据路径外，其他参数还有哪些需要注意的，涉及到哪些参数，谢谢

第一时间前来支持！

支持！

可视化的图变得模糊

请问为什么使用参数--visualize后得到的可视化图很模糊？

deform_conv报错，是不是什么版本有问题？

我用的redhat7.4，python3.6.2, gcc 4.9.2, Cython 0.28, torch 1.3，cuda 10.1，
deformable convolution编译的时候没问题，已经生成文件了。
但是在运行eval.py进行测试时，报了图中的错误。我怀疑是哪个库版本不一致导致的，麻烦您有空看一下，谢谢

预测的时候是否可以完全在CPU上进行

dcn模块是否可以运行在CPU上？？

训练自己数据集validation为null

train error in modulated_deformable_col2im_coord_cuda

python train.py experiments/seg_detector/ic15_resnet50_deform_thre.yaml --num_gpus 4
[INFO] [2019-12-05 05:32:38,907] Training epoch 0
error in modulated_deformable_col2im_coord_cuda: invalid device function
error in modulated_deformable_col2im_cuda: invalid device function

yaml文件里参数的意义

你好，我看到yaml文件里有很多参数，不太清楚某些参数的具体意义，请问你什么时候有时间可以给我们解释一下这些参数的用处吗

error C3203: “templated_iterator”: 未专用化的类模板不能用作模板变量，该变量属于模板参数“_Ty1”，应为 real 类型

D:/STL/software/Anaconda/Anaconda33/envs/DB/lib/site-packages/torch/include\c10/util/order_preserving_flat_hash_map.h(1502): error C3203: “templated_iterator”: 未专用化的类模板不能用作模板变量，该变量属于模板参数“_Ty1”，应为 real 类型
D:/STL/software/Anaconda/Anaconda33/envs/DB/lib/site-packages/torch/include\c10/util/order_preserving_flat_hash_map.h(1560): note: 参见对正在编译的类模板实例化“ska_ordered::order_preserving_flat_hash_map<K,V,H,E,A>”的引用
D:/STL/software/Anaconda/Anaconda33/envs/DB/lib/site-packages/torch/include\c10/util/order_preserving_flat_hash_map.h(1506): error C3203: “templated_iterator”: 未专用化的类模板不能用作模板变量，该变量属于模板参数“_Ty1”，应为 real 类型
D:/STL/software/Anaconda/Anaconda33/envs/DB/lib/site-packages/torch/include\c10/util/order_preserving_flat_hash_map.h(1514): error C3203: “templated_iterator”: 未专用化的类模板不能用作模板变量，该变量属于模板参数“_Ty1”，应为 real 类型
D:/STL/software/Anaconda/Anaconda33/envs/DB/lib/site-packages/torch/include\c10/util/order_preserving_flat_hash_map.h(1596): error C3203: “templated_iterator”: 未专用化的类模板不能用作模板变量，该变量属于模板参数“_Ty1”，应为 real 类型
D:/STL/software/Anaconda/Anaconda33/envs/DB/lib/site-packages/torch/include\c10/util/order_preserving_flat_hash_map.h(1633): note: 参见对正在编译的类模板实例化“ska_ordered::flat_hash_set<T,H,E,A>”的引用
D:/STL/software/Anaconda/Anaconda33/envs/DB/lib/site-packages/torch/include\c10/util/order_preserving_flat_hash_map.h(1601): error C3203: “templated_iterator”: 未专用化的类模板不能用作模板变量，该变量属于模板参数“_Ty1”，应为 real 类型
D:/STL/software/Anaconda/Anaconda33/envs/DB/lib/site-packages/torch/include\c10/util/order_preserving_flat_hash_map.h(1605): error C3203: “templated_iterator”: 未专用化的类模板不能用作模板变量，该变量属于模板参数“_Ty1”，应为 real 类型
D:/STL/software/Anaconda/Anaconda33/envs/DB/lib/site-packages/torch/include\c10/util/order_preserving_flat_hash_map.h(1609): error C3203: “templated_iterator”: 未专用化的类模板不能用作模板变量，该变量属于模板参数“_Ty1”，应为 real 类型
D:/STL/software/Anaconda/Anaconda33/envs/DB/lib/site-packages/torch/include\c10/util/order_preserving_flat_hash_map.h(1613): error C3203: “templated_iterator”: 未专用化的类模板不能用作模板变量，该变量属于模板参数“_Ty1”，应为 real 类型

Trained models...?

Please check the trained model link, I just find some ground truth files?

Inconsistent results

Thanks for your excellent work! But I have a question, why I get the inconsistent results with the ones reported in paper when testing the fintuned resnet-18 backbone on TotalText?

How can I generate or obtain the visualization result for predicted characters?

Hi Minghui, thanks for your awesome work!

I found that adding --visulize after the evaluation command will generate the visualization results for only the threshold map. I'm wondering how can I generate or obtain the visualization result for predicted characters?

Thank you!

No module named 'structure.representers.boxes_from_map'

When running:

python eval.py ./experiments/seg_detector/base.yaml --resume ./modelz/totaltext_resnet50 --polygon --box_thresh 0.6

I get error:

Traceback (most recent call last):
  File "eval.py", line 8, in <module>
    from trainer import Trainer
  File "/home/home/p9/DB/trainer.py", line 6, in <module>
    from experiment import Experiment
  File "/home/home/p9/DB/experiment.py", line 4, in <module>
    from structure.representers import *
  File "/home/home/p9/DB/structure/representers/__init__.py", line 2, in <module>
    from .boxes_from_map import boxes_from_map
ModuleNotFoundError: No module named 'structure.representers.boxes_from_map'

Did anyone reach the result as the paper shows? or Could you show the result test on mser-td500 or ctw1500 by running the code?

l1 loss may device zero

When mask.sum() ==0 in this line https://github.com/MhLiao/DB/blob/master/decoders/l1_loss.py#L10,the l1loss will get nan

Fine Tune?

How to use the pre-trained model of SynthDat for finetuning? How to add this weight to train.py script?

Failed to build geventwebsocket

pip install -r requirement.txt

ERROR: Failed building wheel for geventwebsocket

Maybe you want to replace geventwebsocket, because it is failing to build.

测试结果跟论文相差很大

KeyError: 'Experiment'

Dear author, I wanted to train the model with command "python train.py experiments/seg_detector/base_totaltext.yaml --num_gpus 8", and the error raise that
File "train.py", line 72, in
main()
File "train.py", line 62, in main
experiment_args = conf.compile(conf.load(args['exp']))['Experiment']
KeyError: 'Experiment'

How can I solve these problem?

how to test on a single image?

Any scripts test on a single image?

inference - threshold , dilated

代码中，没有找到 dilated 的操作？？？是省去了吗？为啥啊

安装环境名字 requirements.txt

pip install -r requirements.txt
没有 requirements.txt 没有 s
pip install -r requirement.txt

Training error

When training the model with totaltext, I got this error when running the command
python mytrain.py experiments/seg_detector/totaltext_resnet18_deform_thre.yaml --num_gpus 4. Can you help me? Thank you!

libibverbs: Warning: no userspace device-specific driver found for /sys/class/infiniband_verbs/uverbs0
Traceback (most recent call last): File "train.py", line 72, in <module> main() File "train.py", line 69, in main trainer.train() File "/DB/trainer.py", line 86, in train epoch=epoch, step=self.steps) File "/DB/trainer.py", line 127, in train_step if step % self.experiment.logger.log_interval == 0: TypeError: unsupported operand type(s) for %: 'int' and 'NoneType'

The training time

Dear author, Could you give me the information about your training time and your computing resource?
I found it cost about 50min with 4 Titian Xp GPUs to train one epoch. Is it normal ?

train_list.txt和val_list.txt里面是如何写的?

Question about eq. 2

After inputting the probability map and threshold map into eq.2, what post processing should we do for the fianl results? Particularly, if P equal to T, the output is a median value (1/2). It seems that there are many noises in the final binary map, so can you tell us the detailed inference?
Thank you~

paper typo

page 7 table 5: CDAFT (Baek et al. 2019) 89.8 84.3 86.9 may be CRAFT

评估时出错，IndexError: list index out of range

你好，感谢你的工作，我在试图评估时报错
IndexError: list index out of range
具体如下：
python eval.py experiments/seg_detector/totaltext_resnet50_deform_thre.yaml --resume models/totaltext_resnet50 --polygon --box_thresh 0.6
Traceback (most recent call last):
File "eval.py", line 193, in
main()
File "eval.py", line 77, in main
experiment = Configurable.construct_class_from_config(experiment_args)
File "./DB/concern/config.py", line 130, in construct_class_from_config
return cls(**args)
File "./DB/experiment.py", line 96, in init
self.load_all(**kwargs)
File "./DB/concern/config.py", line 143, in load_all
self.load(name, **kwargs)
File "./DB/concern/config.py", line 151, in load
(kwargs[state_name], cmd)))
File "./DB/concern/config.py", line 164, in create_member_from_config
return cls(**args, cmd=cmd)
File "./DB/experiment.py", line 37, in init
self.load_all(**kwargs)
File "./DB/concern/config.py", line 143, in load_all
self.load(name, **kwargs)
File "./DB/concern/config.py", line 151, in load
(kwargs[state_name], cmd)))
File "./DB/concern/config.py", line 164, in create_member_from_config
return cls(**args, cmd=cmd)
File "./DB/data/data_loader.py", line 29, in init
self.load_all(**kwargs)
File "./DB/concern/config.py", line 143, in load_all
self.load(name, **kwargs)
File "./DB/concern/config.py", line 151, in load
(kwargs[state_name], cmd)))
File "./DB/concern/config.py", line 164, in create_member_from_config
return cls(**args, cmd=cmd)
File "./DB/data/image_dataset.py", line 33, in init
self.get_all_samples()
File "./DB/data/image_dataset.py", line 37, in get_all_samples
with open(self.data_list[i], 'r') as fid:
IndexError: list index out of range

我使用的是python 3.7.5 pytorch 1.3.1 cuda 10.1
请问我该怎么做？

forward() missing 1 required positional argument: 'data'

I find this error when I run eval.py and demo.py, it seemed something went wrong with the data input?
File "eval.py", line 194, in
main()
File "eval.py", line 80, in main
Eval(experiment, experiment_args, cmd=args, verbose=args['verbose']).eval(args['visualize'])
File "eval.py", line 177, in eval
pred = model.forward(batch, training=False)
File "/ssd/xmzhang/TextDetection/lab1/DB/structure/model.py", line 56, in forward
pred = self.model(data, training=self.training)

TypeError: forward() missing 1 required positional argument: 'data'

百度网盘的数据集里没有images文件

提供的datasets百度网盘链接里的total_text文件夹中没有train_images和test_images文件

怎样训练自己的数据集？

为什么代码结构要搞这么复杂？

当然我们相信各位的工作，相信速度很快，我们不需要再去测试速度，我觉得没什么意义，我们需要的是简单的加载模型，进行预测。现在的代码是各种类，各种跳转，各种参数还不知道啥意思。。。。好南。。。。。。。。

请问adaptive 这个参数是什么意思啊？

您好请问这个参数是什么意思训练的时候需要开着吗

KeyError: 'Experiment'

(db) home@home-desktop:~/p9/DB$ python eval.py ./experiments/seg_detector/base.yaml --resume ./modelz/totaltext_resnet50 --polygon --box_thresh 0.6
Traceback (most recent call last):
  File "eval.py", line 193, in <module>
    main()
  File "eval.py", line 75, in main
    experiment_args = conf.compile(conf.load(args['exp']))['Experiment']
KeyError: 'Experiment'

@MhLiao test the code, both train.py and eval.py

segmentation fault(core dumped)

模型鲁棒性似乎不太好

请问博主，模型对中文检测是不是效果比较差，而且容易把二维码的纹理识别成字，我用demo.py测试一些电商图片效果很差

No class name State

In concern.config there is no class name State,but this class is import in many files

问一个关于synthtext数据集的问题

之前一直在用tensorflow，所以synthtext这个数据集的图片和标注都是处理成tfrecord的格式。这边用syntntext数据集进行训练的时候需要怎么处理呢？处理成一张图片img_name.jpg对应标注img_name.txt的形式吗？

数据样例里面的label有很多坐标点，这个是怎么组织的啊？另外对应的图片数据在哪里下载啊？

还有一个问题就是，对于只有四个坐标点的数据怎么转换到和你这个一样的标签上

TypeError: forward() missing 1 required positional argument: 'data'

home@home-desktop:~/p9/DB/datasets/total_text$ tree -L 1
.
├── test_gts
├── test_images
├── test_list.txt
├── train_gts
├── train_images
└── train_list.txt

(db) home@home-desktop:~/p9/DB$ python eval.py experiments/seg_detector/totaltext_resnet18_deform_thre.yaml --resume ./modelz/totaltext_resnet18 --polygon --box_thresh 0.6
./datasets/total_text/
[INFO] [2019-12-04 19:50:45,447] Resuming from ./modelz/totaltext_resnet18
[INFO] [2019-12-04 19:50:45,496] Resumed from ./modelz/totaltext_resnet18
  0%|                                                                                                                                                                       | 0/300 [00:01<?, ?it/s]
Traceback (most recent call last):
  File "eval.py", line 193, in <module>
    main()
  File "eval.py", line 79, in main
    Eval(experiment, experiment_args, cmd=args, verbose=args['verbose']).eval(args['visualize'])
  File "eval.py", line 176, in eval
    pred = model.forward(batch, training=False)
  File "/home/home/p9/DB/structure/model.py", line 56, in forward
    pred = self.model(data, training=self.training)
  File "/home/home/anaconda3/envs/db/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/home/anaconda3/envs/db/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 152, in forward
    outputs = self.parallel_apply(replicas, inputs, kwargs)
  File "/home/home/anaconda3/envs/db/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 162, in parallel_apply
    return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
  File "/home/home/anaconda3/envs/db/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply
    output.reraise()
  File "/home/home/anaconda3/envs/db/lib/python3.6/site-packages/torch/_utils.py", line 385, in reraise
    raise self.exc_type(msg)
TypeError: Caught TypeError in replica 1 on device 1.
Original Traceback (most recent call last):
  File "/home/home/anaconda3/envs/db/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker
    output = module(*input, **kwargs)
  File "/home/home/anaconda3/envs/db/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
TypeError: forward() missing 1 required positional argument: 'data'

Segmentation fault

The Volatile GPU-Util 0%