Code Monkey home page Code Monkey logo

pmtd's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pmtd's Issues

Train error

@JingChaoLiu Hello, I have a error when run PMTD_demo.py (--method="PlaneClustering"), but I modify --method = “HardThreshold”, it ok. I dont kwon why. And I use the trained model by myself.

Traceback (most recent call last):
File "/home/donglin/projects/PMTD-inference/demo/PMTD_demo.py", line 104, in
main()
File "/home/donglin/projects/PMTD-inference/demo/PMTD_demo.py", line 84, in main
predictions = pmtd_demo.run_on_opencv_image(image)
File "/home/donglin/projects/PMTD-inference/demo/predictor.py", line 175, in run_on_opencv_image
predictions = self.compute_prediction(image)
File "/home/donglin/projects/PMTD-inference/demo/predictor.py", line 223, in compute_prediction
masks = self.masker.forward_single_image(masks, prediction)
File "/home/donglin/projects/PMTD-inference/demo/inference.py", line 27, in forward_single_image
for mask, box in zip(masks, boxes.bbox)
File "/home/donglin/projects/PMTD-inference/demo/inference.py", line 27, in
for mask, box in zip(masks, boxes.bbox)
File "/home/donglin/projects/PMTD-inference/demo/inference.py", line 44, in reg_pyramid_in_image
planes = plane_clustering(pos_points, planes)
File "/home/donglin/projects/PMTD-inference/demo/inference.py", line 87, in plane_clustering
A = torch.gels(B, X)[0][:3]
RuntimeError: Lapack Error in gels : The 1-th diagonal element of the triangular factor of A is zero at /opt/conda/conda-bld/pytorch_1556653114079/work/aten/src/TH/generic/THTensorLapack.cpp:165

执行test_net.py文件报错

作者您好,按照您给的readme进行环境的配置,之后执行test_net.py文件报如下错误,请问我是否需要编译Lapack呢?
image

Question about score threshold of Bbox Branch

Q1: When I set the score threshold to 0.05 as maskrcnn default, the precision was very low. Then I set the score threshold to 0.5, the F-measure matches the proposed score(88.20% on ICDAR 2015 test set), but the recall and the precision do not match the score on paper.

Method Precision Recall F-Measure
Baseline of PMTD 85.84 90.55 88.14
Our Baseline 92.50 84.20 88.20

Q2: Have you do the ablation study on Data Augmentation, RPN Anchor and OHEM. In my experiments, Data Augmentation and OHEM improve the performance, but modification for RPN Anchor does not work.

recommended configuration for a smaller batch size setting

Dear author,
Do you have the recommended configuration for a smaller batch size setting? I got NAN under the setting batch_size=36, LR=0.04, even when I use 1*binary_cross_entropy loss. When I reduce the LR to 0.004 or 0.001, the model seems not convergent well. I even tried Amsgrad optimizer with different LR.

By the way, I calculate the cropped text area via cv2.findContours(). Is it OK?

OHEM implementation?

Hello,
After read your paper, I have some question on your OHEM implementation.
you mean the OHEM is used on the RPN stage? Do you used it only on the RPN?
In my own understanding, you random sample from the RPN output, (maybe value N) and then put all the N proposals to calculate the sum loss, after get the loss, sorting, and choose Top 512 to update the network.
I dont know whether my understanding is right, ask for your help, thanks.

Question about Algorithm 1 Plane Clustering

Q1. Why Algorithm 1 's inputs is all segmentation result of a image( H*W points ), while its outputs is just only single one text bounding box ( 4 planes )

Q2. what's the detail about INITPLANES function? what parameters(A, B, D) is after calling the function ? I' cannot see from the paper.

Thanks !

EOFError

Hello,@JingChaoLiu
I occured the problem about 'EOFError' when I train train.py. I can train without any error for much time (such as 24 hours), but after that , occurs problem as follow:

image

Surprisingly,After the problem occurs, I interrupt code, then I still python train_net.py for a period of time, but then error. Repeated appearance. Cycle.

About configurations

First, thank you for your kind paper and github page.
Your work is super useful for studying text detection using mask-rcnn baseline.
I am reproducing the results of PMTD but my results are little bit worse. (Mask RCNN baseline 60% F-measure on MLT dataset)
So I'm figuring out what is wrong with my configuration.
It will be very helpful if the config file (.yaml) is provided, or let me know RPN.ANCHOR_STRIDE setting (currently, I'm using (4, 8, 16, 32, 64))
Thanks!

关于其他数据集的训练格式及训练方法问题

作者您好,请问该如何进行训练呢,训练数据的格式是需要依据给出的generate_icdar2017.py文件进行转换吗,除此之外,直接执行train.py文件就好了吗?盼复,如有打扰请您见谅

Question about threshold of mask in baseline

I reviewed the code history and found the commit postprocess Mask by HardThreshold.

As far as I understand, this is supposed to be the baseline described in the paper, which I'm not quite sure though.

One thing I found a bit confusing for me is that the threshold for mask head (i.e. for Masker) is set as 0.01 here. Shouldn't it be 0.5 after applying sigmoid()?

I've noticed that you moved sigmoid() from post-process to predictor. However, I suppose that won't change values feeding into Masker, right? Also, I'd like to know why such a move with sigmoid() is necessary?

Looking forward to your reply! @JingChaoLiu @liuxuebo0

关于训练遇到的问题

作者您好,请问该如何进行训练呢,训练数据的格式是需要依据给出的generate_icdar2017.py文件进行转换吗,除此之外,直接执行train.py文件就好了吗?盼复,如有打扰请您见谅

RuntimeError: invalid argument 2: non-empty 4D input tensor expected but got: [0 x 256 x 14 x 14]

On executing tools/test_net.py, I am getting a runtime error. I am using the default configurations with the pretrained model. When I increase the value of IMS_PER_BATCH, the error vanishes, however, the predictions that I obtain after this are highly incomplete, with most of the words not being detected.

File "tools/test_net.py", line 131, in
main()
File "tools/test_net.py", line 116, in main
output_folder=output_folder,
File "/home/pranav/PMTD/maskrcnn_benchmark/engine/inference.py", line 82, in inference
predictions = compute_on_dataset(model, data_loader, device, inference_timer)
File "/home/pranav/PMTD/maskrcnn_benchmark/engine/inference.py", line 28, in compute_on_dataset
output = model(images)
File "/tmp/yes/envs/maskrcnn_benchmark/lib/python3.7/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/pranav/PMTD/maskrcnn_benchmark/modeling/detector/generalized_rcnn.py", line 52, in forward
x, result, detector_losses = self.roi_heads(features, proposals, targets)
File "/tmp/yes/envs/maskrcnn_benchmark/lib/python3.7/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/pranav/PMTD/maskrcnn_benchmark/modeling/roi_heads/roi_heads.py", line 39, in forward
x, detections, loss_mask = self.mask(mask_features, detections, targets)
File "/tmp/yes/envs/maskrcnn_benchmark/lib/python3.7/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/pranav/PMTD/maskrcnn_benchmark/modeling/roi_heads/mask_head/mask_head.py", line 71, in forward
mask_logits = self.predictor(x)
File "/tmp/yes/envs/maskrcnn_benchmark/lib/python3.7/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/pranav/PMTD/maskrcnn_benchmark/modeling/roi_heads/mask_head/roi_mask_predictors.py", line 33, in forward
x = F.relu(self.conv5_mask(x))
File "/tmp/yes/envs/maskrcnn_benchmark/lib/python3.7/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/tmp/yes/envs/maskrcnn_benchmark/lib/python3.7/site-packages/torch/nn/modules/container.py", line 97, in forward
input = module(input)
File "/tmp/yes/envs/maskrcnn_benchmark/lib/python3.7/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/tmp/yes/envs/maskrcnn_benchmark/lib/python3.7/site-packages/torch/nn/modules/upsampling.py", line 134, in forward
return F.interpolate(input, self.size, self.scale_factor, self.mode, self.align_corners)
File "/tmp/yes/envs/maskrcnn_benchmark/lib/python3.7/site-packages/torch/nn/functional.py", line 2523, in interpolate
return torch._C._nn.upsample_bilinear2d(input, _output_size(2), align_corners)
RuntimeError: invalid argument 2: non-empty 4D input tensor expected but got: [0 x 256 x 14 x 14] at /opt/conda/conda-bld/pytorch-nightly_1553749764730/work/aten/src/THCUNN/generic/SpatialUpSamplingBilinear.cu:21

test PMTD model on my dataset

hi i want to use this model on my dataset and i using colab and succesfully installed all requirements.
but i dont know how to do the rest, can anyone help me?

RuntimeError: copy_if failed to synchronize: device-side assert triggered

@JingChaoLiu @liuxuebo0 Hello, When I always occurs the problem as follow, I don't know the reason? Someone says that learning rate is large, but what learning rate is ok? Could you give me a solution?

Traceback (most recent call last):
  File "tools/train_net.py", line 186, in <module>
    main()
  File "tools/train_net.py", line 179, in main
    model = train(cfg, args.local_rank, args.distributed)
  File "tools/train_net.py", line 85, in train
    arguments,
  File "/home/donglin/INSTALL_DIR/PMTD-inference/maskrcnn_benchmark/engine/trainer.py", line 75, in do_train
    loss_dict = model(images, targets)
  File "/home/donglin/anaconda2/envs/maskrcnn/lib/python3.7/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/donglin/anaconda2/envs/maskrcnn/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 367, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/home/donglin/anaconda2/envs/maskrcnn/lib/python3.7/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/donglin/anaconda2/envs/maskrcnn/lib/python3.7/site-packages/apex-0.1-py3.7-linux-x86_64.egg/apex/amp/_initialize.py", line 204, in new_fwd
    **applier(kwargs, input_caster))
  File "/home/donglin/INSTALL_DIR/PMTD-inference/maskrcnn_benchmark/modeling/detector/generalized_rcnn.py", line 50, in forward
    proposals, proposal_losses = self.rpn(images, features, targets)
  File "/home/donglin/anaconda2/envs/maskrcnn/lib/python3.7/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/donglin/INSTALL_DIR/PMTD-inference/maskrcnn_benchmark/modeling/rpn/rpn.py", line 207, in forward
    return self._forward_train(anchors, objectness, rpn_box_regression, targets)
  File "/home/donglin/INSTALL_DIR/PMTD-inference/maskrcnn_benchmark/modeling/rpn/rpn.py", line 223, in _forward_train
    anchors, objectness, rpn_box_regression, targets
  File "/home/donglin/anaconda2/envs/maskrcnn/lib/python3.7/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/donglin/INSTALL_DIR/PMTD-inference/maskrcnn_benchmark/modeling/rpn/inference.py", line 140, in forward
    sampled_boxes.append(self.forward_for_single_feature_map(a, o, b))
  File "/home/donglin/INSTALL_DIR/PMTD-inference/maskrcnn_benchmark/modeling/rpn/inference.py", line 115, in forward_for_single_feature_map
    boxlist = remove_small_boxes(boxlist, self.min_size)
  File "/home/donglin/INSTALL_DIR/PMTD-inference/maskrcnn_benchmark/structures/boxlist_ops.py", line 46, in remove_small_boxes
    (ws >= min_size) & (hs >= min_size)
RuntimeError: copy_if failed to synchronize: device-side assert triggered
terminate called without an active exception
terminate called without an active exception
terminate called without an active exception
terminate called without an active exception
Traceback (most recent call last):
  File "/home/donglin/anaconda2/envs/maskrcnn/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/donglin/anaconda2/envs/maskrcnn/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/donglin/anaconda2/envs/maskrcnn/lib/python3.7/site-packages/torch/distributed/launch.py", line 238, in <module>
    main()
  File "/home/donglin/anaconda2/envs/maskrcnn/lib/python3.7/site-packages/torch/distributed/launch.py", line 234, in main
    cmd=process.args)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.