Code Monkey home page Code Monkey logo

cascadetabnet's Introduction

CascadeTabNet

PWC PWC PWC

CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents
Devashish Prasad, Ayan Gadpal, Kshitij Kapadni, Manish Visave,
CVPR Link of Paper
arXiv Link of Paper
Supplementary file
The paper was presented (Orals) at CVPR 2020 Workshop on Text and Documents in the Deep Learning Era

Virtual Oral Presentation YOUTUBE VIDEO
Cascadetabnet Demo by Bhavesh Bhatt YOUTUBE VIDEO

1. Introduction

CascadTabNet is an automatic table recognition method for interpretation of tabular data in document images. We present an improved deep learning-based end to end approach for solving both problems of table detection and structure recognition using a single Convolution Neural Network (CNN) model. CascadeTabNet is a Cascade mask Region-based CNN High-Resolution Network (Cascade mask R-CNN HRNet) based model that detects the regions of tables and recognizes the structural body cells from the detected tables at the same time. We evaluate our results on ICDAR 2013, ICDAR 2019 and TableBank public datasets. We achieved 3rd rank in ICDAR 2019 post-competition results for table detection while attaining the best accuracy results for the ICDAR 2013 and TableBank dataset. We also attain the highest accuracy results on the ICDAR 2019 table structure recognition dataset.

2. Setup

Models are developed in Pytorch based MMdetection framework (Version 1.2)

pip install -q mmcv terminaltables
git clone --branch v1.2.0 'https://github.com/open-mmlab/mmdetection.git'
cd "mmdetection"
pip install -r "/content/mmdetection/requirements/optional.txt"
python setup.py install
python setup.py develop
pip install -r {"requirements.txt"}
pip install pillow==6.2.1 
pip install mmcv==0.4.3

Code is developed under following library dependencies

PyTorch = 1.4.0
Torchvision = 0.5.0
Cuda = 10.0

pip install torch==1.4.0+cu100 torchvision==0.5.0+cu100 -f https://download.pytorch.org/whl/torch_stable.html

If you are using Google Colaboratory (Colab), Then you need add

from google.colab.patches import cv2_imshow

and replace all the cv2.imshow with cv2_imshow

3. Model Architecture

Model Computation Graph

4. Image Augmentation


Codes: Code for dilation transform Code for smudge transform

5. Benchmarking

5.1. Table Detection

1. ICDAR 13

2. ICDAR 19 (Track A Modern)

3. TableBank

TableBank Benchmarking : Official Leaderboard

TableBank Dataset Divisions : TableBank

5.2. Table Structure Recognition

1. ICDAR 19 (Track B2)

6. Model Zoo

Checkout our demo notebook for loading checkpoints and performing inference
Open In Colab
Config file for the Models
Note: Config paths are only required to change during training
Checkpoints of the Models we have trained :

Model NameCheckpoint File
General Model table detectionCheckpoint
ICDAR 13 table detectionCheckpoint
ICDAR 19 (Track A Modern) table detectionCheckpoint
Table Bank Word table detectionCheckpoint
Table Bank Latex table detectionCheckpoint
Table Bank Both table detectionCheckpoint
ICDAR 19 (Track B2 Modern) table structure recognitionCheckpoint

7. Datasets

  1. End to End Table Recognition Dataset
    We manually annotated some of the ICDAR 19 table competition (cTDaR) dataset images for cell detection in the borderless tables. More details about the dataset are mentioned in the paper.
    dataset link

  2. General Table Detection Dataset (ICDAR 19 + Marmot + Github)
    We manually corrected the annotations of Marmot and Github and combined them with ICDAR 19 dataset to create a general and robust dataset.
    dataset link

8. Training

You may refer this tutorial for training Mmdetection models on your custom datasets in colab.

You may refer this script to convert your Pascal VOC XML annotation files to a single COCO Json file.

9. Docker

The docker image of this project can be found at docker hub

It currently contains three models from model zoo. For details you can check the readme file at the docker hub.

Contact

Devashish Prasad : devashishkprasad [at] gmail [dot] com
Ayan Gadpal : ayangadpal2 [at] gmail [dot] com
Kshitij Kapadni : kshitij.kapadni [at] gmail [dot] com
Manish Visave : manishvisave149 [at] gmail [dot] com

Acknowledgements

We thank the following contributions because of which the paper was made possible

  1. The MMdetection project team for creating the amazing framework to push the state of the art computer vision research and which enabled us to experiment and build state of the art models very easily.

  2. Our college ”Pune Institute of Computer Technology” for funding our research and giving us the opportunity to work and publish our research at an international conference.

  3. Kai Chen for endorsing our paper on the arXiv to publish a pre-print of the paper and also for maintaining the Mmdetection repository along with the team.

  4. Google Colaboratory team for providing free high end GPU resources for research and development. All of the code base was developed using Google colab and couldn't be possible without it.

  5. AP Analytica for making us aware about a similar problem statement and giving us an opportunity to work on the same.

  6. Overleaf.com for open sourcing the wonderful project which enabled us to write the research paper easily in the latex format

License

The code of CascadeTabNet is Open Source under the MIT License. There is no limitation for both acadmic and commercial usage.

Cite as

If you find this work useful for your research, please cite our paper:

@misc{ cascadetabnet2020,
    title={CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents},
    author={Devashish Prasad and Ayan Gadpal and Kshitij Kapadni and Manish Visave and Kavita Sultanpure},
    year={2020},
    eprint={2004.12629},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

cascadetabnet's People

Contributors

akadirpamukcu avatar ayangadpal avatar devashishprasad avatar francescoperessini avatar kshitijkapadni avatar manishdv avatar mhmd-azeez avatar mrzilinxiao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cascadetabnet's Issues

detect results is smoething wrong

when i test an image in Examples files named cTDaR_t10120.jpg, it just detect one table,and cells picture you show is as follows:

and i use epoch_36 to test, besides the code run to " table: [ 323 208 1135 557]
[Table status] : Processing table with lines" it can not exit,stop here all the time
cells

mmdetection import library error

Hi Devashish -

In the main.py file, I see the mmdetection import statement as:
from mmdet.apis import inference_detector, show_result, init_detector

The "show_result" must be changed because it has now been renamed as "show_result_pyplot" in mmdetection. The import statement should be as follows:

from mmdet.apis import inference_detector, show_result_pyplot, init_detector

Thanks,
Sekhar H.

Predicted image co-ordinates are lifted uniformly above original cell value.

HI,

output xml coordinates plotted on the input images:
https://prnt.sc/sl1f28
https://prnt.sc/sl1fi5
https://prnt.sc/sl1g42
(All blue lines are manually drawn based on the xml output coordinates for these images)

I have attached the input images which i have used for this model and i have drawn the bounding boxes manually using the XML output.

In all these images the bounding boxes are uniformly lifted upwards. By this I could sense that the image size that I'm sending is altered at the time of prediction and the xml output is having the co-ordinates of the altered image. (please correct me if this is not the case)

If above is true please let me know on where can i get the altered images, so that the xml output co-ordinates would match while plotted.


The next question is, if the above is true, how the table detection coordinates are matching correctly, for all 3 table images attached, apart from cell level, I have also drawn the bounding box for the entire table which fits perfectly.

If all the cell level co-ordinates are realigned how come the table level co-ordinates are alone coming up correctly?

Thanks,
Anand.

Post-processing in this test case is so slow

if len(res_border) != 0:
## call border script for each table in image
for res in res_border:
try:
root.append(border(res,cv2.imread(i)))
except:
pass
if len(res_bless) != 0:
if len(res_cell) != 0:
for no,res in enumerate(res_bless):
root.append(borderless(res,cv2.imread(i),res_cell))

test image:https://raw.githubusercontent.com/cndplab-founder/ICDAR2019_cTDaR/master/test/TRACKB2/cTDaR_t10080.jpg

temp_lines_ver is None in borderFunc.py

Hi,

first of all, thank you for the great work.

I tried to run main.py in Table Structure Recognition and encountered the problem that temp_lines_ver is set to None in when calling extract_table without the lines parameter in borderFunc.py.

See here

So when iterating over it here, this will obviously throw an exception.

I'm not familiar enough with the code yet. But should be an easy one to fix for someone who is.

Cheers

the function of "Table Structure Recognition" folder

Hello, I have a question about the function of "Table Structure Recognition" folder. The "main" file under this folder can generate XML, so the generated XML file is used as the tag data of the model(training the model from scratch)? Or generate XML files about the table structure based on the data predicted by the model?

Run a prediction

I am using Pytorch for the first time. I am not able to understand how to run a prediction on an image to get the table. Please guide.

Fine Tuning

Hi, can you please provide scripts used for training for the purpose of fine-tuning the model.

Thanks

Questions about the prediction of the model

The prediction of the model is a list of 80 arrays. Which one represents the cell bounding boxes and which represents the table bounding boxes? I am interested in extracting the vertices for bounding box of table.

CUDA error

Hi I have the right version of Cuda and still getting this issue while running the main file, can you help me with this

Environment :
sys.platform: linux
Python: 3.6.9 (default, Apr 18 2020, 01:56:04) [GCC 8.4.0]
CUDA available: True
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 10.0, V10.0.130
GPU 0: Tesla P100-PCIE-16GB
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.4.0+cu100
PyTorch compiling details: PyTorch built with:

  • GCC 7.3
  • Intel(R) Math Kernel Library Version 2019.0.4 Product Build 20190411 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CUDA Runtime 10.0
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
  • CuDNN 7.6.3
  • Magma 2.5.1
  • Build settings: BLAS=MKL, BUILD_NAMEDTENSOR=OFF, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Wno-stringop-overflow, DISABLE_NUMA=1, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

TorchVision: 0.5.0+cu100
OpenCV: 4.1.2
MMCV: 0.5.3
MMDetection: 1.2.0+0f33c08
MMDetection Compiler: GCC 7.5
MMDetection CUDA Compiler: 10.0

Error Traceback

Traceback (most recent call last):
File "Table Structure Recognition/main.py", line 23, in
result = inference_detector(model, i)
File "/content/drive/My Drive/dundun/mmdetection/mmdet/apis/inference.py", line 86, in inference_detector
result = model(return_loss=False, rescale=True, **data)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/content/drive/My Drive/dundun/mmdetection/mmdet/core/fp16/decorators.py", line 49, in new_func
return old_func(*args, **kwargs)
File "/content/drive/My Drive/dundun/mmdetection/mmdet/models/detectors/base.py", line 149, in forward
return self.forward_test(img, img_metas, **kwargs)
File "/content/drive/My Drive/dundun/mmdetection/mmdet/models/detectors/base.py", line 130, in forward_test
return self.simple_test(imgs[0], img_metas[0], **kwargs)
File "/content/drive/My Drive/dundun/mmdetection/mmdet/models/detectors/cascade_rcnn.py", line 324, in simple_test
self.test_cfg.rpn) if proposals is None else proposals
File "/content/drive/My Drive/dundun/mmdetection/mmdet/models/detectors/test_mixins.py", line 34, in simple_test_rpn
proposal_list = self.rpn_head.get_bboxes(proposal_inputs)
File "/content/drive/My Drive/dundun/mmdetection/mmdet/core/fp16/decorators.py", line 127, in new_func
return old_func(args, **kwargs)
File "/content/drive/My Drive/dundun/mmdetection/mmdet/models/anchor_heads/anchor_head.py", line 276, in get_bboxes
scale_factor, cfg, rescale)
File "/content/drive/My Drive/dundun/mmdetection/mmdet/models/anchor_heads/rpn_head.py", line 92, in get_bboxes_single
proposals, _ = nms(proposals, cfg.nms_thr)
File "/content/drive/My Drive/dundun/mmdetection/mmdet/ops/nms/nms_wrapper.py", line 54, in nms
inds = nms_cuda.nms(dets_th, iou_thr)
RuntimeError: CUDA error: no kernel image is available for execution on the device (launch_kernel at /pytorch/aten/src/ATen/native/cuda/Loops.cuh:103)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x33 (0x7fde7991b193 in /usr/local/lib/python3.6/dist-packages/torch/lib/libc10.so)
frame #1: void at::native::gpu_index_kernel<_nv_dl_wrapper_t<nv_dl_tag<void (
)(at::TensorIterator&, c10::ArrayRef, c10::ArrayRef), &(void at::native::index_kernel_impl<at::native::OpaqueType<8> >(at::TensorIterator&, c10::ArrayRef, c10::ArrayRef)), 1u>> >(at::TensorIterator&, c10::ArrayRef, c10::ArrayRef, __nv_dl_wrapper_t<_nv_dl_tag<void (
)(at::TensorIterator&, c10::ArrayRef, c10::ArrayRef), &(void at::native::index_kernel_impl<at::native::OpaqueType<8> >(at::TensorIterator&, c10::ArrayRef, c10::ArrayRef)), 1u>> const&) + 0x7bb (0x7fde7f58387b in /usr/local/lib/python3.6/dist-packages/torch/lib/libtorch.so)

Unable to fine-tune due to missing mask labels

Hi, I am currently fine-tuning the pre-trained model (epoch36.pth) but I am encountering an error whenever I load my custom dataset generated using LabelImg.

Traceback (most recent call last):
  File "tools/train.py", line 151, in <module>
    main()
  File "tools/train.py", line 147, in main
    meta=meta)
  File "/usr/local/lib/python3.6/dist-packages/mmdet-1.2.0+0f33c08-py3.6-linux-x86_64.egg/mmdet/apis/train.py", line 165, in train_detector
    runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
  File "/usr/local/lib/python3.6/dist-packages/mmcv/runner/runner.py", line 384, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/mmcv/runner/runner.py", line 279, in train
    for i, data_batch in enumerate(data_loader):
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 345, in __next__
    data = self._next_data()
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 856, in _next_data
    return self._process_data(data)
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 881, in _process_data
    data.reraise()
  File "/usr/local/lib/python3.6/dist-packages/torch/_utils.py", line 394, in reraise
    raise self.exc_type(msg)
KeyError: Caught KeyError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
    data = fetcher.fetch(index)
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.6/dist-packages/mmdet-1.2.0+0f33c08-py3.6-linux-x86_64.egg/mmdet/datasets/custom.py", line 132, in __getitem__
    data = self.prepare_train_img(idx)
  File "/usr/local/lib/python3.6/dist-packages/mmdet-1.2.0+0f33c08-py3.6-linux-x86_64.egg/mmdet/datasets/custom.py", line 145, in prepare_train_img
    return self.pipeline(results)
  File "/usr/local/lib/python3.6/dist-packages/mmdet-1.2.0+0f33c08-py3.6-linux-x86_64.egg/mmdet/datasets/pipelines/compose.py", line 24, in __call__
    data = t(data)
  File "/usr/local/lib/python3.6/dist-packages/mmdet-1.2.0+0f33c08-py3.6-linux-x86_64.egg/mmdet/datasets/pipelines/loading.py", line 147, in __call__
    results = self._load_masks(results)
  File "/usr/local/lib/python3.6/dist-packages/mmdet-1.2.0+0f33c08-py3.6-linux-x86_64.egg/mmdet/datasets/pipelines/loading.py", line 125, in _load_masks
    gt_masks = results['ann_info']['masks']
KeyError: 'masks'

I noticed specifically from the config file that the training pipeline requires masks to be enabled.

    dict(type='LoadAnnotations', with_bbox=True, with_mask=True),

Is there something to be done when annotating using LabelImg that you guys did differently to indicate the existence of label masks? I saw the example provided and did the same but still getting an error about masks. I also set with_mask=False but I don't honestly know how relevant would that be to the whole training process.

Example annotation from LabelImg:

<annotation>
	<folder>jpeg_images</folder>
	<filename>acc_2018_fs_008.jpg</filename>
	<path>/Users/rt/Desktop/99_annotated/jpeg_images/acc_2018_fs_008.jpg</path>
	<source>
		<database>Unknown</database>
	</source>
	<size>
		<width>4958</width>
		<height>7017</height>
		<depth>3</depth>
	</size>
	<segmented>0</segmented>
	<object>
        <name>borderless</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>993</xmin>
            <ymin>1020</ymin>
            <xmax>4223</xmax>
            <ymax>5479</ymax>
        </bndbox>
    </object>
	<object>
		<name>cell</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox>
			<xmin>3559</xmin>
			<ymin>1047</ymin>
			<xmax>4021</xmax>
			<ymax>1107</ymax>
		</bndbox>
	</object>
</annotation>

Thank you and I appreciate this awesome work by the way.

Has an alternative for colab?

I want to debug the code so that good understanding, I originally used vscode but gave up due to my mac did not support cuda so i tried to use colab which however debugging is not good experience to me and each operation is slow.

So have an alternative to replace colab or have you a some tips ?

name 'etree' is not defined

[Table status] : Processing table with lines
<PIL.Image.Image image mode=RGB size=812x349 at 0x7F9D02BBA860>
<PIL.Image.Image image mode=RGB size=1224x1584 at 0x7F9D012A60B8>
Traceback (most recent call last):
File "CascadeTabNet/Table Structure Recognition/main.py", line 53, in
root.append(borderless(res,cv2.imread(i),res_cell))
File "/content/gdrive/My Drive/CascadeTabNet/CascadeTabNet/Table Structure Recognition/Functions/blessFunc.py", line 365, in borderless
tableXML = etree.Element("table")
NameError: name 'etree' is not defined

XML output of extracted tabular text

Hi Devashish -

For reference, is it possible to upload the XML output results of extracted tabular text for a few example documents?

Thanks,
Sekhar H.

repair table

mask

the lines on the top is missing ,how can i repaired? thank you

Table structure recognition is not predicted for second table of demo image

First and foremost, thanks for this interesting paper and also this repository!

Now, as you can see in the README, in the demo gif not only both tables are detected but structure recognition is successful for both tables (in the last step of the animation).

However, when predicting this demo image, I get the different results:

image

As you can see in the screenshot, both tables are detected succesfully. But in the right table no cell is recognised. In the left table, cells in the last columns are also not recognised. I'm using the same checkpoint file and configuration as in the demo Jupyter notebook. I tried lowering the threshold, but that didn't help.

How can I improve the prediction so that I get the same performance as shown in the demo gif? Am I missing some postprocessing, or am I not using the optimal configuration, or something else? I'm not sure, I hope you could help.

Thanks!

Variation in Results

I was running the evaluation on ICDAR-13 using pre-trained model which you have provided and using the default configuration file.
There is a huge variation in the results.
recall:1.0, precision:0.843, f_measure:0.9216

Are you doing pre-processing on the testing images as well??

mmdetection v1.2 won't install without a GPU

Hi - It looks like I can't install mmdetection v1.2 without a GPU even though I installed CUDA10.0 and the appropriate version of cuDNN. Is this understanding correct? Clearly my installtion is failing with the error - "no CUDA-capable device is detected".

I'm unable to find any proper information in open-mmlabs about this subject. However, I'm able to install v2.0 without GPU because I believe 2.0 has a default check to fallback to CPU if a GPU device is not found.

Keeping your model at v1.2 for people who run their projects with no GPU will likely make the usage of these models limited. Is there a way to convert the model trained on v1.2 to v2.0?

Thanks,
Sekhar H.

Hi all,

Hi all,
I'm running demo on Colab but the Run the Predictions aren't success.

Prediction in colab

Hi, i'm a newbie with mmdetection, i had some issues (mentionned in the open issue's github too) trying to run it on cpu, so i decided to try a prediction in colab but i faced the error below and i could not find the solution alone

image

directory not empty error

No CUDA runtime is found, using CUDA_HOME='C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2'
Traceback (most recent call last):
  File "C:\Users\ABC\AppData\Local\Continuum\anaconda3\envs\python3env\lib\site-packages\mmcv\utils\config.py", line 92, in _file2dict
    osp.join(temp_config_dir, temp_config_name))
  File "C:\Users\ABC\AppData\Local\Continuum\anaconda3\envs\python3env\lib\shutil.py", line 121, in copyfile
    with open(dst, 'wb') as fdst:
PermissionError: [Errno 13] Permission denied: 'C:\\Users\\ABC\\AppData\\Local\\Temp\\tmp2x0tbf44\\tmpg99tl4cb.py'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "main.py", line 17, in <module>
    model = init_detector(config_fname, os.path.join(checkpoint_path, epoch))
  File "C:\Users\ABC\AppData\Local\Continuum\anaconda3\envs\python3env\lib\site-packages\mmdet-2.0.0+a9bedfb-py3.6-win-amd64.egg\mmdet\apis\inference.py", line 28, in init_detector
    config = mmcv.Config.fromfile(config)
  File "C:\Users\ABC\AppData\Local\Continuum\anaconda3\envs\python3env\lib\site-packages\mmcv\utils\config.py", line 165, in fromfile
    cfg_dict, cfg_text = Config._file2dict(filename)
  File "C:\Users\ABC\AppData\Local\Continuum\anaconda3\envs\python3env\lib\site-packages\mmcv\utils\config.py", line 105, in _file2dict
    temp_config_file.close()
  File "C:\Users\ABC\AppData\Local\Continuum\anaconda3\envs\python3env\lib\tempfile.py", line 809, in __exit__
    self.cleanup()
  File "C:\Users\ABC\AppData\Local\Continuum\anaconda3\envs\python3env\lib\tempfile.py", line 813, in cleanup
    _shutil.rmtree(self.name)
  File "C:\Users\ABC\AppData\Local\Continuum\anaconda3\envs\python3env\lib\shutil.py", line 494, in rmtree
    return _rmtree_unsafe(path, onerror)
  File "C:\Users\ABC\AppData\Local\Continuum\anaconda3\envs\python3env\lib\shutil.py", line 393, in _rmtree_unsafe
    onerror(os.rmdir, path, sys.exc_info())
  File "C:\Users\ABC\AppData\Local\Continuum\anaconda3\envs\python3env\lib\shutil.py", line 391, in _rmtree_unsafe
    os.rmdir(path)
OSError: [WinError 145] The directory is not empty: 'C:\\Users\\ABC\\AppData\\Local\\Temp\\tmp2x0tbf44'


On running main.py with this config file and this model file, I am again and again getting this error

Other Details:
CUDA version : 9.2
OS : Windows
Main.py config

image_path = 'Examples\\cTDaR_t10120.jpg'
xmlPath = 'Examples'


config_fname = "Examples\\faster_rcnn_hrnetv2p_w32_1x_coco.py" 
checkpoint_path = "C:\\Users\\ABC\\Music\\table-structure-rec\\CascadeTabNet\\Table Structure Recognition\\Examples\\"
epoch = 'faster_rcnn_hrnetv2p_w32_1x_coco_20200130-6e286425.pth'


Borderless tables

Not able to produce any output in case of borderless tables. Is the code for cell masks in case of bordeless tables released or am I missing something ?

Training error

I was trying to train the model on a custom dataset for table detection in my local system with COCO style annotations.

However, I encounter an error while training
mmdet - ERROR - The testing results of the whole dataset is empty.
The evaluation results are all empty on validation data and hence I am not able to generate results as i get empty arrays as ouput.

I am not able able to identify any issue. Any help will be appreciated.

How to train your model from scratch?

Hi! Very interesting paper and I am interested in training the model from scratch. Do you have a script available for reference? I am not an expert in object detection and a reference script for full pipeline training would be greatly appreciated.

Thanks.

Training Metrics (Precision, Recall, and F1)

Hello, thank you for the VOC to Coco script. It was very helpful. Would it be fine to ask if you used a custom script to measure model accuracy? Are there any resources you can point where I can get more information?

RuntimeError: cuda runtime error (209) : unrecognized error code at mmdet/ops/roi_align/src/roi_align_kernel.cu:139

I'm trying to run ICDAR-13 model. But I'm getting this error.


RuntimeError Traceback (most recent call last)

in ()
10
11 # Run Inference
---> 12 result = inference_detector(model, img)
13
14 # Visualization results

11 frames

/content/drive/My Drive/mmdetection/mmdet/apis/inference.py in inference_detector(model, img)
84 # forward the model
85 with torch.no_grad():
---> 86 result = model(return_loss=False, rescale=True, **data)
87 return result
88

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
530 result = self._slow_forward(*input, **kwargs)
531 else:
--> 532 result = self.forward(*input, **kwargs)
533 for hook in self._forward_hooks.values():
534 hook_result = hook(self, input, result)

/content/drive/My Drive/mmdetection/mmdet/core/fp16/decorators.py in new_func(*args, **kwargs)
47 'method of nn.Module')
48 if not (hasattr(args[0], 'fp16_enabled') and args[0].fp16_enabled):
---> 49 return old_func(*args, **kwargs)
50 # get the arg spec of the decorated method
51 args_info = getfullargspec(old_func)

/content/drive/My Drive/mmdetection/mmdet/models/detectors/base.py in forward(self, img, img_metas, return_loss, **kwargs)
147 return self.forward_train(img, img_metas, **kwargs)
148 else:
--> 149 return self.forward_test(img, img_metas, **kwargs)
150
151 def show_result(self, data, result, dataset=None, score_thr=0.3):

/content/drive/My Drive/mmdetection/mmdet/models/detectors/base.py in forward_test(self, imgs, img_metas, **kwargs)
128 if 'proposals' in kwargs:
129 kwargs['proposals'] = kwargs['proposals'][0]
--> 130 return self.simple_test(imgs[0], img_metas[0], **kwargs)
131 else:
132 # TODO: support test augmentation for predefined proposals

/content/drive/My Drive/mmdetection/mmdet/models/detectors/cascade_rcnn.py in simple_test(self, img, img_metas, proposals, rescale)
340
341 bbox_feats = bbox_roi_extractor(
--> 342 x[:len(bbox_roi_extractor.featmap_strides)], rois)
343 if self.with_shared_head:
344 bbox_feats = self.shared_head(bbox_feats)

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
530 result = self._slow_forward(*input, **kwargs)
531 else:
--> 532 result = self.forward(*input, **kwargs)
533 for hook in self._forward_hooks.values():
534 hook_result = hook(self, input, result)

/content/drive/My Drive/mmdetection/mmdet/core/fp16/decorators.py in new_func(*args, **kwargs)
125 'method of nn.Module')
126 if not (hasattr(args[0], 'fp16_enabled') and args[0].fp16_enabled):
--> 127 return old_func(*args, **kwargs)
128 # get the arg spec of the decorated method
129 args_info = getfullargspec(old_func)

/content/drive/My Drive/mmdetection/mmdet/models/roi_extractors/single_level.py in forward(self, feats, rois, roi_scale_factor)
103 if inds.any():
104 rois_ = rois[inds, :]
--> 105 roi_feats_t = self.roi_layers[i](feats[i], rois_)
106 roi_feats[inds] = roi_feats_t
107 return roi_feats

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
530 result = self._slow_forward(*input, **kwargs)
531 else:
--> 532 result = self.forward(*input, **kwargs)
533 for hook in self._forward_hooks.values():
534 hook_result = hook(self, input, result)

/content/drive/My Drive/mmdetection/mmdet/ops/roi_align/roi_align.py in forward(self, features, rois)
142 else:
143 return roi_align(features, rois, self.out_size, self.spatial_scale,
--> 144 self.sample_num, self.aligned)
145
146 def repr(self):

/content/drive/My Drive/mmdetection/mmdet/ops/roi_align/roi_align.py in forward(ctx, features, rois, out_size, spatial_scale, sample_num, aligned)
34 out_w)
35 roi_align_cuda.forward_v1(features, rois, out_h, out_w,
---> 36 spatial_scale, sample_num, output)
37 else:
38 output = roi_align_cuda.forward_v2(features, rois,

RuntimeError: cuda runtime error (209) : unrecognized error code at mmdet/ops/roi_align/src/roi_align_kernel.cu:139

Can anyone help me solve this?? Thanks in advance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.