chengyangfu / retinamask Goto Github PK

View Code? Open in Web Editor NEW

340.0 16.0 51.0 4.18 MB

RetinaMask

License: MIT License

Python 85.15% C++ 5.63% Cuda 8.88% Shell 0.34%

retinamask's Introduction

RetinaMask

The code is based on the maskrcnn-benchmark.

Citing RetinaMask

Please cite RetinaMask in your publications if it helps your research:

@inproceedings{fu2019retinamask,
  title = {{RetinaMask}: Learning to predict masks improves state-of-the-art single-shot detection for free},
  author = {Fu, Cheng-Yang and  Shvets, Mykhailo and Berg, Alexander C.},
  booktitle = {arXiv preprint arXiv:1901.03353},
  year = {2019}
}

Installation
Models

Installation

Follow the maskrcnn-benchmark to install code and set up the dataset. Use config files in ./configs/retina/ for Training and Testing.

Models

Models	BBox	B(time)	Mask	M(time)	Link
ResNet-50-FPN	39.4/58.6/42.3/21.9/42.0/51.0	0.124	34.9/55.7/37.1/15.1/36.7/50.4	0.139	link
ResNet-101-FPN	41.4/ 60.8/44.6/23.0/44.5/53.5	0.145	36.6/58.0/39.1/16.2/38.8/52.7	0.160	link
ResNet-101-FPN-GN	41.7/61.7/45.0/23.5/44.7/52.8	0.153	36.7/58.8/39.3/16.4/39.4/52.6	0.164	link
ResNeXt32x8d-101-FPN-GN	42.6/62.5/46.0/24.8/45.6/53.8	0.231	37.4/59.8/40.0/17.6/39.9/53.4	0.270	link

P.S. evaluation metric: AP, AP50, AP75, AP(small), AP(medium), AP(large), please refer to COCO for detailed explanation. The inference time is measured on Nvidia 1080Ti.

Run Inference

Use the following scripts. (Assume models are download to the ./models directory) Run Mask and BBox

python tools/test_net.py --config-file ./configs/retina/retinanet_mask_R-50-FPN_2x_adjust_std011_ms.yaml MODEL.WEIGHT ./models/retinanet_mask_R-50-FPN_2x_adjust_std011_ms_model.pth

Run BBox only

python tools/test_net.py --config-file ./configs/retina/retinanet_mask_R-50-FPN_2x_adjust_std011_ms.yaml MODEL.WEIGHT ./models/retinanet_mask_R-50-FPN_2x_adjust_std011_ms_model.pth MODEL.MASK_ON False

retinamask's People

Contributors

Stargazers

Watchers

Forkers

xuhuaze707313 alixing issac8huxleg haochange shiyongde hajungong007 massymeniche issamlaradji jiachen0212 nemonameless rameezrehman83 xiaochengcike tomheaven zhengqun borislestsov takotab zhenhuajz fendaq projectrgreen nickmckillip mvcaro garfield2005 leo-xxx civilpat sunpeng981712364 spyderxu happog rbalajiwave tgnurulhuda dreadlord1984 furionzg chaoso dipikabablani dotori-hj eric-zhang1990 mengajin xqpinitial youtang1993 paugaso1 mateokutnjak cv-ip zhuxiongwei24 anslt fruitspec agathemo ruthvik92 quebradawill ruyuan2512 peace-love243 ruiminchen

retinamask's Issues

how to merge retinamask to maskrcnn-benchmark?

what are the difference between retinamask and maskrcnn-benchmark (maybe the pr102)? because i want merge retinamask into it. (for personal purpose).
thanks

About the speed issue

Hi, since I tested maskrcnn-benchmark with resnet50 maskrcnn
it was about 15fps fastest on GTX 1080 ti. Which means it's about 66ms per frame.

Why does one stage Retina as feature extractor backbones gots 120+ms in your readme?
It should be much more faster therotically.

❓ Questions and Help

Because of the epidemic,I still can't go back to University to use GPU.And could it work on cpu?
It would be appreciated if you reply me soon.Thanks!

How to implement the sigmoid_focalloss on CPU?

❓ Questions and Help

Hi~
I want to train the model on CPU, but how to implement the sigmoid_focalloss?

AdjustSmoothL1Loss substracts two variables with different quantities

❓ Questions and Help

I'm confused by the AdjustSmoothL1Loss using running_mean-running_var in the paper. Running mean and running_var have different dimensional quantity. Subtracting has no meaning on them.

e.g., if running_mean stands for meter, then running_var is m^2
according to Dimensional analysis

Only commensurable quantities (physical quantities having the same dimension) may be compared, equated, added, or subtracted.

Maybe use standard deviation?

Add speed test vs original maskrcnn-benmark

Thanks for integrating retinanet into maskrcnn. Does there any plan to post some speed evaluation using retinanet in maskrcnn architecture?

That would be great if maskrcnn runs realtime (15fps both inference and visualization)!

❓ Questions and Help

I converted the PASCAL VOC dataset into the format of COCO dataset and trained a new model, but the AP is low (almost zero). Do you have any suggection on how to train RetinaMask with VOC ?

error while training the retinamask for COCO dataset with my own images inside

🐛 Bug

Error while starting the training using tools/train_net.py

I have my own dataset in COCO format on which i want to train the retinamask. I am getting following error while starting the training.

Traceback (most recent call last):
File "tools/train_net.py", line 18, in
from maskrcnn_benchmark.engine.inference import inference
File "/data/home/amjn_cowi/amardeep/retinamask/retinamask/maskrcnn_benchmark/engine/inference.py", line 20, in
from maskrcnn_benchmark.structures.boxlist_ops import boxlist_iou
File "/data/home/amjn_cowi/amardeep/retinamask/retinamask/maskrcnn_benchmark/structures/boxlist_ops.py", line 6, in
from maskrcnn_benchmark.layers import nms as _box_nms
File "/data/home/amjn_cowi/amardeep/retinamask/retinamask/maskrcnn_benchmark/layers/init.py", line 8, in
from .nms import nms
File "/data/home/amjn_cowi/amardeep/retinamask/retinamask/maskrcnn_benchmark/layers/nms.py", line 3, in
from maskrcnn_benchmark import _C
ImportError: cannot import name '_C' from 'maskrcnn_benchmark' (/data/home/amjn_cowi/amardeep/retinamask/retinamask/maskrcnn_benchmark/init.py)

Can somebody guide me how to fix it ?

I used following command to start the training :
python tools/train_net.py --config-file=configs/retina/retinanet_mask_R-50-FPN_1.5x.yaml

I followed the MaskRCNN Benchmark (install.md) file to setup the environment.

Thanks in Advance

ERROR in Mask_R-CNN_demo.ipynb

why remove "background" class?

"box_cls" shape is num_class-1, so why remove "background" class?

P6 is different from paper

It's mentioned in the original retinanet paper that P6 is generated by C5. But in this implementation, P6 is generated by P5. Did you have any experiment on these implementation?

How to train the retina mask on customer data set

❓ Questions and Help

Issue about ImageNetPretrained weights

❓ Questions and Help

Hi, i can't get the ImageNetPretrained weights when run the train script. I just get the R-50.pkl just like :
NoSuchBucketThe specified bucket does not existdetectron5FE2AB3EE4E3584851Vhi/Z1ck7Jayj7yY2muN495jIdfytq3QOLzmAaL0jneIhvQkirmRu6/0ZYIWr1XKCQtpgIFrM=
Error comes : _pickle.UnpicklingError: invalid load key, '<'.
How can i solve this issue or can you give me some other links.
@chengyangfu

Non-existent config key: MODEL.BACKBONE.OUT_CHANNELS

I followed the installation instructions and everything went fine until wanted to test the pre-trained model by executing:

python tools/test_net.py --config-file ./configs/retina/retinanet_mask_R-50-FPN_2x_adjust_std011_ms.yaml MODEL.WEIGHT ./models/retinanet_mask_R-50-FPN_2x_adjust_std011_ms_model.pth MODEL.MASK_ON False

This leads to the following error:

Traceback (most recent call last):
  File "tools/test_net.py", line 100, in <module>
    main()
  File "tools/test_net.py", line 55, in main
    cfg.merge_from_file(args.config_file)
  File "/home/flo/anaconda3/envs/uav-challenge/lib/python3.6/site-packages/yacs/config.py", line 213, in merge_from_file
    self.merge_from_other_cfg(cfg)
  File "/home/flo/anaconda3/envs/uav-challenge/lib/python3.6/site-packages/yacs/config.py", line 217, in merge_from_other_cfg
    _merge_a_into_b(cfg_other, self, self, [])
  File "/home/flo/anaconda3/envs/uav-challenge/lib/python3.6/site-packages/yacs/config.py", line 460, in _merge_a_into_b
    _merge_a_into_b(v, b[k], root, key_list + [k])
  File "/home/flo/anaconda3/envs/uav-challenge/lib/python3.6/site-packages/yacs/config.py", line 460, in _merge_a_into_b
    _merge_a_into_b(v, b[k], root, key_list + [k])
  File "/home/flo/anaconda3/envs/uav-challenge/lib/python3.6/site-packages/yacs/config.py", line 473, in _merge_a_into_b
    raise KeyError("Non-existent config key: {}".format(full_key))
KeyError: 'Non-existent config key: MODEL.BACKBONE.OUT_CHANNELS'

Any ideas?

train on cityscapes

hi, have you trained on cityscapes? I trained on it and AP is very low.

RuntimeError: The size of tensor a (0) must match the size of tensor b (225603) at non-singleton dimension 0

🐛 Bug

Hi, Thanks for sharing,
I'm training on a custom dataset using
python tools/train_net.py --config-file ./configs/retina/retinanet_mask_R-50-FPN_2x_adjust_std011_ms.yaml SOLVER.IMS_PER_BATCH 2 SOLVER.MAX_ITER 180000 SOLVER.STEPS "(90000, 120000)"

But after few iteration, I get this error.

2021-01-19 12:42:44,720 maskrcnn_benchmark.trainer INFO: eta: 1 day, 5:42:19  iter: 902  loss: 1.8773 (2.2307)  loss_retina_cls: 0.4378 (0.6734)  loss_retina_reg: 1.0318 (1.1414)  loss_mask: 0.3184 (0.4159)  time: 0.3497 (0.5971)  data: 0.0093 (0.3164)  lr: 0.005000  max mem: 3196
2021-01-19 12:42:45,114 maskrcnn_benchmark.trainer INFO: eta: 1 day, 5:41:38  iter: 903  loss: 1.7931 (2.2301)  loss_retina_cls: 0.4374 (0.6731)  loss_retina_reg: 1.0279 (1.1412)  loss_mask: 0.3223 (0.4158)  time: 0.3535 (0.5969)  data: 0.0093 (0.3160)  lr: 0.005000  max mem: 3196
Traceback (most recent call last):
  File "tools/train_net.py", line 171, in <module>
    main()
  File "tools/train_net.py", line 164, in main
    model = train(cfg, args.local_rank, args.distributed)
  File "tools/train_net.py", line 73, in train
    arguments,
  File "/home/eldad/retinamask-master/maskrcnn_benchmark/engine/trainer.py", line 65, in do_train
    loss_dict = model(images, targets)
  File "/home/eldad/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/eldad/retinamask-master/maskrcnn_benchmark/modeling/detector/retinanet.py", line 61, in forward
    (anchors, detections), detector_losses = self.rpn(images, rpn_features, targets)
  File "/home/eldad/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/eldad/retinamask-master/maskrcnn_benchmark/modeling/rpn/retinanet.py", line 150, in forward
    return self._forward_train(anchors, box_cls, box_regression, targets)
  File "/home/eldad/retinamask-master/maskrcnn_benchmark/modeling/rpn/retinanet.py", line 157, in _forward_train
    anchors, box_cls, box_regression, targets
  File "/home/eldad/retinamask-master/maskrcnn_benchmark/modeling/rpn/retinanet_loss.py", line 108, in __call__
    labels, regression_targets = self.prepare_targets(anchors, targets)
  File "/home/eldad/retinamask-master/maskrcnn_benchmark/modeling/rpn/retinanet_loss.py", line 87, in prepare_targets
    matched_targets.bbox, anchors_per_image.bbox
  File "/home/eldad/retinamask-master/maskrcnn_benchmark/modeling/box_coder.py", line 44, in encode
    targets_dx = wx * (gt_ctr_x - ex_ctr_x) / ex_widths
RuntimeError: The size of tensor a (0) must match the size of tensor b (225603) at non-singleton dimension 0
(base) eldad@x580-05:~/retinamask-master$

Using maskrcnn, I get no errors. but on retina, there size mismatch error. Any idea why I have such error?

Thank you

Detectron2 Support

Hello, quick question, will this repository ever be updated to Detectron2? I've seen the changes over the original maskrcnn benchmark repo and it seems it's very similar to the new Detectron2 repo.

Will it work well on detection task of medical image?

❓ Questions and Help

Do you think it will work well on medical image?And do I have to run this code on 4 GPUs?
Could you give me some advice?

Why implement your own focal-loss

❓ Questions and Help

I think focal loss can be written in pytorch, so why do you use your own? is it faster?

Issue with the build process

🐛 Bug

I try to build the code inside a NVIDIA docker contaner.

gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -m64 -fPIC -m64 -fPIC -fPIC -DWITH_CUDA -I/home/jovyan/map-workspace/code/retinamask/maskrcnn_benchmark/csrc -I/opt/conda/lib/python3.6/site-packages/torch/lib/include -I/opt/conda/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -I/opt/conda/lib/python3.6/site-packages/torch/lib/include/TH -I/opt/conda/lib/python3.6/site-packages/torch/lib/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.6m -c /home/jovyan/map-workspace/code/retinamask/maskrcnn_benchmark/csrc/cpu/ROIAlign_cpu.cpp -o build/temp.linux-x86_64-3.6/home/jovyan/map-workspace/code/retinamask/maskrcnn_benchmark/csrc/cpu/ROIAlign_cpu.o -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
/usr/local/cuda/bin/nvcc -DWITH_CUDA -I/home/jovyan/map-workspace/code/retinamask/maskrcnn_benchmark/csrc -I/opt/conda/lib/python3.6/site-packages/torch/lib/include -I/opt/conda/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -I/opt/conda/lib/python3.6/site-packages/torch/lib/include/TH -I/opt/conda/lib/python3.6/site-packages/torch/lib/include/THC -I/usr/local/cuda/include -I/opt/conda/include/python3.6m -c /home/jovyan/map-workspace/code/retinamask/maskrcnn_benchmark/csrc/cuda/SigmoidFocalLoss_cuda.cu -o build/temp.linux-x86_64-3.6/home/jovyan/map-workspace/code/retinamask/maskrcnn_benchmark/csrc/cuda/SigmoidFocalLoss_cuda.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --compiler-options '-fPIC' -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
In file included from /home/jovyan/map-workspace/code/retinamask/maskrcnn_benchmark/csrc/cuda/SigmoidFocalLoss_cuda.cu:6:0:
/opt/conda/lib/python3.6/site-packages/torch/lib/include/ATen/cuda/CUDAContext.h:12:22: fatal error: cusparse.h: No such file or directory
compilation terminated.

To Reproduce

Steps to reproduce the behavior:

python setup.py build develop

Expected behavior

Should compile the C++ code.

Environment

Please copy and paste the output from the
environment collection script from PyTorch
(or fill out the checklist below manually).

You can get the script and run it with:

wget https://raw.githubusercontent.com/pytorch/pytorch/master/torch/utils/collect_env.py
# For security purposes, please check the contents of collect_env.py before running it.
python collect_env.py

Collecting environment information...
PyTorch version: 1.0.1.post2
Is debug build: No
CUDA used to build PyTorch: 9.0.176

OS: Ubuntu 16.04.5 LTS
GCC version: (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609
CMake version: Could not collect

Python version: 3.6
Is CUDA available: Yes
CUDA runtime version: 9.0.176
GPU models and configuration: GPU 0: Tesla V100-SXM2-16GB
Nvidia driver version: 390.46
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.2.1

Versions of relevant libraries:
[pip] numpy==1.13.3
[pip] torch==1.0.1.post2
[pip] torchsummary==1.5.1
[pip] torchvision==0.2.2.post3
[conda] torch 1.0.1.post2
[conda] torchsummary 1.5.1
[conda] torchvision 0.2.2.post3

PyTorch Version (e.g., 1.0):
OS (e.g., Linux):
How you installed PyTorch (conda, pip, source):
Build command you used (if compiling from source):
Python version:
CUDA/cuDNN version:
GPU models and configuration:
Any other relevant information:

Additional context

How to Set feature layers of FPN on which classification and bounding box regression is applied on Model?

In your work, you used p3, p4,p5 for prediction on the Resnet backbone. Can you please help me know where to set which feature layers will be used for prediction? How do I do this setting to allow all feature layers of FPN (P2 to P6) to be used for prediction on the ResNet50 backbone?

How can I see the image with "box" and "mask" lables?

❓ Questions and Help

Sorry to bother you again.In fact,I didn't run it successfully because of my poor device.But I don't want to stop learning it ,I read the code,and found that the result of this project is some files ended with".pth", isn't it? If I want to see the image with "boxes" and "mask" label , should I write this part of the code?ThanKs again!

Clarification on number of Features (num_features) as parameter In the adjusted smooth l1 loss

@chengyangfu . The new smooth L1 loss in fast RCNN is implemented in code as:

def smooth_l1_loss(input, target, beta=1. / 9, size_average=True):

    n = torch.abs(input - target)
    cond = n < beta
    loss = torch.where(cond, 0.5 * n ** 2 / beta, n - 0.5 * beta)
    if size_average:
        return loss.mean()
    return loss.sum()

In your work, you implemented self-adjust smooth L1 loss as:

import torch
from torch import nn
import logging
from torch.distributed import deprecated as dist

class AdjustSmoothL1Loss(nn.Module):

    def __init__(self, num_features, momentum=0.1, beta=1. /9):
        super(AdjustSmoothL1Loss, self).__init__()
        self.num_features = num_features
        self.momentum = momentum
        self.beta = beta
        self.register_buffer(
            'running_mean', torch.empty(num_features).fill_(beta)
        )
        self.register_buffer('running_var', torch.zeros(num_features))
        self.logger = logging.getLogger("maskrcnn_benchmark.trainer")

    def forward(self, inputs, target, size_average=True):

        n = torch.abs(inputs -target)
        with torch.no_grad():
            if torch.isnan(n.var(dim=0)).sum().item() == 0:
                self.running_mean = self.running_mean.to(n.device)
                self.running_mean *= (1 - self.momentum)
                self.running_mean += (self.momentum * n.mean(dim=0))
                self.running_var = self.running_var.to(n.device)
                self.running_var *= (1 - self.momentum)
                self.running_var += (self.momentum * n.var(dim=0))


        beta = (self.running_mean - self.running_var)
        beta = beta.clamp(max=self.beta, min=1e-3)

        beta = beta.view(-1, self.num_features).to(n.device)
        cond = n < beta.expand_as(n)
        loss = torch.where(cond, 0.5 * n ** 2 / beta, n - 0.5 * beta)
        if size_average:
            return loss.mean()
        return loss.sum()

I see the num_features parameter that needs to be passed to class, for the loss computation to succeed. Can you please help me understand what this num_features parameter is? What value is needed for this num_features?

Why need include ground truth masks during training?

As mentioned in the paper "Thus, the number of mask proposals is (100+Gt) during training.", I am wondering whether including the GT mask for training mask head could increase accuracy?

Multi-scale training

Thank you for making the code available online. Really solid work!

Where do you specify multi-scale training?

In the caption of figure one of the paper (https://arxiv.org/pdf/1901.03353.pdf), it is mentioned that you do not use multi-scale training.

Looking at https://github.com/chengyangfu/retinamask/blob/master/maskrcnn_benchmark/data/transforms/build.py#L10.

It seems you always use multiscale resizing option which seems depends on the number of min scales. If we were to specify multiple min-sizes in the config files, we would get multi-scale training. Is that right?

Can you point me config file where you do that? Or you don't do that at all.

How the batch is handled given all the images could have arbitrary second dimension resulting in arbitrary feature map size and a different number of predictions depending on input image size.

Many thanks
Gurkirt

why 'the number of mask proposals is (100+Gt) during training'?

Hi, I am very glad you share the retinamask, and I have some questions:

in your paper, you said training the mask subnet by (100+ GT), why not GT? and I want have a try, so I need confirm if the (100+GT) is set in the fellow line:
( ../maskrcnn_benchmark/modeling/detector/retinanet.py )
"""
proposals = []
for (image_detections, image_targets) in zip(detections, targets):
merge_list = []

                 if not isinstance(image_detections, list): 
                    merge_list.append(image_detections.copy_with_fields('labels'))
                 if not isinstance(image_targets, list):
                     merge_list.append(image_targets.copy_with_fields('labels'))

                 if len(merge_list) == 1: proposals.append(merge_list[0])
                 else: proposals.append(cat_boxlist(merge_list))
             """

I mean, I want to just using gt for training, I want to delete the detection net's proposals.
2. can you tell me ,if is it a practical way fo ignoring the mask net, and just training the detection, then freeze it, start train mask subnet? do you have any more experiment ?
Looking u reply.Thanks a lot.

chengyangfu / retinamask Goto Github PK

retinamask's Introduction

RetinaMask

Citing RetinaMask

Contents

Installation

Models

Run Inference

retinamask's People

Contributors

Stargazers

Watchers

Forkers

retinamask's Issues

❓ Questions and Help

❓ Questions and Help

❓ Questions and Help

❓ Questions and Help

🐛 Bug

❓ Questions and Help

❓ Questions and Help

🐛 Bug

❓ Questions and Help

❓ Questions and Help

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

❓ Questions and Help

Recommend Projects

Recommend Topics

Recommend Org