Code Monkey home page Code Monkey logo

ssds.pytorch's Introduction

ssds.pytorch

Repository for Single Shot MultiBox Detector and its variants, implemented with pytorch, python3. This repo is easy to setup and has plenty of visualization methods. We hope this repo can help people have a better understanding for ssd-like model and help people train and deploy the ssds model easily.

Currently, it contains these features:

  • Multiple SSD Variants: ssd, fpn, bifpn, yolo and etc.
  • Multiple Base Network: resnet, regnet, mobilenet and etc.
  • Visualize the features of the ssd-like models to help the user understand the model design and performance.
  • Fast Training and Inference: Utilize Nvidia Apex and Dali to fast training and support the user convert the model to ONNX or TensorRT for deployment.

This repo is depended on the work of ODTK, Detectron and Tensorflow Object Detection API. Thanks for their works.

Notice The pretrain model for the current version does not finished yet, please check the previous version for enrich pretrain models.

Table of Contents

Installation

requirements

  • python>=3.7
  • CUDA>=10.0
  • pytorch>=1.4

basic installation:

conda install pytorch torchvision cudatoolkit=10.2 -c pytorch
git clone https://github.com/ShuangXieIrene/ssds.pytorch.git
cd ssds.pytorch
python setup.py clean -a install

extra python libs for parallel training

Currently, nvidia DALI and apex is not include in the requirements.txt and need to install manually.

pip install --extra-index-url https://developer.download.nvidia.com/compute/redist/cuda/10.0 nvidia-dali
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

Docker

git clone https://github.com/ShuangXieIrene/ssds.pytorch.git
docker build -t ssds:local ./ssds.pytorch/
docker run --gpus all -it --rm -v /data:/data ssds:local

Usage

0. Check the config file by Visualization

Defined the network in a config file and tweak the config file based on the visualized anchor boxes

python -m ssds.utils.visualize -cfg experiments/cfgs/tests/test.yml

1. Training

# basic training
python -m ssds.utils.train -cfg experiments/cfgs/tests/test.yml
# parallel training
python -m torch.distributed.launch --nproc_per_node={num_gpus} -m ssds.utils.train_ddp -cfg experiments/cfgs/tests/test.yml

2. Evaluation

python -m ssds.utils.train -cfg experiments/cfgs/tests/test.yml -e

3. Export to ONNX or TRT model

python -m ssds.utils.export -cfg experiments/cfgs/tests/test.yml -c best_mAP.pth -h

Performance

Visualization

ssds.pytorch's People

Contributors

foreveryounggithub avatar shuangxieirene avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ssds.pytorch's Issues

Which pytorch version ?

May I know which pytorch version you used for running this program? My current version is 0.4.1 but there are some variables initialization are being deprecated. (e.g. images = Variable(preproc(image, anno)[0].unsqueeze(0), volatile=True)).

Thanks

Cannot re-implement the performance

I have tried to train the ssdlite with Mobilenetv2 as the backbone. As suggested, the pre-trained Mobilenetv2 model is downloaded from another repo. Besides changing the weights' name in pre-trained model to load it, I also change the preprocessing of input images. The input image is normalized to to make the mean 0 and standard deviation 1, as an issue in the repo of the pre-trained model shows. I didn't make other modifications and followed the Readme to train the ssdlite model for 300 epochs. I used the test.py to test and the mAP after 300 epochs is about 65 to 66. I also tested the pre-trained ssdlite model downloaded from the 73.2 link in the Readme file, and the mAP is 73.4 when nms threshold is 0.45.

Could you give some suggestions on how to improve the mAP and re-implement the results? Thanks!

Any one test the accuracy on VOC dataset?

Hi, everyone,
I test the pre-trained model on the VOC dataset, and found the mAP didn't match the expected ones.
Here is my result:
yolo_v2_mobilenetv2_voc: 0.721956
yolo_v3_mobilenetv2_voc: 0.766846
ssd_lite_mobilenetv2_voc: 0.708510
rfb_lite_mobilenetv2_voc: 0.713691
fssd_lite_mobilenetv2_voc: 0.744321
It can be seen, yolov2/3 perform better than the author's result, however the other three are bad than the expected result.

Have anyone met this problem?

multi gpu training

Hi, I am trying to train the ssds network with multi gpu , but it seems doesn't working. I just use the DataParallel class. Are there any tricks to implent this? Thanks a lot.

Performance of FSSD MobileNetV1 on VOC2007

Hi ShuangXieIrene:

Did you use MS COCO for pretraining before you trained the FSSD MobileNetV1 on VOC2007?
When I using VOC2007 and VOC2012 as training data for FSSD MobileNetV1, my performance is 73.8%. It is seems that your result is much better(78.4%).

failed when run : python demo.py --cfg=$file --demo=./experiments/person.jpg

./experiments/cfgs/ssd_lite_mobilenetv2_train_voc.yml
===> Building model
==>Feature map size:
[(19, 19), (10, 10), (5, 5), (3, 3), (2, 2), (1, 1)]
Utilize GPUs for computation
Number of GPU available 1
=> loading checkpoint ./weights/ssd_lite/mobilenet_v2_ssd_lite_voc_73.2.pth
/home/wei_li/anaconda3/lib/python3.6/site-packages/torch/tensor.py:321: UserWarning: self and other not broadcastable, but have the same number of elements. Falling back to deprecated pointwise behavior.
return self.mul(other)
ASSERT: "false" in file qasciikey.cpp, line 501
./time_benchmark.sh: line 9: 16177 Aborted (core dumped) python demo.py --cfg=$file --demo=./experiments/person.jpg

Structure about Feature map in SSD-Mobilenet

I found that the first feature layer output is 1919 and last two layers both give a 11 size output, dose it correct? Why choosing such a small size comparing to the original structure?

When I train model, get this problem!

File "ssds.pytorch/lib/layers/modules/multibox_loss.py", line 91, in forward
loss_c[pos] = 0 # filter out pos boxes for now
RuntimeError: The shape of the mask [32, 11620] at index 0 does not match the shape of the indexed tensor [371840, 1] at index 0

tensorboardX version problem

Hi,
When I tried to eval the model, I got the problem in visualize_utils.py,that's say:
""
File "/home/ssds.pytorch/lib/utils/visualize_utils.py", line 159, in add_pr_curve_raw
writer.add_pr_curve_raw(
AttributeError: 'SummaryWriter' object has no attribute 'add_pr_curve_raw'
""
I saw from the tensorboardX-1.1,that in the writer.py, only see the 'add_pr_curve' method, so I want to make sure is that the version problem ,and what the version of your tensorboardX is. ; )

[Question] What is "Free Image Size" ?

Hello @ShuangXieIrene and @foreverYoungGitHub .

Thank you for your great contribution.

As the title said, what is "Free Image Size" ?

In the .cfg file, there are several settings as below. If i want to change the IMAGE_SIZE: [300, 300] to IMAGE_SIZE: [512, 512], then do i have to change the other parameters as well ? If that's so, how can i change those ?

MODEL:
  SSDS: ssd
  NETS: resnet_50
  IMAGE_SIZE: [300, 300]
  NUM_CLASSES: 21
  FEATURE_LAYER: [[10, 16, 'S', 'S', '', ''], [512, 1024, 512, 256, 256, 256]]
  STEPS: [[8, 8], [16, 16], [32, 32], [64, 64], [100, 100], [300, 300]]
  SIZES: [[30, 30], [60, 60], [111, 111], [162, 162], [213, 213], [264, 264], [315, 315]]
  ASPECT_RATIOS: [[1, 2, 3], [1, 2, 3], [1, 2, 3], [1, 2, 3], [1, 2], [1, 2]]

Thank you for your time

ImportError: No module named _mask

I run this code python train.py --cfg=./experiments/cfgs/ssd_vgg16_train_voc.yml and this is what i get:

python train.py --cfg=./experiments/cfgs/ssd_vgg16_train_voc.yml
Traceback (most recent call last):
  File "train.py", line 20, in <module>
    from lib.ssds_train import train_model
  File "/home/alireza/Desktop/ssd/B/lib/ssds_train.py", line 23, in <module>
    from lib.dataset.dataset_factory import load_data
  File "/home/alireza/Desktop/ssd/B/lib/dataset/dataset_factory.py", line 2, in <module>
    from lib.dataset import coco
  File "/home/alireza/Desktop/ssd/B/lib/dataset/coco.py", line 13, in <module>
    from lib.utils.pycocotools.coco import COCO
  File "/home/alireza/Desktop/ssd/B/lib/utils/pycocotools/coco.py", line 55, in <module>
    from . import mask as maskUtils
  File "/home/alireza/Desktop/ssd/B/lib/utils/pycocotools/mask.py", line 3, in <module>
    import lib.utils.pycocotools._mask as _mask
ImportError: No module named _mask

Any idea why??
Also What python version are using?
Plut where should i put the dataset (eg coco vgg)

trouble with dark2pth

Hi, when I run the dark2pth.py, it turns out like this

convert yolov2...
Traceback (most recent call last):
  File "dark2pth.py", line 375, in <module>
    yolo_v2 = build_yolo_v2(darknet_19, feature_layer_v2, mbox_v2, 81)
  File "/home/georgeokelly/ssds.pytorch/lib/modeling/ssds/yolo.py", line 215, in build_yolo_v2
    return YOLO(base_, extras_, head_, feature_layer, num_classes)
  File "/home/georgeokelly/ssds.pytorch/lib/modeling/ssds/yolo.py", line 31, in __init__
    self.softmax = nn.Softmax(dim=-1)
TypeError: __init__() got an unexpected keyword argument 'dim'

and I don't know how to fix it.

RetinaNet supprt

Is there a plan to support RetinaNet? The accuracy of RetinaNet seems to be higher compared to SSD according to the RetinaNet paper.

Thanks,

ASPECT_RATIOS

Whay the aspect ratios are define like this:

ASPECT_RATIOS: [[1, 2, 3], [1, 2, 3], [1, 2, 3], [1, 2, 3], [1, 2], [1, 2]]
can any body explain it please?

Also i see in another place it also define like:

ASPECT_RATIOS: [[[0.1,0.1], [0.2,0.2], [0.3,0.3], [0.5,0.5], [0.9,0.9]]]

I am not sure why? plus should not aspect ratio be like 1:2 ....

the implementation of the yolov3

i read the code, but i found that the loss function is ssd's? the multibox loss function is't the function in yolo's paper, right?

Loss is not decreasing

I have trained ssd with mobilenetv2 on VOC but after almost 500 epochs, the loss is still like this:

517/518 in 0.154s [##########] | loc_loss: 1.4773 cls_loss: 2.3165

==>Train: || Total_time: 79.676s || loc_loss: 1.1118 conf_loss: 2.3807 || lr: 0.000721

Wrote snapshot to: ./experiments/models/ssd_mobilenet_v2_voc/ssd_lite_mobilenet_v2_voc_epoch_525.pth
Epoch 526/1300:
0/518 in 0.193s [----------] | loc_loss: 0.8291 cls_loss: 1.9464
1/518 in 0.186s [----------] | loc_loss: 1.3181 cls_loss: 2.5404
2/518 in 0.184s [----------] | loc_loss: 1.0371 cls_loss: 2.2243

It's doesn't change and loss is very hight...... What's the problem with implementation?

How to implement a train on only 1 class?

for mobilenetV2 ssd-lite on COCO dataset. want to train to detect only pedestrians. I am sure that changing "num_classes" arg in the yml file isn't enough. My guess is: change the COCO ann file to "person_keypoints_train/val2017.json" and num_classes to 2. Will it be enough? Thank you.

How can I organize my own dataset for 1 class like voc?

I have trained “yolo_v3_mobilenetv2_voc” and “ssd_lite_mobilenetv2_train_voc” successfully.
So I want to train my own data, just 1 class: car, I have prepared “Annotations” and “JPEGImages”.
I have changed NUM_CLASSES to 2 in yml file and VOC_CLASSES (...) to VOC_CLASSES = ( 'car', 'background') in voc.py.
But I have no idea about txt files in ImageSets/Main, now I just have a trainval.txt which just contains the ids of the images.
so I have the error: Dimension out of range (expected to be in range of [-1,0), but got 1)
What additional txt file do I need?What else do I need to modify?
Thank you very much.

I train a new model,but test speed is too slow?

I down load the mobilenet_v1_ssd_voc_72.7, it's test speed faster(0.03s) than my new model(0.3s).

I can't find the reason,the configuration file is same as mobilenet_v1_ssd_voc_72.

I also train new mobilenetv2-ssd in same file, the speed is still very slow.
Please give me some advice. Thank you very much!

_mask doesn't get imported

The _mask is not getting imported.
its present in pycoco. the previous issue solved it using python3,but I am facing the same using python3. I think the issue is that this file is capable only with python 3.6 not 3.5. ( as the name itself is _mask.cpython-36m-x86_64-linux-gnu.so)

Can i know the training environment?

I wanna train and get weight file of rfb mobilenet ssd model end-to-end by myself.
Should I need to load pretrained feature extractors weights? Then how can i get that weight file?
Can i know the training environment like epoch, learning rate, scheduler and learning decaying parameters?

Segmentation fault (core dumped)

Hi,

I'm trying to do a test for the ssd_vgg model on VOC dataset, but I'm getting the following error :

Segmentation fault (core dumped)

I was using Python3.6 and Pytorch 0.4.1, after seeing this error I switched back to Pytorch 0.4.0. But the error is still there.

The command I run is:
python test.py --cfg=./experiments/cfgs/ssd_vgg16_train_voc.yml

I have changed the corresponding directories and weights in the yml file to be those I need.

Any help would be more than appreciated.
Thanks in advance!

how to train from scratch

Hi.

can you tell me what is the proper way to train object detector from scratch with pretrained base architectures such as Resnet50, VGG16, MobilenetV1, MobilenetV2?
or can you give me some references regarding this?

I first tried by command line below but it seemed that the training starts without pretrained weights.
python train.py --cfg=./experiments/cfgs/ssd_vgg16_train_voc.yml

So now I'm training VGG16-SSD on voc dataset using exactly the same command below
python train.py --cfg=./experiments/cfgs/ssd_vgg16_train_voc.yml
but changed RESUME_CHECKPOINT and PHASE like this
RESUME_CHECKPOINT: './weights/ssd/vgg16_reducedfc.pth'
PHASE: ['train']
and also I downloaded vgg16_reducedfc.pth from https://s3.amazonaws.com/amdegroot-models/vgg16_reducedfc.pth and put into ./weights/ssd/
but I wonder it is the right way to do it.

by the way, thanks for your great project! It seems like well-written, understandable codes! It helps me a lot.

demo.py issue

Encounter an error when running demo.py below:
python3 ./demo.py --cfg=./experiments/cfgs/fssd_lite_mobilenetv1_train_voc.yml --demo=./experiments/person.jpg

===> Building model
/home/topspin/2TB/src/ssds.pytorch/lib/modeling/model_builder.py:51: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead.
x = torch.autograd.Variable(x, volatile=True) #.cuda()
/usr/local/lib/python3.5/dist-packages/torch/nn/functional.py:1761: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
"See the documentation of nn.Upsample for details.".format(mode))
==>Feature map size:
[(38, 38), (19, 19), (10, 10), (5, 5), (3, 3), (1, 1)]
/home/topspin/2TB/src/ssds.pytorch/lib/ssds.py:21: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead.
self.priors = Variable(self.priorbox.forward(), volatile=True)
Utilize GPUs for computation
Number of GPU available 1
=> loading checkpoint ./weights/fssd_lite/mobilenet_v1_fssd_lite_voc_78.4.pth
/home/topspin/2TB/src/ssds.pytorch/lib/ssds.py:68: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead.
x = Variable(self.preprocessor(img)[0].unsqueeze(0),volatile=True)
Traceback (most recent call last):
File "./demo.py", line 154, in
demo(args, args.demo_file)
File "./demo.py", line 59, in demo
_labels, _scores, _coords = object_detector.predict(image)
File "/home/topspin/2TB/src/ssds.pytorch/lib/ssds.py", line 82, in predict
detections = self.detector.forward(out)
File "/home/topspin/2TB/src/ssds.pytorch/lib/layers/functions/detection.py", line 151, in forward
ids, count = nms(boxes, scores, self.nms_thresh, self.top_k)
ValueError: not enough values to unpack (expected 2, got 0)

No module named 'lib.utils.nms._ext.nms._nms'

When I try to run the demo, I get the following error:
from ._nms import lib as _lib, ffi as _ffi
ImportError: No module named 'lib.utils.nms._ext.nms._nms'

Ubuntu 16.04.3
GTX 1080Ti
Cuda 8.0 installed

not able to use torch.nn.DataParallel

I uncomment the lines that has torch.nn.DataParallel
but the model still cannot run in parallel.
It runs but it still use just one gpu
Any suggestion to solve it?

Almost results are Zero.

Hello, I tried run your code using "ssd_vgg16_train_voc.yml"

But, almost results are zero and doesn't increase.

=======================================================
AP for aeroplane = 0.0000
AP for bicycle = 0.0000
AP for bird = 0.0000
AP for boat = 0.0000
AP for bottle = 0.0000
AP for bus = 0.0000
AP for car = 0.0000
AP for cat = 0.0000
AP for chair = 0.0000
AP for cow = 0.0000
AP for diningtable = 0.0000
AP for dog = 0.0000
AP for horse = 0.0000
AP for motorbike = 0.0000
AP for person = 0.0001
AP for pottedplant = 0.0003
AP for sheep = 0.0000
AP for sofa = 0.0000
AP for train = 0.0000
AP for tvmonitor = 0.0000
Mean AP = 0.0000

Results:
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000

================================================

Above result are printed at epoch 5.
Although the number epoch is too low, there are problems.
Please tell me the way to fix these problems if you know.

Thank you.

regarding permute and contiguous

can someone tell me what loc.append(l(x).permute(0, 2, 3, 1).contiguous())
does?
I dont understand what permute and contiguous does, and how they wrk

Conda environment, aka dependecies file

Created an env file which you can use.
Just download the yaml from my repo https://github.com/cardoso-neto/ssds.pytorch/blob/5384d8c27de82919eab956655a88834e29ac8c2d/ssds.yaml and run conda env create -f ssds.yaml

This will, however, require one small modification to line 45 of ssds_train.py from:

self.priors = self.priorbox.forward()

to:

self.priors = Variable(self.priorbox.forward())

The error message is quite clear and straightforward, suggesting to cast the torch.Tensor to an autograd.Variable.

After fiddling a bit with different Python/Pytorch versions, this is seemed to work:
Python=3.6
Pytorch=0.3.1

Not really an issue. Just wanted to leave this somewhere where it'd be helpful.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.