Code Monkey home page Code Monkey logo

yet-another-efficientdet-pytorch's Introduction

Yet Another EfficientDet Pytorch

The pytorch re-implement of the official EfficientDet with SOTA performance in real time, original paper link: https://arxiv.org/abs/1911.09070

Having troubles training? I might train it for you

If you have troubles training a dataset, and if you are willing to share your dataset with the public or it's open already, post it on Issues with help wanted tag, I might try to help train it for you, if I'm free, which is not guaranteed.

Requirements:

  1. The total number of the image of the dataset should not be larger than 10K, capacity should be under 5GB, and it should be free to download, i.e. baiduyun.

  2. The dataset should be in the format of this repo.

  3. If you post your dataset in this repo, it is open to the world. So PLEASE DO NOT upload your confidential datasets!

  4. If the datasets are against the law or invade one's privacy, feel free to contact me to delete it.

  5. Most importantly, you can't demand me to train unless I wanted to.

I'll post the trained weights in this repo along with the evaluation result.

Hope it help whoever wants to try efficientdet in pytorch.

Training examples can be found here. tutorials. The trained weights can be found here. weights

Performance

Pretrained weights and benchmark

The performance is very close to the paper's, it is still SOTA.

The speed/FPS test includes the time of post-processing with no jit/data precision trick.

coefficient pth_download GPU Mem(MB) FPS Extreme FPS (Batchsize 32) mAP 0.5:0.95(this repo) mAP 0.5:0.95(official)
D0 efficientdet-d0.pth 1049 36.20 163.14 33.1 33.8
D1 efficientdet-d1.pth 1159 29.69 63.08 38.8 39.6
D2 efficientdet-d2.pth 1321 26.50 40.99 42.1 43.0
D3 efficientdet-d3.pth 1647 22.73 - 45.6 45.8
D4 efficientdet-d4.pth 1903 14.75 - 48.8 49.4
D5 efficientdet-d5.pth 2255 7.11 - 50.2 50.7
D6 efficientdet-d6.pth 2985 5.30 - 50.7 51.7
D7 efficientdet-d7.pth 3819 3.73 - 52.7 53.7
D7X efficientdet-d8.pth 3983 2.39 - 53.9 55.1

Update Log

[2020-07-23] supports efficientdet-d7x, mAP 53.9, using efficientnet-b7 as its backbone and an extra deeper pyramid level of BiFPN. For the sake of simplicity, let's call it efficientdet-d8.

[2020-07-15] update efficientdet-d7 weights, mAP 52.7

[2020-05-11] add boolean string conversion to make sure head_only works

[2020-05-10] replace nms with batched_nms to further improve mAP by 0.5~0.7, thanks Laughing-q.

[2020-05-04] fix coco category id mismatch bug, but it shouldn't affect training on custom dataset.

[2020-04-14] fixed loss function bug. please pull the latest code.

[2020-04-14] for those who needs help or can't get a good result after several epochs, check out this tutorial. You can run it on colab with GPU support.

[2020-04-10] warp the loss function within the training model, so that the memory usage will be balanced when training with multiple gpus, enabling training with bigger batchsize.

[2020-04-10] add D7 (D6 with larger input size and larger anchor scale) support and test its mAP

[2020-04-09] allow custom anchor scales and ratios

[2020-04-08] add D6 support and test its mAP

[2020-04-08] add training script and its doc; update eval script and simple inference script.

[2020-04-07] tested D0-D5 mAP, result seems nice, details can be found here

[2020-04-07] fix anchors strategies.

[2020-04-06] adapt anchor strategies.

[2020-04-05] create this repository.

Demo

# install requirements
pip install pycocotools numpy opencv-python tqdm tensorboard tensorboardX pyyaml webcolors
pip install torch==1.4.0
pip install torchvision==0.5.0
 
# run the simple inference script
python efficientdet_test.py

Training

Training EfficientDet is a painful and time-consuming task. You shouldn't expect to get a good result within a day or two. Please be patient.

Check out this tutorial if you are new to this. You can run it on colab with GPU support.

1. Prepare your dataset

# your dataset structure should be like this
datasets/
    -your_project_name/
        -train_set_name/
            -*.jpg
        -val_set_name/
            -*.jpg
        -annotations
            -instances_{train_set_name}.json
            -instances_{val_set_name}.json

# for example, coco2017
datasets/
    -coco2017/
        -train2017/
            -000000000001.jpg
            -000000000002.jpg
            -000000000003.jpg
        -val2017/
            -000000000004.jpg
            -000000000005.jpg
            -000000000006.jpg
        -annotations
            -instances_train2017.json
            -instances_val2017.json

2. Manual set project's specific parameters

# create a yml file {your_project_name}.yml under 'projects'folder 
# modify it following 'coco.yml'
 
# for example
project_name: coco
train_set: train2017
val_set: val2017
num_gpus: 4  # 0 means using cpu, 1-N means using gpus 

# mean and std in RGB order, actually this part should remain unchanged as long as your dataset is similar to coco.
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]

# this is coco anchors, change it if necessary
anchors_scales: '[2 ** 0, 2 ** (1.0 / 3.0), 2 ** (2.0 / 3.0)]'
anchors_ratios: '[(1.0, 1.0), (1.4, 0.7), (0.7, 1.4)]'

# objects from all labels from your dataset with the order from your annotations.
# its index must match your dataset's category_id.
# category_id is one_indexed,
# for example, index of 'car' here is 2, while category_id of is 3
obj_list: ['person', 'bicycle', 'car', ...]

3.a. Train on coco from scratch(not necessary)

# train efficientdet-d0 on coco from scratch 
# with batchsize 12
# This takes time and requires change 
# of hyperparameters every few hours.
# If you have months to kill, do it. 
# It's not like someone going to achieve
# better score than the one in the paper.
# The first few epoches will be rather unstable,
# it's quite normal when you train from scratch.

python train.py -c 0 --batch_size 64 --optim sgd --lr 8e-2

3.b. Train a custom dataset from scratch

# train efficientdet-d1 on a custom dataset 
# with batchsize 8 and learning rate 1e-5

python train.py -c 1 -p your_project_name --batch_size 8 --lr 1e-5

3.c. Train a custom dataset with pretrained weights (Highly Recommended)

# train efficientdet-d2 on a custom dataset with pretrained weights
# with batchsize 8 and learning rate 1e-3 for 10 epoches

python train.py -c 2 -p your_project_name --batch_size 8 --lr 1e-3 --num_epochs 10 \
 --load_weights /path/to/your/weights/efficientdet-d2.pth

# with a coco-pretrained, you can even freeze the backbone and train heads only
# to speed up training and help convergence.

python train.py -c 2 -p your_project_name --batch_size 8 --lr 1e-3 --num_epochs 10 \
 --load_weights /path/to/your/weights/efficientdet-d2.pth \
 --head_only True

4. Early stopping a training session

# while training, press Ctrl+c, the program will catch KeyboardInterrupt
# and stop training, save current checkpoint.

5. Resume training

# let says you started a training session like this.

python train.py -c 2 -p your_project_name --batch_size 8 --lr 1e-3 \
 --load_weights /path/to/your/weights/efficientdet-d2.pth \
 --head_only True
 
# then you stopped it with a Ctrl+c, it exited with a checkpoint

# now you want to resume training from the last checkpoint
# simply set load_weights to 'last'

python train.py -c 2 -p your_project_name --batch_size 8 --lr 1e-3 \
 --load_weights last \
 --head_only True

6. Evaluate model performance

# eval on your_project, efficientdet-d5

python coco_eval.py -p your_project_name -c 5 \
 -w /path/to/your/weights

7. Debug training (optional)

# when you get bad result, you need to debug the training result.
python train.py -c 2 -p your_project_name --batch_size 8 --lr 1e-3 --debug True

# then checkout test/ folder, there you can visualize the predicted boxes during training
# don't panic if you see countless of error boxes, it happens when the training is at early stage.
# But if you still can't see a normal box after several epoches, not even one in all image,
# then it's possible that either the anchors config is inappropriate or the ground truth is corrupted.

TODO

  • re-implement efficientdet
  • adapt anchor strategies
  • mAP tests
  • training-scripts
  • efficientdet D6 support
  • efficientdet D7 support
  • efficientdet D7x support

FAQ

Q1. Why implement this while there are several efficientdet pytorch projects already.

A1: Because AFAIK none of them fully recovers the true algorithm of the official efficientdet, that's why their communities could not achieve or having a hard time to achieve the same score as the official efficientdet by training from scratch.

Q2: What exactly is the difference among this repository and the others?

A2: For example, these two are the most popular efficientdet-pytorch,

https://github.com/toandaominh1997/EfficientDet.Pytorch

https://github.com/signatrix/efficientdet

Here is the issues and why these are difficult to achieve the same score as the official one:

The first one:

  1. Altered EfficientNet the wrong way, strides have been changed to adapt the BiFPN, but we should be aware that efficientnet's great performance comes from it's specific parameters combinations. Any slight alteration could lead to worse performance.

The second one:

  1. Pytorch's BatchNormalization is slightly different from TensorFlow, momentum_pytorch = 1 - momentum_tensorflow. Well I didn't realize this trap if I paid less attentions. signatrix/efficientdet succeeded the parameter from TensorFlow, so the BN will perform badly because running mean and the running variance is being dominated by the new input.

  2. Mis-implement of Depthwise-Separable Conv2D. Depthwise-Separable Conv2D is Depthwise-Conv2D and Pointwise-Conv2D and BiasAdd ,there is only a BiasAdd after two Conv2D, while signatrix/efficientdet has a extra BiasAdd on Depthwise-Conv2D.

  3. Misunderstand the first parameter of MaxPooling2D, the first parameter is kernel_size, instead of stride.

  4. Missing BN after downchannel of the feature of the efficientnet output.

  5. Using the wrong output feature of the efficientnet. This is big one. It takes whatever output that has the conv.stride of 2, but it's wrong. It should be the one whose next conv.stride is 2 or the final output of efficientnet.

  6. Does not apply same padding on Conv2D and Pooling.

  7. Missing swish activation after several operations.

  8. Missing Conv/BN operations in BiFPN, Regressor and Classifier. This one is very tricky, if you don't dig deeper into the official implement, there are some same operations with different weights.

     illustration of a minimal bifpn unit
         P7_0 -------------------------> P7_2 -------->
            |-------------|                ↑
                          ↓                |
         P6_0 ---------> P6_1 ---------> P6_2 -------->
            |-------------|--------------↑ ↑
                          ↓                |
         P5_0 ---------> P5_1 ---------> P5_2 -------->
            |-------------|--------------↑ ↑
                          ↓                |
         P4_0 ---------> P4_1 ---------> P4_2 -------->
            |-------------|--------------↑ ↑
                          |--------------↓ |
         P3_0 -------------------------> P3_2 -------->
    

    For example, P4 will downchannel to P4_0, then it goes P4_1, anyone may takes it for granted that P4_0 goes to P4_2 directly, right?

    That's why they are wrong, P4 should downchannel again with a different weights to P4_0_another, then it goes to P4_2.

And finally some common issues, their anchor decoder and encoder are different from the original one, but it's not the main reason that it performs badly.

Also, Conv2dStaticSamePadding from EfficientNet-PyTorch does not perform like TensorFlow, the padding strategy is different. So I implement a real tensorflow-style Conv2dStaticSamePadding and MaxPool2dStaticSamePadding myself.

Despite of the above issues, they are great repositories that enlighten me, hence there is this repository.

This repository is mainly based on efficientdet, with the changing that makes sure that it performs as closer as possible as the paper.

Btw, debugging static-graph TensorFlow v1 is really painful. Don't try to export it with automation tools like tf-onnx or mmdnn, they will only cause more problems because of its custom/complex operations.

And even if you succeeded, like I did, you will have to deal with the crazy messed up machine-generated code under the same class that takes more time to refactor than translating it from scratch.

Q3: What should I do when I find a bug?

A3: Check out the update log if it's been fixed, then pull the latest code to try again. If it doesn't help, create a new issue and describe it in detail.

Known issues

  1. Official EfficientDet use TensorFlow bilinear interpolation to resize image inputs, while it is different from many other methods (opencv/pytorch), so the output is definitely slightly different from the official one.

Visual Comparison

Conclusion: They are providing almost the same precision. Tips: set force_input_size=1920. Official repo uses original image size while this repo uses default network input size. If you try to compare these two repos, you must make sure the input size is consistent.

This Repo

Official EfficientDet

References

Appreciate the great work from the following repositories:

Donation

If you like this repository, or if you'd like to support the author for any reason, you can donate to the author. Feel free to send me your name or introducing pages, I will make sure your name(s) on the sponsors list.

Sponsors

Sincerely thank you for your generosity.

cndylan claire-s11

yet-another-efficientdet-pytorch's People

Contributors

ale90 avatar bonlime avatar brianbarbieri avatar chipsspectre avatar ggaziv avatar loveunk avatar mammadjv avatar mashiwei avatar namangup avatar r2ufuk avatar rvandeghen avatar shreyas-bk avatar vcasecnikovs avatar winter2897 avatar zylo117 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

yet-another-efficientdet-pytorch's Issues

reproduce D0 performance on COCO2017

Hi, thanks for your excellent work!
I really want to reproduce your result on coco2017 using D0. Could you please tell me the training setting parameters, such as number of epochs, batch_size(is it batch size on each gpu or batch size in total?), and initial learning rate?

Besides, what does it mean by 'This takes time and requires change of hyperparameters every few hours.' Do we need to tune hyperpamaters by ourselves?

Thanks a lot again : )

advice on training a custom dataset

If my dataset has different # of classes than coco dataset, can I still use:

train efficientdet-d2 on a custom dataset with pretrained weights

with batchsize 8 and learning rate 1e-5 for 10 epoches

Or I need to use this:

with a coco-pretrained, you can even freeze the backbone and train heads only

to speed up training and help convergence.

Another newbie question: how to select anchors for custom dataset

Hi author @zylo117 ,

Thank you for your help in my previous question. Yet I still need help for selection of anchors. How to do this for customer dataset if I find out the detection performance is not satisfying?

I know some practices such as anchor selection in Yolo family. Do you have more resources to help me on this topic?

Thank you again,

learning rate doesn't decrease

It seems learning rate doesn't decrease as shown in the following graph of the tensorboard:

image

I have seen this behavior on every training in my system.
Do you know why?

training code

Do you have a plan to release it?
I want to train on a different dataset.

Thanks

About video test

hello, thanks for your great work,however,i have a little request: can you write a test.py for video?
i believe lots of people need this, add this work may 让你的工作更加完善!更加吸引人!太棒了!
awesome!

About the same padding

I find that the conv2d in pytorch has the padding, the description is
padding(int or tuple, optional): Zero-padding added to both sides of the input. Default: 0

Is it same as the same padding in tensorflow?

why training with 90 classes length?

Found that coco training using 90 classes when I load the model with widerly used 80 categories list, is there any reason using this classes length?

inference time of B0

Thanks for the great work. The analysis of other versions is excellent. I tired your model. It runs precisely. My question is about the model inference time. The D0 version runs inference with 268ms in average with GTX1070. It is different with what paper claims. Do you get the same performance on speed ? Thanks again.

Newbie Question: how to determine if mean/std value is suitable

Hi author @zylo117 ,

Thank you for your excellent work! I have some questions for mean/std normalization parameters.

  1. Is it for image normalization configuration?
  2. Are those set of parameters determined by pretrain model (This is the behavior as described in mmdetection )? If yes, then if we use the same pre-train, then we do not need to modify those parameters right?
  3. You also mentioned "mean and std in RGB order, actually this part should remain unchanged as long as your dataset is similar to coco". So looks like this also has something to do with dataset itself. I'm not offending, but do you think if this seems conflict with the explanation from mmdetection (namely, determined by model or by dataset)?

Thank you for your response.

Any plan of EfficientDet-lite

You might be aware that TPU model repo has published lite versions of efficientnet w/o the following three features as they are not well supported for some mobile accelerators.

  1. No Squeeze-and-excite (SE):
  2. Swish (replace with RELU6)
  3. Fix the stem and head while scaling models up: for keeping models small and fast.
    https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet/lite

I think it will be quite a value if you modify your code to include EfficientDet-lite which uses EfficientNet-lIte.
So my humble question, do you have any plan for this?

Some questions about weight initialize

image
Sorry, my own dataset has more class numbers than COCO , so I want to initialize the last layer's weight and bias , but I can't find the initial codes in the project.Please tell me how to initialize the last weight and bias like scratch.Thank you!!!!!!

typo in speed test

Thanks for the excellent repository.

Pytorch speed ratio may be wrong. It should be 0.713/0.028=25.5

Screenshot from 2020-04-12 02-10-27

training error

My system is ubuntu 18.04 with 1 GPU.
WHen I run the following script:
python train.py -c 0 --batch_size 4 --data ./data/
I got "the TypeError: forward() takes 2 positional arguments but 5" were given in the error log: below:

loading annotations into memory...
Done (t=7.84s)
creating index...
index created!
loading annotations into memory...
Done (t=0.27s)
creating index...
index created!
initializing weights...
0%| | 0/29571 [00:00<?, ?it/s]Traceback (most recent call last):
File "train.py", line 172, in train
_, regression, classification, anchors = model(imgs)
File "/home/henry/miniconda3/envs/remnav/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/home/henry/miniconda3/envs/remnav/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/henry/miniconda3/envs/remnav/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
TypeError: forward() takes 2 positional arguments but 5 were given

forward() takes 2 positional arguments but 5 were given
0%| | 1/29571 [00:00<6:58:07, 1.18it/s]Traceback (most recent call last):
File "train.py", line 172, in train
_, regression, classification, anchors = model(imgs)
File "/home/henry/miniconda3/envs/remnav/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/home/henry/miniconda3/envs/remnav/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/henry/miniconda3/envs/remnav/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
TypeError: forward() takes 2 positional arguments but 5 were given

forward() takes 2 positional arguments but 5 were given
Traceback (most recent call last):
File "train.py", line 172, in train
_, regression, classification, anchors = model(imgs)
File "/home/henry/miniconda3/envs/remnav/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/home/henry/miniconda3/envs/remnav/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/henry/miniconda3/envs/remnav/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
TypeError: forward() takes 2 positional arguments but 5 were given

forward() takes 2 positional arguments but 5 were given

Pascal voc2007 problem

Thanks for your good job! I have a problem. Training d0 on the pascal voc2007 dataset, the test result is almost 0, is this just the reason for the anchor ratio? Why is this result file so large...
image

anchor implementation

Hi author @zylo117 ,

Just read your code for anchors. Looks like you are using implementation from faster Rcnn anchors.

In your efficientdet backbone, I see the definition of anchor_scale, which remains default. However, in #17 , there is some discussions and we need to pass both anchor_scale and anchor ratios to customerize it.

Then I want to know, why self.anchor_scale is [4., 4., 4., 4., 4., 4., 4., 5.], instead of kwargs.get('scales', [2 ** 0, 2 ** (1.0 / 3.0), 2 ** (2.0 / 3.0)])?

Here is the code for your reference.
self.anchor_scale = [4., 4., 4., 4., 4., 4., 4., 5.]
self.aspect_ratios = kwargs.get('ratios', [(1.0, 1.0), (1.4, 0.7), (0.7, 1.4)])
self.num_scales = len(kwargs.get('scales', [2 ** 0, 2 ** (1.0 / 3.0), 2 ** (2.0 / 3.0)]))

Thank you,

GPU memory when training

Hi, @zylo117 Thanks for your remarkable work here. I have some questions about GPU memory used during training. First, how many GPUs did you use to train a model from scratch with coefficient D7, and how is each GPU allocated? Second, when i try to train from scratch with coefficient D7 on my own dataset with two 2080ti GPUs (each GPU memory is 12G) and the batch size is set to 1, the cuda out of memory error occurs. I wonder why this error occurs, is it because my own dataset has more class numbers than COCO or my GPU won't be able to train coefficient D7?

how much epochs?

Hi,I want to konw how much epochs you train that can get a good result?

loss odd

Hi,Thanks for your great job first,and I have a loss problem,wish you can help me:
I used my custom dataset,and I have verified its' Correctness,but when I train the model,the loss is very odd, What's wrong with my operation? (the json file i wrote to yml file )

python train.py -c 7 -n 2 --batch_size 1 --lr 0.00025 --optim sgd --es_ patience 10 --data_path '/' --log_path /efficientdet/logs/ --load_weights /efficientdet/weights/efficientdet-d7.pth --saved_path /results/efficientdet/results/ -p 'dla'

loss:It's either zero or several values are repeated,especially Cls_loss
Step: 31. Epoch: 0/500. Iteration: 39/400. Cls loss: 2.30212. Reg loss: 0.64907. Total loss: 2.95119: 10%|███▏ | 38/400 [00:30<04:59, 1.21it/s Step: 31. Epoch: 0/500. Iteration: 39/400. Cls loss: 2.30212. Reg loss: 0.64907. Total loss: 2.95119: 10%|███▎ | 39/400 [00:30<05:06, 1.18it/s Step: 32. Epoch: 0/500. Iteration: 40/400. Cls loss: 2.30212. Reg loss: 1.08668. Total loss: 3.38880: 10%|███▎ | 39/400 [00:31<05:06, 1.18it/s Step: 32. Epoch: 0/500. Iteration: 40/400. Cls loss: 2.30212. Reg loss: 1.08668. Total loss: 3.38880: 10%|███▍ | 40/400 [00:31<05:11, 1.16it/s Step: 33. Epoch: 0/500. Iteration: 41/400. Cls loss: 0.00000. Reg loss: 0.00000. Total loss: 0.00000: 10%|███▍ | 40/400 [00:32<05:11, 1.16it/s Step: 33. Epoch: 0/500. Iteration: 41/400. Cls loss: 0.00000. Reg loss: 0.00000. Total loss: 0.00000: 10%|███▍ | 41/400 [00:32<05:09, 1.16it/s Step: 34. Epoch: 0/500. Iteration: 42/400. Cls loss: 2.30212. Reg loss: 1.13490. Total loss: 3.43702: 10%|███▍ | 41/400 [00:33<05:09, 1.16it/s Step: 35. Epoch: 0/500. Iteration: 43/400. Cls loss: 0.00000. Reg loss: 0.00000. Total loss: 0.00000: 11%|███▋ | 43/400 [00:34<05:09, 1.15it/s Step: 36. Epoch: 0/500. Iteration: 44/400. Cls loss: 0.00000. Reg loss: 0.00000. Total loss: 0.00000: 11%|███▋ | 43/400 [00:35<05:09, 1.15it/s Step: 36. Epoch: 0/500. Iteration: 44/400. Cls loss: 0.00000. Reg loss: 0.00000. Total loss: 0.00000: 11%|███▋ | 44/400 [00:35<05:07, 1.16it/s Step: 37. Epoch: 0/500. Iteration: 45/400. Cls loss: 2.30212. Reg loss: 1.03828. Total loss: 3.34040: 11%|███▋ | 44/400 [00:35<05:07, 1.16it/s Step: 37. Epoch: 0/500. Iteration: 45/400. Cls loss: 2.30212. Reg loss: 1.03828. Total loss: 3.34040: 11%|███▊ | 45/400 [00:35<05:10, 1.15it/s Step: 38. Epoch: 0/500. Iteration: 46/400. Cls loss: 2.30212. Reg loss: 0.84100. Total loss: 3.14313: 11%|███▊ | 45/400 [00:36<05:10, 1.15it/s Step: 38. Epoch: 0/500. Iteration: 46/400. Cls loss: 2.30212. Reg loss: 0.84100. Total loss: 3.14313: 12%|███▉ | 46/400 [00:36<05:11, 1.14it/s Step: 39. Epoch: 0/500. Iteration: 47/400. Cls loss: 2.30212. Reg loss: 0.90405. Total loss: 3.20618: 12%|███▉ | 46/400 [00:37<05:11, 1.14it/s Step: 39. Epoch: 0/500. Iteration: 47/400. Cls loss: 2.30212. Reg loss: 0.90405. Total loss: 3.20618: 12%|███▉ | 47/400 [00:37<05:12, 1.13it/s Step: 39. Epoch: 0/500. Iteration: 47/400. Cls loss: 2.30212. Reg loss: 0.90405. Total loss: 3.20618: 12%|███▉ | 47/400 [00:37<04:44, 1.24it/s ]

Got cuda out of memory on my RTX 2080ti when running efficientdet_test.py???

(py3.6.2) @hpcl-X299-UD4-Pro:~/peng/code/Yet-Another-EfficientDet-Pytorch$ python efficientdet_test.py
running speed test...
test1: model inferring and postprocessing
inferring image for 10 times...
0.07670512199401855 seconds, 13.036939046625593 FPS, @batch_size 1
test2: model inferring only
inferring images for batch_size 32 for 10 times...
Traceback (most recent call last):
File "efficientdet_test.py", line 127, in
_, regression, classification, anchors = model(x)
File "/home/.pyenv/versions/py3.6.2/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/home/peng/code/Yet-Another-EfficientDet-Pytorch/backbone.py", line 73, in forward
_, p3, p4, p5 = self.backbone_net(inputs)
File "/home/.pyenv/versions/py3.6.2/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/home/peng/code/Yet-Another-EfficientDet-Pytorch/efficientdet/model.py", line 401, in forward
x = self.model._swish(x)
File "/home/.pyenv/versions/py3.6.2/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/home/peng/code/Yet-Another-EfficientDet-Pytorch/efficientnet/utils.py", line 54, in forward
return SwishImplementation.apply(x)
File "/home/peng/code/Yet-Another-EfficientDet-Pytorch/efficientnet/utils.py", line 41, in forward
result = i * torch.sigmoid(i)
RuntimeError: CUDA out of memory. Tried to allocate 3.52 GiB (GPU 0; 10.76 GiB total capacity; 8.64 GiB already allocated; 863.62 MiB free; 8.91 GiB reserved in total by PyTorch)

evaluation error

When I use your pretrained d0 weight and run evaluation script as below and got an error:
python coco_eval.py -c 0 -w weights/efficientdet-d0.pth

running coco-style evaluation on project coco, weights weights/efficientdet-d0.pth...
loading annotations into memory...
Done (t=0.28s)
creating index...
index created!
Loading and preparing results...
Traceback (most recent call last):
File "coco_eval.py", line 146, in
eval(coco_gt, image_ids, f'{SET_NAME}_bbox_results.json')
File "coco_eval.py", line 114, in eval
coco_pred = coco_gt.loadRes(pred_json_path)
File "/home/henry/miniconda3/env/test/lib/python3.6/site-packages/pycocotools/coco.py", line 317, in loadRes
'Results do not correspond to current coco set'
AssertionError: Results do not correspond to current coco set

inference time question

Hi,

Your implementation has greatly improved the speed. But it seems still slow compared with yolov3 since EfficientDet-D0 has a substantial parameter reduction.

Is postprocess restricting the running time?
Do you test the average cost time of YOLOv3 running on the 2080Ti ?

Thanks,
Tairan Chen

Learning rate advice

Hi @zylo117 ,

I've found your tips on training the model from scratch and I've been wondering about the learning rate setup. In the official repository, they've used a rate of 0.16 with 128 images in batch during learning, so scaling this to let's say a batch of 12 images would result in using a learning rate of 0.16 * 12/128 = 0,015. I know there are different opinions about scaling learning rate according to the batch size, but let's stick with linear scaling for simplicity). In the readme, you suggest using a learning rate of 1e-4 or even 1e-5. Could you please share your experiences learning rate settings here?

Best,
Daniel
.

anchor box

Hi!
How to use custom anchor box as same in yolo ?
Thanks

can not load model weights

Thanks for sharing your code. When I loaded the code weights, I found that the dimensions were wrong, but I strictly followed your code to load.

`
def get. net():

 nun_classes = 7
 anchors_ratios = '[(1.0, 1.0),(1.4, 0.7),(0.7, 1.4)]'
 anchors_scales = '[2 ** 0, 2 ** (1.0 1 3.0), 2 ** (2.0 / 3.0)] '
 compound_coef = 2
 my_model = EfficientDetBackbone(nun_ classes = nun_ classes, compound_coef = 
                       compound_coef, ratios = eval(anchors_ ratios), scales = eval(anchor))
weights_path = ' ./efficientdet-d2 . pth'
my_model.load_state_dict(torch. load(weights_ path), strict = False)

return my_model

if __ name__== "__ main__" :
model = get_net()
`

RuntineError: Error(s) in loading state_ dict for EfficientDetBackbone:
size mismatch for classifier .header . pointwise_ conv . conv . weight :
copying a paran with shape torch.size([810, 112,1,1]) from
checkpoint, the shape in current moJdeil is torch. size([63, 112, 1, 1]).
size mismatch for classifier .header . pointwise
conv. conv. bias:
copying a param with shape torch. Size([810]) from checkpoint, the shape
in current model is torch. Size([63]).

请问为什么会出现TypeError: object of type <class 'numpy.float64'> cannot be safely interpreted as an integer.

在我运行eval的时候载入bbx报了一个这个错:
大伙儿能帮我看看嘛#跪谢

running coco-style evaluation on project coco2017, weights /notebooks/storage/logs/coco2017/efficientdet-d2_15_912.pth...
loading annotations into memory...
Done (t=0.01s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.88s)
creating index...
index created!
BBox
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/numpy/core/function_base.py", line 117, in linspace
num = operator.index(num)
TypeError: 'numpy.float64' object cannot be interpreted as an integer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "coco_eval.py", line 164, in
eval(coco_gt, image_ids, f'{SET_NAME}_bbox_results.json')
File "coco_eval.py", line 133, in eval
coco_eval = COCOeval(coco_gt, coco_pred, 'bbox')
File "/usr/local/lib/python3.6/dist-packages/pycocotools/cocoeval.py", line 76, in init
self.params = Params(iouType=iouType) # parameters
File "/usr/local/lib/python3.6/dist-packages/pycocotools/cocoeval.py", line 527, in init
self.setDetParams()
File "/usr/local/lib/python3.6/dist-packages/pycocotools/cocoeval.py", line 507, in setDetParams
self.iouThrs = np.linspace(.5, 0.95, np.round((0.95 - .5) / .05) + 1, endpoint=True)
File "<array_function internals>", line 6, in linspace
File "/usr/local/lib/python3.6/dist-packages/numpy/core/function_base.py", line 121, in linspace
.format(type(num)))
TypeError: object of type <class 'numpy.float64'> cannot be safely interpreted as an integer.

data augmentation

I saw that the data augmentation only does left|right flip, I wonder if only one single data augmentation achieves the SoTA or the weights/benchmark metrics reported are achieved by this flip as data augmentation.

Best and thanks for sharing your FAQ, especially the part of your repo vs others.

RuntimeError: Source tensor must be contiguou

Source tensor must be contiguous. [Error] Traceback (most recent call last): File "/home/effdet/train.py", line 221, in train cls_loss, reg_loss = model(imgs, annot, obj_list=params.obj_list) File "/home/anaconda3/envs/python3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__ result = self.forward(*input, **kwargs) File "/home/anaconda3/envs/python3/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 148, in forward inputs, kwargs = self.scatter(inputs, kwargs, self.device_ids) File "/home/effdet/utils/utils.py", line 183, in scatter for device_idx in range(len(devices))], \ File "/home/effdet/utils/utils.py", line 183, in <listcomp> for device_idx in range(len(devices))], \ RuntimeError: Source tensor must be contiguous.
Hi, Thanks for your code, when I train on single GPU, it runs well, but when I run on two GPU, the error happen, Could you give me some advice, thank you

Some questions about input image size

Sorry, if I want to use efficientdet-d7.pth models to train my own datasets. Do I need to change the size of the images into the corresponding size in the paper?

Bias in point-wise sep conv in BiFPN

Hi @zylo117 ,

good job finding so many differences between other PyTorch repos and the official implementation. I've got one question about the BiFPN convolutions, i.e. here:

self.conv6_up = SeparableConvBlock(num_channels, onnx_export=onnx_export)

You're using Separable Conv followed by the batch norm. My question is: why do you use bias in pointwise conv? Bias followed by batch norm is unnecessary, right?

Table Detection

感谢作者优秀的工作!
我在使用该实现做table detection表格检测,第66个epoch val_loss降低至0.23,但是预测时并不能取得很好的效果,作者有什么好的建议吗,感谢!

error load pretrain weights

File "train.py", line 113, in train
model.load_state_dict(torch.load(weights_path))
File "/home/phucnhs/miniconda3/envs/python36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 830, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for EfficientDetBackbone:
size mismatch for classifier.header.pointwise_conv.conv.weight: copying a param with shape torch.Size([810, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([45, 64, 1, 1]).
size mismatch for classifier.header.pointwise_conv.conv.bias: copying a param with shape torch.Size([810]) from checkpoint, the shape in current model is torch.Size([45]).
Hi !
I use custom dataset with 5 class and when training i have use pretrain coco download in repo .
How do I solve it !
thanks

How to change the model to onnx?

Thank you for your contribution!I want to change the model to onnx , but I get many errors.could you tell me when will you update the code to support onnx format?

Support for Training Negative Samples

Hi, thanks for your code.
I have a question. What would be the good way to add support for Negative Samples or Background Label in training in this repository?

unstable loss during training epoches

Hi @zylo117 ,

I found the training loss changes dynamically during training epoches.
e.g. efficient-d1, during last several batch iters of epoch 0, the loss about 0.3, 0.08
batch size 32, init lr 0.005

then during the epoch 1, it will be much larger than that, i have tried several times and this will happen every time. My custom data should be ok, since offical tf implement could be trained with same dataset normally.
Do you have any clues for this>

Step: 895. Epoch: 0/500. Iteration: 896/897. Cls loss: 0.31577. Reg loss: 0.06190. Total loss: 0.37767: 100%|▉| Step: 895. Epoch: 0/500. Iteration: 896/897. Cls loss: 0.31577. Reg loss: 0.06190. Total loss: 0.37767: 100%|▉| Step: 896. Epoch: 0/500. Iteration: 897/897. Cls loss: 0.34984. Reg loss: 0.07105. Total loss: 0.42089: 100%|▉| Step: 896. Epoch: 0/500. Iteration: 897/897. Cls loss: 0.34984. Reg loss: 0.07105. Total loss: 0.42089: 100%|█| 897/897 [17:16<00:00, 1.16s/it]
^@Val. Epoch: 0/500. Classification loss: 0.32317. Regression loss: 0.14466. Total loss: 0.46783
Step: 897. Epoch: 1/500. Iteration: 1/897. Cls loss: 0.34505. Reg loss: 0.16021. Total loss: 0.50526: 0%| | 0/Step: 897. Epoch: 1/500. Iteration: 1/897. Cls loss: 0.34505. Reg loss: 0.16021. Total loss: 0.50526: 0%| | 1/Step: 898. Epoch: 1/500. Iteration: 2/897. Cls loss: 1.12159. Reg loss: 2.12988. Total loss: 3.25147: 0%| | 1/Step: 898. Epoch: 1/500. Iteration: 2/897. Cls loss: 1.12159. Reg loss: 2.12988. Total loss: 3.25147: 0%| | 2/Step: 899. Epoch: 1/500. Iteration: 3/897. Cls loss: 2.10690. Reg loss: 0.88227. Total loss: 2.98917: 0%| | 2/Step: 899. Epoch: 1/500. Iteration: 3/897. Cls loss: 2.10690. Reg loss: 0.88227. Total loss: 2.98917: 0%| | 3/Step: 900. Epoch: 1/500. Iteration: 4/897. Cls loss: 2.30212. Reg loss: 80.03463. Total loss: 82.33675: 0%| | Step: 900. Epoch: 1/500. Iteration: 4/897. Cls loss: 2.30212. Reg loss: 80.03463. Total loss: 82.33675: 0%| | Step: 901. Epoch: 1/500. Iteration: 5/897. Cls loss: 2.30212. Reg loss: 67977.46094. Total loss: 67979.76562: Step: 901. Epoch: 1/500. Iteration: 5/897. Cls loss: 2.30212. Reg loss: 67977.46094. Total loss: 67979.76562: Step: 902. Epoch: 1/500. Iteration: 6/897. Cls loss: 2.30212. Reg loss: 65421.88281. Total loss: 65424.18359: Step: 902. Epoch: 1/500. Iteration: 6/897. Cls loss: 2.30212. Reg loss: 65421.88281. Total loss: 65424.18359: Step: 903. Epoch: 1/500. Iteration: 7/897. Cls loss: 2.30212. Reg loss: 1573417.00000. Total loss: 1573419.25000Step: 903. Epoch: 1/500. Iteration: 7/897. Cls loss: 2.30212. Reg loss: 1573417.00000. Total loss: 1573419.25000Step: 904. Epoch: 1/500. Iteration: 8/897. Cls loss: 2.70166. Reg loss: 272323072.00000. Total loss: 272323072.0Step: 904. Epoch: 1/500. Iteration: 8/897. Cls loss: 2.70166. Reg loss: 272323072.00000. Total loss: 272323072.0Step: 905. Epoch: 1/500. Iteration: 9/897. Cls loss: 2.78378. Reg loss: 5323778686976.00000. Total loss: 5323778

Step: 1142. Epoch: 1/500. Iteration: 246/897. Cls loss: 62.49104. Reg loss: 0.50637. Total loss: 62.99741: 27%|Step: 1143. Epoch: 1/500. Iteration: 247/897. Cls loss: 73.22089. Reg loss: 1213188314532348763832320.00000. TotStep: 1143. Epoch: 1/500. Iteration: 247/897. Cls loss: 73.22089. Reg loss: 1213188314532348763832320.00000. TotStep: 1144. Epoch: 1/500. Iteration: 248/897. Cls loss: 71.14322. Reg loss: 451076503452162755395584.00000. TotaStep: 1144. Epoch: 1/500. Iteration: 248/897. Cls loss: 71.14322. Reg loss: 451076503452162755395584.00000. TotaStep: 1145. Epoch: 1/500. Iteration: 249/897. Cls loss: 66.21103. Reg loss: 441545013143201799733248.00000. TotaStep: 1145. Epoch: 1/500. Iteration: 249/897. Cls loss: 66.21103. Reg loss: 441545013143201799733248.00000. TotaStep: 1146. Epoch: 1/500. Iteration: 250/897. Cls loss: 68.61477. Reg loss: 235899719417569696808960.00000. TotaStep: 1146. Epoch: 1/500. Iteration: 250/897. Cls loss: 68.61477. Reg loss: 235899719417569696808960.00000. TotaStep: 1147. Epoch: 1/500. Iteration: 251/897. Cls loss: 61.03159. Reg loss: 359504415912071251623936.00000. TotaStep: 1147. Epoch: 1/500. Iteration: 251/897. Cls loss: 61.03159. Reg loss: 359504415912071251623936.00000. TotaStep: 1148. Epoch: 1/500. Iteration: 252/897. Cls loss: 62.65581. Reg loss: 1484498270567262345756672.00000. TotStep: 1148. Epoch: 1/500. Iteration: 252/897. Cls loss: 62.65581. Reg loss: 1484498270567262345756672.00000. TotStep: 1149. Epoch: 1/500. Iteration: 253/897. Cls loss: 63.57681. Reg loss: 703274623820396710854656.00000. TotaStep: 1149. Epoch: 1/500. Iteration: 253/897. Cls loss: 63.57681. Reg loss: 703274623820396710854656.00000. TotaStep: 1150. Epoch: 1/500. Iteration: 254/897. Cls loss: 66.59061. Reg loss: 229218953629538736668672.00000. TotaStep: 1150. Epoch: 1/500. Iteration: 254/897. Cls loss: 66.59061. Reg loss: 229218953629538736668672.00000. TotaStep: 1151. Epoch: 1/500. Iteration: 255/897. Cls loss: 66.09634. Reg loss: 871639065787460319444992.00000. TotaStep: 1151. Epoch: 1/500. Iteration: 255/897. Cls loss: 66.09634. Reg loss: 871639065787460319444992.00000. TotaStep: 1152. Epoch: 1/500. Iteration: 256/897. Cls loss: 68.23049. Reg loss: 1114574254499712676659200.00000. TotStep: 1152. Epoch: 1/500. Iteration: 256/897. Cls loss: 68.23049. Reg loss: 1114574254499712676659200.00000. TotStep: 1153. Epoch: 1/500. Iteration: 257/897. Cls loss: 61.66665. Reg loss: 350685359035363289464832.00000. TotaStep: 1153. Epoch: 1/500. Iteration: 257/897. Cls loss: 61.66665. Reg loss: 350685359035363289464832.00000. Tot

Request: Segmentation

The paper recommends small changes to make the segmentation model. Can you please incorporate it in your code?

How do you get your model parameter?

Hi, I want to know how do you get the pretrained model. Do you retrain it by your own code or transfered the official model from tensorflow to pytorch?

No such operator torchvision::nms

when I run the python efficientdet_test.py
there is an error that perhaps due to the version of torchvison?
but I have reinstalled the required version of every tools,pycocotools、torch==1.4.0 and torchvision==0.5.0.
I dont know why the error still appears.

mess up now...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.