peterisfar / yolov3 Goto Github PK

View Code? Open in Web Editor NEW

196.0 4.0 52.0 17.74 MB

yolov3 by pytorch

License: MIT License

Python 100.00%

yolov3 pytorch voc object-detection model-compression mobilenetv2

yolov3's Introduction

YOLOV3

Introduction

This is my own YOLOV3 written in pytorch, and is also the first time i have reproduced a object detection model.The dataset used is PASCAL VOC. The eval tool is the voc2010. Now the mAP gains the goal score.

Subsequently, i will continue to update the code to make it more concise , and add the new and efficient tricks.

Note : Now this repository supports the model compression in the new branch model_compression

Results

name	Train Dataset	Val Dataset	mAP(others)	mAP(mine)	notes
YOLOV3-448-544	2007trainval + 2012trainval	2007test	0.769	0.768 \| -	baseline(augument + step lr)
YOLOV3-*-544	2007trainval + 2012trainval	2007test	0.793	0.803 \| -	+multi-scale training
YOLOV3-*-544	2007trainval + 2012trainval	2007test	0.806	0.811 \| -	+focal loss(note the conf_loss in the start is lower)
YOLOV3-*-544	2007trainval + 2012trainval	2007test	0.808	0.813 \| -	+giou loss
YOLOV3-*-544	2007trainval + 2012trainval	2007test	0.812	0.821 \| -	+label smooth
YOLOV3-*-544	2007trainval + 2012trainval	2007test	0.822	0.826 \| -	+mixup
YOLOV3-*-544	2007trainval + 2012trainval	2007test	0.833	0.832 \| 0.840	+cosine lr
YOLOV3--	2007trainval + 2012trainval	2007test	0.858	0.858 \| 0.860	+multi-scale test and flip, nms threshold is 0.45

Note :

YOLOV3-448-544 means train image size is 448 and test image size is 544. "*" means the multi-scale.
mAP(mine)'s format is (use_difficult mAP | no_difficult mAP).
In the test, the nms threshold is 0.5(except the last one) and the conf_score is 0.01.others nms threshold is 0.45(0.45 will increase the mAP)
Now only support the single gpu to train and test.

Environment

Nvida GeForce RTX 2080 Ti
CUDA10.0
CUDNN7.0
ubuntu 16.04
python 3.5

# install packages
pip3 install -r requirements.txt --user

Brief

Prepared work

1、Git clone YOLOV3 repository

git clone https://github.com/Peterisfar/YOLOV3.git

update the "PROJECT_PATH" in the params.py.

2、Download dataset

Download Pascal VOC dataset : VOC 2012_trainval 、VOC 2007_trainval、VOC2007_test. put them in the dir, and update the "DATA_PATH" in the params.py.
Convert data format : Convert the pascal voc *.xml format to custom format (Image_path0 xmin0,ymin0,xmax0,ymax0,class0 xmin1,ymin1...)

cd YOLOV3 && mkdir data
cd utils
python3 voc.py # get train_annotation.txt and test_annotation.txt in data/

3、Download weight file

Darknet pre-trained weight : darknet53-448.weights
This repository test weight : best.pt

Make dir weight/ in the YOLOV3 and put the weight file in.

Train

Run the following command to start training and see the details in the config/yolov3_config_voc.py

WEIGHT_PATH=weight/darknet53_448.weights

CUDA_VISIBLE_DEVICES=0 nohup python3 -u train.py --weight_path $WEIGHT_PATH --gpu_id 0 > nohup.log 2>&1 &

Notes:

Training steps could run the "cat nohup.log" to print the log.
It supports to resume training adding --resume, it will load last.pt automaticly.

Test

You should define your weight file path WEIGHT_FILE and test data's path DATA_TEST

WEIGHT_PATH=weight/best.pt
DATA_TEST=./data/test # your own images

CUDA_VISIBLE_DEVICES=0 python3 test.py --weight_path $WEIGHT_PATH --gpu_id 0 --visiual $DATA_TEST --eval

The images can be seen in the data/

TODO

Mish
OctConv
Custom data

Reference

tensorflow : https://github.com/Stinky-Tofu/Stronger-yolo
pytorch : https://github.com/ultralytics/yolov3
keras : https://github.com/qqwweee/keras-yolo3

yolov3's People

Contributors

Stargazers

Watchers

Forkers

cafunechen brownsweater yehaizi1995 smalldroid byfate guobinli note-liu atr-fusion-detect developer0hye 2017tjm jinxinshao haihai36 vzyknc shank2358 renping0535 ww-zwj dely-yu tuananh12101997 yuanliangxie liguoxuan ehsab lyhcome scott0o0 zhongyuanw gesilaa gyq716 camila-zsw ndb796 tiantian-han mx985211 ashish-gpta taramirmira rousage jsllmj yaohong9257 amurthy-groq gzw820 minjie740130 damonyly 18720744987 creep-dude jiwei-dot moneypi yuxin-yu 99312254 yu-nie fardman69420 davidsonggithub zhn6818 yeonjikim0316 xf-cpp

yolov3's Issues

About utils/datasets.py

line111: for i in range(3):
line112: label[i][..., 5] = 1.0

Why set the conf of map pixels to 1.0? It means all pixels contains obj?

关于yololoss的疑问

您的代码非常简练有力！赞！
但是我对于yololoss有一些疑问：
您的策略貌似跟yolov3论文的策略不一样啊
gtbox与每层feature_map中的pred_anchor进行IOU计算，如果IOU>0.3即进行assign为positive操作。若gt_box与所有feature_map中的pred_anchor的IOU都小于0.3时，取所有feature_map中最大的anchor进行assign
而我理解的yolov3策略是
gtbox与所有feature_map中的pred_anchor进行IOU计算，挑选IOU最大的哪一个进行assign.
请问这样做有什么好处吗？希望能与您交流一二

validate的速度特别慢

我的设备：云平台的4核cpu，24g内存，2080ti-11g显卡。
训练阶段，gpu正常使用，利用率100%。但是在测试时，gpu利用率0%，速度特别慢。之前的相关issues我也看了，但没有发现到相关的解决办法。
之后在自己没有cpu的电脑上使用best.pt权重测试，发现测试一张图片的时间为1-2s，但在云平台测试一张图片的时间为15-25s。

注：multi_scale和flip都没有开

After validating for once,the programe can not train any more.

您好，想请问一个您的label处理的问题

我自己写的yolov3的loss总是计算出nan值，所以我用您的loss计算函数放在我的代码上想看下是否是我的loss计算有问题。我发现在您的loss计算 cls loss时有这么一段：
label_cls = label[..., 6:]
label_mix = label[..., 5:6]
按道理，label cls应该是label[...,5:]
我想请问下您的第五位代表什么含义，因为最近事情很多，没时间细看您的dataloader部分，如果有时间的话麻烦简单讲解一下这部分处理在您的代码的哪部分，谢谢！！！

Anchor

请问前面这组anchor和后面注释的anchor是怎样的尺度关系？
MODEL = {"ANCHORS":[[(1.25, 1.625), (2.0, 3.75), (4.125, 2.875)], # Anchors for small obj(12,16),(19,36),(40,28)
[(1.875, 3.8125), (3.875, 2.8125), (3.6875, 7.4375)], # Anchors for medium obj(36,75),(76,55),(72,146)
[(3.625, 2.8125), (4.875, 6.1875), (11.65625, 10.1875)]], # Anchors for big obj(142,110),(192,243),(459,401)
"STRIDES":[8, 16, 32],
"ANCHORS_PER_SCLAE":3
}
谢谢

关于测试

你好　关于这里的测试这一部分我看的不是很理解。这里是要自己的图片进行测试吗还是？不是用下载好的2007-test吗？

multi_scale在单个epoch里面并不起作用？

因为我自己的需求问题，需要在训练集中额外加一些其他东西，然后想看一下每个batch的tensor大小，结果发现，在单个epoch里的图片大小全都是448，图片大小并没有随着multi_scale的代码而改变。

我自己打印tensor.size的代码在imgs = imgs.to(self.device)之后，multi_scale代码在https://github.com/Peterisfar/YOLOV3/blob/03a834f88d57f6cf4c5016a1365d631e8bbbacea/train.py#L131-L134。

Epoch:[ 0 | 49 ]    Batch:[ 0 | 2068 ]    loss_giou: 2.3399    loss_conf: 1.8814    loss_cls: 1.1482    loss: 5.3695    lr: 0
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
multi_scale_img_size : 512
torch.Size([8, 3, 448, 448])
Epoch:[ 0 | 49 ]    Batch:[ 10 | 2068 ]    loss_giou: 2.0293    loss_conf: 3.2374    loss_cls: 2.2489    loss: 7.5156    lr: 2.41663e-07
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
multi_scale_img_size : 512
torch.Size([8, 3, 448, 448])
Epoch:[ 0 | 49 ]    Batch:[ 20 | 2068 ]    loss_giou: 2.1348    loss_conf: 2.8944    loss_cls: 2.2905    loss: 7.3198    lr: 4.83325e-07
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])

但在第一个epoch的最后，图片的大小成功改变了，不知道这是不是一个bug。

Epoch:[ 0 | 49 ]    Batch:[ 2050 | 2068 ]    loss_giou: 2.0007    loss_conf: 2.3295    loss_cls: 2.1310    loss: 6.4612    lr: 4.95408e-05
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
multi_scale_img_size : 544
torch.Size([8, 3, 448, 448])
Epoch:[ 0 | 49 ]    Batch:[ 2060 | 2068 ]    loss_giou: 1.9995    loss_conf: 2.3252    loss_cls: 2.1278    loss: 6.4524    lr: 4.97825e-05
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
torch.Size([8, 3, 448, 448])
torch.Size([7, 3, 448, 448])
torch.Size([8, 3, 544, 544])

baseline

您好~~请问您的代码的baseline是指的darknet对yolo进行训练吗~~

关于模型的map的疑问？

您好，虽然你的代码很简洁，但是随着对您的代码的深入了解，发现您的代码貌似并没有取得良好的检测效果。在图像大小为544*544下，conf_thershold=0.01,nms_threshold=0.5时，对voc_test进行000001.jpg进行检测时，发现进行了极大值抑制之后，boundingbox的数量居然还有了276个，但GT_box只有一个，模型检测出很多的FP，所以在此对您的eval——map的计算方式是否正确产生了质疑。虽然运行test.py，在data/result/里会出现有些不错的检测效果的图，但是您的utils/visualize中的visualize_boxes的min_score_thresh=0.5，所以您这些图片其实是在conf_threshold=0.5时得到的。我用我自己的eval方法在conf_threshhold=0.5的条件下对您的检测结果的进行了map测试，发现结果有点poor。。。

每个epoch训练完之后validate的速度特别慢？

感谢您的代码，请问每个epoch训练完之后validate的batchsize可以修改吗（现在我运行的时候是单张图片计算的，2007test好多图片，每个epoch验证都要很久），是否是我的参数没设置对，我应该如何修改，因为每跑完一个epoch就要花很长时间在validation上，速度特别慢。请问我该如何解决这个问题。谢谢。

作者您好：感谢您的付出，我有个问题看了一下您的loss函数在计算conf 和cls loss时用的是p 而不是p_d。这一块是不是代码写错了？

How can the fps be when input image is 416*416?

聚类anchors

请问一下，anchors是怎么得到的啊？有代码可以提供一下嘛？

关于voc.py中的逻辑错误

当use_difficult_bbox=False时,其中在判断的时候有逻辑错误，生成的注解文件包含有难以分辨的标签。
if (not use_difficult_bbox) and (difficult == 1): # difficult 表示是否容易识别，0表示容易，1表示困难
continue
应该更改为
if ( use_difficult_bbox) and (difficult == 1): # difficult 表示是否容易识别，0表示容易，1表示困难
continue

训练问题

请问您的代码可以对于VOC中的特定几类（不是全部20类）进行训练吗？

作者您好，请问yolo_loss中，怎么计算的loss confidence的呀？

作者您好，您的代码写得太好了！我在阅读yolo_loss的时候有个疑问，您loss confidence 是怎么计算的呢？特别是iou_max是代表的什么含义呀？

IndexError: Caught IndexError in DataLoader worker process 1.

File "train.py", line 99, in train
for i, (imgs, label_sbbox, label_mbbox, label_lbbox, sbboxes, mbboxes, lbboxes) in enumerate(self.train_dataloader):
File "/home/aistudio/work/torch1/pytorch/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 801, in next
return self._process_data(data)
File "/home/aistudio/work/torch1/pytorch/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 846, in _process_data
data.reraise()
File "/home/aistudio/work/torch1/pytorch/lib/python3.6/site-packages/torch/_utils.py", line 369, in reraise
raise self.exc_type(msg)
IndexError: Caught IndexError in DataLoader worker process 1.
Original Traceback (most recent call last):
File "/home/aistudio/work/torch1/pytorch/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "/home/aistudio/work/torch1/pytorch/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/aistudio/work/torch1/pytorch/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/aistudio/work/newyolov3/YOLOV3/utils/datasets.py", line 36, in getitem
img_mix, bboxes_mix = self.__parse_annotation(self.__annotations[item_mix])
File "/home/aistudio/work/newyolov3/YOLOV3/utils/datasets.py", line 84, in __parse_annotation
img, bboxes = dataAug.Resize((self.img_size, self.img_size), True)(np.copy(img), np.copy(bboxes))
File "/home/aistudio/work/newyolov3/YOLOV3/utils/data_augment.py", line 97, in call
bboxes[:, [0, 2]] = bboxes[:, [0, 2]] * resize_ratio + dw
IndexError: too many indices for array
当我在跑第一个epoch的时候运行60个batchsize出现上述问题

【Question】Batch size?

您好, 我看到config文件里batch size设置为8，仔细阅读代码后也没有发现其他地方会更改batch size。我看到你的显卡是2080ti，挺好的显卡了，请问batch size设置这么小是有什么特殊原因吗？
谢谢

Can you share the AP of each class in VOC for the baseline version?

I am wondering the detection performance (AP) on each class. Can you share them?

Dear author, why do you choose alpha =1.0 in the focal loss?

best.pt链接

作者大大你好，你分享的best.pt链接失效了，方便更新一下吗，万分感谢！

augment

你好，首先感谢您分享的代码，我想问一下在数据增强那里有个mixup函数，为什么要采用beta分布来设置进行图像数据的混合，而且class_id后面一个参数为什么是0-1？？

LOSS

Sorry to bother you, I read your comment in ultralytics / yolov3. Did you changed his loss function to the original YOLOV3 paper version?

VOC训练问题

我使用ultralytics的yolov3进行了训练，发现根本训练不出来， map连1都上不去，我发现您也遇到过这个问题，您知道是什么原因吗？

Dear author, thank you for your code. How many epoches have you trained for pascal voc to get map 85%? Is the batches set by 8?

About training

Hey gays !when I training use my own image ,it only trained 20 epoches and then begin to validate,the total epoches I set are 50.

Why do you set epoch >= 20 in train.py ?

            if epoch >= 20:
                print('*'*20+"Validate"+'*'*20)
                with torch.no_grad():
                    APs = Evaluator(self.yolov3).APs_voc()
                    for i in APs:
                        print("{} --> mAP : {}".format(i, APs[i]))
                        mAP += APs[i]
                    mAP = mAP / self.train_dataset.num_classes
                    print('mAP:%g'%(mAP))

            self.__save_model_weights(epoch, mAP)
            print('best mAP : %g' % (self.best_mAP))

Detect too many extra bounding boxes with the test threshold of 0.01.

Dear author,
When evaluating the mAP of VOC dataset, the performance is much better than darknet YOLOv3 version. However, there are much more extra predicted bounding boxes. The test threshold is default as 0.005 in darknet version and the number of false positives is about 10000~11000. However, in this Pytorch version code, even with a threshold of 0.1, the number of FP is still high. Could you please tell me the reason and how to solve it?
Thanks

Is it possible to convert weight from pt to darknet

Hi,
Thank you so much for your work in this repository. I would like to know if it is possible to convert the trained weight from pytorch model (pt) format to darknet format, so that one can use it with darknet?

Thanks.

mixup have some problems when pics‘s shape different

The trick about `Label Smoothing` in `Bag of Freebies for Training Object Detection Neural Networks`

在Bag of Freebies for Training Object Detection Neural Networks这篇论文中的Label smoothing公式描述如下：

我照着跟代码中比对了一下，num_classes并没有减1，请问这是一时出错了吗？

What's the difference between darknet53-448.weights and darknet53.conv.74?

Both of them can be download from YOLOv3 author's website, but now there is only darknet53.conv.74 showed on the YOLOv3's page.SO I am hesitating which one to use to pre-train the network.
Can anyone tell me the difference between them?

Training on custom dataset guide and Image, video detection demo

Would you like to add training on custom dataset guide and Image, video detection demo. Thank you!

关于训练

你好　我按照你写的步骤去训练　但是这一行代码我没看懂是什么意思，是写错了还是？希望有空能解答一下。代码如下：
CUDA_VISIBLE_DEVICES=0 nohup python3 -u train.py --weight_path $WEIGHT_PATH --gpu_id 0 > nohup.log 2>&1 &

Change backbone

could I change backbone to resnet50?

GPU training

你好，非常感谢你开源纯pytorch的yolov3版本，如何使用多GPU训练。谢谢。

How many epochs have you trained the best.pt file?

just like the title

感谢您的代码。关于训练loss变为NAN

我才用多GPU训练时，经过1到2个epoch后变为了NAN，请问您是如何解决的

IDE

Hi,
Which IDE are you using?
Best regards,
PeterPham

some questions about the 'datasets' and 'yolo_loss' code

Very thanks for your work. These days I am working on the implementation of YOLOv3 and find your work. I have some questions about the code.

random_crop function. In utils/data_augment.py, line 'crop_xmax = max(w_img, int(max_bbox[2] + random.uniform(0, max_r_trans)))' and 'crop_ymax = max(h_img, int(max_bbox[3] + random.uniform(0, max_d_trans)))' I think the 'max' function should be changed to 'min' because 'h_img' and 'w_img' are always the maximum ones.
mixup function: utils/datasets.py line 'item_mix = random.randint(0, len(self.__annotations)-1)'. when I test by myself, there exists a case that item_mix = item. I think it is better to add a if condition for this case to avoid 'bboxes = np.concatenate([bboxes_org, bboxes_mix])'
label_smooth: I read the paper 'Bag of Freebies' and in section 3.2, equation (3), the formula is different with your implementation. The one shown in the paper is 1 - delta if i=y else delta (K -1) and your code is 1 - delta + delta / K if i=y else delta / K. Is it correct?
loss function: I didn't understand 'bbox_loss_scale = 2.0 - 1.0 * label_xywh[..., 2:3] * label_xywh[..., 3:4] / (img_size ** 2)' in giou loss. Is it shown in the GIOU paper? why the format is 2 minus the label box area / image area? For 'label_noobj_mask * FOCAL(input=p_conf, target=label_obj_mask)' in confidence loss. I find in some other implementations, there is a scale factor for the balance of negative samples. see eriklindernoren/PyTorch-YOLOv3#309. Then how to determine the scale value?
Could you please explain them? Very thanks!

关于结构

同学你好我想询问一下你现在用的pytorch复现yolov3提高mAP时是通过改变网络结构来实现的吗?能说一下大体的改变吗

换自己的数据集为什么跑一点就报错了

您好，我换了一个自己的数据集后为什么在跑了一点点后报错啊。我查了下问题，说是数据集的问题，但是我反复重新弄了好几次数据集，确保无误后，依然还是有这个问题。希望您能解答一下，万分感谢！！！

index 56 is out of bound for axis 1 with size 56?

when I run this code to trian on VOC, I met the following bug:
"index 56 is out of bound for axis 1 with size 56",

It caused by the following sentences in "utils/dataset.py" :
label[i][yind, xind, iou_mask, 0:4] = bbox_xywh
label[i][yind, xind, iou_mask, 4:5] = 1.0
Iabel[i][yind, xind, iou_mask, 5:6] = bbox_mix
label[i][yind, xind, iou_mask, 6:] = one_hot_smooth
How can I solve it? Thanks for your help.

loss 降不下来

你好，自己数据集在v5上面ｌｏｓｓ降低到0.02左右，但是在本工程下，最低只能降到0.5,精度上不去，请问遇到过这个问题吗？