Code Monkey home page Code Monkey logo

Comments (59)

BowieHsu avatar BowieHsu commented on August 16, 2024 8

我观察到的现象:
1.最后报道的这个准确率是best的准确率,而我看到last的准确率其实是比best要小的,也就是当前的半监督训练超参对您的任务其实带来了一些副作用;
2.根据最后一个epoch的表现,teacher的表现也是比student要差的,说明teacher从student上获取的信息量也收到干扰了;

有这么几个可以尝试的方案:
1.lr0继续缩小,观察student和teacher的变化
2.SSOD下的nms_iou_thres: 0.65参数可以考虑调小,如果您的无标注图片非常干净,每幅图没有包含大量的目标,那么这个nms 0.65的设置会倾向生成含有大量重叠的伪标签
3.ignore_thres_high: 0.6可以考虑调大,这个阈值的含义是判定当前的伪标签是否可信,调大这个阈值也可以减少错误伪标签对半监督训练的干扰
4.终极解决方案,在SSOD下面添加一行 debug: True,这样在半监督训练的时候,算法库会把伪标签渲染在未标注图片上并且存下来,虽然会花费一些存储空间,但是可以快速帮您判断当前伪标签的生成状态

Feel free to report of ask any question~

from efficientteacher.

songxinkuan avatar songxinkuan commented on August 16, 2024 3

我观察到的现象: 1.最后报道的这个准确率是best的准确率,而我看到last的准确率其实是比best要小的,也就是当前的半监督训练超参对您的任务其实带来了一些副作用; 2.根据最后一个epoch的表现,teacher的表现也是比student要差的,说明teacher从student上获取的信息量也收到干扰了;

有这么几个可以尝试的方案: 1.lr0继续缩小,观察student和teacher的变化 2.SSOD下的nms_iou_thres: 0.65参数可以考虑调小,如果您的无标注图片非常干净,每幅图没有包含大量的目标,那么这个nms 0.65的设置会倾向生成含有大量重叠的伪标签 3.ignore_thres_high: 0.6可以考虑调大,这个阈值的含义是判定当前的伪标签是否可信,调大这个阈值也可以减少错误伪标签对半监督训练的干扰 4.终极解决方案,在SSOD下面添加一行 debug: True,这样在半监督训练的时候,算法库会把伪标签渲染在未标注图片上并且存下来,虽然会花费一些存储空间,但是可以快速帮您判断当前伪标签的生成状态

Feel free to report of ask any question~

非常感谢作者的工作,经过尝试发现以下个人经验:
(1)遇到的问题:监督学习map正常,半监督学习map很差并逐渐将至0;
(2)分析: custom数据集训练后的模型推理时预测的score范围和coco数据集训练后推理时预测的score范围不同,比如我的数据集使用监督学习后的模型预测分值范围0.8-0.99(可靠预测);
(3)解决办法:为使伪标签bbox比较可靠,将ignore_thres_low和ignore_thres_high分别增加值0.8和0.999(需根据custom数据集统计)。(0.8和0.999是随便选择的数值,未调优)

from efficientteacher.

ppogg avatar ppogg commented on August 16, 2024 1

这是一项很棒的开源工作!

from efficientteacher.

BowieHsu avatar BowieHsu commented on August 16, 2024

@David-19940718 好的,能贴给我具体的报错信息吗?还有您的txt的样子

from efficientteacher.

David-19940718 avatar David-19940718 commented on August 16, 2024

@David-19940718 好的,能贴给我具体的报错信息吗?还有您的txt的样子

看了下是在合并dict的时候校验出错了:ValueError: Type mismatch (<class 'str'> vs. <class 'list'>) with values

这是yolov5官方支持的另外三种形式:Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..],我用的第三种

from efficientteacher.

BowieHsu avatar BowieHsu commented on August 16, 2024

@David-19940718 可以报告一下报错是在哪行吗?我来修一下。如果您已经解决了这个问题可以提一下PR。

from efficientteacher.

David-19940718 avatar David-19940718 commented on August 16, 2024

@David-19940718 可以报告一下报错是在哪行吗?我来修一下。如果您已经解决了这个问题可以提一下PR。

您好,这一行报错了,"efficientteacher/configs/yacs.py", line 510

val.py 文件第220行会调用到 cfg.merge_from_file(config)

from efficientteacher.

BowieHsu avatar BowieHsu commented on August 16, 2024

@David-19940718 您好,已经定位到这个问题了,当前还请您使用txt这种读取方式,我们将添加对list读取方式的支持

from efficientteacher.

David-19940718 avatar David-19940718 commented on August 16, 2024

@David-19940718 您好,已经定位到这个问题了,当前还请您使用txt这种读取方式,我们将添加对list读取方式的支持

好的谢谢。我刚已经用txt这种读取方式完成了。但精度好像“下降的特别夸张”。我先简单总结下我的步骤:

  1. 我下载了一份yolov5-7.0版本的代码,随便找了一份开源的数据集训练全监督学习,训练了300个epochs得到一个权重custom_best.pt;
  2. 我将这个权重通过 convert_pt_to_efficient.py 转换成 efficient-yolov5s.pt;
  3. 通过 python val.py --config configs/sup/custom/yolov5s_custom.yaml --weights efficient-yolov5s.pt 评估得到指标,显示精度非常高,共三个类别,平均mAP达到 98% 左右;
  4. 通过 find <unlabeld_data_path> -name "*.jpg" >> unlabel.txt 生成对应的伪标签txt文件
  5. 参考 configs/ssod/custom/yolov5l_custom_ssod.yaml 自定义一份 yolov5s_custom_ssod.yaml 配置数据,epochs20, 默认burn_epochs 为 10;

现在有两个问题,第一个是精度非常低,第二个是我完成burn_epochs之后开始半监督训练时直接报显存溢出。
image

from efficientteacher.

David-19940718 avatar David-19940718 commented on August 16, 2024

我刚看了开启 SSOD 训练显存会多一倍,但问题是样本数好像没变,这是因为teacher和student两个模型吗?
不过好像不对,我原先设定的batch是64,双卡跑,两张2080Ti(11G显存);后面我改成32,SSOD可以跑一下子,但是还没跑完完整的一个epoch显存又爆了,是不是需要清除什么缓存?

from efficientteacher.

David-19940718 avatar David-19940718 commented on August 16, 2024

image

from efficientteacher.

BowieHsu avatar BowieHsu commented on August 16, 2024

好的,我明白您的状态了,首先请您将ssod.yaml里面的batch设置为16,这样运行起来显存应该就不会爆炸了,然后请将你有监督训练得到的efficient-yolov5s.pt填入ssod.yaml的weights那里,然后将lr0: 0.01设置为0.001, burn_epochs和warmup_epochs都设置为0,再启动您的训练任务

from efficientteacher.

BowieHsu avatar BowieHsu commented on August 16, 2024

如果顺利的话,ss_bbox/ss_obj/ss_cls/gt_num都会出现数值,代表网络开始在没有标签的数据上开始生成伪标签并学习了

from efficientteacher.

BowieHsu avatar BowieHsu commented on August 16, 2024

@David-19940718 有进展或者遇到任何新的问题都可以继续联系我,这里应该还需要一些超参精调您的网络

from efficientteacher.

David-19940718 avatar David-19940718 commented on August 16, 2024

@David-19940718 有进展或者遇到任何新的问题都可以继续联系我,这里应该还需要一些超参精调您的网络

好的,非常感谢您耐心的解答,我待会跑跑看,有任何进展我会及时反馈,再次感谢!

from efficientteacher.

David-19940718 avatar David-19940718 commented on August 16, 2024

您好,有一个非常神奇的现象,我昨天下班前挂后台同样的配置让它去跑,今天早上过来居然训练完了,batch是32。。。下面是整个训练结果,麻烦您这边帮忙看一下是否是正常的收敛过程?另外这个最终的精度怎么好像变低了很多。
image


我先按照你给的Tip再训练一版对比看看。

from efficientteacher.

BowieHsu avatar BowieHsu commented on August 16, 2024

@David-19940718 好的,log里面表示的gt_num就是每幅图上生成的伪标签个数,现在看起来这些图上并没有足够多的伪标签出来,然后每一轮训练验的两个指标,第一行是student的准确率,第二行是teacher的准确率,按照我最后说的那种小学习率然后加载训练好模型的方案,您的student应该会维持比较高的准确率,而teacher的准确率也会逼近student的准确率,最后您可以用detect函数来渲染一下一开始的s模型和半监督之后的s模型在一些可能会出现误检漏检的图片上的效果。

from efficientteacher.

David-19940718 avatar David-19940718 commented on August 16, 2024

@David-19940718 有进展或者遇到任何新的问题都可以继续联系我,这里应该还需要一些超参精调您的网络

您好,这边加载完预训练和修改完参数后,训练算正常了,目前来看精度还可以,请问下这个具体的调参有什么要注意的或者有哪些经验可以分享下吗?

image

from efficientteacher.

David-19940718 avatar David-19940718 commented on August 16, 2024

我观察到的现象: 1.最后报道的这个准确率是best的准确率,而我看到last的准确率其实是比best要小的,也就是当前的半监督训练超参对您的任务其实带来了一些副作用; 2.根据最后一个epoch的表现,teacher的表现也是比student要差的,说明teacher从student上获取的信息量也收到干扰了;

有这么几个可以尝试的方案: 1.lr0继续缩小,观察student和teacher的变化 2.SSOD下的nms_iou_thres: 0.65参数可以考虑调小,如果您的无标注图片非常干净,每幅图没有包含大量的目标,那么这个nms 0.65的设置会倾向生成含有大量重叠的伪标签 3.ignore_thres_high: 0.6可以考虑调大,这个阈值的含义是判定当前的伪标签是否可信,调大这个阈值也可以减少错误伪标签对半监督训练的干扰 4.终极解决方案,在SSOD下面添加一行 debug: True,这样在半监督训练的时候,算法库会把伪标签渲染在未标注图片上并且存下来,虽然会花费一些存储空间,但是可以快速帮您判断当前伪标签的生成状态

Feel free to report of ask any question~

非常感谢您的这些建议,我这边后面会找一些真实的线上数据去做严格的消融实验比对,后续有结果再report上来一起讨论。

from efficientteacher.

BowieHsu avatar BowieHsu commented on August 16, 2024

不客气,半监督训练落地应用这块还有很多有价值的问题需要解决,我们很关注工程师的反馈,希望多多使用提出意见。

from efficientteacher.

nanfei666 avatar nanfei666 commented on August 16, 2024

@David-19940718 好的,log里面表示的gt_num就是每幅图上生成的伪标签个数,现在看起来这些图上并没有足够多的伪标签出来,然后每一轮训练验的两个指标,第一行是student的准确率,第二行是teacher的准确率,按照我最后说的那种小学习率然后加载训练好模型的方案,您的student应该会维持比较高的准确率,而teacher的准确率也会逼近student的准确率,最后您可以用detect函数来渲染一下一开始的s模型和半监督之后的s模型在一些可能会出现误检漏检的图片上的效果。

作者你好,长尾场景下unlabel中每张图片中的目标可能只有一到两个,此时的pse_num为0.09左右gt_num为0.01左右是合理的吗,还是需要调整nms_iou_thres,ignore_thres_high,谢谢

from efficientteacher.

BowieHsu avatar BowieHsu commented on August 16, 2024

@nanfei666 老师您好,是确认每张图都有目标吗,还是有大量无目标的图片,这个pse_num看起来不太理想,说明10张左右才产生了一个伪标签,可以把thres_high调低试一试。

from efficientteacher.

nanfei666 avatar nanfei666 commented on August 16, 2024

@nanfei666 老师您好,是确认每张图都有目标吗,还是有大量无目标的图片,这个pse_num看起来不太理想,说明10张左右才产生了一个伪标签,可以把thres_high调低试一试。

有大量的无目标图片,目前我使用原YOLOV5训练完成的weight进行无监督训练,lr0:0.001,nms_iou_thres:0.4,ignore_thres_high:0.5,在第一个epoch就会出现loss为Nan

from efficientteacher.

BowieHsu avatar BowieHsu commented on August 16, 2024

@nanfei666 您好,可以提供一下log截图吗,我们分析一下

from efficientteacher.

BowieHsu avatar BowieHsu commented on August 16, 2024

@nanfei666 您好,可以提供一下log截图吗,我们分析一下

from efficientteacher.

nanfei666 avatar nanfei666 commented on August 16, 2024

ec6ad35c3cc2226dc319d095f05b4ff
@BowieHsu 您好,这个是log的照片

from efficientteacher.

BowieHsu avatar BowieHsu commented on August 16, 2024

@nanfei666 您好,看起来您这是一个单类检测的任务,建议再将lr0调小运行,应该loss就不会报错了。如果时间允许的话,您试试先把有监督任务跑起来测试一下,再测半监督任务。

from efficientteacher.

nanfei666 avatar nanfei666 commented on August 16, 2024

@BowieHsu 好的我去试试,感谢。关于pse_num看起来不太理想的问题,我能否理解为,Efficient Teacher更适用于对于大量无标注的数据中每张图片都存在目标的场景,对于长尾场景下大部分采集到的图片中都没有目标Efficient Teacher的效果可能不好吗

from efficientteacher.

BowieHsu avatar BowieHsu commented on August 16, 2024

每张都存在目标的场景是最理想的场景,但是长尾场景的数据灌进来也可以有效减少误检,我们正在研究更稳健的方案来做您所说的长尾场景,欢迎您随时反馈您的实验状态。

from efficientteacher.

nanfei666 avatar nanfei666 commented on August 16, 2024

@nanfei666 您好,看起来您这是一个单类检测的任务,建议再将lr0调小运行,应该loss就不会报错了。如果时间允许的话,您试试先把有监督任务跑起来测试一下,再测半监督任务。

bdd7739c17817eb53c7a2d3e8e326a3
您好,关于loss为nan的情况,我此时的lr0已经调至1e-5,但是任然出现了nan。
关于有监督任务的测试,对原版YOLOv5训练得到的weights转换后得到的模型进行测试,map比原版低。转换前map为0.86,转换后的map为0.79

from efficientteacher.

BowieHsu avatar BowieHsu commented on August 16, 2024

@nanfei666 您好,您在使用原版训练的时候开了autoanchor这个选项吗?就是会在训练前自动聚一遍anchor,还是您一直使用的是默认的COCO anchor。
如果是前一种情况,麻烦您尝试在现在的yaml文件中加入这一行,让训练开始时为您重新抓一次anchor
截屏2023-03-08 下午2 09 16

from efficientteacher.

nanfei666 avatar nanfei666 commented on August 16, 2024

@BowieHsu 开启了,但AutoAnchor输出1.000 Best Possible Recall (BPR). Cruuent anchor s are a good fit to dataset. 程序并没有聚类新anchor,使用的是默认的COCO anchor

from efficientteacher.

BowieHsu avatar BowieHsu commented on August 16, 2024

@nanfei666 您可以贴一下训练的yaml文件吗,我看一下

from efficientteacher.

nanfei666 avatar nanfei666 commented on August 16, 2024

@BowieHsu 谢谢您,精度问题我再去看看,关于无监督学习时loss为Nan的问题请问可能是什么原因导致的呢,lr0已经降低至1e-6还是会出现
556bc088da02d3c2dcc22acfe91fd8c
707c77a635676b4c3ca3670f75a1b59

from efficientteacher.

BowieHsu avatar BowieHsu commented on August 16, 2024

@nanfei666 您好,麻烦把这份yaml文件burn_epochs改成10重启实验,观察是否依然会出现Nan的问题

from efficientteacher.

nanfei666 avatar nanfei666 commented on August 16, 2024

@BowieHsu 您好,burn_epochs改为10之后,任然出现了Nan的问题
5925e21793d8b65080168445fc5d987

from efficientteacher.

BowieHsu avatar BowieHsu commented on August 16, 2024

@nanfei666 看着进展不错,现在麻烦将teacher_loss_weight 改为1.0,再启动训练脚本,burn_epochs可以调小一些。现在可以定位出导致loss为nan的原因是进入半监督才开始的,刚才的burn_epochs的目的是为了启动有监督训练,现在就可以排除有监督的loss本身问题了,原因应该就是半监督的loss干扰了网络,而且您也可以到gt_num涨上来了,一般还是建议先burn_in几个epochs。

from efficientteacher.

nanfei666 avatar nanfei666 commented on August 16, 2024

@nanfei666 看着进展不错,现在麻烦将teacher_loss_weight 改为1.0,再启动训练脚本,burn_epochs可以调小一些。现在可以定位出导致loss为nan的原因是进入半监督才开始的,刚才的burn_epochs的目的是为了启动有监督训练,现在就可以排除有监督的loss本身问题了,原因应该就是半监督的loss干扰了网络,而且您也可以到gt_num涨上来了,一般还是建议先burn_in几个epochs。

感谢您的建议,将teacher_loss_weight 改为1.0,其他所有参数不变后,我进行了两个实验:

  1. burn_epochs 10,epochs:80,lr0:1e-6,加载由原版YOLOv5有监督训练后,通过脚本转换成efficient teacher格式的weight,loss还是出现了Nan。进一步降低teacher_loss_weight至0.5,loss还是出现Nan
  2. burn_epochs 220,epochs 300, 加载efficient-yolov5m.pt,lr0:0.001,并未出现Nan但是最终的map低于原版的YOLOv5m在相同数据集上的训练结果

from efficientteacher.

nanfei666 avatar nanfei666 commented on August 16, 2024

@BowieHsu 您好,请问loss为Nan的问题可能有什么原因导致的吗,我目前任无法解决

from efficientteacher.

BowieHsu avatar BowieHsu commented on August 16, 2024

@nanfei666 您好,我这边尝试复现了一些corner case,我大致的感觉还是burn_epochs太少导致的,我们现在有一个针对custom dataset比较友好的方案,那就是训练时加载coco预训练模型进行半监督,您能尝试一下吗,我这边同步转一些预训练模型上来

from efficientteacher.

nanfei666 avatar nanfei666 commented on August 16, 2024

7cf99deb14254a4e45e010e0f0d470c
@BowieHsu 您好,我通过20轮的burn_epochs后目前训练的map曲线是这样的,请问我应该修改哪些地方呢

from efficientteacher.

jo-dean avatar jo-dean commented on August 16, 2024

@BowieHsu
使用自己的数据集的一些对比,训练集1685张,验证集100张
这是yolov5s的结果
image
这是yolov5s转et-yolov5s的结果
yolov5
这是ssod-v5s的结果,无标签数据设为1508张,标签数据设为180张,验证为100张

image
这是ssod-v5s的配置参数
`# EfficientTeacher by Alibaba Cloud

project: 'yolov5_ssod'
adam: False
epochs: 20
weights: 'scripts/mula_convertor/weights/efficient-yolov5s.pt'
prune_finetune: False
linear_lr: True
find_unused_parameters: True

hyp:
lr0: 0.0005
hsv_h: 0.015
hsv_s: 0.7
hsv_v: 0.4
lrf: 1.0
scale: 0.9
burn_epochs: 0
no_aug_epochs: 0

mixup: 0.1

warmup_epochs: 0

Model:
depth_multiple: 0.33 # model depth multiple
width_multiple: 0.50 # layer channel multiple
Backbone:
name: 'YoloV5'
activation: 'SiLU'
Neck:
name: 'YoloV5'
in_channels: [256, 512, 1024]
out_channels: [256, 512, 1024]
activation: 'SiLU'
Head:
name: 'YoloV5'
activation: 'SiLU'
anchors: [[10,13, 16,30, 33,23],[30,61, 62,45, 59,119],[116,90, 156,198, 373,326]] # P5/32]
Loss:
type: 'ComputeLoss'
cls: 0.3
obj: 0.7
anchor_t: 4.0

Dataset:
data_name: 'coco'
train:ssod/coco/custom_train.txt # 118287 images
val: ssod/coco/custom_val.txt # 5000 images
test: ssod/coco/custom_val.txt # 20288 of 40670 images, submit to https://competitions.codalab.org/competitions/20794^
target: ssod/coco/custom_unlabeled.txt
nc: 5 # number of classes
np: 0 #number of keypoints
names: ['0','1','2','3','4']
img_size: 640
batch_size: 16

SSOD:
train_domain: True
nms_conf_thres: 0.1
nms_iou_thres: 0.5
teacher_loss_weight: 1.0
cls_loss_weight: 0.3
box_loss_weight: 0.05
obj_loss_weight: 0.7
loss_type: 'ComputeStudentMatchLoss'
ignore_thres_low: 0.1
ignore_thres_high: 0.75
uncertain_aug: True
use_ota: False
multi_label: False
ignore_obj: False
pseudo_label_with_obj: True
pseudo_label_with_bbox: True
pseudo_label_with_cls: False
with_da_loss: False
da_loss_weights: 0.01
epoch_adaptor: True
resample_high_percent: 0.25
resample_low_percent: 0.99
ema_rate: 0.999
cosine_ema: True
imitate_teacher: False

dynamic_thres: True

ssod_hyp:
with_gt: False
mosaic: 1.0
cutout: 0.5
autoaugment: 0.5
scale: 0.8
degrees: 0.0
shear: 0.0
debug: True`

这是训练的一些显示参数
` 0/19 7.5G 9 640 0.0232 0.00544 0.001086 0.4427 0.02298 0.009129 0.001268 0.5113 0 1 8.972 4.395: 100%|█| 95/95 [0
Class Images Labels P R [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:01<00:00, 2.11it/s]
all 100 254 0.896 0.775 0.843 0.542
Class Images Labels P R [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00, 7.94it/s]
all 100 254 0.911 0.785 0.858 0.556

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
  1/19      7.5G        16       640   0.02302  0.005081 0.0008262    0.4301   0.02073  0.008711 0.0005686    0.4358         0         1     8.657     3.663: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  4.32it/s]
             all        100        254      0.877      0.778      0.848      0.536
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  8.75it/s]
             all        100        254      0.908      0.788      0.858      0.555

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
  2/19      7.5G         7       640   0.02269  0.004879 0.0006551    0.4208    0.0204   0.00798 0.0006983    0.4976         0         1     7.893     3.853: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.88it/s]
             all        100        254      0.868      0.776      0.843      0.533
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.73it/s]
             all        100        254      0.875      0.814      0.858      0.555

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
  3/19      7.5G        14       640   0.02204   0.00524 0.0007224    0.4196   0.01982  0.007821 0.0006461    0.5137         0         1     7.793     3.955: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.18it/s]
             all        100        254       0.93      0.733      0.848       0.54
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.14it/s]
             all        100        254      0.911      0.786      0.859      0.554

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
  4/19      7.5G        18       640   0.02198  0.005007 0.0007243    0.4149    0.0198  0.007912 0.0007847    0.5282         0         1     7.937     4.082: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.26it/s]
             all        100        254      0.886      0.758      0.844      0.532
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  6.93it/s]
             all        100        254       0.91      0.787      0.859      0.554

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
  5/19      7.5G        14       640    0.0208   0.00477 0.0006679    0.3951   0.01929  0.007809  0.001213    0.5319         0         1     7.952     4.138: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.37it/s]
             all        100        254      0.853      0.776      0.844      0.518
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.14it/s]
             all        100        254       0.91      0.788      0.859      0.553

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
  6/19      7.5G        14       640   0.02063  0.004849 0.0006552    0.3919   0.01955  0.007797   0.00124    0.5384         0         1      8.03     4.234: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.27it/s]
             all        100        254      0.934      0.716      0.836      0.521
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.72it/s]
             all        100        254       0.91      0.791      0.858      0.555

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
  7/19      7.5G        28       640   0.02078  0.004696 0.0005436    0.3883   0.01906  0.007754 0.0006981     0.551         0         1     7.943     4.327: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.27it/s]
             all        100        254      0.842      0.779      0.839      0.539
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.87it/s]
             all        100        254       0.91      0.791      0.859      0.554

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
  8/19      7.5G        20       640   0.02074  0.004727 0.0006025     0.392   0.01934  0.007588 0.0006731    0.5684         0         1     7.569     4.251: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  8.81it/s]
             all        100        254      0.842      0.776      0.837      0.525
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.27it/s]
             all        100        254      0.906      0.794      0.859      0.553

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
  9/19      7.5G        15       640   0.02023  0.004765 0.0005475    0.3854   0.01936  0.007333 0.0006324    0.5599         0         1     7.568     4.152: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.03it/s]
             all        100        254      0.916      0.733      0.843      0.527
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.73it/s]
             all        100        254      0.903      0.792      0.858      0.553

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
 10/19      7.5G        11       640   0.02051  0.004779 0.0006087    0.3892   0.01915  0.007693 0.0005753    0.5615         0         1      7.72     4.255: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.66it/s]
             all        100        254      0.899      0.732      0.842      0.523
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.36it/s]
             all        100        254      0.904      0.792      0.858      0.553

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
 11/19      7.5G        15       640   0.01977  0.004534 0.0005935    0.3732   0.01893  0.007512 0.0006511     0.559         0         1     7.674     4.203: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  6.91it/s]
             all        100        254      0.907      0.733      0.839      0.526
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.61it/s]
             all        100        254      0.859       0.84       0.86      0.555

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
 12/19      7.5G         5       640   0.02034  0.004635 0.0005117    0.3824   0.01976  0.007463  0.000523    0.5811         0         1     7.516     4.315: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.06it/s]
             all        100        254      0.857      0.745      0.834      0.526
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  6.79it/s]
             all        100        254      0.849      0.851      0.861      0.555

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
 13/19      7.5G        15       640   0.01954  0.004348 0.0005915      0.37   0.01893  0.007576 0.0006197    0.5676         0         1     7.774     4.343: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.08it/s]
             all        100        254      0.874      0.753      0.835      0.523
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  6.77it/s]
             all        100        254      0.863      0.837       0.86      0.554

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
 14/19      7.5G         6       640   0.01983  0.004386 0.0004745    0.3652    0.0186  0.007267 0.0008434      0.58         0         1     7.472     4.244: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  5.41it/s]
             all        100        254      0.843      0.764      0.834      0.529
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.38it/s]
             all        100        254      0.862      0.837       0.86      0.556

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
 15/19      7.5G        15       640   0.01939  0.004635 0.0004191    0.3635   0.01896  0.006846 0.0004682    0.5913         0         1     7.188     4.173: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  5.98it/s]
             all        100        254      0.849      0.776      0.838      0.523
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.22it/s]
             all        100        254      0.882      0.817       0.86      0.557

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
 16/19      7.5G        13       640   0.01956  0.004517 0.0004645    0.3698   0.01833  0.007239 0.0007417    0.5919         0         1      7.38     4.293: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  8.00it/s]
             all        100        254      0.866      0.749      0.839      0.524
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.59it/s]
             all        100        254      0.861      0.837       0.86      0.557

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
 17/19      7.5G        16       640   0.01972  0.004691 0.0005716    0.3723   0.01849  0.007343 0.0008375    0.5918         0         1     7.309     4.265: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  8.68it/s]
             all        100        254       0.91      0.735       0.84      0.524
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  8.70it/s]
             all        100        254      0.862      0.839      0.862      0.557

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
 18/19      7.5G        12       640   0.01914  0.004482 0.0005326    0.3683   0.01861  0.007116 0.0004974    0.5943         0         1     7.422     4.404: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.67it/s]
             all        100        254      0.852      0.739      0.827      0.507
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.42it/s]
             all        100        254      0.862      0.839      0.861      0.557

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
 19/19      7.5G         8       640    0.0187  0.004511 0.0004423    0.3516   0.01859  0.007294 0.0005941    0.5881         0         1     7.526     4.367: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.70it/s]
             all        100        254      0.805      0.788      0.832      0.517
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  6.81it/s]
             all        100        254      0.862      0.839      0.861      0.557

Optimizer stripped from yolov5_ssod/exp8/weights/last.pt, 15.2MB
Optimizer stripped from yolov5_ssod/exp8/weights/best.pt, 15.2MB

Validating yolov5_ssod/exp8/weights/best.pt...
`
从验证结果看还是和yolov5s差点,能帮忙看看需要微调哪些参数吗?感谢。

from efficientteacher.

delicate00 avatar delicate00 commented on August 16, 2024

@BowieHsu 使用自己的数据集的一些对比,训练集1685张,验证集100张 这是yolov5s的结果 image 这是yolov5s转et-yolov5s的结果 yolov5 这是ssod-v5s的结果,无标签数据设为1508张,标签数据设为180张,验证为100张

image 这是ssod-v5s的配置参数 `# EfficientTeacher by Alibaba Cloud

project: 'yolov5_ssod' adam: False epochs: 20 weights: 'scripts/mula_convertor/weights/efficient-yolov5s.pt' prune_finetune: False linear_lr: True find_unused_parameters: True

hyp: lr0: 0.0005 hsv_h: 0.015 hsv_s: 0.7 hsv_v: 0.4 lrf: 1.0 scale: 0.9 burn_epochs: 0 no_aug_epochs: 0

mixup: 0.1

warmup_epochs: 0

Model: depth_multiple: 0.33 # model depth multiple width_multiple: 0.50 # layer channel multiple Backbone: name: 'YoloV5' activation: 'SiLU' Neck: name: 'YoloV5' in_channels: [256, 512, 1024] out_channels: [256, 512, 1024] activation: 'SiLU' Head: name: 'YoloV5' activation: 'SiLU' anchors: [[10,13, 16,30, 33,23],[30,61, 62,45, 59,119],[116,90, 156,198, 373,326]] # P5/32] Loss: type: 'ComputeLoss' cls: 0.3 obj: 0.7 anchor_t: 4.0

Dataset: data_name: 'coco' train:ssod/coco/custom_train.txt # 118287 images val: ssod/coco/custom_val.txt # 5000 images test: ssod/coco/custom_val.txt # 20288 of 40670 images, submit to https://competitions.codalab.org/competitions/20794^ target: ssod/coco/custom_unlabeled.txt nc: 5 # number of classes np: 0 #number of keypoints names: ['0','1','2','3','4'] img_size: 640 batch_size: 16

SSOD: train_domain: True nms_conf_thres: 0.1 nms_iou_thres: 0.5 teacher_loss_weight: 1.0 cls_loss_weight: 0.3 box_loss_weight: 0.05 obj_loss_weight: 0.7 loss_type: 'ComputeStudentMatchLoss' ignore_thres_low: 0.1 ignore_thres_high: 0.75 uncertain_aug: True use_ota: False multi_label: False ignore_obj: False pseudo_label_with_obj: True pseudo_label_with_bbox: True pseudo_label_with_cls: False with_da_loss: False da_loss_weights: 0.01 epoch_adaptor: True resample_high_percent: 0.25 resample_low_percent: 0.99 ema_rate: 0.999 cosine_ema: True imitate_teacher: False

dynamic_thres: True

ssod_hyp: with_gt: False mosaic: 1.0 cutout: 0.5 autoaugment: 0.5 scale: 0.8 degrees: 0.0 shear: 0.0 debug: True`

这是训练的一些显示参数 ` 0/19 7.5G 9 640 0.0232 0.00544 0.001086 0.4427 0.02298 0.009129 0.001268 0.5113 0 1 8.972 4.395: 100%|█| 95/95 [0 Class Images Labels P R [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:01<00:00, 2.11it/s] all 100 254 0.896 0.775 0.843 0.542 Class Images Labels P R [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00, 7.94it/s] all 100 254 0.911 0.785 0.858 0.556

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
  1/19      7.5G        16       640   0.02302  0.005081 0.0008262    0.4301   0.02073  0.008711 0.0005686    0.4358         0         1     8.657     3.663: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  4.32it/s]
             all        100        254      0.877      0.778      0.848      0.536
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  8.75it/s]
             all        100        254      0.908      0.788      0.858      0.555

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
  2/19      7.5G         7       640   0.02269  0.004879 0.0006551    0.4208    0.0204   0.00798 0.0006983    0.4976         0         1     7.893     3.853: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.88it/s]
             all        100        254      0.868      0.776      0.843      0.533
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.73it/s]
             all        100        254      0.875      0.814      0.858      0.555

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
  3/19      7.5G        14       640   0.02204   0.00524 0.0007224    0.4196   0.01982  0.007821 0.0006461    0.5137         0         1     7.793     3.955: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.18it/s]
             all        100        254       0.93      0.733      0.848       0.54
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.14it/s]
             all        100        254      0.911      0.786      0.859      0.554

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
  4/19      7.5G        18       640   0.02198  0.005007 0.0007243    0.4149    0.0198  0.007912 0.0007847    0.5282         0         1     7.937     4.082: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.26it/s]
             all        100        254      0.886      0.758      0.844      0.532
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  6.93it/s]
             all        100        254       0.91      0.787      0.859      0.554

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
  5/19      7.5G        14       640    0.0208   0.00477 0.0006679    0.3951   0.01929  0.007809  0.001213    0.5319         0         1     7.952     4.138: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.37it/s]
             all        100        254      0.853      0.776      0.844      0.518
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.14it/s]
             all        100        254       0.91      0.788      0.859      0.553

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
  6/19      7.5G        14       640   0.02063  0.004849 0.0006552    0.3919   0.01955  0.007797   0.00124    0.5384         0         1      8.03     4.234: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.27it/s]
             all        100        254      0.934      0.716      0.836      0.521
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.72it/s]
             all        100        254       0.91      0.791      0.858      0.555

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
  7/19      7.5G        28       640   0.02078  0.004696 0.0005436    0.3883   0.01906  0.007754 0.0006981     0.551         0         1     7.943     4.327: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.27it/s]
             all        100        254      0.842      0.779      0.839      0.539
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.87it/s]
             all        100        254       0.91      0.791      0.859      0.554

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
  8/19      7.5G        20       640   0.02074  0.004727 0.0006025     0.392   0.01934  0.007588 0.0006731    0.5684         0         1     7.569     4.251: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  8.81it/s]
             all        100        254      0.842      0.776      0.837      0.525
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.27it/s]
             all        100        254      0.906      0.794      0.859      0.553

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
  9/19      7.5G        15       640   0.02023  0.004765 0.0005475    0.3854   0.01936  0.007333 0.0006324    0.5599         0         1     7.568     4.152: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.03it/s]
             all        100        254      0.916      0.733      0.843      0.527
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.73it/s]
             all        100        254      0.903      0.792      0.858      0.553

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
 10/19      7.5G        11       640   0.02051  0.004779 0.0006087    0.3892   0.01915  0.007693 0.0005753    0.5615         0         1      7.72     4.255: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.66it/s]
             all        100        254      0.899      0.732      0.842      0.523
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.36it/s]
             all        100        254      0.904      0.792      0.858      0.553

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
 11/19      7.5G        15       640   0.01977  0.004534 0.0005935    0.3732   0.01893  0.007512 0.0006511     0.559         0         1     7.674     4.203: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  6.91it/s]
             all        100        254      0.907      0.733      0.839      0.526
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.61it/s]
             all        100        254      0.859       0.84       0.86      0.555

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
 12/19      7.5G         5       640   0.02034  0.004635 0.0005117    0.3824   0.01976  0.007463  0.000523    0.5811         0         1     7.516     4.315: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.06it/s]
             all        100        254      0.857      0.745      0.834      0.526
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  6.79it/s]
             all        100        254      0.849      0.851      0.861      0.555

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
 13/19      7.5G        15       640   0.01954  0.004348 0.0005915      0.37   0.01893  0.007576 0.0006197    0.5676         0         1     7.774     4.343: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.08it/s]
             all        100        254      0.874      0.753      0.835      0.523
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  6.77it/s]
             all        100        254      0.863      0.837       0.86      0.554

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
 14/19      7.5G         6       640   0.01983  0.004386 0.0004745    0.3652    0.0186  0.007267 0.0008434      0.58         0         1     7.472     4.244: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  5.41it/s]
             all        100        254      0.843      0.764      0.834      0.529
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.38it/s]
             all        100        254      0.862      0.837       0.86      0.556

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
 15/19      7.5G        15       640   0.01939  0.004635 0.0004191    0.3635   0.01896  0.006846 0.0004682    0.5913         0         1     7.188     4.173: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  5.98it/s]
             all        100        254      0.849      0.776      0.838      0.523
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.22it/s]
             all        100        254      0.882      0.817       0.86      0.557

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
 16/19      7.5G        13       640   0.01956  0.004517 0.0004645    0.3698   0.01833  0.007239 0.0007417    0.5919         0         1      7.38     4.293: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  8.00it/s]
             all        100        254      0.866      0.749      0.839      0.524
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.59it/s]
             all        100        254      0.861      0.837       0.86      0.557

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
 17/19      7.5G        16       640   0.01972  0.004691 0.0005716    0.3723   0.01849  0.007343 0.0008375    0.5918         0         1     7.309     4.265: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  8.68it/s]
             all        100        254       0.91      0.735       0.84      0.524
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  8.70it/s]
             all        100        254      0.862      0.839      0.862      0.557

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
 18/19      7.5G        12       640   0.01914  0.004482 0.0005326    0.3683   0.01861  0.007116 0.0004974    0.5943         0         1     7.422     4.404: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.67it/s]
             all        100        254      0.852      0.739      0.827      0.507
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.42it/s]
             all        100        254      0.862      0.839      0.861      0.557

 Epoch   gpu_mem    labels  img_size       box       obj       cls      loss    ss_box    ss_obj    ss_cls        tp    fp_cls    fp_loc   pse_num    gt_num
 19/19      7.5G         8       640    0.0187  0.004511 0.0004423    0.3516   0.01859  0.007294 0.0005941    0.5881         0         1     7.526     4.367: 100%|█| 95/95 [0
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  7.70it/s]
             all        100        254      0.805      0.788      0.832      0.517
           Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00,  6.81it/s]
             all        100        254      0.862      0.839      0.861      0.557

Optimizer stripped from yolov5_ssod/exp8/weights/last.pt, 15.2MB Optimizer stripped from yolov5_ssod/exp8/weights/best.pt, 15.2MB

Validating yolov5_ssod/exp8/weights/best.pt... ` 从验证结果看还是和yolov5s差点,能帮忙看看需要微调哪些参数吗?感谢。
image
image
您好,这是我从标准转et的对比效果,精度这差太多了,请问这种情况可能是什么原因导致的呢

from efficientteacher.

jo-dean avatar jo-dean commented on August 16, 2024

@delicate00 拉取最新的代码看看,或者用标准yolov5s的pt看看转换的结果

from efficientteacher.

delicate00 avatar delicate00 commented on August 16, 2024

@delicate00 拉取最新的代码看看,或者用标准yolov5s的pt看看转换的结果

好的,我已经解决了,在验证半监督精度的时候出现了“AttributeError: 'tuple' object has no attribute 'shape'”,请问您有遇到过类似的问题吗
image

from efficientteacher.

BowieHsu avatar BowieHsu commented on August 16, 2024

@delicate00 您好,报这个错的原因是运行val.py的时候没有加超参数 --val-ssod,加这个参数的原因是ssod版本的检测器还多个一个feature_map的输出,所以把原来的输出给转换成了tuple,所以我们用了那个参数来约束

from efficientteacher.

jo-dean avatar jo-dean commented on August 16, 2024

@BowieHsu 再贴一个没有加预训练的模型训练的结果,有没有什么建议呢?
2023-04-03_09-11

参数配置

`# EfficientTeacher by Alibaba Cloud

project: 'yolov5_ssod'
adam: False
epochs: 300
weights: ''
prune_finetune: False
linear_lr: True
find_unused_parameters: True

hyp:
lr0: 0.001
hsv_h: 0.015
hsv_s: 0.7
hsv_v: 0.4
lrf: 1.0
scale: 0.9
burn_epochs: 220
no_aug_epochs: 0

mixup: 0.1

warmup_epochs: 3

Model:
depth_multiple: 0.33 # model depth multiple
width_multiple: 0.50 # layer channel multiple
Backbone:
name: 'YoloV5'
activation: 'SiLU'
Neck:
name: 'YoloV5'
in_channels: [256, 512, 1024]
out_channels: [256, 512, 1024]
activation: 'SiLU'
Head:
name: 'YoloV5'
activation: 'SiLU'
anchors: [[10,13, 16,30, 33,23],[30,61, 62,45, 59,119],[116,90, 156,198, 373,326]] # P5/32]
Loss:
type: 'ComputeLoss'
cls: 0.3
obj: 0.7
anchor_t: 4.0

Dataset:
data_name: 'coco'
train: ssod/coco/custom_train.txt # 118287 images
val:ssod/coco/custom_val.txt # 5000 images
test:ssod/coco/custom_val.txt # 20288 of 40670 images, submit to https://competitions.codalab.org/competitions/20794^
target:ssod/coco/custom_unlabeled.txt
nc: 5 # number of classes
np: 0 #number of keypoints
names: ['0','1','2','3','4']
img_size: 640
batch_size: 16

SSOD:
train_domain: True
nms_conf_thres: 0.1
nms_iou_thres: 0.5
teacher_loss_weight: 1.0
cls_loss_weight: 0.3
box_loss_weight: 0.05
obj_loss_weight: 0.7
loss_type: 'ComputeStudentMatchLoss'
ignore_thres_low: 0.1
ignore_thres_high: 0.75
uncertain_aug: True
use_ota: False
multi_label: False
ignore_obj: False
pseudo_label_with_obj: True
pseudo_label_with_bbox: True
pseudo_label_with_cls: False
with_da_loss: False
da_loss_weights: 0.01
epoch_adaptor: True
resample_high_percent: 0.25
resample_low_percent: 0.99
ema_rate: 0.999
cosine_ema: True
imitate_teacher: False

dynamic_thres: True

ssod_hyp:
with_gt: False
mosaic: 1.0
cutout: 0.5
autoaugment: 0.5
scale: 0.8
degrees: 0.0
shear: 0.0
debug: True`

from efficientteacher.

delicate00 avatar delicate00 commented on August 16, 2024

@delicate00 您好,报这个错的原因是运行val.py的时候没有加超参数 --val-ssod,加这个参数的原因是ssod版本的检测器还多个一个feature_map的输出,所以把原来的输出给转换成了tuple,所以我们用了那个参数来约束

好的已经解决了,感谢!

from efficientteacher.

wangsun1996 avatar wangsun1996 commented on August 16, 2024

您好,我开启debug后,得到的伪标签图片上只是原图,没有叠加上预测框,这个情况正常吗?

from efficientteacher.

jaideep11061982 avatar jaideep11061982 commented on August 16, 2024

@BowieHsu why my ss_box loss are all Null.
also after start map for unsupervised dataset is extremely poor

please help

image

from efficientteacher.

jaideep11061982 avatar jaideep11061982 commented on August 16, 2024

ss_bbox/ss_obj/ss_cls/gt_num都会出现数值,代表网络开始在没有标签的数据上开始生成伪标签并学习了

hi why in my case it is null @BowieHsu
also what should be the size of Target unlabelled , when we give input as 640 what size does it tries to predict unlabelled image

from efficientteacher.

aaaeric026 avatar aaaeric026 commented on August 16, 2024

您好,检测行人数据集,请问2个阈值设置有什么讲究吗?
论文里面Epoch Adaptor提及会自动更新2个阈值,而代码中需要我们自行设置?这块不是很理解,想请教一下作者。

from efficientteacher.

2363776628 avatar 2363776628 commented on August 16, 2024

训练后用测试集测试得到不同类的精度差距特别大是什么原因啊
1

from efficientteacher.

GJMY592 avatar GJMY592 commented on August 16, 2024

2024011514553618813
ni
tensorboard1
这是我的yaml文件:

EfficientTeacher by Alibaba Cloud

project: 'yolov5_ssod'
name: 'v5n_ratio_0.3s'
adam: False
epochs: 300
weights: ''
prune_finetune: False
linear_lr: True
find_unused_parameters: True

hyp:
lr0: 0.01
hsv_h: 0.015
hsv_s: 0.7
hsv_v: 0.4
lrf: 0.01
scale: 0.5
perspective: 0.0002
translate: 0.3
burn_epochs: 30
no_aug_epochs: 10

mixup: 0.1

warmup_epochs: 3

Model:
depth_multiple: 0.33 # model depth multiple
width_multiple: 0.25 # layer channel multiple
Backbone:
name: 'YoloV5'
activation: 'SiLU'
Neck:
name: 'YoloV5'
in_channels: [256, 512, 1024]
out_channels: [256, 512, 1024]
activation: 'SiLU'
Head:
name: 'YoloV5'
activation: 'SiLU'
anchors: [[10,13, 16,30, 33,23],[30,61, 62,45, 59,119],[116,90, 156,198, 373,326]] # P5/32]
Loss:
type: 'ComputeLoss'
cls: 0.5
obj: 0.7
anchor_t: 4.0
label_smoothing: 0.1

Dataset:
data_name: 'my_dataset'
train: data_train.txt
val: data_val.txt
test: data_test.txt
target: data_unlabeled.txt
nc: 5 # number of classes
np: 0 #number of keypoints
names: ['Background','CAAC', 'BABA', 'BBBB', 'BBAC']
img_size: 640
batch_size: 8

SSOD:
train_domain: True
nms_conf_thres: 0.1
nms_iou_thres: 0.65
teacher_loss_weight: 1.0
cls_loss_weight: 0.3
box_loss_weight: 0.05
obj_loss_weight: 0.7
loss_type: 'ComputeStudentMatchLoss'
ignore_thres_low: 0.1
ignore_thres_high: 0.6
uncertain_aug: True
use_ota: False
multi_label: False
ignore_obj: False
pseudo_label_with_obj: True
pseudo_label_with_bbox: True
pseudo_label_with_cls: False
with_da_loss: False
da_loss_weights: 0.01
epoch_adaptor: True
resample_high_percent: 0.25
resample_low_percent: 0.99
ema_rate: 0.999
cosine_ema: True
imitate_teacher: False

dynamic_thres: True

ssod_hyp:
with_gt: False
mosaic: 1.0
cutout: 0.5
autoaugment: 0.5
scale: 0.5
degrees: 15.0
perspective: 0.0002
translate: 0.3
shear: 0.0
你好,在自己的数据集进行半监督实验时出现了问题,训练是直接进行半监督,burn-in=30,总epoch=300,在burn-in结束后精度不再变化了,在别的数据集上测试甚至有下降的现象,想请问下有没有大佬知道这是什么原因造成的

from efficientteacher.

Tan-Shihong avatar Tan-Shihong commented on August 16, 2024

@David-19940718 您好,已经定位到这个问题了,当前还请您使用txt这种读取方式,我们将添加对list读取方式的支持

好的谢谢。我刚已经用txt这种读取方式完成了。但精度好像“下降的特别夸张”。我先简单总结下我的步骤:

  1. 我下载了一份yolov5-7.0版本的代码,随便找了一份开源的数据集训练全监督学习,训练了300个epochs得到一个权重custom_best.pt;
  2. 我将这个权重通过 convert_pt_to_efficient.py 转换成 efficient-yolov5s.pt;
  3. 通过 python val.py --config configs/sup/custom/yolov5s_custom.yaml --weights efficient-yolov5s.pt 评估得到指标,显示精度非常高,共三个类别,平均mAP达到 98% 左右;
  4. 通过 find <unlabeld_data_path> -name "*.jpg" >> unlabel.txt 生成对应的伪标签txt文件
  5. 参考 configs/ssod/custom/yolov5l_custom_ssod.yaml 自定义一份 yolov5s_custom_ssod.yaml 配置数据,epochs20, 默认burn_epochs 为 10;

现在有两个问题,第一个是精度非常低,第二个是我完成burn_epochs之后开始半监督训练时直接报显存溢出。 image

你好。我也是遇到了精准度非常低的问题。在用了yolov5官方给予的代码情况下,已经确保了不是数据集的问题。请问当时是如何解决精准度非常低的情况?

from efficientteacher.

Liumomo30 avatar Liumomo30 commented on August 16, 2024

您好,有一个非常神奇的现象,我昨天下班前挂后台同样的配置让它去跑,今天早上过来居然训练完了,batch是32。。。下面是整个训练结果,麻烦您这边帮忙看一下是否是正常的收敛过程?另外这个最终的精度怎么好像变低了很多。 image

我先按照你给的Tip再训练一版对比看看。

您好,请问精度不佳的问题您是如何解决的呢,感谢回复

from efficientteacher.

Liumomo30 avatar Liumomo30 commented on August 16, 2024

@David-19940718 您好,已经定位到这个问题了,当前还请您使用txt这种读取方式,我们将添加对list读取方式的支持

好的谢谢。我刚已经用txt这种读取方式完成了。但精度好像“下降的特别夸张”。我先简单总结下我的步骤:

  1. 我下载了一份yolov5-7.0版本的代码,随便找了一份开源的数据集训练全监督学习,训练了300个epochs得到一个权重custom_best.pt;
  2. 我将这个权重通过 convert_pt_to_efficient.py 转换成 efficient-yolov5s.pt;
  3. 通过 python val.py --config configs/sup/custom/yolov5s_custom.yaml --weights efficient-yolov5s.pt 评估得到指标,显示精度非常高,共三个类别,平均mAP达到 98% 左右;
  4. 通过 find <unlabeld_data_path> -name "*.jpg" >> unlabel.txt 生成对应的伪标签txt文件
  5. 参考 configs/ssod/custom/yolov5l_custom_ssod.yaml 自定义一份 yolov5s_custom_ssod.yaml 配置数据,epochs20, 默认burn_epochs 为 10;参考 configs/ssod/custom/yolov5l_custom_ssod.yaml 自定义一份 yolov5s_custom_ssod.yaml 配置数据,epochs20, 默认burn_epochs 为 10;

现在有两个问题,第一个是精度非常低,第二个是我完成burn_epochs之后开始半监督训练时直接报显存溢出。 image

你好。我也是遇到了精准度非常低的问题。在用了yolov5官方给予的代码情况下,已经确保了不是数据集的问题。请问当时是如何解决精准度非常低的情况?

请问您解决精度不良的问题吧

from efficientteacher.

18874959346 avatar 18874959346 commented on August 16, 2024

你好,我有一个问题,我用自己做的表记数据集跑完后,发现用测试集和验证集里面的图片去预测效果很好,但是另外从不同角度取图片预测的话效果就很差,精度很低。(我们的外部预测图片跟数据集有些许不一样)我想问一下这大概率是数据集不是很好从而导致模型过拟合了吗?

from efficientteacher.

18874959346 avatar 18874959346 commented on August 16, 2024

@David-19940718

from efficientteacher.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.