The vision-toolbox's discuss from gau-nernst

checklist

TODO

Backbones

HRNet [Paper v1] [Paper v2] [code] [mmdet]
VOLO? [paper]
PatchConvNet (training results are not good. port weights over to check correctness?)
VoVNetV1, since ESE is not supported on many edge platforms
torchvision's RegNet [paper] [code]
Darknet-YOLOv5 [code]

Training

(Low priority) JPEG GPU decoding to alleviate CPU bottleneck in data loading for small models link
- May break torchvision.transforms and Lightning
(Low priority) Model EMA
- Lightning-AI/pytorch-lightning/issues/10914 -> not a clean solution
- torchvision
Use DataModule to separate data from LightningModule for better readability
Channels-last memory layout link
- Good performance gain (RTX3090, with fp16 training). For Darknet-53, 1.4 it/s (w/ JIT) -> 2 it/s (w/o JIT, channels-last)
- JIT + channels-last is not good: 1.3it/s
WebDataset
(Currently not possible) DALI pipeline
- NVIDIA/DALI/issues/2978
- DALI does not support conditional operations. Implementing Trivial Augment will be challenging

csd = ckpt['model'].float().state_dict()

hey,dude.I find you are the only one to train imagenet in yolov5.
I wanna to use your pretrained model in yolov5 for object detecting.
But it cann't take it in raw yolov5?

python train.py --data tinyObject.yaml --cfg ./models/yolov5m.yaml  --weights darknet_yolov5m-a1eb31bd.pt

error:  File "train.py", line 123, in train
    csd = ckpt['model'].float().state_dict()  # checkpoint state_dict as FP32
KeyError: 'model'

So what can I do to edit backbone net?

Comparing backbone to original YOLOv5 repo

Hey,
I've been trying to load the pre-trained weights from this repo into the original YOLOv5 (changing state dict key names, etc.).
However, something does not align in the weights shapes, for example on YOLOv5s:
(left - your implementation with names and sizes, right - original YOLOv5)

The first problem occurs In block 0 - a mismatch in the weights shapes as shown in the image above.
I've checked it on the s,m,l , all of them having the same issue.

Could you please check if there's a bug in your implementation?
Thanks!

Imagenet pretrained

I've seen your instructions to train on Imagenet Object Localization.

How long does it take to train on such a big dataset?

Could you please share a trained checkpoint for Yolov5 and any other architecture that you've tested?

It would be great for people with less resources and to save a little bit of energy ;-)

If it wouldn't be possible for any reason, it's perfectly understandable. Thanks for sharing this project

Timm models

What happened to Timm backbones?
Before the refactor to vision toolbox, I was using timm_dla34 for example.

Add lightweight ViT

For learning purpose

Probably will port weights.

Replace PyTorch Lightning + Webdataset with HF Accelerate + NVIDIA DALI

As per title.

Darknet derivative version support

🚀 Features and Motivation

Hi @gau-nernst , This toolbox looks very impressive!

There are several derivative versions of darknet in the practical application of object detection, such as the popular yolov5 and yolox, which modify darknet a bit. Do you have any plans to support these versions?

Additional context

https://github.com/ultralytics/yolov5/blob/436ffc4/models/yolov5l.yaml#L13-L25
https://github.com/Megvii-BaseDetection/YOLOX/blob/main/yolox/models/darknet.py
https://github.com/zhiqwang/yolov5-rt-stack/blob/main/yolort/models/darknetv6.py (This is a refactored version of YOLOv5 Darknet to make it readable in the torchvision layout)

gau-nernst / vision-toolbox Goto Github PK

vision-toolbox's Issues

checklist

TODO

csd = ckpt['model'].float().state_dict()

Comparing backbone to original YOLOv5 repo

Imagenet pretrained

Timm models

Add lightweight ViT

Replace PyTorch Lightning + Webdataset with HF Accelerate + NVIDIA DALI

Darknet derivative version support

🚀 Features and Motivation

Additional context

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent