Code Monkey home page Code Monkey logo

vision-toolbox's Issues

checklist

TODO

Backbones

Training

  • (Low priority) JPEG GPU decoding to alleviate CPU bottleneck in data loading for small models link
    • May break torchvision.transforms and Lightning
  • (Low priority) Model EMA
  • Use DataModule to separate data from LightningModule for better readability
  • Channels-last memory layout link
    • Good performance gain (RTX3090, with fp16 training). For Darknet-53, 1.4 it/s (w/ JIT) -> 2 it/s (w/o JIT, channels-last)
    • JIT + channels-last is not good: 1.3it/s
  • WebDataset
  • (Currently not possible) DALI pipeline
    • NVIDIA/DALI/issues/2978
    • DALI does not support conditional operations. Implementing Trivial Augment will be challenging

csd = ckpt['model'].float().state_dict()

hey,dude.I find you are the only one to train imagenet in yolov5.
I wanna to use your pretrained model in yolov5 for object detecting.
But it cann't take it in raw yolov5?

python train.py --data tinyObject.yaml --cfg ./models/yolov5m.yaml  --weights darknet_yolov5m-a1eb31bd.pt

error:  File "train.py", line 123, in train
    csd = ckpt['model'].float().state_dict()  # checkpoint state_dict as FP32
KeyError: 'model'

So what can I do to edit backbone net?

Comparing backbone to original YOLOv5 repo

Hey,
I've been trying to load the pre-trained weights from this repo into the original YOLOv5 (changing state dict key names, etc.).
However, something does not align in the weights shapes, for example on YOLOv5s:
(left - your implementation with names and sizes, right - original YOLOv5)
image

The first problem occurs In block 0 - a mismatch in the weights shapes as shown in the image above.
I've checked it on the s,m,l , all of them having the same issue.

Could you please check if there's a bug in your implementation?
Thanks!

Imagenet pretrained

I've seen your instructions to train on Imagenet Object Localization.

How long does it take to train on such a big dataset?

Could you please share a trained checkpoint for Yolov5 and any other architecture that you've tested?

It would be great for people with less resources and to save a little bit of energy ;-)

If it wouldn't be possible for any reason, it's perfectly understandable. Thanks for sharing this project

Timm models

What happened to Timm backbones?
Before the refactor to vision toolbox, I was using timm_dla34 for example.

Add lightweight ViT

Darknet derivative version support

๐Ÿš€ Features and Motivation

Hi @gau-nernst , This toolbox looks very impressive!

There are several derivative versions of darknet in the practical application of object detection, such as the popular yolov5 and yolox, which modify darknet a bit. Do you have any plans to support these versions?

Additional context

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.